The χ2 distribution

Prince, E.; Spiegelman, C. H.

doi:10.1107/97809553602060000612

International
Tables for
Crystallography
Volume C
Mathematical, physical and chemical tables
Edited by E. Prince

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. C. ch. 8.4, pp. 702-703

Section 8.4.1. The χ² distribution

E. Prince^a and C. H. Spiegelman^b

^a NIST Center for Neutron Research, National Institute of Standards and Technology, Gaithersburg, MD 20899, USA, and ^bDepartment of Statistics, Texas A&M University, College Station, TX 77843, USA

8.4.1. The χ² distribution

| top | pdf |

We have seen [equation (8.1.2.1 )] that the least-squares estimate is derived by finding the minimum value of a sum of terms of the form $[ R_i=w_i[y_i-M_i({\bf x})]^2, \eqno (8.4.1.1)]$ and, further, that the precision of the estimate is optimized if the weight, [w_i] , is the reciprocal of the variance of the population from which the observation is drawn, $[w_i=1/\sigma _i^2]$ . Using this relation, (8.4.1.1) can be written $[ R_i=\left\{\left[y_i-M_i({\bf x})\right]/\sigma _i\right\}^2. \eqno (8.4.1.2)]$ Each term is the square of a difference between observed and calculated values, expressed as a fraction of the standard uncertainty of the observed value. But, by definition, $[ \sigma _i^2=\left \langle [y_i-M_i({\bf x})]^2\right \rangle, \eqno (8.4.1.3)]$ where x has its unknown `correct' value, so that 〈R〉 = 1, and the expected value of the sum of n such terms is n. It can be shown (Draper & Smith, 1981) that each parameter estimated reduces this expected sum by one, so that, for p estimated parameters, $[\left \langle S\right \rangle =\left \langle \textstyle\sum\limits _{i=1}^n\left \{ \left [y_i-M_i\left (\widehat {{\bf x}}\right) \right]/\sigma _i\right \} ^2\right \rangle =n-p, \eqno (8.4.1.4)]$ where $[\widehat {{\bf x}}]$ is the least-squares estimate. The standard uncertainty of an observation of unit weight, also referred to as the goodness-of-fit parameter, is defined by $[ G=\left [{S \over n-p}\right] ^{1/2}=\left [\displaystyle {\left \{ \sum _{i=1}^nw_i\left [y_i-M_i(\widehat {{\bf x}})\right] ^2\right \} \over n-p}\right] ^{1/2}. \eqno (8.4.1.5)]$ From (8.4.1.4), it follows that 〈G〉 = 1 for a correct model with weights assigned in accordance with (8.4.1.2).

A value of G that is close to one, if the weights have been assigned by $[w_i=1/\sigma _i^2]$ , is an indicator that the model is consistent with the data. It should be noted that it is not necessarily an indicator that the model is `correct', because it does not rule out the existence of an alternative model that fits the data as well or better. An assessment of the adequacy of the fit of a given model depends, however, on what is meant by `close to one', which depends in turn on the spread of a probability density function for G. We saw in Chapter 8.1 that least squares with this weighting scheme would give the best, linear, unbiased estimate of the model parameters, with no restrictions on the p.d.f.s of the populations from which the observations are drawn except for the implicit assumption that the variances of these p.d.f.s are finite. To construct a p.d.f. for G, however, it is necessary to make an assumption about the shapes of the p.d.f.s for the observations. The usual assumption is that these p.d.f.s can be described by the normal p.d.f., $[ \Phi _N(x,\mu, \sigma)={1 \over \sqrt {2\pi }\sigma }\exp \left [{\left (x-\mu \right) ^2 \over 2\sigma ^2}\right] . \eqno (8.4.1.6)]$ The justification for this assumption comes from the central-limit theorem, which states that, under rather broad conditions, the p.d.f. of the arithmetic mean of n observations drawn from a population with mean μ and variance σ² tends, for large n, to a normal distribution with mean μ and variance $[\sigma ^2/n]$ . [For a discussion of the central limit theorem, see Cramér (1951).]

If we make the assumption of a normal distribution of errors and make the substitution z = (x − μ)/σ, (8.4.1.6) becomes $[ \Phi _N(z,0,1)= {1\over\sqrt {2\pi }}\exp \left (- {{z^2}\over 2}\right). \eqno (8.4.1.7)]$ The probability that [z^2] will be less than χ² is equal to the probability that z will lie in the interval $[-\chi \leq z\leq \chi ]$ , or $[\Psi (\chi ^2)=\textstyle\int\limits_0^{\chi ^2}\Phi (z^2){\,{\rm d}}z^2=\textstyle\int\limits _{-\chi }^{+\chi }\Phi (z){\,{\rm d}}z. \eqno (8.4.1.8)]$ Letting [t=z^2/2] and substituting in (8.4.1.7), this becomes $[ \Psi (\chi ^2) = { 1 \over \sqrt {\pi }} \int\limits_0^{\chi ^2/2}t^{-1/2} \exp (-t){\,{\rm d}}t. \eqno (8.4.1.9)]$ $[\Phi (\chi ^2)={\,{\rm d}}\Psi (\chi ^2)/{{\rm d}}\chi ^2]$ , so that $[ \matrix{ \Phi \left (\chi ^2\right) = \left (2\pi \chi ^2\right) ^{-1/2}\exp \left (-\chi ^2/2\right), \qquad &\chi ^2 \gt 0, \cr \Phi \left (\chi ^2\right) = 0, \hfill &\chi ^2\leq 0.} \eqno (8.4.1.10)]$ The joint p.d.f. of the squares of two random variables, [z_1] and [z_2 ] , drawn independently from the same population with a normal p.d.f. is $[ \Phi _J\left (z_1^2,z_2^2\right) = {1\over 2\pi z_1z_2}\exp \left [-{z_1^2+z_2^2 \over 2}\right] , \eqno (8.4.1.11)]$ and the p.d.f. of the sum, [s^2] , of these two terms is the integral over the joint p.d.f. of all pairs of [z_1^2] and [z_2^2] that add up to [s^2] . $[ \Phi (s^2)={1\over 2\pi }\exp \left (- {s^2\over 2}\right) \left [z_1^2(s^2-z_1^2)\right] {\,{\rm d}}z_1^2. \eqno (8.4.1.12)]$ This integral can be evaluated by use of the gamma and beta functions. The gamma function is defined for positive real x by $[ \Gamma (x)=\textstyle\int\limits_0^\infty t^{x-1}\exp (-t){\,{\rm d}}t. \eqno (8.4.1.13)]$ Although this function is continuous for all $[x\gt 0]$ , its value is of interest in the context of this analysis only for x equal to positive, integral multiples of 1/2. It can be shown that Γ(1/2) = $[\sqrt {\pi }]$ , Γ(1) = 1, and Γ(x + 1) = xΓ(x). It follows that, for a positive integer, n, Γ(n) = (n −1)!, and that Γ(3/2) = $[\sqrt {\pi }/2]$ , Γ(5/2) = $[3\sqrt {\pi }/4]$ , etc. The beta function is defined by $[ B(x,y)=\textstyle\int\limits_0^1t^{x-1}(1-t)^{y-1}{\,{\rm d}}t. \eqno (8.4.1.14)]$ It can be shown (Prince, 1994) that $[B(x,y)=\Gamma (x)\Gamma (y)/\Gamma (x+y) ]$ . Making the substitution [t=z_1^2/s^2] , (8.4.1.12) becomes $[\eqalignno{ \Phi (s^2) &= {1\over2\pi }\exp \left (- {s^2\over 2}\right) \int _0^1\; \left [t(1-t)\right] ^{-1/2}{\,{\rm d}}t \cr &= {1\over 2\pi }\exp \left (- {s^2 \over 2}\right) B(1/2,1/2) \cr &= {\textstyle {1\over2}}\exp \left (- {s^2 \over 2}\right), \qquad s^2\geq 0.\qquad & (8.4.1.15)}]$ By a similar procedure, it can be shown that, if χ² is the sum of ν terms, [z_1^2] , [z_2^2] , $[\ldots ]$ , [z_v^2] , where all are drawn independently from a population with the p.d.f. given in (8.4.1.10), χ² has the p.d.f. $[\matrix{ \Phi \left (\chi ^2,\nu \right) = \displaystyle {\left (\chi ^2\right) ^{\nu /2-1} \over 2^{\nu /2}\Gamma (\nu /2) }\exp \left (- {\chi ^2 \over 2}\right), \qquad &\chi ^{\dot {2}}\gt 0, \cr \Phi \left (\chi ^2,\nu \right) = 0, \hfill &\chi ^2\leq 0. & (8.4.1.16)}]$ The parameter ν is known as the number of degrees of freedom, but this use of that term must not be confused with the conventional use in physics and chemistry. The p.d.f. in (8.4.1.16) is the chi-squared distribution with ν degrees of freedom. Table 8.4.1.1 gives the values of χ²/ν for which the cumulative distribution function (c.d.f.) Ψ(χ², ν) has various values for various choices of ν. This table is provided to enable verification of computer codes that may be used to generate more extensive tables. It was generated using a program included in the statistical library DATAPAC (Filliben, unpublished). Fortran code for this program appears in Prince (1994).

Table 8.4.1.1| top | pdf |
Values of χ²/ν for which the c.d.f. ψ(χ², ν) has the values given in the column headings, for various values of ν

ν	0.5	0.9	0.95	0.99	0.995
1	0.4549	2.7055	3.8415	6.6349	7.8795
2	0.6931	2.3026	2.9957	4.6052	5.2983
3	0.7887	2.0838	2.6049	3.7816	4.2794
4	0.8392	1.9449	2.3719	3.3192	3.7151
6	0.8914	1.7741	2.0986	2.8020	3.0913
8	0.9180	1.6702	1.9384	2.5113	2.7444
10	0.9342	1.5987	1.8307	2.3209	2.5188
15	0.9559	1.4871	1.6664	2.0385	2.1868
20	0.9669	1.4206	1.5705	1.8783	1.9999
25	0.9735	1.3753	1.5061	1.7726	1.8771
30	0.9779	1.3419	1.4591	1.6964	1.7891
40	0.9834	1.2951	1.3940	1.5923	1.6692
50	0.9867	1.2633	1.3501	1.5231	1.5898
60	0.9889	1.2400	1.3180	1.4730	1.5325
80	0.9917	1.2072	1.2735	1.4041	1.4540
100	0.9933	1.1850	1.2434	1.3581	1.4017
120	0.9945	1.1686	1.2214	1.3246	1.3638
140	0.9952	1.1559	1.2044	1.2989	1.3346
160	0.9958	1.1457	1.1907	1.2783	1.3114
200	0.9967	1.1301	1.1700	1.2472	1.2763

The quantity (n − p)G is the sum of n terms that have mean value (n − p)/n. Because the process of determining the least-squares fit establishes p relations among them, however, only (n − p) of the terms are independent. The number of degrees of freedom is therefore ν = (n − p), and, if the model is correct, and the terms have been properly weighted, χ² = (n − p)G² has the chi-squared distribution with (n − p) degrees of freedom. In crystallography, the number of degrees of freedom tends to be large, and the p.d.f. for G correspondingly sharp, so that even rather small deviations from G² = 1 should cause one or both of the hypotheses of a correct model and appropriate weights to be rejected. It is common practice to assume that the model is correct, and that the weights have correct relative values, that is that they have been assigned by $[w_i=k/\sigma _i^2]$ , where k is a number different from, usually greater than, one. G is then taken to be an estimate of k, and all elements of (A^TWA)⁻¹ (Section 8.1.2 ) are multiplied by G² to get an estimated variance–covariance matrix. The range of validity of this procedure is limited at best. It is discussed further in Chapter 8.5 .

References

Cramér, H. (1951). Mathematical methods of statistics. Princeton, NJ: Princeton University Press.Google Scholar

Draper, N. & Smith, H. (1981). Applied regression analysis. New York: John Wiley.Google Scholar

Prince, E. (1994). Mathematical techniques in crystallography and materials science, 2nd ed. Berlin/Heidelberg/New York/London/Paris/Tokyo/Hong Kong/Barcelona/Budapest: Springer-Verlag.Google Scholar

International Tables for Crystallography (2006). Vol. C. ch. 8.4, pp. 702-703

Section 8.4.1. The χ2 distribution

8.4.1. The χ2 distribution

References

Section 8.4.1. The χ² distribution

8.4.1. The χ² distribution