Ideal probability density distributions

Shmueli, U.; Wilson, A. J. C.

doi:10.1107/97809553602060000554

International
Tables for
Crystallography
Volume B
Reciprocal space
Edited by U. Shmueli

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. B. ch. 2.1, pp. 195-197 | 1 | 2 |

Section 2.1.5. Ideal probability density distributions

U. Shmueli^a ^* and A. J. C. Wilson^b ^‡

^a School of Chemistry, Tel Aviv University, Tel Aviv 69 978, Israel, and ^bSt John's College, Cambridge, England
Correspondence e-mail: ushmueli@post.tau.ac.il

2.1.5. Ideal probability density distributions

| top | pdf |

In applications of the central-limit theorem, and its extensions, to intensity statistics the $[x_{j}]$ 's of equation (2.1.4.19) have the form (atomic scattering factor of the jth atom) times (a trigonometric expression characteristic of the space group and Wyckoff position; also known as the trigonometric structure factor). These trigonometric expressions for all the space groups, and general Wyckoff positions, are given in Tables A1.4.3.1 through A1.4.3.7 , and their first few even moments (fixed-index averaging) are given in Table 2.1.7.1. One cannot, of course, conclude that the magnitudes of the structure factor always have a normal distribution – even if the structure is homoatomic; one must look at each problem and see what components of the structure factor can be put in the form (2.1.4.19), deduce the m and $[\sigma^{2}]$ to be used for each, and combine the components to obtain the asymptotic (large N, not large x) expression for the problem in question. Ordinarily the components are the real and the imaginary parts of the structure factor; the structure factor is purely real only if the structure is centrosymmetric, the space-group origin is chosen at a crystallographic centre and the atoms are non-dispersive.

2.1.5.1. Ideal acentric distributions

| top | pdf |

The ideal acentric distributions are obtained by applying the central-limit theorem to the real and the imaginary parts of the structure factor, as given by equation (2.1.1.1). Consider first a crystal with no rotational symmetry (space group P1). The real part, A, of the structure factor is then given by $[A = \textstyle\sum\limits_{j = 1}^{N}f_{j}\cos \vartheta_{j}, \eqno(2.1.5.1)]$ where N is the number of atoms in the unit cell and $[\vartheta_{j}]$ is the phase angle of the jth atom. The central-limit theorem then states that A tends to be normally distributed about its mean value with variance equal to its mean-square deviation from its mean. Under the assumption that the phase angles $[\vartheta_{j}]$ are uniformly distributed on the 0–2π range, the mean value of each cosine is zero, so that its variance is $[\sigma^{2} = \textstyle\sum\limits_{j = 1}^{N}f_{j}^{2}\langle \cos^{2} \vartheta_{j} \rangle. \eqno(2.1.5.2)]$ Under the same assumption, the mean value of each $[\cos^{2} \vartheta]$ is one-half, so that the variance becomes $[\sigma^{2} = (1/2)\textstyle\sum\limits_{j = 1}^{N}f_{j}^{2} = (1/2)\Sigma, \eqno(2.1.5.3)]$ where Σ is the sum of the squares of the atomic scattering factors [cf. equation (2.1.2.4)]. The asymptotic form of the distribution of A is therefore given by $[p(A)\;{\rm d}A = (\pi\Sigma)^{-1/2}\exp(-A^{2}/\Sigma)\;{\rm d}A. \eqno(2.1.5.4)]$ A similar calculation, with sines instead of cosines, gives an analogous distribution for the imaginary part B, so that the joint probability of the real and imaginary parts of F is $[p(A,B)\;{\rm d}A\;{\rm d}B = (\pi\Sigma)^{-1}\exp[-(A^{2}+B^{2})/\Sigma]\;{\rm d}A\;{\rm d}B. \eqno(2.1.5.5)]$ Ordinarily, however, we are more interested in the distribution of the magnitude, [|F|] , of the structure factor than in the distribution of A and B. Using polar coordinates in equation (2.1.5.5) [ $[A = |F|\cos\phi]$ , $[B = |F|\sin\phi]$ ] and integrating over the angle ϕ gives $[p(|F|)\;{\rm d}|F| = (2|F|/\Sigma)\exp(-|F|^{2}/\Sigma)\;{\rm d}|F|. \eqno(2.1.5.6)]$ It is usually convenient, in structure-factor and intensity statistics, to express the results in terms of the normalized structure factor E and its magnitude [|E|] . If [|F|] has been put on an absolute scale (see Section 2.2.4.3 ), we have $[E = {{F}\over{\sqrt{\Sigma}}}\quad{\rm and}\quad |E| = {{|F|}\over{\sqrt{\Sigma}}}, \eqno(2.1.5.7)]$ so that $[p(|E|)\;{\rm d}|E| = 2|E|\exp(-|E|^{2})\;{\rm d}|E| \eqno(2.1.5.8)]$ is the normalized-structure-factor version of (2.1.5.6).

Distributions resulting from noncentrosymmetric crystals are known as acentric distributions; those arising from centrosymmetric crystals are known as centric. These adjectives are used to describe distributions, not crystal symmetry.

2.1.5.2. Ideal centric distributions

| top | pdf |

When a non-dispersive crystal is centrosymmetric, and the space-group origin is chosen at a crystallographic centre of symmetry, the imaginary part B of its structure amplitude is zero. In the simplest case, space group $[P\bar{1}]$ , the contribution of the jth atom plus its centrosymmetric counterpart is $[2f_{j}\cos\vartheta_{j}]$ . The calculation of [p(A)] goes through as before, with allowance for the fact that there are [N/2] pairs instead of N independent atoms, giving $[p(A)\;{\rm d}A = (2\pi\Sigma)^{-1/2}\exp[-A^{2}/(2\Sigma)]\;{\rm d}A \eqno(2.1.5.9)]$ or equivalently $[p(|F|)\;{\rm d}|F| = [2/(\pi\Sigma)]^{1/2}\exp[-|F|^{2}/(2\Sigma)]\;{\rm d}|F| \eqno(2.1.5.10)]$ or $[p(|E|)\;{\rm d}|E| = (2/\pi)^{1/2}\exp(-|E|^{2}/2)\;{\rm d}|E|. \eqno(2.1.5.11)]$

2.1.5.3. Effect of other symmetry elements on the ideal acentric and centric distributions

| top | pdf |

Additional crystallographic symmetry elements do not produce any essential alterations in the ideal centric or acentric distribution; their main effect is to replace the parameter Σ by a `distribution parameter', called S by Wilson (1950) and Rogers (1950), in certain groups of reflections. In addition, in noncentrosymmetric space groups, the distribution of certain groups of reflections becomes centric, though the general reflections remain acentric. The changes are summarized in Tables 2.1.3.1 and 2.1.3.2. The values of S are integers for lattice centring, glide planes and those screw axes that produce absences, and approximate integers for rotation axes and mirror planes; the modulations of the average intensity in reciprocal space outlined in Section 2.1.3.2 apply.

It should be noted that if intensities are normalized to the average of the group to which they belong, rather than to the general average, the distributions given in equations (2.1.5.8) and (2.1.5.11) are not affected.

2.1.5.4. Other ideal distributions

| top | pdf |

The distributions just derived are asymptotic, as they are limiting values for large N. They are the only ideal distributions, in this sense, when there is only strict crystallographic symmetry and no dispersion. However, other ideal (asymptotic) distributions arise when there is noncrystallographic symmetry, or if there is dispersion. The subcentric distribution, $[\eqalignno{p(|E|)\;{\rm d}|E| &= {{2|E|}\over{(1-k^{2})^{1/2}}}\exp[-|E|^{2}/(1-k^{2})] & \cr &\quad\times{} I_{0}\left({{k|E|^{2}}\over{1-k^{2}}}\right)\;{\rm d}|E|, &(2.1.5.12) \cr}]$ where $[I_{0}(x)]$ is a modified Bessel function of the first kind and k is the ratio of the scattering from the centrosymmetric part to the total scattering, arises when a noncentrosymmetric crystal contains centrosymmetric parts or when dispersion introduces effective noncentrosymmetry into the scattering from a centrosymmetric crystal (Srinivasan & Parthasarathy, 1976, ch. III; Wilson, 1980a,b; Shmueli & Wilson, 1983). The bicentric distribution $[p(|E|)\;{\rm d}|E| = \pi^{-3/2}\exp(-|E|^{2}/8)K_{0}(|E|^{2}/8)\;{\rm d}|E| \eqno(2.1.5.13)]$ arises, for example, when the `asymmetric unit in a centrosymmetric crystal is a centrosymmetric molecule' (Lipson & Woolfson, 1952); $[K_{0}(x)]$ is a modified Bessel function of the second kind. There are higher hypercentric, hyperparallel and sesquicentric analogues (Wilson, 1952; Rogers & Wilson, 1953; Wilson, 1956). The ideal subcentric and bicentric distributions are expressed in terms of known functions, but the higher hypercentric and the sesquicentric distributions have so far been studied only through their moments and integral representations. Certain hypersymmetric distributions can be expressed in terms of Meijer's G functions (Wilson, 1987b).

2.1.5.5. Relation to distributions of I

| top | pdf |

When only the intrinsic probability distributions are being considered, it does not greatly matter whether the variable chosen is the intensity of reflection (I), or its positive square root, the modulus of the structure factor [(|F|)] , since both are necessarily real and non-negative. In an obvious notation, the relation between the intensity distribution and the structure-factor distribution is $[p_{I}(I) = (1/2)I^{-1/2}p_{|F|}(I^{1/2}) \eqno(2.1.5.14)]$ or $[p_{|F|}(|F|) = 2|F|p_{I}(|F|^{2}). \eqno(2.1.5.15)]$ Statistical fluctuations in counting rates, however, introduce a small but finite probability of negative observed intensities (Wilson, 1978a, 1980a) and thus of imaginary structure factors. This practical complication is treated in IT C (2004, Parts 7 and 8 ).

Both the ideal centric and acentric distributions are simple members of the family of gamma distributions, defined by $[\gamma_{n}(x)\;{\rm d}x = [\Gamma(n)]^{-1}x^{n-1}\exp(-x)\;{\rm d}x, \eqno(2.1.5.16)]$ where n is a parameter, not necessarily integral, and $[\Gamma(n)]$ is the gamma function. Thus the ideal acentric intensity distribution is $[\eqalignno{p(I)\;{\rm d}I & = \exp(-I/\Sigma)\;{\rm d}(I/\Sigma) &(2.1.5.17) \cr & = \gamma_{1}(I/\Sigma)\;{\rm d}(I/\Sigma) &(2.1.5.18)}%2.1.5.18]$ and the ideal centric intensity distribution is $[\eqalignno{p(I)\;{\rm d}I& = (2\Sigma/\pi)^{1/2}\exp[-I/(2\Sigma)]\;{\rm d}[I/(2\Sigma)] &(2.1.5.19) \cr & = \gamma_{1/2}[I/(2\Sigma)]\;{\rm d}[I/(2\Sigma)]. &(2.1.5.20)}%2.1.5.20]$ The properties of gamma distributions and of the related beta distributions, summarized in Table 2.1.5.1, are used in Section 2.1.6 to derive the probability density functions of sums and of ratios of intensities drawn from one of the ideal distributions.

Table 2.1.5.1| top | pdf |
Some properties of gamma and beta distributions

If $[x_{1}, x_{2}, \ldots, x_{n}]$ are independent gamma-distributed variables with parameters $[p_{1}, p_{2}, \ldots, p_{n}]$ , their sum is a gamma-distributed variable with $[p\ =]$ $[ p_{1} + p_{2} + \ldots + p_{n}]$ .

If x and y are independent gamma-distributed variables with parameters p and q, then the ratio [u = x/y] has the distribution $[\beta_{2} (u\hbox{; } p, q)]$ .

With the same notation, the ratio [v = x/(x + y)] has the distribution $[\beta_{1} (v\hbox{; }p, q)]$ .

Differences and products of gamma-distributed variables do not lead to simple results. For proofs, details and references see Kendall & Stuart (1977).

Name of the distribution, its functional form, mean and variance
Gamma distribution with parameter p: $[\gamma_{p} (x) = [\Gamma (x)]^{-1} x^{p-1} \exp (-x)\hbox{;} \quad p \leq x \leq \infty,\quad p > 0]$ $[\hbox{mean: }\langle x\rangle = p\hbox{;} \quad \hbox{variance: } \langle (x - \langle x\rangle)^{2}\rangle = p.]$
Beta distribution of first kind with parameters p and q: $[\beta_{1} (x\hbox{; } p, q) = {\Gamma (p + q) \over \Gamma (p) \Gamma (q)} x^{p - 1} (1 - x)^{q - 1}\hbox{;} \quad 0 \leq x \leq \infty,\quad p, q > 0]$ $[\hbox{mean: }\langle x\rangle = p/(p + q)\hbox{;}]$ $[\hbox{variance: }\langle (x - \langle x \rangle)^{2}\rangle = pq/[(p + q)^{2} (p + q + 1)].]$
Beta distribution of second kind with parameters p and q: $[\beta_{2} (x\hbox{; } p, q) = {\Gamma (p + q) \over \Gamma (p) \Gamma (q)} x^{p - 1} (1 + x)^{-p -q}\hbox{;} \quad 0 \leq x \leq \infty,\quad p, q > 0]$ $[\hbox{mean: }\langle x \rangle = p/(q - 1);]$ $[\hbox{variance: }\langle (x - \langle x \rangle)^{2}\rangle = p(p + q - 1)/[(q - 1) (q - 2)].]$

2.1.5.6. Cumulative distribution functions

| top | pdf |

The integral of the probability density function [f(x)] from the lower end of its range up to an arbitrary value x is called the cumulative probability distribution, or simply the distribution function, [F(x)] , of x. It can always be written $[F(x) = \textstyle\int\limits_{-\infty}^{x}f(u)\;{\rm d}u\hbox{;} \eqno(2.1.5.21)]$ if the lower end of its range is not actually $[-\infty]$ one takes [f(x)] as identically zero between $[-\infty]$ and the lower end of its range. For the distribution of A [equation (2.1.5.4) or (2.1.5.9)] the lower limit is in fact $[-\infty]$ ; for the distribution of [|F|] , [|E|] , I and $[I/\Sigma]$ the lower end of the range is zero. In such cases, equation (2.1.5.21) becomes $[F(x) = \textstyle\int\limits_{0}^{x}f(x)\;{\rm d}x. \eqno(2.1.5.22)]$ In crystallographic applications the cumulative distribution is usually denoted by [N(x)] , rather than by the capital letter corresponding to the probability density function designation. The cumulative forms of the ideal acentric and centric distributions (Howells et al., 1950) have found many applications. For the acentric distribution of [|E|] [equation (2.1.5.8)] the integration is readily carried out: $[N(|E|) = 2\textstyle\int\limits_{0}^{|E|} y\exp(-y^{2})\;{\rm d}y = 1 - \exp(-|E|^{2}). \eqno(2.1.5.23)]$ The integral for the centric distribution of [|E|] [equation (2.1.5.11)] cannot be expressed in terms of elementary functions, but the integral required has so many important applications in statistics that it has been given a special name and symbol, the error function erf(x), defined by $[{\rm erf}(x) = (2/\pi^{1/2})\textstyle\int\limits_{0}^{x}\exp(-t^{2})\;{\rm d}t. \eqno(2.1.5.24)]$ For the centric distribution, then $[\eqalignno{N(|E|) & = (2/\pi)^{1/2}\textstyle\int\limits_{0}^{|E|}y\exp(-y^{2}/2)\;{\rm d}y &(2.1.5.25) \cr & = {\rm erf}(|E|/2^{1/2}). &(2.1.5.26)}%2.1.5.26]$ The error function is extensively tabulated [see e.g. Abramowitz & Stegun (1972), pp. 310–311, and a closely related function on pp. 966–973].

References

International Tables for Crystallography (2004). Vol. C. Mathematical, physical and chemical tables, edited by E. Prince. Dordrecht: Kluwer Academic Publishers.Google Scholar

Abramowitz, M. & Stegun, I. A. (1972). Handbook of mathematical functions. New York: Dover.Google Scholar

Howells, E. R., Phillips, D. C. & Rogers, D. (1950). The probability distribution of X-ray intensities. II. Experimental investigation and the X-ray detection of centers of symmetry. Acta Cryst. 3, 210–214.Google Scholar

Kendall, M. & Stuart, A. (1977). The advanced theory of statistics, Vol. 1, 4th ed. London: Griffin.Google Scholar

Lipson, H. & Woolfson, M. M. (1952). An extension of the use of intensity statistics. Acta Cryst. 5, 680–682.Google Scholar

Rogers, D. (1950). The probability distribution of X-ray intensities. IV. New methods of determining crystal classes and space groups. Acta Cryst. 3, 455–464.Google Scholar

Rogers, D. & Wilson, A. J. C. (1953). The probability distribution of X-ray intensities. V. A note on some hypersymmetric distributions. Acta Cryst. 6, 439–449.Google Scholar

Shmueli, U. & Wilson, A. J. C. (1983). Generalized intensity statistics: the subcentric distribution and effects of dispersion. Acta Cryst. A39, 225–233.Google Scholar

Srinivasan, R. & Parthasarathy, S. (1976). Some statistical applications in X-ray crystallography. Oxford: Pergamon Press. Google Scholar

Wilson, A. J. C. (1950). The probability distribution of X-ray intensities. III. Effects of symmetry elements on zones and rows. Acta Cryst. 3, 258–261.Google Scholar

Wilson, A. J. C. (1952). Hypercentric and hyperparallel distributions of X-ray intensities. Research (London), 5, 588–589.Google Scholar

Wilson, A. J. C. (1956). The probability distribution of X-ray intensities. VII. Some sesquicentric distributions. Acta Cryst. 9, 143–144.Google Scholar

Wilson, A. J. C. (1978a). On the probability of measuring the intensity of a reflection as negative. Acta Cryst. A34, 474–475.Google Scholar

Wilson, A. J. C. (1980a). Relationship between `observed' and `true' intensity: effects of various counting modes. Acta Cryst. A36, 929–936.Google Scholar

Wilson, A. J. C. (1980b). Effect of dispersion on the probability distribution of X-ray reflections. Acta Cryst. A36, 945–946.Google Scholar

Wilson, A. J. C. (1987b). Functional form of the ideal hypersymmetric distributions of structure factors. Acta Cryst. A43, 554–556.Google Scholar

International Tables for Crystallography (2006). Vol. B. ch. 2.1, pp. 195-197