Variance–covariance matrices

Sands, D. E.

doi:10.1107/97809553602060000559

International
Tables for
Crystallography
Volume B
Reciprocal space
Edited by U. Shumeli

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. B. ch. 3.1, pp. 350-351 | 1 | 2 |

Section 3.1.10. Variance–covariance matrices

D. E. Sands^a ^*

^a Department of Chemistry, University of Kentucky, Chemistry–Physics Building, Lexington, Kentucky 40506-0055, USA
Correspondence e-mail: sands@pop.uky.edu

3.1.10. Variance–covariance matrices

| top | pdf |

Refinement of a crystal structure yields both the parameters that describe the structure and estimates of the uncertainties of those parameters. Refinement by the method of least squares minimizes a weighted sum of squares of residuals. In the matrix notation of Hamilton's classic book (Hamilton, 1964), values of the m parameters to be determined are expressed by the $[m \times 1]$ column vector X given by $[{\bi X} = ({\bi A}^{T} {\bi{PA}})^{-1} {\bi A}^{T} {\bi PF}, \eqno(3.1.10.1)]$ where F is an $[n \times 1]$ matrix representing the observations (structure factors or squares of structure factors), P is an $[n \times n]$ weight matrix that is proportional to the variance–covariance matrix of the observed F, A is an $[n \times m]$ design matrix consisting of the derivatives of each element of F with respect to each of the parameters and $[{\bi A}^{T}]$ is the transpose of A. The variance–covariance matrix of the parameters is then given by $[{\bi M} = {\bi V}^{T} {\bi PV} ({\bi A}^{T} {\bi PA})^{-1}/(n - m). \eqno(3.1.10.2)]$ Here, V is the $[n \times 1]$ matrix of residuals, consisting of the differences between the observed and calculated values of the elements of F. Since $[{\bi V}^{T}{\bi PV}/(n - m)]$ is just a single number, M is proportional to the inverse least-squares matrix $[({\bi A}^{T}{\bi PA})^{-1}]$ .

Once the variance–covariance matrix of the parameters is known, the variances and covariances of any quantities derived from these parameters can be computed. The variance of a single function f is given by $[\sigma^{2}(\hskip 2ptf) = {\partial f\over \partial x^{i}} {\partial f\over \partial x\hskip 2pt^{j}} \hbox{cov} (x^{i}, x\hskip 2pt^{j}), \eqno(3.1.10.3)]$ where, as usual, we are using the summation convention and summing over all parameters included in f. A generalization of (3.1.10.3) for two functions is $[\hbox{cov} (\hskip 2ptf_{1}, f_{2}) = {\partial f_{1}\over \partial x^{i}} {\partial f_{2}\over \partial x\hskip 2pt^{j}} \hbox{cov} (x^{i}, x\hskip 2pt^{j}). \eqno(3.1.10.4)]$ [The covariance of two quantities is, of course, just the variance if the two quantities are the same. For an elementary discussion of statistical covariance and correlation, see Sands (1977).] Equation (3.1.10.4) may now be extended to any number of functions (Sands, 1966); the $[k \times k]$ variance–covariance matrix C of k functions of m parameters is given in terms of the $[m \times m]$ variance–covariance matrix of the parameters by $[{\bi C} = {\bi DMD}^{T}, \eqno(3.1.10.5)]$ in which the ijth element of the $[k \times m]$ matrix D is the derivative of function $[f_{i}]$ with respect to parameter j. Element $[C_{II}]$ (no summation implied over I) is the variance of function $[f_{I}]$ , and $[C_{IJ}]$ is the covariance of functions $[f_{I}]$ and $[f_{J}]$ .

The calculation of C must, of course, include the contributions of all sources of error, so M in (3.1.10.5) should include the variances and covariances of the unit-cell dimensions and of any other relevant parameters with non-negligible uncertainties.

It may be easier, in some cases, to carry out calculations of variances and covariances in steps. For example, the variance–covariance matrix of a set of distances may be computed and then other quantities may be determined as functions of the distances. It is imperative that all non-vanishing covariances be included in every stage of the calculation; only in special cases are the covariances negligible, and often they are large enough to affect the results seriously (Sands, 1977).

These principles may be used to explore the effects of symmetry or of transformations on the variance–covariance matrices of atomic parameters and derived quantities. Using the notation of Sands (1966), with $[x_{A}^{i}]$ and $[x_{B}^{i}]$ the positional parameters i of atoms A and B, respectively, we define $[{\bi M}_{AA}, {\bi M}_{AB}, {\bi M}_{BA}]$ and $[{\bi M}_{BB}]$ as $[3 \times 3]$ matrices with ijth elements $[\hbox{cov} (x_{A}^{i}, x_{A}^{\;j})]$ , $[\hbox{cov} (x_{A}^{i}, x_{B}^{\;j})]$ , $[\hbox{cov} (x_{B}^{i}, x_{A}^{\;j})]$ and $[\hbox{cov} (x_{B}^{i}, x_{B}^{\;j})]$ , respectively. If atom [B'] is generated from atom B by symmetry operator S, such that $[\eqalignno{ {\bf x}_{B'} &= {\bi S}{\bf x}_{B} &(3.1.10.6)\cr x_{B'}^{i} &= {S}_{j}^{i} {x}\hskip 2pt^{j}_{B}, &(3.1.10.7)}%(3.1.10.7)]$ it is shown in Sands (1966) that the variance–covariance matrices involving atom [{B}'] are $[\eqalignno{ {\bi M}_{AB'} &= {\bi M}_{AB} {\bi S}^{T} &(3.1.10.8)\cr {\bi M}_{B'A} &= {\bi SM}_{BA} &(3.1.10.9)\cr {\bi M}_{B'B'} &= {\bi SM}_{BB} {\bi S}^{T}. & (3.1.10.10)}]$ If symmetry operator S is applied to both atoms A and B to generate atoms [{A}'] and [{B}'] , the corresponding matrices may be expressed by the matrix equation $[\pmatrix{{\bi M}_{A'A'} &{\bi M}_{A'B'}\cr {\bi M}_{B'A'} &{\bi M}_{B'B'}\cr} = \pmatrix{{\bi SM}_{AA} {\bi S}^{T} &{\bi SM}_{AB} {\bi S}^{T}\cr {\bi SM}_{BA} {\bi S}^{T} &{\bi SM}_{BB} {\bi S}^{T}\cr}. \eqno(3.1.10.11)]$

If G is a matrix that transforms to a new set of axes, $[{\bf a}' = {\bi G} {\bf a}, \eqno(3.1.10.12)]$ the transformed variance–covariance matrix of the atomic parameters is $[{\bi M}' = ({\bi G^{T}})^{-1} {\bi MG}^{-1}. \eqno(3.1.10.13)]$

To apply these formulae to calculations of the errors and covariances of interatomic distances and angles, consider the triangle of atoms A, B, C with edges $[l_{1} = AB]$ , $[l_{2} = BC]$ , $[l_{3} = CA]$ , and angles $[\alpha_{1}]$ , $[\alpha_{2}]$ , $[\alpha_{3}]$ at A, B, C, respectively. If the atoms are not related by symmetry, $[\eqalignno{\sigma^{2}(l_{1})& = {\bi l}_{1}^{T} {\bf g} ({\bi M}_{AA} - {\bi M}_{AB} - {\bi M}_{BA} + {\bi M}_{BB}) {\bf g}{\bi l}_{1}/l_{1}^{2} &(3.1.10.14)\cr \hbox{cov} (l_{1}, l_{2})& = {\bi l}_{1}^{T} {\bf g} ({\bi M}_{AB} - {\bi M}_{AC} - {\bi M}_{BB} + {\bi M}_{BC}) {\bf g}{\bi l}_{2}/l_{1} l_{2}.&(3.1.10.15)\cr}]$ If atom B is generated from atom A by symmetry matrix S, the results, as derived in Sands (1966), are $[\eqalignno{ \sigma^{2}(l_{1}) &= {\bi l}_{1}^{T} {\bf g} ({\bi M}_{AA} - {\bi SM}_{AA} - {\bi M}_{AA} {\bi S}^{T} \cr &\quad + {\bi SM}_{AA} {\bi S}^{T}) {\bf g} {\bi l}_{1}/l_{1}^{2} &(3.1.10.16)\cr \sigma^{2}(l_{2}) &= {\bi l}_{2}^{T} {\bf g} ({\bi SM}_{AA} {\bi S}^{T} - {\bi M}_{AC} {\bi S}^{T} \cr &\quad - {\bi SM}_{AC} + {\bi M}_{CC}) {\bf g} {\bi l}_{2}/l_{2}^{2} &(3.1.10.17) \cr \sigma^{2}(l_{3}) &= {\bi l}_{3}^{T} {\bf g} ({\bi M}_{AA} - {\bi M}_{AC} - {\bi M}_{CA} \cr &\quad + {\bi M}_{CC}) {\bf g} {\bi l}_{3}/l_{3}^{2} &(3.1.10.18)\cr \hbox{cov} (l_{1}, l_{2}) &= {\bi l}_{1}^{T} {\bf g} ({\bi M}_{AA} {\bi S}^{T} - {\bi SM}_{AA} {\bi S}^{T} \cr &\quad - {\bi M}_{AC} + {\bi SM}_{AC}) {\bf g} {\bi l}_{2}/l_{1}l_{2} &(3.1.10.19) \cr \hbox{cov} (l_{1}, l_{3}) &= {\bi l}_{1}^{T} {\bf g} (- {\bi M}_{AA} + {\bi SM}_{AA} \cr &\quad + {\bi M}_{AC} - {\bi SM}_{AC}) {\bf g} {\bi l}_{3}/l_{1}l_{3} &(3.1.10.20) \cr \hbox{cov} (l_{2}, l_{3}) &= {\bi l}_{2}^{T} {\bf g} (- {\bi SM}_{AA} + {\bi M}_{CA} \cr &\quad + {\bi SM}_{AC} - {\bi M}_{CC}) {\bf g} {\bi l}_{3}/l_{2}l_{3}. &(3.1.10.21)}%(3.1.10.21)]$ In equations (3.1.10.14)–(3.1.10.21), $[{\bi l}_{i}]$ is a column vector with components the differences of the coordinates of the atoms connected by the vector. Representative formulae involving the angles $[\alpha_{1}]$ , $[\alpha_{2}]$ , $[\alpha_{3}]$ are $[\eqalignno{ \sigma^{2}(\alpha_{1}) &= [\cos^{2} \alpha_{2}\sigma^{2} (l_{1}) - 2 \cos \alpha_{2} \hbox{ cov} (l_{1}, l_{2}) \cr &\quad + 2 \cos \alpha_{2} \cos \alpha_{3} \hbox{ cov} (l_{1}, l_{3}) + \sigma^{2} (l_{2}) \cr &\quad - 2 \cos \alpha_{3} \hbox{ cov} (l_{2}, l_{3}) \cr &\quad + \cos^{2} \alpha_{3}\sigma^{2} (l_{3})] (l_{2}/l_{1}l_{3} \sin \alpha_{1})^{2} &(3.1.10.22)\cr \hbox{cov} (\alpha_{1}, \alpha_{2}) &= [\cos \alpha_{1} \cos \alpha_{2} \sigma^{2} (l_{1}) \cr &\quad + (\cos \alpha_{2} \cos \alpha_{3} - \cos \alpha_{1}) \hbox{ cov} (l_{1}, l_{2}) \cr &\quad + (\cos \alpha_{1} \cos \alpha_{3} - \cos \alpha_{2}) \hbox{ cov} (l_{1}, l_{3}) \cr &\quad - \cos \alpha_{3} \sigma^{2} (l_{2}) + (1 + \cos^{2} \alpha_{3}) \hbox{ cov} (l_{2}, l_{3}) \cr &\quad - \cos \alpha_{3} \sigma^{2} (l_{3})] / (l_{1}^{2} \sin \alpha_{1} \sin \alpha_{2}) &(3.1.10.23) \cr \hbox{cov} (\alpha_{1}, l_{1}) &= [- \cos \alpha_{2} \sigma^{2} (l_{1}) + \hbox{cov} (l_{1}, l_{2}) \cr &\quad - \cos \alpha_{3} \hbox{ cov} (l_{1}, l_{3})] (l_{2}/l_{1} l_{3} \sin \alpha_{1}) &(3.1.10.24) \cr \hbox{cov} (\alpha_{1}, l_{2}) &= [- \cos \alpha_{2} \hbox{ cov} (l_{1}, l_{2}) + \sigma^{2} (l_{2}) \cr &\quad - \cos \alpha_{3} \hbox{ cov} (l_{2}, l_{3})] (l_{2}/l_{1} l_{3} \sin \alpha_{1}). &(3.1.10.25)}%(3.1.10.25)]$ If any of the angles approach or $[180^{\circ}]$ , the denominators in (3.1.10.22)–(3.1.10.25) will become very small, necessitating high-precision arithmetic. Indeterminacies resulting from special relationships between atomic positions may require rederivation of the equations for variances and covariances, to take the relationships into account explicitly and avoid the indeterminacies. A true symmetry condition requiring, for example, a linear bond should cause little problem, and the corresponding variance will be zero. It is the indeterminacies not originating from crystal symmetry that demand caution, in recognizing them and in coping with them correctly.

A general expression for the variance of a dihedral angle, in terms of the variances and covariances of the coordinates of the four atoms, is (Shmueli, 1974) $[\sigma^{2} (\tau) = {\displaystyle\sum\limits_{k}} {\displaystyle\sum\limits_{n}} {\partial \tau\over \partial x_{(k)}^{i}} {\partial \tau\over \partial x\hskip 1pt_{(n)}^{\;j}} \hbox{cov} [x_{(k)}^{i}, x\hskip 2pt_{(n)}^{\;j}], \eqno(3.1.10.26)]$ where, in addition to the usual tensor summation over i and j from 1 to 3, summation must be carried out over the four atoms (i.e., k and n vary from 1 to 4). Special cases of (3.1.10.26), corresponding to various levels of approximation of diagonal matrices and isotropic errors, are given in Shmueli (1974). Formulae in dyadic notation are given in Waser (1973) for the variances and covariances of dihedral angles, of best planes, of torsion angles, and of other molecular parameters.

References

Hamilton, W. C. (1964). Statistics in physical science. New York: Ronald Press.Google Scholar

Sands, D. E. (1966). Transformations of variance–covariance tensors. Acta Cryst. 21, 868–872.Google Scholar

Sands, D. E. (1977). Correlation and covariance. J. Chem. Educ. 54, 90–94.Google Scholar

Shmueli, U. (1974). On the standard deviation of a dihedral angle. Acta Cryst. A30, 848–849.Google Scholar

Waser, J. (1973). Dyadics and the variances and covariances of molecular parameters, including those of best planes. Acta Cryst. A29, 621–631.Google Scholar

International Tables for Crystallography (2006). Vol. B. ch. 3.1, pp. 350-351