International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F. ch. 18.5, pp. 404-405   | 1 | 2 |

Section 18.5.2. The least-squares method

D. W. J. Cruickshanka*

a Chemistry Department, UMIST, Manchester M60 1QD, England
Correspondence e-mail: dwj_cruickshank@email.msn.com

18.5.2. The least-squares method

| top | pdf |

18.5.2.1. The normal equations

| top | pdf |

In the unrestrained least-squares method, the residual [R = \textstyle\sum\limits_{3}\displaystyle w(hkl)\Delta^{2} (hkl) \eqno(18.5.2.1)] is minimized, where Δ is either [|F_{o}| - |F_{c}|] for [R_{1}] or [|F_{o}|^{2} - |F_{c}|^{2}] for [R_{2}], and w(hkl) is chosen appropriately. The summation is over crystallographically independent planes.

When R is a minimum with respect to the parameter [u_{j}], [\partial R/\partial u_{j} = 0], i.e., [\textstyle\sum\limits_{3}\displaystyle w\Delta (\partial \Delta / \partial u_{j}) = 0. \eqno(18.5.2.2)] For [R_{1}], [\partial \Delta / \partial u_{j} = -\partial |F_{c}|/\partial u_{j}]; for [R_{2}], [\partial \Delta / \partial u_{j} =] [-2|F_{c}|\partial |F_{c}|/ \partial u_{j}]. The n parameters have to be varied until the n conditions (18.5.2.2)[link] are satisfied. For a trial set of the [u_{j}] close to the correct values, we may expand Δ as a function of the parameters by a Taylor series to the first order. Thus for [R_{1}], [\Delta ({\bf u} + {\bf e}) = \Delta ({\bf u}) - \textstyle\sum\limits_{i}\displaystyle \varepsilon_{i} (\partial |F_{c}|/ \partial u_{i}), \eqno(18.5.2.3)] where [\varepsilon_{i}] is a small change in the parameter [u_{i}], and u and e represent the whole sets of parameters and changes. The minus sign occurs before the summation, since [\Delta = |F_{o}| - |F_{c}|], and the changes in [|F_{c}|] are being considered.

Substituting (18.5.2.3)[link] in (18.5.2.2)[link], we get the normal equations for [R_{1}], [\openup6pt\displaylines{\textstyle\sum\limits_{i}\displaystyle \varepsilon_{i} \left[\textstyle\sum\limits_{3}\displaystyle w(\partial |F_{c}|/ \partial u_{i}) (\partial |F_{c}|/ \partial u_{j})\right] \cr\hfill= \textstyle\sum\limits_{3}\displaystyle w\Delta (\partial |F_{c}|/ \partial u_{j}).\hfill (18.5.2.4)}] There are n of these equations for [j = 1,\ldots, n] to determine the n unknown [\varepsilon_{j}].

For [R_{2}] the normal equations are [\openup6pt\displaylines{\textstyle\sum\limits_{i}\displaystyle \varepsilon_{i} \left[\textstyle\sum\limits_{3}\displaystyle w(\partial |F_{c}|^{2} / \partial u_{i}) (\partial |F_{c}|^{2} / \partial u_{j})\right] \cr\hfill= \textstyle\sum\limits_{3}\displaystyle w\Delta (\partial |F_{c}|^{2} / \partial u_{j}). \hfill(18.5.2.5)}] Both forms of the normal equations can be abbreviated to [\textstyle\sum\limits_{i}\displaystyle \varepsilon_{i} a_{ij} = b_{j}. \eqno(18.5.2.6)]

For the values of [\partial |F_{c}| / \partial u_{j}] for common parameters see, e.g., Cruickshank (1970)[link].

Some important points in the derivation of the standard uncertainties of the refined parameters can be most easily understood if we suppose that the matrix [a_{ij}] can be approximated by its diagonal elements. Each parameter is then determined by a single equation of the form [\varepsilon_{i} \textstyle\sum\limits_{3}\displaystyle wg^{2} = \textstyle\sum\limits_{3}\displaystyle wg\Delta, \eqno(18.5.2.7)] where [g = \partial |F_{c}| / \partial u_{i}] or [\partial |F_{c}|^{2} / \partial u_{i}]. Hence [\varepsilon_{i} = \left(\textstyle\sum\limits_{3}\displaystyle wg\Delta \right)\bigg/ \left(\textstyle\sum\limits_{3}\displaystyle wg^{2}\right). \eqno(18.5.2.8)] At the conclusion of the refinement, when R is a minimum, the variance (square of the s.u.) of the parameter [u_{i}] due to uncertainties in the Δ's is [\sigma_{i}^{2} = \left[\textstyle\sum\limits_{3}\displaystyle w^{2}g^{2}\sigma^{2}(F)\right] \bigg/ \left(\textstyle\sum\limits_{3}\displaystyle wg^{2}\right)^{2}. \eqno(18.5.2.9)] If the weights have been chosen as [w(hkl) = 1 / \sigma^{2}(|F_{hkl}|)] or [1 / \sigma^{2} (|F_{hkl}|^{2})], this simplifies to [\sigma_{i}^{2} = 1 \bigg/ \left(\textstyle\sum\limits_{3}\displaystyle wg^{2}\right) = 1 / a_{ii}, \eqno(18.5.2.10)] which is appropriate for absolute weights. Equation (18.5.2.10)[link] provides an s.u. for a parameter relative to the s.u.'s [\sigma (|F|)] or [\sigma (|F|^{2})] of the observations.

In general, with the full matrix [a_{ij}] in the normal equations, [\sigma_{i}^{2} = (a^{-1})_{ii}, \eqno(18.5.2.11)] where [(a^{-1})_{ii}] is an element of the matrix inverse to [a_{ij}]. The covariance of the parameters [u_{i}] and [u_{j}] is [\hbox{cov} (i, j) \equiv \sigma_{i}\sigma_{j}\hbox{correl} (i, j) = (a^{-1})_{ij}. ]

18.5.2.2. Weights

| top | pdf |

In the early stages of refinement, artificial weights may be chosen to accelerate refinement. In the final stages, the weights must be related to the precision of the structure factors if parameter variances are being sought. There are two distinct ways, covering two ranges of error, in which this may be done.

  • (1) The weights for [R_{1}], say, may reflect the precision of the [|F_{o}|], so that [w(hkl) = 1 / \sigma^{2} (|F_{hkl}|)], where [\sigma^{2}] is the estimated variance of [|F_{o}|] due to a specific class of experimental uncertainties. These absolute weights are derived from an analysis of the experiment. Weights chosen in this way lead to estimated parameter variances [\sigma_{i}^{2} = (a^{-1})_{ii}], (18.5.2.11)[link], which cover only the specific class of experimental uncertainties.

  • (2) The weights may reflect the trends in the [|\Delta| \equiv \|F_{o}| - |F_{c}\|]. A weighting function with a small number of parameters is chosen so that the averages of [w \Delta^{2}] are constant when the set of [w\Delta^{2}] values is analysed in any pertinent fashion (e.g. in bins of increasing [|F_{o}|] and [2\sin \theta/\lambda]). Weights chosen in this way are relative weights, and the expression for the parameter variances needs a scaling factor, [S^{2} = \left(\textstyle\sum\limits_{3}\displaystyle w\Delta^{2}\right) \bigg/ (n_{\rm obs} - n_{\rm params}). \eqno(18.5.2.12)] Hence, in the full-matrix case, [\sigma_{i}^{2} = \left[\left(\textstyle\sum\limits_{3}\displaystyle w\Delta^{2}\right)\bigg / (n_{\rm obs} - n_{\rm params})\right] (a^{-1})_{ii}, \eqno(18.5.2.13)] which allows for all random experimental errors, such systematic experimental errors as cannot be simulated in the [|F_{c}|] and imperfections in the calculated model.

18.5.2.3. Statistical descriptors and goodness of fit

| top | pdf |

In recent years, there have been developments and changes in statistical nomenclature and usage. Many aspects are summarised in the reports of the IUCr Subcommittee on Statistical Descriptors in Crystallography (Schwarzenbach et al., 1989[link], 1995[link]). In the second report, inter alia, the Subcommittee emphasizes the terms uncertainty and standard uncertainty (s.u.). The latter is a replacement for the older term estimated standard deviation (e.s.d.). The Subcommittee classify uncertainty components in two categories, based on their method of evaluation: type A, estimated by the statistical analysis of a series of observations, and type B, estimated otherwise. As an example of the latter, a type B component could allow for doubts concerning the estimated shape and dimensions of the diffracting crystal and the subsequent corrections made for absorption.

The square root S of the expression S2, (18.5.2.12)[link] above, is called the goodness of fit when the weights are the reciprocals of the absolute variances of the observations.

One recommendation in the second report does call for comment here. While agreeing that formulae like (18.5.2.13)[link] lead to conservative estimates of parameter variances, the report suggests that this practice is based on the questionable assumption that the variances of the observations by which the weights are assigned are relatively correct but uniformly underestimated. When the goodness of fit [S\gt1], then either the weights or the model or both are suspect.

Comment is needed. The account in Section 18.5.2.2[link] describes two distinct ways of estimating parameter variances, covering two ranges of error. The kind of weights envisaged in the reports (based on variances of type A and/or of type B) are of a class described for method (1)[link]. They are not the weights to be used in method (2)[link] (though they may be a component in such weights). Method (2)[link] implicitly assumes from the outset that there are experimental errors, some covered and others not covered by method (1)[link], and that there are imperfections in the calculated model (as is obviously true for proteins). Method (2)[link] avoids exploring the relative proportions and details of these error sources and aims to provide a realistic estimate of parameter uncertainties which can be used in external comparisons. It can be formally objected that method (2)[link] does not conform to the criteria of random-variable theory, since clearly the Δ's are partially correlated through the remaining model errors and some systematic experimental errors. But it is a useful procedure. Method (1)[link] on its own would present an optimistic view of the reliability of the overall investigation, the degree of optimism being indicated by the inverse of the goodness of fit (18.5.2.12)[link]. In method (2)[link], if the weights are on an arbitrary scale, then [S^{2}] can have an arbitrary value.

For an advanced-level treatment of many aspects of the refinement of structural parameters, see Part 8[link] of International Tables for Crystallography, Volume C (2004)[link]. The detection and treatment of systematic error are discussed in Chapter 8.5[link] therein.

References

First citation International Tables for Crystallography (2004). Vol. C. Mathematical, physical and chemical tables, edited by E. Prince. Dordrecht: Kluwer Academic Publishers.Google Scholar
First citation Cruickshank, D. W. J. (1970). Least-squares refinement of atomic parameters. In Crystallographic computing, edited by F. R. Ahmed, S. R. Hall & C. P. Huber, pp. 187–196. Copenhagen: Munksgaard.Google Scholar
First citation Schwarzenbach, D., Abrahams, S. C., Flack, H. D., Gonschorek, W., Hahn, Th., Huml, K., Marsh, R. E., Prince, E., Robertson, B. E., Rollett, J. S. & Wilson, A. J. C. (1989). Statistical descriptors in crystallography: Report of the IUCr subcommittee on statistical descriptors. Acta Cryst. A45, 63–75.Google Scholar
First citation Schwarzenbach, D., Abrahams, S. C., Flack, H. D., Prince, E. & Wilson, A. J. C. (1995). Statistical descriptors in crystallography. II. Report of a working group on expression of uncertainty in measurement. Acta Cryst. A51, 565–569.Google Scholar








































to end of page
to top of page