The least-squares plane

Marsh, R. E.; Schomaker, V.

doi:10.1107/97809553602060000560

International
Tables for
Crystallography
Volume B
Reciprocal space
Edited by U. Shmueli

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. B. ch. 3.2, pp. 353-359 | 1 | 2 |
https://doi.org/10.1107/97809553602060000560

Chapter 3.2. The least-squares plane

R. E. Marsh^a ^* and V. Schomaker^b ^‡

^aThe Beckman Institute–139–74, California Institute of Technology, 1201 East California Blvd, Pasadena, California 91125, USA, and ^bDepartment of Chemistry, University of Washington, Seattle, Washington 98195, USA
Correspondence e-mail: rem@xray.caltech.edu

Footnotes

^‡ Deceased.

¹ A simple two-dimensional problem illustrates the point. A regular polygon of n atoms is to define a `best' line (always a central line). If the error matrix (the same for each atom) is isotropic, the weighted sum of squares of deviations from the line is independent of its orientation for $[n \gt 2]$ , i.e. the problem is a degenerate eigenvalue problem, with two equal eigenvalues. However, if the error ellipsoids are not isotropic and are all oriented radially or all tangentially (these are merely the two orientations tried), the sum has n/2 equal minima for even n and 2 equal minima for odd n, in the one- $[\pi]$ range of possible orientations of the line.
Possibly similar peculiarities might be imagined if the anisotropic weights were more complicated (e.g., `star' shaped) than can be described by a non-singular matrix, or by any matrix. Such are of course excluded here.
² Ito observes that his method fails when there are only three points to define the plane, his least-squares normal equations becoming singular. But the situation is worse: his equations are singular for any number of points, if the points fit a plane exactly.
³ See first footnote¹.
⁴ Ito's second method (Ito, 1981b

), of `substitution', is also a regression, essentially like the regression along z at fixed x and y used long ago by Clews & Cochran (1949

, p. 52) and like the regressions of y on fixed x that – despite the fact that both x and y are afflicted with random errors – are commonly taught or practised in schools, universities and laboratories nearly 200 years after Gauss, to the extent that Deming, Lybanon and other followers of Gauss have so far had rather little influence. Kalantar's (1987

) short note is a welcome but still rare exception.
⁵ Is this statement firm for a nonlinear problem? We use it, assuming that at convergence the problem has become effectively linear. But in fact this will depend on how great the nonlinearity is, in comparison with the random errors (variances) that eventually have to be considered. Another caveat may be in order in regard to our limited knowledge of Gauss's second derivation of the method of least squares, the one he preferred [see Whittaker & Robinson (1929

)] and which establishes for a linear system that the best linear combination of a set of observations, afflicted by random errors, for estimating any arbitrary derived quantity – best in the sense of being unbiased and having minimal mean-square error – is given by the method of least squares with the weight matrix set equal to the inverse error matrix of the observations. Hamilton, and Whittaker & Robinson, prove this only for the case that the derived parameters are not constrained, whereas here they are. We believe, however, that the best choice of weights is a question concerning only the observations, and that it cannot be affected by the method used for minimizing S subject to any constraints, whether by eliminating some of the parameters by invoking the constraints directly or by the use of Lagrange multipliers.
⁶ We do not fully understand the curious situation of this equation. It arises immediately if the isotropic problem is formulated as one of minimizing $[[(1 - {\sf b}^{T}{\sf r})^{2}]]$ by varying $[\sf b]$ , and it fails then [SWMB (1959

) referred to it as `an incorrect method'], as it obviously must – observe the denominator – if the plane passes too close to the origin. However, it fails in other circumstances also. The main point about it is perhaps that it is linear in $[\sf b]$ and is obtained as the supposedly exact and unique solution of the isotropic problem, whereas the problem has no unique solution but three solutions instead (SWMB, 1959

). From the point of view of Gaussian least squares, the essential fault in minimizing $[S_{\rm lin} = [(1 - {\sf b}^{T}{\sf r})^{2}]]$ may be that the apparently simple weighting function in it, i.e. the identity, is actually complicated and unreasonable. In terms of distance deviations from the plane, we have $[S_{\rm lin} = [w(d - {\sf m}^{T}{\sf r})^{2}]]$ , with $[w = {\sf b}^{T}{\sf b} = d^{-2}]$ . Prudence requires that the origin be shifted to a point sufficiently far from the plane and close enough to the centroid normal to avoid the difficulties discussed by SWMB

. Note that for the one-dimensional problem of fitting a constant to a set of measurements of a single entity the Deming–Lagrange treatment with the condition $[1 = cx_{a}]$ and weights w reduces immediately to the standard result [1 / c = [wx]/[w]]

International Tables for Crystallography (2006). Vol. B. ch. 3.2, pp. 353-359
https://doi.org/10.1107/97809553602060000560