International
Tables for Crystallography Volume B Reciprocal space Edited by U. Shmueli © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. B. ch. 3.2, pp. 353-355
Section 3.2.2. Least-squares plane based on uncorrelated, isotropic weights
aThe Beckman Institute–139–74, California Institute of Technology, 1201 East California Blvd, Pasadena, California 91125, USA, and bDepartment of Chemistry, University of Washington, Seattle, Washington 98195, USA |
This is surely the most common situation; it is not often that one will wish to take the trouble, or be presumptive enough, to assign anisotropic or correlated weights to the various atoms. And one will sometimes, perhaps even often, not be genuinely interested in the hypothesis that the atoms actually are rigorously coplanar; for instance, one might be interested in examining the best plane through such a patently non-planar molecule as cyclohexane. Moreover, the calculation is simple enough, given the availability of computers and programs, as to be a practical realization of the off-the-cuff treatment suggested in our opening paragraph. The problem of deriving the plane's coefficients is intrinsically nonlinear in the way first discussed by Schomaker et al. (1959; SWMB). Any formulation other than as an eigenvalue–eigenvector problem (SWMB), as far as we can tell, will sometimes go astray. As to the propagation of errors, numerous treatments have been given, but none that we have seen is altogether satisfactory.
We refer all vectors and matrices to Cartesian axes, because that is the most convenient in calculation. However, a more elegant formulation can be written in terms of general axes [e.g., as in Shmueli (1981)].
The notation is troublesome. Indices are needed for atom number and Cartesian direction, and the exponent 2 is needed as well, which is difficult if there are superscript indices. The best way seems to be to write all the indices as subscripts and distinguish among them by context – i, j, 1, 2, 3 for directions; k, l, p (and sometimes K, …) for atoms. In any case, atom first then direction if there are two subscripts; direction, if only one index for a vector component, but atom (in this section at least) if for a weight or a vector. And , e.g., for the standard uncertainty of the distance of atom 1 from a plane. For simplicity in practice, we use Cartesian coordinates throughout.
The first task is to find the plane, which we write as where r is here the vector from the origin to any point on the plane (but usually represents the measured position of an atom), m is a unit vector parallel to the normal from the origin to the plane, d is the length of the normal, and and are the column representations of m and r. The least-squares condition is to find the stationary values of subject to , with , , the vector from the origin to atom k and with weights, , isotropic and without interatomic correlations for the n atoms of the plane. We also write S as , the subscript for atom number being implicit in the Gaussian summations over all atoms, as it is also in the angle-bracket notation for the weighted average over all atoms, for example in – the weighted centroid of the groups of atoms – just below.
First solve for d, the origin-to-plane distance. Then Here is the vector from the centroid to atom k. Then solve for m. This is the eigenvalue problem – to diagonalize (bear in mind that is just ) by rotating the coordinate axes, i.e., to find the arrays and , diagonal, to satisfy and are symmetric; the columns of are the direction cosines of, and the diagonal elements of are the sums of weighted squares of residuals from, the best, worst and intermediate planes, as discussed by SWMB.
Waser et al. (1973; WMC) carefully discussed how the random errors of measurement of the atom positions propagate into the derived quantities in the foregoing determination of a least-squares plane. This section presents an extension of their discussion. To begin, however, we first show how standard first-order perturbation theory conveniently describes the propagation of error into and when the positions of the atoms are incremented by the amounts and the corresponding quantities (the vectors from the centroid to the atoms) by the amounts . (The need to account for the variation in position of the centroid, i.e. to distinguish between and , was overlooked by WMC.) The consequent increments in and are Here the columns of are expressed as linear combinations of the columns of . Note also that both perturbations, and , which are the adjustments to the orientations and associated eigenvalues of the principal planes, will depend on the reduced coordinates and the perturbing influences by way of , which in turn depends only on the reduced coordinates and the reduced shifts . In contrast, the change in the origin-to-plane distance for the plane defined by the column vectors m of , depends on the and directly as well as on the and by way of the
The first-order results arising from the standard conditions, diagonal, and , are and Stated in terms of the matrix components and , the first condition is , hence , and the second is . With these results the third condition then reads All this is analogous to the usual first-order perturbation theory, as, for example, in elementary quantum mechanics.
Now rotate to the coordinates defined by WMC, with axes parallel to the original eigenvectors , restrict attention to the best plane , and define as , keeping in mind ; itself, the original plane-to-centroid distance, of course vanishes. One then finds and also These results have simple interpretations. The changes in direction of the plane normal (the ) are rotations, described by and , in response to changes in moments acting against effective torsion force constants. For , for example, the contribution of atom k to the total relevant moment, about direction 1, is ( the `force' and the lever arm), and its nominally first-order change has two parts, from the change in force and from the change in lever arm; the resisting torsion constant is , which, reflection will show, is qualitatively reasonable if not quantitatively obvious. The perpendicular displacement of the plane from the original centroid is , but there are two further contributions to , the change in distance from origin to plane along the plane normal, that arise from the two components of out-of-plane rotation of the plane about its centroid. Note that is not given by , which vanishes identically.
There is a further, somewhat delicate point: If the group of atoms is indeed essentially coplanar, the are of the same order of magnitude as the , unlike the , , which are in general about as big as the lateral extent of the group. It is then appropriate to drop all terms in or , and, in the denominators, the terms in .
The covariances of the derived quantities (by covariances we mean here both variances and covariances) can now be written out rather compactly by extending the implicit designation of atom numbers to double sums, the first of each of two similar factors referring to the first atom index and the second to the second, e.g., . Note that the various covariances, i.e. the averages over the presumed population of random errors of replicated measurements, are indicated by overlines, angle brackets having been pre-empted for averages over sets of atoms.
Interatomic covariance (e.g., ) thus presents no formal difficulty, although actual computation may be tedious. Nonzero covariance for the 's may arise explicitly from interatomic covariance (e.g., ) of the errors in the atomic positions , and it will always arise implicitly because in includes all the and therefore has nonzero covariance with all of them and with itself, even if there is no interatomic covariance among the 's.
If both types of interatomic covariance (explicit and implicit) are negligible, the covariances simplify a great deal, the double summations reducing to single summations. [The formal expression for does not change, so it will not be repeated.]
When the variances are the same for as for (i.e. , all i, j) and the covariances all vanish , the simplify further. If the variances are also isotropic , all i, j), there is still further simplification to If allowance is made for the difference in definition between and , these expressions are equivalent to the ones (equations 7–9) given by WMC, who, however, do not appear to have been aware of the distinction between and and the possible consequences thereof.
If, finally, for each atom is taken equal to its , all j, there is still further simplification.
For the earlier, more general expressions for the components of it is still necessary to find and in terms of , with .
In the isotropic, `no-correlation' case, for example, these reduce to and Here the difference between the correct covariance values and the values obtained on ignoring the variation in may be important if the number of defining atoms is small, say, 5 or 4 or, in the extreme, 3.
There are two cases, as has been pointed out, e.g., by Ito (1982).
For example, consider a plane defined by only three atoms, one of overwhelmingly great w at (0, 0, 0), one at (1, 0, 0) and one at (0, 1, 0). The centroid is at (0, 0, 0) and we take , i.e. is the item of interest. Of course, it is obvious without calculation that the standard uncertainties vanish for the distances of the three atoms from the plane they alone define; the purpose here is only to show, in one case for one of the atoms, that the calculation gives the same result, partly, it will be seen, because the change in orientation of the plane is taken into account. If the only variation in the atom positions is described by , one has , and . The non-vanishing terms in the desired variance are then If, however, the problem concerns the same plane and a fourth atom at position , not included in the specification of the plane and uncertain only in respect to (which is arbitrary) with (the same mean-square variation in direction 3 as for atom 2) and , the calculation for runs the same as before, except for the third term:
Extreme examples of this kind show clearly enough that variation in the direction of the plane normal or in the normal component of the centroid position will sometimes be important, the remarks to the contrary by Shmueli (1981) and, for the centroid, the omission by WMC notwithstanding. If only a few atoms are used to define the plane (e.g., three or, as is often the case, a very few more), both the covariance with the centroid position and uncertainty in the direction of the normal are likely to be important. The uncertainty in the normal may still be important, even if a goodly number of atoms are used to define the plane, whenever the test atom lies near or beyond the edge of the lateral domain defined by the other atoms.
References
Ito, T. (1982). On the estimated standard deviation of the atom-to-plane distance. Acta Cryst. A38, 869–870.Google ScholarSchomaker, V., Waser, J., Marsh, R. E. & Bergman, G. (1959). To fit a plane or a line to a set of points by least squares. Acta Cryst. 12, 600–604.Google Scholar
Shmueli, U. (1981). On the statistics of atomic deviations from the `best' molecular plane. Acta Cryst. A37, 249–251.Google Scholar
Waser, J., Marsh, R. E. & Cordes, A. W. (1973). Variances and covariances for best-plane parameters including dihedral angles. Acta Cryst. B29, 2703–2708.Google Scholar