Measurement of rotations and strains from coordinates

Diamond, R.

doi:10.1107/97809553602060000561

International
Tables for
Crystallography
Volume B
Reciprocal space
Edited by U. Shmueli

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. B. ch. 3.3, pp. 364-366 | 1 | 2 |

Section 3.3.1.2.2. Measurement of rotations and strains from coordinates

R. Diamond^a ^*

^aMRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, England
Correspondence e-mail: rd10@cam.ac.uk

3.3.1.2.2. Measurement of rotations and strains from coordinates

| top | pdf |

Given the coordinates of a molecular fragment it is often a requirement to relate the fragment to its image in some standard orientation by a transformation which may be required to be a pure rotation, or may be required to be a combination of rotation and strain. Of the methods reviewed in this section all except (iv) are concerned with pure rotation, ignoring any strain that may be present, and give the best rigid-body superposition. In all these methods, unless inhomogeneous strain is being considered, the best possible superposition is obtained if the centroids of the two images are first brought into coincidence by translation and treated as the origin.

Methods (i ) to (v) seek transformations which perform the superposition and impose on these, in various ways, the requirements of orthogonality for the rotational part. All these methods therefore need some defence against indeterminacy that arises in the general transformation if one or both of the fragments is planar, and, if improper rotations are to be excluded, need a defence against these also if the fragment and its image are of opposite chirality. Methods (vi) and (vii) pay no attention to the general transformation and work with variables which are intrinsically rotational in character, and always produce an orthogonal transformation with positive determinant, with no degeneracy arising from planar fragments which need no special attention. Even collinear atoms cause no problem, the superposition being performed correctly but with an arbitrary rotation about the length of the line being present in the result. These methods are therefore to be preferred over the earlier ones unless the purpose of the operation is to detect differences of chirality, although this, too, can be detected with a simple test.

In this review we adopt the same notation for all the methods which, unavoidably, means that symbols are used in ways which differ from the original publications. We use the symbol x for the vector set which is to be rotated and X for the vector set whose orientation is not to be altered, and write the residuals as $[e_{IA} = D_{Ij} x_{jA} - X_{IA}]$ and, by choice of origin, $[W_{a} x_{Ia} = W_{a} X_{Ia} = 0_{I}]$ for weights W. The quadratic residual to be minimized is $[E = W_{a} e_{ia} e_{ia}]$ and we define the matrix $[M_{IJ} = W_{a} x_{Ia} X_{Ja}]$ and use l for the direction cosines of the rotation axis.

(i) McLachlan's first method (McLachlan, 1972, 1982) is iterative and conceptually the simplest. It sets $[D_{IJ} = A_{Ik} R_{kJ}]$ in which A and R are both orthogonal with R being a current estimate of D and A being an adjustment which, at the beginning of each cycle, has a zero angle associated with it. One iterative cycle estimates a non-trivial A, after which the product AR replaces R. $[A_{IJ} = (1 - \cos \theta) l_{I} l_{J} + \delta_{IJ} \cos \theta - \varepsilon_{IJk} l_{k} \sin \theta]$ and $[\left({\partial A_{IJ}\over \partial \theta}\right)_{\theta = 0} = - \varepsilon_{IJk} l_{k},]$ therefore $[\eqalign{\left({\partial E\over \partial \theta}\right)_{\theta = 0} &= 2 W_{a} \left({\partial A_{ij}\over \partial \theta}\right)_{\theta = 0} R_{jk} x_{ka} (A_{il}R_{lm}x_{ma} - X_{ia})\cr &= 2\varepsilon_{ijl} R_{jk} M_{ki} l_{l}.}]$ For this to vanish for all possible rotation axes l the vector $[g_{L} = \varepsilon_{ijL} R_{jk} M_{ki}]$ must vanish, i.e. at the end of the iteration R must be such that the matrix $[N_{JI} = R_{Jk} M_{kI}]$ is symmetrical. The vector g represents the couple exerted on the rotating body by forces $[2 W_{A} (R_{Ij} x_{jA} - X_{IA})]$ acting at the atoms. Choosing $[l_{L} = g_{L}/|{\bf g}|]$ gives the greatest $[|\partial {E}/\partial \theta|_{\theta = 0}]$ and $[(\partial E/\partial \theta)]$ vanishes when $[\tan \theta = {\varepsilon_{ijk} N_{ji} l_{k}\over N_{pq} (l_{p} l_{q} - \delta_{pq})}]$ in which N is constructed from the current R matrix. A is then constructed from l and this θ and AR replaces R. The process is iterative because a couple about some new axis can appear when rotation about g eliminates the couple about g.

Note that for each rotation axis l there are two values of θ, differing by π, which reduce $[|{\bf g}|]$ to zero, corresponding to maximum and minimum values of E. The minimum is that which makes $[{\partial^{2} E\over \partial \theta^{2}} = 2(\hbox{tr } N - l_{i} N_{ij} l_{j})]$ positive. Adding π to θ alters R and N and negates this quantity.

Note, too, that the process is essentially characterized as that which makes the product RM symmetrical with R orthogonal. We return to this point in (iii).
(ii) Kabsch's method (Kabsch, 1976, 1978) minimizes E with respect to the nine elements of D, subject to the six constraints $[D_{kI} D_{kJ} - \delta_{IJ} = 0_{IJ},]$ by using an auxiliary function $[F = L_{ij} (D_{ki} D_{kj} - \delta_{ij})]$ in which L is symmetric containing six Lagrange multipliers. The Lagrangian function then has minima with respect to the elements of D at locations which are dependent, inter alia, on the elements of L. By suitably choosing L a minimum of G may be brought into coincidence with the constrained minimum of E. A minimum of G occurs where $[{\partial G\over \partial D_{IJ}} = 2D_{Ik} (S_{Jk} + L_{Jk}) - 2 M_{JI} = 0_{IJ}]$ and the $[9 \times 9]$ matrix $[{\partial^{2} G\over \partial D_{MK} \partial D_{IJ}} = 2\delta_{MI} (S_{JK} + L_{JK})]$ is positive definite, block diagonal, and has $[S_{JK} = W_{a} x_{Ja} x_{Ka}]$ which is symmetrical. Thus L must be chosen so as to make the symmetric matrix $[({\bi S} + {\bi L})]$ such that $[{\bi D} ({\bi S} + {\bi L})^{T} = {\bi M}^{T}]$ with D orthogonal, or $[{\bi RN} = {\bi M}^{T}]$ with R replacing D since we are now confined to the orthogonal case, and N is symmetric and positive definite.
(iii) Comparison of the Kabsch and McLachlan methods. Using the initials of these authors as subscripts, we have seen that the Kabsch solution involves solving $[{\bi R}_{\rm WK} {\bi N}_{\rm WK} = {\bi M}^{T}]$ for an orthogonal matrix $[{\bi R}_{\rm WK}]$ given that $[{\bi N}_{\rm WK}]$ is symmetrical and positive definite. Thus $[{\bi MM}^{T} = {\bi N}_{\rm WK}^{T} {\bi R}_{\rm WK}^{T} {\bi R}_{\rm WK} {\bi N}_{\rm WK} = {\bi N}_{\rm WK}^{2}]$ and $[{\bi R}_{\rm WK} = {\bi M}^{T} ({\bi MM}^{T})^{-1/2}.]$

By comparison, the McLachlan treatment leads to an orthogonal R matrix satisfying $[{\bi R}_{\rm ADM} = {\bi N}_{\rm ADM} {\bi M}^{-1}]$ in which $[{\bi N}_{\rm ADM}]$ is also symmetric and positive definite, which similarly leads to $[{\bi R}_{\rm ADM} = ({\bi M}^{T} {\bi M})^{1/2} {\bi M}^{-1}.]$

These seemingly different expressions for $[{\bi R}_{\rm WK}]$ and $[{\bi R}_{\rm ADM}]$ are, in fact, equal, as the following shows $[{\bi R}_{\rm WK} = {\bi R}_{\rm ADM} {\bi R}_{\rm ADM}^{-1} {\bi R}_{\rm WK} = {\bi R}_{\rm ADM} {\bi MN}_{\rm ADM}^{-1} {\bi M}^{T} {\bi N}_{\rm WK}^{-1},]$ therefore $[\eqalign{ {\bi R}_{\rm WK}^{T} {\bi R}_{\rm WK} &= {\bi I}\cr &= {\bi N}_{\rm WK}^{-1} {\bi MN}_{\rm ADM}^{-1} {\bi M}^{T} {\bi R}_{\rm ADM}^{T} {\bi R}_{\rm ADM} {\bi MN}_{\rm ADM}^{-1} {\bi M}^{T} {\bi N}_{\rm WK}^{-1}.}]$ Multiplying on both sides by $[{\bi N}_{\rm WK}]$ gives $[{\bi N}_{\rm WK}^{2} = ({\bi MN}_{\rm ADM}^{-1} {\bi M}^{T})^{2},]$ and since both N matrices are positive definite $[{\bi N}_{\rm WK} = {\bi MN}_{\rm ADM}^{-1} {\bi M}^{T}]$ and conversely $[{\bi N}_{\rm ADM} = {\bi M}^{T} {\bi N}_{\rm WK}^{-1} {\bi M},]$ therefore $[{\bi R}_{\rm WK} = {\bi M}^{T} {\bi M}^{T-1} {\bi N}_{\rm ADM} {\bi M}^{-1} = {\bi R}_{\rm ADM}.]$
(iv) Diamond's first method. This method (Diamond, 1976a) differs from the previous ones in that the transformation D is allowed to be a general transformation which is then factorized into the product of an orthogonal matrix R and a symmetrical matrix T. The transformation of x to fit X is thus interpreted as the combination of homogeneous strain and pure rotation in which x is subjected to strain and the result is rotated. $[\eqalign{{\bi D} &= {\bi RT}\cr {\bi T}^{2} &= {\bi D}^{T} {\bi D}\cr {\bi T} &= ({\bi D}^{T} {\bi D})^{1/2}\cr {\bi R} &= {\bi D} ({\bi D}^{T} {\bi D})^{-1/2}.}]$ Furthermore, the solution for D is $[{\bi D} = {\bi M}^{T} {\bi S}^{-1}]$ (in the notation of Kabsch), so that $[{\bi R} = {\bi M}^{T} {\bi S}^{-1} ({\bi S}^{-1} {\bi MM}^{T} {\bi S}^{-1})^{-1/2}]$ which may be compared with the results of the previous paragraph.

Although this R matrix by itself (i.e. applied without T) does not produce the best rotational superposition (i.e. smallest E), it is the one which exactly superposes the only three vectors in x whose mutual dispositions are conserved, on their equivalents in X, so that the rotation so found is arguably the best defined one.

Alternatives based on $[{\bi D} = {\bi TR}]$ , $[{\bi D}^{-1} = {\bi RT}]$ , $[{\bi D}^{-1} = {\bi TR}]$ are all easily developed, and these ideas are developed by Diamond (1976a) to include non-homogeneous strains also.
(v) McLachlan's second method. This method (McLachlan, 1979) is based on the properties of the $[6\times 6]$ matrix $[\pmatrix{{\bi0} &{\bi M}\cr {\bi M}^{T} &{\bi 0}\cr}]$ and is immune to singularity of M. If p and q are three-dimensional vectors such that $[({\bf p}^{T}, {\bf q}^{T})]$ is an eigenvector of this matrix then $[\pmatrix{{\bi 0} &{\bi M}\cr {\bi M}^{T} &{\bi 0}\cr} \pmatrix{{\bf p}\cr {\bf q}\cr} = \pmatrix{{\bi M}{\bf q}\cr {\bi M}^{T}{\bf p}\cr} = \pmatrix{{\bf p}\lambda\cr {\bf q}\lambda\cr}.]$

If q is negated the second equality is maintained provided λ is also negated. Therefore an orthogonal $[6\times 6]$ matrix $[\pmatrix{{\bi H} &{\bi H}\cr {\bi K} &{\bi -K}\cr}]$ (consisting of $[3\times 3]$ partitions) exists for which $[\pmatrix{{\bi H}^{T} &{\bi K}^{T}\cr {\bi H}^{T} &{\bi -K}^{T}\cr} \pmatrix{{\bi 0} &{\bi M}\cr {\bi M}^{T} &{\bi 0}\cr} \pmatrix{{\bi H} &{\bi H}\cr {\bi K} &{\bi -K}\cr} = \pmatrix{{\boldLambda} &{\bi 0}\cr {\bi 0} &{-\boldLambda}\cr}]$ in which $[\boldLambda]$ is diagonal and contains non-negative eigenvalues. The reverse transformation shows that $[{\bi M} = 2 {\bi H\boldLambda K}^{T}]$ and multiplying the eigenvectors together gives $[{\bi H}^{T} {\bi H} = {\bi K}^{T} {\bi K} = {\textstyle{1\over 2}} {\bi I} = {\bi HH}^{T} = {\bi KK}^{T}.]$ Therefore $[2{\bi KH}^{T} {\bi M} = 4{\bi KH}^{T} {\bi H}\boldLambda {\bi K}^{T} = 2{\bi K}\boldLambda {\bi K}^{T},]$ but $[2{\bi KH}^{T}]$ is orthogonal and $[2{\bi K}\boldLambda {\bi K}^{T}]$ is symmetrical, therefore [by paragraphs (i) and (iii) above] $[2{\bi KH}^{T}]$ is the required rotation. Similarly, forming $[\displaylines{{\bi M}^{T} = 2{\bi K}\boldLambda {\bi H}^{T}\cr \noalign{\vskip5pt} 2{\bi M}^{T} {\bi H}\boldLambda^{-1} {\bi H}^{T} = 4{\bi K}\boldLambda {\bi H}^{T} {\bi H}\boldLambda^{-1} {\bi H}^{T} = 2{\bi KH}^{T}}]$ corresponds to the Kabsch formulation [paragraphs (ii) and (iii)] since $[2{\bi H}\boldLambda^{-1} {\bi H}^{T}]$ is symmetrical and the same rotation, $[2{\bi KH}^{T}]$ , appears.

Note that the determinant of the orthogonal matrix so found is twice the product of the determinants of H and of K, and since the positive eigenvalues are collected into $[\boldLambda]$ it follows that the sign of the determinant of M is the same as the sign of the determinant of the resulting orthogonal matrix. If this is negative it means that the best superposition is obtained if one vector set is inverted and that x and X are of opposite chirality.

Expanding the expression for E, the weighted sum of squares of errors, for an orthogonal transformation shows that this is least when the trace of the product RM is greatest. In this treatment $[\hbox{tr} ({\bi RM}) = \hbox{tr} (2{\bi KH}^{T}\cdot 2{\bi H}\boldLambda {\bi K}^{T}) = \hbox{tr} (2{\bi K}\boldLambda {\bi K}^{T}) = \hbox{tr} (\boldLambda).]$ Hence, if the eigenvalues in $[\boldLambda]$ and − $[\boldLambda]$ are arranged in decreasing order of modulus, and if the determinant of M is negative, then exchanging the third and sixth columns of $[\pmatrix{{\bi H} &{\bi H}\cr {\bi K} &{\bi -K}\cr}]$ produces a product $[2{\bi KH}^{T}]$ with positive determinant (i.e. a proper rotation) at minimum cost in residual. Similarly, if M is singular and one or more eigenvalues in $[\boldLambda]$ vanishes it is necessary only to complete an orthonormal set of eigenvectors such that the determinants of H and K have the same sign.

(vi) MacKay's method. MacKay (1984) was the first to consider the rotational superposition problem in terms of the vector r of Section 3.3.1.2.1. Using quaternion algebra he showed that if a vector x is rotated to $[{\bf X} = {\bi R}{\bf x}]$ then $[({\bf X} - {\bf x}) = {\bf r}\times ({\bf X} + {\bf x}),]$ where $[|{\bf r}| = \tan (\theta/2)]$ and the direction of r is the axis of rotation, as may also be shown from elementary considerations. MacKay then solves this for the vector r by least squares given the vector pairs X and x. The individual errors are $[e_{IA} = \varepsilon_{Ijk} r_{j} (X_{kA} + x_{kA}) - (X_{IA} - x_{IA})]$ and $[E = W_{a} e_{ia} e_{ia}.]$ Setting $[\partial E/\partial r_{P} = 0_{P}]$ gives $[\displaylines{W_{a} \varepsilon_{iPk} \varepsilon_{ilm} r_{l} (X_{ka} + x_{ka}) (X_{ma} + x_{ma})\cr = W_{a} \varepsilon_{iPk} (X_{ka} + x_{ka}) (X_{ia} - x_{ia})}]$ which reduces to $[2{\bf V} = - ({\bi Q} + {\bi Q}_{0}){\bf r}]$ in which $[\eqalign{{\bi Q} &= {\bi M} + {\bi M}^{T} - 2{\bi I} \hbox{tr } {\bi M}\cr {\bi Q}_{0} &= {\bi S} + {\bi S}' - {\bi I} (\hbox{tr } {\bi S} + \hbox{tr}\; {\bi S}')\cr V_{I} &= \varepsilon_{Ijk} M_{jk}\cr S_{IJ} &= W_{a} x_{Ia} x_{Ja}\cr S'_{IJ} &= W_{a} X_{Ia} X_{Ja}.}]$ Thus a direct solution for r is obtained, $[{\bf r} = -2({\bi Q}_{0} + {\bi Q})^{-1}{\bf V},]$ the elements of which are u, v and w, and may be used to construct the orthogonal matrix as in Section 3.3.1.2.1. $[{\bi Q} + {\bi Q}_{0}]$ may be obtained directly from $[{\bi X} + {\bi x}]$ .

If the requisite rotation is 180°, $[({\bi Q}_{0} + {\bi Q})]$ is singular and cannot be inverted. In this case any row or column of the adjoint of $[({\bi Q}_{0} + {\bi Q})]$ is a vector in the direction of the axis. Normalizing this vector to unity, giving l, gives the requisite orthogonal matrix as $[{\bi R} = 2{\bi ll}^{T} - {\bi I}.]$

Note that MacKay's residual E is quadratic in r. E therefore has one minimum and no maximum, and the minimum is reached on the first cycle of least squares. By contrast, the objective function E that is minimized in methods (i), (ii), (v) and (vii) has one minimum, one maximum and two saddle points in the space of the vector r, as shown in (vii).

It may be shown (Diamond, 1989) that if MacKay's solution vector r is denoted by $[{\bf r}_{M}]$ and that given by the other methods [except (iv)] by $[{\bf r}_{O}]$ then $[{\bf r}_{M} = {\bf r}_{O} - {\bi A}^{-1} {\bi B}{\bf r}_{O}]$ in which A and B are real symmetric, positive semi-definite. A is positive definite unless all the individual vector sums $[({\bf X} + {\bf x})]$ are parallel, as can happen when the best rotation is 180°. Thus the MacKay method only gives the same result as the other methods if:

(a) the initial orientation is optimal, for then $[{\bf r}_{O} = {\bf 0}]$ , or
(b) perfect fitting is possible, for then $[{\bi B} = {\bf 0}]$ , or
(c) all the residual vectors (after fitting by $[{\bf r}_{O}]$ ) are parallel to $[{\bf r}_{O}]$ , for then B is singular such that $[{\bi B}{\bf r}_{O} = {\bf 0}]$ . In general, $[|{\bf r}_{M}|\leq |{\bf r}_{O}|]$ . $[{\bf r}_{O}]$ may be found by iterating $[{\bf r}_{M}]$ , but x must be replaced by Rx on each iteration.

(vii) Diamond's second method. This is closely related to MacKay's method, but uses a four-dimensional vector $[\boldrho]$ with components λ, μ, ν and σ in which λ, μ and ν are the direction cosines of the rotation axis multiplied by $[\sin (\theta/2)]$ and σ is $[\cos (\theta/2)]$ . In terms of such a vector Diamond (1988) showed that $[E = E_{0} - 2\boldrho^{T} {\bi P}\boldrho]$ in which E is the weighted sum of squares of coordinate differences, as before, $[E_{0}]$ is its value before any rotation is applied and P is the matrix $[{\bi P} = \pmatrix{{\bi Q} &{\bf V}\cr {\bf V}^{T} &0\cr}.]$ The rotation matrix R corresponding to the vector $[\boldrho]$ is then the last of the forms for R given in Section 3.3.1.2.1.

The minimum E is therefore $[E_{0}]$ minus twice the largest eigenvalue of P since $[\boldrho^{T} \boldrho = 1]$ , and a stationary value of E occurs when $[\boldrho]$ is any of the four eigenvectors of P. E thus has a maximum, a minimum and two saddle points, in general, and its value may be determined before any coordinates are transformed. Diamond also showed that the orientations giving these stationary values are related by the operations of 222 symmetry. Equivalent results have also been obtained by Kearsley (1989).

As an alternative to solving a $[4\times 4]$ eigenproblem, Diamond also showed that the vector r, as in MacKay's solution, may be obtained by iterating $[\eqalign{\alpha_{0} &= E_{0}/2\cr {\bf r}_{n} &= (\alpha_{n} {\bi I - Q})^{-1} {\bf V}\cr \alpha_{n+1} &= {{\bf V} \cdot {\bf r}_{n} + \alpha_{n} r_{n}^{2}\over 1 + r_{n}^{2}}}]$ which has the property that if X and x are exactly superposable then $[\alpha_{0}]$ is the exact solution and no iteration is necessary. If X and x are similar but not exactly superposable then a small number of iterations may be required to reach a stable r vector, though the matrix $[{\bi Q}_{0}]$ is not required. As in MacKay's solution, $[(\alpha {\bi I} - {\bi Q})]$ is singular at the end of the iteration if the required rotation is 180°, but the MacKay and Diamond methods both have the advantage that improper rotations are never generated by these means, and methods based on P and $[\boldrho]$ rather than Q and r are trouble-free for 180° rotations. The iterative loop in this method does not require Rx to be redetermined on each cycle.

Finally, it may be shown that if $[p_{1}, p_{2}, p_{3}, p_{4}]$ are the eigenvalues of P arranged in descending order and $[p_{1} - p_{2} - p_{3} + p_{4}]$ is negative, then a closer superposition may be obtained by reversing the chirality of one of the vector sets, and the R matrix constructed from $[{\boldrho_{4}}]$ optimally superimposes Rx onto − X, the enantiomer of X (Diamond, 1990b).

References

Diamond, R. (1976a). On the comparison of conformations using linear and quadratic transformations. Acta Cryst. A32, 1–10.Google Scholar

Diamond, R. (1988). A note on the rotational superposition problem. Acta Cryst. A44, 211–216.Google Scholar

Diamond, R. (1989). A comparison of three recently published methods for superimposing vector sets by pure rotation. Acta Cryst. A45, 657.Google Scholar

Diamond, R. (1990b). Chirality in rotational superposition. Acta Cryst. A46, 423.Google Scholar

Kabsch, W. (1976). A solution for the best rotation to relate two sets of vectors. Acta Cryst. A32, 922–923.Google Scholar

Kabsch, W. (1978). A discussion of the solution for the best rotation to relate two sets of vectors. Acta Cryst. A34, 827–828.Google Scholar

Kearsley, S. K. (1989). On the orthogonal transformation used for structural comparisons. Acta Cryst. A45, 208–210.Google Scholar

Mackay, A. L. (1984). Quaternion transformation of molecular orientation. Acta Cryst. A40, 165–166.Google Scholar

McLachlan, A. D. (1972). A mathematical procedure for superimposing atomic coordinates of proteins. Acta Cryst. A28, 656–657.Google Scholar

McLachlan, A. D. (1979). Gene duplications in the structural evolution of chymotrypsin. Appendix: Least squares fitting of two structures. J. Mol. Biol. 128, 49–79.Google Scholar

McLachlan, A. D. (1982). Rapid comparison of protein structures. Acta Cryst. A38, 871–873.Google Scholar

International Tables for Crystallography (2006). Vol. B. ch. 3.3, pp. 364-366