International
Tables for Crystallography Volume B Reciprocal space Edited by U. Shmueli © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. B. ch. 3.3, pp. 361-367
Section 3.3.1.2. Orthogonal (or rotation) matrices
aMRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, England |
It is a basic requirement for any graphics or molecular-modelling system to be able to control and manipulate the orientation of the structures involved and this is achieved using orthogonal matrices which are the subject of these sections.
If a vector v is expressed in terms of its components resolved onto an axial set of vectors X, Y, Z which are of unit length and mutually perpendicular and right handed in the sense that , and if these components are , and if a second set of axes X′, Y′, Z′ is similarly established, with the same origin and chirality, and if v has components on these axes then in which is the cosine of the angle between the ith primed axis and the jth unprimed axis. Evidently the elements comprise a matrix R, such that any row represents one of the primed axial vectors, such as X′, expressed as components on the unprimed axes, and each column represents one of the unprimed axial vectors expressed as components on the primed axes. It follows that since elements of the product are scalar products among perpendicular unit vectors.
A real matrix whose transpose equals its inverse is said to be orthogonal.
Since X, Y and Z can simultaneously be superimposed on X′, Y′ and Z′ without deformation or change of scale the relationship is one of rotation, and orthogonal matrices are often referred to as rotation matrices. The operation of replacing the vector v by Rv corresponds to rotating the axes from the unprimed to the primed set with v itself unchanged. Equally, the same operation corresponds to retaining fixed axes and rotating the vector in the opposite sense. The second interpretation is the one more frequently helpful since conceptually it corresponds more closely to rotational operations on objects, and it is primarily in this sense that the following is written.
If three vectors u, v and w form the edges of a parallelepiped, then its volume V is and if these vectors are transformed by the matrix R as above, then the transformed volume V′ is But the determinant of R is given by so that and the determinant of R must therefore be +1 for a transformation which is a pure rotation. Nevertheless orthogonal matrices with determinant −1 exist though these do not describe a pure rotation. They may always be described as the product of a pure rotation and inversion through the origin and are referred to here as improper rotations. In what follows all references to orthogonal matrices refer to those with positive determinant only, unless stated otherwise.
An important general form of an orthogonal matrix in three dimensions was derived as equation (1.1.4.32 ) and is or in which l, m and n are the direction cosines of the axis of rotation (which are the same when referred to either set of axes under either interpretation) and θ is the angle of rotation. In this form, and with R operating on column vectors on the right, the sign of θ is such that, when viewed along the rotation axis from the origin towards the point lmn, the object is rotated clockwise for positive θ with a fixed right-handed axial system. If, under the same viewing conditions, the axes are to be rotated clockwise through θ with the object fixed then the components of vectors in the object, on the new axes, are given by R with the same lmn and with θ negated. This is the transpose of R, and if R is constructed from a product, as below, then each factor matrix in the product must be transposed and their order reversed to achieve this. Note that if, for a given rotation, the viewing direction from the origin is reversed, l, m, n and θ are all reversed and the matrix is unchanged.
Any rotation about a reference axis such that two of the direction cosines are zero is termed a primitive rotation, and it is frequently a requirement to generate or to interpret a general rotation as a product of primitive rotations.
A second important general form is based on Eulerian angles and is the product of three such primitives. It is which is commonly employed in four-circle diffractometers for which , and . In terms of the fixed-axes–moving-object conceptualization this corresponds to a rotation about Z followed by about Y followed by about Z. In the familiar diffractometer example, when the φ and ω axes are both vertical and equivalent. If φ is altered first, then the χ axis is still in the direction of a fixed Y axis, but if ω is altered first it is not. Since all angles are to be rotations about fixed axes to describe a rotating object it follows that it is φ rather than ω which corresponds to . In general, when rotating parts are mounted on rotating parts the rotation closest to the moved object must be applied first, forming the right-most factor in any multiple transformation, with the rotation closest to the fixed part as the left-most factor, assuming data supplied as column vectors on the right.
Given an orthogonal matrix, in either numerical or analytical form, it may be required to discover θ and the axis of rotation, or to factorize it as a product of primitives. From the first form we see that the vector consisting of the antisymmetric part of R, has elements times the direction cosines l, m, n, which establishes the direction immediately, and normalization using determines . Furthermore, the trace is so that the quadrant of θ is also fixed. This method fails, however, if the matrix is symmetrical, which occurs if . In this case only the direction of the axis is required, which is given by for non-zero elements, or etc., with the signs chosen to satisfy etc.
The Eulerian form may be factorized by noting that . There is then freedom to choose the sign of , but the choice then fixes the quadrants of and through the elements in the last row and column, and the primitives may then be constructed. These expressions for and fail if , in which case the rotation reduces to a primitive rotation about Z with angle , or .
Eulerian angles are unlikely to be the best choice of primitive angles unless they are directly related to the parameters of a system, as with the diffractometer. It is often more important that the changes to primitive angles should be quasi-linearly related to θ for any small rotations, which is not the case with Eulerian angles when the required rotation axis is close to the X axis. In such a case linearized techniques for solving for the primitive angles will fail. Furthermore, if the required rotation is about Z only is determinate.
Quasi-linear relationships between θ and the primitive rotations arise if the primitives are one each about X, Y and Z. Any order of the three factors may be chosen, but the choice must then be adhered to since these factors do not commute. For sufficiently small rotations the primitive rotations are then , and , whilst for larger θ linearized iterative techniques for finding the primitive rotations are likely to be convergent and well conditioned.
The three-dimensional space of the angles and in either case is non-linearly related to θ. In the Eulerian case the worst non-linearities occur at the origin of φ-space. Equally severe non-linearities occur in the quasi-linear case also but are 90° away from the origin and less likely to be troublesome.
Neither of the foregoing general forms of orthogonal matrix has ideally convenient properties. The first is inconvenient because it uses four non-equivalent variables l, m, n and θ, with a linking equation involving l, m and n, so that they cannot be treated as independent variables for analytical purposes. The second form (the product of primitives) is not ideal because the three angles, though independent, are not equivalent, the non-equivalence arising from the non-commutation of the primitive factors. In the remainder of this section we give two further forms of orthogonal matrix which each use three variables which are independent and strictly equivalent, and a third form using four whose squares sum to unity.
The first of these is based on the diagonal and uses the three independent variables p, q, r, from which we construct the auxiliary variables then is orthogonal with positive determinant for any of the sixteen sign combinations. The signs of P, Q, R and S are, respectively, the signs of the direction cosines of the rotation axis and of . Using also , which may be deemed positive without loss of generality,
Although p, q and r are independent, the point [pqr] is bound, by the requirement that P, Q, R and S be real, to lie within a tetrahedron whose vertices are the points [111], , and , corresponding to the identity and to 180° rotations about each of the axes. The facts that the identity occurs at a vertex of the feasible region and that , rather than , is linear on p, q and r in this vicinity make this form suitable only for substantial rotations.
The second form consists in defining a rotation vector r with components u, v, w such that , , with and . Then the matrix is orthogonal and the variables u, v, w are independent, equivalent and unbounded, and, unlike the previous form, small rotations are quasi-linear on these variables. As examples, gives 90° about X, gives 120° about [111].
R then transforms a vector d according to
Multiplying two such matrices together allows us to establish the manner in which the rotation vectors and combine. for a rotation followed by , so that rotations expressed in terms of rotation angles and axes may be compounded into a single such rotation without the need to form and decompose a product matrix.
Note that if and are parallel this reduces to the formula for the tangent of the sum of two angles, and that if the combined rotation is always 180°. Note, too, that reversing the order of application of the rotations reverses only the vector product.
If three rotations and are applied successively, first, then their combined rotation is
Note the irregular pattern of signs in the numerator.
Similar ideas, using a vector of magnitude , are developed in Aharonov et al. (1977).
The third form of orthogonal matrix uses four variables, λ, μ, ν and σ, which comprise a four-dimensional vector , such that , , with and . In terms of these variables Two further matrices S and T may be defined (Diamond, 1988), which are themselves orthogonal (though S has determinant −1) and which have the property that so that, for example, if homogeneous coordinates are being employed (Section 3.3.1.1.2) is a rotation of (x, y, z, w) through the angle θ about the axis (l, m, n). With suitably pipelined hardware this forms an efficient means of applying rotations since the `overhead' of establishing S is so trivial.
T has the property that the rotation vector arising from a concatenation of n rotations is in which is the vector (0, 0, 0, 1) which defines a null rotation. This equation may be used as a basis for factorizing a given rotation into a concatenation of rotations about designated axes (Diamond, 1990a).
Finally, an exact rotation of the vector d may be obtained without using matrices at all by writing in which and is the initial position which is to be rotated. Here is a vector with direction cosines l, m and n, and magnitude equal to the required rotation angle in radians (Diamond, 1966). This method is particularly efficient when or when the number of vectors to be transformed is small since the overhead of establishing R is eliminated and the process is simple to program. It is the three-dimensional analogue of the power series for sin θ and cos θ and has the same convergence properties.
Given the coordinates of a molecular fragment it is often a requirement to relate the fragment to its image in some standard orientation by a transformation which may be required to be a pure rotation, or may be required to be a combination of rotation and strain. Of the methods reviewed in this section all except (iv) are concerned with pure rotation, ignoring any strain that may be present, and give the best rigid-body superposition. In all these methods, unless inhomogeneous strain is being considered, the best possible superposition is obtained if the centroids of the two images are first brought into coincidence by translation and treated as the origin.
Methods (i ) to (v) seek transformations which perform the superposition and impose on these, in various ways, the requirements of orthogonality for the rotational part. All these methods therefore need some defence against indeterminacy that arises in the general transformation if one or both of the fragments is planar, and, if improper rotations are to be excluded, need a defence against these also if the fragment and its image are of opposite chirality. Methods (vi) and (vii) pay no attention to the general transformation and work with variables which are intrinsically rotational in character, and always produce an orthogonal transformation with positive determinant, with no degeneracy arising from planar fragments which need no special attention. Even collinear atoms cause no problem, the superposition being performed correctly but with an arbitrary rotation about the length of the line being present in the result. These methods are therefore to be preferred over the earlier ones unless the purpose of the operation is to detect differences of chirality, although this, too, can be detected with a simple test.
In this review we adopt the same notation for all the methods which, unavoidably, means that symbols are used in ways which differ from the original publications. We use the symbol x for the vector set which is to be rotated and X for the vector set whose orientation is not to be altered, and write the residuals as and, by choice of origin, for weights W. The quadratic residual to be minimized is and we define the matrix and use l for the direction cosines of the rotation axis.
There are several ways of deriving a strictly orthogonal matrix from a given approximately orthogonal matrix, among them the following.
If R is the orthogonal matrix given in Section 3.3.1.2.1 in terms of the direction cosines l, m and n of the axis of rotation, then it is clear that (l, m, n) is an eigenvector of R with eigenvalue unity because
Consideration of the determinant shows that the sum of the three eigenvalues is and that their product is unity. Hence the three eigenvalues are 1, and . Since R is real, its product with any real vector is also real, yet its product with an eigenvector must, in general, be complex. Thus the eigenvectors must themselves be complex.
The remaining two eigenvectors u may be found using the results of Section 3.3.1.2.1 (q.v.) according to which is solved by any vector of the form for any real vector v, where l is the normalized axis vector, , , . Eigenvectors for the two eigenvalues may have unrelated v vectors though the sign choices are coupled. If the vector v is rotated about l through an angle φ the corresponding vector u is multiplied by and remains an eigenvector. Using superscript signs to denote the sign of θ in the eigenvalue with which each vector is associated, the matrix has the properties that and which places restrictions on v if this is to be the identity. Note that the 23 element vanishes even in the absence of any relationship between and .
A convenient form for U, symmetrical in the elements of l, is obtained by setting and is in which the normalizing denominator is given by
References
Aharonov, Y., Farach, H. A. & Poole, C. P. (1977). Non-linear vector product to describe rotations. Am. J. Phys. 45, 451–454.Google ScholarDiamond, R. (1966). A mathematical model-building procedure for proteins. Acta Cryst. 21, 253–266.Google Scholar
Diamond, R. (1976a). On the comparison of conformations using linear and quadratic transformations. Acta Cryst. A32, 1–10.Google Scholar
Diamond, R. (1988). A note on the rotational superposition problem. Acta Cryst. A44, 211–216.Google Scholar
Diamond, R. (1989). A comparison of three recently published methods for superimposing vector sets by pure rotation. Acta Cryst. A45, 657.Google Scholar
Diamond, R. (1990a). On the factorisation of rotations with special reference to diffractometry. Proc. R. Soc. London Ser. A, 428, 451–472.Google Scholar
Diamond, R. (1990b). Chirality in rotational superposition. Acta Cryst. A46, 423.Google Scholar
Kabsch, W. (1976). A solution for the best rotation to relate two sets of vectors. Acta Cryst. A32, 922–923.Google Scholar
Kabsch, W. (1978). A discussion of the solution for the best rotation to relate two sets of vectors. Acta Cryst. A34, 827–828.Google Scholar
Kearsley, S. K. (1989). On the orthogonal transformation used for structural comparisons. Acta Cryst. A45, 208–210.Google Scholar
Mackay, A. L. (1984). Quaternion transformation of molecular orientation. Acta Cryst. A40, 165–166.Google Scholar
McLachlan, A. D. (1972). A mathematical procedure for superimposing atomic coordinates of proteins. Acta Cryst. A28, 656–657.Google Scholar
McLachlan, A. D. (1979). Gene duplications in the structural evolution of chymotrypsin. Appendix: Least squares fitting of two structures. J. Mol. Biol. 128, 49–79.Google Scholar
McLachlan, A. D. (1982). Rapid comparison of protein structures. Acta Cryst. A38, 871–873.Google Scholar