International
Tables for Crystallography Volume F Crystallography of biological macromolecules Edited by M. G. Rossmann and E. Arnold © International Union of Crystallography 2006 
International Tables for Crystallography (2006). Vol. F, ch. 18.5, pp. 406409
Section 18.5.4. Two examples of fullmatrix inversion^{a}Chemistry Department, UMIST, Manchester M60 1QD, England 
G. M. Sheldrick extended his SHELXL96 program (Sheldrick & Schneider, 1997) to provide extra information about protein precision through the inversion of leastsquares full matrices. His programs have been used by Deacon et al. (1997) for the highresolution refinement of native concanavalin A with 237 residues, using data at 110 K to 0.94 Å refined anisotropically. After the convergence and completion of fullmatrix restrained refinement for the structure, the unrestrained full matrix (coordinates only) was computed and then inverted in a massive calculation. This led to s.u's , , and for all atoms, and to and for all bond lengths and angles. is defined as . For concanavalin A the restrained full matrix was also inverted, thus allowing the comparison of restrained and unrestrained s.u.'s.
The results for concanavalin A from the inversion of the coordinate matrices of order 6402 (= 2134 × 3) are plotted in Figs. 18.5.4.1 and 18.5.4.2. Fig. 18.5.4.1 shows versus for the fully occupied atoms of the protein (a few atoms with B > 60 Å^{2} are offscale). The points are colourcoded black for carbon, blue for nitrogen and red for oxygen. Fig. 18.5.4.1(a) shows the restrained results, and Fig. 18.5.4.1(b) shows the unrestrained diffractiondataonly results. Superposed on both sets of data points are leastsquares quadratic fits determined with weights . At high B, the unrestrained can be at least double the restrained , e.g., for carbon at B = 50 Å^{2}, the unrestrained is about 0.25 Å, whereas the restrained is about 0.11 Å. For B < 10 Å^{2}, both 's fall below 0.02 Å and are around 0.01 Å at B = 6 Å^{2}.

Plots of versus for concanavalin A with 0.94 Å data, (a) restrained fullmatrix , (b) unrestrained fullmatrix . Carbon black, nitrogen blue, oxygen red. 

Plots of versus average for concanavalin A with 0.94 Å data, (a) restrained fullmatrix , (b) unrestrained fullmatrix . C—C black, C—N blue, C—O red. 
For B < 10 Å^{2}, the better precision of oxygen as compared with nitrogen, and of nitrogen as compared with carbon, can be clearly seen. At the lowest B, the unrestrained in Fig. 18.5.4.1(b) are almost as small as the restrained in Fig. 18.5.4.1(a). [The quadratic fits of the restrained results in Fig. 18.5.4.1(a) are evidently slightly imperfect in making tend almost to 0 as B tends to 0.]
Fig. 18.5.4.2 shows versus for the bond lengths in the protein. The points are colourcoded black for C—C, blue for C—N and red for C—O. The restrained and unrestrained distributions are very different for high B. The restrained distribution in Fig. 18.5.4.2(a) tends to about 0.02 Å, which is the standard uncertainty of the applied restraint for 1–2 bond lengths, whereas the unrestrained distribution in Fig. 18.5.4.2(b) goes off the scale of the diagram. But for B < 10 Å^{2}, both distributions fall to around 0.01 Å.
The differences between the restrained and unrestrained and can be understood through the twoatom model for restrained refinement described in Section 18.5.3. For that model, the equation relates the bondlength s.u. in the restrained refinement, , to the of the unrestrained refinement and the s.u. assigned to the length in the stereochemical dictionary. In the refinements, was 0.02 Å for all bond lengths. When this is combined in (18.5.3.16) with the unrestrained of any bond, the predicted restrained is close to that found in the restrained full matrix.
It can be seen from Fig. 18.5.4.2(b) that many bond lengths with average B < 10 Å^{2} have Å. For these bonds the diffraction data have greater weight than the stereochemical dictionary. Some bonds have as low as 0.0080 Å, with around 0.0074 Å. This situation is one consequence of the availability of diffraction data to the high resolution of 0.94 Å. For large (i.e., high B), equation (18.5.3.16) predicts that Å, as is found in Fig. 18.5.4.2(a).
In an isotropic approximation, . Equation (18.5.3.12) of the twoatom model can be recast to give For low B, say in concanavalin, (18.5.4.1) gives quite good predictions of from . For instance, for a carbon atom with B = 15 Å^{2}, the quadratic curve for carbon in Fig. 18.5.4.1(b) shows Å, and Fig. 18.5.4.1(a) shows Å. While if Å is used with (18.5.4.1), the resulting prediction for is 0.028 Å.
However, for high B, say B = 50 Å^{2}, the quadratic curve for carbon in Fig. 18.5.4.1(b) shows Å, and Fig. 18.5.4.1(a) shows Å, whereas (18.5.4.1) leads to the poor estimate Å.
Thus at high B, equation (18.5.4.1) from the twoatom model does not give a good description of the relationship between the restrained and unrestrained . The reason is obvious. Most atoms are linked by 1–2 bond restraints to two or three other atoms. Even a carbonyl oxygen atom linked to its carbon atom by a 0.02 Å restraint is also subject to 0.04 Å 1–3 restraints to chain and N atoms. Consequently, for a highB atom, when the restraints are applied it is coupled to several other atoms in a group, and its is lower, compared with the diffractiondataonly , by a greater amount than would be expected from the twoatom model.
Sheldrick has provided the results of the unrestrained lowerresolution refinement of a singlechain immunoglobulin mutant (T39K) with 218 aminoacid residues, with data to 1.70 Å refined isotropically (Usón et al., 1999). Fig. 18.5.4.3 shows versus for the fully occupied protein atoms. Superposed on the data points are leastsquares quadratic fits. In a first very rough approximation for suggested later by equation (18.5.6.3), the dependence on atom type is controlled by , the reciprocal of the atomic number. Sheldrick found that a dependence produced too little difference between C, N and O. The proportionalities between the quadratics for in Figs. 18.5.4.1 and 18.5.4.3 are based on the reciprocals of the scattering factors at , symbolized by . For C, N and O, these are 2.494, 3.219 and 4.089, respectively. For potential use in later work, the leastsquares fits to the in Å are recorded here as for the immunoglobulin (unrestrained), concanavalin A (unrestrained) and concanavalin A (restrained), respectively.

Plot of versus from an unrestrained full matrix for immunoglobulin mutant (T39K) with 1.70 Å data. Carbon black, nitrogen blue, oxygen red. 
As might be expected from the lower resolution, the lowest 's in the immunoglobulin are about six times the lowest 's in concanavalin. But at B = 50 Å^{2}, the immunoglobulin curve for carbon gives Å, which is only 50% larger than the concanavalin value of 0.25 Å.
Fig. 18.5.4.4 shows versus for the immunoglobulin. Note that the lowest immunoglobulin unrestrained is about 0.06 Å, which is three times the 0.02 Å bond restraint.
Geometric restraint dictionaries typically use bondlength weights based on of around 0.02 or 0.03 Å. Tables 18.5.7.1 –18.5.7.3 show that even 1.5 Å studies have diffractiononly errors of 0.08 Å and upwards. Only for resolutions of 1.0 Å or so are the diffractiononly errors comparable with the dictionary weights. Of course, the dictionary offers no values for many of the configurational parameters of the protein structure, including the centroid and molecular orientation.
The opening contention of this chapter in Section 18.5.1.1 is that the variances and covariances of the structural parameters of proteins can be found from the inverse of the leastsquares normal matrix. But there is a caveat, chiefly that explicit account would not be taken of disorder of the solvent or of parts of the protein. Corrections by Babinet's principle of complementarity or by mask bulk solvent models are only firstorder approximations. The consequences of such disorder problems, which make the variation of calculated structure factors nonlinear over the range of interest, may in future be better handled by maximumlikelihood methods (e.g. Read, 1990; Bricogne, 1993; Bricogne & Irwin, 1996; Murshudov et al., 1997). Pannu & Read (1996) have shown how the maximumlikelihood method can be cast computationally into a form akin to leastsquares calculations. Fullmatrix precision estimates along the lines of the present chapter are probably somewhat low.
It should also be noted that fullmatrix estimates of coordinate precision are most reliably derived from matrices involving both coordinates and atomic displacement parameters. This is particularly important for lowerresolution analyses, in which atomic images overlap. The work on the highresolution analysis of concanavalin A described in Section 18.5.4.1 was based on the very large coordinate matrix, of order 6402. The omission, because of computer limitations, of the anisotropic displacement parameters from the full matrix will have caused the coordinate s.u.'s of atoms with high to be underestimated.
Much information about the quality of a molecular model can be obtained from the eigenvalues and eigenvectors of the normal matrix (Cowtan & Ten Eyck, 2000).
References
Bricogne, G. (1993). Direct phase determination by entropy maximization and likelihood ranking: status report and perspectives. Acta Cryst. D49, 37–60.Google ScholarBricogne, G. & Irwin, J. (1996). Maximumlikelihood structure refinement: theory and implementation within BUSTER + TNT. In Proceedings of the CCP4 study weekend. Macromolecular refinement, edited by E. Dodson, M. Moore, A. Ralph & S. Bailey, pp. 85–92. Warrington: Daresbury Laboratory.Google Scholar
Cowtan, K. & Ten Eyck, L. F. (2000). Eigensystem analysis of the refinement of a small metalloprotein. Acta Cryst. D56, 842–856.Google Scholar
Deacon, A., Gleichmann, T., Kalb (Gilboa), A. J., Price, H., Raftery, J., Bradbrook, G., Yariv, J. & Helliwell, J. R. (1997). The structure of concanavalin A and its bound solvent determined with smallmolecule accuracy at 0.94 Å resolution. J. Chem. Soc. Faraday Trans. 93, 4305–4312.Google Scholar
Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Refinement of macromolecular structures by the maximumlikelihood method. Acta Cryst. D53, 240–255.Google Scholar
Pannu, N. S. & Read, R. J. (1996). Improved structure refinement through maximum likelihood. Acta Cryst. A52, 659–668.Google Scholar
Read, R. J. (1990). Structurefactor probabilities for related structures. Acta Cryst. A46, 900–912.Google Scholar
Sheldrick, G. M. & Schneider, T. R. (1997). SHELXL: high resolution refinement. Methods Enzymol. 277, 319–343.Google Scholar
Usón, I., Pohl, E., Schneider, T. R., Dauter, Z., Schmidt, A., Fritz, H.J. & Sheldrick, G. M. (1999). 1.7 Å structure of the stabilized REI_{V} mutant T39K. Application of local NCS restraints. Acta Cryst. D55, 1158–1167.Google Scholar