International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F. ch. 14.2, pp. 305-306   | 1 | 2 |

Section 14.2.2.6. Scoring of trial heavy-atom solutions

T. C. Terwilligerc* and J. Berendzend

14.2.2.6. Scoring of trial heavy-atom solutions

| top | pdf |

Scoring of potential heavy-atom solutions is an essential part of the Solve algorithm because it allows ranking of solutions and appropriate decision making. Solve scores trial heavy-atom solutions (or anomalously scattering atom solutions) using four criteria: agreeement with the Patterson function, cross-validation of heavy-atom sites, figure of merit, and non-randomness of the electron-density map. The scores for each criterion are normalized to those for a group of starting solutions (most of which are incorrect) to obtain Z scores. The total score for a solution is the sum of its Z scores after correction for anomalously high scores in any category.

The first criterion used by Solve for evaluating a trial heavy-atom solution is the agreement between calculated and observed Patterson functions. Comparisons of this type have always been important in the MIR and MAD methods (Blundell & Johnson, 1976[link]). The score for Patterson-function agreement is the average value of the Patterson function at predicted locations of peaks, after multiplication by a weighting factor based on the number of heavy-atom sites in the trial solution. The weighting factor (Terwilliger & Berendzen, 1999b[link]) is adjusted so that if two solutions have the same mean value at predicted Patterson peaks, the one with the larger numbers of sites receives the higher score. Typically the weighting factor is approximately given by [(N)^{1/2}], where there are N sites in the solution.

In some cases, predicted Patterson vectors fall on high peaks that are not related to the heavy-atom solution. To exclude these contributions, occupancies of each heavy-atom site are refined so that the predicted peak heights approximately match the observed peak heights at the predicted interatomic positions. Then all peaks with heights more than 1σ higher than their predicted values are truncated at this height. The average values are further corrected for instances where more than one predicted Patterson vector falls on the same location by scaling that peak height by the fraction of predicted vectors that are unique.

A `cross-validation' difference Fourier analysis is the basis of the second criterion used to evaluate heavy-atom solutions. One at a time, each site in a solution (and any equivalent sites in other derivatives for MIR solutions) is omitted from the heavy-atom model and phases are recalculated. These phases are used in a difference Fourier analysis and the peak height at the location of the omitted site is noted. A similar analysis where a derivative is omitted from phasing and all other derivatives are used to phase a difference Fourier has been used for many years (Dickerson et al., 1961[link]). The score for cross-validation difference Fouriers is the average peak height, after weighting by the same factor used in the difference Patterson analysis.

The mean figure of merit of phasing (m) (Blundell & Johnson, 1976[link]) can be a remarkably useful measure of the quality of phasing despite its susceptibility to systematic error (Terwilliger & Berendzen, 1999b[link]). The overall figure of merit is essentially a measure of the internal consistency of the heavy-atom solution and the data, and is used as the third criterion for solution quality in Solve. As heavy-atom refinement in Solve is carried out using origin-removed Patterson refinement (Terwilliger & Eisenberg, 1983[link]), occupancies of heavy-atom sites are relatively unbiased. This minimizes the problem of high occupancies leading to inflated figures of merit. Additionally, using a single procedure for phasing allows comparison between solutions. The score based on figure of merit is simply the unweighted mean for all reflections included in phasing.

The most important criterion used by a crystallographer in evaluating the quality of a heavy-atom solution is the interpretability of the resulting electron-density map. Although a full implementation of such a criterion is difficult, it is quite straightforward to evaluate instead whether the electron-density map has features that are expected for a crystal of a macromolecule. A number of features of electron-density maps could be used for this purpose, including the connectivity of electron density in the maps (Baker et al., 1993[link]), the presence of clearly defined regions of protein and solvent (Wang, 1985[link]; Podjarny et al., 1987[link]; Zhang & Main, 1990[link]; Xiang et al., 1993[link]; Abrahams et al., 1994[link]; Terwilliger & Berendzen, 1999a[link],c[link]), and histogram matching of electron densities (Zhang & Main, 1990[link]; Goldstein & Zhang, 1998[link]). We have used the identification of solvent and protein regions as the measure of map quality in Solve. This requires that there be both solvent and protein regions in the electron-density map, but for most macromolecular structures the fraction of the unit cell that is occupied by the macromolecule is in the suitable range of 30–70%. The criterion used in scoring by Solve is based on the connectivity of the solvent and protein regions (Terwilliger & Berendzen, 1999c[link]). The unit cell is divided into boxes approximately twice the resolution of the map on a side, and within each box the r.m.s. electron density is calculated, without including the [F_{000}] term in the Fourier synthesis. For boxes within the protein region, this r.m.s. electron density will typically be high (as there are some points where atoms are located and other points between atoms), while for those in the solvent region it will be low (as the electron density is fairly uniform). The score based on the connectivity of the protein and solvent regions is simply the correlation coefficient of this r.m.s. electron density for adjacent boxes. If there is a large contiguous protein region and a large contiguous solvent region, then adjacent boxes will have highly correlated values of their r.m.s. electron densities. If the electron density is random, there will be little or no correlation. In practice, for a very good electron-density map, this correlation of local r.m.s. electron density may be as high as 0.5 or 0.6.

References

First citation Abrahams, J. P., Leslie, A. G. W., Lutter, R. & Walker, J. E. (1994). Structure at 2.8-angstrom resolution of f1-ATPase from bovine heart-mitochondria. Nature (London), 370, 621–628.Google Scholar
First citation Baker, D., Krukowski, A. E. & Agard, D. A. (1993). Uniqueness and the ab initio phase problem in macromolecular crystallography. Acta Cryst. D49, 186–192.Google Scholar
First citation Blundell, T. L. & Johnson, L. N. (1976). Protein crystallography. p. 368. New York: Academic Press.Google Scholar
First citation Dickerson, R. E., Kendrew, J. C. & Strandberg, B. E. (1961). The crystal structure of myoglobin: phase determination to a resolution of 2 Å by the method of isomorphous replacement. Acta Cryst. 14, 1188–1195.Google Scholar
First citation Goldstein, A. & Zhang, K. Y. J. (1998). The two-dimensional histogram as a constraint for protein phase improvement. Acta Cryst. D54, 1230–1244.Google Scholar
First citation Podjarny, A. D., Bhat, T. N. & Zwick, M. (1987). Improving crystallographic macromolecular images: the real-space approach. Annu. Rev. Biophys. Biophys. Chem. 16, 351–373.Google Scholar
First citation Terwilliger, T. C. & Berendzen, J. (1999a). Discrimination of solvent from protein regions in native Fouriers as a means of evaluating heavy-atom solutions in the MIR and MAD methods. Acta Cryst. D55, 501–505.Google Scholar
First citation Terwilliger, T. C. & Berendzen, J. (1999b). Automated MIR and MAD structure solution. Acta Cryst. D55, 849–861.Google Scholar
First citation Terwilliger, T. C. & Berendzen, J. (1999c). Evaluation of macromolecular electron-density map quality using the correlation of local r.m.s. density. Acta Cryst. D55, 1872–1877.Google Scholar
First citation Terwilliger, T. C. & Eisenberg, D. (1983). Unbiased three-dimensional refinement of heavy-atom parameters by correlation of origin-removed Patterson functions. Acta Cryst. A39, 813–817.Google Scholar
First citation Wang, B.-C. (1985). Resolution of phase ambiguity in macromolecular crystallography. Methods Enzymol. 115, 90–112.Google Scholar
First citation Xiang, S., Carter, C. W. Jr, Bricogne, G. & Gilmore, C. J. (1993). Entropy maximization constrained by solvent flatness: a new method for macromolecular phase extension and map improvement. Acta Cryst. D49, 193–212.Google Scholar
First citation Zhang, K. Y. J. & Main, P. (1990). The use of Sayre's equation with solvent flattening and histogram matching for phase extension and refinement of protein structures. Acta Cryst. A46, 377–381.Google Scholar








































to end of page
to top of page