R values

Kleywegt, G. J.

doi:10.1107/97809553602060000707

International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. F. ch. 21.1, pp. 504-505 | 1 | 2 |

Section 21.1.7.4.1. R values

G. J. Kleywegt^a ^*

^aDepartment of Cell and Molecular Biology, Uppsala University, Biomedical Centre, Box 596, SE-751 24 Uppsala, Sweden
Correspondence e-mail: gerard@xray.bmc.uu.se

21.1.7.4.1. R values

| top | pdf |

The traditional statistic used to assess how well a model fits the experimental data is the crystallographic R value , $[R = \textstyle \sum w \big ||F_{o}| - k|F_{c}| \big| / \sum |F_{o}|.]$ This statistic is closely related to the standard least-squares crystallographic residual $[\textstyle \sum w (|F_{o}| - k|F_{c}|)^{2}]$ and its value can be reduced essentially arbitrarily by increasing the number of parameters used to describe the model (e.g. by refining anisotropic ADPs and occupancies for all atoms) or, conversely, by reducing the number of experimental observations (e.g. through resolution and σ cutoffs) or the number of restraints imposed on the model. Therefore, the conventional R value is only meaningful if the number of experimental observations and restraints greatly exceeds the number of model parameters. In 1992, Brünger introduced the free R value (R _free; Brünger, 1992a, 1993, 1997; Kleywegt & Brünger, 1996), whose definition is identical to that of the conventional R value, except that the free R value is calculated for a small subset of reflections that are not used in the refinement of the model. The free R value, therefore, measures how well the model predicts experimental observations that are not used to fit the model (cross-validation). Until a few years ago, a conventional R value below 0.25 was generally considered to be a sign that a model was essentially correct (Brändén & Jones, 1990). While this is probably true at high resolution, it was subsequently shown for several intentionally mistraced models that these can be refined to deceptively low conventional R values (Jones et al., 1991; Kleywegt & Jones, 1995b; Kleywegt & Brünger, 1996). Brünger suggests a threshold value of 0.40 for the free R value, i.e. models with free R values greater than 0.40 should be treated with caution (Brünger, 1997). Tickle and coworkers have developed methods to estimate the expected value of R _free in least-squares refinement (Tickle et al., 1998). Since the difference between the conventional and free R value is partly a measure of the extent to which the model over-fits the data (i.e. some aspects of the model improve the conventional but not the free R value and are therefore likely to fit noise rather than signal in the data), this difference R _free − R should be small (Kleywegt & Jones, 1995a; Kleywegt & Brünger, 1996). Alternatively, the R _free ratio (defined as R _free/R; Tickle et al., 1998) should be close to unity. Various practical aspects of the use of the free R value have been discussed by Kleywegt & Brünger (1996) and by Brünger (1997).

Self-validation is an alternative to cross-validation and in the case of crystallographic refinement, the Hamilton test (Hamilton, 1965) is a prime example of this. This method enables one to assess whether a reduction in the R value is statistically significant given the increase in the number of degrees of freedom. Application of this test in the case of macromolecules is compounded by the difficulty of estimating the effect of the combined set of restraints on the (effective) number of degrees of freedom, but some information can nevertheless be gained from such an analysis (Bacchi et al., 1996).

References

Bacchi, A., Lamzin, V. S. & Wilson, K. S. (1996). A self-validation technique for protein structure refinement: the extended Hamilton test. Acta Cryst. D52, 641–646.Google Scholar

Brändén, C.-I. & Jones, T. A. (1990). Between objectivity and subjectivity. Nature (London), 343, 687–689.Google Scholar

Brünger, A. T. (1992a). Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature (London), 355, 472–475.Google Scholar

Brünger, A. T. (1993). Assessment of phase accuracy by cross validation: the free R value. Methods and applications. Acta Cryst. D49, 24–36.Google Scholar

Brünger, A. T. (1997). The free R value: a more objective statistic for crystallography. Methods Enzymol. 277, 366–396.Google Scholar

Hamilton, W. C. (1965). Significance tests on the crystallographic R factor. Acta Cryst. 18, 502–510.Google Scholar

Jones, T. A., Zou, J.-Y., Cowan, S. W. & Kjeldgaard, M. (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Cryst. A47, 110–119.Google Scholar

Kleywegt, G. J. & Brünger, A. T. (1996). Checking your imagination: applications of the free R value. Structure, 4, 897–904.Google Scholar

Kleywegt, G. J. & Jones, T. A. (1995a). Braille for pugilists. In Proceedings of the CCP4 study weekend. Making the most of your model, edited by W. N. Hunter, J. M. Thornton & S. Bailey, pp. 11–24. Warrington: Daresbury Laboratory.Google Scholar

Kleywegt, G. J. & Jones, T. A. (1995b). Where freedom is given, liberties are taken. Structure, 3, 535–540.Google Scholar

Tickle, I. J., Laskowski, R. A. & Moss, D. S. (1998). R_free and the R_free ratio. I. Derivation of expected values of cross-validation residuals used in macromolecular least-squares refinement. Acta Cryst. D54, 547–557.Google Scholar

International Tables for Crystallography (2006). Vol. F. ch. 21.1, pp. 504-505