Tables for
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F. ch. 21.1, pp. 500-501   | 1 | 2 |

Section Merging R values

G. J. Kleywegta*

aDepartment of Cell and Molecular Biology, Uppsala University, Biomedical Centre, Box 596, SE-751 24 Uppsala, Sweden
Correspondence e-mail: Merging R values

| top | pdf |

Possibly the most common mistake in papers describing protein crystal structures is an incorrectly quoted formula for the merging R value (calculated during data reduction), [R_{\rm merge} = \textstyle \sum \limits_{h} \sum \limits_{i} | I_{h,\, i} - \langle I_{h} \rangle | / \sum \limits_{h} \sum \limits_{i} I_{h,\, i},] where the outer sum (h) is over the unique reflections (in most implementations, only those reflections that have been measured more than once are included in the summations) and the inner sum (i) is over the set of independent observations of each unique reflection (Drenth, 1994[link]). This statistic is supposed to reflect the spread of multiple observations of the intensity of the unique reflections (where the multiple observations may derive from symmetry-related reflections, different images or different crystals). Unfortunately, Rmerge is a very poor statistic, since its value increases with increasing redundancy (Weiss & Hilgenfeld, 1997[link]; Diederichs & Karplus, 1997[link]), even though the signal-to-noise ratio of the average intensities will be higher as more observations are included (in theory, an N-fold increase of the number of independent observations should improve the signal-to-noise ratio by a factor of N 1/2). At high redundancy, the value of Rmerge is directly related to the average signal-to-noise ratio (Weiss & Hilgenfeld, 1997[link]): Rmerge ≃ 0.8/<I/σ(I)>.

Diederichs & Karplus (1997[link]) have suggested a number of alternative measures that lack most of the drawbacks of Rmerge. Their statistic Rmeas is similar to Rmerge, but includes a correction for redundancy (m),[R_{\rm meas} = \textstyle \sum \limits_{h}[m/(m-1)]^{1/2} \sum \limits_{i} |I_{h,\, i} - \langle I_{h} \rangle | / \sum \limits_{h} \sum \limits_{i} I_{h,\, i}.] Another statistic, the pooled coefficient of variation (PCV), is defined as[{\rm PCV} = \textstyle \sum \limits_{h} \{[1/(m-1)] \sum \limits_{i} (I_{h,\, i} - \langle I_{h} \rangle)^{2} \}^{1/2} / \sum \limits_{h} \langle I_{h}\rangle.] Since PCV = 1/<I/σ(I)>, this quantity also provides an indication as to whether the standard deviations σ(I) have been estimated appropriately. Finally, the statistic Rmrgd-F, used for assessing the quality of the reduced data, enables a direct comparison of this merging R value with the refinement residuals R and Rfree.

Ideally, merging statistics should be quoted for all resolution shells (which should not be too broad), as well as for the entire data set. However, as a minimum, the values for the two extreme (low- and high-resolution) shells and for the entire data set should be reported.


First citation Diederichs, K. & Karplus, P. A. (1997). Improved R-factors for diffraction data analysis in macromolecular crystallography. Nature Struct. Biol. 4, 269–275.Google Scholar
First citation Drenth, J. (1994). Principles of protein X-ray crystallography. New York: Springer–Verlag.Google Scholar
First citation Weiss, M. S. & Hilgenfeld, R. (1997). On the use of the merging R factor as a quality indicator for X-ray data. J. Appl. Cryst. 30, 203–205.Google Scholar

to end of page
to top of page