International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F. ch. 15.2, p. 327   | 1 | 2 |

Section 15.2.3.4. Estimating [\sigma_{A}]

R. J. Reada*

a Department of Haematology, University of Cambridge, Wellcome Trust Centre for Molecular Mechanisms in Disease, CIMR, Wellcome Trust/MRC Building, Hills Road, Cambridge CB2 2XY, England
Correspondence e-mail: rjr27@cam.ac.uk

15.2.3.4. Estimating [\sigma_{A}]

| top | pdf |

Srinivasan (1966)[link] showed that the Sim and Luzzati distributions could be combined into a single distribution that had a particularly elegant form when expressed in terms of normalized structure factors, or E values. This functional form still applies to the general distribution that reflects a variety of sources of error; the only difference is the interpretation placed on the parameters (Read, 1990)[link]. If F and [{\bf F}_{C}] are replaced by the corresponding E values, a parameter [\sigma_{A}] plays the role of D, and [\sigma_{\Delta}^{2}] reduces to ([1 - \sigma_{A}^{2}]). [The parameter [\sigma_{A}] is equivalent to D after correction for model completeness; [\sigma_{A} = D(\Sigma_{P}/\Sigma_{N})^{1/2}.]] When the structure factors are normalized, overall scale and B-factor effects are also eliminated. The parameter [\sigma_{A}] that characterizes this probability distribution varies as a function of resolution. It must be deduced from the amplitudes [|{\bf F}_{O}|] and [|{\bf F}_{C}|], since the phase (thus the phase difference) is unknown.

A general approach to estimating parameters for probability distributions is to maximize a likelihood function. The likelihood function is the overall joint probability of making the entire set of observations, which is a function of the desired parameters. The parameters that maximize the probability of making the set of observations are the most consistent with the data. The idea of using maximum likelihood to estimate model phase errors was introduced by Lunin & Urzhumtsev (1984)[link], who gave a treatment that was valid for space group P1. In a more general treatment that applies to higher-symmetry space groups, allowance is made for the statistical effects of crystal symmetry (centric zones and differing expected intensity factors) (Read, 1986[link]).

The [\sigma_{A}] values are estimated by maximizing the joint probability of making the set of observations of [|{\bf F}_{O}|]. If the structure factors are all assumed to be independent, the joint probability distribution is the product of all the individual distributions. The assumption of independence is not completely justified in theory, but the results are fairly accurate in practice. [L = \textstyle\prod\limits_{\bf h}p(|{\bf F}_{O}|\hbox{;} \ |{\bf F}_{C}|).] The required probability distribution, [p(|{\bf F}_{O}|\hbox{;} \ |{\bf F}_{C}|)], is derived from [p({\bf F}\hbox {;}\ {\bf F}_{C})] by integrating over all possible phase differences and neglecting the errors in [|{\bf F}_{O}|] as a measure of [|{\bf F}|]. The form of this distribution, which is given in other publications (Read, 1986[link], 1990[link]), differs for centric and acentric reflections. (It is important to note that although the distributions for structure factors are Gaussian, the distributions for amplitudes obtained by integrating out the phase are not.) It is more convenient to deal with a sum than a product, so the log likelihood function is maximized instead. In the program SIGMAA, reciprocal space is divided into spherical shells, and a value of the parameter [\sigma_{A}] is refined for each resolution shell. Details of the algorithm are given elsewhere (Read, 1986)[link].

The resolution shells must be thick enough to contain several hundred to a thousand reflections each, in order to provide [\sigma_{A}] estimates with a sufficiently small statistical error. A larger number of shells (fewer reflections per shell) can be used for refined structures, since estimates of [\sigma_{A}] become more precise as the true value approaches 1. If there are sufficient reflections per shell, the estimates will vary smoothly with resolution. As discussed below, the smooth variation with resolution can also be exploited through a restraint that allows [\sigma_{A}] values to be estimated from fewer reflections.

References

First citation Lunin, V. Yu. & Urzhumtsev, A. G. (1984). Improvement of protein phases by coarse model modification. Acta Cryst. A40, 269–277.Google Scholar
First citation Read, R. J. (1986). Improved Fourier coefficients for maps using phases from partial structures with errors. Acta Cryst. A42, 140–149.Google Scholar
First citation Read, R. J. (1990). Structure-factor probabilities for related structures. Acta Cryst. A46, 900–912.Google Scholar
First citation Srinivasan, R. (1966). Weighting functions for use in the early stages of structure analysis when a part of the structure is known. Acta Cryst. 20, 143–144.Google Scholar








































to end of page
to top of page