Reality

Stubbs, M. T.; Huber, R.

doi:10.1107/97809553602060000680

International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. F. ch. 12.2, pp. 258-259 | 1 | 2 |

Section 12.2.4. Reality

M. T. Stubbs^a ^* and R. Huber^b

^a Institut für Pharmazeutische Chemie der Philipps-Universität Marburg, Marbacher Weg 6, D-35032 Marburg, Germany, and ^bMax-Planck-Institut für Biochemie, 82152 Martinsried, Germany
Correspondence e-mail: stubbs@mailer.uni-marburg.de

12.2.4. Reality

| top | pdf |

12.2.4.1. Treatment of errors

| top | pdf |

Until now, we have dealt with cases involving perfect data. Although this ideal may now be attainable using MAD techniques, this is not necessarily the usual laboratory situation. In the first place, it is necessary to scale the derivative data $[F_{PH}]$ to the native $[F_{P}]$ . One of the most common scaling procedures is based on the expected statistical dependence of intensity on resolution (Wilson, 1949). This may not be particularly accurate when only low-resolution data are available, in which case a scaling through equating the Patterson origin peaks of native and derivative sets may provide better results (Rogers, 1965).

A model to account for errors in the data, determination of heavy-atom positions etc. was proposed by Blow & Crick (1959), in which all errors are associated with $[|F_{PH}|_{\rm obs}]$ (Fig. 12.2.4.1); a more detailed treatment has been provided by Terwilliger & Eisenberg (1987). Owing to errors, the triangle formed by $[F_{P}]$ , $[F_{PH}]$ and $[F_{H}]$ fails to close. The lack of closure error ɛ is a function of the calculated phase angle $[\varphi_{P}]$ : $[\varepsilon (\varphi_{P}) = |F_{PH}|_{\rm obs} - |F_{PH}|_{\rm calc}.]$ Once an initial set of heavy-atom positions has been found, it is necessary to refine their parameters (x, y, z, occupancy and thermal parameters). This can be achieved through the minimization of $[{\textstyle\sum\limits_{\bf S}} \varepsilon^{2} / E,]$ where E is the estimated error $[(\simeq \langle (|F_{PH}|_{\rm obs} - |F_{PH}|_{\rm calc})^{2}\rangle)]$ (Rossmann, 1960; Terwilliger & Eisenberg, 1983). This procedure is safest for noncentrosymmetric reflections (φ restricted to 0 or π) if enough are present. Phase refinement is generally monitored by three factors: $[R_{\rm Cullis} = {\textstyle\sum} \| F_{PH} + F_{P}| - |F_{H}|_{\rm calc}|\big/ {\textstyle\sum} |F_{PH} - F_{P}|]$ for noncentrosymmetric reflections only; acceptable values are between 0.4 and 0.6; $[R_{\rm Kraut} = {\textstyle\sum} \|F_{PH}|_{\rm obs} - |F_{PH}|_{\rm calc}| \big/ {\textstyle\sum} |F_{PH}|_{\rm obs},]$ which is useful for monitoring convergence; and the $[\hbox{phasing power} = {\textstyle\sum} |F_{H}|_{\rm calc} /{\textstyle\sum} \|F_{PH}|_{\rm obs} - |F_{PH}|_{\rm calc}|,]$ which should be greater than 1 (if less than 1, then the phase triangle cannot be closed via $[F_{H}]$ ).

Figure 12.2.4.1| top | pdf |

The treatment of phase errors. The calculated heavy-atom structure results in a calculated value for both the phase and magnitude of $[F_{H}]$ (red). According to the value of $[\varphi_{P}]$ , the triangle $[F_{P}]$ – $[F_{H}]$ – $[F_{PH}]$ will fail to close by an amount ɛ, the lack of closure (green). This gives rise to a phase distribution which is bimodal for a single derivative. The combined probability from a series of derivatives has a most probable phase (the maximum) and a best phase (the centroid of the distribution), for which the overall phase error is minimum.

The resulting phase probability is given by $[P (\varphi_{P}) = \exp \{- \varepsilon^{2} (\varphi_{P}) / 2E^{2}\}.]$ The phases have a minimum error when the best phase $[\varphi_{\rm best}]$ , i.e. the centroid of the phase distribution, $[\varphi_{\rm best} = {\textstyle\int} \varphi_{P} P(\varphi_{P})\ \hbox{d}\varphi_{P},]$ is used instead of the most probable phase. The quality of the phases is indicated by the figure of merit m, where $[m = {\textstyle\int} P(\varphi_{P}) \exp (i\varphi_{P})\ \hbox{d}\varphi_{P} \big/ {\textstyle\int} P(\varphi_{P})\ \hbox{d}\varphi_{P}.]$ A value of 1 for m indicates no phase error, a value of 0.5 represents a phase error of about 60°, while a value of 0 means that all phases are equally probable.

The best Fourier is calculated from $[\rho_{\rm best}({\bf r}) = (1/V) {\textstyle\sum\limits_{\bf S}} m|F_{P}({\bf S})| \exp \{i\varphi_{P{\rm best}}({\bf S})\},]$ where the electron density should have minimal errors.

12.2.4.2. Automated search procedures

| top | pdf |

If the derivative shows a high degree of substitution, then the Harker sections become more difficult to interpret. Furthermore, Terwilliger et al. (1987) have shown that the intrinsic noise in the difference Patterson map increases with increasing heavy-atom substitution. It is at this stage that automated procedures are invaluable.

One such automated procedure is implemented in PROTEIN (Steigemann, 1991). The unit cell is scanned for possible heavy-atom sites; for each search point (x, y, z), all possible Harker vectors are calculated, and the difference-Patterson-map values at these points are summed or multiplied. As the origin peak dominates the Patterson function, this region is set to zero. The resulting correlation map should contain peaks at all possible heavy-atom positions. The peak list can then be used to find a set of consistent heavy-atom locations through a subsequent search for difference vectors (cross vectors) between putative sites. It should be possible to locate all major and minor heavy-atom sites through repetition of this procedure. A similar strategy is adopted in the program HEAVY (Terwilliger et al., 1987), but sets of heavy-atom sites are ranked according to the probability that the peaks are not random. The program SOLVE (Terwilliger & Berendzen, 1999) takes this process a stage further, where potential heavy-atom structures are solved and refined to generate an (interpretable) electron density in an automated fashion.

The search method can also be applied in reciprocal space, where the Fourier transform of the trial heavy-atom structure is calculated, and the resulting $[F_{{H}{\rm calc}}]$ is compared to the measured differences between derivative and native structure-factor amplitudes (Rossmann et al., 1986). In the programme XtalView (McRee, 1998), the correlation coefficient between $[|F_{H}|]$ and $[|F_{PH} - F_{P}|]$ is calculated, whilst a correlation between $[F_{H}^{2}]$ and $[(F_{PH} - F_{P})^{2}]$ is used by Badger & Athay (1998). Dumas (1994b,c ) calculates the correlation between $[|F_{{H}{\rm calc}}|^{2}]$ and $[|F_{{H}{\rm estimated}}|^{2}]$ , based on the estimated lack of isomorphism.

Vagin & Teplyakov (1998) have reported a heavy-atom search based on a reciprocal-space translation function. In this case, low-resolution peaks are not removed but weighted down using a Gaussian function. Potential solutions are ranked not only according to their translation-function height, but also through their phasing power, which appears to be a stronger selection criterion.

All these searches are based upon the sequential identification of heavy-atom sites and their incorporation in a heavy-atom partial structure. Problems arise when bogus sites influence the search for further heavy-atom positions. In an attempt to overcome this problem, the heavy-atom search has been reprogrammed using a genetic algorithm, with the Patterson minimum function as a selection criterion (Chang & Lewis, 1994). This approach has the potential to reveal all heavy-atom positions in one calculation, and tests on model data have shown it to be faster than traditional sequential searches.

References

Badger, J. & Athay, R. (1998). Automated and graphical methods for locating heavy-atom sites for isomorphous replacement and multiwavelength anomalous diffraction phase determination. J. Appl. Cryst. 31, 270–274.Google Scholar

Blow, D. M. & Crick, F. H. C. (1959). The treatment of errors in the isomorphous replacement method. Acta Cryst. 12, 794–802.Google Scholar

Chang, G. & Lewis, M. (1994). Using genetic algorithms for solving heavy-atom sites. Acta Cryst. D50, 667–674.Google Scholar

Dumas, P. (1994b). The heavy-atom problem: a statistical analysis. II. Consequences of the a priori knowledge of the noise and heavy-atom powers and use of a correlation function for heavy-atom-site determination. Acta Cryst. A50, 537–546.Google Scholar

Dumas, P. (1994c). The heavy-atom problem: a statistical analysis. II. Consequences of the a priori knowledge of the noise and heavy-atom powers and use of a correlation function for heavy-atom-site determination. Erratum. Acta Cryst. A50, 793.Google Scholar

McRee, D. E. (1998). Practical protein crystallography. San Diego: Academic Press.Google Scholar

Rogers, D. (1965). In Computing methods in crystallography, edited by J. S. Rollett, pp. 133–148. Oxford University Press.Google Scholar

Rossmann, M. G. (1960). The accurate determination of the position and shape of heavy-atom replacement groups in proteins. Acta Cryst. 13, 221–226.Google Scholar

Rossmann, M. G., Arnold, E. & Vriend, G. (1986). Comparison of vector search and feedback methods for finding heavy-atom sites in isomorphous derivatives. Acta Cryst. A42, 325–334.Google Scholar

Steigemann, W. (1991). Recent advances in the PROTEIN program system for the X-ray structure analysis of biological macromolecules. In Crystallographic computing 5: from chemistry to biology, edited by D. Moras, A. D. Podjarny & J. C. Thierry, pp. 115–125. Oxford University Press.Google Scholar

Terwilliger, T. C. & Berendzen, J. (1999). Automated MAD and MIR structure solution. Acta Cryst. D55, 849–861.Google Scholar

Terwilliger, T. C. & Eisenberg, D. (1983). Unbiased three-dimensional refinement of heavy-atom parameters by correlation of origin-removed Patterson functions. Acta Cryst. A39, 813–817.Google Scholar

Terwilliger, T. C. & Eisenberg, D. (1987). Isomorphous replacement: effects of errors on the phase probability distribution. Acta Cryst. A43, 6–13.Google Scholar

Terwilliger, T. C., Kim, S.-H. & Eisenberg, D. (1987). Generalized method of determining heavy-atom positions using the difference Patterson function. Acta Cryst. A43, 1–5.Google Scholar

Vagin, A. & Teplyakov, A. (1998). A translation-function approach for heavy-atom location in macromolecular crystallography. Acta Cryst. D54, 400–402.Google Scholar

Wilson, A. J. C. (1949). The probability distribution of X-ray intensities. Acta Cryst. 2, 318–321.Google Scholar

International Tables for Crystallography (2006). Vol. F. ch. 12.2, pp. 258-259