Substructure applications

Sheldrick, G. M.; Hauptman, H. A.; Weeks, C. M.; Miller, R.; Usón, I.

doi:10.1107/97809553602060000689

International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. F. ch. 16.1, pp. 343-344 | 1 | 2 |

Section 16.1.8.6. Substructure applications

G. M. Sheldrick,^c H. A. Hauptman,^b C. M. Weeks,^b ^* R. Miller^b and I. Usón^a

^a Institut für Anorganisch Chemie, Universität Göttingen, Tammannstrasse 4, D-37077 Göttingen, Germany,^bHauptman–Woodward Medical Research Institute, Inc., 73 High Street, Buffalo, NY 14203-1196, USA, and ^cLehrstuhl für Strukturchemie, Universität Göttingen, Tammannstrasse 4, D-37077 Göttingen, Germany
Correspondence e-mail: weeks@orion.hwi.buffalo.edu

16.1.8.6. Substructure applications

| top | pdf |

It has been known for some time that conventional direct methods can be a valuable tool for locating the positions of heavy-atom substructures using isomorphous (Wilson, 1978) and anomalous (Mukherjee et al., 1989) difference structure factors. Experience has shown that successful substructure applications are highly dependent on the accuracy of the difference magnitudes. As the technology for producing selenomethionine-substituted proteins and collecting accurate multiple-wavelength (MAD) data has improved (Hendrickson & Ogata, 1997; Smith, 1998), there has been an increased need to locate many selenium sites. For larger structures (e.g. more than about 30 Se atoms), automated Patterson interpretation methods can be expected to run into difficulties since the number of unique peaks to be analysed increases with the square of the number of atoms. Experimentally measured difference data are an approximation to the data for the hypothetical substructure, and it is reasonable to expect that conventional direct methods might run into difficulties sooner when applied to such data. Dual-space direct methods provide a more robust foundation for handling such data, which are often extremely noisy. Dual-space methods also have the added advantage that the expected number of Se atoms, N_u, which is usually known, can be exploited directly by picking the top N_u peaks. Successful applications require great care in data processing, especially if the $[F_{A}]$ values resulting from a MAD experiment are to be used.

All successful applications of SnB to previously unknown SeMet data sets, as reported in Table 16.1.8.1, actually involved the use of peak-wavelength anomalous difference data $[(|E_{\Delta}|)]$ . The amount of data available for substructure problems is much larger than for full-structure problems with a comparable number of atoms to be located. Consequently, the user can afford to be stringent in eliminating data with uncertain measurements. Guidelines for rejecting uncertain data have been suggested (Smith et al., 1998). Consideration should be limited to those data pairs $[(|E_{1}|, |E_{2}|)]$ [i.e., isomorphous pairs $[(|E_{\rm nat}|, |E_{\rm der}|)]$ and anomalous pairs $[(|E_{+{\bf H}}|, |E_{-{\bf H}}|)]$ ] for which $[\min \left[|E_{1}| / \sigma (|E_{1}|), |E_{2}| / \sigma (|E_{2}|)\right] \geq x_{\min} \eqno(16.1.8.2)]$ and $[{\|E_{1}| - |E_{2}\| \over [\sigma^{2} (|E_{1}|) + \sigma^{2} (|E_{2}|)]^{1/2}} \geq y_{\min}, \eqno(16.1.8.3)]$ where typically $[x_{\min} = 3]$ and $[y_{\min} = 1]$ . The final choice of maximum resolution to be used should be based on inspection of the spherical shell averages $[\langle |E_{\Delta}|^{2} \rangle_s]$ versus $[\langle s \rangle]$ . The purpose of this precaution is to avoid spuriously large $[|E_{\Delta}|]$ values for high-resolution data pairs measured with large uncertainties due to imperfect isomorphism or general fall-off of scattering intensity with increasing scattering angle. Only those $[|E_{\Delta}|]$ for which $[|E_{\Delta}| / \sigma (|E_{\Delta}|) \geq z_{\min} \eqno(16.1.8.4)]$ (typically $[z_{\min} = 3]$ ) should be deemed sufficiently reliable for subsequent phasing. The probability of very large difference [|E|] 's (e.g. $[\gt 5]$ ) is remote, and data sets that appear to have many such measurements should be examined critically for measurement errors. If a few such data remain even after the adoption of rigorous rejection criteria, it may be best to eliminate them individually. A later paper (Blessing & Smith, 1999) elaborates further data-selection criteria.

On the other hand, it is also important that the phase:invariant ratio be maintained at 1:10 in order to ensure that the phases are overdetermined. Since the largest [|E|] 's for the substructure cell are more widely separated than they are in a true small-molecule cell, the relative number of possible triplets involving the largest reciprocal-lattice vectors may turn out to be too small. Consequently, a relatively small number of substructure phases (e.g. 10N_u) may not have a sufficient number (i.e., 100N_u) of invariants. Since the number of triplets increases rapidly with the number of reflections considered, the appropriate action in such cases is to increase the number of reflections as suggested in Table 16.1.7.1. This will typically produce the desired overdetermination.

It is rare for Se atoms to be closer to each other than 5 Å, and the application of SnB to AdoHcy data truncated to 4 and 5 Å has been successful. Success rates were less for lower-resolution data, but the CPU time required per trial was also reduced, primarily because much smaller Fourier grids were necessary. Consequently, there was no net increase in the CPU time needed to find a solution.

A special version of SHELXD is being developed that makes extensive use of the Patterson function both in generating starting atoms and in providing an independent figure of merit. It has already successfully located the anomalous scatterers in a number of structures using MAD $[F_{A}]$ data or simple anomalous differences. A recent example was the unexpected location of 17 anomalous scatterers (sulfur atoms and chloride ions) from the 1.5 Å-wavelength anomalous differences of tetragonal HEW lysozyme (Dauter et al., 1999).

References

Blessing, R. H. & Smith, G. D. (1999). Difference structure-factor normalization for heavy-atom or anomalous-scattering substructure determinations. J. Appl. Cryst. 32, 664–670.Google Scholar

Dauter, Z., Dauter, M., de La Fortelle, E., Bricogne, G. & Sheldrick, G. M. (1999). Can anomalous signal of sulfur become a tool for solving protein crystal structures? J. Mol. Biol. 289, 83–92.Google Scholar

Hendrickson, W. A. & Ogata, C. M. (1997). Phase determination from multiwavelength anomalous diffraction measurements. Methods Enzymol. 276, 494–523.Google Scholar

Mukherjee, A. K., Helliwell, J. R. & Main, P. (1989). The use of MULTAN to locate the positions of anomalous scatterers. Acta Cryst. A45, 715–718.Google Scholar

Smith, G. D., Nagar, B., Rini, J. M., Hauptman, H. A. & Blessing, R. H. (1998). The use of SnB to determine an anomalous scattering substructure. Acta Cryst. D54, 799–804.Google Scholar

Smith, J. L. (1998). Multiwavelength anomalous diffraction in macromolecular crystallography. In Direct methods for solving macromolecular structures, edited by S. Fortier, pp. 211–225. Dordrecht: Kluwer Academic Publishers.Google Scholar

Wilson, K. S. (1978). The application of MULTAN to the analysis of isomorphous derivatives in protein crystallography. Acta Cryst. B34, 1599–1608.Google Scholar

International Tables for Crystallography (2006). Vol. F. ch. 16.1, pp. 343-344