Locating heavy-atom sites

Stubbs, M. T.; Huber, R.

doi:10.1107/97809553602060000680

International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. F. ch. 12.2, pp. 256-262 | 1 | 2 |
https://doi.org/10.1107/97809553602060000680

Chapter 12.2. Locating heavy-atom sites

M. T. Stubbs^a ^* and R. Huber^b

^a Institut für Pharmazeutische Chemie der Philipps-Universität Marburg, Marbacher Weg 6, D-35032 Marburg, Germany, and ^bMax-Planck-Institut für Biochemie, 82152 Martinsried, Germany
Correspondence e-mail: stubbs@mailer.uni-marburg.de

In order to obtain phase information from isomorphous replacement (or from anomalous dispersion), it is necessary to locate the atomic positions of the heavy-atom (or anomalous) scatterers. The topics covered in this chapter include: the origin of the phase problem; the Patterson function; difference Fourier maps; treatment of errors; automated search procedures; and special complications such as lack of isomorphism, space-group problems and high levels of substitution.

Keywords: Fourier maps; Patterson functions; difference Fourier maps; heavy-atom location; isomorphism; isomorphous replacement; lack of isomorphism; noncrystallographic symmetry; phase problem.

12.2.1. The origin of the phase problem

| top | pdf |

Once a native data set has been collected, the next task is the solution of the structure. There is one major hurdle: the phase problem. To study objects at the atomic level, we must utilize waves with a wavelength in the ångström range, i.e. X-radiation. X-rays interact with electrons and so provide an image of the electron distribution of the sample. Unfortunately, X-rays are refracted by matter only very weakly, and so it is not possible to construct a lens to view molecules at atomic dimensions.¹

As shown in Chapter 2.1 , the diffraction $[F({\bf S})]$ obtained from an electron-density distribution $[\rho ({\bf r})]$ is given by $[F({\bf S}) = {\textstyle\int}\rho ({\bf r}) \exp \{2\pi i{\bf r}\cdot {\bf S}\}\ \hbox{d}^{3}{\bf r},]$ where S is perpendicular to the scattered wave and $[|{\bf S}| = 2\sin \theta/\lambda]$ ; θ is the scattering angle and λ is the wavelength. The diffraction pattern is a Fourier transform of the electron density. If we have a crystal with cell parameters a, b and c, then the Laue diffraction conditions require that S lies on a reciprocal lattice such that $[{\bf S} = h{\bf a}^{*} + k{\bf b}^{*} + l{\bf c}^{*}]$ , where $[{\bf a}^{*}, {\bf b}^{*}]$ and $[{\bf c}^{*}]$ are the reciprocal-lattice vectors, and h, k and l are the integer indices of the diffracted beam. $[F(hkl) = V {\textstyle\sum\limits_{xyz}} \rho (xyz) \exp \{2\pi i(hx + ky + lz)\},]$ where V represents the volume of the unit cell, and x, y and z are the fractional coordinates within that cell in the directions of a, b and c.

Since the diffraction pattern is a Fourier transform of the electron density, it follows that the electron density is an inverse Fourier transform of the diffraction pattern: $[\rho ({\bf r}) = {\textstyle\int} F({\bf S}) \exp \{-2\pi i{\bf r}\cdot {\bf S}\}\ \hbox{d}^{3}{\bf S},]$ $[\rho (xyz) = (1/V) {\textstyle\sum\limits_{hkl}} F(hkl) \exp \{-2\pi i(hx + ky + lz)\}.]$

Thus it should be mathematically straightforward to calculate the electron density from the diffraction pattern. This is, unfortunately, not the case. The function $[F({\bf S})]$ describing the diffracted rays is a complex function with a magnitude $[|F({\bf S})|]$ and a phase $[\varphi({\bf S})]$ . The diffraction experiment measures the intensities $[I({\bf S})]$ , however; the relationship between $[I({\bf S})]$ and $[F({\bf S})]$ is: $[I({\bf S}) = F^{*}({\bf S})\cdot F({\bf S}) = |F({\bf S})|^{2},]$ where $[F^{*}({\bf S})]$ is the complex conjugate of $[F({\bf S})]$ . The measured intensities are related directly to the magnitudes of the diffracted beams; the phase information, however, is lost (Fig. 12.2.1.1 ): this is the origin of the phase problem.

Figure 12.2.1.1| top | pdf |

Relationships in diffraction space. The diffraction pattern of an object is the Fourier transform of the electron density, consisting of both amplitudes and phases. What we measure in an X-ray diffraction experiment, however, are the diffracted intensities; the phase information is lost. The Fourier transform of the intensities results in the Patterson map, which is related to the electron density as follows. For any two atoms in the structure, the vector between them, centred at the origin, has a value corresponding to the product of their densities. Thus, the red atom and the yellow atom result in the orange cross vectors, the red and blue atoms result in the magenta cross vectors, and the yellow and blue atoms result in green cross vectors. As each atom has a `cross vector' to itself, a large peak is found at the origin; the Patterson map is centrosymmetric.

There are essentially four ways of overcoming the phase problem (Fig. 12.2.1.2 ):

(1) the use of isomorphous replacement to influence the diffraction pattern, thereby revealing information about the phases (this chapter);

Figure 12.2.1.2| top | pdf |

The effect of introducing a heavy atom or anomalous scatterer. The native two-atom structure gives rise to two diffraction vectors (green and blue) of equal magnitude but different phase (see Chapter 2.1 ), with a resultant diffraction vector $[F_{P}]$ (black). Isomorphous replacement of the blue atom by the larger red one gives rise to a diffraction vector of greater magnitude but equivalent phase (red), causing a change in the resultant magnitude $[F_{PH}]$ (and hence the intensity) and in the phase. Introduction of an anomalous scatterer results in a phase shift (lilac) of the diffraction vector, resulting in differing amplitudes and phases for $[F_{PH}({\bf S})]$ and $[F_{PH}(-{\bf S})]$ .

(2) Patterson search techniques, which in essence allow modelling of the phase distribution of an unknown crystal based upon that of a known molecule (Part 13 );
(3) the use of anomalous dispersion, which shifts the imaginary component of $[F({\bf S})]$ , allowing an experimental measurement of the phase (Part 14 ); and
(4) direct methods, which make use of probabilistic relationships between different diffracted rays (Part 16 ).

The method of isomorphous replacement, by which the first macromolecular structures were solved (Green et al., 1954 ), remains the most widely used technique for ab initio structure determination, although the availability of synchrotrons, with their facility for selecting a desired wavelength, and molecular-biology techniques that allow the direct introduction of anomalous scatterers, such as selenium or tellurium, into the protein of interest (Hendrickson et al., 1990 ; Budisa et al., 1997 ) have proven that multiple anomalous dispersion is an exceptionally powerful technique for the solution of novel structures. Patterson search techniques (Rossmann, 1972 ) are ideal if a similar macromolecular structure is already known, while direct methods are more-or-less confined to very high resolution data (Sheldrick, 1990 ).

In order to obtain phase information from isomorphous replacement (or from anomalous dispersion), it is necessary to locate the atomic positions of the heavy-atom (or anomalous) scatterers.

12.2.2. The Patterson function

| top | pdf |

Although the set of measured intensities contains no information regarding the phases, the Fourier transform of the intensities, the so-called Patterson function, contains valuable information. Patterson (1934 ) showed that the inverse Fourier transform of the intensity, $[P(uvw) = (1/V) {\textstyle\sum\limits_{hkl}} I(hkl) \exp \{-2\pi i(hu + kv + lw)\},]$ is related to the electron density by $[P({\bf u}) = {\textstyle\int} \rho ({\bf r}) \rho ({\bf r} + {\bf u})\ \hbox{d}^{3}{\bf r}.]$

The Patterson function $[P({\bf u})]$ is an autocorrelation function of the density. For every vector u that corresponds to an interatomic vector, $[P({\bf u})]$ will contain a peak (Fig. 12.2.1.1 ). These are some properties of the Patterson function:

(1) Every atom makes an `interatomic vector' with itself, and therefore the origin peak, $[P({\bf 0}) = \textstyle\sum \rho^{2}({\bf r})]$ , dominates the Patterson function. This origin peak can be `removed' through subtraction of the average intensity from I(hkl) before Fourier transformation.
(2) For every vector between $[\rho_{i}({\bf r}_{i})]$ and $[\rho_{j}({\bf r}_{j})]$ , the same value (i.e. their product) is found for $[\rho_{j}({\bf r}_{j})]$ to $[\rho_{i}({\bf r}_{i})]$ , and so the Patterson map is centrosymmetric.
(3) For a structure consisting of n atoms, there are cross vectors , and so the Patterson function is extremely crowded.

For simple crystals, the Patterson map can be used to solve the structure directly. For macromolecular structures, the Patterson map provides a vehicle for solving the phase problem.

If the crystal contains rotational symmetry elements, then the cross vectors between $[\rho_{i}({\bf r}_{i})]$ and its symmetry mate lie on a plane perpendicular to the symmetry axis – the Harker section (Harker, 1956 ). By way of example, the space group $[P2_{1}]$ has two symmetry-related positions (Fig. 12.2.2.1 ), $[(x, y, z)\ \hbox{and}\ (-x, y + {\textstyle{1 \over 2}}, -z).]$

Figure 12.2.2.1| top | pdf |

The Patterson map with symmetry. When the crystal unit cell contains more than one molecule, then additional cross vectors will be formed between differing molecules. If these are related by crystallographic symmetry, there is a geometrical relationship between cross peaks. In this diagram, the peaks of Fig. 12.2.1.1 are supplemented by those between atoms of symmetry-related molecules. The red, yellow and blue peaks of the resulting Patterson function represent those between same atoms (i.e. red to red, yellow to yellow and blue to blue) related by symmetry. These peaks are found on a Harker section.

Cross vectors between symmetry-related points will therefore have the form $[(2x, {\textstyle{1 \over 2}}, 2z),]$ i.e. all cross vectors lie on the plane $[v = {1 \over 2}]$ . For space group $[P2_{1}2_{1}2_{1}]$ , the general coordinates $[\displaylines{(x, y, z), (x + {\textstyle{1 \over 2}}, -y + {\textstyle{1 \over 2}}, -z), (-x + {\textstyle{1 \over 2}}, -y, z + {\textstyle{1 \over 2}}),\cr (-x, y + {\textstyle{1 \over 2}}, -z + {\textstyle{1 \over 2}})}]$ give rise to cross vectors $[({\textstyle{1 \over 2}}, 2y + {\textstyle{1 \over 2}}, 2z), (2x + {\textstyle{1 \over 2}}, 2y, {\textstyle{1 \over 2}}), (2x, {\textstyle{1 \over 2}}, 2z + {\textstyle{1 \over 2}}),]$ i.e. there are three Harker sections: $[u = {1 \over 2}]$ , $[v = {1 \over 2}]$ and $[w = {1 \over 2}]$ . Peaks occurring on the Harker sections must reduce to a self-consistent set of coordinates (x, y, z), allowing reconstruction of the atomic positions.

If we have two isomorphous (see below) data sets $[F_{PH}]$ and $[F_{P}]$ , then the difference in the two Patterson functions, $[P_{PH} - P_{P} = {\textstyle\int} [F_{PH}^{2} ({\bf S}) - F_{P}^{2} ({\bf S})] \exp \{-2\pi i{\bf r}\cdot {\bf S}\}\ \hbox{d}^{3}{\bf S},]$ will deliver information about the heavy-atom structure. Such a difference function gives rise to non-negligible peaks arising from interference between the $[F_{H}]$ and $[F_{P}]$ terms, however (Perutz, 1956 ). Rossmann (1960 ) showed that these interference terms could be reduced through calculation of the modified Patterson function $[P_{H} = {\textstyle\int} [F_{PH}({\bf S}) - F_{P} ({\bf S})]^{2} \exp \{-2\pi i{\bf r}\cdot {\bf S}\}\ \hbox{d}^{3}{\bf S}.]$

In the case of a single-site derivative, peaks should occur only at the Harker vectors corresponding to the heavy-atom position. Even so, there is a choice of positions for the heavy atom: e.g., in the $[P2_{1}2_{1}2_{1}]$ case, coordinates $[(\pm x + \xi, \pm y + \nu, \pm z + \zeta)]$ , where ξ, ν and ζ can each take the value 0 or [1/2] , will all give rise to the same Harker vectors. This in itself is not a problem, relating to equivalent choices of origin and of handedness, but has important ramifications for multisite derivatives or multiple isomorphous replacement (see below).

If there is more than one site, then there will be two sets of peaks: one set corresponding to the Harker sections (self-vector set) and one set corresponding to the difference vectors between different heavy-atom sites (the cross-vector set). In this case, the choice of one heavy-atom position $[(x_{H1}, y_{H1}, z_{H1})]$ determines the origin and the handedness to which all other peaks must correspond. Thus, in the $[P2_{1}2_{1}2_{1}]$ example, only one cross vector will occur for $[(x_{h1} \pm x_{h2} + \xi, y_{h1} \pm y_{h2} + \nu, z_{h1} \pm z_{h2} + \zeta).]$

An alternative to the Harker-vector approach is Patterson-vector superposition (Sheldrick et al., 1993 ; Richardson & Jacobson, 1987 ). The Patterson map contains several images of the structure that have been shifted by interatomic vectors (Fig. 12.2.2.2 ). If this structure is relatively simple (as is to be hoped for in a `normal' heavy-atom derivative), then it should be possible to deconvolute the superimposed structures by vector shifts (Buerger, 1959 ).

Figure 12.2.2.2| top | pdf |

The vector superposition method. The Patterson map of Fig. 12.2.1.1 can be regarded as the superposition of the structure (and its inverse), with each of its atoms placed alternately at the origin. By shifting each peak of the Patterson function to the origin and calculating the correlation of all remaining peaks with the unshifted map, it is possible to deconvolute the Patterson function.

12.2.3. The difference Fourier

| top | pdf |

Once the heavy-atom positions have been found, they can be used to calculate approximate phases and Fourier maps. Ideally, difference Fourier maps calculated with phases from a single site should reveal the other positions determined from the Harker search procedure. This ensures that all heavy-atom positions correspond to a single origin and hand. Similarly, phases calculated from derivative H1 should reveal the heavy-atom structure for derivative H2. Merging and refinement of all phase information will result in a phase set that can be used to solve the structure.

12.2.4. Reality

| top | pdf |

12.2.4.1. Treatment of errors

| top | pdf |

Until now, we have dealt with cases involving perfect data. Although this ideal may now be attainable using MAD techniques, this is not necessarily the usual laboratory situation. In the first place, it is necessary to scale the derivative data $[F_{PH}]$ to the native $[F_{P}]$ . One of the most common scaling procedures is based on the expected statistical dependence of intensity on resolution (Wilson, 1949 ). This may not be particularly accurate when only low-resolution data are available, in which case a scaling through equating the Patterson origin peaks of native and derivative sets may provide better results (Rogers, 1965 ).

A model to account for errors in the data, determination of heavy-atom positions etc. was proposed by Blow & Crick (1959 ), in which all errors are associated with $[|F_{PH}|_{\rm obs}]$ (Fig. 12.2.4.1 ); a more detailed treatment has been provided by Terwilliger & Eisenberg (1987 ). Owing to errors, the triangle formed by $[F_{P}]$ , $[F_{PH}]$ and $[F_{H}]$ fails to close. The lack of closure error ɛ is a function of the calculated phase angle $[\varphi_{P}]$ : $[\varepsilon (\varphi_{P}) = |F_{PH}|_{\rm obs} - |F_{PH}|_{\rm calc}.]$ Once an initial set of heavy-atom positions has been found, it is necessary to refine their parameters (x, y, z, occupancy and thermal parameters). This can be achieved through the minimization of $[{\textstyle\sum\limits_{\bf S}} \varepsilon^{2} / E,]$ where E is the estimated error $[(\simeq \langle (|F_{PH}|_{\rm obs} - |F_{PH}|_{\rm calc})^{2}\rangle)]$ (Rossmann, 1960 ; Terwilliger & Eisenberg, 1983 ). This procedure is safest for noncentrosymmetric reflections (φ restricted to 0 or π) if enough are present. Phase refinement is generally monitored by three factors: $[R_{\rm Cullis} = {\textstyle\sum} \| F_{PH} + F_{P}| - |F_{H}|_{\rm calc}|\big/ {\textstyle\sum} |F_{PH} - F_{P}|]$ for noncentrosymmetric reflections only; acceptable values are between 0.4 and 0.6; $[R_{\rm Kraut} = {\textstyle\sum} \|F_{PH}|_{\rm obs} - |F_{PH}|_{\rm calc}| \big/ {\textstyle\sum} |F_{PH}|_{\rm obs},]$ which is useful for monitoring convergence; and the $[\hbox{phasing power} = {\textstyle\sum} |F_{H}|_{\rm calc} /{\textstyle\sum} \|F_{PH}|_{\rm obs} - |F_{PH}|_{\rm calc}|,]$ which should be greater than 1 (if less than 1, then the phase triangle cannot be closed via $[F_{H}]$ ).

Figure 12.2.4.1| top | pdf |

The treatment of phase errors. The calculated heavy-atom structure results in a calculated value for both the phase and magnitude of $[F_{H}]$ (red). According to the value of $[\varphi_{P}]$ , the triangle $[F_{P}]$ – $[F_{H}]$ – $[F_{PH}]$ will fail to close by an amount ɛ, the lack of closure (green). This gives rise to a phase distribution which is bimodal for a single derivative. The combined probability from a series of derivatives has a most probable phase (the maximum) and a best phase (the centroid of the distribution), for which the overall phase error is minimum.

The resulting phase probability is given by $[P (\varphi_{P}) = \exp \{- \varepsilon^{2} (\varphi_{P}) / 2E^{2}\}.]$ The phases have a minimum error when the best phase $[\varphi_{\rm best}]$ , i.e. the centroid of the phase distribution, $[\varphi_{\rm best} = {\textstyle\int} \varphi_{P} P(\varphi_{P})\ \hbox{d}\varphi_{P},]$ is used instead of the most probable phase. The quality of the phases is indicated by the figure of merit m, where $[m = {\textstyle\int} P(\varphi_{P}) \exp (i\varphi_{P})\ \hbox{d}\varphi_{P} \big/ {\textstyle\int} P(\varphi_{P})\ \hbox{d}\varphi_{P}.]$ A value of 1 for m indicates no phase error, a value of 0.5 represents a phase error of about 60°, while a value of 0 means that all phases are equally probable.

The best Fourier is calculated from $[\rho_{\rm best}({\bf r}) = (1/V) {\textstyle\sum\limits_{\bf S}} m|F_{P}({\bf S})| \exp \{i\varphi_{P{\rm best}}({\bf S})\},]$ where the electron density should have minimal errors.

12.2.4.2. Automated search procedures

| top | pdf |

If the derivative shows a high degree of substitution, then the Harker sections become more difficult to interpret. Furthermore, Terwilliger et al. (1987 ) have shown that the intrinsic noise in the difference Patterson map increases with increasing heavy-atom substitution. It is at this stage that automated procedures are invaluable.

One such automated procedure is implemented in PROTEIN (Steigemann, 1991 ). The unit cell is scanned for possible heavy-atom sites; for each search point (x, y, z), all possible Harker vectors are calculated, and the difference-Patterson-map values at these points are summed or multiplied. As the origin peak dominates the Patterson function, this region is set to zero. The resulting correlation map should contain peaks at all possible heavy-atom positions. The peak list can then be used to find a set of consistent heavy-atom locations through a subsequent search for difference vectors (cross vectors) between putative sites. It should be possible to locate all major and minor heavy-atom sites through repetition of this procedure. A similar strategy is adopted in the program HEAVY (Terwilliger et al., 1987 ), but sets of heavy-atom sites are ranked according to the probability that the peaks are not random. The program SOLVE (Terwilliger & Berendzen, 1999 ) takes this process a stage further, where potential heavy-atom structures are solved and refined to generate an (interpretable) electron density in an automated fashion.

The search method can also be applied in reciprocal space, where the Fourier transform of the trial heavy-atom structure is calculated, and the resulting $[F_{{H}{\rm calc}}]$ is compared to the measured differences between derivative and native structure-factor amplitudes (Rossmann et al., 1986 ). In the programme XtalView (McRee, 1998 ), the correlation coefficient between $[|F_{H}|]$ and $[|F_{PH} - F_{P}|]$ is calculated, whilst a correlation between $[F_{H}^{2}]$ and $[(F_{PH} - F_{P})^{2}]$ is used by Badger & Athay (1998 ). Dumas (1994b,c ) calculates the correlation between $[|F_{{H}{\rm calc}}|^{2}]$ and $[|F_{{H}{\rm estimated}}|^{2}]$ , based on the estimated lack of isomorphism.

Vagin & Teplyakov (1998 ) have reported a heavy-atom search based on a reciprocal-space translation function. In this case, low-resolution peaks are not removed but weighted down using a Gaussian function. Potential solutions are ranked not only according to their translation-function height, but also through their phasing power, which appears to be a stronger selection criterion.

All these searches are based upon the sequential identification of heavy-atom sites and their incorporation in a heavy-atom partial structure. Problems arise when bogus sites influence the search for further heavy-atom positions. In an attempt to overcome this problem, the heavy-atom search has been reprogrammed using a genetic algorithm, with the Patterson minimum function as a selection criterion (Chang & Lewis, 1994 ). This approach has the potential to reveal all heavy-atom positions in one calculation, and tests on model data have shown it to be faster than traditional sequential searches.

12.2.5. Special complications

| top | pdf |

12.2.5.1. Lack of isomorphism

| top | pdf |

This problem is by far the most common in protein crystallography. An isomorphous derivative is one in which the crystalline arrangement has not been disturbed by derivatization. An early study of Crick & Magdoff (1956 ) proposed a rule of thumb that a change in any of the cell dimensions by more than around 5% would result in a lack of isomorphism that would defeat any attempt to locate the heavy-atom positions or extract useful phase information. Lack of isomorphism can, however, be more subtle; sometimes a natural variation in the native crystal form may occur, resulting in poor merging statistics of data obtained from different crystals. Coupling this variation with commonly observed structural changes upon heavy-atom binding can provide a considerable barrier to obtaining satisfactory phases. Dumas (1994a ) has provided a theoretical consideration of this problem.

One practical approach is to collect native and derivative data sets from the same crystal, a technique that has been successful in the structure determination of cyclohydrolase (Nar et al., 1995 ), proteosome (Löwe et al., 1995 ) and a number of other proteins. Nonisomorphism can be used, however. In the structure solution of carbamoyl sarcosine hydrolase (Romao et al., 1992 ), derivatives fell into two (related) crystalline classes. By judicious use of two `native' crystal forms, heavy-atom positions could be obtained in each of the two classes. Phasing and resultant averaging between the two classes provided an interpretable electron density. In the case of ascorbate oxidase (Messerschmidt et al., 1989 ), multiple isomorphous replacement failed to provide an interpretable density. It was possible, however, to place the initial density into a second crystal form, which in turn provided phases of sufficient quality to determine heavy-atom sites in derivatives of the second form. Phase-combination and density-modification techniques in the two crystal forms allowed the solution of the structure.

12.2.5.2. Space-group problems

| top | pdf |

Although the macromolecular crystallographer is rarely confronted with the problems facing their small-molecule colleagues with regard to determining the correct space group, the simplified heavy-atom structure may often throw some surprises. Certain pseudosymmetries may become `exact' for the heavy-atom difference Patterson map. Thus, cross peaks between different heavy atoms may occur on a Harker section (or `pseudo-Harker section'), complicating interpretation of the Patterson map. Such was the case with azurin (Adman et al., 1978 ; Nar et al., 1991 ), where the heavy-atom structure gave rise to a pseudo-homometric Patterson function, i.e. one in which two possible (nonequivalent) choices were available for the heavy-atom structure, only one of which was correct. This arose from a pseudo-centring of the lattice that became almost exact for the heavy-atom structure.

In the case of human NC1 (Stubbs et al., 1990 ), all heavy-atom derivatives appeared to lie on or near the crystallographic twofold axis. This resulted in a partially centrosymmetric heavy-atom structure that failed to deliver sufficient phase information for noncentrosymmetric reflections. To check for problems with the native data set, anomalous difference Patterson maps {coefficients $[[F_{PH}({\bf S}) - F_{PH}(- {\bf S})]^{2}]$ were calculated. Coincidence of the peaks obtained from conventional and anomalous Patterson syntheses showed that the heavy-atom positions were correct, but unfortunately did not lead to a structure solution.

12.2.5.3. High levels of substitution; noncrystallographic symmetry

| top | pdf |

Most problematic are the cases where many heavy atoms have become incorporated in the asymmetric unit. Not only does this cause difficulties in the scaling of derivative to native data, but also the large number of peaks results in ambiguities in the solution of the Patterson function. In such cases, it may be necessary to obtain primary phase information from a different source (such as, for example, another low-substitution-site derivative). One important subclass of high-level substitution is when the native asymmetric unit contains several copies of a single molecule (noncrystallographic symmetry or NCS).

A major problem in locating complex noncrystallographic axes is that the geometrical relationship between NCS peaks in the Patterson map is nontrivial. Under certain conditions, NCS results in a recognizable local symmetry within the Patterson map (Stubbs et al., 1996 ). In many cases, however, these conditions (that the NCS axes of crystallographic symmetry-related molecules are parallel) are not fulfilled. Under such circumstances, all heavy-atom sites (including all crystallographic symmetry-related positions) must be checked carefully with the rotation function in order to pinpoint the NCS axis. This is relatively trivial for low-order NCS (twofold, threefold), but becomes increasingly complicated for higher orders. It should also always be borne in mind that the heavy-atom positions might not necessarily follow the NCS constraints due to crystal packing. If there is reason to suspect that sites are related by local symmetry, then the orientation of this axis can be used in the initial Harker searches; in practice, however, such searches are extremely sensitive to the correct orientation of the axis.

In the case of high-order NCS (such as, e.g., with icosahedral virus structures or symmetric macromolecular complexes), an alternative approach to the usual initial Harker-vector search can be provided by the self-rotation function. Knowledge of the orientation of the NCS axis (from the rotation function) can be used to determine the relative positions of heavy atoms to the NCS axis (Argos & Rossmann, 1976 ; Arnold et al., 1987 ; Tong & Rossmann, 1993 ). The orientation can be refined and the resulting peaks can be used as input in a subsequent translation search of the Harker sections.

References

Adman, E. T., Stenkamp, R. E., Sieker, L. C. & Jensen, L. H. (1978). A crystallographic model for azurin at 3.0 Å resolution. J. Mol. Biol. 123, 35–47.Google Scholar

Argos, P. & Rossmann, M. G. (1976). A method to determine heavy-atom positions for virus structures. Acta Cryst. B32, 2975–2983.Google Scholar

Arnold, E., Vriend, G., Luo, M., Griffith, J. P., Kamer, G., Erickson, J. W., Johnson, J. E. & Rossmann, M. G. (1987). The structure determination of a common cold virus, human rhinovirus 14. Acta Cryst. A43, 346–361.Google Scholar

Badger, J. & Athay, R. (1998). Automated and graphical methods for locating heavy-atom sites for isomorphous replacement and multiwavelength anomalous diffraction phase determination. J. Appl. Cryst. 31, 270–274.Google Scholar

Blow, D. M. & Crick, F. H. C. (1959). The treatment of errors in the isomorphous replacement method. Acta Cryst. 12, 794–802.Google Scholar

Budisa, N., Karnbrock, W., Steinbacher, S., Humm, A., Prade, L., Neuefeind, T., Moroder, L. & Huber, R. (1997). Bioincorporation of telluromethionine into proteins: a promising new approach for X-ray structure analysis of proteins. J. Mol. Biol. 270, 616–623.Google Scholar

Buerger, M. J. (1959). Vector space. New York: Wiley.Google Scholar

Chang, G. & Lewis, M. (1994). Using genetic algorithms for solving heavy-atom sites. Acta Cryst. D50, 667–674.Google Scholar

Crick, F. H. C. & Magdoff, B. S. (1956). The theory of the method of isomorphous replacement for protein crystals. I. Acta Cryst. 9, 901–908.Google Scholar

Dumas, P. (1994a). The heavy-atom problem: a statistical analysis. I. A priori determination of best scaling, level of substitution, lack of isomorphism and phasing power. Acta Cryst. A50, 526–537.Google Scholar

Dumas, P. (1994b). The heavy-atom problem: a statistical analysis. II. Consequences of the a priori knowledge of the noise and heavy-atom powers and use of a correlation function for heavy-atom-site determination. Acta Cryst. A50, 537–546.Google Scholar

Dumas, P. (1994c). The heavy-atom problem: a statistical analysis. II. Consequences of the a priori knowledge of the noise and heavy-atom powers and use of a correlation function for heavy-atom-site determination. Erratum. Acta Cryst. A50, 793.Google Scholar

Green, D. W., Ingram, V. M. & Perutz, M. F. (1954). The structure of haemoglobin IV. Sign determination by the isomorphous replacement method. Proc. R. Soc. London Ser. A, 225, 287–307.Google Scholar

Harker, D. (1956). The determination of the phases of the structure factors of non-centrosymmetric crystals by the method of double isomorphous replacement. Acta Cryst. 9, 1–9.Google Scholar

Hendrickson, W. A., Horton, J. R. & LeMaster, D. M. (1990). Selenomethionyl proteins produced for analysis by multiwavelength anomalous diffraction (MAD): a vehicle for direct determination of three-dimensional structure. EMBO J. 9, 1665–1672.Google Scholar

Löwe, J., Stock, D., Jap, B., Zwickl, P., Baumeister, W. & Huber, R. (1995). Crystal structure of the 20S proteosome from the archaeon T. acidophilum at 3.4 Å resolution. Science, 268, 533–539.Google Scholar

McRee, D. E. (1998). Practical protein crystallography. San Diego: Academic Press.Google Scholar

Messerschmidt, A., Rossi, A., Ladenstein, R., Huber, R., Bolognesi, M., Gatti, G., Marchesini, A., Petruzzelli, R. & Finazzi-Agro, A. (1989). X-ray crystal structure of the blue oxidase ascorbate oxidase from zucchini. Analysis of the polypeptide fold and a model of the copper sites and ligands. J. Mol. Biol. 206, 513–529.Google Scholar

Nar, H., Huber, R., Meining, W., Schmid, C., Weinkauf, S. & Bacher, A. (1995). Atomic structure of GTP cyclohydrolase I. Structure, 3, 459–466.Google Scholar

Nar, H., Messerschmidt, A., Huber, R., van de Kamp, M. & Canters, G. W. (1991). X-ray crystal structure of the two site-specific mutants His35Gln and His235Leu of azurin from Pseudomonas aeruginosa. J. Mol. Biol. 218, 427–447.Google Scholar

Patterson, A. L. (1934). A Fourier series method for the determination of the components of interatomic distances in crystals. Phys. Rev. 46, 372–376.Google Scholar

Perutz, M. F. (1956). Isomorphous replacement and phase determination in non-centrosymmetric space groups. Acta Cryst. 9, 867–873.Google Scholar

Richardson, J. W. & Jacobson, R. A. (1987). In Patterson and Pattersons, edited by J. P. Glusker, B. K. Patterson & M. Rossi. Oxford University Press.Google Scholar

Rogers, D. (1965). In Computing methods in crystallography, edited by J. S. Rollett, pp. 133–148. Oxford University Press.Google Scholar

Romao, M. J., Turk, D., Gomis-Ruth, F. X., Huber, R., Schumacher, G., Mollering, H. & Russmann, L. (1992). Crystal structure analysis, refinement and enzymatic reaction mechanism of N-carbamoylsarcosine amidohydrolase from Arthrobacter sp. at 2.0 Å resolution. J. Mol. Biol. 226, 1111–1130.Google Scholar

Rossmann, M. G. (1960). The accurate determination of the position and shape of heavy-atom replacement groups in proteins. Acta Cryst. 13, 221–226.Google Scholar

Rossmann, M. G. (1972). Editor. The molecular replacement method. New York: Gordon and Breach. Google Scholar

Rossmann, M. G., Arnold, E. & Vriend, G. (1986). Comparison of vector search and feedback methods for finding heavy-atom sites in isomorphous derivatives. Acta Cryst. A42, 325–334.Google Scholar

Sheldrick, G. M. (1990). Phase annealing in SHELX-90: direct methods for larger structures. Acta Cryst. A46, 467–473.Google Scholar

Sheldrick, G. M., Dauter, Z., Wilson, K. S., Hope, H. & Sieker, L. C. (1993). The application of direct methods and Patterson interpretation to high-resolution native protein data. Acta Cryst. D49, 18–23.Google Scholar

Steigemann, W. (1991). Recent advances in the PROTEIN program system for the X-ray structure analysis of biological macromolecules. In Crystallographic computing 5: from chemistry to biology, edited by D. Moras, A. D. Podjarny & J. C. Thierry, pp. 115–125. Oxford University Press.Google Scholar

Stubbs, M. T., Nar, H., Löwe, J. , Huber, R., Ladenstein, R., Spangfort, M. D. & Svensson, L. A. (1996). Locating a local symmetry axis from Patterson map cross vectors: application to crystal data from GroEL, GTP cyclohydrolase I and the proteosome. Acta Cryst. D52, 447–452.Google Scholar

Stubbs, M. T., Summers, L., Mayr, I., Schneider, M., Bode, W., Huber, R., Ries, A. & Kühn, K. (1990). Crystals of the NC1 domain of type IV collagen. J. Mol. Biol. 211, 683–684.Google Scholar

Terwilliger, T. C. & Berendzen, J. (1999). Automated MAD and MIR structure solution. Acta Cryst. D55, 849–861.Google Scholar

Terwilliger, T. C. & Eisenberg, D. (1983). Unbiased three-dimensional refinement of heavy-atom parameters by correlation of origin-removed Patterson functions. Acta Cryst. A39, 813–817.Google Scholar

Terwilliger, T. C. & Eisenberg, D. (1987). Isomorphous replacement: effects of errors on the phase probability distribution. Acta Cryst. A43, 6–13.Google Scholar

Terwilliger, T. C., Kim, S.-H. & Eisenberg, D. (1987). Generalized method of determining heavy-atom positions using the difference Patterson function. Acta Cryst. A43, 1–5.Google Scholar

Tong, L. & Rossmann, M. G. (1993). Patterson-map interpretation with noncrystallographic symmetry. J. Appl. Cryst. 26, 15–21.Google Scholar

Vagin, A. & Teplyakov, A. (1998). A translation-function approach for heavy-atom location in macromolecular crystallography. Acta Cryst. D54, 400–402.Google Scholar

Wilson, A. J. C. (1949). The probability distribution of X-ray intensities. Acta Cryst. 2, 318–321.Google Scholar

International Tables for Crystallography (2006). Vol. F. ch. 12.2, pp. 256-262
https://doi.org/10.1107/97809553602060000680