International
Tables for Crystallography Volume F Crystallography of biological macromolecules Edited by M. G. Rossmann and E. Arnold © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. F. ch. 15.1, pp. 311-318
Section 15.1.2. Density-modification methods
a
Division of Basic Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N., Seattle, WA 90109, USA,bDepartment of Chemistry, University of York, York YO1 5DD, England, and cDepartment of Physics, University of York, York YO1 5DD, England |
The aim of density-modification calculations is to obtain new or improved phase estimates for observed structure-factor amplitudes. Often, this includes calculation of phases for previously unphased reflections, for example, in the case of phase extension. The calculation of weights, which indicate the degree of confidence in the new phase estimates, is also an important part of the calculation. Improved phase estimates are obtained by bringing the initial phase estimates into consistency with additional sources of structural information.
One difficulty in combining information from various sources is that the amplitudes and phases are represented in reciprocal space and include good estimates of error, whereas the other constraints are in real space and in general, represent expectations about the structure which may be hard to quantify. As a result, the method that has been adopted is iterative and divided into real- and reciprocal-space steps. A weighted map is calculated and used as a basis for applying all the real-space modifications. The modified map is then back-transformed to produce a set of amplitudes and phases. The agreement between the observed amplitudes and the amplitudes calculated from the modified map is then used to estimate weights for the modified phases, which are used to combine the modified phases with experimental phases to produce new phases. This process is shown diagrammatically in Fig. 15.1.2.1.
![]() | Density-modification calculation showing iterative application of real-space and reciprocal-space constraints. |
A broad range of techniques have been applied to electron-density maps to impose chemical or physical information. Some sources of information used in density modification are summarized in Table 15.1.2.1. The list included here is not exhaustive, but covers the most widely used methods. Here, we describe some of the constraints and the techniques through which these constraints are implemented for phase improvement.
|
Solvent flattening exploits the fact that the electron density in the solvent region is flat at medium resolution, owing to the high thermal motion and disorder of solvent molecules. The flattening of the solvent region suppresses noise in the map and therefore improves phases.
Biological molecules are typically irregular in shape, often taking roughly globular forms. When they are packed regularly to form a crystal lattice, there are gaps left between them, and these spaces are filled with the solvent in which the crystallization was performed. This solvent is a disordered liquid, and thus the arrangement of atoms in the solvent regions varies between unit cells, except in those small regions near the surface of the protein. The X-ray image forms an average of electron density over many cells, so the electron density over much of the solvent region appears to be constant to a good approximation.
The existence of a flat solvent region in a crystal places strong constraints on the structure-factor phases. The constraint of solvent flatness is implemented by identifying the molecular boundaries and replacing the densities in the solvent region by their mean density value.
When solving a structure, the contents of the unit cell are usually known, and so an estimate can be formed of how much of the cell volume is taken up by solvent (Matthews, 1968). If the solvent region can be located in the cell, then we can improve an electron-density map by setting the electron density in this region to the expected constant solvent density. Once the resulting modified phases are combined with the experimental data, an improvement can often be seen in the protein regions of the map (Bricogne, 1974
).
The solvent region of a unit cell may usually be determined even from a poor MIR map using the following features:
A good method for locating the solvent region therefore takes into account information from both low- and high-resolution structure factors. Many methods have been proposed to locate the protein–solvent boundary. The first of these were the visual identification methods. The boundary was identified by digitizing a mini-map with the aid of a graphic tablet (Hendrickson et al., 1975; Schevitz et al., 1981
). The hand-digitizing procedure was very time-consuming and prone to subjective judgmental errors. Nevertheless, these methods demonstrated the potential of solvent flattening and stimulated further improvement on boundary-identification methods. An automated method using a linked, high-density approach was first proposed by Bhat & Blow (1982)
. Based on the fact that the densities are generally higher in the protein region than in the solvent region, they defined the molecular boundary by locating the protein as a region of linked, high-density points.
Convolution techniques were subsequently adopted as an efficient method of molecular-boundary identification. Reynolds et al. (1985) proposed a high mean absolute density value approach. The electron density within the protein region was expected to have greater excursions from the mean density value than the solvent region, which is relatively featureless. The molecular boundary was located based on the value of a smoothed `modulus' electron density, which is the sum of the absolute values of all density points within a small box.
Wang (1985) suggested an automated convolution method for identifying the solvent region which has achieved widespread use. His method involved first calculating a truncated map:
The electron density is simply truncated at the expected solvent value,
; however, since the variations in density in the protein region are much larger than the variations in the solvent region, it is generally only the protein region which will be affected. Thus, the mean density over the protein region is increased. Similar results may be obtained using the mean-squared difference of the density from the expected solvent value.
A smoothed map is then formed by calculating at each point in the map the mean density over a surrounding sphere of radius R. This operation can be written as a convolution of the truncated map, , with a spherical weighting function,
,
where
Leslie (1987) noted that the convolution operation required in equation (15.1.2.2)
can be very efficiently performed in reciprocal space using fast Fourier transforms (FFTs),
where
denotes a Fourier transform, and
represents an inverse Fourier transform.
The Fourier transform of the truncated density can be readily calculated using FFTs. The Fourier transform of the weighting function can be calculated analytically by where
Therefore, the averaging of the truncated electron density by a spherical weighting function can be achieved by two FFTs. This greatly reduced the time required for calculating the averaged density. Other weighting functions may be implemented by the same approach.
A cutoff value, , is then calculated, which divides the unit cell into two portions occupying the correct volumes for the protein and solvent regions. All points in the map where
can then be assumed to be in the solvent region. A typical mask obtained from an MIR map by this means, and the modified map, are shown in Fig. 15.1.2.2
.
The radius of the sphere, R, used in equation (15.1.2.3) for the averaging of electron densities is generally around 8 Å. The molecular envelope derived from such an averaged map tends to lose details of the protein molecular surface. Paradoxically, a large averaging sphere is required for the identification of the protein–solvent boundary based on the difference between the mean density of the protein and solvent, which is very small and can only be distinguished when a sufficiently large area of the map is averaged. Abrahams & Leslie (1996)
proposed an alternative method of molecular-boundary identification that uses the standard deviation of the electron density within a given radius relative to the overall mean at every grid point of a map. The local-standard-deviation map is the square root of a convolution of a sphere and the squared map, which can be calculated in reciprocal space in a similar way to the procedure described in equations (15.1.2.4)
and (15.1.2.5)
as proposed by Leslie (1987)
. By integrating the histogram of the local-standard-deviation map, the cutoff value of the local standard deviation corresponding to the solvent fraction can be calculated. Using this procedure, a molecular envelope that contains more details of the protein molecular surface can be obtained, since the radius of the averaging sphere can be as low as 4 Å (Abrahams & Leslie, 1996
).
Once the envelope has been determined, solvent flattening is performed by simply setting the density in the solvent region to the expected value, :
If the electron density has not been calculated on an absolute scale, the solvent density may be set to its mean value.
A related method is solvent flipping, developed by Abrahams & Leslie (1996). In this approach, the flattening operation is modified by the introduction of a relaxation factor, γ, where γ is positive, effectively `flipping' the density in the solvent region.
The effect of this modification is to correct for the problem of independence in phase combination and is discussed in Section 15.1.4.3
.
Histogram matching seeks to bring the distribution of electron-density values of a map to that of an ideal map. The density histogram of a map is the probability distribution of electron-density values. It provides a global description of the appearance of the map, and all spatial information is discarded. The comparison of the histogram for a given map with that expected for an ideal map can serve as a measure of quality. Furthermore, the initial map can be improved by adjusting density values in a systematic way to make its histogram match the ideal histogram.
Histogram matching is a standard technique in image processing. It is aimed at bringing the density distribution of an image to an ideal distribution, thereby improving the image quality. The first attempt at modifying the electron-density distribution was that by Hoppe & Gassman (1968), who proposed the `3–2' rule. The electron density was first normalized to a maximum of 1 and modified by imposing positivity. Subsequently, the electron density was modified by
. Podjarny & Yonath (1977)
used the skewness of the density histogram as a measure of quality of the modified map. Harrison (1988)
used a Gaussian function as the ideal histogram in his histogram-specification method for protein phase refinement and extension. The choice of the Gaussian function as the ideal electron-density distribution was based on theoretical arguments instead of experimental evaluation. The Gaussian function was also made independent of resolution. Lunin (1988)
used the electron-density distribution to retrieve the values of low-angle structure factors whose amplitudes had not been measured during an X-ray experiment. The electron-density distribution was thought to be structure specific and was derived from a homologous structure. Moreover, the histogram was derived from the entire unit cell, including both the protein and the solvent. Zhang & Main (1988)
systematically examined the electron-density histogram of several proteins and found that the ideal density histogram is dependent on resolution, the overall temperature factor and the phase error. It is, however, independent of structural conformation. The sensitivity to phase error suggests that the density histogram could be used for phase improvement. The structural conformation independence made it possible to predict the ideal histogram for unknown structures.
Polypeptide structures in particular, and biological macromolecules in general, display a broadly similar atomic composition, and the way in which these atoms bond together is also conserved across a wide range of structures. These similarities between different protein structures can be used to predict the ideal histogram even when positional information for individual atoms is not available in a map. If the positional information is removed from an electron-density map, then what remains is an unlabelled list of density values. This list is the histogram of the electron-density distribution, which is independent of the relative disposition of these densities. The shape of the histogram is primarily based on the presence of atoms and their characteristic distances from each other. This is true for all polypeptide structures.
The frequency distribution, , of electron-density values in a map can be constructed by sampling the map and counting the density values in different ranges. In practice, once the electron-density map has been sampled on a discrete grid, this frequency distribution becomes a histogram, but for convenience, it is treated here as a continuous distribution.
At resolutions of better than 6.0 Å and after exclusion of the solvent region, the frequency distribution of electron-density values for protein density over a wide range of proteins varies only with resolution and overall temperature factor to a good approximation. If the overall temperature factor is artificially adjusted, for example, by sharpening to , then the frequency distributions may be treated as a function of resolution only. Therefore, once a good approximation to the molecular envelope is known, the frequency distribution of electron densities in the protein region as a function of resolution may be assumed to be known. Therefore, the ideal density histogram for an unknown map at a given resolution can be taken from any known structure at the same resolution (Zhang & Main, 1988
, 1990a
).
The ideal electron-density histogram can also be predicted by an analytical formula (Lunin & Skovoroda, 1991; Main, 1990a
). The method adopted by Main (1990a
) represents the density histogram by components that correspond to three types of electron density in the map. The first component is the region of overlapping densities, which can be represented by a randomly distributed background noise. The second component is the region of partially overlapping densities. The third component is the region of non-overlapping atomic peaks, which can be represented by a Gaussian.
The histogram for the overlapping part of the density can be represented by a Gaussian distribution, where
is the mean density and σ is the standard deviation. The region of partially overlapping densities can be modelled by a cubic polynomial function,
The histogram for the non-overlapping part of the density can be derived analytically from a Gaussian atom,
where
is the maximum density, N is a normalizing factor and A is the relative weight of the terms between equation (15.1.2.8)
and equation (15.1.2.10)
.
If we use two threshold values, and
, to divide the three density regions, the complete formula can be expressed as
The parameters a, b, c, d in the cubic polynomial are calculated by matching function values and gradients at and
. The parameters in the histogram formula,
, σ, A,
,
,
, can be obtained from histograms of known structures.
Zhang & Main (1990a) demonstrated that, at better than 4 Å resolution, the histogram for an MIR map is generally significantly different from the ideal distribution calculated from atomic coordinates. The obvious course is therefore to alter the map in such a way as to make its density histogram equal to the ideal distribution. Unfortunately, there are an infinite number of maps corresponding to any chosen density distribution, so we must choose a systematic method of altering the map.
The conventional method of performing such a modification is to retain the ordering of the density values in the map. The highest point in the original map will be the highest point in the modified map, the second highest points will correspond in the same way, and so on.
Mathematically, this transformation is represented as follows. Let be the current density histogram and
be the desired distribution, normalized such that their sums are equal to 1. The cumulative distribution functions,
and
, may then be calculated:
The cumulative distribution function of a variable transforms a value chosen from the distribution into a number between 0 and 1, representing the position of that value in an ordered list of values chosen from the distribution.
The transformation may, therefore, be performed in two stages. A density value is taken from the initial distribution and the cumulative distribution function of the initial distribution is applied to obtain the position of that value in the distribution. The inverse of the cumulative distribution function for the desired distribution is applied to this value to obtain the density value for the corresponding point in the desired distribution. Thus, given a density value, ρ, from the initial distribution, the modified value, ρ′, is obtained by The distribution of ρ′ will then match the desired distribution after the above transformation. The transformation of an electron-density value by this method is illustrated in Fig. 15.1.2.3.
The transformation in equation (15.1.2.13)
can be achieved through a linear transform represented by
where
and n is the number of density bins. The above linear transform is sufficient if the number of density bins is large enough. An n value of about 200 is usually quite satisfactory.
Various properties of the electron density are specified in the density histogram, such as the minimum, maximum and mean density, the density variance, and the entropy of the map. The mean density of the ideal map can be obtained by The variance of the density in the ideal map can be obtained by
where
The entropy of the ideal map can be calculated by
Therefore, the process of histogram matching applies a minimum and a maximum value to the electron density, imposes the correct mean and variance, and defines the entropy of the new map. The order of electron-density values remains unchanged after histogram matching.
Histogram matching is complementary to solvent flattening since it is applied to the protein region of a map, whereas solvent flattening only operates on the solvent region of the map. The same envelope that was used for isolating the solvent region can be used to determine the protein region of the cell. An alternative approach is to define separate solvent and protein masks, with uncertain regions excluded from either mask and allowed to keep their unmodified values.
15.1.2.2.4. Scaling the observed structure-factor amplitudes according to the ideal density histogram
In the process of density modification, electron density or structure factors from different sources are compared and combined. It is, therefore, crucial to ensure that all the structure factors and maps are on the same scale. The observed structure factors can be put on the absolute scale by Wilson statistics (Wilson, 1949) using a scale and an overall temperature factor. This is accurate when atomic or near atomic resolution data are available. The scale and overall temperature factor obtained from Wilson statistics are less accurate when only medium- to low-resolution data are available. A more robust method of scaling non-atomic resolution data is through the density histogram (Cowtan & Main, 1993
; Zhang, 1993
).
The ideal density histogram defines the mean and variance of an electron density, as shown in equations (15.1.2.15) and (15.1.2.16)
. We can scale the observed structure-factor amplitudes to be consistent with the target histogram using the following formula, obtained from the structure-factor equation and Parseval's theorem. The mean density and the density variance of the observed map can be calculated as
The mean and variance of the electron-density map at the desired resolution are calculated using the target histogram, the mean value of the solvent density, , and the solvent volume of the cell,
. The F(000) term can then be evaluated from equations (15.1.2.15)
and (15.1.2.19)
:
The scale of the observed amplitudes can be obtained from equations (15.1.2.16)
and (15.1.2.20)
,
where
This method is adequate for scaling observed structure factors at any resolution.
The averaging method enforces the equivalence of electron-density values between grid points in the map related by noncrystallographic symmetry. The averaging procedure can filter noise, correct systematic error and even determine the phases ab initio in favourable cases (Chapman et al., 1992; Tsao et al., 1992
).
Noncrystallographic symmetry (NCS) arises in crystals when there are two or more of the same molecules in one asymmetric unit. Such symmetries are local, since they only apply within a sub-region of a single unit cell. A fivefold axis, for example, must be noncrystallographic, since it is not possible to tessellate objects with fivefold symmetry. Since the symmetry does not map the crystal lattice back onto itself, the individual molecules that are related by the noncrystallographic symmetry will be in different environments; therefore, the symmetry relationships are only approximate.
Noncrystallographic symmetries provide phase information by the following means. Firstly, the related regions of the map may be averaged together, increasing the ratio of signal to noise in the map. Secondly, since the asymmetric unit must be proportionally larger to hold multiple copies of the molecule, the number of independent diffraction amplitudes available at any resolution is also proportionally larger. This redundancy in sampling the molecular transform leads to additional phase information which can be used for phase improvement.
The self-rotation symmetry is now routinely solved by the use of a Patterson rotation function (Rossmann & Blow, 1962). The translation symmetry can be determined by a translation function (Crowther & Blow, 1967
) when a search model, either an approximate structure of the protein to be determined or the structure of a homologous protein, is available. The searches of the Patterson rotation and translation functions are achieved typically using fast automatic methods, such as X-PLOR (Brünger et al., 1987
) or AMoRe (Navaza, 1994
). In cases where no search model is available or the Patterson translation function is unsolvable, either the whole electron-density map, or a region which is expected to contain a molecule, may be rotated using the rotation solution and used as a search model in a phased translation function (Read & Schierbeek, 1988
).
Once the averaging operators are determined, the mask can be determined using the local density correlation function as developed by Vellieux et al. (1995). This is achieved by a systematic search for extended peaks in the local density correlation, which must be carried out over a volume of several unit cells in order to guarantee finding the whole molecule. The local correlation function distinguishes those volumes of crystal space which map onto similar density under transformation by the averaging operator. Thus, in the case of improper NCS, a local correlation mask will cover only one monomer. In the case of a proper symmetry, a local correlation mask will cover the whole complex (Fig. 15.1.2.4a,b
).
Special cases arise when there are combinations of crystallographic and noncrystallographic symmetries, of proper and improper symmetries, or when a noncrystallographic symmetry element maps a cell edge onto itself. In the latter case, the volume of matching density is infinite, and arbitrary limits must be placed upon the mask along one crystal axis.
The initial NCS operation obtained from rotation and translation functions or heavy-atom positions can be fine-tuned by a density-space R-factor search in the six-dimensional rotation and translation space. The density-space R factor is defined as where
is the set of Cartesian coordinates,
is the NCS-related set of coordinates of r and Ω represents the NCS operator.
The six-dimensional search is very time-consuming. The search rate can be increased by using only a representative subset of grid points. The NCS operation is systematically altered to find the lowest density-space R factor for the selected subset of grid points.
The solution of the NCS operation from the six-dimensional search can be further refined by the following least-squares procedure. If is related to
by the NCS operation, Ω,
Here, Ω is a function of
, where
represents the rotation and translation components of the NCS operation. The solution to the NCS parameters, ω, can be obtained by minimizing the density residual between the NCS-related molecules,
using a least-squares formula of the form
where Δω is the shift to the NCS parameters. Here,
The partial derivatives,
, can be calculated by Fourier transforms,
or more efficiently with a single Fourier transform by the use of spectral B-splines (Cowtan & Main, 1998
).
is derived analytically based on the relationship between the Cartesian coordinates, r, and the rotational and translational coordinates of the NCS operation, ω,
Once the mask and matrices are determined, the electron-density map may be modified by averaging. This can be achieved in one or two stages: The density for each copy of the molecule in the asymmetric unit may be replaced by the averaged density from every copy; however, this becomes slow for high-order NCS (Fig. 15.1.2.4c). Alternatively, a single averaged copy of the molecule may be created in an artificial cell [referred to by Rossmann et al. (1992)
as an H-cell], and then each copy of the molecule may be reconstructed in the asymmetric unit from this copy (Fig. 15.1.2.4d
). This is more efficient for high-order NCS, but additional errors are introduced in the second interpolation.
Interpolation of electron-density values at non-map grid sites is usually required, since the NCS operators will not normally map grid points onto each other. To obtain accurate interpolated values, either a fine grid or a complex interpolation function are required; suitable functions are described in Bricogne (1974) and Cowtan & Main (1998)
. Solvent flattening and histogram matching are frequently applied after averaging, since histogram matching tends to correct for any smoothing introduced by density interpolation.
In the case of flexible proteins, it may be necessary to average only part of the molecule, in which case the averaging mask will exclude some parts of the unit cell which are indicated as protein by the solvent mask. In other cases, it may be necessary to apply multi-domain averaging; in this case, the protein is divided into rigid domains which can appear in differing orientations. Each domain must then have a separate mask and set of averaging matrices.
Averaging may also be performed across similar molecules in multiple crystal forms (Schuller, 1996); in this case, density modification is performed on each crystal form simultaneously, with averaging of the molecular density across all copies of the molecule in all crystal forms. This is a powerful technique for phase improvement, even when no phasing is available in some crystal forms.
The skeletonization method enhances connectivity in the map. This is achieved by locating ridges of density, constructing a graph of linked peaks, and then building a new map using cylinders of density around the graph peaks.
At worse than atomic resolution, the density peaks for bonded atoms are no longer resolved, and so interpretation of the density in terms of atomic positions involves recognition of common motifs in the pattern of ridges in the density. Skeletonization was a tool developed by Greer (1985) to assist model building by tracing high ridges in the electron density to describe the connectivity in the map.
Skeletonization has more recently been adapted to the problem of density modification (Baker, Bystroff et al., 1993; Bystroff et al., 1993
; Wilson & Agard, 1993
). A skeleton is constructed by tracing the ridges in the map. The resulting ridges form connected `trees'. These trees may be pruned to remove small unconnected fragments and break circuits to select for protein-like features. A new map may then be built by building density around the links of the skeleton using the profile of a cylindrically averaged atom at the appropriate resolution.
The skeletonization method has been used to add new features to a partial model of a molecule (Baker, Bystroff et al., 1993). An efficient alternative algorithm for tracing density ridges is given by Swanson (1994)
.
Sayre's equation constrains the local shape of electron density. It provides a link between all structure-factor amplitudes and phases. It is an exact equation at atomic resolution in an equal-atom system. It is, therefore, very powerful for phase refinement and extension for small molecules at atomic resolution (Sayre, 1952, 1972
, 1974
). However, its power diminishes as resolution decreases. It can still be an effective tool for macromolecular phase refinement and extension if the shape function can be modified to accommodate the overlap of atoms at non-atomic resolution (Zhang & Main, 1990b
).
Sayre's equation (Sayre, 1952, 1972
, 1974
) expresses the constraint on structure factors when the atoms in a structure are equal and resolved, and the equation has formed the foundation of direct methods. In protein calculations, the resolution is generally too poor for atoms to be resolved, and this is reflected in the bulk of the terms required to calculate the equation for any particular missing structure factor.
For equal and resolved atoms, squaring the electron density changes only the shape of the atomic peaks and not their positions. The original density may therefore be restored by convoluting with some smoothing function, , which is a function of atomic shape,
where
Here,
is the ratio of scattering factors of real,
, and `squared',
, atoms, and V is the unit-cell volume, i.e.,
Sayre's equation states that the convolution of the squared electron density with a shape function restores the original electron density. It can be seen from equation (15.1.2.31) that Sayre's equation puts constraints on the local shape of electron density. The local shape function is the Fourier transform of the ratio of scattering factors of the real and `squared' atoms.
Sayre's equation is more frequently expressed in reciprocal space as a system of equations relating structure factors in amplitude and phase: The reciprocal-space expression of Sayre's equation can be obtained directly from a Fourier transformation of both sides of equation (15.1.2.31)
and the application of the convolution theorem.
15.1.2.5.2. The application of Sayre's equation to macromolecules at non-atomic resolution – the θ(
) curve
Sayre's equation is exact for an equal-atom structure at atomic resolution. The reciprocal-space shape function, , can be calculated analytically from the ratio of the scattering factors of real and `squared' atoms, which can both be represented by a Gaussian function. At infinite resolution, we expect
to be a spherically symmetric function that decreases smoothly with increased h. However, for data at non-atomic resolution, the
curve will behave differently because atomic overlap changes the peak shapes. Therefore, a spherical-averaging method is adopted to obtain an estimate of the shape function empirically from the ratio of the observed structure factors and the structure factors from the squared electron density using the formula
where the averaging is carried out over ranges of
, i.e., over spherical shells, each covering a narrow resolution range. Here, s represents the modulus of h.
The empirically derived shape function only extends to the resolution of the experimentally observed phases. This is sufficient for phase refinement. However, there are no experimentally observed phases to give the empirical for phase extension. Therefore, a Gaussian function of the form
is fitted to the available values of
, and the parameters K and B are obtained using a least-squares method. The shape function
for the resolution beyond that of the observed phases is extrapolated using the fitted Gaussian function. The derivation of the shape function
from a combination of spherical averaging and Gaussian extrapolation is the key to the successful application of Sayre's equation for phase improvement at non-atomic resolution (Zhang & Main, 1990b
).
The atomization method uses the fact that the structure underlying the map consists of discrete atoms. It attempts to interpret the map by automatically placing atoms and refining their positions.
Agarwal & Isaacs (1977) proposed a method for the extension of phases to higher resolutions by interpreting an electron-density map in terms of `dummy' atoms. These are so called because at the initial resolution of 3.0 Å, true atom peaks could not be resolved. The placement of `dummy atoms' is subject to constraints of bonding distance and the number of neighbours. The coordinates and temperature factors of these dummy atoms may then be refined against all the available diffraction amplitudes. Structure factors may then be calculated from the refined coordinates to provide phases for the high-resolution reflections and to improve the phases of the starting set.
The atomization approach has been extended in the ARP program (Lamzin & Wilson, 1997) by the use of difference-map criteria to test dummy-atom assignments, with the aim of removing wrong atoms and introducing missing atoms. With modern refinement algorithms, this technique has become very effective for the solution of structures at high resolution from a poor molecular-replacement model, or even directly from an MIR/MAD map.
Map improvement has also been demonstrated at intermediate resolutions by Perrakis et al. (1997) using a multi-solution variant of the ARP method, and by Vellieux (1998)
.
The interpretation of an approximately phased map has also been applied very successfully as part of the `Shake n' Bake' direct-methods procedure (Miller et al., 1993; Weeks et al., 1993
). The alternating application of phase refinement by the minimum principle in reciprocal space (`Shake') and atomization in real space (`Bake') has proved to be a very powerful method for solving small protein structures at atomic resolution using only structure-factor amplitudes.
References













































