International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F. ch. 25.2, pp. 709-710   | 1 | 2 |

Section 25.2.2.5. Code description

K. D. Cowtan,b* K. Y. J. Zhangc and P. Maind

25.2.2.5. Code description

| top | pdf |

The program was designed to be run largely automatically with minimal user intervention. This is achieved by using extensive default settings and by automatic selection of options based on the data used. The program is also modular by design so that additional density-modification methods can be incorporated easily.

A simplified flow diagram for DM is shown in Fig. 25.2.2.2(a)[link]. When a reflection-omit calculation is performed, an additional loop is introduced, shown in Fig. 25.2.2.2(b)[link]. The Sayre's equation calculation adds another level of complexity, described in Zhang & Main (1990b)[link]. Skeletonization imposes the protein histogram and solvent flatness implicitly and so is performed, if necessary, every second or third cycle in place of solvent flattening and histogram matching. Simplified conceptual and actual flow diagrams for DMMULTI are shown in Figs. 25.2.2.3(a)[link] and (b)[link].

[Figure 25.2.2.2]

Figure 25.2.2.2| top | pdf |

(a) Flow chart for a simple DM calculation with free-Sim phase combination. (b) Flow chart for a simple DM calculation with reflection-omit phase combination.

[Figure 25.2.2.3]

Figure 25.2.2.3| top | pdf |

(a) Conceptual flowchart for a DMMULTI multi-crystal calculation. (b) Actual flow chart for a DMMULTI multi-crystal calculation.

Many of the basic approaches used in DM and DMMULTI are described in Chapter 15.1[link] . Some practical aspects of the application and combination of these approaches are described here.

25.2.2.5.1. Scaling

| top | pdf |

All forms of map modification are affected by the overall temperature factor of the data, and histogram matching in particular is critically dependent on the accurate determination of the scale factor. Wilson statistics have been found inadequate for scaling in this case, especially when the data resolution is worse than 3 Å, because of the dip in scattering below 5 Å.

More accurate estimates of the scale and temperature factors may be achieved by fitting the data to a semi-empirical scattering curve (Cowtan & Main, 1998[link]). This curve is prepared using Parseval's theorem, which relates the sum of the intensities to the variance of the map: [\sigma_{\rho}^{2} = {1\over V^2}\sum\limits_{{\bf h} \neq 000}\displaystyle |F({\bf h})|^{2}. \eqno(25.2.2.2)] Thus, the sum of the intensities in a particular resolution shell is proportional to the difference in variance of maps calculated with and without that shell of data. The empirical curve is therefore calculated from the variance in the protein regions of a group of known structures, calculated as a function of resolution. The curve is scaled to the protein volume of the current structure, and a correction is made for the solvent, which is assumed to be flat.

The overall temperature factor is removed, and an absolute scale is imposed by fitting the data to this curve. The use of sharpened F's (with no overall temperature factor) is necessary for histogram matching and often increases the power of averaging for phase extension.

Since the solvent content is used in scaling the data, it is important that this value be entered correctly. However, the volume of the solvent mask may be varied independently of the true solvent content, as discussed in Section 25.2.2.3[link].

25.2.2.5.2. Solvent-mask determination

| top | pdf |

If the user does not supply a solvent mask, the solvent mask is calculated by Wang's (1985) method, using the reciprocal-space approach of Leslie (1987)[link]. A number of variants on this algorithm are implemented; however, the parameter that affects the quality of the solvent mask most dramatically is the radius of the smoothing function (Chapter 15.1[link] ). This parameter may be estimated empirically by [r_{\rm Wang} = 2r_{\max} \overline{w}^{1/3}, \eqno(25.2.2.3)] where [r_{\max}] is the resolution limit of the observed amplitudes, and [\overline{w}] is the mean figure of merit over the same reflections (with w = 0 for unphased reflections).

Once the smoothed map has been determined, cutoff values are chosen to divide the map into protein and solvent regions. If the protein boundary is poorly defined, the user may specify protein, solvent and excluded volumes, in which case two cutoffs are specified and the intermediate region is marked as neither protein nor solvent.

25.2.2.5.3. Averaging-mask determination

| top | pdf |

If the user does not supply an averaging mask, it is determined by a local correlation method (Vellieux et al., 1995[link]). A large region covering 27 unit cells is selected, and the local correlation between the maps before and after rotation by one of the noncrystallographic symmetry operators is calculated. The largest contiguous region that is in agreement among different NCS operators is isolated from the local correlation map, and a finer local correlation map is calculated over this volume. This process is iterated until a good mask with a detailed boundary is found.

This approach is fully automatic, except in the case where a noncrystallographic symmetry operator intersects a crystallographic symmetry operator, in which case the mask is not uniquely defined, and some user intervention may be required. The method is robust, and by increasing the radius of the sphere within which the local correlation is calculated, it may be used with very poor maps (Cowtan & Main, 1998[link]). The method is easily extended to include information from multiple averaging operators.

25.2.2.5.4. Fourier transforms

| top | pdf |

For simplicity of coding, all Fourier transforms are performed in core using real-to-Hermitian and Hermitian-to-real fast Fourier transforms (FFTs). The data are expanded to space group P1 before calculating a map and averaged back to a reciprocal asymmetric unit after inverse transformation. Most of the map modifications preserve crystallographic symmetry, so restricted phases are not constrained except during phase combination.

25.2.2.5.5. Histogram matching

| top | pdf |

The target histograms are calculated from the protein regions of several stationary-atom structures at resolutions from 6 to 1.5 Å, according to the method described by Zhang & Main (1990a)[link]. The histogram variances should be consistent with the map variances used in scaling the data. The resolution of the target histogram can be accurately matched to the data resolution by averaging the target histograms on either side of the current resolution.

25.2.2.5.6. Averaging

| top | pdf |

Averaging is performed using a single-step approach (Rossmann et al., 1992[link]), in which every copy of the molecule in a `virtual' asymmetric unit is averaged with every other copy. Density values are obtained at non-grid positions using a 27-point quadratic spectral spline interpolation. A sharpened map is first calculated by dividing by the Fourier transform of the quadratic spline function. The same spline function is then convoluted with the sharpened map to obtain the density value at an arbitrary coordinate (Cowtan & Main, 1998[link]). This approach gives very accurate interpolation from a coarse grid map with relatively little computation and additionally provides gradient information for the refinement of averaging operators.

25.2.2.5.7. Multi-crystal averaging

| top | pdf |

The multi-crystal averaging calculation in DMMULTI is equivalent to several single-crystal averaging calculations running simultaneously, with the exception that during the averaging step, the molecule density is averaged across every copy in every crystal form. This average is weighted by the mean figure of merit of each crystal form; this allows the inclusion of unphased crystal forms, since in the first cycle they will have zero weight and therefore not disrupt the phasing that is already present. In subsequent cycles, the unphased form contains phase information from the back-transformed density.

This technique can be extremely useful, since adding a new crystal form usually provides considerably more phase information than adding a new derivative if the cross-rotation and translation functions can be solved.

In the multi-crystal case, averaging is performed using a two-step approach, first building an averaged molecule from all the copies in all crystal forms, then replacing the density in each crystal form with the averaged values. This approach is computationally more efficient when there are many copies of the molecule.

The conceptual flow chart of simultaneous density-modification calculations across multiple crystal forms is shown in Fig. 25.2.2.3(a)[link]; in practice, this scheme is implemented using a single process and looping over every crystal form at each stage (Fig. 25.2.2.3b)[link]. Maps are reconstructed from a large data object containing all the reflection data in every crystal form. Averaging is performed using a second data object containing maps of each averaging domain. By this means, an arbitrary number of domains may be averaged across an arbitrary number of crystal forms.

Multi-crystal averaging has been particularly successful in solving structures from very weak initial phasing, since the data redundancy is usually higher than for single-crystal problems.

References

First citation Cowtan, K. D. & Main, P. (1998). Miscellaneous algorithms for density modification. Acta Cryst. D54, 487–493.Google Scholar
First citation Leslie, A. G. W. (1987). A reciprocal-space method for calculating a molecular envelope using the algorithm of B. C. Wang. Acta Cryst. A43, 134–136.Google Scholar
First citation Rossmann, M. G., McKenna, R., Tong, L., Xia, D., Dai, J.-B., Wu, H., Choi, H.-K. & Lynch, R. E. (1992). Molecular replacement real-space averaging. J. Appl. Cryst. 25, 166–180.Google Scholar
First citation Vellieux, F. M. D. A. P., Hunt, J. F., Roy, S. & Read, R. J. (1995). DEMON/ANGEL: a suite of programs to carry out density modification. J. Appl. Cryst. 28, 347–351.Google Scholar
First citation Zhang, K. Y. J. & Main, P. (1990a). Histogram matching as a new density modification technique for phase refinement and extension of protein molecules. Acta Cryst. A46, 41–46.Google Scholar
First citation Zhang, K. Y. J. & Main, P. (1990b). The use of Sayre's equation with solvent flattening and histogram matching for phase extension and refinement of protein structures. Acta Cryst. A46, 377–381.Google Scholar








































to end of page
to top of page