International
Tables for Crystallography Volume F Crystallography of biological macromolecules Edited by M. G. Rossmann and E. Arnold © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. F. ch. 15.1, pp. 314-315
Section 15.1.2.2.2. The prediction of the ideal histogram
a
Division of Basic Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N., Seattle, WA 90109, USA,bDepartment of Chemistry, University of York, York YO1 5DD, England, and cDepartment of Physics, University of York, York YO1 5DD, England |
Polypeptide structures in particular, and biological macromolecules in general, display a broadly similar atomic composition, and the way in which these atoms bond together is also conserved across a wide range of structures. These similarities between different protein structures can be used to predict the ideal histogram even when positional information for individual atoms is not available in a map. If the positional information is removed from an electron-density map, then what remains is an unlabelled list of density values. This list is the histogram of the electron-density distribution, which is independent of the relative disposition of these densities. The shape of the histogram is primarily based on the presence of atoms and their characteristic distances from each other. This is true for all polypeptide structures.
The frequency distribution, , of electron-density values in a map can be constructed by sampling the map and counting the density values in different ranges. In practice, once the electron-density map has been sampled on a discrete grid, this frequency distribution becomes a histogram, but for convenience, it is treated here as a continuous distribution.
At resolutions of better than 6.0 Å and after exclusion of the solvent region, the frequency distribution of electron-density values for protein density over a wide range of proteins varies only with resolution and overall temperature factor to a good approximation. If the overall temperature factor is artificially adjusted, for example, by sharpening to , then the frequency distributions may be treated as a function of resolution only. Therefore, once a good approximation to the molecular envelope is known, the frequency distribution of electron densities in the protein region as a function of resolution may be assumed to be known. Therefore, the ideal density histogram for an unknown map at a given resolution can be taken from any known structure at the same resolution (Zhang & Main, 1988
, 1990a
).
The ideal electron-density histogram can also be predicted by an analytical formula (Lunin & Skovoroda, 1991; Main, 1990a
). The method adopted by Main (1990a
) represents the density histogram by components that correspond to three types of electron density in the map. The first component is the region of overlapping densities, which can be represented by a randomly distributed background noise. The second component is the region of partially overlapping densities. The third component is the region of non-overlapping atomic peaks, which can be represented by a Gaussian.
The histogram for the overlapping part of the density can be represented by a Gaussian distribution, where
is the mean density and σ is the standard deviation. The region of partially overlapping densities can be modelled by a cubic polynomial function,
The histogram for the non-overlapping part of the density can be derived analytically from a Gaussian atom,
where
is the maximum density, N is a normalizing factor and A is the relative weight of the terms between equation (15.1.2.8)
and equation (15.1.2.10)
.
If we use two threshold values, and
, to divide the three density regions, the complete formula can be expressed as
The parameters a, b, c, d in the cubic polynomial are calculated by matching function values and gradients at and
. The parameters in the histogram formula,
, σ, A,
,
,
, can be obtained from histograms of known structures.
References



