Structural features of monomers

Fitzgerald, P. M. D.; Westbrook, J. D.; Bourne, P. E.; McMahon, B.; Watenpaugh, K. D.; Berman, H. M.

doi:10.1107/97809553602060000738

International
Tables for
Crystallography
Volume G
Definition and exchange of crystallographic data
Edited by S. R. Hall and B. McMahon

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. G. ch. 3.6, pp. 183-184

Section 3.6.7.5.4. Structural features of monomers

P. M. D. Fitzgerald,^a ^* J. D. Westbrook,^b P. E. Bourne,^c B. McMahon,^d K. D. Watenpaugh^e and H. M. Berman^f

^a Merck Research Laboratories, Rahway, New Jersey, USA,^bProtein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers, The State University of New Jersey, Department of Chemistry and Chemical Biology, 610 Taylor Road, Piscataway, New Jersey, USA,^cResearch Collaboratory for Structural Bioinformatics, San Diego Supercomputer Center, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0537, USA,^dInternational Union of Crystallography, 5 Abbey Square, Chester CH1 2HU, England,^eretired; formerly Structural, Analytical and Medicinal Chemistry, Pharmacia Corporation, Kalamazoo, Michigan, USA, and ^fProtein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers, The State University of New Jersey, Department of Chemistry and Chemical Biology, 610 Taylor Road, Piscataway, New Jersey, USA
Correspondence e-mail: paula_fitzgerald@merck.com

3.6.7.5.4. Structural features of monomers

| top | pdf |

The data items in these categories are as follows:

(a) STRUCT_MON_DETAILS [Scheme scheme153]

(b) STRUCT_MON_NUCL [Scheme scheme154]

(d) STRUCT_MON_PROT_CIS [Scheme scheme156]

The bullet ( $[\bullet]$ ) indicates a category key. Where multiple items within a category are marked with a bullet, they must be taken together to form a compound key. The arrow ( $[\rightarrow]$ ) is a reference to a parent data item.

Most macromolecules have complex structures which contain regions of well defined structure and flexible regions that are difficult to model accurately. Overall measures of the quality of a model, such as the standard crystallographic R factors, do not represent the local quality of the model. During the development of the mmCIF dictionary, it was found that the biological crystallography community felt that mmCIF should contain data items that allowed the local quality of the model to be recorded: these data items are found in the categories STRUCT_MON_DETAILS, STRUCT_MON_NUCL (for nucleotides), and STRUCT_MON_PROT and STRUCT_MON_PROT_CIS (for proteins). Using these categories, quantities that reflect the local quality of the structure, such as isotropic displacement factors, real-space R factors and real-space correlation coefficients, can be given at the monomer and submonomer levels.

In addition, these categories can be used to record the conformation of the structure at the monomer level by listing side-chain torsion angles. These values can be derived from the atom coordinate list, so it would not be common practice to include them in an mmCIF for archiving a structure unless it was to highlight conformations that deviate significantly from expected values (Engh & Huber, 1991). However, there are applications, such as comparative studies across a number of independent determinations of the same structure, where it would be useful to store torsion-angle information without having to recalculate it each time it is needed.

The relationships between the categories used to describe the structural features of monomers are shown in Fig. 3.6.7.11.

Figure 3.6.7.11 | top | pdf |

The family of categories used to describe the structural features of monomers. Boxes surround categories of related data items. Data items that serve as category keys are preceded by a bullet ( $[\bullet]$ ). Lines show relationships between linked data items in different categories with arrows pointing at the parent data items.

Three indicators of the quality of a structure at the local level are included in this version of the dictionary: the mean displacement (B) factor, the real-space correlation coefficient (Jones et al., 1991) and the real-space R factor (Brändén & Jones, 1990). Other indicators are likely to be added as they become available. In the current version of the dictionary, these metrics can be given at the monomer level, or at the levels of main- and side-chain for proteins, or base, phosphate and sugar for nucleic acids (Altona & Sundaralingam, 1972).

The variables used when calculating real-space correlation coefficients and real-space R factors, such as the coefficients used to calculate the map being evaluated or the radii used for including points in a calculation, can be recorded using the data items _struct_mon_details.RSC and _struct_mon_details.RSR.

These data items are also provided for recording the full conformation of the macromolecule, using a full set of data items for the torsion angles of both proteins and nucleic acids. Although one could use these data items to describe the whole macromolecule, it is more likely that they would be used to highlight regions of the structure that deviate from expected values (Example 3.6.7.11). Deviations from expected values could imply inaccuracies in the model in poorly defined parts of the structure, but in some cases nonstandard torsion angles are found in very well defined regions and are essential to the proper configurations of active sites or ligand binding pockets.

Example 3.6.7.11. A hypothetical example of the structural features of a single protein residue described with data items in the STRUCT_MON_PROT category.

[Scheme scheme157]

A special case of nonstandard conformation is the occurrence of cis peptides in proteins. As the cis conformation occurs quite often, the category STRUCT_MON_PROT_CIS is provided so that an explicit list can be made of cis peptides. The related data item _struct_mon_details.prot_cis allows an author to specify how far a peptide torsion angle can deviate from the expected value of 0.0 and still be considered to be cis.

In these categories, properties are listed by residue rather than by individual atom. The only label components needed to identify the residue are *_alt, *_asym, *_comp and *_seq. If the author has provided an alternative labelling system, this can also be used. Since the analysis is by individual residue, there is no need to specify symmetry operations that might be needed to move one residue so that it is next to another.

References

Altona, C. & Sundaralingam, M. (1972). Conformational analysis of the sugar ring in nucleosides and nucleotides. New description using the concept of pseudorotation. J. Am. Chem. Soc. 94, 8205–8212.Google Scholar

Engh, R. A. & Huber, R. (1991). Accurate bond and angle parameters for X-ray protein structure refinement. Acta Cryst. A47, 392–400.Google Scholar

Brändén C.-I. & Jones, T. A. (1990). Between objectivity and subjectivity. Nature (London), 343, 687–689.Google Scholar

Jones, T. A., Zou, J.-Y., Cowan, S. W. & Kjeldgaard, M. (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Cryst. A47, 110–119.Google Scholar

International Tables for Crystallography (2006). Vol. G. ch. 3.6, pp. 183-184