Tables for
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F, ch. 22.4, pp. 559-560   | 1 | 2 |

Section 22.4.3. Structural knowledge from the CSD

F. H. Allen,a* J. C. Colea and M. L. Verdonka

aCambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 1EZ, England
Correspondence e-mail:

22.4.3. Structural knowledge from the CSD

| top | pdf | The CSD software system

| top | pdf |

Structural knowledge from the CSD is reflected principally in the geometries of individual molecules, extended crystal structures and, most importantly, through systematic studies of the geometrical characteristics of large subsets of related substructural units. Software facilities for search, retrieval, analysis and visualization of CSD information are fully described in Chapter 24.3[link] . The system allows for the calculation of a very wide range of geometrical parameters, both intramolecular and intermolecular. Most importantly, chemical substructural search fragments may be specified using normal covalent bonding definitions (single, double, triple etc.), limiting non-covalent contact distances and other geometrical constraints. For each instance of a search fragment located in the CSD, the system will compute a user-defined set of geometrical descriptors. The full matrix, G(N, p), of the p geometrical parameters for each of the N fragments located in the CSD can then be analysed using numerical, statistical and visualization techniques to display individual parameter distributions, to compute medians, means and standard deviations, and to examine the geometrical data for correlations or discrete clusters of observations that may exist in the p-dimensional parameter space. CSD structures and substructures of relevance to protein studies

| top | pdf |

Table[link] presents statistics for the 3137 structures of amino acids and peptides that are available in the CSD of April 1998 (containing 181 309 entries). Although this represents less than 2% of CSD information, some may consider that these are the only entries of real interest in molecular biology. In certain cases, e.g. for the derivation of very precise molecular dimensions and for some conformational work, this may be true. However, the real issue concerns the transferability of CSD-derived information to the protein environment. It is the biological relevance of a chemical substructure (inter- or intramolecular) that is important, and this consideration immediately brings much larger subsets of CSD entries into play. Information such as van der Waals radii can be derived from the CSD as a whole, while more specific information concerning, for example, biologically important metal coordination geometries can be derived from appreciable subsets of the total database, as shown in the statistics of Table[link].

Table| top | pdf |
Summary of amino-acid and peptide structures available in the CSD (April 1998, 181 309 entries)

(a) Overall statistics

StructuresNo. of entries
α-Amino acids (any organic) 3137
Peptides (standard or modified standard α-amino acids) 1430

(b) Peptide statistics

No. of residuesNo. of CSD entries
Acyclic Cyclic
2 543 123
3 249 45
4 76 50
5 62 44
6 20 73
7 14 15
8 19 32
10 16 19
11 4 10
12 2 11
14 1
15 3 2
16 3
Any organic structure containing the α-amino acid functionality.
The standard amino acids (those normally found in proteins) may be modified by substitution in these peptides.

Table| top | pdf |
CSD entry statistics for selected metal-containing structures

CSD entries (R < 0.10) containing M and (N or O). No additional transition metals were allowed to occur in the Na, K, Mg and Ca structures cited.

Metal No. of CSD entries
Na 1189
K 987
Mg 510
Ca 469
Zn 1996 Geometrical parameters of relevance to protein studies

| top | pdf |

Precise geometrical knowledge from atomic resolution studies of small molecules is important in the macromolecular domain since it provides: (a) geometrical restraints and standards to be applied during protein structure determination, refinement and validation; (b) model geometries for liganded small molecules and information about their preferred modes of interaction with the host protein; (c) details of metal coordination spheres and geometries that are likely to be observed in metalloproteins; and (d) information from which force field and other parameters may be derived. Thus, the types of study discussed in this chapter are concerned with retrieving systematic knowledge concerning:

  • (1) molecular dimensions: bond lengths and valence angles;

  • (2) conformational features: torsion angles that describe acyclic and cyclic systems;

  • (3) metal coordination-sphere geometries: coordination numbers, metal–ligand distances and inter-ligand valence angles;

  • (4) general non-bonded contact distances: van der Waals radii;

  • (5) hydrogen-bond geometries: distances, angles, directional properties;

  • (6) other non-bonded interactions: identification and geometrical description;

  • (7) formation of preferred atomic arrangements or motifs involving non-covalent interactions.

In this short overview, which deals with such a broad range of structural information, our literature coverage is, of necessity, highly selective. In each area, we have tried to cite the more recent papers, from which leading references to earlier studies can be located. We also draw attention to a number of recent monographs in which a variety of CSD analyses are comprehensively cited and discussed: Structure Correlation (Bürgi & Dunitz, 1994[link]), Crystal Structure Analysis for Chemists and Biologists (Glusker et al., 1994[link]), Hydrogen Bonding in Biological Structures (Jeffrey & Saenger, 1991[link]) and Crystal Engineering: the Design of Organic Solids (Desiraju, 1989[link]). Finally, we note the CCDC's own database of published research applications of the CSD. The DBUSE database currently contains literature references and short descriptive abstracts for nearly 700 papers. It forms part of each biannual CSD release and is fully searchable using the Quest3D program.


Bürgi, H.-B. & Dunitz, J. D. (1994). Structure correlation. Weinheim: VCH Publishers.Google Scholar
Desiraju, G. R. (1989). Crystal engineering: the design of organic solids. New York: Academic Press.Google Scholar
Glusker, J. P., Lewis, M. & Rossi, M. (1994). Crystal structure analysis for chemists and biologists. Weinheim: VCH Publishers.Google Scholar
Jeffrey, G. A. & Saenger, W. (1991). Hydrogen bonding in biological structures. Berlin: Springer Verlag.Google Scholar

to end of page
to top of page