Tables for
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F. ch. 24.3, pp. 664-665

Section Derived data and bit-encoded information

F. H. Allena* and V. J. Hoya

aCambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 1EZ, England
Correspondence e-mail: Derived data and bit-encoded information

| top | pdf |

Derived data are calculated directly from the evaluated raw data and stored in the master archive for search purposes. Numerical items such as Z′, the number of chemical entities in the asymmetric unit, is a typical (real) numerical data item in this category. However, by far the most useful of the derived data items are a set of 682 individual pieces of yes/no information which are encoded as a bitmap, referred to as the screen record. The first 155 of these bits record information about (a) the elemental constitution of the compound, (b) results of the data-validation procedure and (c) summary information about the data content of the entry. These bits can be accessed directly by the user as search keys. The most important parts of the bitmap contain codified yes/no information about the presence/absence of specific features in the complete 2D or 3D structures held in the CSD. When a chemical substructure is entered as a query, its constitution is analysed in the same way to produce a bitmap for the query. Logical comparison of the query bitmap with the bitmap stored for each full CSD entry is computationally rapid, and quickly eliminates those entries that do not contain the requested features. Only those entries that pass this initial screening process need enter the detailed and computationally intensive atom-by-atom, bond-by-bond connectivity mapping that finally confirms (or not) the presence of the required query substructure.

to end of page
to top of page