Generalized database support

Westbrook, J. D.; Yang, H.; Feng, Z.; Berman, H. M.

doi:10.1107/97809553602060000755

RELATED SITES: IUCr | IUCr Journals

International
Tables for
Crystallography
Volume G
Definition and exchange of crystallographic data
Edited by S. R. Hall and B. McMahon

International Tables for Crystallography (2006). Vol. G. ch. 5.5, p. 542

Section 5.5.3.2. Generalized database support

J. D. Westbrook,^a ^* H. Yang,^a Z. Feng^a and H. M. Berman^a

^a Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers, The State University of New Jersey, Department of Chemistry and Chemical Biology, 610 Taylor Road, Piscataway, NJ 08854-8087, USA
Correspondence e-mail: [email protected]

5.5.3.2. Generalized database support

| top | pdf |

In addition to the data editing and processing functions, ADIT also supports a versatile database loader (mmCIF Loader; http://sw-tools.pdb.org/apps/MMCIF-LOADER ) that builds database schemata and extracts the processed data required to load database instances. The relation of the database loader to the central components of the ADIT system is shown in Fig. 5.5.3.4.

Figure 5.5.3.4 | top | pdf |

Schematic diagram of ADIT database loading functions.

Schemata are defined in a metadata repository that is accessed by the loader application. In the simplest case, a schema can be constructed that is modelled directly from the data dictionary. Since the data model underlying the dictionary description language used to build ADIT data dictionaries is essentially relational, mapping a data dictionary specification to a relational schema is straightforward.

In other cases, a mapping is required between the target schema and the data dictionary specification. This mapping is encoded in the schema metadata repository. The database loader uses this mapping information to extract items from data files and translate these data into a form that can be loaded into the target database schema. The definition of the mapping operation can include: selection operations with equijoin constraints (e.g. the value of _entity.type where _entity.id = 1), aggregation (e.g. count, sum, average), collapse (e.g. vector to string), type conversions and existence tests.

Schema definitions are converted by the database loader into SQL instructions that create the defined tables and indices. Loadable data are produced either as SQL insert/update instructions or in the more efficient table copy formats used by popular database engines (i.e. DB2, Sybase, Oracle and MySQL). Loadable data can also be produced in XML.

References

International Tables for Crystallography (2006). Vol. G. ch. 5.5, p. 542