International
Tables for Crystallography Volume G Definition and exchange of crystallographic data Edited by S. R. Hall and B. McMahon © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. G. ch. 5.3, pp. 523-525
Section 5.3.8.3. mmLib : a Python toolkit for bioinformatics applications
a
International Union of Crystallography, 5 Abbey Square, Chester CH1 2HU, England |
While the libraries developed for use within the Protein Data Bank provide powerful functionality, their very size and complexity make them inappropriate for some applications. Indeed, considerable effort may be needed to compile the C++ code on nonstandard platforms. The mmLib toolkit (Painter & Merritt, 2004) addresses this by supplying a library of object-oriented routines implemented in Python (van Rossum, 1991
) that are designed to integrate with existing or new applications in an easy way.
The objective of mmLib is to build a support platform to handle the increasingly rich data about macromolecular structure available to structural biologists. Not only do applications need to be able to handle atomic positions and build appropriate three-dimensional structure representations; but links to and integration with information on sequence, homologous structures, and biochemical, genetic and medical form and function are also demanded from individual program systems. Since much of these data are available from external databases in a variety of formats, mmLib will not be restricted to the handling of files in a single format. Its initial release provides support for mmCIF, for the PDB format files that historically have been used for representation of macromolecular structures (Westbrook & Fitzgerald, 2003) and for the MTZ format used by the CCP4 program suite (Collaborative Computational Project, Number 4, 1994
).
Table 5.3.8.1 lists the main modules in the current release. mmLib.mmCIF and mmLib.PDB are read/write parsers for mmCIF and PDB format files, respectively, which handle file input and output in these formats, and provide support for inspection or modification of such file formats. They are typically used in conjunction with the mmLib.FileLoader component to populate the mmLib.Structure internal representation of the macromolecular structure. The high-level abstraction of such functionality allows for very succinct programmatic constructs. Fig. 5.3.8.3
illustrates this with a program snippet that (apart from the necessary system calls for file management) achieves the conversion of an mmCIF input file to a PDB format representation. This is sufficiently robust and lightweight to act as an input filter to software already designed for handling PDB format files.
|
mmLib.Structure represents the internal representation of a molecular structure and is implemented as an object hierarchy with four basic object classes: Structure, Chain, Fragment and Atom. The Fragment class has subclasses AminoAcidResidue and NucleicAcidResidue. In order to build a complete representation of a structure, the toolkit may need to load data from an input mmCIF or PDB format file, and also from standard data sets of properties of individual monomers and chemical elements; these standard libraries of chemical properties are provided by the mmLib.Library module. The core mmLib source includes a limited library of such chemical properties (accessible through the subclasses mmLib.Elements, mmLib.AminoAcids and mmLib.NucleicAcids) and also provides support for the extensive CCP4 monomer library through the mmLib.Extensions.CCP4Library. The naming of this class expresses the intention that other standard data sources should be made accessible in the same way.
The CCP4 monomer library is in fact included with the software as a directory tree of small files in mmCIF format, which are loaded into the Structure object through the normal use of the toolkit's mmCIF parser.
mmLib.GLViewer is a module provided to support visualization programs using the OpenGL graphics environment. Although it does not by itself provide a stand-alone viewer, it can be incorporated into many common graphics application building environments. An example molecular viewer, mmView, is provided with the distribution as an example of an application using the GTK graphical user interface, a popular toolkit in Linux.
References



