International
Tables for Crystallography Volume G Definition and exchange of crystallographic data Edited by S. R. Hall and B. McMahon © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. G. ch. 5.6, pp. 544-545
Section 5.6.1. IntroductionaStanford Linear Accelerator Center, 2575 Sand Hill Road, Menlo Park, CA 94025, USA, and bDepartment of Mathematics and Computer Science, Kramer Science Center, Dowling College, Idle Hour Blvd, Oakdale, NY 11769, USA |
CBFlib is a library of ANSI C functions providing a simple mechanism for accessing crystallographic binary files (CBFs) and image-supporting CIF (imgCIF) files (see Chapters 2.3, 3.7 and 4.6). The CBFlib application programming interface (API) consists of a set of low-level functions and a set of high-level functions. The low-level functions are loosely based on the CIFPARSE (Tosic & Westbrook, 2000) API for mmCIF files. As in CIFPARSE, the low-level functions in CBFlib do not perform any semantic integrity checks and simply create, read, modify and write files conforming to the CIF syntax, with additional functionality for working with binary sections. These basic functions are independent of the details of the CBF/imgCIF dictionary and can be used to manage any CIF data set. In contrast, the high-level functions are based on the CBF/imgCIF dictionary and facilitate writing or reading commonly used entries to or from CBF and imgCIF data files.
External to a program, a CBF/imgCIF data set `lives' in a file. Internally, when managed by CBFlib, a CBF or imgCIF data set has a simple tree structure pointed to by a `handle' (Fig. 5.6.1.1). At the highest level are named data blocks. Each data block may contain a number of named categories. Within each category, the actual data entries are stored in tabular form with named columns and numbered rows. The numbers of rows in different columns of a given category are constrained by the software to be the same.
CBFlib provides functions to create a corresponding data structure in memory; to copy a data set from an external file to the data structure or from the data structure to an external file; to navigate the tree; to scan, add and remove data blocks within data sets, categories (tables) within data blocks, and rows or columns within categories; to read or modify data entries; and finally to delete the structure from memory.
As is common in C programming, all functions return an integer equal to 0 for success or an error code for failure. The CBFlib error codes are given in Table 5.6.1.1.
|
CBFlib is thread-safe, re-entrant and able to operate on multiple data sets at a time. This means that the library maintains no static data and that the object to be operated on must be passed to each function. In CBFlib, this is accomplished by referring to each data set in memory with a unique handle of type cbf_handle. The handle maintains a link to the data structure in memory as well as the current location on the tree (data block, category, column and row). Before reading or creating a data set, the handle is created by calling the cbf_make_handle function. When the data set is no longer required, the resources associated with it can be freed using cbf_free_handle. Most functions in the library expect a handle as the first argument.
CBF binary data files and imgCIF ASCII data files may have one or more large images or other data sections as values for CIF tags. The focus of CBFlib is to handle large data sections efficiently.
The basic flow of an application reading CBF/imgCIF data with the low-level CBFlib functions is shown in Fig. 5.6.1.2.
The general approach to reading CBF/imgCIF data with CBFlib is to create an empty data structure with cbf_make_handle, load the data structures with cbf_read_file and then use nested loops to work through data blocks, categories, rows and columns in turn to extract values. Conceptually, all data values are held in the memory-resident data structures. In practice, however, only pointers to text fields with image data are held in memory. The data themselves remain on disk until explicitly referenced.
The basic flow of an application writing CBF/imgCIF data with the low-level CBFlib functions is shown in Fig. 5.6.1.3.
The general approach to writing CBF/imgCIF data with CBFlib is to create empty data structures with cbf_make_handle and load the data structures with nested loops, working through data blocks, categories, rows and columns in turn, to store values. The major difference from the nested loops used for reading is that empty columns are created before data are stored into the data structures row by row. Alternatively, the data could be stored column by column. Finally, the fully loaded memory data structures are written out with cbf_write_file. As with reading, text fields with image data are actually held on disk.
References
