International
Tables for Crystallography Volume G Definition and exchange of crystallographic data Edited by S. R. Hall and B. McMahon © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. G. ch. 5.2, pp. 489-491
Section 5.2.2.3. Context in data sets
a
School of Computer Science and Software Engineering, University of Western Australia, 35 Stirling Highway, Crawley, Perth, WA 6009, Australia,bSchool of Biomedical and Chemical Sciences, University of Western Australia, Crawley, Perth, WA 6009, Australia, and cInternational Union of Crystallography, 5 Abbey Square, Chester CH1 2HU, England |
Another indicator of context in the previous example is the data-block header, which was reproduced in the output of Fig. 5.2.2.2.
The STAR File allows data instances in three types of location: in a data block, in a save frame or in a global block.
The usual way to partition a STAR File is by data blocks; each such block represents a data set in which a data name (associated with a single or multiple values) may be declared once only.
Data blocks may include save frames. A save frame is an encapsulated subsidiary data set, effectively insulated from the contents of the surrounding data block, in which data items may occur that have the same names as items in the parent data block. Indeed, `parent' is potentially a misleading term, since no relationship is implied between the data within a save frame and those in the data block in which the save frame occurs. A reference to a save frame may, however, occur as a data value within the data block where the save frame is specified. Recall from Section 2.1.3.6
that save frames within a data block are uniquely identified by the framecode header.
Global blocks may also occur in a STAR File, preceding or interspersed between data blocks. For each data item defined within a global block, that definition is inherited by each succeeding data block that does not contain an internal definition of a data item with the same name. If there is a definition of a data item with the same name within a data block, that internal definition overrides the global definition within that data block. The situation is then re-evaluated in the next data block. If that data block does not contain an internal definition, the global definition holds.
The scope of data values is well defined (see Section 2.1.3.9
). Only data expressed in a global block have values that are inherited in later portions of the STAR File. Data values in data blocks or save frames are restricted in scope to the current data block or save frame, respectively.
A consequence of these rules of scope and encapsulation is that a full description of the context of a STAR data value must also reflect any values carried through as global data or by de-referencing associated save frames. The results are not always intuitive.
Consider Fig. 5.2.2.3, which represents a partial description of a chemical reaction where one of the reactants is expressed as a generic structure described by the save frame save_R1. However, the generic structure in this case is restricted to a small number of alkyl groups, each described in its own save frame. Setting aside this prior knowledge, we see that a request for _atom_identity_symbol must return not only the data values in their embedded save frames, but also the save frames in their entirety and the higher-order data values that reference the matching save frames. It is only in this way that we can guarantee that the value can be used by any application. Fig. 5.2.2.4
demonstrates the full context of the returned requested data values.
![]() |
Context for the requested values of _atom_identity_symbol in the preceding example. See text for details. |
Notice that the requested item occurs (among other places) in the save frame save_carboxylic_acid and this instance of the item is presented solely in the context of the save header and closure strings (it is shown in italics in Fig. 5.2.2.4). However, one of the values extracted from this location is the save-frame reference pointer $R1 that identifies save_R1, and the complete contents of this save frame are presented (because the data structure represented by the save frame is itself one of the values of the requested data item). Further de-referencing of the save-frame pointers within save_R1 results in the extraction also of the complete save frames save_methyl and save_ethyl. In this example it is coincidental that there are instances of the requested data item (_atom_identity_symbol) within these returned save frames as well.
Notice, however, that establishing the full context of the returned data demands also that data values referencing the save_carboxylic_acid frame be presented. In this example, the value of _reaction_component_symbol at the outermost level of the data block is returned, a result that may at first seem surprising. It is only in this way that one can be sure that an arbitrary application will have access to the full semantic information carried by the data item.
Data values declared in global blocks should be presented in the same spirit of supplying the complete context in which the value was instantiated, and not simply the value in isolation. For example, given the trivial STAR File
a request for _example should return the identical file, not the interpolated result
despite the latter's equivalence purely in terms of the non-contextual values returned.