International
Tables for
Crystallography
Volume G
Definition and exchange of crystallographic data
Edited by S. R. Hall and B. McMahon

International Tables for Crystallography (2006). Vol. G. ch. 5.7, pp. 561-562

Section 5.7.2.6. Automated data validation: checkcif

P. R. Strickland,a M. A. Hoylanda and B. McMahona*

a International Union of Crystallography, 5 Abbey Square, Chester CH1 2HU, England
Correspondence e-mail:  bm@iucr.org

5.7.2.6. Automated data validation: checkcif

| top | pdf |

The current service for checking structural data submitted to IUCr journals is known as checkcif and is available at http://journals.iucr.org/services/cif/checkcif.html . Versions of this service have been made available to other publishers for some time. In 2003, a general service was introduced at http://checkcif.iucr.org to provide structural checks on CIF data sets destined for publication in non-IUCr journals or database deposition, or indeed to allow authors to assess the quality of their structure determinations whether they wish to publish them or not.

The tests carried out by checkcif include:

(i) a simple file syntax check: essential in the early days of manual CIF construction, but of less importance now as syntax-preserving editing programs have become more widespread;

(ii) tests for the self-consistency of mutually dependent data items present in the CIF;

(iii) a large collection of analytic tests on structural chemistry and molecular geometry based on the program PLATON (Spek, 2003[link]).

The checks carried out at the time of publication (2005) are listed in Appendix 5.7.2[link] and on the CD-ROM accompanying this volume. The current list is available from http://journals.iucr.org/services/cif/datavalidation.html .

Although the results from checkcif provide valuable indications of possible inconsistencies or data errors, an article for publication is not accepted or rejected on the basis of the checkcif report alone. The report is always read by a reviewer as part of a considered critical appraisal of the article.

Sometimes, particular data values are so far from the expected values that some response is required from the author to explain them. The unusual values may be a consequence of poor experimental conditions that the author was unable to improve, or of poor crystal quality; they may indicate an uncertainty in part of the structure determination that the author considers acceptable, particularly if the purpose of the study is to concentrate on a different part of the structure; or they may genuinely indicate novel chemical features. Whatever the case, anomalous values usually need to be discussed by the author and the reviewer or editor, and often need to be commented on in the article. For Acta Cryst. C and E, checkcif generates in CIF format a list of the tests that have highlighted unusual values in the author's CIF (called `A alerts'), together with a text field for each of these tests in which the author may justify or discuss the apparently anomalous results (see Fig. 5.7.2.3[link]). Together these comprise a `validation reply form'. The author can complete this form and paste it into the final version of the CIF submitted for publication. The editor handling the paper can then read the comments in the validation reply form and decide whether to accept the paper for publication. The submission system will automatically return to the author any CIF which generates an A alert but does not contain a completed validation reply form.

[Figure 5.7.2.3]

Figure 5.7.2.3 | top | pdf |

Extracts from a checkcif report for a `publication check' on a CIF to be submitted to an IUCr journal. (a) Alerts of various levels of severity are listed. (b) The journal policy on the handling of alerts is summarized and a validation reply form listing the A alerts is supplied for the author to fill in.

Every article published in Acta Cryst. E has as part of its supplementary material a summary of the checkcif report for the structure described in it. This summary includes any validation reply that the author has supplied. It also includes selected numerical data items identified by the journal editors as characterizing the overall quality and completeness of the structure determination.

The characterization of the `quality' of a structure is a contentious issue. For journals, where there is active selection of articles for publication, it can be difficult to assign criteria for assessing the quality of the structure determination without these being seen as judging the quality or worth of the scientific work giving rise to the result. Thus journals rely upon the experience and discernment of referees to identify structures `worth' publishing. However, in a comprehensive collection of structural data sets, such as in a public structural database, it might be possible to identify particular data items that could be used for weighting individual data sets when the database is being `mined' for particular patterns or characteristic values. It will be interesting to see whether a consensus emerges on what items would be suitable. It is clear that reliance on a single indicator will not be appropriate for sophisticated studies. The old idea that a structure could be classed as `good' or `bad' on the basis of its final residual R factor alone has long been abandoned, but it may be possible to stipulate criteria for a set of interrelated data items and use these to filter specific information from a database.

References

First citation Spek, A. L. (2003). Single-crystal structure validation with the program PLATON. J. Appl. Cryst. 36, 7–13.Google Scholar








































to end of page
to top of page