Tables for
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F, ch. 18.3, pp. 383-384   | 1 | 2 |

Section Treatment of outliers

R. A. Engha* and R. Huberb

aPharmaceutical Research, Roche Diagnostics GmbH, Max Planck Institut für Biochemie, 82152 Martinsried, Germany, and bMax-Planck-Institut für Biochemie, 82152 Martinsried, Germany
Correspondence e-mail: Treatment of outliers

| top | pdf |

A truly Gaussian distribution should include outliers at high σ values (about 0.01% for 4σ). We should expect, however, that the width of the distribution is affected not only by inherent variation in the variables to be parameterized, but also by variability in the experimental conditions (e.g. resolution) and by erroneous structures. This weakens a strategy of automatic rejection of outliers beyond a specific cutoff value. The possibility of visualizing the distributions with CSD software allows refinement of this rejection strategy, with, however, the introduction of considerable subjectivity in the criteria. For this work, a 4σ cutoff was generally considered a flag for erroneous outliers. However, broad and flat tails in the distribution were relatively frequent and often asymmetric. These deviations from Gaussian behaviour `artificially' increased σ values. In these cases, the 4σ cutoff rule was not applied automatically, but was applied after examination and rejection of conspicuous outliers. From an algorithmic viewpoint, this was the additional use of skew and kurtosis (third and fourth moments of the distribution) for rejection criteria. In most cases, uncertainty in rejection criteria affected the average values little, but could significantly alter standard deviations.

to end of page
to top of page