Tasks performed by SFCHECK

Wodak, S. J.; Vagin, A. A.; Richelle, J.; Das, U.; Pontius, J.; Berman, H. M.

doi:10.1107/97809553602060000708

International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. F. ch. 21.2, pp. 510-511 | 1 | 2 |

Section 21.2.3.1.1. Tasks performed by SFCHECK

S. J. Wodak,^a ^* A. A. Vagin,^b J. Richelle,^b U. Das,^b J. Pontius^b and H. M. Berman^c

^aUnité de Conformation de Macromolécules Biologiques, Université Libre de Bruxelles, avenue F. D. Roosevelt 50, CP160/16, B-1050 Bruxelles, Belgium, and EMBL–EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, England, ^bUnité de Conformation de Macromolécules Biologiques, Université Libre de Bruxelles, avenue F. D. Roosevelt 50, CP160/16, B-1050 Bruxelles, Belgium, and ^cDepartment of Chemistry, Rutgers University, 610 Taylor Road, Piscataway, NJ 08854-8087, USA
Correspondence e-mail: shosh@ucmb.ulb.ac.be

21.2.3.1.1. Tasks performed by SFCHECK

| top | pdf |

21.2.3.1.1.1. Treatment of structure-factor data and scaling

| top | pdf |

SFCHECK reads in the structure-factor data written in mmCIF format. It then performs the following operations: Reflections are excluded if they are systematically absent, negative, or have flagged σ values (99.9). Equivalent reflections are merged. The amplitudes of missing reflections are approximated by taking the average value for the corresponding resolution shell.

From the model coordinates read from the PDB (or mmCIF) atomic coordinates file, SFCHECK calculates structure factors and scales them to the observed structure factors. The scaling factor, S, is computed using a smooth cutoff for low-resolution data (Vaguine et al., 1999) (Table 21.2.3.1). This involves the calculation of the observed and calculated overall B factors from the standard deviations of the Gaussian fitted to the Patterson origin peaks [see Table 21.2.3.1 and Vaguine et al. (1999)]. In addition, SFCHECK also estimates the overall anisotropy of the data, following the approach of Sheriff & Hendrickson (1987), and applies the anisotropic scaling after the Patterson scaling is performed (Murshudov et al., 1998).

Table 21.2.3.1| top | pdf |
Parameters computed for the analysis of the structure-factor data

The first column lists the parameter, the second column gives the formula or definition of the parameter and the third column contains a short description of the meaning of the parameters when warranted.

Parameter	Formula/definition	Meaning
Completeness (%)	Percentage of the expected number of reflections for the given crystal space group and resolution
B_overall (Patterson)	$[8\pi^{2} \sigma_{\rm Patt}/(2)^{1/2}]$ ^†	Overall B factor
R_stand(F)	$[\langle \sigma (F)\rangle/\langle F \rangle]$ ^‡	Uncertainty of the structure-factor amplitudes
Optical resolution	$[(\sigma_{\rm Patt}^{2} + \sigma_{\rm sph}^{2})^{1/2}]$ ^† ^§	Expected minimum distance between two resolved atomic peaks
Expected optical resolution	Optical resolution computed considering all reflections
$[\hbox{CC}_{F}]$	$[\displaystyle{\langle F_{\rm obs} F_{\rm calc}\rangle - \langle F_{\rm obs}\rangle\langle F_{\rm calc}\rangle \over \left[(\langle F_{\rm obs}^{2} \rangle - \langle F_{\rm obs}\rangle^{2}) (\langle F_{\rm calc}^{2}\rangle - \langle F_{\rm calc}\rangle^{2})\right]^{1/2}}]$	Correlation coefficient between the observed and calculated structure-factor amplitudes
S	$[\left\{{\textstyle\sum\displaystyle (F_{\rm obs} f_{\rm cutoff})^{2} \over \textstyle\sum\displaystyle \left[F_{\rm calc} \exp (- B_{\rm diff}^{\rm overall} s^{2}) f_{\rm cutoff}\right]^{2}}\right\}^{1/2}]$ ^¶	Factor applied to scale $[F_{\rm calc}]$ to $[F_{\rm obs}]$
$[f_{\rm cutoff}]$	$[1 - \exp (- B_{\rm off} s^{2})]$ ^††	Function applied to obtain a smooth cutoff for low-resolution data

^† $[\sigma_{\rm Patt}]$ is the standard deviation of the Gaussian fitted to the Patterson origin peak.
^‡F is the structure-factor amplitude, and $[\sigma({F})]$ is the structure-factor standard deviation. The brackets denote averages.
^§ $[\sigma _{\rm sph}]$ is the standard deviation of the spherical interference function, which is the Fourier transform of a sphere of radius $[1/d_{\min}]$ , with $[d_{\rm min}]$ being the minimum d spacing.
^¶ $[B_{\rm diff}^{\rm overall} = B_{\rm obs}^{\rm overall} - B_{\rm calc}^{\rm overall}]$ is added to the calculated overall B factor, $[B_{\rm overall}]$ , so as to make the width of the calculated Patterson origin peak equal to the observed one; s is the magnitude of reciprocal-lattice vector.
^†† $[B_{\rm off} = 4 d_{\rm max}^{2}]$ , where s and $[d_{\rm max}]$ , respectively, are the magnitude of the reciprocal-lattice vector and the maximum d spacing.

To assess the quality of the structure-factor data, the program computes four additional quantities (see Table 21.2.3.1 for details): the completeness of the data, the uncertainty of the structure-factor amplitudes, the optical resolution and the expected optical resolution. The latter two quantities represent the expected minimum distance between two resolved atomic peaks in the electron-density map when the latter is computed with the set of reflections specified by the authors and with all the reflections, respectively.

21.2.3.1.1.2. Global agreement between the model and experimental data

| top | pdf |

To evaluate the global agreement between the atomic model and the experimental data, the program computes three classical quality indicators: the R factor, $[R_{\rm free}]$ (Brünger, 1992b) and the correlation coefficient $[\hbox{CC}_{F}]$ between the calculated and observed structure-factor amplitudes (Table 21.2.3.1). The R factor is computed using all the reflections considered (except those approximated by their average value in the corresponding resolution shell) and applying the same resolution and σ cutoff as those reported by the authors. $[R_{\rm free}]$ is computed using the subset of reflections specified by the authors. In addition, the R factor is evaluated using the `non-free' subset of reflections (those not used to compute $[R_{\rm free}]$ ). The correlation coefficient is computed using all reflections from the reported high-resolution limit, applying the smooth low-resolution cutoff (see Table 21.2.3.1) but no σ cutoff.

21.2.3.1.1.3. Estimations of errors in atomic positions

| top | pdf |

The errors associated with the atomic positions are expressed as standard deviations (σ) of these positions. SFCHECK computes three different error measures. One is the original error measure of Cruickshank (1949). The second is a modified version of this error measure, in which the difference between the observed and calculated structure factors is replaced by the error in the experimental structure factors. The first two error measures are the expected maximal and minimal errors, respectively, and the third measure is the diffraction-component precision indicator (DPI). The mathematical expressions for these error measures are given in Table 21.2.3.2, and further details can be found in Vaguine et al. (1999).

Table 21.2.3.2| top | pdf |
Estimation of errors in atomic coordinates

Parameter	Formula/definition	Meaning
$[\sigma(x)]$	$[\displaystyle{\sigma\hbox{(slope)} \over \hbox{curvature}}]$ ^†	Standard deviation of the atomic coordinates following Cruickshank (1949) for the minimal and maximal errors (Vaguine et al., 1999)
σ(slope) for maximal error	$[\displaystyle{2\pi \left\{\textstyle\sum\displaystyle \left[h^{2} (F_{\rm obs} - F_{\rm calc})^{2}\right]\right\}^{1/2} \over V_{\rm unit \ cell} a}]$ ^‡	Expression for σ(slope) in the expected maximal error following Cruickshank (1949)
Curvature	$[\displaystyle{2\pi \textstyle\sum\displaystyle (h^{2} F_{\rm obs}) \over V_{\rm unit \ cell} a^{2}}]$	Expression for the curvature following Murshudov et al. (1997)
σ(slope) for minimal error	$[\displaystyle{2\pi^2 \left\{\textstyle\sum\displaystyle \left[h^{2} \sigma (F_{\rm obs})^{2}\right]\right\}^{1/2} \over V_{\rm unit \ cell} a}]$ ^§	Expression for σ(slope) in the expected minimal error, following Cruickshank (1949)
DPI	$[\displaystyle{\sigma (x) = \left({N_{\rm atoms} \over N_{\rm obs} - 4 N_{\rm atoms}}\right)^{1/2} c^{-1/3} d_{\min} R}]$ ^¶	Atomic coordinate error estimate following Cruickshank (1996)

^†σ(slope) and curvature are the slope and curvature of the electron-density map at the atomic centre, in the x direction, for spherically symmetric peaks; $[\sigma (x)\simeq \sigma(y)\simeq \sigma(z)]$ .
^‡a is the crystal unit-cell length, h is the Miller index and V _{unit cell} the unit-cell volume.
^§ $[\sigma(F_{\rm obs})]$ is the standard deviation of the structure-factor amplitude.
^¶c is the structure-factor data completeness expressed as a fraction (0–1), R is the conventional R factor, $[N_{\rm atoms}]$ is the total number of atoms in the unit cell, $[N_{\rm obs}]$ is the total number of observed reflections and $[d_{\rm min}]$ is the minimum d spacing.

21.2.3.1.1.4. Local agreement between the model and the experimental data

| top | pdf |

In addition to the global structure quality measures, SFCHECK also determines the quality of the model in specific regions. Several quality estimators can be calculated for each residue in the macromolecule and, whenever appropriate, for solvent molecules and groups of atoms in ligand molecules. These estimators are the normalized atomic displacement (Shift), the correlation coefficient between the calculated and observed electron densities (Density correlation), the local electron-density level (Density index), the average B factor (B-factor) and the connectivity index (Connect), which measures the local electron-density level along the molecular backbone. These quantities are computed for individual atoms and averaged over those composing each residue or group of atoms [see Table 21.2.3.3 and Vaguine et al. (1999) for details].

Table 21.2.3.3| top | pdf |
Parameters computed by SFCHECK to assess the quality of the model in specific regions

Parameter	Formula/definition	Meaning
Shift	$[(1/N\sigma)\textstyle\sum\limits_{i}^{N}\displaystyle \Delta_{i},\hbox{ with } \Delta_{i} = (\hbox{gradient}_{i}/\hbox{curvature}_{i})]$ ^†	Normalized average atomic displacement computed over a group of atoms or residue; reflects the tendency of the group of atoms to move from their current position
Density correlation	$[\displaystyle{\textstyle\sum\displaystyle \rho_{\rm calc}(x_{i})[2\rho_{\rm obs}(x_{i}) - \rho_{\rm calc}(x_{i})] \over \left(\left[\textstyle\sum\displaystyle \rho_{\rm calc}^{2} (x_{i})\right]\left\{\textstyle\sum\displaystyle \left[2\rho_{\rm obs}(x_{i}) - \rho_{\rm calc}(x_{i})\right]^{2}\right\}\right)^{1/2}}]$ ^‡	Electron density correlation coefficient computed over a group of atoms or residue; reflects the local agreement of the model with the electron density
Density index	$[\left[\textstyle\prod\displaystyle \rho(x_{i})\right]^{1/N}/\langle \rho \rangle_{\rm all \ atoms}]$ ^§	Reflects the level of the electron density for a group of atoms; is a local measure of the density level
Connect		Same as Density index, but considering only backbone atoms.^¶

^†Gradient _i is the gradient of the $[F_{\rm obs} - F_{\rm calc}]$ map with respect to the atomic coordinates, curvature _i is the curvature of the model map computed at the atomic centre (see Agarwal, 1978

), N is the number of atoms in the group considered and σ is the standard deviation of the $[\Delta_{i}]$ values computed in the structure.
^‡ $[\rho_{\rm calc}(x_{i})]$ and $[\rho_{\rm obs}(x_{i})]$ are, respectively, the electron density computed from calculated and observed structure-factor amplitudes at the atomic centre. The summation is performed over all the atoms in the group considered. For polymer residues, D_corr is computed separately for backbone and side-chain atoms. For the calculation of the electron density at the atomic centre, see Vaguine et al. (1999)

.
^§ $[[\prod{\rho (x_{i})}]^{1/N}]$ is the geometric mean of the $[2F_{\rm obs} - F_{\rm calc}]$ electron density of the atom subset considered and $[\langle \rho \rangle_{\rm all \ atoms}]$ is the average electron density of the atoms in the structure. For water molecules or ions which are represented by a unique atom, the above expression reduces to the ratio $[\rho(x_i)/\langle \rho \rangle_{\rm all \ atoms}]$ .
^¶Backbone atoms are N, C, C^α for proteins and P, O5′, C5′, C3′, O3′ for nucleic acids.

References

Brünger, A. T. (1992b). Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature (London), 355, 472–474.Google Scholar

Cruickshank, D. W. J. (1949). The accuracy of electron-density maps in X-ray analysis with special reference to dibenzyl. Acta Cryst. 2, 65–82.Google Scholar

Murshudov, G. N., Davies, G. J., Isupov, M., Krzywda, S. & Dodson, E. J. (1998). The effect of overall anisotropic scaling in macromolecular refinement. Newsletter on protein crystallography, pp. 37–42. Warrington: Daresbury Laboratory.Google Scholar

Sheriff, S. & Hendrickson, W. A. (1987). Description of overall anisotropy in diffraction from macromolecular crystals. Acta Cryst. A43, 118–121.Google Scholar

Vaguine, A. A., Richelle, J. & Wodak, S. J. (1999). SFCHECK: a unified set of procedures for evaluating the quality of macromolecular structure-factor data and their agreement with the atomic model. Acta Cryst. D55, 191–205.Google Scholar

International Tables for Crystallography (2006). Vol. F. ch. 21.2, pp. 510-511