Tables for
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F, ch. 25.2, pp. 712-713   | 1 | 2 |

Section Symbolic target function

A. T. Brunger,v* P. D. Adams,e W. L. DeLano,f P. Gros,g R. W. Grosse-Kunstleve,e J.-S. Jiang,h N. S. Pannu,i R. J. Read,j L. M. Ricek and T. Simonsonl Symbolic target function

| top | pdf |

One of the key innovative features of CNS is the ability to symbolically define target functions and their first derivatives for crystallographic searches and refinement. This allows one conveniently to implement new crystallographic methodologies as they are being developed.

The power of symbolic target functions is illustrated by two examples. In the first example, a target function is defined for simultaneous heavy-atom parameter refinement of three derivatives. The sites for each of the three derivatives can be disjoint or identical, depending on the particular situation. For simplicity, the Blow & Crick (1959)[link] approach is used, although maximum-likelihood targets are also possible (see below). The heavy-atom sites are refined against the target [\eqalignno{ &\displaystyle\sum\limits_{hkl} {(|{\bf F}_{h_{1}} + {\bf F}_{p}| - |{\bf F}_{ph_{1}}|)^{2} \over 2 v_{1}} + {(|{\bf F}_{h_{2}} + {\bf F}_{p}| - |{\bf F}_{ph_{2}}|)^{2} \over 2 v_{2}} &\cr &\quad + {(|{\bf F}_{h_{3}} + {\bf F}_{p}| - |{\bf F}_{ph_{3}}|)^{2} \over 2v_{3}}. &(\cr}]

[{\bf F}_{h_{1}}], [{\bf F}_{h_{2}}] and [{\bf F}_{h_{3}}] are complex structure factors corresponding to the three sets of heavy-atom sites, [{\bf F}_{p}] represents the structure factors of the native crystal, [|{\bf F}_{ph_{1}}|], [|{\bf F}_{ph_{2}}|] and [|{\bf F}_{ph_{3}}|] are the structure-factor amplitudes of the derivatives, and [v_{1}], [v_{2}] and [v_{3}] are the variances of the three lack-of-closure expressions. The corresponding target expression and its first derivatives with respect to the calculated structure factors are shown in Fig.[link]. The derivatives of the target function with respect to each of the three associated structure-factor arrays are specified with the `dtarget' expressions. The `tselection' statement specifies the selected subset of reflections to be used in the target function (e.g. excluding outliers), and the `cvselection' statement specifies a subset of reflections to be used for cross-validation (Brünger, 1992b[link]) (i.e. the subset is not used during refinement but only as a monitor for the progress of refinement).


Figure | top | pdf |

Examples of symbolic definition of a refinement target function and its derivatives with respect to the calculated structure-factor arrays. (a) Simultaneous refinement of heavy-atom sites of three derivatives. The target function is defined by the `target' expression. `[\hbox{f}\_\hbox{h}\_1]', `[\hbox{f}\_\hbox{h}\_2]' and `[\hbox{f}\_\hbox{h}\_3]' (in bold) are complex structure factors corresponding to three sets of heavy atoms that are specified using atom selections [equation ([link]]. The target function and its derivatives with respect to the three structure-factor arrays are defined symbolically using the structure-factor amplitudes of the native crystal, `[\hbox{f}\_\hbox{p}]', those of the derivatives, `[\hbox{f}\_\hbox{ph}\_1]', `[\hbox{f}\_\hbox{ph}\_2]', `[\hbox{f}\_\hbox{ph}\_3]', the complex structure factors of the heavy-atom models, `[\hbox{f}\_\hbox{h}\_1]', `[\hbox{f}\_\hbox{h}\_2]', `[\hbox{f}\_\hbox{h}\_3]', and the corresponding lack-of-closure variances, `[\hbox{v}\_1]', `[\hbox{v}\_2]' and `[\hbox{v}\_3]'. The summation over the selected stucture factors (`tselection') is performed implicitly. (b) Refinement of two independent models against perfectly twinned data. `fcalc1' and `fcalc2' are complex structure factors for the models that are related by a twinning operation (in bold). The target function and its derivatives with respect to the two structure-factor arrays are explicitly defined.

The second example is the refinement of a perfectly twinned crystal with overlapping reflections from two independent crystal lattices. Refinement of the model is carried out against the residual [\textstyle\sum\limits_{hkl}\displaystyle |{\bf F}_{\rm obs} |- (|{\bf F}_{\rm calc1}|^{2} + |{\bf F}_{\rm calc2}|^{2})^{1/2}. \eqno(] The symbolic definition of this target is shown in Fig.[link]. The twinning operation itself is imposed as a relationship between the two sets of selected atoms (not shown). This example assumes that the two calculated structure-factor arrays (`fcalc1' and `fcalc2') that correspond to the two lattices have been appropriately scaled with respect to the observed structure factors, and the twinning fractions have been incorporated into the scale factors. However, a more sophisticated target function could be defined which incorporates scaling.

A major advantage of the symbolic definition of the target function and its derivatives is that any arbitrary function of structure-factor arrays can be used. This means that the scope of possible targets is not limited to least-squares targets. Symbolic definition of numerical integration over unknown variables (such as phase angles) is also possible. Thus, even complicated maximum-likelihood target functions (Bricogne, 1984[link]; Otwinowski, 1991[link]; Pannu & Read, 1996a[link]; Pannu et al., 1998[link]) can be defined using the CNS language. This is particularly valuable at the prototype stage. For greater efficiency, the standard maximum-likelihood targets are provided through CNS source code which can be accessed as functions in the CNS language. For example, the maximum-likelihood target function MLF (Pannu & Read, 1996a[link]) and its derivative with respect to the calculated structure factors are defined as [\tt\eqalignno{\hbox{target} &= \hbox{(mlf (fobs,sigma,(fcalc + fbulk),} &\cr&\quad\hbox{d,sigma\_delta))} &\cr \hbox{dtarget} &= \hbox{(dmlf (fobs,sigma,(fcalc + fbulk),}\cr&\quad\hbox{d,sigma\_delta))} &{\rm(}\cr}] where `mlf( )' and `dmlf( )' refer to internal maximum-likelihood functions, `fobs' and `sigma' are the observed structure-factor amplitudes and corresponding σ values, `fcalc' is the (complex) calculated structure-factor array, `fbulk' is the structure-factor array for a bulk solvent model, and `d' and [`\hbox{sigma}\_\hbox{delta'}] are the cross-validated D and [\sigma_{\Delta}] functions (Read, 1990[link]; Kleywegt & Brünger, 1996[link]; Read, 1997[link]) which are precomputed prior to invoking the MLF target function using the test set of reflections. The availability of internal Fortran subroutines for the most computing-intensive target functions and the symbolic definitions involving structure-factor arrays allow for maximal flexibility and efficiency. Other examples of available maximum-likelihood target functions include MLI (intensity-based maximum-likelihood refinement), MLHL [crystallographic model refinement with prior phase information (Pannu et al., 1998[link])], and maximum-likelihood heavy-atom parameter refinement for multiple isomorphous replacement (Otwinowski, 1991[link]) and MAD phasing (Hendrickson, 1991[link]; Burling et al., 1996[link]). Work is in progress to define target functions that include correlations between different heavy-atom derivatives (Read, 1994[link]).


Blow, D. M. & Crick, F. H. C. (1959). The treatment of errors in the isomorphous replacement method. Acta Cryst. 12, 794–802.Google Scholar
Bricogne, G. (1984). Maximum entropy and the foundations of direct methods. Acta Cryst. A40, 410–445.Google Scholar
Brünger, A. T. (1992b). Free R value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature (London), 355, 472–475.Google Scholar
Burling, F. T., Weis, W. I., Flaherty, K. M. & Brünger, A. T. (1996). Direct observation of protein solvation and discrete disorder with experimental crystallographic phases. Science, 271, 72–77.Google Scholar
Hendrickson, W. A. (1991). Determination of macromolecular structures from anomalous diffraction of synchrotron radiation. Science, 254, 51–58.Google Scholar
Kleywegt, G. J. & Brünger, A. T. (1996). Checking your imagination: applications of the free R value. Structure, 4, 897–904.Google Scholar
Otwinowski, Z. (1991). In Proceedings of the CCP4 study weekend. Isomorphous replacement and anomalous scattering, edited by W. Wolf, P. R. Evans & A. G. W. Leslie, pp. 80–86. Warrington: Daresbury Laboratory.Google Scholar
Pannu, N. S., Murshudov, G. N., Dodson, E. J. & Read, R. J. (1998). Incorporation of prior phase information strengthens maximum-likelihood structure refinement. Acta Cryst. D54, 1285–1294.Google Scholar
Pannu, N. S. & Read, R. J. (1996a). Improved structure refinement through maximum likelihood. Acta Cryst. A52, 659–668.Google Scholar
Read, R. J. (1990). Structure-factor probabilities for related structures. Acta Cryst. A46, 900–912.Google Scholar
Read, R. J. (1994). Maximum likelihood refinement of heavy atoms. Lecture notes for a workshop on isomorphous replacement methods in macromolecular crystallography. American Crystallographic Association Annual Meeting, 1994, Atlanta, GA, USA.Google Scholar
Read, R. J. (1997). Model phases: probabilities and bias. Methods Enzymol. 277, 110–128.Google Scholar

to end of page
to top of page