A standard protocol for NMR structure determination of proteins and nucleic acids

Wüthrich, K.

doi:10.1107/97809553602060000704

International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. F. ch. 19.7, pp. 464-466 | 1 | 2 |

Section 19.7.2. A standard protocol for NMR structure determination of proteins and nucleic acids

K. Wüthrich^a

^a Institut für Molekularbiologie und Biophysik, Eidgenössische Technische Hochschule-Hönggerberg, CH-8093 Zürich, Switzerland

19.7.2. A standard protocol for NMR structure determination of proteins and nucleic acids

| top | pdf |

An NMR structure determination involves sample preparation, NMR measurements, assignment of the NMR lines to individual atoms in the polymer chain, collection of conformational constraints, and structure calculation and refinement, where in present practice the sequence of steps usually corresponds to the flow diagram of Fig. 19.7.2.1. As is also indicated in Fig. 19.7.2.1, it is a special feature of protein structure determination by NMR that the secondary polypeptide structure , including the connections between individual segments of regular secondary structure, may be known early on from the data used for obtaining the resonance assignments, i.e. before the structure calculation is even started.

Figure 19.7.2.1| top | pdf |

Diagram outlining the course of a macromolecular structure determination by NMR in solution.

For the sample preparation, homogeneous macromolecular material is dissolved at about 1 mM concentration in 0.5 ml of water. The ionic strength, pH and temperature, and possibly the concentration of additives, may then be adjusted, for example, to ensure near-physiological conditions, or denaturing conditions etc. The NMR study will often include the preparation of compounds enriched with ¹⁵N and/or ¹³C, and possibly with ²H (Kay & Gardner, 1997). Uniformly isotope-labelled recombinant proteins are routinely obtained by expression in Escherichia coli bacteria grown on minimal media. For RNA and DNA, isotope-labelling techniques are more involved, but labelled nucleic acids will also be commonly available in the future. Being able to work with solutions is generally considered to be a great asset of the NMR method, but there are also potential inherent difficulties. For example, in the course of an investigation it may be nontrivial to achieve identical solution conditions in different NMR samples of the same compound, the absence of which typically results in small chemical-shift differences that slow down the combined analysis of different NMR spectra.

The demands on NMR experiments for macromolecular structure determination are currently met by multidimensional NMR at high polarizing magnetic fields (Wüthrich, 1986; Ernst et al., 1987; Cavanagh et al., 1996). With increasing molecular size and concomitant increase of the number of NMR peaks, it becomes more and more difficult to resolve and assign the individual resonances. In heteronuclear three- or four-dimensional (3D or 4D) spectra recorded with compounds that are uniformly labelled with ¹⁵N and/or ¹³C, the peaks are spread out in a third and possibly fourth dimension along the ¹⁵N and ¹³C chemical-shift axes (for a review, see Bax & Grzesiek, 1993). Additional labelling with ²H results in slower spin-relaxation rates and hence improved sensitivity and spectral resolution. Although this approach provides ¹³C and ¹⁵N NMR information, which may be used to support the structure determination (e.g. Wishart et al., 1991) and to provide supplementary information on dynamic features of the molecule studied, the key purpose of heteronuclear NMR experiments is to enhance the spectral resolution for studies of protons and thus to obtain the maximum possible number of ¹H NMR-based conformational constraints.

Resonance assignments in biopolymers are obtained using the sequential assignment strategy (Dubs et al., 1979; Billeter et al., 1982; Wagner & Wüthrich, 1982). Unique segments of two or several sequentially adjoining amino-acid residues are identified by NMR experiments and then attributed to discrete positions in the polypeptide chain by comparison with the chemically determined amino-acid sequence. The desired relations between protons in sequentially neighbouring amino-acid residues i and [i + 1] can be established by nuclear Overhauser effects (NOEs), manifesting a close approach between $[\alpha \hbox{CH}_{i}]$ and $[\hbox{NH}_{i + 1} \kern3pt(d_{\alpha {\rm N}})]$ , $[\hbox{NH}_{i}]$ and $[\hbox{NH}_{i + 1} \kern3pt(d_{\rm NN})]$ (Fig. 19.7.2.2a), and possibly $[\beta \hbox{CH}_{i}]$ and $[\hbox{NH}_{i + 1} \kern3pt(d_{\beta {\rm N}})]$ (Wüthrich, 1986). For Xxx—Pro dipeptide segments, corresponding connectivities are observed with $[\delta \hbox{CH}_{2}]$ of Pro in the place of the amide proton. For small proteins with up to about 100 amino-acid residues, NOE-based sequential assignments can rely entirely on homonuclear ¹H NMR experiments, and for somewhat bigger proteins they can be established based on the improved resolution of 3D NMR experiments with ¹⁵N-labelled proteins. No prior knowledge of the polypeptide conformation is needed, since at least one of the two distances $[d_{\alpha {\rm N}}]$ or $[d_{\rm NN}]$ (Fig. 19.7.2.2a) is always sufficiently short to be observed by NOEs (Billeter et al., 1982). A further attractive feature of this approach is that the identification of the sequential NOEs forms an integral part of the data collection for the protein structure determination (see below). Sequential assignments can alternatively be obtained entirely via heteronuclear scalar couplings, using recombinant isotope-labelled proteins (Fig. 19.7.2.2b). Using 3D and 4D heteronuclear triple-resonance experiments , the resonance lines of sufficiently large mutually overlapping fragments of the polypeptide chain are grouped together to enable sequence-specific resonance assignments (for a review, see Bax & Grzesiek, 1993). With the implementation of transverse relaxation-optimized spectroscopy (TROSY) elements (Pervushin et al., 1997) into triple-resonance experiments (Salzmann et al., 1998), backbone resonance assignments via the spin–spin couplings of Fig. 19.7.2.2(b) can be performed with molecular weights of 100 000 and beyond. For nucleic acids, assignment procedures were largely patterned after those used for proteins and have been used successfully for fragments with 40 nucleotides and beyond.

Figure 19.7.2.2| top | pdf |

(a) Sequential resonance assignment based on sequential ¹H–¹H NOEs. In the dipeptide segment -Ala-Val- the dotted lines indicate ¹H–¹H relations which can be established by scalar through-bond spin–spin couplings. The broken arrows connect pairs of protons in sequentially neighbouring residues, i and [i + 1] , which are related by ¹H–¹H NOEs that manifest short sequential distances $[d_{\alpha {\rm N}}]$ (between αCH and the amide proton of the following residue) and $[d_{\rm NN}]$ (between the amide protons of neighbouring residues). (b) Segment of a polypeptide chain with indication of the scalar spin–spin couplings that provide the basis for obtaining sequential assignments by triple-resonance experiments with uniformly ¹³C/¹⁵N labelled proteins.

NOE upper-distance constraints contain the crucial information needed for macromolecular structure determination (Wüthrich, 1986, 1989). To obtain a high-quality structure, the maximum possible number of NOE conformational constraints must be collected as input for the structure calculation. This is accomplished by using the chemical-shift lists obtained as a result of the sequence-specific resonance assignments to attribute the cross peaks in 2D [¹H, ¹H]-NOESY spectra, or 3D and 4D heteronuclear-resolved [¹H, ¹H]-NOESY spectra, to distinct pairs of hydrogen atoms. As indicated in Fig. 19.7.2.1, this data collection is achieved in several cycles, where ambiguities in the NOESY cross-peak assignments can usually be resolved by reference to preliminary structures calculated from incomplete input data sets (Güntert et al., 1993). In present practice, each individual NOE constraint has the format of an allowed distance range, which circumvents intrinsic difficulties that might arise from attempts at quantitative distance measurements, and which is also adjusted to account for possible effects from internal mobility. The lower limit is usually taken to correspond to the sum of two hydrogen atomic radii, i.e. 2.0 Å, and the NOE intensities are translated into corresponding upper bounds, typically in steps of 2.5, 3.0 and 4.0 Å. Supplementary conformational constraints, for example, from spin–spin coupling constants (Wüthrich, 1986), residual dipole–dipole couplings (Tjandra & Bax, 1997), pseudocontact shifts and relaxation effects near paramagnetic centres (Banci et al., 1998) etc., are represented in the input by similar allowed ranges, which account for the internal mobility of the 3D structures and the limited accuracy of the individual measurements. Initially, NMR structures were calculated using distance-geometry techniques, and subsequently the principles of distance geometry have been introduced into molecular-dynamics programs in Cartesian coordinates (Brünger et al., 1986) or in torsion-angle space (Güntert et al., 1997). Model calculations performed in conjunction with the initial protein structure determinations had shown that NMR structure calculation depends critically on the density of NOE distance constraints, while it is remarkably robust with regard to low precision of the individual distance constraints (Havel & Wüthrich, 1985). For the common presentation of an NMR structure, one considers the result of a single structure calculation as representing one molecular geometry that is compatible with the NMR data. To investigate further whether or not this solution is unique, the calculation is repeated with different boundary conditions, where for each calculation, convergence is judged by the residual constraint violations. All satisfactory solutions, by this criterion, are included in a group of conformers that is used to represent the NMR structure (Fig. 19.7.2.3). The precision of the structure determination is reflected by the dispersion among this group of conformers. In proteins, larger variations are typically observed near the chain ends, in exposed loops and for surface amino-acid side chains, which contrasts with the well defined core. For nucleic acids, the `global folds', for example, formation of duplexes, triplexes, quadruplexes, or loops, can be well defined by NMR, but because of the short range of the NOE distance measurements, certain `long-range' features, for example, bending of DNA duplexes, may be more difficult to characterize.

Figure 19.7.2.3| top | pdf |

Polypeptide backbone of 19 energy-refined conformers selected to represent the NMR solution structure of the Antennapedia homeodomain. From residues 7 to 59, the structure is well defined by the NMR data. The chain-terminal segments 0–6 and 60–67 are disordered, and additional NMR studies showed that these terminal segments behave like `flexible tails'. [Drawing prepared with the atomic coordinates from Qian et al. (1989).]

References

Banci, L., Bertini, I., Cremonini, M. A., Gori-Savellini, G., Luchinat, C., Wüthrich, K. & Güntert, P. (1998). PSEUDYANA for NMR structure calculation of paramagnetic metalloproteins using torsion angle molecular dynamics. J. Biomol. NMR, 12, 553–557.Google Scholar

Bax, A. & Grzesiek, S. (1993). Methodological advances in protein NMR. Acc. Chem. Res. 26, 131–138.Google Scholar

Billeter, M., Braun, W. & Wüthrich, K. (1982). Sequential resonance assignments in protein ¹H nuclear magnetic resonance spectra: computation of sterically allowed proton–proton distances and statistical analysis of proton–proton distances in single crystal protein conformations. J. Mol. Biol. 155, 321–346.Google Scholar

Brünger, A. T., Clore, G. M., Gronenborn, A. M. & Karplus, M. (1986). Three-dimensional structure of proteins determined by molecular dynamics with interproton distance restraints. Application to crambin. Proc. Natl Acad. Sci. USA, 83, 3801–3805. Google Scholar

Cavanagh, J., Fairbrother, W. J., Palmer, A. G. III & Skelton, N. J. (1996). Protein NMR spectroscopy, principles and practice. New York: Academic Press.Google Scholar

Dubs, A., Wagner, G. & Wüthrich, K. (1979). Individual assignments of amide proton resonances in the proton NMR spectrum of the basic pancreatic trypsin inhibitor. Biochim. Biophys. Acta, 577, 177–194.Google Scholar

Ernst, R. R., Bodenhausen, G. & Wokaun, A. (1987). Principles of nuclear magnetic resonance in one and two dimensions. Oxford: Clarendon Press.Google Scholar

Güntert, P., Berndt, K. D. & Wüthrich, K. (1993). The program ASNO for computer-supported collection of NOE upper distance constraints as input for protein structure determination. J. Biomol. NMR, 3, 601–606.Google Scholar

Güntert, P., Mumenthaler, C. & Wüthrich, K. (1997). Torsion angle dynamics for NMR structure calculation with the new program DYANA. J. Mol. Biol. 273, 283–298.Google Scholar

Havel, T. F. & Wüthrich, K. (1985). An evaluation of the combined use of nuclear magnetic resonance and distance geometry for the determination of protein conformations in solution. J. Mol. Biol. 182, 281–294.Google Scholar

Kay, L. E. & Gardner, K. H. (1997). Solution NMR spectroscopy beyond 25 kDa. Curr. Opin. Struct. Biol. 7, 722–731.Google Scholar

Pervushin, K., Riek, R., Wider, G. & Wüthrich, K. (1997). Attenuated T₂ relaxation by mutual cancellation of dipole–dipole coupling and chemical shift anisotropy indicates an avenue to NMR structures of very large biological macromolecules in solution. Proc. Natl Acad. Sci. USA, 94, 12366–12371.Google Scholar

Salzmann, M., Pervushin, K., Wider, G., Senn, H. & Wüthrich, K. (1998). TROSY in triple-resonance experiments: new perspectives for sequential NMR assignment of large proteins. Proc. Natl Acad. Sci. USA, 95, 13585–13590.Google Scholar

Tjandra, N. & Bax, A. (1997). Direct measurement of distances and angles in biomolecules by NMR in dilute liquid crystalline medium. Science, 278, 1111–1114.Google Scholar

Wagner, G. & Wüthrich, K. (1982). Sequential resonance assignments in protein ¹H nuclear magnetic resonance spectra: basic pancreatic trypsin inhibitor. J. Mol. Biol. 155, 347–366.Google Scholar

Wishart, D. S., Sykes, B. D. & Richards, F. M. (1991). Relationship between nuclear-magnetic-resonance chemical shift and protein secondary structure. J. Mol. Biol. 222, 311–333.Google Scholar

Wüthrich, K. (1986). NMR of proteins and nucleic acids. New York: Wiley.Google Scholar

Wüthrich, K. (1989). Protein structure determination in solution by nuclear magnetic resonance spectroscopy. Science, 243, 45–50.Google Scholar

International Tables for Crystallography (2006). Vol. F. ch. 19.7, pp. 464-466