International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F. ch. 20.1, pp. 481-488   | 1 | 2 |
https://doi.org/10.1107/97809553602060000705

Chapter 20.1. Molecular-dynamics simulation of protein crystals: convergence of molecular properties of ubiquitin

U. Stockera and W. F. van Gunsterena

aLaboratory of Physical Chemistry, ETH-Zentrum, 8092 Zürich, Switzerland

A unit cell of ubiquitin was simulated for 2 ns using molecular dynamics to investigate the degree of convergence of different molecular properties in crystals. Energies, deviation from the experimentally derived crystal structure, atomic positional fluctuations and dihedral-angle fluctuations were analysed. Most of the properties examined are converged after 2 ns. The atomic positional deviation from the X-ray structure had not converged within 2 ns.

Keywords: molecular dynamics; molecular-dynamics simulation; ubiquitin.

20.1.1. Introduction

| top | pdf |

Molecules in crystals are often believed to have a very rigid structure due to their ordered packing, and the investigation of the molecular motion of such systems is often considered to be of little interest. In contrast to small-molecule crystals, however, the solvent concentration in protein crystals is high, usually with about half of the crystal consisting of water. Thus, in this respect, one can compare protein crystals with very concentrated solutions and expect non-negligible atomic motion. The atomic mobility in proteins can be investigated by experiment (X-ray diffraction, NMR) or by molecular simulation.

Today's experimental techniques are very advanced. They are, however, only able to examine time- and ensemble-averaged structures and properties. In contrast, with simulations one can go beyond averaged properties and examine the motions of a single molecule in the pico- and nanosecond time regime. Such simulations have become possible with the availability of high-resolution structural data, which provide adequate starting structures for biologically relevant systems. Depending on the kind of property in which one is interested, different methods of simulation may be used. Equilibrium properties can be obtained using either Monte Carlo (MC) or molecular-dynamics (MD) simulation techniques, but motions can only be observed with the latter. Current interest in the simulation community mainly focuses on dissolved proteins as they would be in their natural environment. Force fields are parameterized to mimic the behaviour and function of proteins in a solution, and few crystal simulations have been performed. Consequently, a crystal environment provides an excellent opportunity to test a force field on a task for which it should be appropriate, but for which it has not been directly parameterized.

Apart from the analysis of the dynamic properties of a system, MD simulations are also used in structure refinement. In refinement, be it X-ray crystallographic or NMR, a special term is added to the standard physical force field to reflect the presence of experimental data: [V({\bf r}) = V^{\rm phys}({\bf r}) + V^{\rm special} ({\bf r}). \eqno(20.1.1.1)] In NMR, a variety of properties can be measured, and each of these can be used in the definition of an additional term that restrains the generated structures to reproduce given experimental values. Refinement procedures exist that use nuclear-Overhauser-effect (van Gunsteren et al., 1984[link]; Kaptein et al., 1985[link]), J-value (Torda et al., 1993[link]) and chemical-shift (Harvey & van Gunsteren, 1993[link]) restraints. In crystallography, X-ray intensities are used to generate the restraining energy contribution (Brünger et al., 1987[link]; Fujinaga et al., 1989[link]). Combined NMR/X-ray refinement uses both solution and crystal data (Schiffer et al., 1994[link]).

As in an experiment, averages over time and molecules are measured, and instantaneous restraints can lead to artificial rigidity in the molecular system (Torda et al., 1990[link]). This can be circumvented by restraining time or ensemble averages, instead of instantaneous values, to the value of the measured quantity. Time averaging has been applied to nuclear Overhauser effects (Torda et al., 1990[link]) and J values (Torda et al., 1993[link]) in NMR structure determination and to X-ray intensities in crystallography (Gros et al., 1990[link]; Gros & van Gunsteren, 1993[link]; Schiffer et al., 1995[link]). Ensemble averaging has been applied in NMR refinement (Scheek et al., 1991[link]; Fennen et al., 1995[link]). For a more detailed discussion of restrained MD simulations, we refer to the literature (van Gunsteren et al., 1994[link], 1997[link]).

The first unrestrained MD simulations of a protein in a crystal were carried out in the early 1980s (van Gunsteren & Karplus, 1981[link], 1982[link]). The protein concerned was bovine pancreatic trypsin inhibitor (BPTI), a small (58-residue) protein for which high-resolution X-ray diffraction data were available. The initial level of simulation was to neglect solvent, using vacuum boundary conditions. This was improved gradually by the inclusion of Lennard–Jones particles at the density of water as a solvent (van Gunsteren & Karplus, 1982[link]) to let the protein feel random forces and friction from the outside as well as feel a slightly attractive external field. The next step was to use a simple (simple point charge, SPC) water model (van Gunsteren et al., 1983[link]). Further improvement was achieved by incorporating counter ions into the modelled systems to obtain overall charge neutrality (Berendsen et al., 1986[link]).

Despite these early attempts, few unrestrained crystal simulations have been reported in the literature, and, to our knowledge, these involve one to four protein molecules, simulating one unit cell (Shi et al., 1988[link]; Heiner et al., 1992[link]). The maximum time range covered has been less than 100 ps.

In the work described in this chapter, the current state of MD simulation of protein crystals is illustrated. A full unit cell of ubiquitin, containing four ubiquitin and 692 water molecules, has been simulated for a period of two nanoseconds. Since this simulation is an order of magnitude longer than crystal simulations in the literature, it offers the possibility of analysing the convergence of different properties as a function of time and as a function of the number of protein molecules. Converged properties can also be compared with experimental values as a test of the GROMOS96 force field (van Gunsteren et al., 1996[link]). Finally, the motions obtained can be analysed to obtain a picture of the molecular behaviour of ubiquitin in a crystalline environment.

20.1.2. Methods

| top | pdf |

Ubiquitin consists of 76 amino acids with 602 non-hydrogen atoms. Hydrogen atoms attached to aliphatic carbon atoms are incorporated into these (the united-atom approach), and the remaining 159 hydrogen atoms are treated explicitly. Ubiquitin crystallizes in the orthorhombic space group [P2_{1}2_{1}2_{1}], with a = 5.084, b = 4.277 and c = 2.895 nm. There is one molecule in the asymmetric unit. The protein was crystallized at pH 5.6. The amino acids Glu and Asp were taken to be deprotonated, and Lys, Arg and His residues were protonated, leading to a charge of +1 electron charge per chain. Because this is a small value compared with the size of the system, no counter ions were added. Four chains of ubiquitin, making up a full unit cell of the crystal, were simulated together with 692 water molecules modelled using the SPC water model (Berendsen et al., 1981[link]). 232 water molecules were placed at crystallographically observed water sites, and the remaining 460 were added to obtain the experimental density of 1.35 g cm−3, leading to a system size of 3044 protein atoms and 5120 atoms total.

The crystal structure of ubiquitin [Protein Data Bank (Bernstein et al., 1977[link]) code 1UBQ] solved at 1.8 Å resolution (Vijay-Kumar et al., 1987[link]) was used as a starting point. To achieve the appropriate total density, noncrystallographic water molecules were added, using a minimum distance of 0.220605 nm between non-hydrogen protein atoms or crystallographic water oxygen atoms and the oxygen atoms of the added water molecules, which were taken from an equilibrated water configuration (van Gunsteren et al., 1996[link]). Initial velocities were assigned from a Maxwell–Boltzmann distribution at 300 K. The protein and solvent were coupled separately to temperature baths of 300 K with a coupling time of 0.1 ps (Berendsen et al., 1984[link]). No pressure coupling was applied. Another simulation (results not shown) including pressure coupling showed no significant change in the box volume. Bonds were kept rigid using the SHAKE method (Ryckaert et al., 1977[link]), with a relative geometric tolerance of [10^{-4}]. Long-range forces were treated using twin range cutoff radii of 0.8 and 1.4 nm (van Gunsteren & Berendsen, 1990[link]). The pair list for non-bonded interactions was updated every 10 fs. No reaction field correction was applied. All simulations were performed using the GROMOS96 package and force field (van Gunsteren et al., 1996[link]).

The system was initially minimized for 20 cycles using the steepest-descent method. The protein atoms were harmonically restrained (van Gunsteren et al., 1996[link]) to their initial positions with a force constant of 25000 kJ mol−1 nm−2. This minimized structure was then pre-equilibrated in several short MD runs of 500 steps of 0.002 ps each, gradually lowering the restraining force constant from 25000 kJ mol−1 nm−2 to zero. The time origin was then set to zero, and the entire unit cell was simulated for 2 ns. The time step was 0.002 ps, and every 500th configuration was stored for evaluation. The first 400 ps of the run were treated as equilibration time, the remaining 1.6 ns were used for analysis.

20.1.3. Results

| top | pdf |

20.1.3.1. Energetic properties

| top | pdf |

In Fig. 20.1.3.1[link], the non-bonded contributions to the total potential energy are shown. The non-bonded interactions comprise Lennard–Jones and electrostatic interactions. Solvent–solvent, solute–solute and solute–solvent interaction energies are shown separately. All of these appear converged after approximately 100 ps. The solvent–solvent energy remains close to its initial value during the whole simulation, the water molecules having relaxed during the pre-equilibration period, while the protein was restrained. The protein internal energy increases during the first few hundred picoseconds, but this is compensated by a decrease in the protein–solvent energy as the protein adapts to the force field and the pre-relaxed solvent environment. This effect becomes negligible after about 200 ps, from which time point the system can be viewed as equilibrated with respect to the energies. The distribution of kinetic versus potential energy and the total (bonded and non-bonded) energy of the system relaxes even faster (results not shown).

[Figure 20.1.3.1]

Figure 20.1.3.1| top | pdf |

Non-bonded energies (in kJ mol−1) of the simulated system as a function of time.

20.1.3.2. Structural properties

| top | pdf |

Not all properties converge as fast as the energies. Fig. 20.1.3.2[link] shows the root-mean-square atom-position deviation (RMSD) from the X-ray structure for each of the four individual chains for both Cα atoms and all atoms. Here, several relaxation periods can be distinguished. After the initial increase, which occurs during the first 50 ps of the simulation, a plateau is reached, and the system is apparently stable until 300 ps. The values reached are 0.12 nm for the Cα atoms and 0.20 nm if all atoms are considered. These numbers are comparable with results obtained in crystal simulations of other proteins of equivalent length reported in the literature (van Gunsteren et al., 1983[link]; Berendsen et al., 1986[link]; Shi et al., 1988[link]; Heiner et al., 1992[link]; Levitt et al., 1995[link]). After 300 ps, however, the values increase slowly again. For the Cα atoms, there is apparently a second plateau from 300 to 850 ps, but during this period the RMSD for all atoms continues to increase monotonically. After 850 ps, a final plateau is reached. During the second nanosecond of the simulation (1000–2000 ps), the RMSDs are 0.21 nm for the Cα atoms and 0.29 nm for all atoms. The RMSD of chain 1 is an exception. There is a strong increase after 1200 ps due to a movement of a particular part of the chain which will be addressed later. To ensure that the RMSD values have converged, longer runs would be required.

[Figure 20.1.3.2]

Figure 20.1.3.2| top | pdf |

Root-mean-square atom-positional deviations (RMSD) in nm from the X-ray structure of the four different protein molecules in the unit cell as a function of time. Rotational and translational fitting was applied using the Cα atoms of residues 1–72. The upper and lower graphs show the deviations for the Cα atoms and for all atoms, respectively.

Although the RMSD values shown in Fig. 20.1.3.2[link] grow larger than those usually observed in the course of short simulations, the hydrogen-bonding pattern and thus the secondary-structure elements observed in the X-ray structure are reproduced well (Table 20.1.3.1[link]). Most of the hydrogen bonds reported (Vijay-Kumar et al., 1987[link]) show high occupancies during the whole simulation, especially the ones within secondary-structure elements. Only six out of the 44 hydrogen bonds present in the X-ray structure are disrupted during the simulation. Hydrogen bonds in the α-helix (residues 23–34) show very high occupancies, ranging from 75% at the N-terminus to well over 90% inside the helix. Only its C-terminus shows signs of instability, with the α-helix deforming towards a 310-helix. The β-sheet pattern is, apart from chain 1 in the region of residues 49–64, as stable as the α-helix. Occupancies range from 55% up to 95%. The six hydrogen bonds not reproduced (five 310 and one α-helical) can be rationalized as follows. The bridges 10–7 and 65–62 are part of the most mobile regions of the protein. These regions involve residues 7–10, 51–54 and 62–65 (Vijay-Kumar et al., 1987[link]). The hydrogen bond at the C-terminal end of the α-helix (35–31) is lost, and the preceding hydrogen bond (34–30) is partly changed, indicating that the end of the α-helix is deformed towards a 310-helix. The donor of the bond 40–37 is replaced by residue 41, and the four-residue 310-helix that was stabilized by hydrogen bonds 58–55 and 59–56 is replaced by an α-helical hydrogen bond 59–55. The high occupancy of this particular bond and the complete absence of the two observed experimentally indicate an early rearrangement in this part of the structure (before the analysis period) which is stable during the rest of the simulation.

Table 20.1.3.1| top | pdf |
Occurrence of intramolecular hydrogen bonds (%) during the final 1.6 ns of the simulation

The criteria for a hydrogen bond to be present are angle donor–hydrogen–acceptor [\geq 135^{\circ}], distance hydrogen–acceptor [\leq 0.25] nm. Hydrogen bonds are shown if they are either present in the X-ray structure or if at least one of the four protein molecules in the unit cell shows the hydrogen bond of interest for at least 50% of the simulation time. The letter h appended to an amino-acid code indicates that the residue is protonated.

Hydrogen bonds X-ray structureMolecular dynamics
BackboneBackbone Molecule 1 Molecule 2 Molecule 3 Molecule 4
3Ile H15Leu O10094949598
4Phe H65Ser O10085698777
5Val H13Ile O10080908793
6Lysh H67Leu O10085828894
7Thr H11Lysh O10065495462
8Leu H69Leu O05521955
10Gly H7Thr O1000000
13Ile H5Val O10086767087
15Leu H3Ile O10087927282
17Val H1Met O10068397951
21Asp H18Glu O10068848490
23Ile H54Arg O1000748992
24Glu H52Asp O10058696384
26Val H22Thr O10092697861
27Lysh H23Ile O10094979899
28Ala H24Glu O10071718489
29Lysh H25Asn O10091799488
30Ile H26Val O10092769491
31Gln H27Lysh O10085536693
32Asp H28Ala O10082278777
33Lysh H29Lysh O10023138151
33Lysh H30Ile O05923719
34Glu H30Ile O10095546486
35Gly H31Gln O1000000
36Ile H34Glu O062502835
40Gln H37Pro O1000000
41Gln H37Pro O068567220
41Gln H38Pro O10014251450
42Arg H70Val O10082828388
44Ile H68Hish O10084969395
45Phe H48Lysh O10020747791
48Lysh H45Phe O10024625944
50Leu H43Leu O10029889285
54Arg H51GluO10020601969
56Leu H21Asp O1000908181
57Ser H19Pro O1003788683
58Asp H55Thr O1000000
59Tyr H55Thr O10058869285
59Tyr H56Leu O1000000
60Asn H57Ser O10038346058
61Ile H56Leu O1006776356
64Glu H2Gln O100042695
65Ser H62Gln O1000000
67Leu H4Phe O10069748770
68Hish H44Ile O10062688389
69Leu H6Lysh O10079729290
70Val H42Arg O10091899091
72Arg H40Gln O10079598578

Hydrogen bondX-ray structureMolecular dynamics
BackboneSide chainMolecule 1Molecule 2Molecule 3Molecule 4
2Gln H64Glu OE20637840
9Thr H7Thr OG11000000
11Lysh H7Thr OG11000000
18Glu H21Asp OD210080300
20Ser H18Glu OE2000550
25Asn H22Thr OG110031136138
51Glu H59Tyr OH10046875676
55Thr H58Asp OD110029622275
58Asp H55Thr OG110053767286
64Glu H64Glu OE20556160

Hydrogen bondX-ray structureMolecular dynamics
Side chainBackboneMolecule 1Molecule 2Molecule 3Molecule 4
29Lysh HZ216Glu O1000000
33Lysh HZ214Thr O1000000
41Gln HE2127Lysh O10081914771
41Gln HE2236Ile O10090896083
48Lysh HZ346Ala O1000000

Hydrogen bondX-ray structureMolecular dynamics
Side chainSide chainMolecule 1Molecule 2Molecule 3Molecule 4
11Lysh HZ234Glu OE21000000
20Ser HG18Glu OE2000600
27Lysh HZ252Asp OD21000000
49Gln HE2116Glu OE11000000
54Arg HH1258Asp OD11000000
55Thr HG158Asp OD1044832986

Backbone–side-chain hydrogen bonds are less well reproduced than backbone–backbone interactions. While some are present 80–90% of the time, others are present less than 50% of the time. Two out of the seven hydrogen bonds in which a backbone atom is the donor are not observed in the simulation; both involve the OG1 atom of Thr7 as an acceptor. The hydrogen-donor atoms are the backbone nitrogen atoms of residues Thr9 and Lys11, both of which have high experimental B factors (18.32 Å2 for Thr9 and 13.56 Å2 for Lys11). The mean of the experimental B factors is 10.73 Å2 for the backbone atoms and 13.41 Å2 for all protein atoms. Where a side-chain atom is the donor, three out of the five hydrogen bonds present in the X-ray structure are not found in the simulation. All of these involve the side-chain nitrogen atom of a lysine residue as the donor, the experimental B factors of which range from 23.92 Å2 for the NZ atom of Lys48 up to 30.06 Å2 for the NZ atom of Lys33. Of the four side-chain–side-chain hydrogen bonds, not one is observed as in the crystal. The 54–58 hydrogen bond is, however, replaced by a 55–58 hydrogen bond. All the others involve very mobile atoms with large experimental B factors as donors and acceptors.

There is one intermolecular hydrogen bond (Table 20.1.3.2[link]) in the starting structure which is not seen in the simulation. The donor is the side-chain nitrogen atom of Lys6, which has an experimental B factor of 20.55 Å2, and the acceptors are the side-chain oxygen atoms of Glu51, with B factors of 32.13 and 33.44 Å2. Most of the hydrogen bonds not reproduced in the simulation contain at least one mobile atom. Although these atoms do not remain fixed at their equilibrium positions, they may still stabilize the structure on average.

Table 20.1.3.2| top | pdf |
Occurrence of intermolecular hydrogen bonds (%) during the final 1.6 ns of the simulation

The criteria for a hydrogen bond to be present are: angle donor–hydrogen–acceptor [\geq 135^{\circ}], distance hydrogen–acceptor [\leq 0.25] nm. Hydrogen bonds are shown if they are either present in the X-ray structure or if at least one of the four protein molecules in the unit cell shows the hydrogen bond of interest for at least 50% of the simulation time.

Hydrogen bond X-ray structureMolecular dynamics
Molecules 1–4 Molecules 2–3 Molecules 3–2 Molecules 4–1
6 Lys HZ351 Glu OE11000000
12 Thr HG118 Glu OE1056577534
49 Gln H8 Leu O01034067
68 Hish HE232 Asp OD200053 (3–1)13 (4–2)
71 Leu H58 Asp O065000

In Fig. 20.1.3.3[link], the deviation of the Cα atoms of the different chains from the X-ray structure and from the mean MD structure are presented together with the deviation of the mean MD structure from the X-ray structure. Overall, the individual protein molecules remain close to the experimental structure; however, parts of the structure do deviate substantially. The region involving residues 7–11, which experimentally has high B factors (implying high mobility), has, in three out of the four cases, an RMSD for Cα atoms of 0.3 nm or greater. Chain 3, in contrast, is close to the X-ray structure, and the mean MD structure is closer to the X-ray structure than are any of the individual chains. This suggests that the simulation does not deviate systematically from the X-ray structure, but rather that different regions of conformational space are sampled by the different molecules. The same holds for the very mobile region between residues 47 and 64, where chain 1 deviates dramatically from both the mean MD structure and the X-ray structure. In the other chains, this part of the protein remains close to the X-ray structure. Overall, the deviation from the X-ray structure is largest in the C-terminal region. This part is also ill-defined in the experiment, with occupancies of 0.45 for residues 73 and 74, and 0.25 for the terminal two glycines. Other parts of the protein, especially the stable secondary-structure elements, stay close to the X-ray structure. In the average structure, the α-helix, including its C-terminal part which was deformed to a 310-helix, deviates by a maximum of 0.08 nm from the X-ray structure, although, as seen before, the individual chains may deviate more. The β-sheet regions also stay close to the X-ray structure. As with the helix, residues 1–7, 40–45 and 64–72 stay within 0.1 nm RMSD from the X-ray structure. The β-strands formed by residues 10–17 and 48–50 are not as similar to the experiment, since they lie close to mobile regions and are thus influenced by neighbouring mobile residues. For the strand formed by residues 10–17, from residue 12 onwards the same structural similarity is reached as for all other secondary-structure elements, and residues 48–50 are, again, strongly influenced by the moving part of chain 1.

[Figure 20.1.3.3]

Figure 20.1.3.3| top | pdf |

Root-mean-square Cα-atom-position deviation (RMSD) in nm from a reference structure as a function of the residue number using the final 1.6 ns of the simulation. RMSDs of the four protein molecules in the unit cell from the mean molecular-dynamics structure (dashed line) and from the X-ray structure (solid line) are shown in the first four graphs. The bottom graph shows the RMSD of the mean (over the four molecules) MD structure from the X-ray structure.

20.1.3.3. Effect of the translational and rotational fitting procedure

| top | pdf |

In Fig. 20.1.3.4[link], the impact of different fitting protocols on atomic mean-square position fluctuations (RMSFs) is examined. B factors are related to mean-square position fluctuations according to [B_{i} = (8\pi^{2}/3) \left\langle ({\bf r}_{i} - \langle {\bf r}_{i}\rangle)^{2}\right\rangle, \eqno(20.1.3.1)] where the angle brackets indicate a time or a combined time and ensemble average. Molecule 4 was selected because it is the most stable. RMSFs of the Cα atoms were calculated directly from the simulation trajectory (Fig. 20.1.3.4a[link]) after applying a translational fit using the Cα atoms of residues 1–72, which are well defined in the X-ray structure (Fig. 20.1.3.4b[link]), and after applying both a rotational and a translational fit on residues 1–72 (Fig. 20.1.3.4c[link]). In Fig. 20.1.3.4(d)[link], a translational and rotational fit was applied to all Cα atoms (residues 1–76). Removal of the overall translational component of motion reduces the positional fluctuations by 0.04 nm on average. Only the RMSFs in the proximity of the end of the large α-helix formed by residues 23–34 are not affected by the introduction of translational fitting. In contrast, it is exactly this region where the fluctuations are substantially lowered by introducing an additional rotational fit. The regions before residue 27 and after residue 42 are only slightly affected by the removal of overall rotation. These findings suggest that the entire protein translates by about 0.04 nm, while the α-helix region remains close to its initial position, thus rotating relative to the rest of the protein. Inclusion of the four C-terminal residues in the fitting procedure only affects the RMSFs of these residues and residues in the rotating part of the molecule, indicating that these four residues move together with the rest of the molecule. The atom-position fluctuations obtained by applying a full (rotational and translational) fit are determined by internal motion only. The largest RMSFs for residues 1–72 are 0.12 nm. RMSFs of the two C-terminal glycines are 0.26 nm if the last four residues are excluded from the fitting and 0.22 nm otherwise.

[Figure 20.1.3.4]

Figure 20.1.3.4| top | pdf |

Root-mean-square Cα-atom-position fluctuations (RMSFs) in nm are shown for molecule 4 as a function of residue number. RMSFs are averaged over the final 1.6 ns. In (a), no fitting was applied; in (b), translational fitting was applied using the Cα atoms of residues 1–72; and in (c), the rotational component of motion was also removed. In (d), translational and rotational fitting was applied over all Cα atoms (1–76).

If the same properties are examined but averaged over all the chains, similar trends can be observed (Fig. 20.1.3.5[link]). If no fitting is applied, the RMSFs of 0.24 nm, on average, indicate that the different molecules show relative translation and rotation. After translational fitting is applied, the mean RMSFs drop to 0.18 nm. Thus, the molecules translate within the unit cell. If, in addition, the rotational component of overall motion is removed, the whole helix region is much less mobile than before, and the mean RMSFs drop to 0.14 nm. The same holds for region 47–64, dominated by the rotation of part of chain 1. Fluctuations are generally much larger than before when only chain 4 was observed, again indicating that the distinct chains behave in an uncorrelated way. The size of the peaks of the RMSFs averaged over all chains is around 0.22 nm, compared with 0.27 nm when overall rotation is still present. Thus, in addition to internal rotations, relative rotations of the different molecules occur. If the fit is not applied only to the well defined Cα atoms of residues 1–72, the RMSF value becomes slightly higher – apart from the C-terminal region – but this effect is small, with the mean RMSF staying at 0.14 nm. However, the relative heights of the peaks differ, which shows that it is crucial to define a standard fitting protocol that must not be changed during the course of the analysis.

[Figure 20.1.3.5]

Figure 20.1.3.5| top | pdf |

Root-mean-square Cα-atom-position fluctuations (RMSFs) in nm are shown using the same fitting protocols as in Fig. 20.1.3.4[link], but averaged over all four protein molecules in the unit cell.

20.1.3.4. Effect of the averaging period

| top | pdf |

In Fig. 20.1.3.6[link], we concentrate again on molecule 4. Comparing different averaging periods of 400 and 800 ps with different starting points, it can be seen that, in general, the later the simulation, the less motion observed. During the period 800–1200 ps, only the small region between residues 9 and 12 shows more mobility than that between 400 and 800 ps. In the stable region, the rest of the molecule shows the same mobility as in the earlier time period. In the parts that are most mobile between 400 and 800 ps, the motions decrease significantly after the latter time point, indicating that equilibrium is reached. Focusing on the longer averaging periods, 400–1200 ps versus 1200–2000 ps, we see that over the whole chain mobility is comparable, indicating clear equilibrium as far as internal motions are concerned. The fluctuations during the 400–1200 ps period are of the same size as those of the shorter subperiods, 400–800 ps and 800–1200 ps. They are thus determined by movements on a timescale shorter than 400 ps.

[Figure 20.1.3.6]

Figure 20.1.3.6| top | pdf |

Root-mean-square Cα-atom-position fluctuations (RMSFs) in nm are shown for molecule 4, with full translational and rotational fitting over the Cα atoms of residues 1–72. Different averaging periods are compared. (a) shows the RMSFs for the period 400–800 ps, (b) shows those for the period 800–1200 ps. In (c), the results for the previous two periods are averaged (400–1200 ps), and in (d), the results for the period 1200–2000 ps are shown.

In Fig. 20.1.3.7[link], the effect of different averaging periods is again considered, but taking the whole unit cell into account. Comparing RMSFs between 400 and 800 ps with those of the following 400 ps period, no significant difference is seen. In fact, contrary to what was observed when chain 4 was examined, somewhat more mobility is evident in the later period compared with the earlier one. This difference shows that although the configurations of the single chains converge rapidly, the different chains visit different parts of phase space. The fact that the fluctuations in the period 400–1200 ps are between those of the two shorter analysis periods is another indication that the single-chain movements are converged on these short timescales. In the last 800 ps of the simulation, the RMSFs are substantially higher than in the 800 ps window before that. All of the peaks can be traced back to one of the single chains. If only one of the four molecules differs strongly from the other three, this one determines the magnitude of the fluctuations of the average. The peak at residue 10 comes from chain 2, the ones around residue 20 and the whole region 47–64 are determined by chain 1. The peak at residue 33 originates in chain 3, which at this point differs substantially from the mean MD structure (Fig. 20.1.3.3[link]).

[Figure 20.1.3.7]

Figure 20.1.3.7| top | pdf |

Root-mean-square Cα-atom-position fluctuations (RMSFs) in nm are shown using the same averaging periods as in Fig. 20.1.3.6[link], but averaged over all four protein molecules in the unit cell.

20.1.3.5. Internal motions of the proteins

| top | pdf |

Fig. 20.1.3.8[link] displays the atomic root-mean-square position fluctuations for the Cα atoms of the four protein molecules during the whole analysis period, together with corresponding values obtained using equation (20.1.3.1[link]) and the crystallographic B factors. Rotational and translational fitting was applied using the Cα atoms of residues 1–72, and the fluctuations were averaged over the final 1.6 ns. The mobility of the stable secondary-structure elements in the simulation is comparable with that inferred from the experiment. There is a correlation between the more mobile parts of the proteins in the simulation and large B factors in the X-ray structure, but the magnitude of the fluctuations is overestimated in the simulation. The movements of the single chains can be rationalized as follows. In chain 1, the whole region from Gly47 onwards rotates around a stable axis formed by residues 41–46. This part lies, as do all the flexible regions, on the exterior of the protein. Residues 19 and 20, which are stable in all but this single chain, are in contact with this moving part. This rotation, which tends to compact the protein, occurs during the 200 ps period between 1350 and 1550 ps after the start of the simulation, in which the atom-position RMSD from the X-ray structure increases significantly (Fig. 20.1.3.2[link]). Overall, chain 2 is more stable than chain 1. Nevertheless, the end of the unwinding helix shows large fluctuations. In the course of this deformation, the side-chain nitrogen atom of Lys11 moves from close to the OE atom of Glu34 towards the backbone oxygen atom of Lys33, which is associated with a change in the position of Gly10. A similar but smaller motion occurs in chain 4. Both lysines, Lys33 and Lys63, are fully exposed to the solvent and have no intramolecular contacts. In chain 3, the flexible residues are also not part of secondary-structure elements and are located on the outside of the protein. The backbone oxygen atom of Gln62 that moves in all the four chains has, in addition, the closest contact to another heavy atom: the OG1 atom of Ser65 is only 2.51 Å away, and the van der Waals repulsion of these atoms causes them to move further away from one another. The mobile residues in chain 4 are again in contact with the solvent, Gly35, Gly47, Gln62, the end of the helix and Gly10. The terminal residues of all the protein molecules are very mobile, as observed experimentally in the crystal.

[Figure 20.1.3.8]

Figure 20.1.3.8| top | pdf |

Root-mean-square Cα-atom-position fluctuations (RMSFs) in nm for the four protein molecules in the unit cell as a function of the residue number. Full translational and rotational fitting was applied to the Cα atoms of residues 1–72 using the final 1.6 ns of the simulation [(a)–(d)]. (e) shows the corresponding values defined by equation (20.1.3.1[link]), obtained from experimental B factors.

20.1.3.6. Dihedral-angle fluctuations and transitions

| top | pdf |

Backbone dihedral-angle fluctuations and transitions are examined in Tables 20.1.3.3[link] and 20.1.3.4[link] using different analysis periods. After the first 400 ps of analysis, the [\varphi/\psi] dihedral-angle fluctuations differ only slightly, but if longer averaging times are chosen, the different protein molecules show larger differences from one another. These fluctuations also increase for longer analysis times, indicating that they are not yet converged after 2 ns. In the period from 800 to 1200 ps, chain 3 shows a large increase in mean-square dihedral-angle fluctuations, whereas the Cα-atom-position RMSDs with respect to the X-ray structure during the same time fluctuate around a plateau. Thus, there is a lot of flexibility without the simulation structure diverging from the experimental one. Protein molecule 3, for example, shows the largest [\varphi/\psi] fluctuations of all the four molecules, and it shows the lowest atom-position RMSDs of Cα atoms from the X-ray structure at the end of the simulation (Fig. 20.1.3.2[link]), indicating that it explores phase space around the equilibrium structure. If, in contrast, the Cα-atom-position RMSDs, with respect to the X-ray structure, increase significantly, larger dihedral-angle fluctuations are also observed, for example, in molecule 1 after 1200 ps.

Table 20.1.3.3| top | pdf |
Root-mean-square fluctuations of polypeptide backbone and ψ dihedral angles (°) for the different molecules using different time-averaging periods

The bottom row shows the averages over all four protein molecules.

Molecule400–800 ps400–1200 ps400–2000 ps
[\varphi/\psi] [\varphi/\psi] [\varphi/\psi]
118.4/22.919.5/23.731.0/33.6
217.2/17.018.6/18.723.6/26.8
318.5/20.425.6/26.335.3/37.5
419.7/18.819.4/20.321.6/28.8
All26.1/26.228.0/28.635.2/38.1

Table 20.1.3.4| top | pdf |
Number of protein-backbone dihedral-angle transitions per 100 ps for the different molecules using different time periods

Dihedral angles with threefold and sixfold potential-energy wells are distinguished. The bottom rows show the averages over all protein molecules.

(a) 120° transitions

Molecule400–800 ps400–1200 ps400–2000 ps
146.545.447.7
240.541.547.3
350.557.151.3
444.846.446.4
All45.647.648.2

(b) 60° transitions

Molecule400–800 ps400–1200 ps400–2000 ps
1245.5246.6289.3
2271.5272.1261.3
3381.5381.0348.3
4356.8325.4325.4
All313.8306.3306.1

(c) All transitions

Molecule400–800 ps400–1200 ps400–2000 ps
1292.0292.0336.9
2312.0313.7308.6
3432.0438.1399.6
4401.5371.8371.8
All359.4353.9354.2

Concerning relaxation, observations similar to those made before can be made when analysing dihedral-angle transitions (Table 20.1.3.4[link]). The number of transitions for the different chains differs by about 20%. Within a single chain, however, the number of transitions increases in proportion to the observation time. Again, the protein molecules showing the most transitions do not have the largest Cα-atom-position RMSDs from the X-ray structure. Thus, only certain dihedral-angle flips induce large movements that determine the RMSD value.

20.1.3.7. Water diffusion

| top | pdf |

In Fig. 20.1.3.9[link], the number of water oxygen atoms with a given atomic root-mean-square position fluctuation (RMSF) are plotted. The time evolution and the shapes of these curves are similar to those obtained for bulk water, a Gaussian distribution with the maximum at larger RMSF values and larger standard deviations when using longer averaging times. Using a diffusion constant of bulk SPC water at 300 K of 3.9 × 10−3 nm2 ps−1 (Smith & van Gunsteren, 1995[link]), the root-mean-square position fluctuation for an average water molecule would be 1.25 nm for a 400 ps period, 1.77 nm for a 800 ps period, and 2.5 nm for a 1600 ps period. Comparison of these values with the distributions in Fig. 20.1.3.9[link] indicates that the motion of most of the crystal water molecules is restricted by the crystalline environment. A number of water molecules adopt well defined positions for periods of the order of some 100 ps. Several water molecules were also observed to visit the same location repeatedly. This indicates that, although not remaining fixed in space, they stay close to a crystallographically determined site which they revisit periodically, alternating with other water molecules.

[Figure 20.1.3.9]

Figure 20.1.3.9| top | pdf |

The number of water molecules with a given root-mean-square oxygen-position fluctuation (RMSF) in nm are shown for different averaging periods: 400–800 ps (solid line), 400–1200 ps (short-dashed line), 400–2000 ps (long-dashed line).

20.1.4. Conclusions

| top | pdf |

In the present molecular-dynamics simulation, fast convergence in energy, within about 100 ps, was observed. Other properties, such as dihedral-angle fluctuations and backbone atom-position fluctuations, converged on an intermediate timescale of hundreds of picoseconds. Root-mean-square deviations of the simulated protein molecules from the starting X-ray structure required of the order of 1 ns to reach a plateau. Longer simulations would be necessary to obtain convergence for all molecular properties. The convergence of quickly relaxing properties of the system, such as the energies, is not a reliable indicator of the degree of global convergence in such a molecular-dynamics simulation.

The GROMOS96 force field used in this simulation largely reproduces the secondary structure and the relative internal mobility of ubiquitin. The simulation does, however, overestimate the magnitude of the fluctuations in the most mobile regions of the protein. The different protein molecules were observed to translate and rotate relative to one another during this simulation. This indicates that the force field would not be able to reproduce the experimental melting temperature of this crystal under the conditions simulated.

Acknowledgements

The authors wish to thank Dr Thomas Huber for fruitful discussions and Dr Alan Mark for critical reading of the manuscript. Financial support was obtained from the Schweizerischer Nationalfonds (Project 21-41875.94), which is gratefully acknowledged.

References

First citation Berendsen, H. J. C., van Gunsteren, W. F., Zwinderman, H. R. J. & Geurtsen, R. (1986). Simulations of proteins in water. Ann. N. Y. Acad. Sci. 482, 269–285.Google Scholar
First citation Berendsen, H. J. C., Postma, J. P. M., van Gunsteren, W. F., DiNola, A. & Haak, J. R. (1984). Molecular dynamics with coupling to an external bath. J. Chem. Phys. 81, 3684–3690.Google Scholar
First citation Berendsen, H. J. C., Postma, J. P. M.,van Gunsteren, W. F. & Hermans, J. (1981). Interaction models for water in relation to protein hydration. In Intermolecular forces, edited by B. Pullman, pp. 331–342. Dordrecht: Reidel.Google Scholar
First citation Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. F. Jr, Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T. & Tasumi, M. (1977). The Protein Data Bank: a computer-based archival file for macromolecular structures. J. Mol. Biol. 112, 535–542.Google Scholar
First citation Brünger, A. T., Kuriyan, J. & Karplus, M. (1987). Crystallographic R-factor refinement by molecular dynamics. Science, 235, 458–460.Google Scholar
First citation Fennen, J., Torda, A. E. & van Gunsteren, W. F. (1995). Structure refinement with molecular dynamics and a Boltzmann-weighted ensemble. J. Biomol. NMR, 6, 163–170.Google Scholar
First citation Fujinaga, M., Gros, P. & van Gunsteren, W. F. (1989). Testing the method of crystallographic refinement using molecular dynamics. J. Appl. Cryst. 22, 1–8.Google Scholar
First citation Gros, P. & van Gunsteren, W. F. (1993). Crystallographic refinement and structure-factor time-averaging by molecular dynamics in the absence of a physical force field. Mol. Simul. 10, 377–395.Google Scholar
First citation Gros, P., van Gunsteren, W. F. & Hol, W. G. J. (1990). Inclusion of thermal motion in crystallographic structures by restrained molecular dynamics. Science, 249, 1149–1152.Google Scholar
First citation Gunsteren, W. F. van & Berendsen, H. J. C. (1990). Computer simulations of molecular dynamics: methodology, applications and perspectives in chemistry. Angew. Chem. Int. Ed. Engl. 29, 992–1023.Google Scholar
First citation Gunsteren, W. F. van, Berendsen, H. J. C., Hermans, J., Hol, W. G. J. & Postma, J. P. M. (1983). Computer simulation of the dynamics of hydrated protein crystals and its comparison with X-ray data. Proc. Natl Acad. Sci. USA, 80, 4315–4319.Google Scholar
First citation Gunsteren, W. F. van, Billeter, S. R., Eising, A. A., Hünenberger, P. H., Krüger, P., Mark, A. E., Scott, W. R. P. & Tironi, I. G. (1996). Biomolecular simulation: the GROMOS96 manual and user guide. Vdf Hochschulverlag, Zürich, Switzerland.Google Scholar
First citation Gunsteren, W. F. van, Bonvin, A. M. J. J., Daura, X. & Smith, L. J. (1997). Aspects of modelling biomolecular structures on the basis of spectroscopic or diffraction data. In Modern techniques in protein NMR, edited by Krishna & Berliner. Plenum.Google Scholar
First citation Gunsteren, W. F. van, Brunne, R. M., Gros, P., van Schaik, R. C., Schiffer, C. A. & Torda, A. E. (1994). Accounting for molecular mobility in structure determination based on nuclear magnetic resonance spectroscopic and X-ray diffraction data. Methods Enzymol. 239, 619–654.Google Scholar
First citation Gunsteren, W. F. van, Kaptein, R. & Zuiderweg, E. R. P. (1984). Use of molecular dynamics computer simulations when determining protein structure by 2D-NMR. In Proceedings of the NATO/CECAM workshop on nucleic acid conformation and dynamics, edited by W. K. Olson, pp. 79–97. France: CECAM.Google Scholar
First citation Gunsteren, W. F. van & Karplus, M. (1981). Effect of constraints, solvent and crystal environment on protein dynamics. Nature (London), 293, 677–678.Google Scholar
First citation Gunsteren, W. F. van & Karplus, M. (1982). Protein dynamics in solution and in crystalline environment: a molecular dynamics study. Biochemistry, 21, 2259–2274.Google Scholar
First citation Harvey, T. S. & van Gunsteren, W. F. (1993). The application of chemical shift calculation to protein structure determination by NMR. In Techniques in protein chemistry IV, pp. 615–622. New York: Academic Press.Google Scholar
First citation Heiner, A. P., Berendsen, H. J. C. &van Gunsteren, W. F. (1992). MD simulation of subtilisin BPN′ in a crystal environment. Proteins, 14, 451–464.Google Scholar
First citation Kaptein, R., Zuiderweg, E. R. P., Scheek, R. M., Boelens, R. & van Gunsteren, W. F. (1985). A protein structure from nuclear magnetic resonance data, lac repressor headpiece. J. Mol. Biol. 182, 179–182.Google Scholar
First citation Levitt, M., Hirshberg, M., Sharon, R. & Daggett, V. (1995). Potential energy function and parameters for simulations of the molecular dynamics of proteins and nucleic acids in solution. Comput. Phys. Commun. 91, 215–231.Google Scholar
First citation Ryckaert, J.-P., Ciccotti, G. & Berendsen, H. J. C. (1977). Numerical integration of the Cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Comput. Phys. 23, 327–341.Google Scholar
First citation Scheek, R. M., Torda, A. E., Kemmink, J. & van Gunsteren, W. F. (1991). Structure determination by NMR: the modelling of NMR parameters as ensemble averages. In Computational aspects of the study of biological macromolecules by nuclear magnetic resonance spectroscopy, edited by J. C. Hoch, F. M. Poulsen & C. Redfield, NATO ASI Series, Vol. A225, pp. 209–217. New York: Plenum Press.Google Scholar
First citation Schiffer, C. A., Gros, P. & van Gunsteren, W. F. (1995). Time-averaging crystallographic refinement: possibilities and limitations using α-cyclodextrin as a test system. Acta Cryst. D51, 85–92.Google Scholar
First citation Schiffer, C. A., Huber, R., Wüthrich, K. & van Gunsteren, W. F. (1994). Simultaneous refinement of the structure of BPTI against NMR data measured in solution and X-ray diffraction data measured in single crystals. J. Mol. Biol. 241, 588–599.Google Scholar
First citation Shi, Y.-Y., Yun, R.-H. & van Gunsteren, W. F. (1988). Molecular dynamics simulation of despentapeptide insulin in a crystalline environment. J. Mol. Biol. 200, 571–577.Google Scholar
First citation Smith, P. E. & van Gunsteren, W. F. (1995). Reaction field effects on the simulated properties of liquid water. Mol. Simul. 15, 233–245.Google Scholar
First citation Torda, A. E., Brunne, R. M., Huber, T., Kessler, H. & van Gunsteren, W. F. (1993). Structure refinement using time-averaged J-coupling constant restraints. J. Biomol. NMR, 3, 55–66.Google Scholar
First citation Torda, A. E., Scheek, R. M. & van Gunsteren, W. F. (1990). Time-averaged nuclear Overhauser effect distance restraints applied to tendamistat. J. Mol. Biol. 214, 223–235.Google Scholar
First citation Vijay-Kumar, S., Bugg, C. E. & Cook, W. J. (1987). Structure of ubiquitin refined at 1.8 Å resolution. J. Mol. Biol. 194, 531–544.Google Scholar








































to end of page
to top of page