Tables for
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F. ch. 23.4, pp. 625-627   | 1 | 2 |

Section Water distribution around the individual amino-acid residues in protein structures

C. Mattosa* and D. Ringeb

aDepartment of Molecular and Structural Biochemistry, North Carolina State University, 128 Polk Hall, Raleigh, NC 02795, USA, and  bRosenstiel Basic Medical Sciences Research Center, Brandeis University, 415 South St, Waltham, MA 02254, USA
Correspondence e-mail: Water distribution around the individual amino-acid residues in protein structures

| top | pdf |

The most comprehensive study of water molecules at the local level of binding to the individual types of amino-acid residues in protein structures was published in a series of papers (Thanki et al., 1988[link], 1990[link], 1991[link]; Walshaw & Goodfellow, 1993[link]). The initial database consisted of 16 protein structures solved to better than 1.7 Å resolution and refined to an R factor of 26% or better (Thanki et al., 1988[link]). It was subsequently increased to 24 proteins using the same selection criteria (Thanki et al., 1990[link], 1991[link]; Walshaw & Goodfellow, 1993[link]). All equivalent side chains as well as carbonyl or amide groups present in the database were brought to a common reference frame constructed from previously established bond lengths and bond angles (Momany et al., 1975[link]). The distribution of water molecules interacting with each of the 20 types of side chains was studied by focusing on particular atoms. Therefore, water molecules within 3.5 Å of N and O polar side-chain or main-chain atoms or within 5.0 Å of apolar side-chain carbon atoms were appropriately translated to the reference frame.

Fig.[link] shows the results of these superpositions for the polar main-chain amido and carbonyl groups as well as for some representative polar side chains: Ser, Tyr, Asp, Asn, Arg, His, Trp and Ala. The overall results show that despite the complex protein architecture, water molecules interact with hydroxyl, carbonyl and amide moieties, as well as with the sp3-hybridized and ring nitrogen atoms, as expected from their known stereochemical requirements (Baker & Hubbard, 1984[link]). Thus, there are water clusters in positions that optimize interaction with the lone-pair electrons on oxygen atoms and with the hydrogen atoms of amide and hydroxyl groups. Figs.[link] and (b)[link] show the distribution of water molecules around the main-chain carbonyl oxygen and amido nitrogen atoms, respectively. The stereochemical requirements mentioned above are satisfied, with the distribution around the carbonyl oxygen clustered in two distinct regions peaking at an O–O distance of 2.7 Å. In contrast, there is a single water cluster interacting with the nitrogen, in line with the N—H bond at an N–O distance of about 2.9 Å. This cluster is much tighter than seen for the interactions with oxygen, reflecting a greater flexibility of water interaction with the carbonyl oxygen relative to the amido-group nitrogen atom.


Figure | top | pdf |

Distribution of water-molecule sites in stereo around: (a) main-chain O, (b) main-chain N, (c) Ser OG, (d) Tyr ring, (e) Asp OD1 and OD2, (f) Asn OD1 and ND2, (g) Arg NH1, NH2 and NE, (h) His ring to 3.5 Å, (i) Trp ring to 3.5 Å, (j) Ala CB. Reprinted with permission from Thanki et al. (1988)[link]. Copyright (1988) Academic Press.

Ser and Thr residues present a wide distribution of water molecules around the hydroxyl groups, presumably due to the freely rotating side chain. Fig.[link] shows the water-molecule distribution around Ser, which is only slightly different from that for Thr and can be representative of both. In contrast, the Tyr hydroxyl group is involved in resonance stabilization with the aromatic ring and, consequently, water molecules are clustered in the plane of the ring in well defined positions (Fig.[link]).

Fig.[link] shows the clustering of water molecules around the Asp side chain into four distinct groups, corresponding to the four available lone-pair electrons. The distribution around Glu is similar. Most water molecules interact with a single carbonyl oxygen, although about 11% (for Asp) and 15% (for Glu) of water molecules around these side chains interact with both oxygen atoms of a single carboxyl group. Water molecules that interact with Asn and Gln also show four clusters, with the two clusters around the carbonyl group (C=O) less distinct than those around the amido (NH2) group. Fig.[link] shows the distribution of water-molecule sites around Asn. In the case of Gln, the difference in water clustering around the carbonyl and amido groups is much less pronounced, possibly due to a greater degree of confusion in placing this longer side chain in the correct orientation. About 6% of the water molecules that interact with Asn or Gln are involved in hydrogen bonding to both the carbonyl oxygen and the amido nitrogen atoms.

The clustering of water molecules around the planar guanidyl group of Arg is distinctly positioned around the N[epsilon] atom and on either side of the NH1 and NH2 atoms. This is shown in Fig.[link]. The clusters peak at a distance of about 3.0 Å from the nitrogen atoms. 7% of these water molecules are shared between NH1 and NH2, and only 3% are shared between the N[epsilon] and NH1 atoms. The distribution around the Lys side chain is much broader and is qualitatively similar to the one shown for Ser in Fig.[link], with no particular orientational preferences, mainly due to the freely rotating nature of the C[epsilon]—Nζ bond.

His and Trp are the two residues that contain ring nitrogen atoms, which comprise the main site of interaction with water molecules for these side chains. The distributions of water molecules within 3.5 Å of these residues are shown in Figs.[link] and (i[link]). The clustering around His shows a peak at 2.7 Å and a larger peak at 3.1 Å. The closer peak corresponds to interactions with deprotonated nitrogen (Nδ), where the lone pair of electrons renders the deprotonated nitrogen more negatively charged than the corresponding protonated nitrogen (N[epsilon]) and, therefore, the deprotonated nitrogen pulls the water molecule closer. The peak at 3.1 Å is due to water interactions with the protonated nitrogen (N[epsilon]) of His. There is a strong preference for the water molecules to lie in the plane of the ring. Relatively few water molecules exist within 3.5 Å of Trp. They mostly cluster around the N[epsilon] nitrogen at varying distances. The number of water molecules interacting with His and Trp within 5.0 Å of the ring increases greatly and peaks at a distance of about 4 Å, as discussed below for hydrophobic residues in general (Walshaw & Goodfellow, 1993[link]).

Overall, there seem to be weaker geometric constraints on oxygen acceptors compared to nitrogen donors. Furthermore, the water interaction with oxygen atoms peaks at a distance of about 2.8 Å, while the interactions with protonated nitrogen atoms occur at a somewhat longer distance of about 3.1 Å. This is possibly due to the larger van der Waals radius of nitrogen (1.8 Å) versus that of oxygen (1.7 Å) (Thanki et al., 1988[link]). A more recent study of hydration around polar residues is based on seven proteins solved to better than 1.4 Å resolution (Roe & Teeter, 1993[link]). The authors used cluster analysis to derive a predictive algorithm to locate water sites around polar side chains on protein surfaces, given the atomic coordinates of the protein alone. These more precise results confirm the general conclusions outlined above. The authors find that the water–oxygen distance is less than that of water–nitrogen by 0.07 Å and suggest the difference to be due to a van der Waals radius of 1.5 Å for nitrogen and 1.4 Å for oxygen (Roe & Teeter, 1993[link]). Although the two groups cite different atomic radii for nitrogen and oxygen, this does not have an effect on the statistical analysis of the data. Roe & Teeter (1993)[link] also find that the clusters associated with nitrogen atoms are approximately two times denser than those around oxygen atoms.

The analysis of the local water structure around the apolar side chains Ala, Val, Leu, Ile and Phe was extended to a distance 5.0 Å from the atom of interest, since these residues show only a few water molecules within the 3.5 Å cutoff used to analyse interactions with polar residues. The most noticeable observations from the analysis of apolar side chains are the water peak at a distance of 4 Å from the carbon atoms of interest and the presence of a polar protein atom within a hydrogen-bonding distance for 75% of these water molecules (Walshaw & Goodfellow, 1993[link]). Phe prefers in-plane interactions and has peaks corresponding to the direction of the C[epsilon]1, C[epsilon]2, Cδ1 and Cδ2 atoms from the centre of the ring. Otherwise, any clustering observed for water molecules near apolar side chains is due to interactions with polar protein atoms and, consequently, is modulated by secondary structure.

A study of protein hydration based on atomic and residue hydrophilicity presents general results consistent with those discussed above, but also adds information that can be correlated with various experimentally and computationally derived hydrophilicity–hydrophobicity scales (Kuhn et al., 1995[link]). The authors used 10 837 water molecules found in 56 high-resolution protein crystal structures to obtain the average number of hydrations per occurrence over each amino-acid type and specific atom types. The hydration of the various amino-acid residues has already been discussed above. The atomic hydrophilicity values calculated for the different protein-atom types are of interest. Fig.[link] and Table[link] show that, regardless of where these atoms are found, neutral oxygen atoms exhibit the greatest hydration level per occurrence, closely followed by negatively charged oxygen atoms, which in turn are followed by positively charged nitrogens and neutral nitrogens, in that order. Carbon and sulfur atoms are indistinguishable in terms of hydration per occurrence and are grouped together as the least hydrated atoms (Kuhn et al., 1995[link]).

Table| top | pdf |
Specific hydrophilicity values for protein atoms

Atom typeHydrations per occurrence
Neutral oxygen 0.53
Negative oxygen 0.51
Positive nitrogen 0.44
Neutral nitrogen 0.35
Carbon, sulfur 0.08
The average number of hydrations per occurrence was calculated over all atoms within each group.

Figure | top | pdf |

Distribution of atomic hydration values. To determine which atoms are similar or distinct with respect to water binding, we plotted the number of atom types (e.g. Ala amide nitrogen, Ala Cα, …) at each hydration per occurrence value. Each atom type contributed one vertical unit to the graph. Oxygen atoms were the most hydrated (top graph), with negatively charged oxygen (black bars) slightly less hydrated on average than neutral oxygen (grey bars). Nitrogens (middle graph) were the next most hydrated, overlapping the oxygen distribution, and positively charged nitrogens (black bars) were somewhat more hydrated than neutral nitrogens (grey bars). Proline's amide nitrogen, with no hydrogen-bonding capacity, had the lowest nitrogen hydration value (leftmost bar). Carbon and sulfur atoms (bottom graph; note change of y-axis scale) were the least hydrated, with sulfur values at 0.05 and 0.15 hydrations per occurrence. Reproduced from Kuhn et al. (1995)[link]. Copyright (1995) Wiley-Liss, Inc. Reprinted by permission of Wiley-Liss, Inc., a division of John Wiley & Sons, Inc.


First citation Baker, E. N. & Hubbard, R. E. (1984). Hydrogen bonding in globular proteins. Prog. Biophys. Mol. Biol. 44, 97–179.Google Scholar
First citation Kuhn, L. A., Swanson, C. A., Pique, M. E., Tainer, J. A. & Getzoff, E. D. (1995). Atomic and residue hydrophilicity in the context of folded protein structures. Proteins Struct. Funct. Genet. 23, 536–547.Google Scholar
First citation Momany, F. A., McGuire, R. F., Burgess, A. W. & Scheraga, H. A. (1975). Energy parameters in polypeptides. VII. Geometric parameters, partial atomic charges, nonbonded interactions, hydrogen bond interactions, and intrinsic torsional potentials for the naturally occuring amino acids. J. Phys. Chem. 79, 2361–2381.Google Scholar
First citation Roe, S. M. & Teeter, M. M. (1993). Patterns for prediction of hydration around polar residues in proteins. J. Mol. Biol. 229, 419–427.Google Scholar
First citation Thanki, N., Thornton, J. M. & Goodfellow, J. M. (1988). Distribution of water around amino acid residues in proteins. J. Mol. Biol. 202, 637–657.Google Scholar
First citation Thanki, N., Thornton, J. M. & Goodfellow, J. M. (1990). Influence of secondary structure on the hydration of serine, threonine and tyrosine residues in proteins. Protein Eng. 3, 495–508.Google Scholar
First citation Thanki, N., Umrania, Y., Thornton, J. M. & Goodfellow, J. M. (1991). Analysis of protein main-chain solvation as a function of secondary structure. J. Mol. Biol. 221, 669–691.Google Scholar
First citation Walshaw, J. & Goodfellow, J. M. (1993). Distribution of solvent molecules around apolar side-chains in protein crystals. J. Mol. Biol. 231, 392–414.Google Scholar

to end of page
to top of page