International
Tables for Crystallography Volume F Crystallography of biological macromolecules Edited by M. G. Rossmann and E. Arnold © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. F. ch. 23.3, pp. 588-622
https://doi.org/10.1107/97809553602060000716 Chapter 23.3. Nucleic acids
a
Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095–1570, USA
This chapter is dedicated to Irving Geis, who died on 22 July 1997 at the age of 88, just as the chapter was begun. Irv was a pioneer in the representation of protein and DNA structures, beginning with illustrations for Scientific American articles on myoglobin (Kendrew, 1961 This chapter covers the advances of our knowledge of nucleic acid duplexes, primarily from single-crystal X-ray diffraction, and the biological implications of this new knowledge. The focus is primarily on DNA because much more is known about it, but DNA/RNA hybrids and duplex RNA are also considered. Because the emphasis is on the geometry of the nucleic acid double helix, exotic structures, such as quadruplexes, hammerhead ribozymes and aptamers, are omitted, as are larger-scale structures such as tRNA. Keywords: A-tract bending; B-DNA; base pairing; DNA; glycosyl bond geometry; helix parameters; nucleic acids; sugar ring conformations; Watson–Crick Z-DNA. |
In 1953, James Watson and Francis Crick solved the structure of double-helical DNA (Watson & Crick, 1953; Crick & Watson, 1954
). So what has a dedicated cadre of X-ray crystallographers been doing for the subsequent 45 years? That is the subject of this chapter: the advance of our knowledge of nucleic acid duplexes, primarily from single-crystal X-ray diffraction, and the biological implications of this new knowledge. The focus will be primarily on DNA because much more is known about it, but DNA/RNA hybrids and duplex RNA will also be considered. Because the emphasis is on the geometry of the nucleic acid double helix, exotic structures, such as quadruplexes, hammerhead ribozymes and aptamers, will be omitted, as will larger-scale structures such as tRNA.
Fibre diffraction showed that there were two basic forms of DNA duplex: the common B form and a more highly crystalline A form (Fig. 23.3.1.1) that, in some but not all sequences, could be produced by dehydrating the fibre (Franklin & Gosling, 1953
; Langridge et al., 1960
; Arnott, 1970
; Leslie et al., 1980
). A- and B-DNA are contrasted in Figs. 23.3.1.2
and 23.3.1.3
. The high-humidity B form has base pairs sitting squarely on the helix axis and roughly perpendicular to that axis. In contrast, in the low-humidity A form, the base pairs are displaced off the helix axis by ca 4 Å and are inclined 10–20° away from perpendicularity to that axis. The two grooves in B-DNA are of comparable depth because base pairs sit on the helix axis, but the major groove is wider than the minor because of asymmetry of attachment of base pairs to the backbone chains. In A-DNA, the minor groove is broad and shallow, whereas the major groove is cavernously deep (all the way from the surface of the helix, to the helix axis, and beyond) but can be quite narrow.
Pohl and co-workers had shown in the 1970s that alternating poly(dC-dG) is special in that it undergoes a reversible salt- or alcohol-induced conformation change (Pohl & Jovin, 1972; Pohl, 1976
). Hence, it was not surprising that when DNA synthesis methods advanced to the stage where oligonucleotide crystallization became feasible, two separate research groups – those of Alexander Rich at MIT and Richard Dickerson at Caltech – elected to synthesize, crystallize and solve a short, alternating C-G oligomer. The result was a third family of DNA duplexes, Z-DNA (Fig. 23.3.1.4)
, first as the hexamer C-G-C-G-C-G (Z1) and then the tetramer C-G-C-G (Z3). (References to A-, B- and Z-DNA structures are listed at the end of Tables A23.3.1.1
, A23.3.1.2
and A23.3.1.3
in the Appendix
, respectively. They are cited by numbers beginning with A, B or Z.) Single-crystal analyses of the traditional helix types soon followed: B-DNA as C-G-C-G-A-A-T-T-C-G-C-G (B1), and A-DNA as both C-C-G-G (A1) and G-G-T-A-T-A-C-C (A2).
Before making detailed comparisons of the three helix types, one must define the parameters by which the helices are characterized. The fundamental feature of all varieties of nucleic acid double helices is two antiparallel sugar–phosphate backbone chains, bridged by paired bases like rungs in a ladder (Fig. 23.3.2.1). Using the convention that the positive direction of a backbone chain is from 5′ to 3′ within a nucleotide, the right-hand chain in Fig. 23.3.2.1
runs downward, while the left-hand chain runs upward. A- or B-DNA is then obtained by twisting the ladder into a right-handed helix. But Z-DNA cannot be obtained from Fig. 23.3.2.1
simply by giving it a left-handed twist; both backbone chains run in the wrong direction for Z-DNA. A more complex adjustment is required, and this will be addressed again later.
The conformation of the backbone chain along each nucleotide is described by six torsion angles, labelled α through ζ, as shown in Fig. 23.3.2.2. An earlier convention termed these same six angles as ω, φ, ψ, ψ′, φ′, ω′ (Sundaralingam, 1975
), but the alphabetical nomenclature is now generally employed. Torsion angles are defined in Fig. 23.3.2.3
, which also shows three common configurations: gauche − (−60°), trans (180°) and gauche + (+60°). These three configurations are especially favoured with sp3 hybridization or tetrahedral ligand geometry at the two ends of the bond in question, because their `staggered' arrangement minimizes ligand–ligand interactions across the bond. An `eclipsed' arrangement with ligands at −120°, 0° (cis), and 120° is unfavourable because it brings substituents at the two ends of the bond into opposition. Table 23.3.2.1
lists the mean values and standard deviations of all six main-chain torsion angles for A-, B- and Z-DNA, as recently observed in 96 oligonucleotide crystal structures (Schneider et al., 1997
).
|
The type of ligand–ligand clash just mentioned is an important element in ensuring that five-membered rings, such as ribose and deoxyribose, are not ordinarily planar, even though the internal bond angle of a regular pentagon, 108°, is close to the 109.5° of tetrahedral geometry. A stable compromise is for one of the four ring atoms to lie out of the plane defined by the other four, as in Fig. 23.3.2.4. This is termed an `envelope' or E conformation, by analogy with a four-cornered envelope having a flap at an angle. Intermediate `twist' or T forms are also possible, in which two adjacent atoms sit on either side of the plane defined by the other three, but this discussion will focus on the simple envelope conformations. In most cases, the accuracy of a nucleic acid crystal structure determination is such that it would be difficult to distinguish clearly between a given E form and its flanking T forms. For this reason, most structure reports consider only the E alternatives.
A convenient and intuitive nomenclature is to name the conformation after the out-of-plane atom and then specify whether it is out of plane on the same side as the C5′ atom (endo) or the opposite side (exo). Ten such conformations exist: five endo and five exo. In Fig. 23.3.2.4 (top), pushing the C3′ atom of the C3′-endo conformation into the plane of the ring would tend to push C2′ below the ring, passing through a T state and creating a C2′-exo conformation. C2′ can, in turn, be returned to the ring plane if C1′ is pushed above the ring, forming C1′-endo, and so on, around the ring. In this way, a contiguous series of alternating endo/exo conformations is produced, as listed in Table 23.3.2.2
.
|
This ten-conformation endo/exo cycle can be generalized to a continuous distribution of intermediate conformations, characterized by a pseudorotation angle, P (Altona et al., 1968; Altona & Sundaralingam, 1972
), with the ten endo/exo conformations spaced 36° apart (Table 23.3.2.2)
. Fig. 23.3.2.5
shows the calculated potential energy of conformations around the pseudorotation cycle (Levitt & Warshel, 1978
). Note that C2′-endo and C3′-endo are most stable, that the pathway between them along the right half of the circle remains one of low energy, but that a large 6 kcal mol−1 potential energy barrier (1 kcal mol−1 = 4.184 kJ mol−1) effectively forbids conformations around the left half of the circle.
As Fig. 23.3.2.4 indicates, the main-chain torsion angle, δ, is sensitive to ring conformation, because the C5′—C4′ and C3′—O3′ bonds that define the angle shift as ring puckering changes. The idealized relationship between torsion angle, δ, and pseudorotation angle, P (Saenger, 1984
), is
Fig. 23.3.2.6
shows the observed torsion angles, δ, and pseudorotation angles, P, from X-ray crystal structure analyses of synthetic DNA oligonucleotides: 296 examples from A-DNA and 280 from B-DNA. The most striking aspect of this plot is the radically different behaviour of A- and B-DNA. The prototypical sugar conformation for A-DNA obtained from fibre diffraction modelling, C3′-endo, is, in fact, adhered to quite closely in A-DNA crystal structures.
However, B-DNA shows a quite different behaviour. Although earlier fibre diffraction led one to expect C2′-endo sugars, the actual experimental distribution is quite broad, extending up the right-hand side of the pseudorotation circle of Fig. 23.3.2.5, through C1′-exo, O1′-endo and C4′-exo, in some cases all the way to C3′-endo itself. Indeed, the mean value of δ observed in B-DNA oligomer crystal structures is 128° rather than 144° (Table 23.3.2.1)
, making C1′-exo a better description of sugar conformation in B-DNA than C2′-endo. Old habits die hard, however, and the B-DNA sugar conformation is still colloquially termed C2′-endo, a designation of historical significance but of little practical value. The apparent greater malleability of the B helix compared to A may indeed be one feature that makes B-DNA particularly suitable for expressing its base sequence to drugs and control proteins via local helix structure changes.
The key to the biological role of DNA is that one of the two purines can pair with only one of the pyrimidines: A with T, and G with C. Hence, genetic information present in one strand is passed on to the complementary strand. The standard two-base pairs are shown in Fig. 23.3.2.7 along with the conventional numbering of the atoms. Backbone sugar and phosphate atoms are primed while base atoms are unprimed, as, for example, C1′ and N9 at opposite ends of a purine glycosydic bond. The G·C base pair is held together by three hydrogen bonds, whereas an A·T pair has only two. This means that A·T pairs show less resistance to propeller twisting (counter-rotation of the two bases about their common long axis), and this will have an effect on minor groove width, as seen later. The patterns of hydrogen-bond acceptors (A) and donors (D) on the major and minor groove edges of base pairs are important elements in recognition of base sequence by drugs and control proteins.
![]() | A·T and G·C base pairs with minor groove edge below and major groove edge above. A is a hydrogen-bond acceptor, D is as hydrogen-bond donor. |
Other related but nonstandard base pairs are compared in Fig. 23.3.2.8. Inosine (I) is useful in studying properties of DNA in that, when paired with cytosine (C), it creates a G·C-family base pair having overall similarity to A·T. Similarly, diaminopurine (DAP) [also known as 2-aminoadenine (2aA)], when paired with thymine (T), creates a G·C-like pair from A·T-family bases. Hence, in a given experimental situation, one can unscramble the relative significance of number of hydrogen bonds versus identity and location of exocyclic groups.
The conventional Watson–Crick base pairing of Fig. 23.3.2.7 uses the hexamer `end' of the purine base. A different type of base pairing was proposed many years ago by Hoogsteen (1963)
, in which the upper edge of the purine was used: N7 and N6/O6. Hoogsteen base pairing is shown between the left-hand two bases in each part of Fig. 23.3.2.9
. Note that in Hoogsteen base pairing of A and T, each ring provides both a hydrogen-bond donor and an acceptor. Guanine cannot do this, since both its N7 and O6 positions are acceptors. As a consequence, in a G·C pair, C must supply both of the hydrogen-bond donors. It can only form a Hoogsteen base pair with G when the cytosine ring is protonated. This would lead one to expect triplex formation only at low pH. However, the stability of a triplex can, to a certain extent, alter the pKa of the N—H proton itself. (Recall the shift in pKa of buried Asp and His groups in the active sites of enzymes.) Hence, with a single-chain DNA, G-A-G-A-G-A-A-C-C-C-C-T-T-C-T-C-T-C-T-T-T-C-T-C-T-C-T-T, that folds back upon itself twice to build a triplex, NMR experiments indicate a significant amount of triplex remaining even at pH 8.0 (Sklenár & Feigon, 1990
; Feigon, 1996
).
An important advantage of single-crystal oligonucleotide structures over fibre-based models is that one can actually observe local sequence-based departures from ideal helix geometry. B-DNA fibre models indicated a mean twist of ca 36° per step, or ten base pairs per turn, whereas A-DNA fibre patterns indicated less winding: ca 33° per step or 11 base pairs per turn. Twist, rise per base pair along the helix axis, horizontal displacement of base pairs off that axis, and inclination of base pairs away from perpendicularity to the axis are all intuitively obvious parameters. But when single-crystal structures began appearing in great numbers in the mid-1980s, it became imperative that uniform names and definitions be used for these and for less obvious, but increasingly significant, local helix parameters.
An EMBO workshop on DNA curvature and bending, held at Churchill College, Cambridge, in September 1988, led to an agreement on definitions and conventions that was published simultaneously in four journals (Dickerson et al., 1989). Fig. 23.3.2.10
shows the reference frames for two successive base pairs, and Figs. 23.3.2.11
and 23.3.2.12
illustrate local helix parameters involving rotation and translation, respectively. Subsequent experience has shown the most useful parameters to be inclination, propeller, twist and roll among the rotations, and x displacement, rise and slide among the translations. As mentioned at the beginning of this chapter, inclination and x displacement are the two properties that best differentiate A- from B-DNA. The four most widely used computer programs for calculation of local helix parameters are NEWHELIX by Dickerson (B7, B46), CURVES by Lavery & Sklenar (1988
, 1989
), BABCOCK by Babcock & Olson (Babcock et al., 1993
, 1994
; Babcock & Olson, 1994
) and FREEHELIX (Dickerson, 1998c
). NEWHELIX was the earliest of these, but it performs all calculations relative to a best overall helix axis. This is satisfactory for single-crystal DNA structures, but makes the program unusable for the 180° bending observed in some protein–DNA complexes. CURVES is especially convenient for mapping the axis of a bent or curved helix. FREEHELIX, which evolved from NEWHELIX, calculates all parameters relative to local base-pair geometry, without assuming an overall axis, and permits display of normal vector plots that are especially useful in analysing bending in DNA–protein complexes (Dickerson & Chiu, 1997
).
The glycosyl bond angle, χ, about the bond connecting a sugar ring to a base is a special case of torsion angle, and is defined by O4′—C1′—N1—C2 for pyrimidines and O4′—C1′—N9—C4 for purines. In A- and B-DNA, the normal range of χ is 160 to 300°. This is known as the anti conformation (right-hand side of Fig. 23.3.2.13) and swings the sugar ring out away from the minor groove edge of the base pair. In Z-DNA, pyrimidines also exhibit the anti glycosyl bond conformation, but purines adopt the syn geometry shown on the left-hand side of Fig. 23.3.2.13
. Now the sugar ring is rotated so that it intrudes into the minor groove, and χ lies in the range 50 to 90°.
Figs. 23.3.3.1
–23.3.3.3
show the original stereo pairs that were re-drawn by Irving Geis in preparing Figs. 23.3.1.2
–23.3.1.4
. These stereo pairs were constructed from X-ray structures of A-, B- and Z-DNA oligomers by deleting the outermost base pair from each end, eliminating the backbone as far as the first phosphate group, and then stacking these trimmed-down helices on top of one another, with phosphate groups overlapping, to create an infinite helix. They are improvements over the idealized infinite helices generated from fibre diffraction in that they display local variation in helix parameters that only single-crystal analyses can reveal. In the present context, they are good subjects for discussion of the differences between the three helix types.
A-DNA (Wahl & Sundaralingam, 1996, 1998
), B-DNA (Berman, 1996
; Dickerson, 1998b
) and Z-DNA (Ho & Mooers, 1996
; Basham et al., 1998
) have each been the subject of recent reviews, to which the reader is referred for details that cannot be covered here. The distinctive properties of the three helices are listed in Table 23.3.3.1
. The most obvious distinction is handedness: A and B are right-handed helices, whereas Z is left-handed. Moreover, the position of each base pair relative to the helix axis is quite different. As noted in Fig. 23.3.2.13
, the helix axis passes through base pairs in B-DNA, lies on the minor groove side of base pairs in Z-DNA, and on the major groove side in A-DNA. In terms of the helix parameters of Fig. 23.3.2.12
, A-DNA has a typical x displacement of dx = +3 to +5 Å, B-DNA has dx = −1 to 0 Å, and Z-DNA has dx = −3 to −4 Å. There is virtually no overlap between these three ranges; x displacement, dx, in fact, is a better criterion for differentiating the three classes of helix than is sugar ring conformation.
†Relative 5′-to-3′ directions of the two backbone chains, when viewed into the minor groove.
|
A direct consequence of these x displacement values is great differences in depths of major and minor grooves. Both grooves are of equivalent depth in B-DNA because base pairs sit on the helix axis. In A-DNA, a base pair is pushed off-axis so that its minor edge approaches the helix surface, making the minor groove very shallow and the major groove cavernously deep. In Z-DNA, it is the major edge of each base pair that is pushed toward the surface, so that the minor groove is deep and the major groove is so shallow as hardly to be characterized as a groove at all. It is sometimes stated that `Z-DNA has no major groove', but space-filling stereos, such as Fig. 1 of reference Z6 or Fig. 3 of Z23 reveal the shallowest of major grooves running around the helix cylinder, flanked by very slightly higher phosphate backbones.
In both A- and B-DNA, all glycosydic bonds are anti, with sugar rings swung to either side away from the minor groove, as in Fig. 23.3.3.4(a). As mentioned earlier, when viewed into the minor groove, the backbone chains describe a clockwise rotation, with the chain on the right running downward, and that on the left upward, as in Fig. 23.3.2.1
. In Z-DNA, both chains run in the opposite direction, leading to a counterclockwise rotation sense viewed into the minor groove. But Z-DNA has yet another striking (and defining) feature. Purines and pyrimidines alternate along each chain. G and C are most strongly favoured by far, but A and T can substitute intermittently at a price in stability. Breaking the strict alternation of purines and pyrimidines is even more unfavourable and is rarely encountered in crystal structures (Table A23.3.1.3)
. At each purine base, the glycosyl bond is rotated into the minor groove to the syn position, as in Fig. 23.3.3.4(c)
. This causes the local backbone directions, defined by sugar ring atoms C4′ and C3′, to be parallel in the two strands. Z-DNA avoids becoming a parallel-chain helix by performing a local chain reversal at each pyrimidine. In Fig. 23.3.3.4(c)
, although the local C4′–C3′ chain direction at the cytosine sugar is downward, the double loop in backbone chain gives it a net upward orientation. In stereo Fig. 23.3.3.3
, the ascending backbone chain rises smoothly past each guanine, with a chain path parallel to the helix axis. However, the chain bends abruptly at right angles when passing a cytosine, in a direction tangential to the helix cylinder. Guanine sugar rings point their O4′ oxygen atoms in the backward chain direction (as is also true for all bases in A- and B-DNA), but cytosine sugars point their oxygens in the forward direction. This `up at G, across at C' pathway and inversion of sugar rings is what produces the zigzag backbone pathway that leads to the name Z-DNA. The O4′ atom of each cytosine sugar is stacked on top of the guanine ring of the subsequent nucleotide, and this stacking of a polar O (or N) on top of a polarizable aromatic ring contributes to the stability of the Z helix, as it does to many other base–base interactions to be discussed later (Bugg et al., 1971
; Thomas et al., 1982
; B32).
Sugar ring conformations in A- and B-DNA have a logical structural basis. The B-DNA backbone is more extended than the A-DNA backbone, with P–P distances of ca 6.6 Å along one chain, compared with ca 5.5 Å in A-DNA. In turn, C2′-endo is a more extended ring conformation than C3′-endo, demonstrable in Fig. 23.3.2.4 by a greater distance between C5′ and O3′ atoms. Hence, it is logical that the more extended ring conformation should be associated with the more extended backbone chain. In Z-DNA, the extended C2′-endo form is adopted at cytosine, where a zigzag double chain reversal must be accommodated, while the more compact C3′-endo occurs at the straight backbone segment running past a guanine.
The cramped syn glycosyl conformation is strongly disfavoured, although not absolutely forbidden, at pyrimidines, most probably because of steric clash between the pyrimidine O2 and the syn ring (Haschmeyer & Rich, 1967; Davies, 1978
; Ho & Mooers, 1996
; Basham et al., 1998
). Hence, the Z-DNA helix is effectively limited to alternating pyrimidine/purine sequences, with a price that must be paid for intermittent substitution of A and T for G and C, and an even higher price paid for breaking the pyrimidine/purine alternation. This is reflected in the X-ray crystal structures listed in Table A23.3.1.3
. Only one non-alternating sequence has been completely solved and published: *C-G-G-G-*C-G (Z40), where adoption of the Z form has been forced by 5-methylation of cytosines (*C). A second non-alternating sequence that includes AT base pairs, *C-G-A-T-*C-G (Z13), was solved in 1985, but its coordinates have never been made public. It, too, required methylation of cytosines to induce the Z form. A third sequence, C-C-G-C-G-G (Z42), opens its terminal base pairs to make intermolecular base pairs with crystal neighbours. The 52 remaining Z-DNA structures in Table A23.3.1.3
all have strict alternation of pyrimidines and purines.
The helical repeat unit in Z-DNA is therefore two successive base pairs, rather than the single base pair of A- and B-DNA. Ho & Mooers (1996) propose that the C-G or 5′pyrimidine-P-purine3′ step be considered the fundamental unit of the Z-helical structure, because of the tight overlap between the two base pairs. As can be seen in Fig. 23.3.3.3
, in a C-G step the pyrimidine rings from the two base pairs actually stack over one another, whereas the purine rings are packed against neighbouring sugar O4′ atoms. Helix-axis rotation at this step is only −8°, whereas the preceding and following G-C steps have a mammoth −52° twist. Hence, although Z-DNA has 12 base pairs per turn, it technically is not a dodecamer helix, but a hexamer with a two-base-pair repeating unit and a total rotation of −60° per unit.
This virtual restriction to purine/pyrimidine alternation means that Z-DNA cannot be involved in the coding of genetic information. A and B helices have no such restriction; their structures can accommodate a random sequence of bases. Average twist angles are as shown in Table 23.3.3.1, although extreme variation in twist is observed at individual steps in single-crystal structure analyses, from as little as 16° to as much as 55°. Base-sequence preferences for local helix parameters are discussed below.
In both B and Z helices, base pairs are very nearly perpendicular to the helix axis, whereas in the winding double ribbon of A-DNA, the long axis of each base pair is inclined by 10 to 20° away from perpendicularity to the axis. Hence, the rise per base pair for all B-helical steps and for G-C steps of Z-DNA is equal to the thickness of a base pair, 3.4 Å. The rise at a C-G step of Z-DNA is larger because it involves stacking of a sugar oxygen on each purine ring, not ring stacked on ring. For A-DNA, the rise along the helix axis can actually be less than the thickness of a base pair, because adjacent base pairs are stacked at an incline. The perpendicular distance from one base pair to the next in A-DNA is still 3.4 Å. Both A- and B-DNA exhibit considerable base pair propeller twist, especially at A·T pairs with only two hydrogen bonds rather than three. In contrast, Z-DNA, with predominately G·C pairs, shows only a small propeller twist.
The stacking of base pairs has immediate consequences for crystal growth. For Z-DNA, four base pairs are one-third of a helical turn, and six base pairs are a half turn. Hexamers are the most common crystal form in Table A23.3.1.3 by a large majority. In contrast, octamers and decamers are not simple fractions of a turn, and they stack in a disordered manner. One would predict that dodecamers of Z-DNA might crystallize well if the oligomers were not so long as to fall prey to cylindrical disorder.
By the same principles, B-DNA decamers stack easily and well to build pseudo-infinite helices through the crystal, with ordered cylindrical rods packed in six different space groups. The other common crystallization mode for B-DNA, the dodecamer, has a two-base-pair overlap of ends that both stablizes the crystals and yields a functional ten-base-pair repeat. (See Fig. 2 of Dickerson et al., 1987.) Because the dodecamers are held by their outer two base pairs, the central eight pairs are unobstructed and accessible in the crystal, making dodecamers particularly good subjects for the study of minor-groove binding drugs.
A-RNA duplexes [Table A23.3.1.1, part (k)] also stack end-for-end in a manner simulating an infinite A helix, even though the end base pairs are inclined and are not perpendicular to the helix axis. This behaviour has been seen for octamers with roughly two-thirds of a helical turn, for nonamers, and for dodecamers with roughly a full turn.
In contrast, crystals of A-DNA behave quite differently. Regardless of chain length, A-DNA helices crystallize with the outer base pair of one helix packed against one wall of the broad, open and relatively hydrophobic minor groove of another helix. This packing mode is sufficiently adaptable to accommodate duplexes of lengths four, six, eight, nine, ten and 12 base pairs. Hence, A-DNA does not simulate infinite helices through the crystal lattice, as A-RNA and B- and Z-DNA do.
So far this discussion has only been concerned with DNA. Which of the three helix types can be adopted by RNA? Fig. 23.3.3.5 shows that addition of a 2′-OH group to a B-DNA helix [part (a)] creates severe steric clash with the phosphate group and sugar ring of the following nucleotide, whereas in an A helix [part (b)], the added hydroxyl group extends radially outward from the helix cylinder and causes no steric problems. Hence, the natural helical form for RNA is the A helix, not the B helix. Table A23.3.1.1
shows several single-crystal analyses of A-RNA and RNA/DNA hybrids; Table A23.3.1.2
shows no B-RNA structures. One RNA/DNA hybrid is known as a Z helix: C-G-c-g-C-G (Z24), in which the two central nucleotides are RNA. If one mentally adds an —OH to each C2′ atom in Fig. 23.3.3.3
, on the same side of the ring as O3′, it is apparent that the C2′-OH is not inherently incompatible with the Z helix, as it is with the B helix. At guanine sugars, the C2-OH points out and away from the helix, while at cytosine sugars it points away from the base into the spacious minor groove.
The B helix is the biologically relevant structure for DNA. The A form might logically be adopted at the stage of transient DNA–RNA duplexes during transcription, but elsewhere the B form holds sway. It was once thought that binding of DNA to a protein surface, most particularly nucleosomal winding, might constitute a sufficient dehydration of bound water molecules from the DNA duplex to shift it to the A form. This proved to be false; nucleosomal DNA clearly retains the B conformation. The closest that one comes to biological A-DNA is local deformations upon binding of B-DNA to a few proteins that have been described as `A-like distortions'. On the other hand, the A helix has been found repeatedly in RNA duplexes, including tRNA and ribozymes.
The situation is even more restrictive with the Z helix. Although its alternating purine/pyrimidine sequence makes it unusable for genetic coding, the suggestion has been made on many occasions that Z-DNA might be an important element in genetic control by being involved in negative supercoiling (Herbert & Rich, 1996). It has been shown that a left-handed DNA conformation can be induced by negative superhelical stress, but it is not absolutely clear that this induced, left-handed conformation is the same as the Z helix seen in crystal structures of small oligomers. As noted by Herbert & Rich (1996)
, after nearly twenty years of enquiry, it is still far from certain that Z-DNA itself has any demonstrable biological role.1
A major stumbling block is the cumbersome mechanism that must be invoked to explain a B-to-Z interconversion. As mentioned previously, a simple twisting of the helix from right to left is not sufficient, because the backbone chains run in opposite directions in the two forms. Fig. 23.3.3.6 demonstrates the steps that must still be undertaken after both B and Z helices have been unwound so as to remove all of their helical character. Note the opposite sense of the backbone strands in B [part (a)] and Z [part (e)]. In order to accomplish the interconversion, base pairs of B-DNA must be pulled apart, as in part (b), and each base pair swung around to the opposite side of the backbone `ladder' [part (c)]. This would automatically lead to syn conformations at both ends of the base pair, as drawn in Fig. 23.3.3.4(b)
. Returning pyrimidines to an anti conformation would create the zigzag backbone chain (Fig. 23.3.3.4c)
. Base pairs can then be re-stacked, as in parts (d) and (e) in Fig. 23.3.3.6
(which differ only by rotation of the entire helix about the vertical), to yield the backbone geometry of a Z helix. This is the simplest interconversion and one which was recognized and proposed in the very first Z-DNA structure paper (Z1). Other alternatives have been suggested, involving breaking individual base pairs, swinging the bases independently around their backbone chains, and re-forming the pairs. But one kind of special mechanism or another must be invoked if a B-to-Z interconversion is to be achieved.
Ansevin & Wang (1990) have proposed an alternative left-handed double helix, with many of the properties of Z-DNA, but possessing the same backbone chain orientations as A- and B-DNA. With such a helix, a B-to-Z conversion would require only a twisting of the duplex about its axis – no separation of bases or unpairing, and no pulling apart of the stack. Ansevin & Wang did not challenge the X-ray crystal structure analyses of short Z-DNA oligomers. Instead, they suggested that Z-DNA was globally the most stable form, adopted in short oligomers where chain unravelling and rearrangement is easy, but that their `Watson–Crick' Z-DNA or Z(WC)-DNA was the structure that was actually produced by in vitro or in vivo manipulations of long DNA duplexes. They noted that most solution measurements focus on only two characteristics of the DNA: left-handedness and a dinucleotide repeat, both shared by Z-DNA and Z(WC)-DNA.
The Z(WC) helix is shown in Fig. 23.3.3.7, and a different stereo view appears as Fig. 7 of Dickerson (1992)
. Like Z-DNA, it is left-handed, with a deep minor groove and shallow major groove. Cytosines with anti glycosyl bonds and guanines with syn bonds alternate along each backbone strand. However, sugar puckering is reversed: cytosines are C3′-endo, while guanines are C2′-endo. In Z-DNA, the backbone chain runs parallel to the helix axis past G, and at right angles to the axis past C. In Z(WC)-DNA, this is reversed: parallel to the helix past C, and at right angles past G. Because of efficient stacking of base pairs, the logical two-base-pair structural unit in Z-DNA is 5′C–G3′; in Z(WC)-DNA it is 5′G–C3′. One such unit is clearly visible in the centre of Fig. 23.3.3.7
. This behaviour is reflected in local twist angles:
The Ansevin–Wang helix has been sedulously ignored since its publication in 1990, especially by crystallographers. The Science Citation Index lists an average of one citation of their paper per year since publication, most commonly by spectroscopists. Ho & Mooers (1996)
are almost alone among crystallographers in coupling the B-to-Z interconversion dilemma to the possible existence of a different kind of left-handed structure in long polynucleotides. Of course the Z(WC)-DNA structure, as presented here, is only a model; it could be far from the true structure in many respects. But its interest lies in the fact that a left-handed alternating helix with `standard' backbone dirctions can be built with reasonable bond geometries and with properties that fit the various physical measurements as well as Z-DNA. It calls into question not the correctness of the Z-DNA structure obtained from short oligomers with free helix ends, but the relevance of that structure to the production of left-handed regions in longer duplexes with constrained ends.
Two channels of information exist in B-DNA by which base sequence is expressed to the outside world. One of these is the Watson–Crick base pairing of A with T and G with C that is used in the storage of genetic information and in replication and transcription. The other channel, used in control and regulation of the expression of this genetic information, involves the hydrogen-bonding patterns of base-pair edges along the floors of the grooves and any systematic deformations of local helix structure that result explicitly from the base sequence.
The simplest and most direct expression of this second channel is the passive reading of hydrogen-bonding patterns along the floor of the major and minor grooves. This readout mechanism was first proposed by Seeman et al. (1976), and involves acceptors and donors as marked by A and D in Fig. 23.3.2.7
. The wide major groove of B-DNA is read by several classes of control proteins that function by positioning an α-helix within the groove so that its amino-acid side chains can sense the pattern of hydrogen bonding. This category includes prokaryotic and eukaryotic helix-turn-helix or HTH proteins, zinc-finger and other zinc-binding proteins, basic leucine zippers and their basic helix-loop-helix cousins, and others (See Table I of Dickerson & Chiu, 1997
). The narrower minor groove is a frequent target for long, planar drug molecules, such as netropsin and distamycin, as listed in Part II of Table A23.3.1.2
.
In principle, this readout mechanism would work perfectly well with a regular, ideal, fibre-like B-DNA helix. But other control proteins that recognize the minor groove, such as TATA-binding protein (TBP) and integration host factor (IHF), depend not merely on passive hydrogen bonding to an ideally regular duplex, but on the sequence-dependent deformability of one region of the helix versus another. The remainder of this chapter will be concerned with this effect and its role in DNA recognition.
The simplest and first-noticed sequence-dependent deformability of the B-DNA duplex was variation in minor groove width. The first B-DNA oligomer to be solved, C-G-C-G-A-A-T-T-C-G-C-G (B1–B6), had a narrow minor groove in the central A-A-T-T region, with only ca 3.5 Å of free space between opposing phosphates and sugar rings. (It has become conventional to define the free space between phosphates as the measured minimal P–P separation across the groove, less 5.8 Å to represent two phosphate-group radii. Similarly, the measured distance between sugar oxygens is decreased by 2.8 Å, representing two oxygen van der Waals radii.) The C-G-C-G ends of the helix had the 6–7 Å opening expected for ideal B-DNA, but the situation was clouded, because the outermost two base pairs at each end of the helix interlocked minor grooves with neighbours in the crystal. Hence, the wider ends could possibly be only an artifact of crystal packing.
After 1991, the situation was clarified by the structures of several decamers [Table A23.3.1.2, Part I(c)], which stack on top of one another without the interlocking of grooves. The normal minor groove opening is ca 7 Å. Regions of four or more AT base pairs can exhibit a significantly narrowed minor groove, although such narrowing is not mandatory. This behaviour is seen with the B-DNA decamer, C-A-A-A-G-A-A-A-A-G, in Fig. 23.3.4.1
. The narrowing arises mainly from the larger allowable propeller twist in AT base pairs, which displaces C1′ atoms at opposite ends of the pair in different directions, and moves the backbone chains in such a way as to partially close the groove (Fig. 23.3.4.2)
.
This is an excellent example of the concept of sequence-dependent helix deformability, rather than simple deformation. The two hydrogen bonds of an AT base pair allow a larger propeller twist but do not require it. Hence, AT regions of helix permit a narrowing of the minor groove but do not demand it. Indeed, this lesson was brought home in the most dramatic way when Pelton & Wemmer (1989, 1990
) showed via NMR that a 2:1 complex of distamycin with C-G-C-A-A-A-T-T-G-G-C or C-G-C-A-A-A-T-T-T-G-C-G could exist, in which two drug molecules sat side-by-side within an enlarged central minor groove. Fig. 23.3.4.3
shows a narrow minor groove with a single netropsin molecule, and Fig. 23.3.4.4
shows a wide minor groove enclosing two di-imidazole lexitropsins side-by-side. In summary, an AT-rich region of minor groove is capable of narrowing but is not inevitably narrow, in contrast to GC-rich regions where the third hydrogen bond tends to keep the base pairs flat and the minor groove wide. The AT minor groove is potentially deformable without being inevitably deformed.
Sequence-dependent bendability has been reviewed recently by Dickerson (1988a,b
,c
) and Dickerson & Chiu (1997)
. The relative bendability of different regions of B-DNA sequence is an important aspect of recognition, one that is used by countless control proteins that must bind to a particular region of double helix. Catabolite activator protein or CAP (Schultz et al., 1991
; Parkinson et al., 1996
), lacI (Lewis et al., 1996
) and purR (Schumacher et al., 1994
) repressors, γδ-resolvase (Yang & Steitz, 1995
), EcoRV restriction enzyme (Winkler et al., 1993
; Kostrewa & Winkler, 1995
), integration host factor or IHF (Rice et al., 1996
), and TBP or TATA-binding protein (Kim, Gerger et al., 1993
; Kim, Nikolov & Burley, 1993
; Nikolov et al., 1996
; Juo et al., 1996
) are all sequence-specific DNA-binding proteins that bend or deform the nucleic acid duplex severely during the recognition process. IHF in Fig. 23.3.4.5
may be taken as representative of this class of DNA-binding proteins. The bend is produced by two localized rolls of ca 60° in a direction compressing the major groove and are additive, because they are spaced nine base pairs, or roughly one turn of helix, apart. In IHF, the two helix segments flanking the bend should be straight and unbent, and this is accomplished in one segment via a six-adenine A-tract: -C-A-A-A-A-A-A-G-.
The bending locus in IHF is C-A-A-T/A-T-T-G . It is C-G in lacI and purR repressors (Fig. 23.3.4.6), C-A = T-G in CAP (Fig. 10 of Dickerson, 1998b
), and T-A in EcoRV, γδ-resolvase and TBP (Fig. 23.3.4.7)
. Pyrimidine-purine or Y-R steps appear to be especially suitable loci for roll bending. The dashed lines in Figs. 23.3.4.6
and 23.3.4.7
plot tilt, and demonstrate its insignificance in bending, compared with roll. (This is intuitively obvious. Imagine yourself standing near a tall stack of wooden planks in a lumberyard during an earthquake. Where would you prefer to stand: alongside the stack, or at one end?)
In summary, bending of the B-DNA helix nearly always involves roll, not tilt. The easier direction of bending is that which compresses the broad major groove, although examples of roll compression of the minor groove are known. Y-R steps are especially prone to roll bending. Again, the phenomenon is one of sequence-induced bendability, not mandatory bending. No one imagines that the IHF binding sequence of Fig. 23.3.4.5 is permanently kinked at its two C-A-A-T/A-T-T-G steps, wandering deformed through the nucleus, looking for an IHF molecule to bind to. Instead, this sequence has a potential bendability that other sequences, such as A-A-A-A-A-A, lack.
Table 23.3.4.1 summarizes the observed behaviour of Y-R, R-R and R-Y steps from a great many X-ray crystal structure analyses, with and without bound DNA. In the present context, these rules are termed the `Major Canon', since they are well established and generally well understood. Some understanding of the proneness of Y-R steps to bend can be obtained by looking at stereo pairs of two successive base pairs viewed down the helix axis. Fig. 23.3.4.8
gives a few representative examples; many more can be found in Figs. 4–6 of Dickerson (1988b)
and in the original literature. In brief, Y-R steps, especially C-A and T-A, tend to orient so that polar exocyclic N and O atoms stack against polarizable rings of the other base pair. This is the same type of polar-on-polarizable stacking stabilization mentioned earlier in connection with O4′ and guanine in Z-DNA (Bugg et al., 1971
; Thomas et al., 1982
; Hunter & Sanders, 1990
; B32). Base pairs in T-A steps tend not to slide over one another along their long axes, keeping pyrimidine O2 stacked over the purine five-membered ring (Fig. 23.3.4.8b)
. C-A steps can adopt this same stacking, or the base pairs can slide until the pyrimidine O2 sits over the purine six-membered ring instead (Fig. 23.3.4.8a)
.
|
Purine-purine or R-R steps behave quite differently (Fig. 23.3.4.8c). They stack ring-on-ring, usually with greater overlap on the purine end than the pyrimidine. The net effect is that the pivot appears to pass through or near the purines, while pyrimidines at the other end of the pairs stack O2-on-ring as with Y-R steps. R-Y steps tend to stack ring-on-ring, with little contribution from exocyclic atoms.
El Hassan & Calladine (1997) have recently examined roll, slide and twist behaviour at 400 different steps observed in crystal structures of 24 A- and 36 B-DNA oligomers. The author has carried out a similar analysis of 1137 steps from 86 sequence-specific protein–DNA complexes (Dickerson, 1998a
,c
; Dickerson & Chiu, 1997
). A striking feature is that trends in local parameters are just the same in DNA crystals and in protein–DNA complexes. The frequently invoked nightmare of `crystal packing deformations' appears to be of only minor significance. In both studies (El Hassan & Calladine, 1997
; Dickerson, 1998b
), roll versus slide, slide versus twist and twist versus roll plots are presented for all ten possible base-pair steps. Fig. 23.3.4.9
illustrates roll versus slide plots for two Y-R, two R-R and two R-Y steps.
Table 23.3.4.2 summarizes observations from these roll/slide/twist plots. These are labelled the `Minor Canon' since they are recent, approximate and not well understood. However, they provide goals for future investigations of helix behaviour.
|
It has long been known that introduction of short A-tracts into general-sequence B-DNA in phase with the natural 10–10.5 base-pair repeat produced overall curvature that could be detected via eletrophoretic gel retardation, ring-cyclization kinetics and other physical measurements in solution (Marini et al., 1982; Wu & Crothers, 1984
; Koo et al., 1986
; Crothers & Drak, 1992
). However, the microscopic source of the observed macroscopic curvature remained unclear. Solution measurements alone cannot discriminate between three alternative curvature models: (1) local bending within the A-tracts themselves; (2) bending at junctions between A-tract B-DNA and general-sequence B-DNA; or (3) inherently straight and unbent A-tracts, with curvature resulting from removal of the normal writhe expected in general-sequence B-DNA (Koo et al., 1990
; Crothers et al., 1990
). The three curvature models are compared schematically in Fig. 10 of reference B77.
X-ray crystallographic results for DNA oligomers come down unequivocally in favour of model (3) above. Short A-tracts of four to six base pairs are straight and unbent in C-G-C-G--C-G-C-G (B1–B6), C-G-C-
-G-C-G (B20), C-G-C-
-G-C-G (B31), C-G-C-
-G-C-G (B17, B52), C-G-C-G-
-G-C (B64) and C-A-A-A-G
-G (B105) (A-tracts are double-underlined). It has been claimed (Sprous et al., 1995
) and disputed (Dickerson et al., 1994
, 1996
) that the observed straightness of crystalline A-tracts was only an artifact of crystal packing, or of the high levels of methyl-2,4-pentanediol (MPD) used in the crystallization. This concern now is put to rest by the observation that B-DNA packed against a protein molecule in its biological working environment behaves exactly the same as B-DNA packed against other DNA molecules in the crystal, as borne out by the roll/slide/twist studies of El Hassan & Calladine (1997)
for DNA and of Dickerson (1998a
,b
,c
) and Dickerson & Chiu (1997)
for protein–DNA complexes. Added support has come from recent molecular-dynamics simulations by Beveridge and co-workers (Sprous et al., 1999
), who have demonstrated that the duplex of sequence GGGGGGAAAATTTT
AAAATTTTCCCCCC is severely curved because of a roll kink at the double-underlined central CG step, whereas the duplex GGGGGGTTT
AAA
TTT
AAACCCCCC is much less curved because the roll kink at CG is counterbalanced by roll kinks in the opposite direction at the two flanking TA steps. In both cases, A-tracts are straight and completely unbent. (Note that both roll kinks can involve compression of the major groove, as expected, because the kink sites are a half turn of helix apart.)
This similarity of behaviour of DNA in crystals and in protein–DNA complexes should come as no surprise, since the local molecular environments – close intermolecular contacts, partial dehydration, low water activity, low local dielectric constant, high ionic strength, presence of divalent cations – are similar in these two cases and quite different from that of free DNA in dilute aqueous solution. Far from being unwanted `crystal deformations', the local changes in structure resulting from intermolecular contacts in DNA crystals provide positive information about sequence-dependent deformability that is relevant to the protein recognition process. With regard specifically to A-tract behaviour, Occam's Razor would argue in favour of model (3) above for the behaviour of A-tracts in solution. The situation in dilute aqueous solution becomes of secondary importance if what is wanted is an understanding of A-tract B-DNA behaviour in protein–DNA complexes. Here, the answer is unambiguous: A-tracts in their biological setting are inherently rigid structural elements, chosen by natural selection when bending should be avoided.
Three families of nucleic acid double helix have been found – A, B and Z – with widely different structures and usages. The A and B helices are right-handed and have no limitations on base sequence. Z is left-handed and effectively limited to alternating purines and pyrimidines, with G and C overwhelmingly favoured. B is the biologically significant helix for DNA and is used in genetic coding. A is the helix of preference for RNA because it can accommodate the C2′-OH group of ribose, which produces steric clash in the B helix. The Z helix has, as yet, no well established biological function. A left-handed DNA configuration can be induced in longer DNA segments by negative supercoiling in solution, but it is not clear that this left-handed configuration is identical to the Z-DNA seen in short crystalline oligomers, because of the reversed orientation of backbone strands in Z-DNA.
B-DNA is an inherently malleable or deformable duplex. Its sugar ring conformations are much more variable than those of A-DNA. The base sequence of B-DNA is expressed directly via hydrogen bonds between bases of a pair, and indirectly via hydrogen-bond donors and acceptors along the floor of the major and minor groove. Sequence is also expressed as a differential deformability of different regions of the duplex. The two most obvious parameters affected by base sequence are minor groove width and helix bendability. Certain sequences of B-DNA are not statically bent, but are more bendable under stress than are other sequences. Bending occurs via roll, usually in the direction that compresses the broad major groove. Pyrimidine-purine or Y-R steps are most conducive to roll bending, and purine-purine steps are least bendable, particularly A-tracts of four or more AT base pairs without the weak T-A step. Natural selection has engineered Y-R steps into a DNA sequence where a sharp roll bend is wanted, and short A-tracts into a sequence where bending is not desired.
Appendix A23.3.1
|
|
|
References















































































