International
Tables for Crystallography Volume F Crystallography of biological macromolecules Edited by M. G. Rossmann and E. Arnold © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. F. ch. 23.3, pp. 602-609
Section 23.3.4. Sequence–structure relationships in B-DNA
a
Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095–1570, USA |
Two channels of information exist in B-DNA by which base sequence is expressed to the outside world. One of these is the Watson–Crick base pairing of A with T and G with C that is used in the storage of genetic information and in replication and transcription. The other channel, used in control and regulation of the expression of this genetic information, involves the hydrogen-bonding patterns of base-pair edges along the floors of the grooves and any systematic deformations of local helix structure that result explicitly from the base sequence.
The simplest and most direct expression of this second channel is the passive reading of hydrogen-bonding patterns along the floor of the major and minor grooves. This readout mechanism was first proposed by Seeman et al. (1976), and involves acceptors and donors as marked by A and D in Fig. 23.3.2.7
. The wide major groove of B-DNA is read by several classes of control proteins that function by positioning an α-helix within the groove so that its amino-acid side chains can sense the pattern of hydrogen bonding. This category includes prokaryotic and eukaryotic helix-turn-helix or HTH proteins, zinc-finger and other zinc-binding proteins, basic leucine zippers and their basic helix-loop-helix cousins, and others (See Table I of Dickerson & Chiu, 1997
). The narrower minor groove is a frequent target for long, planar drug molecules, such as netropsin and distamycin, as listed in Part II of Table A23.3.1.2
.
In principle, this readout mechanism would work perfectly well with a regular, ideal, fibre-like B-DNA helix. But other control proteins that recognize the minor groove, such as TATA-binding protein (TBP) and integration host factor (IHF), depend not merely on passive hydrogen bonding to an ideally regular duplex, but on the sequence-dependent deformability of one region of the helix versus another. The remainder of this chapter will be concerned with this effect and its role in DNA recognition.
The simplest and first-noticed sequence-dependent deformability of the B-DNA duplex was variation in minor groove width. The first B-DNA oligomer to be solved, C-G-C-G-A-A-T-T-C-G-C-G (B1–B6), had a narrow minor groove in the central A-A-T-T region, with only ca 3.5 Å of free space between opposing phosphates and sugar rings. (It has become conventional to define the free space between phosphates as the measured minimal P–P separation across the groove, less 5.8 Å to represent two phosphate-group radii. Similarly, the measured distance between sugar oxygens is decreased by 2.8 Å, representing two oxygen van der Waals radii.) The C-G-C-G ends of the helix had the 6–7 Å opening expected for ideal B-DNA, but the situation was clouded, because the outermost two base pairs at each end of the helix interlocked minor grooves with neighbours in the crystal. Hence, the wider ends could possibly be only an artifact of crystal packing.
After 1991, the situation was clarified by the structures of several decamers [Table A23.3.1.2, Part I(c)], which stack on top of one another without the interlocking of grooves. The normal minor groove opening is ca 7 Å. Regions of four or more AT base pairs can exhibit a significantly narrowed minor groove, although such narrowing is not mandatory. This behaviour is seen with the B-DNA decamer, C-A-A-A-G-A-A-A-A-G, in Fig. 23.3.4.1
. The narrowing arises mainly from the larger allowable propeller twist in AT base pairs, which displaces C1′ atoms at opposite ends of the pair in different directions, and moves the backbone chains in such a way as to partially close the groove (Fig. 23.3.4.2)
.
This is an excellent example of the concept of sequence-dependent helix deformability, rather than simple deformation. The two hydrogen bonds of an AT base pair allow a larger propeller twist but do not require it. Hence, AT regions of helix permit a narrowing of the minor groove but do not demand it. Indeed, this lesson was brought home in the most dramatic way when Pelton & Wemmer (1989, 1990
) showed via NMR that a 2:1 complex of distamycin with C-G-C-A-A-A-T-T-G-G-C or C-G-C-A-A-A-T-T-T-G-C-G could exist, in which two drug molecules sat side-by-side within an enlarged central minor groove. Fig. 23.3.4.3
shows a narrow minor groove with a single netropsin molecule, and Fig. 23.3.4.4
shows a wide minor groove enclosing two di-imidazole lexitropsins side-by-side. In summary, an AT-rich region of minor groove is capable of narrowing but is not inevitably narrow, in contrast to GC-rich regions where the third hydrogen bond tends to keep the base pairs flat and the minor groove wide. The AT minor groove is potentially deformable without being inevitably deformed.
Sequence-dependent bendability has been reviewed recently by Dickerson (1988a,b
,c
) and Dickerson & Chiu (1997)
. The relative bendability of different regions of B-DNA sequence is an important aspect of recognition, one that is used by countless control proteins that must bind to a particular region of double helix. Catabolite activator protein or CAP (Schultz et al., 1991
; Parkinson et al., 1996
), lacI (Lewis et al., 1996
) and purR (Schumacher et al., 1994
) repressors, γδ-resolvase (Yang & Steitz, 1995
), EcoRV restriction enzyme (Winkler et al., 1993
; Kostrewa & Winkler, 1995
), integration host factor or IHF (Rice et al., 1996
), and TBP or TATA-binding protein (Kim, Gerger et al., 1993
; Kim, Nikolov & Burley, 1993
; Nikolov et al., 1996
; Juo et al., 1996
) are all sequence-specific DNA-binding proteins that bend or deform the nucleic acid duplex severely during the recognition process. IHF in Fig. 23.3.4.5
may be taken as representative of this class of DNA-binding proteins. The bend is produced by two localized rolls of ca 60° in a direction compressing the major groove and are additive, because they are spaced nine base pairs, or roughly one turn of helix, apart. In IHF, the two helix segments flanking the bend should be straight and unbent, and this is accomplished in one segment via a six-adenine A-tract: -C-A-A-A-A-A-A-G-.
The bending locus in IHF is C-A-A-T/A-T-T-G . It is C-G in lacI and purR repressors (Fig. 23.3.4.6), C-A = T-G in CAP (Fig. 10 of Dickerson, 1998b
), and T-A in EcoRV, γδ-resolvase and TBP (Fig. 23.3.4.7)
. Pyrimidine-purine or Y-R steps appear to be especially suitable loci for roll bending. The dashed lines in Figs. 23.3.4.6
and 23.3.4.7
plot tilt, and demonstrate its insignificance in bending, compared with roll. (This is intuitively obvious. Imagine yourself standing near a tall stack of wooden planks in a lumberyard during an earthquake. Where would you prefer to stand: alongside the stack, or at one end?)
In summary, bending of the B-DNA helix nearly always involves roll, not tilt. The easier direction of bending is that which compresses the broad major groove, although examples of roll compression of the minor groove are known. Y-R steps are especially prone to roll bending. Again, the phenomenon is one of sequence-induced bendability, not mandatory bending. No one imagines that the IHF binding sequence of Fig. 23.3.4.5 is permanently kinked at its two C-A-A-T/A-T-T-G steps, wandering deformed through the nucleus, looking for an IHF molecule to bind to. Instead, this sequence has a potential bendability that other sequences, such as A-A-A-A-A-A, lack.
Table 23.3.4.1 summarizes the observed behaviour of Y-R, R-R and R-Y steps from a great many X-ray crystal structure analyses, with and without bound DNA. In the present context, these rules are termed the `Major Canon', since they are well established and generally well understood. Some understanding of the proneness of Y-R steps to bend can be obtained by looking at stereo pairs of two successive base pairs viewed down the helix axis. Fig. 23.3.4.8
gives a few representative examples; many more can be found in Figs. 4–6 of Dickerson (1988b)
and in the original literature. In brief, Y-R steps, especially C-A and T-A, tend to orient so that polar exocyclic N and O atoms stack against polarizable rings of the other base pair. This is the same type of polar-on-polarizable stacking stabilization mentioned earlier in connection with O4′ and guanine in Z-DNA (Bugg et al., 1971
; Thomas et al., 1982
; Hunter & Sanders, 1990
; B32). Base pairs in T-A steps tend not to slide over one another along their long axes, keeping pyrimidine O2 stacked over the purine five-membered ring (Fig. 23.3.4.8b)
. C-A steps can adopt this same stacking, or the base pairs can slide until the pyrimidine O2 sits over the purine six-membered ring instead (Fig. 23.3.4.8a)
.
|
Purine-purine or R-R steps behave quite differently (Fig. 23.3.4.8c). They stack ring-on-ring, usually with greater overlap on the purine end than the pyrimidine. The net effect is that the pivot appears to pass through or near the purines, while pyrimidines at the other end of the pairs stack O2-on-ring as with Y-R steps. R-Y steps tend to stack ring-on-ring, with little contribution from exocyclic atoms.
El Hassan & Calladine (1997) have recently examined roll, slide and twist behaviour at 400 different steps observed in crystal structures of 24 A- and 36 B-DNA oligomers. The author has carried out a similar analysis of 1137 steps from 86 sequence-specific protein–DNA complexes (Dickerson, 1998a
,c
; Dickerson & Chiu, 1997
). A striking feature is that trends in local parameters are just the same in DNA crystals and in protein–DNA complexes. The frequently invoked nightmare of `crystal packing deformations' appears to be of only minor significance. In both studies (El Hassan & Calladine, 1997
; Dickerson, 1998b
), roll versus slide, slide versus twist and twist versus roll plots are presented for all ten possible base-pair steps. Fig. 23.3.4.9
illustrates roll versus slide plots for two Y-R, two R-R and two R-Y steps.
Table 23.3.4.2 summarizes observations from these roll/slide/twist plots. These are labelled the `Minor Canon' since they are recent, approximate and not well understood. However, they provide goals for future investigations of helix behaviour.
|
It has long been known that introduction of short A-tracts into general-sequence B-DNA in phase with the natural 10–10.5 base-pair repeat produced overall curvature that could be detected via eletrophoretic gel retardation, ring-cyclization kinetics and other physical measurements in solution (Marini et al., 1982; Wu & Crothers, 1984
; Koo et al., 1986
; Crothers & Drak, 1992
). However, the microscopic source of the observed macroscopic curvature remained unclear. Solution measurements alone cannot discriminate between three alternative curvature models: (1) local bending within the A-tracts themselves; (2) bending at junctions between A-tract B-DNA and general-sequence B-DNA; or (3) inherently straight and unbent A-tracts, with curvature resulting from removal of the normal writhe expected in general-sequence B-DNA (Koo et al., 1990
; Crothers et al., 1990
). The three curvature models are compared schematically in Fig. 10 of reference B77.
X-ray crystallographic results for DNA oligomers come down unequivocally in favour of model (3) above. Short A-tracts of four to six base pairs are straight and unbent in C-G-C-G--C-G-C-G (B1–B6), C-G-C-
-G-C-G (B20), C-G-C-
-G-C-G (B31), C-G-C-
-G-C-G (B17, B52), C-G-C-G-
-G-C (B64) and C-A-A-A-G
-G (B105) (A-tracts are double-underlined). It has been claimed (Sprous et al., 1995
) and disputed (Dickerson et al., 1994
, 1996
) that the observed straightness of crystalline A-tracts was only an artifact of crystal packing, or of the high levels of methyl-2,4-pentanediol (MPD) used in the crystallization. This concern now is put to rest by the observation that B-DNA packed against a protein molecule in its biological working environment behaves exactly the same as B-DNA packed against other DNA molecules in the crystal, as borne out by the roll/slide/twist studies of El Hassan & Calladine (1997)
for DNA and of Dickerson (1998a
,b
,c
) and Dickerson & Chiu (1997)
for protein–DNA complexes. Added support has come from recent molecular-dynamics simulations by Beveridge and co-workers (Sprous et al., 1999
), who have demonstrated that the duplex of sequence GGGGGGAAAATTTT
AAAATTTTCCCCCC is severely curved because of a roll kink at the double-underlined central CG step, whereas the duplex GGGGGGTTT
AAA
TTT
AAACCCCCC is much less curved because the roll kink at CG is counterbalanced by roll kinks in the opposite direction at the two flanking TA steps. In both cases, A-tracts are straight and completely unbent. (Note that both roll kinks can involve compression of the major groove, as expected, because the kink sites are a half turn of helix apart.)
This similarity of behaviour of DNA in crystals and in protein–DNA complexes should come as no surprise, since the local molecular environments – close intermolecular contacts, partial dehydration, low water activity, low local dielectric constant, high ionic strength, presence of divalent cations – are similar in these two cases and quite different from that of free DNA in dilute aqueous solution. Far from being unwanted `crystal deformations', the local changes in structure resulting from intermolecular contacts in DNA crystals provide positive information about sequence-dependent deformability that is relevant to the protein recognition process. With regard specifically to A-tract behaviour, Occam's Razor would argue in favour of model (3) above for the behaviour of A-tracts in solution. The situation in dilute aqueous solution becomes of secondary importance if what is wanted is an understanding of A-tract B-DNA behaviour in protein–DNA complexes. Here, the answer is unambiguous: A-tracts in their biological setting are inherently rigid structural elements, chosen by natural selection when bending should be avoided.
References
































