International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F, ch. 17.2, pp. 358-363   | 1 | 2 |

Section 17.2.3. Representation and visualization of molecular data and models

A. J. Olsona*

aThe Scripps Research Institute, La Jolla, CA 92037, USA
Correspondence e-mail: olson@Scripps.edu

17.2.3. Representation and visualization of molecular data and models

| top | pdf |

In the early days of molecular computer graphics, the major goal was to represent the spatial structures of molecules, principally the locations of the atom centres and the covalent connectivity between them. Using X-ray diffraction analysis, one would first plot the electron density as line contours projected onto a plane and locate the atom centres from multiple projections. The molecule would then be represented by a simple bond diagram. As experimental and computational methods advanced, other representations were used to convey additional information about the structure. Johnson's ORTEP program plotted the thermal ellipsoids of each of the atoms, visualizing the magnitude and direction of their thermal vibrations as derived from the anisotropic temperature factors (Fig. 17.2.3.1[link]). As colour raster displays became available, space-filling CPK representations were used to visualize molecular shape and volume while using an atom-based colour scheme to show atomic composition and distribution (Porter, 1979[link]) (see Fig. 17.2.3.4). The complexity of protein molecules prompted the introduction of simplified representations that replaced the all-atom visualization with tubes or ribbons (Branden et al., 1975[link]; Carson, 1991[link]) (Fig. 17.2.3.2[link]) that represented the fold of the protein chain. This simplification allowed the comparison of protein folds, and led to the beautiful classification of protein motifs by Richardson (1981)[link].

[Figure 17.2.3.1]

Figure 17.2.3.1 | top | pdf |

ORTEP plot of phenylhydroxynorbornanone showing atomic thermal ellipsoids (from Thermal ellipsoid analysis: the fossil footprints of restless atoms, by Carroll K. Johnson and Michael N. Burnett, Buerger Award Lecture at the ACA meeting in St. Louis, July 20–26, 1997).

[Figure 17.2.3.2]

Figure 17.2.3.2 | top | pdf |

Ribbon diagram of actin monomer (PDB code: 1atn) (Kabsch et al., 1990[link]).

As more structural information has become available, and as computational hardware technology has advanced, the ability to visualize a variety of molecular properties has become possible. Meanwhile, issues of interactivity, intelligibility and interpretability have become increasingly important as the systems under study have become more complex. There are three general approaches to visualizing the structures, properties and relationships of molecular systems: geometric construction, direct volumetric rendering and generic information visualization. Today almost all macromolecular modelling and visualization work is done using geometric representations of bonds, ribbons and surfaces, which are annotated by colour to represent atom type, chain characteristics, or electrostatic potential. For these purposes, there are several `turn-key' programs that facilitate display and interaction. Programs such as RASMOL (Sayle & Milner-White, 1995[link]), GRASP (Nichols et al., 1995[link]) and MOLSCRIPT (Kraulis, 1991[link]) are widely used by the molecular-structure community. Some of the fundamentals of representation used in these types of programs, as well as more exploratory techniques, are described below.

While the world is continuous, our measurements of it tend to be finite and sampled. Thus data are usually represented as discrete values on a line, plane, volume or hypervolume. On the other hand, in order to capture the nature of the world, our models tend to be represented as continuous functions. Geometric construction is useful for rendering continuous models, while other techniques, such as volume rendering, lend themselves to the visualization of discrete data.

17.2.3.1. Geometric representation

| top | pdf |

Geometric construction encompasses dots, lines and surfaces described by lists of three-dimensional coordinates and connectivity or by analytic or parametric expressions that can generate such information for rendering. Basically, geometric rendering involves a projection of the 3-D geometry onto a two-dimensional viewing plane using matrix transformations that account for the viewpoint, perspective and clipping within the viewing volume. For dots and lines, the computation may end there; the only depth information in the rendering might be geometric perspective. Additional depth information can be added by `atmospheric perspective' or depth cueing, where the brightness or colour is modulated by the depth values of the points (Fig. 17.2.3.3[link]). Surface representations permit additional three-dimensional cues such as occlusion and shape-from-shading. Occlusion, or `hidden surface removal', and atmospheric perspective depend on maintaining depth information for all of the picture elements (pixels) in screen space. Such `depth-buffer' algorithms provide visibility information for a given viewpoint. Hardware z-buffers facilitate such calculations in the graphics pipeline. Lighting cues, such as shading, are attained by approximating the ambient, diffuse and specular reflectance of the geometry using Lambert's law. Because typical surfaces are composed of polyhedral facets, interpolation schemes are used to produce smooth shaded representations. The most common technique used for molecular graphics is known as Gouraud shading (Gouraud, 1971[link]), which interpolates the shaded colour values assigned at the vertices across the polyhedral face (Fig. 17.2.3.4[link]). Phong shading (Phong, 1975[link]), a more accurate but costly technique, interpolates the values of the normals of the facets to produce a more realistic rendering. Shading templates for specific geometries, such as spheres, can give very smooth results without having to resort to large polyhedral descriptions for each sphere. In the past, this approach was implemented in the graphics hardware design, resulting in very fast sphere rendering for molecular applications. With the advent of consumer-level 3-D graphics these specialized features have become increasingly rare. Shadows may also provide useful three-dimensional cues in viewing molecular objects, but may also be confusing when they provide too much visual contrast or clutter. Ray tracing is a general technique for producing a complete reflectance and shadow rendering of a three-dimensional scene. It can, however, be very costly in computational time, since every light ray in the final image must be iteratively traced back to its source. Faster approximations for shadow rendering have been implemented that work well for molecular scenes (Gwilliam & Max 1989[link]; Lauher, 1990[link]).

[Figure 17.2.3.3]

Figure 17.2.3.3 | top | pdf |

Simple bonding diagram of a DNA structure (PDB code: 140D) (Mujeeb et al., 1993[link]). On the left all lines are of equal intensity. On the right the lines are depth-cued to show which parts of the structure are closer to the viewer.

[Figure 17.2.3.4]

Figure 17.2.3.4 | top | pdf |

CPK representation of the same DNA structure as in Fig. 17.2.3.3[link]. The model on the left uses highly tessellated spheres for the atoms, while the one on the right uses a coarser tessellation. The Gouraud shading model produces some lighting artifacts, such as the star-shaped highlights, which are most apparent on the right-hand figure. This is due to colour interpolation between the facet vertices.

A number of useful surface representations have been developed that describe the interaction of a molecule with the surrounding solvent. Perhaps the most widely used are the solvent-accessible surface (Lee & Richards, 1971[link]) and the molecular surface (Richards, 1977[link]; Connolly, 1983[link]; Sanner et al., 1996[link]), sometimes referred to as the Connolly or solvent-excluded surface (Fig. 17.2.3.5[link]). For large molecules, such as proteins, which have many atoms buried from solvent, these surfaces have proven to be important in studying molecular interactions. They not only help to visualize the complementarity of interacting molecules, but they are also important in quantifying the entropic changes associated with solvent effects upon binding.

[Figure 17.2.3.5]

Figure 17.2.3.5 | top | pdf |

Solvent-excluded surface of the DNA structure using a water probe radius of 1.5 Å. The figure on the left shows a depth-cued dot surface, while the figure on the right shows a Gouraud-shaded triangulated surface.

Surface representations have opened up the possibilities of displaying a large variety of computed or experimental molecular properties by mappings onto the surface using colour coding. Electrostatic potential, hydrophobicity, sequence conservation, surface shape and any other characteristic of the molecule that can be projected onto the surface can be colour coded and displayed. Typically, this is accomplished by colouring the vertices of the surface mesh using a colour mapping or scale and interpolating the colour across the polygonal faces of the mesh. Since colour values are interpolated between vertices, this can produce unwanted colour artifacts if there are abrupt spatial changes in the properties displayed, or if the colour interpolation does not correspond to the property mapping (Fig. 17.2.3.6[link]).

[Figure 17.2.3.6]

Figure 17.2.3.6 | top | pdf |

Hydrophobicity mapped onto a molecular surface. A spherical-harmonic approximation of the actin monomer solvent-excluded surface is shown. (a) Vertex colouring of a medium-mesh tessellated surface. The hydrophobicity colour scale is shown above. Notice that the colours blend between vertices, producing colour artifacts in relationship to the property scale. (b) The same medium-mesh representation as in (a) but using a property-based (one-dimensional) texture map, applying the same colour scale. Notice that the boundaries between the colours are distinct, even when they intersect vertices. Here the property value is interpolated. (c) A coarser mesh showing the same texture-mapping technique used in (b). Since the properties are only sampled at the vertices of the mesh, the finer details of the mapping are lost at this coarse triangulation. (d) A two-dimensional texture map created as a `Mercator-like' projection in spherical coordinates (θ, ϕ) from the same hydrophobicity scale used in (a)–(c). (e) The 2-D texture map shown in (d) mapped onto the medium-mesh actin surface. Notice that the linear nature of the interpolation seen in (b) using the same mesh is no longer present. (f) The same 2-D texture map applied to the coarse-mesh surface of actin. Notice that, unlike in (c), the detail of the texture map is preserved independent of the mesh.

Another method for projecting information on a surface is texture mapping, an approach that is analogous to applying an image `decal' onto the surface. In this approach, instead of assigning colours to the surface vertices, indices are assigned which serve as coordinates into the image to be mapped. Thus, a great amount of detail may be displayed on a surface mesh that has relatively few polygons describing the geometry. Texture mapping has been used extensively in highly interactive graphics, such as flight simulators and video games, since transformation of the geometry tends to be the computational bottleneck. Since texture mapping requires an indexing scheme that relates an image to a set of geometric vertices on the molecular surface, one needs a rational way of producing such a map. For one-dimensional texture maps, this is relatively easily accomplished by assigning the texture index of each vertex to an appropriate property scale (Teschner et al., 1994[link]) (Fig. 17.2.3.6[link]). This approach, however, is still tied to the level of triangulation. The more general two-dimensional or location-based surface texture mapping requires a global scheme for assigning texture indices. While the original molecular surface geometry does not lend itself directly to this type of texture mapping, recent analytical approximations to these surfaces, such as spherical-harmonics-based molecular surfaces (Duncan & Olson, 1993[link]), provide simple hierarchical meshing schemes that can be easily texture mapped by using a `Mercator'-like projection between the image and the molecular surface (Duncan & Olson, 1995[link]) (Fig. 17.2.3.6[link]).

17.2.3.2. Volumetric representation

| top | pdf |

Molecular properties are not confined to bonds and surfaces. These, in fact, are geometric constructs or abstractions of the time-dependent volumetric characteristics of molecules. In crystallography, electron density is the primary volumetric property to be visualized. Other derived or computed volumetric properties have become important to visualize as well, especially for macromolecules and their complexes. Electrostatic potential and field gradients help establish a molecule's effect at a distance, and a variety of volumetric atomic affinity potentials or grids (Goodford, 1985[link]) can provide a picture of the types of molecular interactions that are energetically favoured.

Traditionally, electron density and other volumetric properties have been displayed as isocontour or isosurface representations, in which lines or surfaces of constant value are rendered in planes or in 3-D space to reveal characteristics of the volumetric property. Early computer-graphic pen plots of planar Fourier projections of electron density were usually sufficient to reveal atomic structure. As the molecules of study became larger and more complex, stacks of two-dimensional slices, creating three-dimensional isocontours, became necessary. The first computer representations of such 3-D isovalue surfaces were composed of three orthogonal 2-D plots – giving the impression of a `basket weave'. These plots depicted surface isocontours of the three-dimensional density, but had several problems from a computational and representational point of view. Since there were preferred directions of the contours (along the x, y and z axes), particular views were difficult to interpret. Additionally, the three orthogonal contours did not define a well formed triangulated geometric surface, so modern surface rendering techniques could not be applied directly. Moreover, the computation and recomputation of isosurfaces was relatively inefficient. An algorithm to compute directly the three-dimensional isosurface, called `marching cubes', was devised by Lorenson & Kline (1987[link]) (Fig. 17.2.3.7[link]). This algorithm speeded up the contouring process and enabled shaded surface representation of these surfaces. More recently, the re-computation of isosurfaces has been speeded up through the pre-computation of seed points that span all values of the volume. Using these seed points to flood-fill an isosurface of a given value reduces the contouring computation from a three-dimensional to a two-dimensional calculation. This enables the interactive modification of contour levels for even very large volumes (Bajaj et al., 1996[link])

[Figure 17.2.3.7]

Figure 17.2.3.7 | top | pdf |

Crystallographic electron-density isosurfaces, showing details of a protein iron–sulfur cluster. The surfaces are coloured by the gradient of the electron density, highlighting the iron and sulfur densities. Image by Michael Pique, The Scripps Research Institute.

While isocontours and isosurfaces have been the dominant modes of volumetric representation in molecular graphics, there has been a trend in scientific visualization to use alternative techniques, termed `direct volume rendering'. These methods bypass the construction of contours or surfaces to represent values within the volume, and instead use the scalar (or sometimes vector) values within the volume to produce an image directly. A general technique to accomplish this type of volumetric rendering is termed ray casting. If one considers a function that maps the scalar values of a volume into optical properties such as colour and opacity, one can simulate the passage of light rays through the volume, projecting the resulting rays onto the image plane. Given an appropriate transfer function or look-up table, the image represents the distribution of all of the values within the volume, circumventing the need to select only certain values as required for isocontouring. Such techniques have been used extensively in medical tomography (Höhne et al., 1989[link]) and electron microscopy (Kremer et al., 1996[link]; Hessler et al., 1996[link]). Their use has also been explored in the rendering of volumetric properties of molecules (Goodsell et al., 1989[link]). The images that are obtained by direct volume rendering tend to appear cloud-like, with soft edges. While this may be a `true' representation of the molecular characteristics, it is sometimes difficult to interpret visually. Techniques for imparting shading cues into these renderings by using gradient information in the volume has made this type of rendering more interpretable (Drebein et al., 1988[link]). Another potential drawback to these methods is the cost of the computations. Since these methods require computing the effect of every element of the volume, the amount of computation scales as the cube of the linear dimension. There have been several clever software and hardware approaches to overcoming this problem. One novel hardware approach is to use three-dimensional texture mapping. By stacking texture-mapped planes to represent the colour and opacity of the volume, and using the hardware depth-buffer capabilities to compose the final image in the viewing plane, one can manipulate and render reasonable-size volumes (1283) at highly interactive rates. For molecular visualization, one would like to be able to represent both geometric and volumetric characteristics in the same rendering to visualize, for instance, model and data (Fig. 17.2.3.8[link]). The three-dimensional texture-mapping approach enables this easily, since the planes upon which the volume data are mapped are in fact geometric. Other direct-volume rendering codes provide this capability as well.

[Figure 17.2.3.8]

Figure 17.2.3.8 | top | pdf |

A difference-electron-density map of a minor-groove drug binding in DNA. This image combines volumetric rendering of the electron density with a geometric model of the DNA molecule. Data courtesy of R. E. Dickerson, UCLA. Image by David Goodsell, The Scripps Research Institute.

17.2.3.3. Information visualization

| top | pdf |

While molecular-structure research deals directly with objects in three dimensions, it is at times advantageous to abstract this three-dimensional information into diagrams that show relationships that are not readily apparent by examination of a set of geometric models or volumes themselves. This type of representation is broadly termed `information visualization'. In the arena of molecular structure, probably the best known and most widely used diagram of this type is the Ramachandran plot (Ramachandran & Sasisekharan, 1968[link]), which maps the positions of each of a protein's amino-acid residues into the backbone torsion-angle space of ϕ and ψ. Such a diagram readily pinpoints the parts of the protein backbone that have unusual (and sometimes erroneous) configurations. It also nicely shows the clustering of residues into the standard secondary structural motifs and their variations. There have been several enhancements of the Ramachandran plot over the years, some of which superimpose computed energy contours or colour-code residues by characteristics such as sequence order.

Another visualization approach that has become very useful is the distance matrix plot, and its derivative, the difference distance matrix (Phillips, 1970[link]). By constructing a matrix of distances between each amino-acid α-carbon and contouring or colouring the resulting values, one can readily see the patterns of α-helices and β-sheets within the structure. An advantage of this type of visualization is that it is coordinate-frame independent. Thus two structures can be compared for features without first superposing their coordinates in the same frame. This approach also works well when comparing two different structures of the same molecule, where there may be some movement between the two. By computing the distance matrix for each structure, and then computing the difference between the two distance matrices, the resulting difference distance matrix will indicate those parts of the structure that stay in the same relative relationships and those that may move relative to each other (Fig. 17.2.3.9[link]).

[Figure 17.2.3.9]

Figure 17.2.3.9 | top | pdf |

A difference distance matrix plot of the α1–β2 interface of haemoglobin in the T to R transformation. The x axis represents the α1 subunit and the y axis the β2 subunit. Red points indicate residues that are closer following the transformation and blue points indicate residues that move farther apart. Plot by Raj Srinivasan, Johns Hopkins University.

Animating trajectories of molecular structures and changes in volumetric properties over time is one way to look for trends and patterns in molecular dynamics and other time-course simulations. However, other modes of information visualization can assist analysis and communication of results, sometimes more effectively. Plotting an array of small images showing the time course of key properties can reveal patterns that may be difficult to see in a trajectory. For instance, using the program MolMol (Koradi et al., 1996[link]), the time course of the seven nucleic-acid backbone torsion angles during a dynamics simulation of an RNA polynucleotide can be plotted on a circular graph (starting from the centre and progressing outward) to uncover patterns of change and correlation between a large number of variables over time.

In addition to the enormous amount of information generated by computational simulations of molecular dynamics, dockings and other multi-structure, multi-modal techniques, the floodgates of molecular information have opened, gushing data from genomics and high-throughput structure determination. Thus, the need for novel visualization methods has become even more acute. Circle maps defining genomic structure at various levels of detail and annotation have become a common graphical form for organizing and communicating the positional and functional aspects of genome structure. Aligned nucleic acid or amino-acid sequences coded by conservation, chemical property, or any number of other functional relationships have become the lingua franca of gene hunters and gatherers. As the protein structure database continues its exponential growth, the opportunities for defining and refining structural family relationships abound. Developing methods for effectively visualizing the relationships that arise from all-by-all computational comparisons of the entire database is an important current challenge in molecular graphics.

References

Bajaj, C. L., Pascucci, V. & Schikore, D. R. (1996). Fast isocontouring for improved interactivity. In Proceedings of the ACM SIGGRAPH/IEEE symposium on volume visualization, pp. 39–46. San Francisco: ACM Press.Google Scholar
Branden, C.-I., Jornvall, H., Eklund, H. & Fureugren, B. (1975). Alcohol dehydrogenase. In The enzymes, edited by P. Boyer, pp. 104–186. New York: Academic Press.Google Scholar
Carson, M. (1991). RIBBONS 2.0. J. Appl. Cryst. 24, 958–961.Google Scholar
Connolly, M. L. (1983). Solvent-accessible surfaces of proteins and nucleic acids. Science, 221, 709–713.Google Scholar
Drebein, R., Carpenter, L. & Hanrahan, P. (1988). Volume rendering. Proc. ACM SIGGRAPH'88 (Atlanta, Georgia, 1–5 August 1988). In Comput. Graphics Proc. Annu. Conf. Ser. 1988 (1993), pp. 65–74. New York: ACM SIGGRAPH.Google Scholar
Duncan, B. S. & Olson, A. J. (1993). Approximation and characterization of molecular surfaces. Biopolymers, 33, 219–229.Google Scholar
Duncan, B. S. & Olson, A. J. (1995). Approximation and visualization of large-scale motion of protein surfaces. J. Mol. Graphics, 13, 250–257.Google Scholar
Goodford, P. J. (1985). A computational procedure for determining energetically favorable binding sites on biologically important molecules. J. Med. Chem. 28, 849–857.Google Scholar
Goodsell, D. S., Mian, I. S. & Olson, A. J. (1989). Rendering of volumetric data in molecular systems. J. Mol. Graphics, 7, 41–47.Google Scholar
Gouraud, H. (1971). Continuous shading of curved surfaces. IEEE Trans. Comput. 20, 623–628.Google Scholar
Gwilliam, M. & Max, N. (1989). Atoms with shadows – an area-based algorithm for cast shadows on space-filling molecular models. J. Mol. Graphics, 7, 54–59.Google Scholar
Hessler, D. S., Young, S. J. & Ellisman, M. H. (1996). A flexible environment for visualization of three-dimensional biological structures. J. Struct. Biol. 116, 113–119.Google Scholar
Höhne, K. H., Bomans, M., Pommert, A., Reimer, M., Schiers, C., Tiede, U. & Wiebecke, G. (1989). 3D-visualization of tomographic volume data using the generalized voxel-model. Volume visualization workshop, Chapel Hill, NC. Department of Computer Science, University of North Carolina at Chapel Hill.Google Scholar
Koradi, R., Billeter, M. & Wuthrich, K. (1996). MOLMOL: a program for display and analysis of macromolecular structures. J. Mol. Graphics, 14, 51–55.Google Scholar
Kraulis, P. J. (1991). MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Cryst. 24, 946–950.Google Scholar
Kremer, J. R., Mastronarde, D. N. & McIntosh, J. R. (1996) Computer visualization of three-dimensional image data using IMOD. J. Struct. Biol. 116, 71–76.Google Scholar
Lauher, J. W. (1990). Chem-Ray: a molecular graphics program featuring an umbra and penumbra shadowing routine. J. Mol. Graphics, 8, 34–38.Google Scholar
Lee, B. & Richards, F. M. (1971). The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol. 55, 379–400.Google Scholar
Lorenson, W. E. & Kline, H. E. (1987). Marching cubes: a high resolution 3D surface construction algorithm. Comput. Graphics, 21, 163–169.Google Scholar
Nichols, W. L., Rose, G. D., Ten Eyck, L. F. & Zimm, B. H. (1995). Rigid domains in proteins: an algorithmic approach to their identification. Proteins Struct. Funct. Genet. 23, 38–45.Google Scholar
Phillips, D. C. (1970). British biochemistry, past and present, edited by T. W. Goodwin, pp. 11–28. Academic Press.Google Scholar
Phong, B. T. (1975). Illumination for computer generated images. Commun. ACM, 18, 311–317.Google Scholar
Porter, T. K. (1979). The shaded surface display of large molecules. Comput. Graphics, 13, 234–236.Google Scholar
Ramachandran, G. N. & Sasisekharan, V. (1968). Conformation of polypeptides and proteins. Adv. Protein Chem. 23, 283–437.Google Scholar
Richards, F. M. (1977). Areas, volumes, packing and protein structure. Annu. Rev. Biophys. Bioeng. A, 6, 151–176.Google Scholar
Richardson, J. S. (1981). The anatomy and taxonomy of protein structure. Adv. Protein Chem. 34, 167–339.Google Scholar
Sanner, M.-F., Olson, A. J. & Spehner, J.-C. (1996). Reduced surface: an efficient way to compute molecular surfaces. Biopolymers, 38, 305–320.Google Scholar
Sayle, R. A. & Milner-White, E. J. (1995). RASMOL: biomolecular graphics for all. Trends Biochem. Sci. 20, 374.Google Scholar
Teschner, M., Henn, C., Vollhardt, H., Reiling, S. & Brinkmann, J. (1994). Texture mapping: a new tool for molecular graphics. J. Mol. Graphics, 12, 98–105.Google Scholar








































to end of page
to top of page