International
Tables for Crystallography Volume D Physical properties of crystals Edited by A. Authier © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. D. ch. 2.2, pp. 294-313
https://doi.org/10.1107/97809553602060000639 Chapter 2.2. Electrons
a
Institut für Materialchemie, Technische Universität Wien, Getreidemarkt 9/165-TC, A-1060 Vienna, Austria The electronic structure of a solid, characterized by its energy band structure, is the fundamental quantity that determines the ground state of the solid and a series of excitations involving electronic states. In the first part of this chapter, several basic concepts are summarized in order to establish the notation used and to repeat essential theorems from group theory and solid-state physics that provide the definitions that are needed in this context (Brillouin zones, symmetry operators, Bloch theorem, space-group symmetry). Next the quantum-mechanical treatment, especially density functional theory, is described and the commonly used methods of band theory are outlined (the linear combination of atomic orbitals, tight binding, pseudo-potential schemes, the augmented plane wave method, the linear augmented plane wave method, the Korringa–Kohn–Rostocker method, the linear combination of muffin-tin orbitals, the Car–Parinello method etc.). The linear augmented plane wave scheme is presented explicitly so that concepts in connection with energy bands can be explained. The electric field gradient is discussed to illustrate a tensorial quantity. In the last section, a few examples illustrate the topics of the chapter. Keywords: Bloch function; Bloch states; Bloch theorem; Bravais lattices; Brillouin zone; Car–Parrinello method; Korringa–Kohn–Rostocker method; Seitz operator; Sommerfeld model; atomic orbitals; augmented plane wave method; band structure; chemical bonding; core electrons; crystal harmonics; density functional theory; density of states; electric field gradient; energy bands; exchange–correlation; free-electron model; full-potential methods; itinerant electrons; linear combination of atomic orbitals; linear combination of muffin-tin orbitals; linearized augmented plane wave method; local coordinate system; localized electrons; muffin-tin approximation; nuclear quadrupole moment; partial charges; periodic boundary conditions; pseudo-potential; quantum-mechanical treatment; reciprocal lattice; relativistic effects; representations; semi-core states; small representations; tight binding. |
The electronic structure of a solid, characterized by its energy band structure, is the fundamental quantity that determines the ground state of the solid and a series of excitations involving electronic states. In this chapter, we first summarize several basic concepts in order to establish the notation used here and to repeat essential theorems from group theory and solid-state physics that provide definitions which we need in this context. Next the quantum-mechanical treatment, especially density functional theory, is described and the commonly used methods of band theory are outlined. One scheme is presented explicitly so that concepts in connection with energy bands can be explained. The electric field gradient is discussed to illustrate a tensorial quantity and a few examples illustrate the topics of this chapter.
The three unit-cell vectors ,
and
define the parallelepiped of the unit cell. We define
From the seven possible crystal systems one arrives at the 14 possible space lattices, based on both primitive and non-primitive (body-centred, face-centred and base-centred) cells, called the Bravais lattices [see Chapter 9.1
of International Tables for Crystallography, Volume A (2005)
]. Instead of describing these cells as parallelepipeds, we can find several types of polyhedra with which we can fill space by translation. A very important type of space filling is obtained by the Dirichlet construction. Each lattice point is connected to its nearest neighbours and the corresponding bisecting (perpendicular) planes will delimit a region of space which is called the Dirichlet region, the Wigner–Seitz cell or the Voronoi cell. This cell is uniquely defined and has additional symmetry properties.
When we add a basis to the lattice (i.e. the atomic positions in the unit cell) we arrive at the well known 230 space groups [see Part 3
of International Tables for Crystallography, Volume A (2005)
].
Owing to the translational symmetry of a crystal, it is convenient to define a reciprocal lattice, which plays a dominating role in describing electrons in a solid. The three unit vectors of the reciprocal lattice are given according to the standard definition by
where the factor
is commonly used in solid-state physics in order to simplify many expressions. Strictly speaking (in terms of mathematics) this factor should not be included [see Section 1.1.2.4
of the present volume and Chapter 1.1
of International Tables for Crystallography, Volume B (2001)
], since the (complete) reciprocity is lost, i.e. the reciprocal lattice of the reciprocal lattice is no longer the direct lattice.
In analogy to the direct lattice we define
A construction identical to the Wigner–Seitz cell delimits in reciprocal space a cell conventionally known as the first Brillouin zone (BZ), which is very important in the band theory of solids. There are 14 first Brillouin zones according to the 14 Bravais lattices.
The concepts of symmetry operations in connection with a quantum-mechanical treatment of the electronic states are essential for an understanding of the electronic structure. In this context the reader is referred, for example, to the book by Altmann (1994).
For the definition of symmetry operators we use in the whole of this chapter the active picture, which has become the standard in solid-state physics. This means that the whole configuration space is rotated, reflected or translated, while the coordinate axes are kept fixed.
A translation is given bywhere t on the left-hand side corresponds to a symmetry (configuration-space) operator.
Often we are interested in a function (e.g. a wavefunction) and wish to know how it transforms under the configuration operator g which acts on
. For this purpose it is useful to introduce a function-space operator
which defines how to modify the function in the transformed configuration space so that it agrees with the original function
at the original coordinate
:
This must be valid for all points
and thus also for
, leading to the alternative formulation
The symmetry operations form a group G of configuration-space operations
with the related group
of the function-shape operators
. Since the multiplication rules
are preserved, these two groups are isomorphic.
In a quantum-mechanical treatment of the electronic states in a solid we have the following different entities: points in configuration space, functions defined at these points and (quantum-mechanical) operators acting on these functions. A symmetry operation transforms the points, the functions and the operators in a clearly defined way.
Consider an eigenvalue equation of operator (e.g. the Hamiltonian):
where
is a function of
. When g acts on
, the function-space operator
acts [according to (2.2.3.4)
] on
yielding
:
By putting
from (2.2.3.7)
into (2.2.3.6)
, we obtain
Multiplication from the left by
yields
This defines the transformed operator
which acts on the transformed function
that is given by the original function
but at position
.
The most general space-group operation is of the form with the point-group operation p (a rotation, reflection or inversion) followed by a translation w:
With the definition
it is easy to prove the multiplication rule
and define the inverse of a Seitz operator as
which satisfies
where
does not change anything and thus is the identity of the space group G.
Using the Seitz operators, we can classify the most important groups as we need them at the beginning of this chapter:
The electronic structure of an infinite solid looks so complicated that it would seem impossible to calculate it. Two important steps make the problem feasible. One is the single-particle approach, in which each electron moves in an average potential according to a Schrödinger equation,
and has its kinetic energy represented by the first operator. The second important concept is the translational symmetry, which leads to Bloch functions. The single-particle aspect will be discussed later (for details see Sections 2.2.9
and 2.2.10
).
In order to derive the Bloch theorem, we can simplify the problem by considering a one-dimensional case with a lattice constant a. [The generalization to the three-dimensional case can be done easily according to (2.2.3.15).] The one-dimensional Schrödinger equation is
where
is invariant under translations, i.e.
. We define a translation operator t according to (2.2.3.1)
for the translation by one lattice constant as
and apply its functional counterpart
to the potential, which gives [according to (2.2.3.4)
]
The first part in
corresponds to the kinetic energy operator, which is also invariant under translations. Therefore, since
(the lattice translation subgroup) and
(the Schrödinger group),
commutes with
, i.e. the commutator vanishes,
or
. This situation was described above [see (2.2.3.16)
–(2.2.3.18)] and leads to the fundamental theorem of quantum mechanics which states that when two operators commute the eigenvectors of the first must also be eigenvectors of the second. Consequently we have
where
is the eigenvalue corresponding to the translation by the lattice constant a. The second equation can be written explicitly as
and tells us how the wavefunction changes from one unit cell to the neighbouring unit cell. Notice that the electron density must be translationally invariant and thus it follows
which is a necessary (but not sufficient) condition for defining
.
We can expect the bulk properties of a crystal to be insensitive to the surface and also to the boundary conditions imposed, which we therefore may choose to be of the most convenient form. Symmetry operations are covering transformations and thus we have an infinite number of translations in T, which is most inconvenient. A way of avoiding this is provided by periodic boundary conditions (Born–von Karman). In the present one-dimensional case this means that the wavefunction becomes periodic in a domain
(with integer N number of lattice constants a), i.e.
According to our operator notation (2.2.4.6)
, we have the following situation when the translation t is applied n times:
It follows immediately from the periodic boundary condition (2.2.4.9)
that
with the obvious solution
Here it is convenient to introduce a notation
so that we can write
. Note that k is quantized due to the periodic boundary conditions according to (2.2.4.13)
. Summarizing, we have the Bloch condition (for the one-dimensional case):
i.e. when we change x by one lattice constant a the wavefunction at x is multiplied by a phase factor
. At the moment (2.2.4.13)
suggests the use of k as label for the wavefunction
.
Generalization to three dimensions leads to the exponential with
and thus to the Bloch condition
or written in terms of the translational operator
[see (2.2.3.15)
]
The eigenfunctions that satisfy (2.2.4.17)
are called Bloch functions and have the form
where
is a periodic function in the lattice,
and
is a vector in the reciprocal lattice [see (2.2.2.6)
] that plays the role of the quantum number in solids. The
vector can be chosen in the first BZ, because any
that differs from
by just a lattice vector
of the reciprocal lattice has the same Bloch factor and the corresponding wavefunction
satisfies the Bloch condition again, since
where the factor
is unity according to (2.2.2.7)
. Since these two functions,
and
, belong to the same Bloch factor
they are equivalent. A physical interpretation of the Bloch states will be given in Section 2.2.8
.
Let us repeat a few fundamental definitions of group theory: For any symmetry operation , the product
can always be formed for any
and defines the conjugate element of
by g. Given any operation
, its class
is defined as the set of all its conjugates under all operations
. What we need here is an important property of classes, namely that no two classes have any element in common so that any group can be considered as a sum of classes.
Assuming periodic boundary conditions with number of primitive cells along the axes
, respectively, a lump of crystal with
unit cells is studied. The translation subgroup T contains the general translation operators
, which [using (2.2.3.15)
] can be written as
where each factor belongs to one of the three axes. Since T is commutative (Abelian), each operation of T is its own class and thus the number of classes equals its order, namely N. From the general theorem that the squares of the dimensions of all irreducible representations of a group must equal the order of the group, it follows immediately that all N irreducible representations of T must be one-dimensional (see also Section 1.2.3.2
of the present volume). Taking the subgroup along the
axis, we must have
different irreducible representations, which we label (for later convenience) by
and denote as
These representations are one-dimensional matrices, i.e. numbers, and must be exponentials, often chosen of the form
. The constant
must be related to the corresponding label of the irreducible representation. In the three-dimensional case, we have the corresponding representation
where we have used the definitions (2.2.2.6)
and (2.2.2.1)
. Within the present derivation, the vector
corresponds to the label of the irreducible representation of the lattice translation subgroup.
The free-electron model corresponds to the special case of taking a constant potential in the Schrödinger equation (2.2.4.1). The physical picture relies on the assumption that the (metallic) valence electrons can move freely in the field of the positively charged nuclei and the tightly bound core electrons. Each valence electron moves in a potential which is nearly constant due to the screening of the remaining valence electrons. This situation can be idealized by assuming the potential to be constant [
]. This simple picture represents a crude model for simple metals but has its importance mainly because the corresponding equation can be solved analytically. By rewriting equation (2.2.4.1)
, we have
where in the last step the constants are abbreviated (for later convenience) by
. The solutions of this equation are plane waves (PWs)
where C is a normalization constant which is defined from the integral over one unit cell with volume
. The PWs satisfy the Bloch condition and can be written (using the bra–ket notation) as
From (2.2.5.1)
we see that the corresponding energy (labelled by
) is given by
In this context it is useful to consider the momentum of the electron, which classically is the vector , where m and
are the mass and velocity, respectively. In quantum mechanics we must replace
by the corresponding operator
.
Thus a PW is an eigenfunction of the momentum operator with eigenvalue . Therefore the
vector is also called the momentum vector. Note that this is strictly true for a vanishing potential but is otherwise only approximately true (referred to as pseudomomentum).
Another feature of a PW is that its phase is constant in a plane perpendicular to the vector (see Fig. 2.2.5.1
). For this purpose, consider a periodic function in space and time,
which has a constant phase factor
within such a plane. We can characterize the spatial part by
within this plane. Taking the nearest parallel plane (with vector
) for which the same phase factors occur again but at a distance
away (with the unit vector
normal to the plane),
then
must differ from
by
. This is easily obtained from (2.2.5.7)
by multiplication with
leading to
Consequently
is the wavelength and thus the
vector is called the wavevector or propagation vector.
The effect of a space-group operation on a Bloch function, labelled by
, is to transform it into a Bloch function that corresponds to a vector
,
which can be proven by using the multiplication rule of Seitz operators (2.2.3.12)
and the definition of a Bloch state (2.2.4.17)
.
A special case is the inversion operator, which leads to The Bloch functions
and
, where p is any operation of the point group P, belong to the same basis for a representation of the space group G.
The same
cannot appear in two different bases, thus the two bases
and
are either identical or have no
in common.
Irreducible representations of T are labelled by the N distinct vectors in the BZ, which separate in disjoint bases of G (with no
vector in common). If a
vector falls on the BZ edge, application of the point-group operation p can lead to an equivalent
vector that differs from the original by
(a vector of the reciprocal lattice). The set of all mutually inequivalent
vectors of
(
) define the star of the k vector (
) (see also Section 1.2.3.3
of the present volume).
The set of all operations that leave a vector invariant (or transform it into an equivalent
) forms the group
of the
vector. Application of q, an element of
, to a Bloch function (Section 2.2.8
) gives
where the band index j (described below) may change to
. The Bloch factor stays constant under the operation of q and thus the periodic cell function
must show this symmetry, namely
For example, a
-like orbital may be transformed into a
-like orbital if the two are degenerate, as in a tetragonal lattice.
A star of determines an irreducible basis, provided that the functions of the star are symmetrized with respect to the irreducible representation of the group of
vectors, which are called small representations. The basis functions for the irreducible representations are given according to Seitz (1937
) by
written as a row vector
with
, where n is the dimension of the irreducible representation of
with the order
. Such a basis consists of
functions and forms an
-dimensional irreducible representation of the space group. The degeneracies of these representations come from the star of
(not crucial for band calculations except for determining the weight of the
vector) and the degeneracy from
. The latter is essential for characterizing the energy bands and using the compatibility relations (Bouckaert et al., 1930
; Bradley & Cracknell, 1972
).
Each irreducible representation of the space group, labelled by , denotes an energy
, where
varies quasi-continuously over the BZ and the superscript j numbers the band states. The quantization of
according to (2.2.4.13)
and (2.2.4.15)
can be done in arbitrary fine steps by choosing corresponding periodic boundary conditions (see Section 2.2.4.2
). Since
and
belong to the same Bloch state, the energy is periodic in reciprocal space:
Therefore it is sufficient to consider
vectors within the first BZ. For a given
, two bands will not have the same energy unless there is a multidimensional small representation in the group of
or the bands belong to different irreducible representations and thus can have an accidental degeneracy. Consequently, this can not occur for a general
vector (without symmetry).
The vector plays a fundamental role in the electronic structure of a solid. In the above, several interpretations have been given for the
vector that
Starting with one of the 14 Bravais lattices, one can define the reciprocal lattice [according to (2.2.2.4)] by the Wigner–Seitz construction as discussed in Section 2.2.2.2
. The advantage of using the BZ instead of the parallelepiped spanned by the three unit vectors is its symmetry. Let us take a simple example first, namely an element (say copper) that crystallizes in the face-centred-cubic (f.c.c.) structure. With (2.2.2.4)
we easily find that the reciprocal lattice is body-centred-cubic (bcc) and the corresponding BZ is shown in Fig. 2.2.7.1
. In this case, f.c.c. Cu has
symmetry with 48 symmetry operations
(point group). The energy eigenvalues within a star of
(i.e.
) are the same, and therefore it is sufficient to calculate one member in the star. Consequently, it is enough to consider the irreducible wedge of the BZ (called the IBZ). In the present example, this corresponds to 1/48th of the BZ shown in Fig. 2.2.7.1
. To count the number of states in the BZ, one counts each
point in the IBZ with a proper weight
to represent the star of this
vector.
![]() |
The Brillouin zone (BZ) and the irreducible wedge of the BZ for the f.c.c. direct lattice. After the corresponding figure from the Bilbao Crystallographic Server (http://www.cryst.ehu.es/ ). The IBZ for any space group can be obtained by using the option KVEC and specifying the space group (in this case No. 225). |
The BZ is purely constructed from the reciprocal lattice and thus only follows from the translational symmetry (of the 14 Bravais lattices). However, the energy bands , with
lying within the first BZ, possess a symmetry associated with one of the 230 space groups. Therefore one can not simply use the geometrical symmetry of the BZ to find its irreducible wedge, although this is tempting. Since the effort of computing energy eigenvalues increases with the number of
points, one wishes to restrict such calculations to the basic domain, but the latter can only be found by considering the space group of the corresponding crystal (including the basis with all atomic positions).
One possible procedure for finding the IBZ is the following. First a uniform grid in reciprocal space is generated by dividing the three unit-cell vectors by an integer number of times. This is easy to do in the parallelepiped, spanned by the three unit-cell vectors, and yields a (more-or-less) uniform grid of
points. Now one must go through the complete grid of
points and extract a list of non-equivalent
points by applying to each
point in the grid the point-group operations. If a
point is found that is already in the list, its weight is increased by 1, otherwise it is added to the list. This procedure can easily be programmed and is often used when
integrations are needed. The disadvantage of this scheme is that the generated
points in the IBZ are not necessarily in a connected region of the BZ, since one member of the star of
is chosen arbitrarily, namely the first that is found by going through the complete list.
We can provide a physical interpretation for a Bloch function by the following considerations. By combining the group-theoretical concepts based on the translational symmetry with the free-electron model, we can rewrite a Bloch function [see (2.2.4.18)] in the form
where
denotes the plane wave (ignoring normalization) in Dirac's ket notation (2.2.5.3)
. The additional superscript j denotes the band index associated with
(see Section 2.2.6.2
). The two factors can be interpreted most easily for the two limiting cases, namely:
According to another theorem, the mean velocity of an electron in a Bloch state with wavevector and energy
is given by
If the energy is independent of
, its derivative with respect to
vanishes and thus the corresponding velocity. This situation corresponds to the genuinely isolated atomic levels (with band width zero) and electrons that are tied to individual atoms. If, however, there is any nonzero overlap in the atomic wavefunctions, then
will not be constant throughout the zone.
In the general case, different notations are used to characterize band states. Sometimes it is more appropriate to label an energy band by the atomic level from which it originates, especially for narrow bands. In other cases (with a large band width) the free-electron behaviour may be dominant and thus the corresponding free-electron notation is more appropriate.
A description of the electronic structure of solids requires a quantum-mechanical (QM) treatment which can be parameterized (in semi-empirical schemes) but is often obtained from ab initio calculations. The latter are more demanding in terms of computational effort but they have the advantage that no experimental knowledge is needed in order to adjust parameters. The following brief summary is restricted to the commonly used types of ab initio methods and their main characteristics.
Hartree–Fock-based (HF-based) methods (for a general description see, for example, Pisani, 1996) are based on a wavefunction description (with one Slater determinant in the HF method). The single-particle HF equations (written for an atom in Rydberg atomic units) can be written in the following form, which is convenient for further discussions:
with terms for the kinetic energy, the nuclear electronic potential, the classical electrostatic Coulomb potential and the exchange, a function potential which involves the permutation operator
, which interchanges the arguments of the subsequent product of two functions. This exchange term can not be rewritten as a potential times the function
but is truly non-local (i.e. depends on
and
). The interaction of orbital j with itself (contained in the third term) is unphysical, but this self-interaction is exactly cancelled in the fourth term. This is no longer true in the approximate DFT method discussed below. The HF method treats exchange exactly but contains – by definition – no correlation effects. The latter can be added in an approximate form in post-HF procedures such as that proposed by Colle & Salvetti (1990
).
Density functional theory (DFT) is an alternative approach in which both effects, exchange and correlation, are treated in a combined scheme but both approximately. Several forms of DFT functionals are available now that have reached high accuracy, so many structural problems can be solved adequately. Further details will be given in Section 2.2.10.
Most calculations of the electronic structure in solids (Pisani, 1996; Singh, 1994
; Altmann, 1994
) use a linear combination of basis functions in one form or another but differ in the basis sets. Some use a linear combination of atomic orbitals (LCAO) where the AOs are given as Gaussian- or Slater-type orbitals (GTOs or STOs); others use plane-wave (PW) basis sets with or without augmentations; and still others make use of muffin-tin orbitals (MTOs) as in LMTO (linear combination of MTOs; Skriver, 1984
) or ASW (augmented spherical wave; Williams et al., 1979
). In the former cases, the basis functions are given in analytic form, but in the latter the radial wavefunctions are obtained numerically by integrating the radial Schrödinger equation (Singh, 1994
) (see Section 2.2.11
).
Closely related to the choice of basis sets is the explicit form of the wavefunctions, which can be well represented by them, whether they are nodeless pseudo-wavefunctions or all-electron wavefunctions including the complete radial nodal structure and a proper description close to the nucleus.
In the muffin-tin or the atomic sphere approximation (MTA or ASA), each atom in the crystal is surrounded by an atomic sphere in which the potential is assumed to be spherically symmetric [see (2.2.12.5) and the discussion thereof]. While these schemes work reasonably well in highly coordinated, closely packed systems (such as face-centred-cubic metals), they become very approximate in all non-isotropic cases (e.g. layered compounds, semiconductors, open structures or molecular crystals). Schemes that make no shape approximation in the form of the potential are termed full-potential schemes (Singh, 1994
; Blaha et al., 1990
; Schwarz & Blaha, 1996
).
With a proper choice of pseudo-potential one can focus on the valence electrons, which are relevant for chemical bonding, and replace the inner part of their wavefunctions by a nodeless pseudo-function that can be expanded in PWs with good convergence.
If a solid contains only light elements, non-relativistic calculations are well justified, but as soon as heavier elements are present in the system of interest relativistic effects can no longer be neglected. In the medium range of atomic numbers (up to about 54), so-called scalar relativistic schemes are often used (Koelling & Harmon, 1977), which describe the main contraction or expansion of various orbitals (due to the Darwin s-shift or the mass–velocity term) but omit spin–orbit splitting. Unfortunately, the spin–orbit term couples spin-up and spin-down wavefunctions. If one has n basis functions without spin–orbit coupling, then including spin–orbit coupling in the Hamiltonian would lead to a
matrix equation, which requires about eight times as much computer time to solve it (due to the
scaling). Since the spin–orbit effect is generally small (at least for the valence states), one can simplify the procedure by diagonalizing the Hamiltonian including spin–orbit coupling in the space of the low-lying bands as obtained in a scalar relativistic step. This version is called second variational method (see e.g. Singh, 1994
). For very heavy elements it may be necessary to solve Dirac's equation, which has all these terms (Darwin s-shift, mass–velocity and spin–orbit) included. Additional aspects are illustrated in Section 2.2.14
in connection with the uranium atom.
The most widely used scheme for calculating the electronic structure of solids is based on density functional theory (DFT). It is described in many excellent books, for example that by Dreizler & Gross (1990), which contains many useful definitions, explanations and references. Hohenberg & Kohn (1964
) have shown that for determining the ground-state properties of a system all one needs to know is the electron density
. This is a tremendous simplification considering the complicated wavefunction of a crystal with (in principle infinitely) many electrons. This means that the total energy of a system (a solid in the present case) is a functional of the density
, which is independent of the external potential provided by all nuclei. At first it was just proved that such a functional exists, but in order to make this fundamental theorem of practical use Kohn & Sham (1965
) introduced orbitals and suggested the following procedure.
In the universal approach of DFT to the quantum-mechanical many-body problem, the interacting system is mapped in a unique manner onto an effective non-interacting system of quasi-electrons with the same total density. Therefore the electron density plays the key role in this formalism. The non-interacting particles of this auxiliary system move in an effective local one-particle potential, which consists of a mean-field (Hartree) part and an exchange–correlation part that, in principle, incorporates all correlation effects exactly. However, the functional form of this potential is not known and thus one needs to make approximations.
Magnetic systems (with collinear spin alignments) require a generalization, namely a different treatment for spin-up and spin-down electrons. In this generalized form the key quantities are the spin densities , in terms of which the total energy
is
with the electronic contributions, labelled conventionally as, respectively, the kinetic energy (of the non-interacting particles), the electron–electron repulsion, the nuclear–electron attraction and the exchange–correlation energies. The last term
is the repulsive Coulomb energy of the fixed nuclei. This expression is still exact but has the advantage that all terms but one can be calculated very accurately and are the dominating (large) quantities. The exception is the exchange–correlation energy
, which is defined by (2.2.10.1)
but must be approximated. The first important methods for this were the local density approximation (LDA) or its spin-polarized generalization, the local spin density approximation (LSDA). The latter comprises two assumptions:
The most effective way known to minimize by means of the variational principle is to introduce (spin) orbitals
constrained to construct the spin densities [see (2.2.10.7)
below]. According to Kohn and Sham (KS), the variation of
gives the following effective one-particle Schrödinger equations, the so-called Kohn–Sham equations (Kohn & Sham, 1965
) (written for an atom in Rydberg atomic units with the obvious generalization to solids):
with the external potential (the attractive interaction of the electrons by the nucleus) given by
the Coulomb potential (the electrostatic interaction between the electrons) given by
and the exchange–correlation potential (due to quantum mechanics) given by the functional derivative
In the KS scheme, the (spin) electron densities are obtained by summing over all occupied states, i.e. by filling the KS orbitals (with increasing energy) according to the Aufbau principle. Here
are occupation numbers such that
, where
is the symmetry-required weight of point
. These KS equations (2.2.10.3)
must be solved self-consistently in an iterative process, since finding the KS orbitals requires the knowledge of the potentials, which themselves depend on the (spin) density and thus on the orbitals again. Note the similarity to (and difference from) the Hartree–Fock equation (2.2.9.1)
. This version of the DFT leads to a (spin) density that is close to the exact density provided that the DFT functional is sufficiently accurate.
In early applications, the local density approximation (LDA) was frequently used and several forms of functionals exist in the literature, for example by Hedin & Lundqvist (1971), von Barth & Hedin (1972
), Gunnarsson & Lundqvist (1976
), Vosko et al. (1980
) or accurate fits of the Monte Carlo simulations of Ceperley & Alder (1984
). The LDA has some shortcomings, mostly due to the tendency of overbinding, which causes, for example, too-small lattice constants. Recent progress has been made going beyond the LSDA by adding gradient terms or higher derivatives (
and
) of the electron density to the exchange–correlation energy or its corresponding potential. In this context several physical constraints can be formulated, which an exact theory should obey. Most approximations, however, satisfy only part of them. For example, the exchange density (needed in the construction of these two quantities) should integrate to
according to the Fermi exclusion principle (Fermi hole). Such considerations led to the generalized gradient approximation (GGA), which exists in various parameterizations, e.g. in the one by Perdew et al. (1996
). This is an active field of research and thus new functionals are being developed and their accuracy tested in various applications.
The Coulomb potential in (2.2.10.5)
is that of all N electrons. That is, any electron is also moving in its own field, which is physically unrealistic but may be mathematically convenient. Within the HF method (and related schemes) this self-interaction is cancelled exactly by an equivalent term in the exchange interaction [see (2.2.9.1)
]. For the currently used approximate density functionals, the self-interaction cancellation is not complete and thus an error remains that may be significant, at least for states (e.g. 4f or 5f) for which the respective orbital is not delocalized. Note that delocalized states have a negligibly small self-interaction. This problem has led to the proposal of self-interaction corrections (SICs), which remove most of this error and have impacts on both the single-particle eigenvalues and the total energy (Parr et al., 1978
).
The Hohenberg–Kohn theorems state that the total energy (of the ground state) is a functional of the density, but the introduction of the KS orbitals (describing quasi-electrons) are only a tool in arriving at this density and consequently the total energy. Rigorously, the Kohn–Sham orbitals are not electronic orbitals and the KS eigenvalues (which correspond to
in a solids) are not directly related to electronic excitation energies. From a formal (mathematical) point of view, the
are just Lagrange multipliers without a physical meaning.
Nevertheless, it is often a good approximation (and common practice) to partly ignore these formal inconsistencies and use the orbitals and their energies in discussing electronic properties. The gross features of the eigenvalue sequence depend only to a smaller extent on the details of the potential, whether it is orbital-based as in the HF method or density-based as in DFT. In this sense, the eigenvalues are mainly determined by orthogonality conditions and by the strong nuclear potential, common to DFT and the HF method.
In processes in which one removes (ionization) or adds (electron affinity) an electron, one compares the N electron system with one with or
electrons. Here another conceptual difference occurs between the HF method and DFT. In the HF method one may use Koopmans' theorem, which states that the
agree with the ionization energies from state i assuming that the corresponding orbitals do not change in the ionization process. In DFT, the
can be interpreted according to Janak's theorem (Janak, 1978
) as the partial derivative with respect to the occupation number
,
Thus in the HF method
is the total energy difference for
, in contrast to DFT where a differential change in the occupation number defines
, the proper quantity for describing metallic systems. It has been proven that for the exact density functional the eigenvalue of the highest occupied orbital is the first ionization potential (Perdew & Levy, 1983
). Roughly, one can state that the further an orbital energy is away from the highest occupied state, the poorer becomes the approximation to use
as excitation energy. For core energies the deviation can be significant, but one may use Slater's transition state (Slater, 1974
), in which half an electron is removed from the corresponding orbital, and then use the
to represent the ionization from that orbital.
Another excitation from the valence to the conduction band is given by the energy gap, separating the occupied from the unoccupied single-particle levels. It is well known that the gap is not given well by taking as excitation energy. Current DFT methods significantly underestimate the gap (half the experimental value), whereas the HF method usually overestimates gaps (by a factor of about two). A trivial solution, applying the `scissor operator', is to shift the DFT bands to agree with the experimental gap. An improved but much more elaborate approach for obtaining electronic excitation energies within DFT is the GW method in which quasi-particle energies are calculated (Hybertsen & Louie, 1984
; Godby et al., 1986
; Perdew, 1986
). This scheme is based on calculating the dielectric matrix, which contains information on the response of the system to an external perturbation, such as the excitation of an electron.
In some cases, one can rely on the total energy of the states involved. The original Hohenberg–Kohn theorems (Hohenberg & Kohn, 1964) apply only to the ground state. The theorems may, however, be generalized to the energetically lowest state of any symmetry representation for which any property is a functional of the corresponding density. This allows (in cases where applicable) the calculation of excitation energies by taking total energy differences.
Many aspects of DFT from formalism to applications are discussed and many references are given in the book by Springborg (1997).
There are several methods for calculating the electronic structure of solids. They have advantages and disadvantages, different accuracies and computational requirements (speed or memory), and are based on different approximations. Some of these aspects have been discussed in Section 2.2.9. This is a rapidly changing field and thus only the basic concepts of a few approaches in current use are outlined below.
For the description of crystalline wavefunctions (Bloch functions), one often starts with a simple concept of placing atomic orbitals (AOs) at each site in a crystal denoted by , from which one forms Bloch sums in order to have proper translational symmetry:
Then Bloch functions can be constructed by taking a linear combination of such Bloch sums, where the linear-combination coefficients are determined by the variational principle in which a secular equation must be solved. The LCAO can be used in combination with both the Hartree–Fock method and DFT.
A simple version of the LCAO is found by parameterizing the matrix elements and
in a way similar to the Hückel molecular orbital (HMO) method, where the only non-vanishing matrix elements are the on-site integrals and the nearest-neighbour interactions (hopping integrals). For a particular class of solids the parameters can be adjusted to fit experimental values. With these parameters, the electronic structures of rather complicated solids can be described and yield quite satisfactory results, but only for the class of materials for which such a parametrization is available. Chemical bonding and symmetry aspects can be well described with such schemes, as Hoffmann has illustrated in many applications (Hoffmann, 1988
). In more complicated situations, however, such a simple scheme fails.
In many respects, core electrons are unimportant for determining the stability, structure and low-energy response properties of crystals. It is a well established practice to modify the one-electron part of the Hamiltonian by replacing the bare nuclear attraction with a pseudo-potential (PP) operator, which allows us to restrict our calculation to the valence electrons. The PP operator must reproduce screened nuclear attractions, but must also account for the Pauli exclusion principle, which requires that valence orbitals are orthogonal to core ones. The PPs are not uniquely defined and thus one seeks to satisfy the following characteristics as well as possible:
There are many versions of the PP method (norm-conserving, ultrasoft etc.) and the actual accuracy of a calculation is governed by which is used. For standard applications, PP techniques can be quite successful in solid-state calculations. However, there are cases that require higher accuracy, e.g. when core electrons are involved, as in high-pressure studies or electric field gradient calculations (see Section 2.2.15), where the polarization of the charge density close to the nucleus is crucial for describing the physical effects properly.
The partition of space (i.e. the unit cell) between (non-overlapping) atomic spheres and an interstitial region (see Fig. 2.2.12.1) is used in several schemes, one of which is the augmented plane wave (APW) method, originally proposed by Slater (Slater, 1937
) and described by Loucks (1967
), and its linearized version (the LAPW method), which is chosen as the one representative method that is described in detail in Section 2.2.12
.
The basis set is constructed using the muffin-tin approximation (MTA) for the potential [see the discussion below in connection with (2.2.12.5)]. In the interstitial region the wavefunction is well described by plane waves, but inside the spheres atomic-like functions are used which are matched continuously (at the sphere boundary) to each plane wave.
In the KKR scheme (Korringa, 1947; Kohn & Rostocker, 1954
), the solution of the KS equations (2.2.10.3)
uses a Green-function technique and solves a Lippman–Schwinger integral equation. The basic concepts come from a multiple scattering approach which is conceptually different but mathematically equivalent to the APW method. The building blocks are spherical waves which are products of spherical harmonics and spherical Hankel, Bessel and Neumann functions. Like plane waves, they solve the KS equations for a constant potential. Augmenting the spherical waves with numerical solutions inside the atomic spheres as in the APW method yields the KKR basis set. Compared with methods based on plane waves, spherical waves require fewer basis functions and thus smaller secular equations.
The radial functions in the APW and KKR methods are energy-dependent and so are the corresponding basis functions. This leads to a nonlinear eigenvalue problem that is computationally demanding. Andersen (1975) modelled the weak energy dependence by a Taylor expansion where only the first term is kept and thereby arrived at the so-called linear methods LMTO and LAPW.
The LMTO method (Andersen, 1975; Skriver, 1984
) is the linearized counterpart to the KKR method, in the same way as the LAPW method is the linearized counterpart to the APW method. This widely used method originally adopted the atomic sphere approximation (ASA) with overlapping atomic spheres in which the potential was assumed to be spherically symmetric. Although the ASA simplified the computation so that systems with many atoms could be studied, the accuracy was not high enough for application to certain questions in solid-state physics.
Following the ideas of Andersen, the augmented spherical wave (ASW) method was developed by Williams et al. (1979). The ASW method is quite similar to the LMTO scheme.
It should be noted that the MTA and the ASA are not really a restriction on the method. In particular, when employing the MTA only for the construction of the basis functions but including a generally shaped potential in the construction of the matrix elements, one arrives at a scheme of very high accuracy which allows, for instance, the evaluation of elastic properties. Methods using the unrestricted potential together with basis functions developed from the muffin-tin potential are called full-potential methods. Now for almost every method based on the MTA (or ASA) there exists a counterpart employing the full potential.
Conventional quantum-mechanical calculations are done using the Born–Oppenheimer approximation, in which one assumes (in most cases to a very good approximation) that the electrons are decoupled from the nuclear motion. Therefore the electronic structure is calculated for fixed atomic (nuclear) positions. Car & Parrinello (1985) suggested a new method in which they combined the motion of the nuclei (at finite temperature) with the electronic degrees of freedom. They started with a fictitious Lagrangian in which the wavefunctions follow a dynamics equation of motion. Therefore, the CP method combines the motion of the nuclei (following Newton's equation) with the electrons (described within DFT) into one formalism by solving equations of motion for both subsystems. This simplifies the computational effort and allows ab initio molecular dynamics calculations to be performed in which the forces acting on the atoms are calculated from the wavefunctions within DFT. The CP method has attracted much interest and is widely used, with a plane-wave basis, extended with pseudo-potentials and recently enhanced into an all-electron method using the projector augmented wave (PAW) method (Blöchl, 1994
). Such CP schemes can also be used to find equilibrium structures and to explore the electronic structure.
The various techniques outlined so far have one thing in common, namely the scaling. In a system containing N atoms the computational effort scales as , since one must determine a number of orbitals that is proportional to N which requires diagonalization of
matrices, where the prefactor k depends on the basis set and the method used. In recent years much work has been done to devise algorithms that vary linearly with N, at least for very large N (Ordejon et al., 1995
). First results are already available and look promising. When such schemes become generally available, it will be possible to study very large systems with relatively little computational effort. This interesting development could drastically change the accessibility of electronic structure results for large systems.
The electronic structure of solids can be calculated with a variety of methods as described above (Section 2.2.11). One representative example is the (full-potential) linearized augmented plane wave (LAPW) method. The LAPW method is one among the most accurate schemes for solving the effective one-particle (the so-called Kohn–Sham) equations (2.2.10.3)
and is based on DFT (Section 2.2.10
) for the treatment of exchange and correlation.
The LAPW formalism is described in many references, starting with the pioneering work by Andersen (1975) and by Koelling & Arbman (1975
), which led to the development and the description of the computer code WIEN (Blaha et al., 1990
; Schwarz & Blaha, 1996
). An excellent book by Singh (1994
) is highly recommended to the interested reader. Here only the basic ideas are summarized, while details are left to the articles and references therein.
In the LAPW method, the unit cell is partitioned into (non-overlapping) atomic spheres centred around the atomic sites (type I) and an interstitial region (II) as shown schematically in Fig. 2.2.12.1. For the construction of basis functions (and only for this purpose), the muffin-tin approximation (MTA) is used. In the MTA, the potential is assumed to be spherically symmetric within the atomic spheres but constant outside; in the former atomic-like functions and in the latter plane waves are used in order to adapt the basis set optimally to the problem. Specifically, the following basis sets are used in the two types of regions:
The solutions to the Kohn–Sham equations are expanded in this combined basis set of LAPWs, where the coefficients
are determined by the Rayleigh–Ritz variational principle. The convergence of this basis set is controlled by the number of PWs, i.e. by the magnitude of the largest
vector in equation (2.2.12.3)
.
In order to improve upon the linearization (i.e. to increase the flexibility of the basis) and to make possible a consistent treatment of semi-core and valence states in one energy window (to ensure orthogonality), additional (-independent) basis functions can be added. They are called `local orbitals' (Singh, 1994
) and consist of a linear combination of two radial functions at two different energies (e.g. at the
and
energy) and one energy derivative (at one of these energies):
The coefficients
,
and
are determined by the requirements that
should be normalized and has zero value and slope at the sphere boundary.
In its general form, the LAPW method expands the potential in the following form:where
are the crystal harmonics compatible with the point-group symmetry of the corresponding atom represented in a local coordinate system (see Section 2.2.13
). An analogous expression holds for the charge density. Thus no shape approximations are made, a procedure frequently called the `full-potential LAPW' (FLAPW) method.
The muffin-tin approximation (MTA) used in early band calculations corresponds to retaining only the and
component in the first expression of (2.2.12.5)
and only the
component in the second. This (much older) procedure corresponds to taking the spherical average inside the spheres and the volume average in the interstitial region. The MTA was frequently used in the 1970s and works reasonable well in highly coordinated (metallic) systems such as face-centred-cubic (f.c.c.) metals. For covalently bonded solids, open or layered structures, however, the MTA is a poor approximation and leads to serious discrepancies with experiment. In all these cases a full-potential treatment is essential.
The choice of sphere radii is not very critical in full-potential calculations, in contrast to the MTA, where this choice may affect the results significantly. Furthermore, different radii would be found when one uses one of the two plausible criteria, namely based on the potential (maximum between two adjacent atoms) or the charge density (minimum between two adjacent atoms). Therefore in the MTA one must make a compromise, whereas in full-potential calculations this problem practically disappears.
The partition of a crystal into atoms (or molecules) is ambiguous and thus the atomic contribution cannot be defined uniquely. However, whatever the definition, it must follow the relevant site symmetry for each atom. There are at least two reasons why one would want to use a local coordinate system at each atomic site: the concept of crystal harmonics and the interpretation of bonding features.
All spatial observables of the bound atom (e.g. the potential or the charge density) must have the crystal symmetry, i.e. the point-group symmetry around an atom. Therefore they must be representable as an expansion in terms of site-symmetrized spherical harmonics. Any point-symmetry operation transforms a spherical harmonic into another of the same . We start with the usual complex spherical harmonics,
which satisfy Laplacian's differential equation. The
are the associated Legendre polynomials and the normalization
is according to the convention of Condon & Shortley (1953
). For the
-dependent part one can use the real and imaginary part and thus use
and
instead of the
functions, but we must introduce a parity p to distinguish the functions with the same
. For convenience we take real spherical harmonics, since physical observables are real. The even and odd polynomials are given by the combination of the complex spherical harmonics with the parity p either
or − by
The expansion of – for example – the charge density around an atomic site can be written using the LAPW method [see the analogous equation (2.2.12.5)
for the potential] in the form
where we use capital letters
for the indices (i) to distinguish this expansion from that of the wavefunctions in which complex spherical harmonics are used [see (2.2.12.1)
] and (ii) to include the parity p in the index M (which represents the combined index
). With these conventions,
can be written as a linear combination of real spherical harmonics
which are symmetry-adapted to the site symmetry,
i.e. they are either
[(2.2.13.2)
] in the non-cubic cases (Table 2.2.13.1
) or are well defined combinations of
's in the five cubic cases (Table 2.2.13.2
), where the coefficients
depend on the normalization of the spherical harmonics and can be found in Kurki-Suonio (1977
).
|
|
According to Kurki-Suonio, the number of (non-vanishing) terms [e.g. in (2.2.13.3)
] is minimized by choosing for each atom a local Cartesian coordinate system adapted to its site symmetry. In this case, other
terms would vanish, so using only these terms corresponds to the application of a projection operator, i.e. equivalent to averaging the quantity of interest [e.g.
] over the star of
. Note that in another coordinate system (for the L values listed) additional M terms could appear. The group-theoretical derivation led to rules as to how the local coordinate system must be chosen. For example, the z axis is taken along the highest symmetry axis, or the x and y axes are chosen in or perpendicular to mirror planes. Since these coordinate systems are specific for each atom and may differ from the (global) crystal axes, we call them `local' coordinate systems, which can be related by a transformation matrix to the global coordinate system of the crystal.
The symmetry constraints according to (2.2.13.4) are summarized by Kurki-Suonio, who has defined picking rules to choose the local coordinate system for any of the 27 non-cubic site symmetries (Table 2.2.13.1
) and has listed the
combinations, which are defined by (a linear combination of) functions
[see (2.2.13.2)
]. If the
parity appears, both the
and the − combination must be taken. An application of a local coordinate system to rutile TiO2 is described in Section 2.2.16.2
.
In the case of the five cubic site symmetries, which all have a threefold axis in (111), a well defined linear combination of functions (given in Table 2.2.13.2
) leads to the cubic harmonics.
Chemical bonding is often described by considering orbitals (e.g. a or a
atomic orbital) which are defined in polar coordinates, where the z axis is special, in contrast to Cartesian coordinates, where x, y and z are equivalent. Consider for example an atom coordinated by ligands (e.g. forming an octahedron). Then the application of group theory, ligand-field theory etc. requires a certain coordinate system provided one wishes to keep the standard notation of the corresponding spherical harmonics. If this octahedron is rotated or tilted with respect to the global (unit-cell) coordinate system, a local coordinate system is needed to allow an easy orbital interpretation of the interactions between the central atom and its ligands. This applies also to spectroscopy or electric field gradients.
The two types of reasons mentioned above may or may not lead to the same choice of a local coordinate system, as is illustrated for the example of rutile in Section 2.2.16.2.
The electronic structure of a solid is specified by energy bands and the corresponding wavefunctions, the Bloch functions
. In order to characterize energy bands there are various schemes with quite different emphasis. The most important concepts are described below and are illustrated using selected examples in the following sections.
The energy bands are primarily characterized by the wavevector in the first BZ that is associated with the translational symmetry according to (2.2.4.23)
. The star of
determines an irreducible basis provided that the functions of the star are symmetrized with respect to the small representations, as discussed in Section 2.2.6.
Along symmetry lines in the BZ (e.g. from
along
towards X in the BZ shown in Fig. 2.2.7.1
), the corresponding group of the
vector may show a group–subgroup relation, as for example for
and
. The corresponding irreducible representations can then be found by deduction (or by induction in the case of a group–supergroup relation). These concepts define the compatibility relations (Bouckaert et al., 1930
; Bradley & Cracknell, 1972
), which tell us how to connect energy bands. For example, the twofold degenerate representation
(the
symmetry in a cubic system) splits into the
and
manifold in the
direction, both of which are one-dimensional. The compatibility relations tell us how to connect bands. In addition, one can also find an orbital representation and thus knows from the group-theoretical analysis which orbitals belong to a certain energy band. This is very useful for interpretations.
In chemistry and physics it is quite common to separate the electronic states of an atom into those from core and valence electrons, but sometimes this distinction is not well defined, as will be discussed in connection with the so-called semi-core states. For the sake of argument, let us discuss the situation in a solid using the concepts of the LAPW method, keeping in mind that very similar considerations hold for all other band-structure schemes.
A core state is characterized by a low-lying energy (i.e. with a large negative energy value with respect to the Fermi energy) and a corresponding wavefunction that is completely confined inside the sphere of the respective atom. Therefore there is effectively no overlap with the wavefunctions from neighbouring atoms and, consequently, the associated band width is practically zero.
The valence electrons occupy the highest states and have wavefunctions that strongly overlap with their counterparts at adjacent sites, leading to chemical bonding, large dispersion (i.e. a strong variation of the band energy with ) and a significant band width.
The semi-core states are in between these two categories. For example, the 3s and 3p states of the 3d transition metals belong here. They are about 2–6 Ry (1 Ry = 13.6 eV) below the valence bands and have most of the wavefunctions inside their atomic spheres, but a small fraction (a few per cent) of the corresponding charge lies outside this sphere. This causes weak interactions with neighbouring atoms and a finite width of the corresponding energy bands.
Above the valence states are the unoccupied states, which often (e.g. in DFT or the HF method) require special attention.
For interpreting chemical bonding or the physical origin of a given Bloch state at , a decomposition according to its wavefunction is extremely useful but always model-dependent. The charge density
corresponding to the Bloch state at
can be normalized to one per unit cell and is (in principle) an observable, while its decomposition depends on the model used. The following considerations are useful in this context:
Simple metals with valence electrons originating from s- and p-type orbitals form wide bands which are approximately free-electron like (with a large band width W). Such a case corresponds to itinerant electrons that are delocalized and thus cause metallic conductivity.
The other extreme case is a system with 4f (and some 5f) electrons, such as the lanthanides. Although the orbital energies of these electrons are in the energy range of the valence electrons, they act more like core electrons and thus are tightly bound to the corresponding atomic site. Such electrons are termed localized, since they do not hop to neighbouring sites (controlled by a hopping parameter t) and thus do not contribute to metallic conductivity. Adding another of these electrons to a given site would increase the Coulomb repulsion U. A large U (i.e. t) prevents them from hopping.
There are – as usual – borderline cases (e.g. the late 3d transition-metal oxides) in which a delicate balance between t and U, the energy gain by delocalizing electrons and the Coulomb repulsion, determines whether a system is metallic or insulating. This problem of metal/insulator transitions is an active field of research of solid-state physics which shall not be discussed here.
In one example, however, the dual role of f electrons is illustrated for the uranium atom using relativistic wavefunctions (with a large and a small component) characterized by the quantum numbers n, and j. Fig. 2.2.14.1
shows the outermost lobe (the large component) of the electrons beyond the [Xe] core without the 4f and 5d core-like states. One can see the
,
and
(semi-core) electrons, and the
and
(valence) electrons.
![]() |
Relativistic radial wavefunctions (large component) of the uranium atom. Shown are the outer lobes of valence and semi-core states excluding the [Xe] core, and the 4f and 5d core states. |
On the one hand, the radial wavefunction of the orbital has its peak closer to the nucleus than the main lobes of the semi-core states
,
and
, and thus demonstrates the core nature of these 5f electrons. On the other hand, the
orbital decays (with distance) much less than the semi-core states and electrons in this orbital can thus also play the role of valence electrons, like electrons in the
and
orbitals. This dual role of the f electrons has been discussed, for example, by Schwarz & Herzig (1979
).
In a non-fully-relativistic treatment, spin remains a good quantum number. Associated with the spin is a spin magnetic moment. If atoms have net magnetic moments they can couple in various orders in a solid. The simplest cases are the collinear spin alignments as found in ferromagnetic (FM) or antiferromagnetic (AF) systems with parallel (FM) and antiparallel (AF) moments on neighbouring sites. Ferrimagnets have opposite spin alignments but differ in the magnitude of their moments on neighbouring sites, leading to a finite net magnetization. These cases are characterized by the electronic structure of spin-up and spin-down electrons. More complicated spin structures (e.g. canted spins, spin spirals, spin glasses) often require a special treatment beyond simple spin-polarized calculations. In favourable cases, however, as in spin spirals, it is possible to formulate a generalized Bloch theorem and treat such systems by band theory (Sandratskii, 1990).
In a fully relativistic formalism, an additional orbital moment may occur. Note that the orientation of the total magnetic moment (spin and orbital moment) with respect to the crystal axis is only defined in a relativistic treatment including spin–orbit interactions. In a spin-polarized calculation without spin–orbit coupling this is not the case and only the relative orientation (majority-spin and minority-spin) is known. The magnetic structures may lead to a lowering of symmetry, a topic beyond this book.
The density of states (DOS) is the number of one-electron states (in the HF method or DFT) per unit energy interval and per unit cell volume. It is better to start with the integral quantity , the number of states below a certain energy
,
where
is the volume of the BZ, the factor 2 accounts for the occupation with spin-up and spin-down electrons (in a non-spin-polarized case), and
is the step function, the value of which is 1 if
is less than
and 0 otherwise. The sum over
points has been replaced by an integral over the BZ, since the
points are uniformly distributed. Both expressions, sum and integral, are used in different derivations or applications. The Fermi energy is defined by imposing that
, the number of (valence) electrons per unit cell.
The total DOS is defined as the energy derivative of as
with the normalization
where the integral is taken from
if all core states are included or from the bottom of the valence bands, often taken to be at zero. This defines the Fermi energy (note that the energy range must be consistent with N). In a bulk material, the origin of the energy scale is arbitrary and thus only relative energies are important. In a realistic case with a surface (i.e. a vacuum) one can take the potential at infinity as the energy zero, but this situation is not discussed here.
The total DOS can be decomposed into a partial (or projected) DOS by using information from the wavefunctions as described above in Section 2.2.14.3.
If the charge corresponding to the wavefunction of an energy state is partitioned into contributions from the atoms, a site-projected DOS can be defined as
, where the superscript t labels the atom t. These quantities can be further decomposed into
-like contributions within each atom to give
. As discussed above for the partial charges, a further partitioning of the
-like terms according to the site symmetry (point group) can be done (in certain cases) by taking the proper m combinations, e.g. the
and
manifold of the fivefold degenerate d orbitals in a octahedral ligand field. The latter scheme requires a local coordinate system in which the spherical harmonics are defined (see Section 2.2.13
). In this context all considerations as discussed above for the partial charges apply again. Note in particular the difference between site-centred and spatially decomposed wavefunctions, which affects the partition of the DOS into its wavefunction-dependent contributions. For example, in atomic sphere representations as in LAPW we have the decomposition
In the case of spin-polarized calculations, one can also define a spin-projected DOS for spin-up and spin-down electrons.
The study of hyperfine interactions is a powerful way to characterize different atomic sites in a given sample. There are many experimental techniques, such as Mössbauer spectroscopy, nuclear magnetic and nuclear quadrupole resonance (NMR and NQR), perturbed angular correlations (PAC) measurements etc., which access hyperfine parameters in fundamentally different ways. Hyperfine parameters describe the interaction of a nucleus with the electric and magnetic fields created by the chemical environment of the corresponding atom. Hence the resulting level splitting of the nucleus is determined by the product of a nuclear and an extra-nuclear quantity. In the case of quadrupole interactions, the nuclear quantity is the nuclear quadrupole moment (Q) that interacts with the electric field gradient (EFG) produced by the charges outside the nucleus. For a review see, for example, Kaufmann & Vianden (1979).
The EFG tensor is defined by the second derivative of the electrostatic potential V with respect to the Cartesian coordinates , i = 1, 2, 3, taken at the nuclear site n,
where the second term is included to make it a traceless tensor. This is more appropriate, since there is no interaction of a nuclear quadrupole and a potential caused by s electrons. From a theoretical point of view it is more convenient to use the spherical tensor notation because electrostatic potentials (the negative of the potential energy of the electron) and the charge densities are usually given as expansions in terms of spherical harmonics. In this way one automatically deals with traceless tensors (for further details see Herzig, 1985
).
The analysis of experimental results faces two obstacles: (i) The nuclear quadrupole moments (Pyykkö, 1992) are often known only with a large uncertainty, as this is still an active research field of nuclear physics. (ii) EFGs depend very sensitively on the anisotropy of the charge density close to the nucleus, and thus pose a severe challenge to electronic structure methods, since an accuracy of the density in the per cent range is required.
In the absence of a better tool, a simple point-charge model was used in combination with so-called Sternheimer (anti-) shielding factors in order to interpret the experimental results. However, these early model calculations depended on empirical parameters, were not very reliable and often showed large deviations from experimental values.
In their pioneering work, Blaha et al. (1985) showed that the LAPW method was able to calculate EFGs in solids accurately without empirical parameters. Since then, this method has been applied to a large variety of systems (Schwarz & Blaha, 1992
) from insulators (Blaha et al., 1985
), metals (Blaha et al., 1988
) and superconductors (Schwarz et al., 1990
) to minerals (Winkler et al., 1996
).
Several other electronic structure methods have been applied to the calculation of EFGs in solids, for example the LMTO method for periodic (Methfessl & Frota-Pessoa, 1990) or non-periodic (Petrilli & Frota-Pessoa, 1990
) systems, the KKR method (Akai et al., 1990
), the DVM (discrete variational method; Ellis et al., 1983
), the PAW method (Petrilli et al., 1998
) and others (Meyer et al., 1995
). These methods achieve different degrees of accuracy and are more or less suitable for different classes of systems.
As pointed out above, measured EFGs have an intrinsic uncertainty related to the accuracy with which the nuclear quadrupole moment is known. On the other hand, the quadrupole moment can be obtained by comparing experimental hyperfine splittings with very accurate electronic structure calculations. This has recently been done by Dufek et al. (1995a) to determine the quadrupole moment of 57Fe. Hence the calculation of accurate EFGs is to date an active and challenging research field.
The nuclear quadrupole interaction (NQI) represents the interaction of Q (the nuclear quadrupole moment) with the electric field gradient (EFG) created by the charges surrounding the nucleus, as described above. Here we briefly summarize the main ideas (following Petrilli et al., 1998) and provide conversions between experimental NQI splittings and electric field gradients.
Let us consider a nucleus in a state with nuclear spin quantum number with the corresponding nuclear quadrupole moment
, where
is the nuclear charge density around point
and e is the proton's charge. The interaction of this Q with an electric field gradient tensor
,
splits the energy levels
for different magnetic spin quantum numbers
of the nucleus according to
in first order of
, where Q represents the largest component of the nuclear quadrupole moment tensor in the state characterized by
. (Note that the quantum-mechanical expectation value of the charge distribution in an angular momentum eigenstate is cylindrical, which renders the expectation value of the remaining two components with half the value and opposite sign.) The conventional choice is
. Hence,
is the principal component (largest eigenvalue) of the electric field gradient tensor and the asymmetry parameter
is defined by the remaining two eigenvalues
through
(2.2.15.3)
shows that the electric quadrupole interaction splits the (
)-fold degenerate energy levels of a nuclear state with spin quantum number I (
) into I doubly degenerate substates (and one singly degenerate state for integer I). Experiments determine the energy difference
between the levels, which is called the quadrupole splitting. The remaining degeneracy can be lifted further using magnetic fields.
Next we illustrate these definitions for 57Fe, which is the most common probe nucleus in Mössbauer spectroscopy measurements and thus deserves special attention. For this probe, the nuclear transition occurs between the excited state and
ground state, with a 14.4 KeV
radiation emission. The quadrupole splitting between the
and the
state can be obtained by exploiting the Doppler shift of the
radiation of the vibrating sample.
For systems in which the 57Fe nucleus has a crystalline environment with axial symmetry (a threefold or fourfold rotation axis), the asymmetry parameter
is zero and
is given directly by
As
can never be greater than unity, the difference between the values of
given by equation (2.2.15.5)
and equation (2.2.15.6)
cannot be more than about 15%. In the remainder of this section we simplify the expressions, as is often done, by assuming that
. As Mössbauer experiments exploit the Doppler shift of the
radiation, the splitting is expressed in terms of the velocity between sample and detector. The quadrupole splitting can be obtained from the velocity, which we denote here by
, by
where c = 2.9979245580 × 108 m s−1 is the speed of light and Eγ = 14.41 × 103 eV is the energy of the emitted
radiation of the 57Fe nucleus.
Finally, we still need to know the nuclear quadrupole moment Q of the Fe nucleus itself. Despite its utmost importance, its value has been heavily debated. Recently, however, Dufek et al. (1995b) have determined the value Q = 0.16 b for 57Fe (1 b = 10−28 m2) by comparing for fifteen different compounds theoretical
values, which were obtained using the linearized augmented plane wave (LAPW) method, with the measured quadrupole splitting at the Fe site.
Now we relate the electric field gradient to the Doppler velocity via
In the special case of the 57Fe nucleus, we obtain
EFGs can also be obtained by techniques like NMR or NQR, where a convenient measure of the strength of the quadrupole interaction is expressed as a frequency
, related to
by
The value
can then be calculated from the frequency in MHz by
where (h/e) = 4.1356692 × 10−15 V Hz−1. The principal component
is also often denoted as
.
In the literature, two conflicting definitions of are in use. One is given by (2.2.15.10)
, and the other, defined as
differs from the first by a factor of 2 and assumes the value
. Finally, the definition of
has been introduced here. In order to avoid confusion, we will refer here only to the definition given by (2.2.15.10)
. Furthermore, we also adopt the same sign convention for
as Schwarz et al. (1990
) because it has been found to be consistent with the majority of experimental results.
Since the EFG is a ground-state property that is uniquely determined by the charge density distribution (of electrons and nuclei), it can be calculated within DFT without further approximations. Here we describe the basic formalism to calculate EFGs with the LAPW method (see Section 2.2.12). In the LAPW method, the unit cell is divided into non-overlapping atomic spheres and an interstitial region. Inside each sphere the charge density (and analogously the potential) is written as radial functions
times crystal harmonics (2.2.13.4)
and in the interstitial region as Fourier series:
The charge density coefficients
can be obtained from the wavefunctions (KS orbitals) by (in shorthand notation)
where
are Gaunt numbers (integrals over three spherical harmonics) and
denote the LAPW radial functions [see (2.2.12.1)
] of the occupied states
below the Fermi energy
. The dependence on the energy bands in
has been omitted in order to simplify the notation.
For a given charge density, the Coulomb potential is obtained numerically by solving Poisson's equation in form of a boundary-value problem using a method proposed by Weinert (1981). This yields the Coulomb potential coefficients
in analogy to (2.2.15.13)
[see also (2.2.12.5)
]. The most important contribution to the EFG comes from a region close to the nucleus of interest, where only the
terms are needed (Herzig, 1985
). In the limit
(the position of the nucleus), the asymptotic form of the potential
can be used and this procedure yields (Schwarz et al., 1990
) for
:
with
,
and the spherical Bessel function
. The first term in (2.2.15.15)
(called the valence EFG) corresponds to the integral over the respective atomic sphere (with radius R). The second and third terms in (2.2.15.15)
(called the lattice EFG) arise from the boundary-value problem and from the charge distribution outside the sphere considered. Note that our definition of the lattice EFG differs from that based on the point-charge model (Kaufmann & Vianden, 1979
). With these definitions the tensor components are given as
where
and the index M combines m and the partity p (e.g.
). Note that the prefactors depend on the normalization used for the spherical harmonics.
The non-spherical components of the potential come from the non-spherical charge density
. For the EFG only the
terms (in the potential) are needed. If the site symmetry does not contain such a non-vanishing term (as for example in a cubic system with
in the lowest
combination), the corresponding EFG vanishes by definition. According to the Gaunt numbers in (2.2.15.14)
only a few non-vanishing terms remain (ignoring f orbitals), such as the p–p, d–d or s–d combinations (for f orbitals, p–f and f–f would appear), where this shorthand notation denotes the products of the two radial functions
. The s–d term is often small and thus is not relevant to the interpretation. This decomposition of the density can be used to partition the EFG (illustrated for the
component),
where the superscripts p and d are a shorthand notation for the product of two p- or d-like functions.
From our experience we find that the first term in (2.2.15.15) is usually by far the most important and often a radial range up to the first node in the corresponding radial function is all that contributes. In this case the contribution from the other two terms is rather small (a few per cent). For first-row elements, however, which have no node in their 2p functions, this is no longer true and thus the first term amounts only to about 50–70%.
In some cases interpretation is simplified by defining a so-called asymmetry count, illustrated below for the oxygen sites in YBa2Cu3O7 (Schwarz et al., 1990), the unit cell of which is shown in Fig. 2.2.15.1
.
In this case essentially only the O 2p orbitals contribute to the O EFG. Inside the oxygen spheres (all taken with a radius of 0.82 Å) we can determine the partial charges corresponding to the
,
and
orbitals, denoted in short as
,
and
charges.
With these definitions we can define the p-like asymmetry count as and obtain the proportionality
where
is the expectation value taken with the p orbitals. A similar equation can be defined for the d orbitals. The factor
enhances the EFG contribution from the density anisotropies close to the nucleus. Since the radial wavefunctions have an asymptotic behaviour near the origin as
, the p orbitals are more sensitive than the d orbitals. Therefore even a very small p anisotropy can cause an EFG contribution, provided that the asymmetry count is enhanced by a large expectation value.
Often the anisotropy in the ,
and
occupation numbers can be traced back to the electronic structure. Such a physical interpretation is illustrated below for the four non-equivalent oxygen sites in YBa2Cu3O7 (Table 2.2.15.1
). Let us focus first on O1, the oxygen atom that forms the linear chain with the Cu1 atoms along the b axis. In this case, the
orbital of O1 points towards Cu1 and forms a covalent bond, leading to bonding and antibonding states, whereas the other two p orbitals have no bonding partner and thus are essentially nonbonding. Part of the corresponding antibonding states lies above the Fermi energy and thus is not occupied, leading to a smaller
charge of 0.91 e, in contrast to the fully occupied nonbonding states with occupation numbers around 1.2 e. (Note that only a fraction of the charge stemming from the oxygen 2p orbitals is found inside the relatively small oxygen sphere.) This anisotropy causes a finite asymmetry count [(2.2.15.18)
] that leads – according to (2.2.15.19)
– to a corresponding EFG.
|
In this simple case, the anisotropy in the charge distribution, given here by the different p occupation numbers, is directly proportional to the EFG, which is given with respect to the crystal axes and is thus labelled ,
and
(Table 2.2.15.1
). The principal component of the EFG is in the direction where the p occupation number is smallest, i.e. where the density has its highest anisotropy. The other oxygen atoms behave very similarly: O2, O3 and O4 have a near neighbour in the a, b and c direction, respectively, but not in the other two directions. Consequently, the occupation number is lower in the direction in which the bond is formed, whereas it is normal (around 1.2 e) in the other two directions. The principal axis falls in the direction of the low occupation. The higher the anisotropy, the larger the EFG (compare O1 with the other three oxygen sites). Excellent agreement with experiment is found (Schwarz et al., 1990
). In a more complicated situation, where p and d contributions to the EFG occur [see (2.2.15.17)
], which often have opposite sign, the interpretation can be more difficult [see e.g. the copper sites in YBa2Cu3O7; Schwarz et al. (1990
)].
The importance of semi-core states has been illustrated for rutile, where the proper treatment of 3p and 4p states is essential to finding good agreement with experiment (Blaha et al., 1992). The orthogonality between
-like bands belonging to different principal quantum numbers (3p and 4p) is important and can be treated, for example, by means of local orbitals [see (2.2.12.4)
].
In many simple cases, the off-diagonal elements of the EFG tensor vanish due to symmetry, but if they don't, diagonalization of the EFG tensor is required, which defines the orientation of the principal axis of the tensor. Note that in this case the orientation is given with respect to the local coordinate axes (see Section 2.2.13) in which the
components are defined.
The general concepts described above are used in many band-structure applications and thus can be found in the corresponding literature. Here only a few examples are given in order to illustrate certain aspects.
For the simple case of an element, namely copper in the f.c.c. structure, the band structure is shown in Fig. 2.2.16.1 along the
symmetry direction from
to X. The character of the bands can be illustrated by showing for each band state the crucial information that is contained in the wavefunctions. In the LAPW method (Section 2.2.12
), the wavefunction is expanded in atomic like functions inside the atomic spheres (partial waves), and thus a spatial decomposition of the associated charge and its portion of
-like charge (s-, p-, d-like) inside the Cu sphere,
, provides such a quantity. Fig. 2.2.16.1
shows for each state
a circle the radius of which is proportional to the
-like charge of that state. The band originating from the Cu 4s and 4p orbitals shows an approximately free-electron behaviour and thus a
energy dependence, but it hybridizes with one of the d bands in the middle of the
direction and thus the
-like character changes along the
direction.
![]() |
Character of energy bands of f.c.c. copper in the |
This can easily be understood from a group-theoretical point of view. Since the d states in an octahedral environment split into the and
manifold, the d bands can be further partitioned into the two subsets as illustrated in Fig. 2.2.16.2
. The s band ranges from about −9.5 eV below
to about 2 eV above. In the
direction, the s band has
symmetry, the same as one of the d bands from the
manifold, which consists of
and
. As a consequence of the `non-crossing rule', the two states, both with
symmetry, must split due to the quantum-mechanical interaction between states with the same symmetry. This leads to the avoided crossing seen in the middle of the
direction (Fig. 2.2.16.1
). Therefore the lowest band starts out as an `s band' but ends near X as a `d band'. This also shows that bands belonging to different irreducible representations (small representations) may cross. The fact that
splits into the subgroups
and
is an example of the compatibility relations. In addition, group-theoretical arguments can be used (Altmann, 1994
) to show that in certain symmetry directions the bands must enter the face of the BZ with zero slope.
![]() |
Decomposition of the Cu d bands into the |
Note that in a site-centred description of the wavefunctions a similar -like decomposition of the charge can be defined as
(without the
term), but here the partial charges have a different meaning than in the spatial decomposition. In one case (e.g. LAPW),
refers to the partial charge of
-like character inside sphere t, while in the other case (LCAO), it means
-like charge coming from orbitals centred at site t. For the main components (for example Cu d) these two procedures will give roughly similar results, but the small components have quite a different interpretation. For this purpose consider an orbital that is centred on the neighbouring site j, but whose tail enters the atomic sphere i. In the spatial representation this tail coming from the j site must be represented by the (s, p, d etc.) partial waves inside sphere i and consequently will be associated with site i, leading to a small partial charge component. This situation is sometimes called the off-site component, in contrast to the on-site component, which will appear at its own site or in its own sphere, depending on the representation, site-centred or spatially confined.
The well known rutile structure (e.g. TiO2) is tetragonal (see Fig. 2.2.16.3) with the basis consisting of the metal atoms at the
Wyckoff positions, (
) and (
), and anions at the
position, located at (
) and (
) with a typical value of about 0.3 for the internal coordinate u. Rutile belongs to the non-symmorphic space group
(
) in which the metal positions are transformed into each other by a rotation by 90° around the crystal c axis followed by a non-primitive translation of (
). The two metal positions at the centre and at the corner of the unit cell are equivalent when the surrounding octahedra are properly rotated. The metal atoms are octahedrally coordinated by anions which, however, do not form an ideal octahedron. The distortion depends on the structure parameters a,
and u, and results in two different metal–anion distances, namely the apical distance
and the equatorial distance
, the height (z axis) and the basal spacing of the octahedron. For a certain value
the two distances
and
become equal:
For this special value
and an ideal
ratio, the basal plane of the octahedron is quadratic and the two distances are equal. An ideal octahedral coordination is thus obtained with
Although the actual coordination of the metal atoms deviates from the ideal octahedron (as in all other systems that crystallize in the rutile structure), we still use this concept for symmetry arguments and call it octahedral coordination.
The concept of a local coordinate system is illustrated for rutile (TiO2) from two different aspects, namely the crystal harmonics expansion (see Section 2.2.13) and the interpretation of chemical bonding (for further details see Sorantin & Schwarz, 1992
).
In excitations involving core electrons, simplifications are possible that allow an easier interpretation. As one example, (soft) X-ray emission (XES) or absorption (XAS) spectra are briefly discussed. In the one-electron picture, the XES process can be described as sketched in Fig. 2.2.16.4. First a core electron of atom A in state
is knocked out (by electrons or photons), and then a transition occurs between the occupied valence states at energy
and the core hole (the transitions between inner core levels are ignored).
According to Fermi's golden rule, the intensity of such a transition can be described bywhere
comes from the integral over the angular components (Table 2.2.16.1
) and contains the
selection rule,
is the local (within atomic sphere A) partial (
-like) DOS,
is the radial transition probability [see (2.2.16.6)
below], and the last term takes the energy conservation into account.
|
The are defined as the dipole transition (with the dipole operator r) probability between the valence state at
and the core state characterized by quantum numbers
,
In this derivation one makes use of the fact that core states are completely confined inside the atomic sphere. Therefore the integral, which should be taken over the entire space, can be restricted to one atomic sphere (namely A), since the core wavefunction
and thus the integrand vanishes outside this sphere. This is also the reason why XES (or XAS) are related to
, the local DOS weighted with the
-like charge within the atomic sphere A.
The interpretation of XES intensities is as follows. Besides the factor from Fermi's golden rule, the intensity is governed by the
selection rule and the energy conservation. In addition, it depends on the number of available states at
which reside inside sphere A and have an
-like contribution, times the probability for the transition to take place from the valence and to the core hole under energy conservation. For an application, see for example the comparison between theory and experiment for the compounds NbC and NbN (Schwarz, 1977
).
Note again that the present description is based on an atomic sphere representation with partial waves inside the spheres, in contrast to an LCAO-like treatment with site-centred basis functions. In the latter, an equivalent formalism can be defined which differs in details, especially for the small components (off-site contributions). If the tails of an orbital enter a neighbouring sphere and are crucial for the interpretation of XES, there is a semantic difference between the two schemes as discussed above in connection with f.c.c. Cu in Section 2.2.16.1. In the present framework, all contributions come exclusively from the sphere where the core hole resides, whereas in an LCAO representation `cross transitions' from the valence states on one atom to the core hole of a neighbouring atom may be important. The latter contributions must be (and are) included in the partial waves within the sphere in schemes such as LAPW. There is no physical difference between the two descriptions.
In XES, spectra are interpreted on the basis of results from ground-state calculations, although there could be relaxations due to the presence of a core hole. As early as 1979, von Barth and Grossmann formulated a `final state rule' for XES in metallic systems (von Barth & Grossmann, 1979). In this case, the initial state is one with a missing core electron (core hole), whereas the final state is close to the ground state, since the hole in the valence bands (after a valence electron has filled the core hole) has a very short lifetime and is very quickly filled by other valence electrons. They applied time-dependent perturbation theory and could show by model calculations that the main XES spectrum can be explained by ground-state results, whereas the satellite spectrum (starting with two core holes and ending with one) requires a treatment of the core-hole relaxation. This example illustrates the importance of the relevant physical process in experiments related to the energy-band structure: it may not always be the just the ground states that are involved and sometimes excited states must be considered.
There are many more applications of band theory to solids and thus an enormous amount of literature has not been covered here. In this chapter, an attempt has been made to collect relevant concepts, definitions and examples from group theory, solid-state physics and crystallography in order to understand symmetry aspects in combination with a quantum-mechanical treatment of the electronic structure of solids.
Acknowledgements
The author wishes to thank the following persons who contributed to this chapter: P. Blaha, the main author of WIEN; J. Luitz, for help with the figures; and P. Herzig, with whom the author discussed the group-theoretical aspects.
References

































































