International
Tables for Crystallography Volume B Reciprocal space Edited by U. Shmueli © International Union of Crystallography 2006 |
International Tables for Crystallography (2006). Vol. B. ch. 1.3, pp. 28-34
|
At the end of the 19th century, Heaviside proposed under the name of `operational calculus' a set of rules for solving a class of differential, partial differential and integral equations encountered in electrical engineering (today's `signal processing'). These rules worked remarkably well but were devoid of mathematical justification (see Whittaker, 1928). In 1926, Dirac introduced his famous δ-function [see Dirac (1958), pp. 58–61], which was found to be related to Heaviside's constructs. Other singular objects, together with procedures to handle them, had already appeared in several branches of analysis [Cauchy's `principal values'; Hadamard's `finite parts' (Hadamard, 1932, 1952); Riesz's regularization methods for certain divergent integrals (Riesz, 1938, 1949)] as well as in the theories of Fourier series and integrals (see e.g. Bochner, 1932, 1959). Their very definition often verged on violating the rigorous rules governing limiting processes in analysis, so that subsequent recourse to limiting processes could lead to erroneous results; ad hoc precautions thus had to be observed to avoid mistakes in handling these objects.
In 1945–1950, Laurent Schwartz proposed his theory of distributions (see Schwartz, 1966), which provided a unified and definitive treatment of all these questions, with a striking combination of rigour and simplicity. Schwartz's treatment of Dirac's δ-function illustrates his approach in a most direct fashion. Dirac's original definition reads: These two conditions are irreconcilable with Lebesgue's theory of integration: by (i), δ vanishes almost everywhere, so that its integral in (ii) must be 0, not 1.
A better definition consists in specifying that for any function φ sufficiently well behaved near . This is related to the problem of finding a unit for convolution (Section 1.3.2.2.4). As will now be seen, this definition is still unsatisfactory. Let the sequence in be an approximate convolution unit, e.g. Then for any well behaved function φ the integrals exist, and the sequence of their numerical values tends to . It is tempting to combine this with (iii) to conclude that δ is the limit of the sequence as . However, almost everywhere in and the crux of the problem is that because the sequence does not satisfy the hypotheses of Lebesgue's dominated convergence theorem.
Schwartz's solution to this problem is deceptively simple: the regular behaviour one is trying to capture is an attribute not of the sequence of functions , but of the sequence of continuous linear functionals which has as a limit the continuous functional It is the latter functional which constitutes the proper definition of δ. The previous paradoxes arose because one insisted on writing down the simple linear operation T in terms of an integral.
The essence of Schwartz's theory of distributions is thus that, rather than try to define and handle `generalized functions' via sequences such as [an approach adopted e.g. by Lighthill (1958) and Erdélyi (1962)], one should instead look at them as continuous linear functionals over spaces of well behaved functions.
There are many books on distribution theory and its applications. The reader may consult in particular Schwartz (1965, 1966), Gel'fand & Shilov (1964), Bremermann (1965), Trèves (1967), Challifour (1972), Friedlander (1982), and the relevant chapters of Hörmander (1963) and Yosida (1965). Schwartz (1965) is especially recommended as an introduction.
The guiding principle which leads to requiring that the functions φ above (traditionally called `test functions') should be well behaved is that correspondingly `wilder' behaviour can then be accommodated in the limiting behaviour of the while still keeping the integrals under control. Thus
To ensure further the continuity of functionals such as with respect to the test function φ as the go increasingly wild, very strong control will have to be exercised in the way in which a sequence of test functions will be said to converge towards a limiting φ: conditions will have to be imposed not only on the values of the functions , but also on those of all their derivatives. Hence, defining a strong enough topology on the space of test functions φ is an essential prerequisite to the development of a satisfactory theory of distributions.
With this rationale in mind, the following function spaces will be defined for any open subset Ω of (which may be the whole of ):
When Ω is unambiguously defined by the context, we will simply write .
It sometimes suffices to require the existence of continuous derivatives only up to finite order m inclusive. The corresponding spaces are then denoted with the convention that if , only continuity is required.
The topologies on these spaces constitute the most important ingredients of distribution theory, and will be outlined in some detail.
It is defined by the family of semi-norms where p is a multi-index and K a compact subset of Ω. A fundamental system S of neighbourhoods of the origin in is given by subsets of of the form for all natural integers m, positive real ɛ, and compact subset K of Ω. Since a countable family of compact subsets K suffices to cover Ω, and since restricted values of ɛ of the form lead to the same topology, S is equivalent to a countable system of neighbourhoods and hence is metrizable.
Convergence in may thus be defined by means of sequences. A sequence in will be said to converge to 0 if for any given there exists such that whenever ; in other words, if the and all their derivatives converge to 0 uniformly on any given compact K in Ω.
It is defined by the family of semi-norms where K is now fixed. The fundamental system S of neighbourhoods of the origin in is given by sets of the form It is equivalent to the countable subsystem of the , hence is metrizable.
Convergence in may thus be defined by means of sequences. A sequence in will be said to converge to 0 if for any given there exists such that whenever ; in other words, if the and all their derivatives converge to 0 uniformly in K.
It is defined by the fundamental system of neighbourhoods of the origin consisting of sets of the form where (m) is an increasing sequence of integers tending to and (ɛ) is a decreasing sequence of positive reals tending to 0, as .
This topology is not metrizable, because the sets of sequences (m) and (ɛ) are essentially uncountable. It can, however, be shown to be the inductive limit of the topology of the subspaces , in the following sense: V is a neighbourhood of the origin in if and only if its intersection with is a neighbourhood of the origin in for any given compact K in Ω.
A sequence in will thus be said to converge to 0 in if all the belong to some (with K a compact subset of Ω independent of ν) and if converges to 0 in .
As a result, a complex-valued functional T on will be said to be continuous for the topology of if and only if, for any given compact K in Ω, its restriction to is continuous for the topology of , i.e. maps convergent sequences in to convergent sequences in .
This property of , i.e. having a non-metrizable topology which is the inductive limit of metrizable topologies in its subspaces , conditions the whole structure of distribution theory and dictates that of many of its proofs.
A distribution T on Ω is a linear form over , i.e. a map which associates linearly a complex number to any , and which is continuous for the topology of that space. In the terminology of Section 1.3.2.2.6.2, T is an element of , the topological dual of .
Continuity over is equivalent to continuity over for all compact K contained in Ω, and hence to the condition that for any sequence in such that
then the sequence of complex numbers converges to 0 in .
If the continuity of a distribution T requires (ii) for only, T may be defined over and thus ; T is said to be a distribution of finite order m. In particular, for is the space of continuous functions with compact support, and a distribution is a (Radon) measure as used in the theory of integration. Thus measures are particular cases of distributions.
Generally speaking, the larger a space of test functions, the smaller its topological dual: This clearly results from the observation that if the φ's are allowed to be less regular, then less wildness can be accommodated in T if the continuity of the map with respect to φ is to be preserved.
Let f be a complex-valued function over Ω such that exists for any given compact K in Ω; f is then called locally integrable.
The linear mapping from to defined by may then be shown to be continuous over . It thus defines a distribution : As the continuity of only requires that , is actually a Radon measure.
It can be shown that two locally integrable functions f and g define the same distribution, i.e. if and only if they are equal almost everywhere. The classes of locally integrable functions modulo this equivalence form a vector space denoted ; each element of may therefore be identified with the distribution defined by any one of its representatives f.
A distribution is said to vanish on an open subset ω of Ω if it vanishes on all functions in , i.e. if whenever .
The support of a distribution T, denoted Supp T, is then defined as the complement of the set-theoretic union of those open subsets ω on which T vanishes; or equivalently as the smallest closed subset of Ω outside which T vanishes.
When for , then Supp , so that the two notions coincide. Clearly, if Supp T and Supp φ are disjoint subsets of Ω, then .
It can be shown that any distribution with compact support may be extended from to while remaining continuous, so that ; and that conversely, if , then its restriction T to is a distribution with compact support. Thus, the topological dual of consists of those distributions in which have compact support. This is intuitively clear since, if the condition of having compact support is fulfilled by T, it needs no longer be required of φ, which may then roam through rather than .
A sequence of distributions will be said to converge in to a distribution T as if, for any given , the sequence of complex numbers converges in to the complex number .
A series of distributions will be said to converge in and to have distribution S as its sum if the sequence of partial sums converges to S.
These definitions of convergence in assume that the limits T and S are known in advance, and are distributions. This raises the question of the completeness of : if a sequence in is such that the sequence has a limit in for all , does the map define a distribution ? In other words, does the limiting process preserve continuity with respect to φ? It is a remarkable theorem that, because of the strong topology on , this is actually the case. An analogous statement holds for series. This notion of convergence does not coincide with any of the classical notions used for ordinary functions: for example, the sequence with converges to 0 in , but fails to do so by any of the standard criteria.
An example of convergent sequences of distributions is provided by sequences which converge to δ. If is a sequence of locally summable functions on such that
then the sequence of distributions converges to δ in .
As a general rule, the definitions are chosen so that the operations coincide with those on functions whenever a distribution is associated to a function.
Most definitions consist in transferring to a distribution T an operation which is well defined on by `transposing' it in the duality product ; this procedure will map T to a new distribution provided the original operation maps continuously into itself.
The reverse operation from differentiation, namely calculating the `indefinite integral' of a distribution S, consists in finding a distribution T such that .
For all such that with , we must have This condition defines T in a `hyperplane' of , whose equation reflects the fact that ψ has compact support.
To specify T in the whole of , it suffices to specify the value of where is such that : then any may be written uniquely as with and T is defined by The freedom in the choice of means that T is defined up to an additive constant.
The product of a distribution T on by a function α over will be defined by transposition: In order that be a distribution, the mapping must send continuously into itself; hence the multipliers α must be infinitely differentiable. The product of two general distributions cannot be defined. The need for a careful treatment of multipliers of distributions will become clear when it is later shown (Section 1.3.2.5.8) that the Fourier transformation turns convolutions into multiplications and vice versa.
If T is a distribution of order m, then α needs only have continuous derivatives up to order m. For instance, δ is a distribution of order zero, and is a distribution provided α is continuous; this relation is of fundamental importance in the theory of sampling and of the properties of the Fourier transformation related to sampling (Sections 1.3.2.6.4, 1.3.2.6.6). More generally, is a distribution of order , and the following formula holds for all with :
The derivative of a product is easily shown to be and generally for any multi-index p
Given a distribution S on and an infinitely differentiable multiplier function α, the division problem consists in finding a distribution T such that .
If α never vanishes, is the unique answer. If , and if α has only isolated zeros of finite order, it can be reduced to a collection of cases where the multiplier is , for which the general solution can be shown to be of the form where U is a particular solution of the division problem and the are arbitrary constants.
In dimension , the problem is much more difficult, but is of fundamental importance in the theory of linear partial differential equations, since the Fourier transformation turns the problem of solving these into a division problem for distributions [see Hörmander (1963)].
Let σ be a smooth non-singular change of variables in , i.e. an infinitely differentiable mapping from an open subset Ω of to Ω′ in , whose Jacobian vanishes nowhere in Ω. By the implicit function theorem, the inverse mapping from Ω′ to Ω is well defined.
If f is a locally summable function on Ω, then the function defined by is a locally summable function on Ω′, and for any we may write: In terms of the associated distributions
This operation can be extended to an arbitrary distribution T by defining its image under coordinate transformation σ through which is well defined provided that σ is proper, i.e. that is compact whenever K is compact.
For instance, if is a translation by a vector a in , then ; is denoted by , and the translate of a distribution T is defined by
Let be a linear transformation defined by a non-singular matrix A. Then , and This formula will be shown later (Sections 1.3.2.6.5, 1.3.4.2.1.1) to be the basis for the definition of the reciprocal lattice.
In particular, if , where I is the identity matrix, A is an inversion through a centre of symmetry at the origin, and denoting by we have: T is called an even distribution if , an odd distribution if .
If with , A is called a dilation and Writing symbolically δ as and as , we have: If and f is a function with isolated simple zeros , then in the same symbolic notation where each is analogous to a `Lorentz factor' at zero .
The purpose of this construction is to extend Fubini's theorem to distributions. Following Section 1.3.2.2.5, we may define the tensor product as the vector space of finite linear combinations of functions of the form where and .
Let and denote the distributions associated to f and g, respectively, the subscripts x and y acting as mnemonics for and . It follows from Fubini's theorem (Section 1.3.2.2.5) that , and hence defines a distribution over ; the rearrangement of integral signs gives for all . In particular, if with , then
This construction can be extended to general distributions and . Given any test function , let denote the map ; let denote the map ; and define the two functions and . Then, by the lemma on differentiation under the sign of Section 1.3.2.3.9.1, , and there exists a unique distribution such that is called the tensor product of S and T.
With the mnemonic introduced above, this definition reads identically to that given above for distributions associated to locally integrable functions:
The tensor product of distributions is associative: Derivatives may be calculated by The support of a tensor product is the Cartesian product of the supports of the two factors.
The convolution of two functions f and g on is defined by whenever the integral exists. This is the case when f and g are both in ; then is also in . Let S, T and W denote the distributions associated to f, g and respectively: a change of variable immediately shows that for any , Introducing the map σ from to defined by , the latter expression may be written: (where denotes the composition of mappings) or by a slight abuse of notation:
A difficulty arises in extending this definition to general distributions S and T because the mapping σ is not proper: if K is compact in , then is a cylinder with base K and generator the `second bisector' in . However, is defined whenever the intersection between Supp and is compact.
We may therefore define the convolution of two distributions S and T on by whenever the following support condition is fulfilled:
`the set is compact in for all K compact in '.
The latter condition is met, in particular, if S or T has compact support. The support of is easily seen to be contained in the closure of the vector sum
Convolution by a fixed distribution S is a continuous operation for the topology on : it maps convergent sequences to convergent sequences . Convolution is commutative: .
The convolution of p distributions with supports can be defined by whenever the following generalized support condition:
`the set is compact in for all K compact in '
is satisfied. It is then associative. Interesting examples of associativity failure, which can be traced back to violations of the support condition, may be found in Bracewell (1986, pp. 436–437).
It follows from previous definitions that, for all distributions , the following identities hold:
|
The latter property is frequently used for the purpose of regularization: if T is a distribution, α an infinitely differentiable function, and at least one of the two has compact support, then is an infinitely differentiable ordinary function. Since sequences of such functions α can be constructed which have compact support and converge to δ, it follows that any distribution T can be obtained as the limit of infinitely differentiable functions . In topological jargon: is `everywhere dense' in . A standard function in which is often used for such proofs is defined as follows: put with (so that θ is in and is normalized), and put
Another related result, also proved by convolution, is the structure theorem: the restriction of a distribution to a bounded open set Ω in is a derivative of finite order of a continuous function.
Properties (i) to (iv) are the basis of the symbolic or operational calculus (see Carslaw & Jaeger, 1948; Van der Pol & Bremmer, 1955; Churchill, 1958; Erdélyi, 1962; Moore, 1971) for solving integro-differential equations with constant coefficients by turning them into convolution equations, then using factorization methods for convolution algebras (Schwartz, 1965).
References
Bochner, S. (1932). Vorlesungen über Fouriersche Integrale. Leipzig: Akademische Verlagsgesellschaft.Google ScholarBochner, S. (1959). Lectures on Fourier integrals. Translated from Bochner (1932) by M. Tenenbaum & H. Pollard. Princeton University Press.Google Scholar
Bracewell, R. N. (1986). The Fourier transform and its applications, 2nd ed., revised. New York: McGraw-Hill.Google Scholar
Bremermann, H. (1965). Distributions, complex variables, and Fourier transforms. Reading: Addison-Wesley.Google Scholar
Carslaw, H. S. & Jaeger, J. C. (1948). Operational methods in applied mathematics. Oxford University Press.Google Scholar
Challifour, J. L. (1972). Generalized functions and Fourier analysis. Reading: Benjamin.Google Scholar
Churchill, R. V. (1958). Operational mathematics, 2nd ed. New York: McGraw-Hill.Google Scholar
Dirac, P. A. M. (1958). The principles of quantum mechanics, 4th ed. Oxford: Clarendon Press.Google Scholar
Erdélyi, A. (1962). Operational calculus and generalized functions. New York: Holt, Rinehart & Winston.Google Scholar
Friedlander, F. G. (1982). Introduction to the theory of distributions. Cambridge University Press.Google Scholar
Gel'fand, I. M. & Shilov, G. E. (1964). Generalized functions, Vol. I. New York and London: Academic Press.Google Scholar
Hadamard, J. (1932). Le problème de Cauchy et les équations aux dérivées partielles linéaires hyperboliques. Paris: Hermann.Google Scholar
Hadamard, J. (1952). Lectures on Cauchy's problem in linear partial differential equations. New York: Dover Publications.Google Scholar
Hörmander, L. (1963). Linear partial differential operators. Berlin: Springer-Verlag.Google Scholar
Lighthill, M. J. (1958). Introduction to Fourier analysis and generalized functions. Cambridge University Press.Google Scholar
Moore, D. H. (1971). Heaviside operational calculus. An elementary foundation. New York: American Elsevier.Google Scholar
Riesz, M. (1938). L'intégrale de Riemann–Liouville et le problème de Cauchy pour l'équation des ondes. Bull. Soc. Math. Fr. 66, 153–170.Google Scholar
Riesz, M. (1949). L'intégrale de Riemann–Liouville et le problème de Cauchy. Acta Math. 81, 1–223.Google Scholar
Schwartz, L. (1965). Mathematics for the physical sciences. Paris: Hermann, and Reading: Addison-Wesley.Google Scholar
Schwartz, L. (1966). Théorie des distributions. Paris: Hermann.Google Scholar
Trèves, F. (1967). Topological vector spaces, distributions, and kernels. New York and London: Academic Press.Google Scholar
Van der Pol, B. & Bremmer, H. (1955). Operational calculus, 2nd ed. Cambridge University Press.Google Scholar
Whittaker, E. T. (1928). Oliver Heaviside. Bull. Calcutta Math. Soc. 20, 199–220. [Reprinted in Moore (1971).]Google Scholar
Yosida, K. (1965). Functional analysis. Berlin: Springer-Verlag.Google Scholar