Elements of the theory of distributions

Bricogne, G.

doi:10.1107/97809553602060000551

International
Tables for
Crystallography
Volume B
Reciprocal space
Edited by U. Shmueli

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. B. ch. 1.3, pp. 28-34 | 1 | 2 |

Section 1.3.2.3. Elements of the theory of distributions

G. Bricogne^a

^a MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, England, and LURE, Bâtiment 209D, Université Paris-Sud, 91405 Orsay, France

1.3.2.3. Elements of the theory of distributions

| top | pdf |

1.3.2.3.1. Origins

| top | pdf |

At the end of the 19th century, Heaviside proposed under the name of `operational calculus' a set of rules for solving a class of differential, partial differential and integral equations encountered in electrical engineering (today's `signal processing'). These rules worked remarkably well but were devoid of mathematical justification (see Whittaker, 1928). In 1926, Dirac introduced his famous δ-function [see Dirac (1958), pp. 58–61], which was found to be related to Heaviside's constructs. Other singular objects, together with procedures to handle them, had already appeared in several branches of analysis [Cauchy's `principal values'; Hadamard's `finite parts' (Hadamard, 1932, 1952); Riesz's regularization methods for certain divergent integrals (Riesz, 1938, 1949)] as well as in the theories of Fourier series and integrals (see e.g. Bochner, 1932, 1959). Their very definition often verged on violating the rigorous rules governing limiting processes in analysis, so that subsequent recourse to limiting processes could lead to erroneous results; ad hoc precautions thus had to be observed to avoid mistakes in handling these objects.

In 1945–1950, Laurent Schwartz proposed his theory of distributions (see Schwartz, 1966), which provided a unified and definitive treatment of all these questions, with a striking combination of rigour and simplicity. Schwartz's treatment of Dirac's δ-function illustrates his approach in a most direct fashion. Dirac's original definition reads: $[\displaylines{\quad (\hbox{i})\;\quad\delta ({\bf x}) = 0 \hbox{ for } {\bf x} \neq {\bf 0},\hfill\cr \quad (\hbox{ii})\quad {\textstyle\int_{{\bb R}^{n}}} \delta ({\bf x}) \;\hbox{d}^{n} {\bf x} = 1.\hfill}]$ These two conditions are irreconcilable with Lebesgue's theory of integration : by (i), δ vanishes almost everywhere, so that its integral in (ii) must be 0, not 1.

A better definition consists in specifying that $[\displaylines{\quad (\hbox{iii})\quad {\textstyle\int_{{\bb R}^{n}}} \delta ({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x} = \varphi ({\bf 0})\hfill}]$ for any function φ sufficiently well behaved near $[{\bf x} = {\bf 0}]$ . This is related to the problem of finding a unit for convolution (Section 1.3.2.2.4). As will now be seen, this definition is still unsatisfactory. Let the sequence $[(\;f_{\nu})]$ in $[L^{1} ({\bb R}^{n})]$ be an approximate convolution unit, e.g. $[f_{\nu} ({\bf x}) = \left({\nu \over 2\pi}\right)^{1/2} \exp (-{\textstyle{1 \over 2}} \nu^{2} \|{\bf x}\|^{2}).]$ Then for any well behaved function φ the integrals $[{\textstyle\int\limits_{{\bb R}^{n}}} f_{\nu} ({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x}]$ exist, and the sequence of their numerical values tends to $[\varphi ({\bf 0})]$ . It is tempting to combine this with (iii) to conclude that δ is the limit of the sequence $[(\;f_{\nu})]$ as $[\nu \rightarrow \infty]$ . However, $[\lim f_{\nu} ({\bf x}) = 0 \quad \hbox{as } \nu \rightarrow \infty]$ almost everywhere in $[{\bb R}^{n}]$ and the crux of the problem is that $[\eqalign{\varphi ({\bf 0}) &= \lim\limits_{\nu \rightarrow \infty} {\textstyle\int\limits_{{\bb R}^{n}}} f_{\nu} ({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x} \cr &\neq {\textstyle\int\limits_{{\bb R}^{n}}} \left[\lim\limits_{\nu \rightarrow \infty} f_{v} ({\bf x}) \right] \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x} = 0}]$ because the sequence $[(\;f_{\nu})]$ does not satisfy the hypotheses of Lebesgue's dominated convergence theorem.

Schwartz's solution to this problem is deceptively simple: the regular behaviour one is trying to capture is an attribute not of the sequence of functions $[(\;f_{\nu})]$ , but of the sequence of continuous linear functionals $[T_{\nu}: \varphi \;\longmapsto\; {\textstyle\int\limits_{{\bb R}^{n}}} f_{\nu} ({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x}]$ which has as a limit the continuous functional $[T: \varphi \;\longmapsto\; \varphi ({\bf 0}).]$ It is the latter functional which constitutes the proper definition of δ. The previous paradoxes arose because one insisted on writing down the simple linear operation T in terms of an integral.

The essence of Schwartz's theory of distributions is thus that, rather than try to define and handle `generalized functions' via sequences such as $[(\;f_{\nu})]$ [an approach adopted e.g. by Lighthill (1958) and Erdélyi (1962)], one should instead look at them as continuous linear functionals over spaces of well behaved functions.

There are many books on distribution theory and its applications. The reader may consult in particular Schwartz (1965, 1966), Gel'fand & Shilov (1964), Bremermann (1965), Trèves (1967), Challifour (1972), Friedlander (1982), and the relevant chapters of Hörmander (1963) and Yosida (1965). Schwartz (1965) is especially recommended as an introduction.

1.3.2.3.2. Rationale

| top | pdf |

The guiding principle which leads to requiring that the functions φ above (traditionally called `test functions') should be well behaved is that correspondingly `wilder' behaviour can then be accommodated in the limiting behaviour of the $[f_{\nu}]$ while still keeping the integrals $[{\textstyle\int_{{\bb R}^{n}}} f_{\nu} \varphi \;\hbox{d}^{n} {\bf x}]$ under control. Thus

(i) to minimize restrictions on the limiting behaviour of the $[f_{\nu}]$ at infinity, the φ's will be chosen to have compact support;
(ii) to minimize restrictions on the local behaviour of the $[f_{\nu}]$ , the φ's will be chosen infinitely differentiable.

To ensure further the continuity of functionals such as $[T_{\nu}]$ with respect to the test function φ as the $[f_{\nu}]$ go increasingly wild, very strong control will have to be exercised in the way in which a sequence $[(\varphi_{j})]$ of test functions will be said to converge towards a limiting φ: conditions will have to be imposed not only on the values of the functions $[\varphi_{j}]$ , but also on those of all their derivatives. Hence, defining a strong enough topology on the space of test functions φ is an essential prerequisite to the development of a satisfactory theory of distributions.

1.3.2.3.3. Test-function spaces

| top | pdf |

With this rationale in mind, the following function spaces will be defined for any open subset Ω of $[{\bb R}^{n}]$ (which may be the whole of $[{\bb R}^{n}]$ ):

(a) $[{\scr E}(\Omega)]$ is the space of complex-valued functions over Ω which are indefinitely differentiable;
(b) $[{\scr D}(\Omega)]$ is the subspace of $[{\scr E}(\Omega)]$ consisting of functions with (unspecified) compact support contained in $[{\bb R}^{n}]$ ;
(c) $[{\scr D}_{K} (\Omega)]$ is the subspace of $[{\scr D}(\Omega)]$ consisting of functions whose (compact) support is contained within a fixed compact subset K of Ω.

When Ω is unambiguously defined by the context, we will simply write $[{\scr E},{\scr D},{\scr D}_{K}]$ .

It sometimes suffices to require the existence of continuous derivatives only up to finite order m inclusive. The corresponding spaces are then denoted $[{\scr E}^{(m)},{\scr D}^{(m)},{\scr D}_{K}^{(m)}]$ with the convention that if [m = 0] , only continuity is required.

The topologies on these spaces constitute the most important ingredients of distribution theory, and will be outlined in some detail.

1.3.2.3.3.1. Topology on $[{\scr E}(\Omega)]$

| top | pdf |

It is defined by the family of semi-norms $[\varphi \in {\scr E}(\Omega) \;\longmapsto\; \sigma_{{\bf p}, \, K} (\varphi) = \sup\limits_{{\bf x} \in K} |D^{{\bf p}} \varphi ({\bf x})|,]$ where p is a multi-index and K a compact subset of Ω. A fundamental system S of neighbourhoods of the origin in $[{\scr E}(\Omega)]$ is given by subsets of $[{\scr E}(\Omega)]$ of the form $[V (m, \varepsilon, K) = \{\varphi \in {\scr E}(\Omega)| |{\bf p}| \leq m \Rightarrow \sigma_{{\bf p}, K} (\varphi) \;\lt\; \varepsilon\}]$ for all natural integers m, positive real ɛ, and compact subset K of Ω. Since a countable family of compact subsets K suffices to cover Ω, and since restricted values of ɛ of the form $[\varepsilon = 1/N]$ lead to the same topology, S is equivalent to a countable system of neighbourhoods and hence $[{\scr E}(\Omega)]$ is metrizable.

Convergence in $[{\scr E}]$ may thus be defined by means of sequences. A sequence $[(\varphi_{\nu})]$ in $[{\scr E}]$ will be said to converge to 0 if for any given $[V (m, \varepsilon, K)]$ there exists $[\nu_{0}]$ such that $[\varphi_{\nu} \in V (m, \varepsilon, K)]$ whenever $[\nu \gt \nu_{0}]$ ; in other words, if the $[\varphi_{\nu}]$ and all their derivatives $[D^{\bf p} \varphi_{\nu}]$ converge to 0 uniformly on any given compact K in Ω.

1.3.2.3.3.2. Topology on $[{\scr D}_{k} (\Omega)]$

| top | pdf |

It is defined by the family of semi-norms $[\varphi \in {\scr D}_{K} (\Omega) \;\longmapsto\; \sigma_{\bf p} (\varphi) = \sup\limits_{{\bf x} \in K} |D^{{\bf p}} \varphi ({\bf x})|,]$ where K is now fixed. The fundamental system S of neighbourhoods of the origin in $[{\scr D}_{K}]$ is given by sets of the form $[V (m, \varepsilon) = \{\varphi \in {\scr D}_{K} (\Omega)| |{\bf p}| \leq m \Rightarrow \sigma_{\bf p} (\varphi) \;\lt\; \varepsilon\}.]$ It is equivalent to the countable subsystem of the [V (m, 1/N)] , hence $[{\scr D}_{K} (\Omega)]$ is metrizable.

Convergence in $[{\scr D}_{K}]$ may thus be defined by means of sequences. A sequence $[(\varphi_{\nu})]$ in $[{\scr D}_{K}]$ will be said to converge to 0 if for any given $[V(m, \varepsilon)]$ there exists $[\nu_{0}]$ such that $[\varphi_{\nu} \in V(m, \varepsilon)]$ whenever $[\nu \gt \nu_{0}]$ ; in other words, if the $[\varphi_{\nu}]$ and all their derivatives $[D^{\bf p} \varphi_{\nu}]$ converge to 0 uniformly in K.

1.3.2.3.3.3. Topology on $[{\scr D}(\Omega)]$

| top | pdf |

It is defined by the fundamental system of neighbourhoods of the origin consisting of sets of the form $[\eqalign{&V((m), (\varepsilon)) \cr &\qquad = \left\{\varphi \in {\scr D}(\Omega)| |{\bf p}| \leq m_{\nu} \Rightarrow \sup\limits_{\|{\bf x}\| \leq \nu} |D^{{\bf p}} \varphi ({\bf x})| \;\lt\; \varepsilon_{\nu} \hbox{ for all } \nu\right\},}]$ where (m) is an increasing sequence $[(m_{\nu})]$ of integers tending to $[+ \infty]$ and (ɛ) is a decreasing sequence $[(\varepsilon_{\nu})]$ of positive reals tending to 0, as $[\nu \rightarrow \infty]$ .

This topology is not metrizable, because the sets of sequences (m) and (ɛ) are essentially uncountable. It can, however, be shown to be the inductive limit of the topology of the subspaces $[{\scr D}_{K}]$ , in the following sense: V is a neighbourhood of the origin in $[{\scr D}]$ if and only if its intersection with $[{\scr D}_{K}]$ is a neighbourhood of the origin in $[{\scr D}_{K}]$ for any given compact K in Ω.

A sequence $[(\varphi_{\nu})]$ in $[{\scr D}]$ will thus be said to converge to 0 in $[{\scr D}]$ if all the $[\varphi_{\nu}]$ belong to some $[{\scr D}_{K}]$ (with K a compact subset of Ω independent of ν) and if $[(\varphi_{\nu})]$ converges to 0 in $[{\scr D}_{K}]$ .

As a result, a complex-valued functional T on $[{\scr D}]$ will be said to be continuous for the topology of $[{\scr D}]$ if and only if, for any given compact K in Ω, its restriction to $[{\scr D}_{K}]$ is continuous for the topology of $[{\scr D}_{K}]$ , i.e. maps convergent sequences in $[{\scr D}_{K}]$ to convergent sequences in $[{\bb C}]$ .

This property of $[{\scr D}]$ , i.e. having a non-metrizable topology which is the inductive limit of metrizable topologies in its subspaces $[{\scr D}_{K}]$ , conditions the whole structure of distribution theory and dictates that of many of its proofs.

1.3.2.3.3.4. Topologies on $[{\scr E}^{(m)}, {\scr D}_{k}^{(m)},{\scr D}^{(m)}]$

| top | pdf |

These are defined similarly, but only involve conditions on derivatives up to order m.

1.3.2.3.4. Definition of distributions

| top | pdf |

A distribution T on Ω is a linear form over $[{\scr D}(\Omega)]$ , i.e. a map $[T: \varphi \;\longmapsto\; \langle T, \varphi \rangle]$ which associates linearly a complex number $[\langle T, \varphi \rangle]$ to any $[\varphi \in {\scr D}(\Omega)]$ , and which is continuous for the topology of that space. In the terminology of Section 1.3.2.2.6.2, T is an element of $[{\scr D}\,'(\Omega)]$ , the topological dual of $[{\scr D}(\Omega)]$ .

Continuity over $[{\scr D}]$ is equivalent to continuity over $[{\scr D}_{K}]$ for all compact K contained in Ω, and hence to the condition that for any sequence $[(\varphi_{\nu})]$ in $[{\scr D}]$ such that

(i) Supp $[\varphi_{\nu}]$ is contained in some compact K independent of ν,
(ii) the sequences $[(|D^{\bf p} \varphi_{\nu}|)]$ converge uniformly to 0 on K for all multi-indices p;

then the sequence of complex numbers $[\langle T, \varphi_{\nu}\rangle]$ converges to 0 in $[{\bb C}]$ .

If the continuity of a distribution T requires (ii) for $[|{\bf p}| \leq m]$ only, T may be defined over $[{\scr D}^{(m)}]$ and thus $[T \in {\scr D}\,'^{(m)}]$ ; T is said to be a distribution of finite order m. In particular, for $[m = 0, {\scr D}^{(0)}]$ is the space of continuous functions with compact support, and a distribution $[T \in {\scr D}\,'^{(0)}]$ is a (Radon) measure as used in the theory of integration. Thus measures are particular cases of distributions.

Generally speaking, the larger a space of test functions, the smaller its topological dual: $[m \;\lt\; n \Rightarrow {\scr D}^{(m)} \supset {\scr D}^{(n)} \Rightarrow {\scr D}\,'^{(n)} \supset {\scr D}\,'^{(m)}.]$ This clearly results from the observation that if the φ's are allowed to be less regular, then less wildness can be accommodated in T if the continuity of the map $[\varphi \;\longmapsto\; \langle T, \varphi \rangle]$ with respect to φ is to be preserved.

1.3.2.3.5. First examples of distributions

| top | pdf |

(i) The linear map $[\varphi \;\longmapsto\; \langle \delta, \varphi \rangle = \varphi ({\bf 0})]$ is a measure (i.e. a zeroth-order distribution) called Dirac's measure or (improperly) Dirac's `δ-function'.
(ii) The linear map $[\varphi \;\longmapsto\; \langle \delta_{({\bf a})}, \varphi \rangle = \varphi ({\bf a})]$ is called Dirac's measure at point $[{\bf a} \in {\bb R}^{n}]$ .
(iii) The linear map $[\varphi\;\longmapsto\; (-1)^{\bf p} D^{\bf p} \varphi ({\bf a})]$ is a distribution of order $[m = |{\bf p}| \gt 0]$ , and hence is not a measure.
(iv) The linear map $[\varphi \;\longmapsto\; {\textstyle\sum_{\nu \gt 0}} \varphi^{(\nu)} (\nu)]$ is a distribution of infinite order on $[{\bb R}]$ : the order of differentiation is bounded for each φ (because φ has compact support) but is not as φ varies.
(v) If $[({\bf p}_{\nu})]$ is a sequence of multi-indices $[{\bf p}_{\nu} = (p_{1\nu}, \ldots, p_{n\nu})]$ such that $[|{\bf p}_{\nu}| \rightarrow \infty]$ as $[\nu \rightarrow \infty]$ , then the linear map $[\varphi \;\longmapsto\; {\textstyle\sum_{\nu \gt 0}} (D^{{\bf p}_{\nu}} \varphi) ({\bf p}_{\nu})]$ is a distribution of infinite order on $[{\bb R}^{n}]$ .

1.3.2.3.6. Distributions associated to locally integrable functions

| top | pdf |

Let f be a complex-valued function over Ω such that $[{\textstyle\int_{K}} | \;f({\bf x}) | \;\hbox{d}^{n} {\bf x}]$ exists for any given compact K in Ω; f is then called locally integrable.

The linear mapping from $[{\scr D}(\Omega)]$ to $[{\bb C}]$ defined by $[\varphi \;\longmapsto\; {\textstyle\int\limits_{\Omega}} f({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x}]$ may then be shown to be continuous over $[{\scr D}(\Omega)]$ . It thus defines a distribution $[T_{f} \in {\scr D}\,'(\Omega)]$ : $[\langle T_{f}, \varphi \rangle = {\textstyle\int\limits_{\Omega}} f({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x}.]$ As the continuity of $[T_{f}]$ only requires that $[\varphi \in {\scr D}^{(0)} (\Omega)]$ , $[T_{f}]$ is actually a Radon measure.

It can be shown that two locally integrable functions f and g define the same distribution, i.e. $[\langle T_{f}, \varphi \rangle = \langle T_{K}, \varphi \rangle \quad \hbox{for all } \varphi \in {\scr D},]$ if and only if they are equal almost everywhere. The classes of locally integrable functions modulo this equivalence form a vector space denoted $[L_{\rm loc}^{1} (\Omega)]$ ; each element of $[L_{\rm loc}^{1} (\Omega)]$ may therefore be identified with the distribution $[T_{f}]$ defined by any one of its representatives f.

1.3.2.3.7. Support of a distribution

| top | pdf |

A distribution $[T \in {\scr D}\,'(\Omega)]$ is said to vanish on an open subset ω of Ω if it vanishes on all functions in $[{\scr D}(\omega)]$ , i.e. if $[\langle T, \varphi \rangle = 0]$ whenever $[\varphi \in {\scr D}(\omega)]$ .

The support of a distribution T, denoted Supp T, is then defined as the complement of the set-theoretic union of those open subsets ω on which T vanishes; or equivalently as the smallest closed subset of Ω outside which T vanishes.

When $[T = T_{f}]$ for $[f \in L_{\rm loc}^{1} (\Omega)]$ , then Supp $[T = \hbox{Supp } f]$ , so that the two notions coincide. Clearly, if Supp T and Supp φ are disjoint subsets of Ω, then $[\langle T, \varphi \rangle = 0]$ .

It can be shown that any distribution $[T \in {\scr D}\,']$ with compact support may be extended from $[{\scr D}]$ to $[{\scr E}]$ while remaining continuous, so that $[T \in {\scr E}\,']$ ; and that conversely, if $[S \in {\scr E}\,']$ , then its restriction T to $[{\scr D}]$ is a distribution with compact support. Thus, the topological dual $[{\scr E}\,']$ of $[{\scr E}]$ consists of those distributions in $[{\scr D}\,']$ which have compact support. This is intuitively clear since, if the condition of having compact support is fulfilled by T, it needs no longer be required of φ, which may then roam through $[{\scr E}]$ rather than $[{\scr D}]$ .

1.3.2.3.8. Convergence of distributions

| top | pdf |

A sequence $[(T_{j})]$ of distributions will be said to converge in $[{\scr D}\,']$ to a distribution T as $[j \rightarrow \infty]$ if, for any given $[\varphi \in {\scr D}]$ , the sequence of complex numbers $[(\langle T_{j}, \varphi \rangle)]$ converges in $[{\bb C}]$ to the complex number $[\langle T, \varphi \rangle]$ .

A series $[{\textstyle\sum_{j=0}^{\infty}} T_{j}]$ of distributions will be said to converge in $[{\scr D}\,']$ and to have distribution S as its sum if the sequence of partial sums $[S_{k} = {\textstyle\sum_{j=0}^{k}}]$ converges to S.

These definitions of convergence in $[{\scr D}\,']$ assume that the limits T and S are known in advance, and are distributions. This raises the question of the completeness of $[{\scr D}\,']$ : if a sequence $[(T_{j})]$ in $[{\scr D}\,']$ is such that the sequence $[(\langle T_{j}, \varphi \rangle)]$ has a limit in $[{\bb C}]$ for all $[\varphi \in {\scr D}]$ , does the map $[\varphi \;\longmapsto\; \lim_{j \rightarrow \infty} \langle T_{j}, \varphi \rangle]$ define a distribution $[T \in {\scr D}\,']$ ? In other words, does the limiting process preserve continuity with respect to φ? It is a remarkable theorem that, because of the strong topology on $[{\scr D}]$ , this is actually the case. An analogous statement holds for series. This notion of convergence does not coincide with any of the classical notions used for ordinary functions: for example, the sequence $[(\varphi_{\nu})]$ with $[\varphi_{\nu} (x) = \cos \nu x]$ converges to 0 in $[{\scr D}\,'({\bb R})]$ , but fails to do so by any of the standard criteria.

An example of convergent sequences of distributions is provided by sequences which converge to δ. If $[(\;f_{\nu})]$ is a sequence of locally summable functions on $[{\bb R}^{n}]$ such that

(i) $[\textstyle{\int_{\|{\bf x}\| \lt\; b}} \;f_{\nu} ({\bf x}) \;\hbox{d}^{n} {\bf x} \rightarrow 1]$ as $[\nu \rightarrow \infty]$ for all $[b \gt 0]$ ;
(ii) $[{\textstyle\int_{a \leq \|{\bf x}\| \leq 1/a}} |\;f_{\nu} ({\bf x})| \;\hbox{d}^{n} {\bf x} \rightarrow 0]$ as $[\nu \rightarrow \infty]$ for all $[0 \;\lt\; a \;\lt\; 1]$ ;
(iii) there exists $[d \gt 0]$ and $[M \gt 0]$ such that $[{\textstyle\int_{\|{\bf x}\|\lt\; d}} |\;f_{\nu} ({\bf x})| \;\hbox{d}^{n} {\bf x}\lt M]$ for all ν;

then the sequence $[(T_{f_{\nu}})]$ of distributions converges to δ in $[{\scr D}\,'({\bb R}^{n})]$ .

1.3.2.3.9. Operations on distributions

| top | pdf |

As a general rule, the definitions are chosen so that the operations coincide with those on functions whenever a distribution is associated to a function.

Most definitions consist in transferring to a distribution T an operation which is well defined on $[\varphi \in {\scr D}]$ by `transposing' it in the duality product $[\langle T, \varphi \rangle]$ ; this procedure will map T to a new distribution provided the original operation maps $[{\scr D}]$ continuously into itself.

1.3.2.3.9.1. Differentiation

| top | pdf |

(a) Definition and elementary properties

If T is a distribution on $[{\bb R}^{n}]$ , its partial derivative $[\partial_{i} T]$ with respect to $[x_{i}]$ is defined by $[\langle \partial_{i} T, \varphi \rangle = - \langle T, \partial_{i} \varphi \rangle]$

for all $[\varphi \in {\scr D}]$ . This does define a distribution, because the partial differentiations $[\varphi \;\longmapsto\; \partial_{i} \varphi]$ are continuous for the topology of $[{\scr D}]$ .

Suppose that $[T = T_{f}]$ with f a locally integrable function such that $[\partial_{i}\; f]$ exists and is almost everywhere continuous. Then integration by parts along the $[x_{i}]$ axis gives $[\eqalign{&{\textstyle\int\limits_{{\bb R}^{n}}} \partial_{i}\; f(x_{\rm l}, \ldots, x_{i}, \ldots, x_{n}) \varphi (x_{\rm l}, \ldots, x_{i}, \ldots, x_{n}) \;\hbox{d}x_{i} \cr &\quad = (\;f\varphi)(x_{\rm l}, \ldots, + \infty, \ldots, x_{n}) - (\;f\varphi)(x_{\rm l}, \ldots, - \infty, \ldots, x_{n}) \cr &\qquad - {\textstyle\int\limits_{{\bb R}^{n}}} f(x_{\rm l}, \ldots, x_{i}, \ldots, x_{n}) \partial_{i} \varphi (x_{\rm l}, \ldots, x_{i}, \ldots, x_{n}) \;\hbox{d}x_{i}\hbox{;}}]$ the integrated term vanishes, since φ has compact support, showing that $[\partial_{i} T_{f} = T_{\partial_{i}\; f}]$ .

The test functions $[\varphi \in {\scr D}]$ are infinitely differentiable. Therefore, transpositions like that used to define $[\partial_{i} T]$ may be repeated, so that any distribution is infinitely differentiable. For instance, $[\displaylines{\langle \partial_{ij}^{2} T, \varphi \rangle = - \langle \partial_{j} T, \partial_{i} \varphi \rangle = \langle T, \partial_{ij}^{2} \varphi \rangle, \cr \langle D^{\bf p} T, \varphi \rangle = (-1)^{|{\bf p}|} \langle T, D^{\bf p} \varphi \rangle, \cr \langle \Delta T, \varphi \rangle = \langle T, \Delta \varphi \rangle,}]$ where Δ is the Laplacian operator. The derivatives of Dirac's δ distribution are $[\langle D^{\bf p} \delta, \varphi \rangle = (-1)^{|{\bf p}|} \langle \delta, D^{\bf p} \varphi \rangle = (-1)^{|{\bf p}|} D^{\bf p} \varphi ({\bf 0}).]$

It is remarkable that differentiation is a continuous operation for the topology on $[{\scr D}\,']$ : if a sequence $[(T_{j})]$ of distributions converges to distribution T, then the sequence $[(D^{\bf p} T_{j})]$ of derivatives converges to $[D^{\bf p} T]$ for any multi-index p, since as $[j \rightarrow \infty]$ $[\langle D^{\bf p} T_{j}, \varphi \rangle = (-1)^{|{\bf p}|} \langle T_{j}, D^{\bf p} \varphi \rangle \rightarrow (-1)^{|{\bf p}|} \langle T, D^{\bf p} \varphi \rangle = \langle D^{\bf p} T, \varphi \rangle.]$ An analogous statement holds for series: any convergent series of distributions may be differentiated termwise to all orders. This illustrates how `robust' the constructs of distribution theory are in comparison with those of ordinary function theory, where similar statements are notoriously untrue.

(b) Differentiation under the duality bracket

Limiting processes and differentiation may also be carried out under the duality bracket $[\langle ,\rangle]$ as under the integral sign with ordinary functions. Let the function $[\varphi = \varphi ({\bf x}, \lambda)]$ depend on a parameter $[\lambda \in \Lambda]$ and a vector $[{\bf x} \in {\bb R}^{n}]$ in such a way that all functions $[\varphi_{\lambda}: {\bf x} \;\longmapsto\; \varphi ({\bf x}, \lambda)]$ be in $[{\scr D}({\bb R}^{n})]$ for all $[\lambda \in \Lambda]$ . Let $[T \in {\scr D}^{\prime}({\bb R}^{n})]$ be a distribution, let $[I(\lambda) = \langle T, \varphi_{\lambda}\rangle]$ and let $[\lambda_{0} \in \Lambda]$ be given parameter value. Suppose that, as λ runs through a small enough neighbourhood of $[\lambda_{0}]$ ,

(i) all the $[\varphi_{\lambda}]$ have their supports in a fixed compact subset K of $[{\bb R}^{n}]$ ;
(ii) all the derivatives $[D^{\bf p} \varphi_{\lambda}]$ have a partial derivative with respect to λ which is continuous with respect to x and λ.

Under these hypotheses, $[I(\lambda)]$ is differentiable (in the usual sense) with respect to λ near $[\lambda_{0}]$ , and its derivative may be obtained by `differentiation under the $[\langle ,\rangle]$ sign': $[{\hbox{d}I \over \hbox{d}\lambda} = \langle T, \partial_{\lambda} \varphi_{\lambda}\rangle.]$

(c) Effect of discontinuities

When a function f or its derivatives are no longer continuous, the derivatives $[D^{\bf p} T_{f}]$ of the associated distribution $[T_{f}]$ may no longer coincide with the distributions associated to the functions $[D^{\bf p} f]$ .

In dimension 1, the simplest example is Heaviside's unit step function $[Y\; [Y(x) = 0 \hbox{ for } x \;\lt\; 0, Y(x) = 1 \hbox{ for } x \geq 0]]$ : $[\langle (T_{Y})', \varphi \rangle = - \langle (T_{Y}), \varphi'\rangle = - {\textstyle\int\limits_{0}^{+ \infty}} \varphi' (x) \;\hbox{d}x = \varphi (0) = \langle \delta, \varphi \rangle.]$ Hence $[(T_{Y})' = \delta]$ , a result long used `heuristically' by electrical engineers [see also Dirac (1958)].

Let f be infinitely differentiable for $[x \;\lt\; 0]$ and $[x \gt 0]$ but have discontinuous derivatives $[f^{(m)}]$ at [ $[\;f^{(0)}]$ being f itself] with jumps $[\sigma_{m} = f^{(m)} (0 +) - f^{(m)} (0 -)]$ . Consider the functions: $[\eqalign{g_{0} &= f - \sigma_{0} Y \cr g_{1} &= g'_{0} - \sigma_{1} Y \cr---&-------\cr g_{k} &= g'_{k - 1} - \sigma_{k} Y.}]$ The $[g_{k}]$ are continuous, their derivatives $[g'_{k}]$ are continuous almost everywhere [which implies that $[(T_{g_{k}})' = T_{g'_{k}}]$ and $[g'_{k} = f^{(k + 1)}]$ almost everywhere]. This yields immediately: $[\eqalign{(T_{f})' &= T_{f'} + \sigma_{0} \delta \cr (T_{f})'' &=T_{f''} + \sigma_{0} \delta' + \sigma_{\rm 1} \delta \cr----&--------------\cr (T_{f})^{(m)} &= T_{f^{(m)}} + \sigma_{0} \delta^{(m - 1)} + \ldots + \sigma_{m - 1} \delta.\cr----&--------------\cr}]$ Thus the `distributional derivatives' $[(T_{f})^{(m)}]$ differ from the usual functional derivatives $[T_{f^{(m)}}]$ by singular terms associated with discontinuities.

In dimension n, let f be infinitely differentiable everywhere except on a smooth hypersurface S, across which its partial derivatives show discontinuities. Let $[\sigma_{0}]$ and $[\sigma_{\nu}]$ denote the discontinuities of f and its normal derivative $[\partial_{\nu} \varphi]$ across S (both $[\sigma_{0}]$ and $[\sigma_{\nu}]$ are functions of position on S), and let $[\delta_{(S)}]$ and $[\partial_{\nu} \delta_{(S)}]$ be defined by $[\eqalign{\langle \delta_{(S)}, \varphi \rangle &= {\textstyle\int\limits_{S}} \varphi \;\hbox{d}^{n - 1} S \cr \langle \partial_{\nu} \delta_{(S)}, \varphi \rangle &= - {\textstyle\int\limits_{S}} \partial_{\nu} \varphi \;\hbox{d}^{n - 1} S.}]$ Integration by parts shows that $[\partial_{i} T_{f} = T_{\partial_{i}\; f} + \sigma_{0} \cos \theta_{i} \delta_{(S)},]$ where $[\theta_{i}]$ is the angle between the $[x_{i}]$ axis and the normal to S along which the jump $[\sigma_{0}]$ occurs, and that the Laplacian of $[T_{f}]$ is given by $[\Delta (T_{f}) = T_{\Delta f} + \sigma_{\nu} \delta_{(S)} + \partial_{\nu} [\sigma_{0} \delta_{(S)}].]$ The latter result is a statement of Green's theorem in terms of distributions. It will be used in Section 1.3.4.4.3.5 to calculate the Fourier transform of the indicator function of a molecular envelope.

1.3.2.3.9.2. Integration of distributions in dimension 1

| top | pdf |

The reverse operation from differentiation, namely calculating the `indefinite integral' of a distribution S, consists in finding a distribution T such that [T' = S] .

For all $[\chi \in {\scr D}]$ such that $[\chi = \psi']$ with $[\psi \in {\scr D}]$ , we must have $[\langle T, \chi \rangle = - \langle S, \psi \rangle .]$ This condition defines T in a `hyperplane' $[{\scr H}]$ of $[{\scr D}]$ , whose equation $[\langle 1, \chi \rangle \equiv \langle 1, \psi' \rangle = 0]$ reflects the fact that ψ has compact support.

To specify T in the whole of $[{\scr D}]$ , it suffices to specify the value of $[\langle T, \varphi_{0} \rangle]$ where $[\varphi_{0} \in {\scr D}]$ is such that $[\langle 1, \varphi_{0} \rangle = 1]$ : then any $[\varphi \in {\scr D}]$ may be written uniquely as $[\varphi = \lambda \varphi_{0} + \psi']$ with $[\lambda = \langle 1, \varphi \rangle, \qquad \chi = \varphi - \lambda \varphi_{0}, \qquad \psi (x) = {\textstyle\int\limits_{0}^{x}} \chi (t) \;\hbox{d}t,]$ and T is defined by $[\langle T, \varphi \rangle = \lambda \langle T, \varphi_{0} \rangle - \langle S, \psi \rangle.]$ The freedom in the choice of $[\varphi_{0}]$ means that T is defined up to an additive constant.

1.3.2.3.9.3. Multiplication of distributions by functions

| top | pdf |

The product $[\alpha T]$ of a distribution T on $[{\bb R}^{n}]$ by a function α over $[{\bb R}^{n}]$ will be defined by transposition: $[\langle \alpha T, \varphi \rangle = \langle T, \alpha \varphi \rangle \quad \hbox{for all } \varphi \in {\scr D}.]$ In order that $[\alpha T]$ be a distribution, the mapping $[\varphi \;\longmapsto\; \alpha \varphi]$ must send $[{\scr D}({\bb R}^{n})]$ continuously into itself; hence the multipliers α must be infinitely differentiable. The product of two general distributions cannot be defined. The need for a careful treatment of multipliers of distributions will become clear when it is later shown (Section 1.3.2.5.8) that the Fourier transformation turns convolutions into multiplications and vice versa.

If T is a distribution of order m, then α needs only have continuous derivatives up to order m. For instance, δ is a distribution of order zero, and $[\alpha \delta = \alpha ({\bf 0}) \delta]$ is a distribution provided α is continuous; this relation is of fundamental importance in the theory of sampling and of the properties of the Fourier transformation related to sampling (Sections 1.3.2.6.4, 1.3.2.6.6). More generally, $[D^{{\bf p}}\delta]$ is a distribution of order $[|{\bf p}|]$ , and the following formula holds for all $[\alpha \in {\scr D}^{(m)}]$ with $[m = |{\bf p}|]$ : $[\alpha (D^{{\bf p}}\delta) = {\displaystyle\sum\limits_{{\bf q} \leq {\bf p}}} (-1)^{|{\bf p}-{\bf q}|} \pmatrix{{\bf p}\cr {\bf q}\cr} (D^{{\bf p}-{\bf q}} \alpha) ({\bf 0}) D^{\bf q}\delta.]$

The derivative of a product is easily shown to be $[\partial_{i}(\alpha T) = (\partial_{i}\alpha) T + \alpha (\partial_{i}T)]$ and generally for any multi-index p $[D^{\bf p}(\alpha T) = {\displaystyle\sum\limits_{{\bf q}\leq {\bf p}}} \pmatrix{{\bf p}\cr {\bf q}\cr} (D^{{\bf p}-{\bf q}} \alpha) ({\bf 0}) D^{{\bf q}}T.]$

1.3.2.3.9.4. Division of distributions by functions

| top | pdf |

Given a distribution S on $[{\bb R}^{n}]$ and an infinitely differentiable multiplier function α, the division problem consists in finding a distribution T such that $[\alpha T = S]$ .

If α never vanishes, $[T = S/\alpha]$ is the unique answer. If [n = 1] , and if α has only isolated zeros of finite order, it can be reduced to a collection of cases where the multiplier is $[x^{m}]$ , for which the general solution can be shown to be of the form $[T = U + {\textstyle\sum\limits_{i=0}^{m-1}} c_{i}\delta^{(i)},]$ where U is a particular solution of the division problem $[x^{m} U = S]$ and the $[c_{i}]$ are arbitrary constants.

In dimension $[n \gt 1]$ , the problem is much more difficult, but is of fundamental importance in the theory of linear partial differential equations, since the Fourier transformation turns the problem of solving these into a division problem for distributions [see Hörmander (1963)].

1.3.2.3.9.5. Transformation of coordinates

| top | pdf |

Let σ be a smooth non-singular change of variables in $[{\bb R}^{n}]$ , i.e. an infinitely differentiable mapping from an open subset Ω of $[{\bb R}^{n}]$ to Ω′ in $[{\bb R}^{n}]$ , whose Jacobian $[J(\sigma) = \det \left[{\partial \sigma ({\bf x}) \over \partial {\bf x}}\right]]$ vanishes nowhere in Ω. By the implicit function theorem, the inverse mapping $[\sigma^{-1}]$ from Ω′ to Ω is well defined.

If f is a locally summable function on Ω, then the function $[\sigma^{\#} f]$ defined by $[(\sigma^{\#} f)({\bf x}) = f[\sigma^{-1}({\bf x})]]$ is a locally summable function on Ω′, and for any $[\varphi \in {\scr D}(\Omega')]$ we may write: $[\eqalign{{\textstyle\int\limits_{\Omega'}} (\sigma^{\#} f) ({\bf x}) \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x} &= {\textstyle\int\limits_{\Omega'}} f[\sigma^{-1} ({\bf x})] \varphi ({\bf x}) \;\hbox{d}^{n} {\bf x} \cr &= {\textstyle\int\limits_{\Omega'}} f({\bf y}) \varphi [\sigma ({\bf y})]|J(\sigma)| \;\hbox{d}^{n} {\bf y} \quad \hbox{by } {\bf x} = \sigma ({\bf y}).}]$ In terms of the associated distributions $[\langle T_{\sigma^{\#} f}, \varphi \rangle = \langle T_{f}, |J(\sigma)|(\sigma^{-1})^{\#} \varphi \rangle.]$

This operation can be extended to an arbitrary distribution T by defining its image $[\sigma^{\#} T]$ under coordinate transformation σ through $[\langle \sigma^{\#} T, \varphi \rangle = \langle T, |J(\sigma)|(\sigma^{-1})^{\#} \varphi \rangle,]$ which is well defined provided that σ is proper, i.e. that $[\sigma^{-1}(K)]$ is compact whenever K is compact.

For instance, if $[\sigma: {\bf x} \;\longmapsto\; {\bf x} + {\bf a}]$ is a translation by a vector a in $[{\bb R}^{n}]$ , then $[|J(\sigma)| = 1]$ ; $[\sigma^{\#}]$ is denoted by $[\tau_{\bf a}]$ , and the translate $[\tau_{\bf a} T]$ of a distribution T is defined by $[\langle \tau_{\bf a} T, \varphi \rangle = \langle T, \tau_{-{\bf a}} \varphi \rangle.]$

Let $[A: {\bf x} \;\longmapsto\; {\bf Ax}]$ be a linear transformation defined by a non-singular matrix A. Then $[J(A) = \det {\bf A}]$ , and $[\langle A^{\#} T, \varphi \rangle = |\det {\bf A}| \langle T, (A^{-1})^{\#} \varphi \rangle.]$ This formula will be shown later (Sections 1.3.2.6.5, 1.3.4.2.1.1) to be the basis for the definition of the reciprocal lattice.

In particular, if $[{\bf A} = -{\bf I}]$ , where I is the identity matrix, A is an inversion through a centre of symmetry at the origin, and denoting $[A^{\#} \varphi]$ by $[\breve{\varphi}]$ we have: $[\langle \breve{T}, \varphi \rangle = \langle T, \breve{\varphi} \rangle.]$ T is called an even distribution if $[\breve{T} = T]$ , an odd distribution if $[\breve{T} = -T]$ .

If $[{\bf A} = \lambda {\bf I}]$ with $[\lambda \gt 0]$ , A is called a dilation and $[\langle A^{\#} T, \varphi \rangle = \lambda^{n} \langle T, (A^{-1})^{\#} \varphi \rangle.]$ Writing symbolically δ as $[\delta ({\bf x})]$ and $[A^{\#} \delta]$ as $[\delta ({\bf x}/\lambda)]$ , we have: $[\delta ({\bf x}/\lambda) = \lambda^{n} \delta ({\bf x}).]$ If [n = 1] and f is a function with isolated simple zeros $[x_{j}]$ , then in the same symbolic notation $[\delta [\;f(x)] = \sum\limits_{j} {1 \over |\;f'(x_{j})|} \delta (x_{j}),]$ where each $[\lambda_{j} = 1/|\;f'(x_{j})|]$ is analogous to a `Lorentz factor' at zero $[x_{j}]$ .

1.3.2.3.9.6. Tensor product of distributions

| top | pdf |

The purpose of this construction is to extend Fubini's theorem to distributions. Following Section 1.3.2.2.5, we may define the tensor product $[L_{\rm loc}^{1} ({\bb R}^{m}) \otimes L_{\rm loc}^{1} ({\bb R}^{n})]$ as the vector space of finite linear combinations of functions of the form $[f \otimes g: ({\bf x},{ \bf y}) \;\longmapsto\; f({\bf x})g({\bf y}),]$ where $[{\bf x} \in {\bb R}^{m},{\bf y} \in {\bb R}^{n}, f \in L_{\rm loc}^{1} ({\bb R}^{m})]$ and $[g \in L_{\rm loc}^{1} ({\bb R}^{n})]$ .

Let $[S_{\bf x}]$ and $[T_{\bf y}]$ denote the distributions associated to f and g, respectively, the subscripts x and y acting as mnemonics for $[{\bb R}^{m}]$ and $[{\bb R}^{n}]$ . It follows from Fubini's theorem (Section 1.3.2.2.5) that $[f \otimes g \in L_{\rm loc}^{1} ({\bb R}^{m} \times {\bb R}^{n})]$ , and hence defines a distribution over $[{\bb R}^{m} \times {\bb R}^{n}]$ ; the rearrangement of integral signs gives $[\langle S_{\bf x} \otimes T_{\bf y}, \varphi_{{\bf x}, \,{\bf y}} \rangle = \langle S_{\bf x}, \langle T_{\bf y}, \varphi_{{\bf x}, \,{\bf y}} \rangle\rangle = \langle T_{\bf y}, \langle S_{\bf x}, \varphi_{{\bf x}, \, {\bf y}} \rangle\rangle]$ for all $[\varphi_{{\bf x}, \,{\bf y}} \in {\scr D}({\bb R}^{m} \times {\bb R}^{n})]$ . In particular, if $[\varphi ({\bf x},{ \bf y}) = u({\bf x}) v({\bf y})]$ with $[u \in {\scr D}({\bb R}^{m}),v \in {\scr D}({\bb R}^{n})]$ , then $[\langle S \otimes T, u \otimes v \rangle = \langle S, u \rangle \langle T, v \rangle.]$

This construction can be extended to general distributions $[S \in {\scr D}\,'({\bb R}^{m})]$ and $[T \in {\scr D}\,'({\bb R}^{n})]$ . Given any test function $[\varphi \in {\scr D}({\bb R}^{m} \times {\bb R}^{n})]$ , let $[\varphi_{\bf x}]$ denote the map $[{\bf y} \;\longmapsto\; \varphi ({\bf x}, {\bf y})]$ ; let $[\varphi_{\bf y}]$ denote the map $[{\bf x} \;\longmapsto\; \varphi ({\bf x},{\bf y})]$ ; and define the two functions $[\theta ({\bf x}) = \langle T, \varphi_{\bf x} \rangle]$ and $[\omega ({\bf y}) = \langle S, \varphi_{\bf y} \rangle]$ . Then, by the lemma on differentiation under the $[\langle,\rangle]$ sign of Section 1.3.2.3.9.1, $[\theta \in {\scr D}({\bb R}^{m}),\omega \in {\scr D}({\bb R}^{n})]$ , and there exists a unique distribution $[S \otimes T]$ such that $[\langle S \otimes T, \varphi \rangle = \langle S, \theta \rangle = \langle T, \omega \rangle.]$ $[S \otimes T]$ is called the tensor product of S and T.

With the mnemonic introduced above, this definition reads identically to that given above for distributions associated to locally integrable functions: $[\langle S_{\bf x} \otimes T_{\bf y}, \varphi_{{\bf x}, \, {\bf y}} \rangle = \langle S_{\bf x}, \langle T_{\bf y}, \varphi_{{\bf x}, \, {\bf y}} \rangle\rangle = \langle T_{\bf y}, \langle S_{\bf x}, \varphi_{{\bf x}, \, {\bf y}} \rangle\rangle.]$

The tensor product of distributions is associative: $[(R \otimes S) \otimes T = R \otimes (S \otimes T).]$ Derivatives may be calculated by $[D_{\bf x}^{\bf p} D_{\bf y}^{\bf q} (S_{\bf x} \otimes T_{\bf y}) = (D_{\bf x}^{\bf p} S_{\bf x}) \otimes (D_{\bf y}^{\bf q} T_{\bf y}).]$ The support of a tensor product is the Cartesian product of the supports of the two factors.

1.3.2.3.9.7. Convolution of distributions

| top | pdf |

The convolution [f * g] of two functions f and g on $[{\bb R}^{n}]$ is defined by $[(\;f * g) ({\bf x}) = {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf y}) g({\bf x} - {\bf y}) \;\hbox{d}^{n}{\bf y} = {\textstyle\int\limits_{{\bb R}^{n}}} f({\bf x} - {\bf y}) g ({\bf y}) \;\hbox{d}^{n}{\bf y}]$ whenever the integral exists. This is the case when f and g are both in $[L^{1} ({\bb R}^{n})]$ ; then [f * g] is also in $[L^{1} ({\bb R}^{n})]$ . Let S, T and W denote the distributions associated to f, g and [f * g,] respectively: a change of variable immediately shows that for any $[\varphi \in {\scr D}({\bb R}^{n})]$ , $[\langle W, \varphi \rangle = {\textstyle\int\limits_{{\bb R}^{n} \times {\bb R}^{n}}} f({\bf x}) g({\bf y}) \varphi ({\bf x} + {\bf y}) \;\hbox{d}^{n}{\bf x} \;\hbox{d}^{n}{\bf y}.]$ Introducing the map σ from $[{\bb R}^{n} \times {\bb R}^{n}]$ to $[{\bb R}^{n}]$ defined by $[\sigma ({\bf x}, {\bf y}) = {\bf x} + {\bf y}]$ , the latter expression may be written: $[\langle S_{\bf x} \otimes T_{\bf y}, \varphi \circ \sigma \rangle]$ (where $[\circ]$ denotes the composition of mappings) or by a slight abuse of notation: $[\langle W, \varphi \rangle = \langle S_{\bf x} \otimes T_{\bf y}, \varphi ({\bf x} + {\bf y}) \rangle.]$

A difficulty arises in extending this definition to general distributions S and T because the mapping σ is not proper: if K is compact in $[{\bb R}^{n}]$ , then $[\sigma^{-1} (K)]$ is a cylinder with base K and generator the `second bisector' $[{\bf x} + {\bf y} = {\bf 0}]$ in $[{\bb R}^{n} \times {\bb R}^{n}]$ . However, $[\langle S \otimes T, \varphi \circ \sigma \rangle]$ is defined whenever the intersection between Supp $[(S \otimes T) = (\hbox{Supp } S) \times (\hbox{Supp } T)]$ and $[\sigma^{-1} (\hbox{Supp } \varphi)]$ is compact.

We may therefore define the convolution [S * T] of two distributions S and T on $[{\bb R}^{n}]$ by $[\langle S * T, \varphi \rangle = \langle S \otimes T, \varphi \circ \sigma \rangle = \langle S_{\bf x} \otimes T_{\bf y}, \varphi ({\bf x} + {\bf y})\rangle]$ whenever the following support condition is fulfilled:

`the set $[\{({\bf x},{\bf y})|{\bf x} \in A, {\bf y} \in B, {\bf x} + {\bf y} \in K\}]$ is compact in $[{\bb R}^{n} \times {\bb R}^{n}]$ for all K compact in $[{\bb R}^{n}]$ '.

The latter condition is met, in particular, if S or T has compact support. The support of [S * T] is easily seen to be contained in the closure of the vector sum $[A + B = \{{\bf x} + {\bf y}|{\bf x} \in A, {\bf y} \in B\}.]$

Convolution by a fixed distribution S is a continuous operation for the topology on $[{\scr D}\,']$ : it maps convergent sequences $[(T_{j})]$ to convergent sequences $[(S * T_{j})]$ . Convolution is commutative: [S * T = T * S] .

The convolution of p distributions $[T_{1}, \ldots, T_{p}]$ with supports $[A_{1}, \ldots, A_{p}]$ can be defined by $[\langle T_{1} * \ldots * T_{p}, \varphi \rangle = \langle (T_{1})_{{\bf x}_{1}} \otimes \ldots \otimes (T_{p})_{{\bf x}_{p}}, \varphi ({\bf x}_{1} + \ldots + {\bf x}_{p})\rangle]$ whenever the following generalized support condition :

`the set $[\{({\bf x}_{1}, \ldots, {\bf x}_{p})|{\bf x}_{1} \in A_{1}, \ldots, {\bf x}_{p} \in A_{p}, {\bf x}_{1} + \ldots + {\bf x}_{p} \in K\}]$ is compact in $[({\bb R}^{n})^{p}]$ for all K compact in $[{\bb R}^{n}]$ '

is satisfied. It is then associative. Interesting examples of associativity failure, which can be traced back to violations of the support condition, may be found in Bracewell (1986, pp. 436–437).

It follows from previous definitions that, for all distributions $[T \in {\scr D}\,']$ , the following identities hold:

(i) $[\delta * T = T]$ : $[\delta]$ is the unit convolution;
(ii) $[\delta_{({\bf a})} * T = \tau_{\bf a} T]$ : translation is a convolution with the corresponding translate of δ;
(iii) $[(D^{{\bf p}} \delta) * T = D^{{\bf p}} T]$ : differentiation is a convolution with the corresponding derivative of δ;
(iv) translates or derivatives of a convolution may be obtained by translating or differentiating any one of the factors: convolution `commutes' with translation and differentiation, a property used in Section 1.3.4.4.7.7 to speed up least-squares model refinement for macromolecules.

The latter property is frequently used for the purpose of regularization: if T is a distribution, α an infinitely differentiable function, and at least one of the two has compact support, then $[T * \alpha]$ is an infinitely differentiable ordinary function. Since sequences $[(\alpha_{\nu})]$ of such functions α can be constructed which have compact support and converge to δ, it follows that any distribution T can be obtained as the limit of infinitely differentiable functions $[T * \alpha_{\nu}]$ . In topological jargon: $[{\scr D}({\bb R}^{n})]$ is `everywhere dense' in $[{\scr D}\,'({\bb R}^{n})]$ . A standard function in $[{\scr D}]$ which is often used for such proofs is defined as follows: put $[\eqalign{\theta (x) &= {1 \over A} \exp \left(- {1 \over 1-x^{2}}\right){\hbox to 10.5pt{}} \hbox{for } |x| \leq 1, \cr &= 0 \phantom{\exp \left(- {1 \over x^{2} - 1}\right)a}\quad \hbox{for } |x| \geq 1,}]$ with $[A = \int\limits_{-1}^{+1} \exp \left(- {1 \over 1-x^{2}}\right) \;\hbox{d}x]$ (so that θ is in $[{\scr D}]$ and is normalized), and put $[\eqalign{\theta_{\varepsilon} (x) &= {1 \over \varepsilon} \theta \left({x \over \varepsilon}\right){\hbox to 13.5pt{}}\hbox{ in dimension } 1,\cr \theta_{\varepsilon} ({\bf x}) &= \prod\limits_{j=1}^{n} \theta_{\varepsilon} (x_{j})\quad \hbox{in dimension } n.}]$

Another related result, also proved by convolution, is the structure theorem: the restriction of a distribution $[T \in {\scr D}\,'({\bb R}^{n})]$ to a bounded open set Ω in $[{\bb R}^{n}]$ is a derivative of finite order of a continuous function.

Properties (i) to (iv) are the basis of the symbolic or operational calculus (see Carslaw & Jaeger, 1948; Van der Pol & Bremmer, 1955; Churchill, 1958; Erdélyi, 1962; Moore, 1971) for solving integro-differential equations with constant coefficients by turning them into convolution equations, then using factorization methods for convolution algebras (Schwartz, 1965).

References

Bochner, S. (1932). Vorlesungen über Fouriersche Integrale. Leipzig: Akademische Verlagsgesellschaft.Google Scholar

Bochner, S. (1959). Lectures on Fourier integrals. Translated from Bochner (1932) by M. Tenenbaum & H. Pollard. Princeton University Press.Google Scholar

Bracewell, R. N. (1986). The Fourier transform and its applications, 2nd ed., revised. New York: McGraw-Hill.Google Scholar

Bremermann, H. (1965). Distributions, complex variables, and Fourier transforms. Reading: Addison-Wesley.Google Scholar

Carslaw, H. S. & Jaeger, J. C. (1948). Operational methods in applied mathematics. Oxford University Press.Google Scholar

Challifour, J. L. (1972). Generalized functions and Fourier analysis. Reading: Benjamin.Google Scholar

Churchill, R. V. (1958). Operational mathematics, 2nd ed. New York: McGraw-Hill.Google Scholar

Dirac, P. A. M. (1958). The principles of quantum mechanics, 4th ed. Oxford: Clarendon Press.Google Scholar

Erdélyi, A. (1962). Operational calculus and generalized functions. New York: Holt, Rinehart & Winston.Google Scholar

Friedlander, F. G. (1982). Introduction to the theory of distributions. Cambridge University Press.Google Scholar

Gel'fand, I. M. & Shilov, G. E. (1964). Generalized functions, Vol. I. New York and London: Academic Press.Google Scholar

Hadamard, J. (1932). Le problème de Cauchy et les équations aux dérivées partielles linéaires hyperboliques. Paris: Hermann.Google Scholar

Hadamard, J. (1952). Lectures on Cauchy's problem in linear partial differential equations. New York: Dover Publications.Google Scholar

Hörmander, L. (1963). Linear partial differential operators. Berlin: Springer-Verlag.Google Scholar

Lighthill, M. J. (1958). Introduction to Fourier analysis and generalized functions. Cambridge University Press.Google Scholar

Moore, D. H. (1971). Heaviside operational calculus. An elementary foundation. New York: American Elsevier.Google Scholar

Riesz, M. (1938). L'intégrale de Riemann–Liouville et le problème de Cauchy pour l'équation des ondes. Bull. Soc. Math. Fr. 66, 153–170.Google Scholar

Riesz, M. (1949). L'intégrale de Riemann–Liouville et le problème de Cauchy. Acta Math. 81, 1–223.Google Scholar

Schwartz, L. (1965). Mathematics for the physical sciences. Paris: Hermann, and Reading: Addison-Wesley.Google Scholar

Schwartz, L. (1966). Théorie des distributions. Paris: Hermann.Google Scholar

Trèves, F. (1967). Topological vector spaces, distributions, and kernels. New York and London: Academic Press.Google Scholar

Van der Pol, B. & Bremmer, H. (1955). Operational calculus, 2nd ed. Cambridge University Press.Google Scholar

Whittaker, E. T. (1928). Oliver Heaviside. Bull. Calcutta Math. Soc. 20, 199–220. [Reprinted in Moore (1971).]Google Scholar

Yosida, K. (1965). Functional analysis. Berlin: Springer-Verlag.Google Scholar

International Tables for Crystallography (2006). Vol. B. ch. 1.3, pp. 28-34