International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F. ch. 16.2, pp. 347-348   | 1 | 2 |

Section 16.2.2.4. Jaynes' maximum-entropy formalism

G. Bricognea*

aLaboratory of Molecular Biology, Medical Research Council, Cambridge CB2 2QH, England
Correspondence e-mail: gb10@mrc-lmb.cam.ac.uk

16.2.2.4. Jaynes' maximum-entropy formalism

| top | pdf |

Jaynes (1957)[link] solved the problem of explicitly determining such maximum-entropy distributions in the case of general linear constraints, using an analytical apparatus first exploited by Gibbs in statistical mechanics.

The maximum-entropy distribution [q^{\rm ME}({\bf s})], under the prior prejudice m(s), satisfying the linear constraint equations [{\cal C}_{j}(q) \equiv \textstyle\int\limits_{\cal A}\displaystyle q({\bf s}) C_{j}({\bf s})\; {\rm d}\mu ({\bf s}) = c_{j}\quad (\;j=1, 2, \ldots, M), \eqno(16.2.2.5)] where the [{\cal C}_{j}(q)] are linear constraint functionals defined by given constraint functions [C_{j}({\bf s})], and the [c_{j}] are given constraint values, is obtained by maximizing with respect to q the relative entropy defined by equation (16.2.2.4)[link]. An extra constraint is the normalization condition [{\cal C}_{0}(q) \equiv \textstyle\int\limits_{\cal A}\displaystyle q({\bf s})\ 1\ {\rm d}\mu ({\bf s}) = 1, \eqno(16.2.2.6)] to which it is convenient to give the label [j = 0], so that it can be handled together with the others by putting [C_{0}({\bf s}) = 1], [c_{0} = 1].

By a standard variational argument, this constrained maximization is equivalent to the unconstrained maximization of the functional [{\cal S}_{m}(q) + \textstyle\sum\limits_{j=0}^{M}\displaystyle \lambda_{j} {\cal C}_{j}(q), \eqno(16.2.2.7)] where the [\lambda_{j}] are Lagrange multipliers whose values may be determined from the constraints. This new variational problem is readily solved: if q(s) is varied to [q({\bf s})+\delta q({\bf s})], the resulting variations in the functionals [{\cal S}_{m}] and [{\cal C}_{j}] will be [\eqalign{\delta {\cal S}_{m} &= \textstyle\int\limits_{\cal A} \displaystyle\{-1 -\log \left[q({\bf s})/m({\bf s})\right]\}\; \delta q({\bf s})\; \hbox{d}\mu ({\bf s}) \quad\hbox{ and } \cr\noalign{\vskip5pt} \delta {\cal C}_{j} &= \textstyle\int\limits_{\cal A}\displaystyle\{C_{j}({\bf s})\} \;\delta q({\bf s}) \;\hbox{d}\mu ({\bf s}),} \eqno(16.2.2.8)] respectively. If the variation of the functional (16.2.2.7)[link] is to vanish for arbitrary variations [\delta q({\bf s})], the integrand in the expression for that variation from (16.2.2.8)[link] must vanish identically. Therefore the maximum-entropy density distribution [q^{\rm ME}({\bf s})] satisfies the relation [-1 -\log \left[q({\bf s})/m({\bf s})\right] + \textstyle\sum\limits_{j=0}^{M}\displaystyle \lambda_{j} C_{j}({\bf s}) = 0 \eqno(16.2.2.9)] and hence [q^{\rm ME}({\bf s}) = m({\bf s}) \exp (\lambda_{0}-1) \exp \left[\textstyle\sum\limits_{j=1}^{M}\displaystyle \lambda_{j} C_{j}({\bf s})\right]. \eqno(16.2.2.10)]

It is convenient now to separate the multiplier [\lambda_{0}] associated with the normalization constraint by putting [\lambda_{0}-1 = -\log Z, \eqno(16.2.2.11)] where Z is a function of the other multipliers [\lambda_{1}, \ldots , \lambda_{M}]. The final expression for [q^{\rm ME}({\bf s})] is thus [q^{\rm ME}({\bf s}) = {m({\bf s}) \over Z(\lambda_{1},\ldots,\lambda_{M})} \exp \left[\sum_{j=1}^{M} \lambda_{j} C_{j}({\bf s}) \right]. \eqno(\hbox{ME1})] The values of Z and of [\lambda_{1}, \ldots , \lambda_{M}] may now be determined by solving the initial constraint equations. The normalization condition demands that [Z(\lambda_{1},\ldots,\lambda_{M}) = \textstyle\int\limits_{\cal A}\displaystyle m({\bf s}) \exp \left[\textstyle\sum\limits_{j=1}^{M}\displaystyle \lambda_{j} C_{j}({\bf s}) \right] \;\hbox{d}\mu ({\bf s}). \eqno(\hbox{ME2})] The generic constraint equations (16.2.2.5)[link] determine [\lambda_{1}, \ldots , \lambda_{M}] by the conditions that [\textstyle\int_{\cal A}\displaystyle{[m({\bf s})/Z]} \exp \left[\textstyle\sum\limits_{\;k=1}^{M}\displaystyle \lambda_{k} C_{k}({\bf s}) \right] C_{j}({\bf s})\; \hbox{d}\mu ({\bf s}) = c_{j} \eqno(16.2.2.12)] for [j=1, 2, \ldots , M]. But, by Leibniz's rule of differentiation under the integral sign, these equations may be written in the compact form [{\partial (\log Z) \over \partial\lambda_{j}} = c_{j} \quad (\;j=1, 2, \ldots, M). \eqno(\hbox{ME3})] Equations (ME1), (ME2) and (ME3) constitute the maximum-entropy equations.

The maximal value attained by the entropy is readily found: [\eqalign{{\cal S}_{m}(q^{\rm ME}) &= -\textstyle\int\limits_{\cal A}\displaystyle q^{\rm ME}({\bf s}) \log \left[q^{\rm ME}({\bf s})/m({\bf s})\right]\; \hbox{d}\mu ({\bf s})\cr\noalign{\vskip5pt} &= -\textstyle\int\limits_{\cal A}\displaystyle q^{\rm ME}({\bf s}) \left[ -\log Z + \textstyle\sum\limits_{j=1}^{M}\displaystyle \lambda_{j} C_{j}({\bf s})\right] \;\hbox{d}\mu ({\bf s}),}] i.e. using the constraint equations [{\cal S}_{m}(q^{\rm ME}) = \log Z - \textstyle\sum\limits_{j=1}^{M}\displaystyle \lambda_{j} c_{j}. \eqno(16.2.2.13)] The latter expression may be rewritten, by means of equations (ME3), as [{\cal S}_{m}(q^{\rm ME}) = \log Z - \sum_{j=1}^{M} \lambda_{j} {\partial(\log Z) \over \partial \lambda_{j}}, \eqno(16.2.2.14)] which shows that, in their dependence on the λ's, the entropy and log Z are related by Legendre duality.

Jaynes' theory relates this maximal value of the entropy to the prior probability [{\cal P}({\bf c})] of the vector c of simultaneous constraint values, i.e. to the size of the sub-ensemble of messages of length N that fulfil the constraints embodied in (16.2.2.5)[link], relative to the size of the ensemble of messages of the same length when the source operates with the symbol probability distribution given by the prior prejudice m. Indeed, it is a straightforward consequence of Shannon's second theorem (Section 16.2.2)[link] as expressed in equation (16.2.2.3)[link] that [{\cal P}^{\rm ME}({\bf c}) \propto \exp({\cal S}), \eqno(16.2.2.15)] where [{\cal S} = \log Z^{N} - \lambda \cdot {\bf c} = N {\cal S}_{m}(q^{\rm ME}) \eqno(16.2.2.16)] is the total entropy for N symbols.

References

First citation Jaynes, E. T. (1957). Information theory and statistical mechanics. Phys. Rev. 106, 620–630.Google Scholar








































to end of page
to top of page