International
Tables for
Crystallography
Volume C
Mathematical, physical and chemical tables
Edited by E. Prince

International Tables for Crystallography (2006). Vol. C. ch. 8.2, p. 691

Section 8.2.3.1. Introduction

E. Princea and D. M. Collinsb

a NIST Center for Neutron Research, National Institute of Standards and Technology, Gaithersburg, MD 20899, USA, and bLaboratory for the Structure of Matter, Code 6030, Naval Research Laboratory, Washington, DC 20375-5341, USA

8.2.3.1. Introduction

| top | pdf |

Entropy maximization, like least squares, is of interest primarily as a framework within which to find or adjust parameters of a model. Rationalization of the name `entropy maximization' by analogy to thermodynamics is controversial, but there is formal proof (Shore & Johnson, 1980[link], Johnson & Shore, 1983[link]) supporting entropy maximization as the unique method of inference that satisfies basic consistency requirements (Livesey & Skilling, 1985[link]). The proof consists of discovering the consequences of four consistency axioms, which may be stated informally as follows:

  • (1) the result of the inference should be unique;

  • (2) the result of the inference should be invariant to any transformations of coordinate system;

  • (3) it should not matter whether independent information is accounted for independently or jointly;

  • (4) it should not matter whether independent subsystems are treated separately in conditional problems or collected and treated jointly.

The term `entropy' is used in this chapter as a name only, the name for variation functions that include the form [\varphi \ln \varphi ], where [\varphi ] may represent probability or, more generally, a positive proportion. Any positive measure, either observed or derived, of the relative apportionment of a characteristic quantity among observations can serve as the proportion.

The method of entropy maximization may be formulated as follows: given a set of n observations, [y_i], that are measurements of quantities that can be described by model functions, [M_i({\bf x})], where x is a vector of parameters, find the prior, positive proportions, [\mu _i=f(y_i)], and the values of the parameters for which the positive proportions [\varphi =f[M_i({\bf x})]] make the sum [S=-\textstyle\sum\limits_{i=1}^n\varphi _i^{\prime }\ln (\varphi _i^{\prime }/\mu _i^{\prime }), \eqno (8.2.3.1)]where [\varphi _i^{\prime }=\varphi _i\big/\sum \varphi _j] and [\mu _i^{\prime }=\mu _i\big/\sum \mu _j], a maximum. S is called the Shannon–Jaynes entropy. For some applications (Collins, 1982[link]), it is desirable to include in the variation function additional terms or restraints that give S the form [S=-\textstyle\sum\limits_{i=1}^n\varphi _i^{\prime }\ln (\varphi _i^{\prime }/\mu _i^{\prime })+\lambda _1\xi _1({\bf x},{\bf y})+\lambda _2\xi _2({\bf x},{\bf y})+\ldots, \eqno (8.2.3.2)]where the λs are undetermined multipliers, but we shall discuss here only applications where λi = 0 for all i, and an unrestrained entropy is maximized. A necessary condition for S to be a maximum is for the gradient to vanish. Using [{\partial S \over \partial x_j}=\sum _{i=1}^n\left ({\partial S \over \partial \varphi _i}\right) \bigg({\partial \varphi _i\over\partial x_j}\bigg) \eqno (8.2.3.3)]and [{\partial S \over \partial \varphi _i}=\sum _{k=1}^n\left ({\partial S \over \partial \varphi _k^{\prime }}\right) \left ({\partial \varphi _k^{\prime }\over\partial \varphi _i}\right), \eqno (8.2.3.4)]straightforward algebraic manipulation gives equations of the form [\sum _{i=1}^n\left \{ {\partial \varphi _i \over \partial x_j}-\varphi _i^{\prime }\left (\sum _{k=1}^n {\partial \varphi _k \over \partial x_j}\right) \right \} \ln \left (\displaystyle {\varphi _i^{\prime } \over \mu _i^{\prime }}\right) =0. \eqno (8.2.3.5)]It should be noted that, although the entropy function should, in principle, have a unique stationary point corresponding to the global maximum, there are occasional circumstances, particularly with restrained problems where the undetermined multipliers are not all zero, where it may be necessary to verify that a stationary solution actually maximizes entropy.

References

First citation Collins, D. M. (1982). Electron density images from imperfect data by iterative entropy maximization. Nature (London), 298, 49–51.Google Scholar
First citation Johnson, R. W. & Shore, J. E. (1983). Comments on and correction to 'Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy'. IEEE Trans. Inf. Theory, IT-29, 942–943.Google Scholar
First citation Livesey, A. K. & Skilling, J. (1985). Maximum entropy theory. Acta Cryst. A41, 113–122.Google Scholar
First citation Shore, J. E. & Johnson, R. W. (1980). Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy. IEEE Trans. Inf. Theory, IT-26, 26–37.Google Scholar








































to end of page
to top of page