International
Tables for
Crystallography
Volume F
Crystallography of biological macromolecules
Edited by M. G. Rossmann and E. Arnold

International Tables for Crystallography (2006). Vol. F. ch. 16.2, p. 346   | 1 | 2 |

Section 16.2.2.1. Sources of random symbols and the notion of source entropy

G. Bricognea*

aLaboratory of Molecular Biology, Medical Research Council, Cambridge CB2 2QH, England
Correspondence e-mail: gb10@mrc-lmb.cam.ac.uk

16.2.2.1. Sources of random symbols and the notion of source entropy

| top | pdf |

Statistical communication theory uses as its basic modelling device a discrete source of random symbols, which at discrete times [t = 1, 2, \ldots], randomly emits a `symbol' taken out of a finite alphabet [{\cal A} = \{ s_{i} | i = 1, \ldots, n\}]. Sequences of such randomly produced symbols are called `messages'.

An important numerical quantity associated with such a discrete source is its entropy per symbol H, which gives a measure of the amount of uncertainty involved in the choice of a symbol. Suppose that successive symbols are independent and that symbol i has probability [q_{i}]. Then the general requirements that H should be a continuous function of the [q_{i}], should increase with increasing uncertainty, and should be additive for independent sources of uncertainty, suffice to define H uniquely as [H (q_{1}, \ldots, q_{n}) = -k \textstyle\sum\limits_{i=1}^{n}\displaystyle q_{i}\log q_{i}, \eqno(16.2.2.1)] where k is an arbitrary positive constant [Shannon & Weaver (1949)[link], Appendix 2] whose value depends on the unit of entropy chosen. In the following we use a unit such that [k = 1].

These definitions may be extended to the case where the alphabet [{\cal A}] is a continuous space endowed with a uniform measure μ: in this case the entropy per symbol is defined as [H(q) = - \textstyle\int\limits_{{\cal A}}\displaystyle q({\bf s}) \log q({\bf s})\; \hbox{d}\mu ({\bf s}), \eqno(16.2.2.2)] where q is the probability density of the distribution of symbols with respect to measure μ.

References

First citation Shannon, C. E. & Weaver, W. (1949). The mathematical theory of communication. Urbana: University of Illinois Press.Google Scholar








































to end of page
to top of page