International
Tables for
Crystallography
Volume G
Definition and exchange of crystallographic data
Edited by S. R. Hall and B. McMahon

International Tables for Crystallography (2006). Vol. G. ch. 5.6, p. 552

Section 5.6.3.1. Canonical-code compression

P. J. Ellisa and H. J. Bernsteinb*

a Stanford Linear Accelerator Center, 2575 Sand Hill Road, Menlo Park, CA 94025, USA, and bDepartment of Mathematics and Computer Science, Kramer Science Center, Dowling College, Idle Hour Blvd, Oakdale, NY 11769, USA
Correspondence e-mail:  yaya@bernstein-plus-sons.com

5.6.3.1. Canonical-code compression

| top | pdf |

The canonical-code compression scheme encodes errors in two ways: directly or indirectly. Errors are coded directly using a symbol corresponding to the error value. Errors are coded indirectly using a symbol for the number of bits in the (signed) error, followed by the error itself.

At the start of the compression, CBFlib constructs a table containing a set of symbols, one for each of the 2n direct codes from [-2^{n-1}] to [2^{n-1}-1], one for a stop code and one for each of the maxbitsn indirect codes, where n is chosen at compression time and maxbits is the maximum number of bits in an error. CBFlib then assigns to each symbol a bit code, using a shorter bit code for the more common symbols and a longer bit code for the less common symbols. The bit-code lengths are calculated using a Huffman-type algorithm and the actual bit codes are constructed using the canonical-code algorithm described by Moffat et al. (1997[link]).

The structure of the compressed data is described in Table 5.6.3.1[link].

Table 5.6.3.1 | top | pdf |
Structure of compressed data using the canonical-code scheme

Byte Value
1 to 8 Number of elements (64-bit little-endian number)
9 to 16 Minimum element
17 to 24 Maximum element
25 to 32 (Reserved for future use)
33 Number of bits directly coded, n
34 Maximum number of bits encoded, maxbits
35 to 35 + 2n − 1 Number of bits in each direct code
35 + 2n Number of bits in the stop code
35 + 2n + 1 to 35 + 2n + maxbitsn Number of bits in each indirect code
35 + 2n + maxbitsn + 1… Coded data

References

First citation Moffat, A., Bell, T. C. & Witten, I. H. (1997). Lossless compression for text and images. Int. J. High Speed Electron. Syst. 8, 179–231.Google Scholar








































to end of page
to top of page