The Winograd algorithms

Bricogne, G.

doi:10.1107/97809553602060000551

International
Tables for
Crystallography
Volume B
Reciprocal space
Edited by U. Shmueli

pdf | chapter contents | chapter index | related articles

International Tables for Crystallography (2006). Vol. B. ch. 1.3, pp. 54-55 | 1 | 2 |

Section 1.3.3.2.4. The Winograd algorithms

G. Bricogne^a

^a MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, England, and LURE, Bâtiment 209D, Université Paris-Sud, 91405 Orsay, France

1.3.3.2.4. The Winograd algorithms

| top | pdf |

The cyclic convolutions generated by Rader's multiplicative reindexing may be evaluated more economically than through DFTs if they are re-examined within a new algebraic setting, namely the theory of congruence classes of polynomials [see, for instance, Blahut (1985), Chapter 2; Schroeder (1986), Chapter 24].

The set, denoted $[{\bb K}[X]]$ , of polynomials in one variable with coefficients in a given field $[{\bb K}]$ has many of the formal properties of the set $[{\bb Z}]$ of rational integers: it is a ring with no zero divisors and has a Euclidean algorithm on which a theory of divisibility can be built.

Given a polynomial [P(z)] , then for every [W(z)] there exist unique polynomials [Q(z)] and [R(z)] such that [W(z) = P(z) Q(z) + R(z)] and $[\hbox{degree } (R) \;\lt\; \hbox{degree } (P).]$ [R(z)] is called the residue of [H(z)] modulo [P(z)] . Two polynomials $[H_{1}(z)]$ and $[H_{2}(z)]$ having the same residue modulo [P(z)] are said to be congruent modulo [P(z)] , which is denoted by $[H_{1}(z) \equiv H_{2}(z) \hbox{ mod } P(z).]$

If $[H(z) \equiv 0\hbox{ mod } P(z),\; H(z)]$ is said to be divisible by [P(z)] . If [H(z)] only has divisors of degree zero in $[{\bb K}[X]]$ , it is said to be irreducible over $[{\bb K}]$ (this notion depends on $[{\bb K}]$ ). Irreducible polynomials play in $[{\bb K}[X]]$ a role analogous to that of prime numbers in $[{\bb Z}]$ , and any polynomial over $[{\bb K}]$ has an essentially unique factorization as a product of irreducible polynomials.

There exists a Chinese remainder theorem (CRT) for polynomials . Let $[P(z) = P_{1}(z) \ldots P_{d}(z)]$ be factored into a product of pairwise coprime polynomials [i.e. $[P_{i}(z)]$ and $[P_{j}(z)]$ have no common factor for $[i \neq j]$ ]. Then the system of congruence equations $[H(z) \equiv H_{j}(z) \hbox{ mod } P_{j}(z), \quad j = 1, \ldots, d,]$ has a unique solution [H(z)] modulo [P(z)] . This solution may be constructed by a procedure similar to that used for integers. Let $[Q_{j}(z) = P(z) / P_{j}(z) = {\textstyle\prod\limits_{i \neq j}} \;P_{i}(z).]$ Then $[P_{j}]$ and $[Q_{j}]$ are coprime, and the Euclidean algorithm may be used to obtain polynomials $[p_{j}(z)]$ and $[q_{j}(z)]$ such that $[p_{j}(z) P_{j}(z) + q_{j}(z) Q_{j}(z) = 1.]$ With $[S_{i}(z) = q_{i}(z) Q_{i}(z)]$ , the polynomial $[H(z) = {\textstyle\sum\limits_{i = 1}^{d}} \;S_{i}(z) H_{i}(z) \hbox{ mod } P(z)]$ is easily shown to be the desired solution.

As with integers, it can be shown that the 1:1 correspondence between [H(z)] and $[H_{j}(z)]$ sends sums to sums and products to products, i.e. establishes a ring isomorphism: $[{\bb K}[X] \hbox{ mod } P \cong ({\bb K}[X] \hbox{ mod } P_{1}) \times \ldots \times ({\bb K}[X] \hbox{ mod } P_{d}).]$

These results will now be applied to the efficient calculation of cyclic convolutions. Let $[{\bf U} = (u_{0}, u_{1}, \ldots, u_{N - 1})]$ and $[{\bf V} = (v_{0}, v_{1}, \ldots, v_{N - 1})]$ be two vectors of length N, and let $[{\bf W} = (w_{0}, w_{1}, \ldots, w_{N - 1})]$ be obtained by cyclic convolution of U and V: $[w_{n} = {\textstyle\sum\limits_{m = 0}^{N - 1}} u_{m} v_{n - m}, \quad n = 0, \ldots, N - 1.]$ The very simple but crucial result is that this cyclic convolution may be carried out by polynomial multiplication modulo $[(z^{N} - 1)]$ : if $[\eqalign{U(z) &= {\textstyle\sum\limits_{l = 0}^{N - 1}} u_{l} z^{l} \cr V(z) &= {\textstyle\sum\limits_{m = 0}^{N - 1}} v_{m} z^{m} \cr W(z) &= {\textstyle\sum\limits_{n = 0}^{N - 1}} w_{n} z^{n}}]$ then the above relation is equivalent to $[W(z) \equiv U(z) V(z) \hbox{ mod } (z^{N} - 1).]$ Now the polynomial $[z^{N} - 1]$ can be factored over the field of rational numbers into irreducible factors called cyclotomic polynomials: if d is the number of divisors of N, including 1 and N, then $[z^{N} - 1 = {\textstyle\prod\limits_{i = 1}^{d}} P_{i}(z),]$ where the cyclotomics $[P_{i}(z)]$ are well known (Nussbaumer, 1981; Schroeder, 1986, Chapter 22). We may now invoke the CRT, and exploit the ring isomorphism it establishes to simplify the calculation of [W(z)] from [U(z)] and [V(z)] as follows:

(i) compute the d residual polynomials $[\eqalign{U_{i}(z) &\equiv U(z) \hbox{ mod } P_{i}(z), \quad i = 1, \ldots, d,\cr V_{i}(z) &\equiv V(z) \hbox{ mod } P_{i}(z), \quad i = 1, \ldots, d\hbox{;}}]$
(ii) compute the d polynomial products $[W_{i}(z) \equiv U_{i}(z) V_{i}(z) \hbox{ mod } P_{i}(z), \quad i = 1, \ldots, d\hbox{;}]$
(iii) use the CRT reconstruction formula just proved to recover from the $[W_{i} (z)]$ : $[W (z) \equiv {\textstyle\sum\limits_{i = 1}^{d}} S_{i} (z) W_{i} (z) \hbox{ mod } (z^{N} - 1).]$

When N is not too large, i.e. for `short cyclic convolutions', the $[P_{i} (z)]$ are very simple, with coefficients 0 or ±1, so that (i) only involves a small number of additions. Furthermore, special techniques have been developed to multiply general polynomials modulo cyclotomic polynomials, thus helping keep the number of multiplications in (ii) and (iii) to a minimum. As a result, cyclic convolutions can be calculated rapidly when N is sufficiently composite.

It will be recalled that Rader's multiplicative indexing often gives rise to cyclic convolutions of length [p - 1] for p an odd prime. Since [p - 1] is highly composite for all $[p \leq 50]$ other than 23 and 47, these cyclic convolutions can be performed more efficiently by the above procedure than by DFT.

These combined algorithms are due to Winograd (1977, 1978, 1980), and are known collectively as `Winograd small FFT algorithms'. Winograd also showed that they can be thought of as bringing the DFT matrix F to the following `normal form': $[{\bf F} = {\bf CBA},]$ where

A is an integer matrix with entries 0, $[\pm 1]$ , defining the `pre-additions',
B is a diagonal matrix of multiplications,
C is a matrix with entries 0, $[\pm 1]$ , $[\pm i]$ , defining the `post-additions'.

The elements on the diagonal of B can be shown to be either real or pure imaginary, by the same argument as in Section 1.3.3.2.3.1. Matrices A and C may be rectangular rather than square, so that intermediate results may require extra storage space.

References

Blahut, R. E. (1985). Fast algorithms for digital signal processing. Reading: Addison-Wesley.Google Scholar

Nussbaumer, H. J. (1981). Fast Fourier transform and convolution algorithms. Berlin: Springer-Verlag.Google Scholar

Schroeder, M. R. (1986). Number theory in science and communication, 2nd ed. Berlin: Springer-Verlag.Google Scholar

Winograd, S. (1977). Some bilinear forms whose multiplicative complexity depends on the field of constants. Math. Syst. Theor. 10, 169–180.Google Scholar

Winograd, S. (1978). On computing the discrete Fourier transform. Math. Comput. 32, 175–199.Google Scholar

Winograd, S. (1980). Arithmetic complexity of computations. CBMS-NST Regional Conf. Series Appl. Math, Publ. No. 33. Philadelphia: SIAM Publications.Google Scholar

International Tables for Crystallography (2006). Vol. B. ch. 1.3, pp. 54-55