In this article I want to discuss the Cooley-Tukey algorithm for general composite N=N1N2. The easiest way to see how the algorithm works is to use index mapping. Consider the length-N DFT of a sequence x[n]: X[k]=N−1∑n=0x[n]WnkN with WN=e−j2π/N. Let's assume that N can be factored as N=N1N2. Now use the following index mapping n=n1N2+n2,n1=0,1,…,N1−1,n2=0,1,…,N2−1k=k1+N1k2,k1=0,1,…,N1−1,k2=0,1,…,N2−1 Plugging (2) into (1) gives X[k1+N1k2]=N2−1∑n2=0N1−1∑n1=0x[n1N2+n2]W(n1N2+n2)(k1+k2N1)N The complex exponentials in (3) can be simplified as follows: W(n1N2+n2)(k1+k2N1)N=WN2n1k1NWN1N2n1k2NWn2k1NWN1n2k2N==Wn1k1N1⋅1⋅Wn2k1NWn2k2N2 because WN1N=WN2, WN2N=WN1, and WN1N2N=WNN=1. Plugging (4) back into (3) and rearranging terms gives X[k1+N1k2]=N2−1∑n2=0[Wn2k1N(N1−1∑n1=0x[n1N2+n2]Wn1k1N1)]Wn2k2N2 Equation (5) can be interpreted as follows: the result can be computed by computing N2 length-N1 DFTs of subsampled versions of the input sequence x[n] (this is the inner sum in (5)), then multiplying the result by twiddle factors Wn2k1N, and finally computing N1 length-N2 DFTs (the outer sum in (5)). This procedure is most easily visualized using matrices:
- read the data row-wise into a N1×N2 matrix
- compute DFTs over all N2 columns (this can be done in-place)
- multiply the result element-wise with twiddle factors
- compute DFTs over all N1 rows (in-place)
- read out the data column-wise to get the final length-N result
In practice, an FFT routine which implements N=2M will be sufficient most of the time, because in most cases one is free to choose the DFT length, and the data can usually be zero-padded to the next power of two. However, there are cases where a non-power of 2 length is required, such as in 3GPP Long Term Evolution (LTE), where for a transmission bandwidth of 15MHz an FFT length of 1536 is required. Suppose that on a certain platform you have an FFT routine for N=2M. In this case, since 1536=3×512, the representation (5) can be used to split up the computation of the DFT in 3 FFTs of length 512, which can be computed by the available FFT routine, and 512 length-3 FFTs, the implementation of which is trivial.
This little octave/Matlab script illustrates the procedure for the case N=1536:
N = 1536; % non-power of 2 FFT length used in LTE
N1 = 512; N2 = 3; % 512 x 3 = 1536
x = randn(N,1); % some input signal
% generate N1 x N2 matrix of input data, reading them row wise
X = reshape(x,[N2,N1]).';
% DFT over each column
X = fft(X);
% Multiplication with twiddle factors
X = X.*exp(-1i*2*pi/N*(0:N1-1)'*(0:N2-1));
% DFT over each row
X = (fft(X.')).';
% read out column wise
X = X(:);
So even if you don't want to program your own FFT algorithm it's very useful to understand the Cooley-Tukey method because it can be used to implement non-power-of-two FFTs using basic (power-of-2) FFT routines, if the FFT length N is composed of only a few (small) factors (other than 2), and this will be the case for almost all practical applications.
No comments:
Post a Comment