Wednesday, October 19, 2022

Design of IIR Allpass Filters: a least squares equation error method

The frequency response of a causal $N^{th}$ order allpass filter is given by $$H(e^{j\omega})=\frac{e^{-jN\omega}A^*(e^{j\omega})}{A(e^{j\omega})}\tag{1}$$ with $^*$ denoting complex conjugation, and with the denominator polynomial given by $$A(e^{j\omega})=1+a_1e^{-j\omega}+\ldots +a_Ne^{-jN\omega}\tag{2}$$ Note that from $(1)$, $|H(e^{j\omega})|=1$, and, consequently $$H(e^{j\omega})=e^{j\phi(\omega)}\tag{3}$$ where $\phi(\omega)$ is the filter's phase response.

The design problem is to find the coefficients $a_n$ in $(2)$ such that the phase $\phi(\omega)$ approximates a given desired phase response $\phi_d(\omega)$. The filter's phase $\phi(\omega)$ is a highly non-linear function of the filter coefficients $a_n$, so it's difficult to directly minimize some norm of the phase error $\phi(\omega)-\phi_d(\omega)$. It is more straightforward to consider the complex error function $H(e^{j\omega})-e^{j\phi_d(\omega)}$. This error function still depends non-linearly on the filter coefficients. However, multiplying this error function by the denominator of $H(e^{j\omega})$ results in an error function which is a linear function of the filter coefficients: $$\epsilon(\omega)=e^{-jN\omega}A^*(e^{j\omega}) - A(e^{j\omega})e^{j\phi_d(\omega)}\tag{4}$$ This error function is usually called "equation error", and it is also used for designing general IIR filters. In this DSP stackexchange post I describe the basics of that method for the general case, and there are also some relevant links.

Minimization of the error function $(4)$ is equivalent to solving the following overdetermined system of equations: $$e^{-jN\omega_i}A^*(e^{j\omega_i})\overset{!}{=}A(e^{j\omega_i})e^{j\phi_d(\omega_i)},\quad i=1,\ldots,M\tag{5}$$ where we've chosen a dense frequency grid $\omega_i$ in the interval $[0,\pi]$ such that there are considerably more equations than unknown filter coefficients. Since the equations in $(5)$ are linear functions of the filter coefficients $a_n$, we can rewrite $(5)$ using matrix/vector notation: $${\mathbf C}{\mathbf a}\overset{!}{=}{\mathbf d}\tag{6}$$ where $\mathbf C$ is a complex valued $M\times N$ matrix with rows $${\mathbf c}_i=\left[e^{-j(N-1)\omega_i}-e^{j(\phi_d(\omega_i)-\omega_i)},e^{-j(N-2)\omega_i}-e^{j(\phi_d(\omega_i)-2\omega_i)},\ldots,1-e^{j(\phi_d(\omega_i)-N\omega_i)}\right]\tag{7}$$ $\mathbf a$ is the $N\times 1$ vector of unknown filter coefficients $a_n$, and $\mathbf d$ is a complex $M\times 1$ vector with elements $e^{j\phi_d(\omega_i)}-e^{-jN\omega_i}$.

We can solve $(6)$ in a least squares sense either by solving the corresponding normal equations or we can directly use the overdetermined system $(6)$ as the input to a QR-solver, as provided by the backslash operator in Matlab or Octave. We could compute the filter coefficients from $(6)$ but we'll add a minor generalization in terms of a weight function which allows to weigh the error differently in different frequency regions. We define a positive weight function $W(\omega)$ on the frequency grid and we define an $M\times M$ diagonal matrix $\mathbf W$ with elements $W(\omega_i)$ on its diagonal. The overdetermined system we are going to solve is then $${\mathbf W}^{1/2}{\mathbf C}{\mathbf a}\overset{!}{=}{\mathbf W}^{1/2}{\mathbf d}\tag{8}$$ We use the square root of the weights in $(8)$ just to be consistent with other weighted least squares methods. Since we're only interested in real-valued solutions, we rewrite $(8)$ as $$\left[\begin{array}{c}{\mathbf W}^{1/2}{\mathbf C}_R\\{\mathbf W}^{1/2}{\mathbf C}_I\end{array}\right]{\mathbf a}\overset{!}{=}\left[\begin{array}{c}{\mathbf W}^{1/2}{\mathbf d}_R\\{\mathbf W}^{1/2}{\mathbf d}_I\end{array}\right]\tag{9}$$ where the subscripts $R$ and $I$ denote real and imaginary parts, respectively. The overdetermined system of linear equations $(9)$ is the one that we need to solve in a least squares sense in order to compute the filter coefficients that minimize the weighted equation error.

It should be noted that the method described here does not guarantee the stability of the designed filter. If the resulting filter turns out to be unstable, reflecting poles about the unit circle is not an option, because that would ruin the phase approximation. Instead, a new design has to be done whereby the desired phase response is modified by a linear phase term, i.e., by adding (or subtracting) a constant delay. In this way, it is always possible to obtain a stable filter. The value of that delay, however, needs to be determined experimentally.

I'll show a few design examples to illustrate the method described above. The first design is a fractional delay. The goal is to design an allpass filter with a fractional delay of $7.5$ samples. We choose a filter order $N=8$ and no weighting, i.e., $W(\omega)=1$ for all frequencies. The figures below shows the result of this design. The top figure shows the phase error in radians. The phase error is smaller than $0.05$ radians for more than $90$% of the bandwidth. Only close to the Nyquist frequency does the phase error become relatively large. The bottom figure shows the group delay, which approximates the desired $7.5$ samples quite well for about $85$% of the bandwidth.

In the next design example, the allpass filter is used to equalize the phase distortion of a digital Cauer lowpass filter. The desired phase response of the allpass filter is the negative phase of the Cauer filter with some linear phase subtracted, such that the total desired phase response can be approximated well by a causal filter. I used a $5^{th}$ order Cauer lowpass and a $12^{th}$ order allpass equalizer. A good result was achieved with a desired total delay of $10$ samples. The figure below shows the design result. The top figure shows the phase approximation error in the passband, and the bottom plot shows the group delay functions in the passband for the Cauer filter (blue), the allpass equalizer (orange), and the cascade of Cauer filter and allpass filter (yellow). The total group delay is close to the specified $10$ samples over a large range of the passband.

The final example is the design of a Hilbert transformer. A Hilbert transformer shifts the phase of the input signal by $-\pi/2$ while keeping the amplitude unchanged. An ideal Hilbert transformer is a non-causal system, that's why a delay has to be added to the desired phase such that the approximation error of a causal allpass becomes reasonably small. I chose the filter order $N=10$ and a delay of $9$ samples. The top figure below shows the phase error. As is to be expected, the approximation is more difficult close to DC and Nyquist. However, over a large range of the bandwidth, the resulting filter computes an accurate Hilbert transform. The bottom figure shows the group delay, which approximates the desired value of $9$ samples, with larger errors at low and high frequencies.

A simple Matlab/Octave implementation of this design method can be found on GitHub: iir_ap.m

Monday, October 17, 2022

FIR Filters with prescribed magnitude and phase responses: weighted least squares in the frequency domain

In this post I'm going to explain the least squares design of finite impulse response (FIR) filters in the frequency domain. The least squares criterion is a practically useful criterion (minimizing error energy) and, in the case of FIR filters, it also leads to a simple problem statement the solution of which can be computed by solving a system of linear equations.

The least squares design of linear phase filters is a special case, and its solution is implemented in the Matlab/Octave function $\texttt{firls.m}$. Here I want to describe the general case of approximating a complex desired frequency response, i.e., a response that prescribes not only the magnitude but also the phase response of the filter. If the prescribed phase response is linear and if the corresponding delay matches the chosen filter length, then the resulting filter will have a linear phase. Otherwise, the designed filter will have a non-linear phase response approximating the prescribed phase response.

The described method cannot only design filters with real-valued coefficients but it can also design FIR filters with complex coefficients. This is simply achieved by prescribing a complex desired frequency response that is not conjugate symmetric. Let $D(e^{j\omega})$ denote the complex desired frequency response. A complex FIR filter is obtained if $$D(e^{j\omega})\neq D^*(e^{-j\omega})$$ where $^*$ denotes complex conjugation. If $D(e^{j\omega})$ is conjugate symmetric, i.e., if $D(e^{j\omega}) = D^*(e^{-j\omega})$, then the resulting filter will be real-valued. In sum, the method described below can be used to design linear and non-linear phase filters with real-valued or complex-valued coefficients.

Define the complex desired frequency $D(e^{j\omega})$ on a dense grid of frequencies $\omega_i$, $i=1\ldots M$, in the interval $(-\pi,\pi]$. Let $H(e^{j\omega})$ be the frequency response of the FIR filter with coefficients $h[n]$: $$H(e^{j\omega})=\sum_{n=0}^{N-1}h[n]e^{-jn\omega}$$ where $N$ is the filter length (i.e., the filter order equals $N-1$). The filter design problem can now simply be stated as $$H(e^{j\omega_i})\overset{!}{=}D(e^{j\omega_i}),\quad i=1\ldots M\tag{1}$$ where $\overset{!}{=}$ means "should be as close as possible" in some sense that is yet to be specified. We assume that $(1)$ is an overdetermined system, i.e., there are more equations than unknowns, or, in other words, there are more frequency points $M$ than unknown filter coefficients $N$.

It can be desirable to modify $(1)$ by introducing a strictly positive weight function $W(\omega)$. For notational convenience when formulating the least squares problem we use the square root of $W(\omega)$ in the overdetermined system: $$\sqrt{W}(\omega)H(e^{j\omega_i})\overset{!}{=}\sqrt{W}(\omega)D(e^{j\omega_i}),\quad i=1\ldots M\tag{2}$$ When minimizing the error in the overdetermined system $(2)$, the resulting approximation error will be smaller in frequency regions with a larger weight.

For formulating the weighted least squares approximation problem it is advantageous to write $(2)$ in matrix notation. Let $\mathbf C$ be a complex $M\times N$ matrix with rows ${\mathbf c}_i$ given by $${\mathbf c}_i=\big[1,e^{-j\omega_i},e^{-2j\omega_i},\ldots,e^{-(N-1)j\omega_i}\big]$$ With the filter coefficients $h[n]$ arranged in the column vector $\mathbf h$, the filter's frequency response evaluated at frequency $\omega_i$ can be written as $H(e^{j\omega_i})={\mathbf c}_i{\mathbf h}$. Consequently, the $M\times 1$ vector with elements $H(e^{j\omega_i})$, $i=1,\ldots,M$, is given by ${\mathbf C}{\mathbf h}$. With the complex desired frequency response values at frequencies $\omega_i$ arranged in a vector $\mathbf d$, we can write $(2)$ as $${\mathbf W}^{1/2}{\mathbf C \mathbf h} \overset{!}{=} {\mathbf W}^{1/2}{\mathbf d}\tag{3}$$ where $\mathbf W$ is a diagonal matrix with elements $W(\omega_i)$.

Solving $(3)$ in a least squares sense is equivalent to minimizing the following error measure: $$\begin{align}\epsilon&=\sum_{i=1}^M W(\omega_i)\big|H(e^{j\omega_i})-D(e^{j\omega_i})\big|^2\\&=({\mathbf C \mathbf h}-{\mathbf d})^H{\mathbf W}({\mathbf C \mathbf h}-{\mathbf d})\\&={\mathbf h}^H{\mathbf C}^H{\mathbf W}{\mathbf C}{\mathbf h}-{\mathbf h}^H{\mathbf C}^H{\mathbf W}{\mathbf d}-{\mathbf d}^H{\mathbf W}{\mathbf C}{\mathbf h}+{\mathbf d}^H{\mathbf d}\tag{4}\end{align}$$ where we use $^H$ to denote complex conjugate transposition.

Minimizing $(4)$ results in the following $N\times N$ system of linear equations for the unknown filter coefficients: $${\mathbf C}^H{\mathbf W}{\mathbf C}{\mathbf h} = {\mathbf C}^H{\mathbf W}{\mathbf d} \tag{5}$$

Now we can either solve $(5)$ using some linear equation solver, or we can directly use the overdetermined system $(3)$ as input to a QR-solver. Matlab and Octave offer both possibilities with the backslash `\` operator. Solving the overdetermined system directly using a QR-solver has numerical advantages, whereas solving the square system $(5)$ has the advantage that no matrix needs to be stored because the system matrix ${\mathbf C}^H{\mathbf W}{\mathbf C}$ has a Toeplitz structure, and is therefore completely determined by its first row. Such a system can be solved efficiently using Levinson's algorithm.

In general, the solutions to $(3)$ and $(5)$ are complex-valued. If we want to design real-valued filters, then for every frequency point $\omega_i>0$ there must be a point $-\omega_i$ for which $D(e^{-j\omega_i})=D^*(e^{j\omega_i})$ and $W(-\omega_i)=W(\omega_i)$ are satisfied. Another more efficient way to guarantee real-valued solutions is to slightly modify the systems $(3)$ and $(5)$. With these modifications we only need to specify frequency points in the interval $[0,\pi]$. The overdetermined system $(3)$ becomes $$\left[\begin{array}{c}{\mathbf W}^{1/2}{\mathbf C}_R \\{\mathbf W}^{1/2}{\mathbf C}_I \end{array} \right]{\mathbf h}\overset{!}{=} \left[\begin{array}{c}{\mathbf W}^{1/2}{\mathbf d}_R\\{\mathbf W}^{1/2}{\mathbf d}_I\end{array}\right]\tag{6}$$ where subscripts $R$ and $I$ denote the real and imaginary parts, respectively. The $N\times N$ system $(5)$ becomes $$\textrm{Re}\left\{{\mathbf C}^H{\mathbf W}{\mathbf C}\right\}{\mathbf h} = \textrm{Re}\left\{{\mathbf C}^H{\mathbf W}{\mathbf d}\right\} \tag{7}$$ I.e., if we want a real-valued solution, we just take the real parts of the system matrix and of the right-hand side vector.

I've implemented the method described above in the Matlab/Octave function cfirls.ml: