Passive radar signal processing part 1: clutter removal

2019-12-27

This is my first post in a series covering signal processing algorithms for passive radar. If you haven’t already, check out the main passive radar page to get an idea of what the project is all about. This post is a bit math-ier than the last one but I will try to keep it as intuitive as possible.

One of the major challenges with passive radar is extracting weak target echoes from noise. This noise includes leakage of the direct path signal into the observation channel as well as strong reflections of the transmitted signal from stationary objects such as mountains and buildings. In radar terminology, these signals are collectively known as ‘clutter.’ For passive radars, the clutter signals are often tens or hundreds of times stronger than the target echoes. A clutter removal algorithm is therefore needed to remove these signals so that the target echoes can be observed.

Before getting into the derivation, here is a quick proof of concept. The animation below shows an example (using real passive radar data) of how a clutter removal algorithm can be used to extract target signals that would otherwise be completely buried in noise. Applying the algorithm lowers the noise floor by almost 40dB! The spike that emerges from the noise is the target echo.

To derive the clutter removal algorithm, it is necessary to make some assumptions about the nature of the received passive radar signals. We will denote the signal received by the reference channel as $s_r[n]$ and the signal received by the observation channel as $s_o[n]$. Next, we can model the observation channel signal as a sum of three components: the desired target echo signal $ s_{tar}[n] $, the clutter signal $ s_{cl}[n] $ and an additive noise term $\nu[n]$.

$$ \begin{equation}\label{eq:1} \displaystyle s_{o}[n] = s_{tar}[n] + s_{cl}[n] + \nu[n] \end{equation} $$

We will assume for simplicity that the reference signal is a perfect copy of the transmitted signal. Then, since the clutter signal is comprised of a direct-path copy of the transmitted signal and various multi-path reflections from stationary objects, it is natural to model it as a sum of $ M $ scaled and delayed copies of the reference signal. The number $ M $ is chosen to be the largest delay for which there is a significant clutter return, which varies depending on the situation.

$$ \begin{equation}\label{eq:2} \displaystyle s_{cl}[n] = \sum_{k=0}^{M – 1} b_k s_{r}[n-k] \end{equation} $$

Note that we don’t yet know what the coefficients $b_k$ in the clutter signal are since they depend on the scattering objects that are present in the radar environment. However, if we can use some additional information to find appropriate estimates $\hat{b_k}$ of the clutter coefficients, they can be used to reconstruct an estimate of the clutter signal. The estimated clutter signal can then be subtracted from the observation channel signal to yield an estimate of the target echo signal.

$$ \begin{equation}\label{eq:3} \displaystyle \hat{s}_{tar}[n] = s_o[n] – \hat{s}_{cl}[n] = s_o[n] – \sum_{k=0}^{M – 1} \hat{b}_k s_{r}[n-k] \end{equation} $$

The main objective of the clutter removal algorithm can therefore be fulfilled by finding an estimate of the clutter coefficients. This is a type of adaptive filtering problem. The coefficients are selected based on the input data so that the desired output is obtained. A block diagram of this process is shown below.

So, how can an adaptive filtering algorithm determine the optimal filter coefficients? This seems impossible at first, since it looks like the filter is trying to predict its own output. To explain, we will begin by constructing a signal model for the target echoes.

Similarly to how the clutter signal was modelled in terms of the reference signal, the echo signal from $N$ different targets can be represented as a sum of $N$ scaled, delayed and Doppler shifted copies of the reference signal. Here $w_k$, $d_k$, and $f_k$ are the magnitude, delay, and Doppler shift of the $k^{th}$ target echo, respectively, and $F_s$ is the sample rate.

$$ \begin{equation}\label{eq:102} \displaystyle s_{tar}[n] = \sum_{k=0}^{N-1} w_{k} s_{r}[n-d_k] exp \left(\frac{j2\pi n f_k}{F_s} \right) \end{equation} $$

This equation looks a lot like the one for the clutter signal, except that now each element of the sum is multiplied by a Doppler shift term, $ exp(j2\pi n f_k / F_s) $. This gives us a hint about how the echo signal can be separated from the clutter signal. First, recall that complex exponential functions of different frequencies are orthogonal to one another. This means that because of the Doppler shift, the target echoes are linearly independent from the reference signal provided that the Doppler shift is nonzero and the signals are measured over a sufficiently long time interval. Conversely, we have assumed that the clutter signal is a perfect linear combination of delayed copies of the reference signal. In other words, if we can find all the components of the observation channel signal that are linearly related to delayed copies of the reference signal and subtract them off, all that will remain will be the echo signal.

It is convenient to reformulate this idea in terms of vector spaces. Since the clutter signal is comprised of scaled and delayed copies of the reference signal, it can be said to occupy a vector space whose basis vectors are delayed copies of the reference signal. We will call this space the clutter space. The clutter coefficients $b_k$ are the coordinates of the clutter signal in the clutter space. The echo signal is linearly independent from each delayed copy of the reference signal, and therefore it is orthogonal to the clutter space. We can find the clutter signal by computing the projection of the observation channel signal into the clutter space. Subtracting off the clutter signal leaves only the echo signal.

A variety of different adaptive filtering algorithms can be used for clutter removal in passive radar. We used an algorithm known as the least squares filter (which is also sometimes called a Wiener filter). The least squares filter is a block algorithm since it involves dividing the input data into chunks many samples in length and computing a single set of filter coefficients for each chunk. This contrasts with stochastic gradient based approaches such as the least mean squares filter in which the filter taps are updated using each of the individual input samples as they arrive. The performance of these two adaptive filtering schemes as well as several others are compared for passive radar clutter removal in this paper. I also strongly recommend the textbook Adaptive Filter Theory by Simon Haykin if you want to read more about adaptive filtering.

One benefit of the least squares algorithm is that its derivation is intuitive given the vector space formulation of the clutter removal process described above. To begin, let $\mathbf{x_k} $ denote a vector containing $L$ samples of the reference signal delayed by $ k $ samples, $s_r[n – k] $. The set of vectors $\mathbf{x_k} $ for $k$ between $ 0 $ and $ M $ are the basis vectors of clutter space. We can then define a matrix $ \mathbf{X} $, which is a $ L \times M $ matrix whose columns are the basis vectors $ \mathbf{x_k} $ for $ k $ between $ 0 $ and $ M $.

Next, let $ \boldsymbol{\beta} $ be an arbitrary vector of length $ M $. Note that the product of $ \mathbf{X} $ with $ \boldsymbol{\beta} $ is a vector of length $ L $. $ \mathbf{X} $ transforms any point in the clutter space (a vector of length $ M $) into its representation in the signal space (a vector of length $ L $). Next, let $ \mathbf{y} $ be a vector containing $ L $ samples of the observation signal $ s_o[n] $. Finally, for any clutter vector $ \boldsymbol{\beta} $ let $\boldsymbol{\epsilon} $ denote the residual between $ \mathbf{y} $ and $ \mathbf{X} \boldsymbol{\beta} $.

$$ \begin{equation}\label{eq:103} \displaystyle \boldsymbol{\epsilon} = \mathbf{y} – \mathbf{X} \boldsymbol{\beta} \end{equation} $$

Or equivalently,

$$ \begin{equation}\label{eq:104} \displaystyle \mathbf{y} = \mathbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon} \end{equation} $$

Comparing this equation with the original signal model for the observation channel signal, we see that if we choose the right clutter vector $ \boldsymbol{\beta} $, then the term $ \mathbf{X} \boldsymbol{\beta} $ represents the clutter signal $ s_{cl} $. $\boldsymbol{\epsilon} $ then represents the remaining two terms, $s_{tar} + \nu$.

So how do we find this optimal value for $ \boldsymbol{\beta} $ (which we will call $ \boldsymbol{\hat{\beta}} $)? Well, we want the term $ \mathbf{X} \boldsymbol{\beta} $ to capture the largest possible fraction of $\mathbf{y}$, such that it leaves the smallest component behind after it is subtracted off. The optimal clutter vector $ \boldsymbol{\hat{\beta}} $ can therefore be obtained by minimizing the square magnitude of $\boldsymbol{\epsilon} $. The corresponding minimum-magnitude vector $\boldsymbol{\hat{\epsilon}} $ then becomes our best estimate of the target signal. Since the noise term $\nu$ is uncorrelated with the reference signal it cannot be removed, but as long as it is small compared to the target signal it can be neglected.

This is exciting because it means that we have reduced our problem to an ordinary least squares regression. The solution for this type of problem is well known, and is given by

$$ \begin{equation}\label{eq:105} \displaystyle \boldsymbol{\hat{\beta}} = \mathbf{(X^H X)^{-1} X^H y} \end{equation} $$

Note that in the preceeding equation $\mathbf{X^H}$ represents the Hermitian transpose of $\mathbf{X}$, since the elements of $\mathbf{X}$ can be complex numbers.

In practice, the matrix $\mathbf{X^H X}$ is sometimes ill-conditioned (nearly singular), which means that the solution to the least squares problem is not uniquely determined. One way to deal with this is to use a regularization technique such as Tikhonov regularization. The modified equation for the clutter coefficient vector $\boldsymbol{\hat{\beta}}$ using Tikhonov regularization is shown below, where $\boldsymbol{\Gamma}$ is an $M \times M$ matrix called the Tikhonov matrix.

$$ \begin{equation}\label{eq:106} \displaystyle \boldsymbol{\hat{\beta}} = (\mathbf{X^H X} + \boldsymbol{\Gamma} \mathbf{^H} \boldsymbol{\Gamma})^{-1}\mathbf{X^H y} \end{equation} $$

Different choices for the Tikhonov matrix result in different types of regularization. A common choice is $\boldsymbol{\Gamma} = \alpha \boldsymbol{I}$, where $\boldsymbol{I}$ is the identity matrix and $ \alpha $ is a constant. This choice (which is equivalent to $L_2$ regularization) adds a penalty on the square magnitude of the solution vector to select for solutions with smaller magnitudes.

Once the clutter coefficient vector has been obtained, it is straightforward to recover the target echo signal:

$$ \begin{equation}\label{eq:107} \displaystyle \hat{\boldsymbol{\epsilon}} = \mathbf{y} – \mathbf{X}\boldsymbol{\hat{\beta}} \end{equation} $$

With that, the clutter removal process is complete! Check out the next post in this series to see how the echo signals are processed to reveal target characteristics.