Digital Filtering¶
Introduction¶
The filters discussed in this chapter are based on the following moving data window which is centered on -th sample:
Here, is a non-negative integer called the window half-length, which represents the number of samples before and after sample . The total window length is .
Handling Endpoints¶
When processing samples near the ends of the input signal, there will not
be enough samples to fill the window defined above.
Therefore the user must specify how to construct the windows near the end points.
This is done by passing an input argument of type gsl_filter_end_t
:
-
type gsl_filter_end_t¶
This data type specifies how to construct windows near end points and can be selected from the following choices:
-
GSL_FILTER_END_PADZERO¶
With this option, a full window of length will be constructed by inserting zeros into the window near the signal end points. Effectively, the input signal is modified to
to ensure a well-defined window for all .
-
GSL_FILTER_END_PADVALUE¶
With this option, a full window of length will be constructed by padding the window with the first and last sample in the input signal. Effectively, the input signal is modified to
-
GSL_FILTER_END_TRUNCATE¶
With this option, no padding is performed, and the windows are simply truncated as the end points are approached.
-
GSL_FILTER_END_PADZERO¶
Linear Digital Filters¶
Gaussian Filter¶
The Gaussian filter convolves the input signal with a Gaussian kernel or window. This filter is often used as a smoothing or noise reduction filter. The Gaussian kernel is defined by
for , and is the size of the kernel. The parameter specifies the number of standard deviations desired in the kernel. So for example setting would define a Gaussian window of length which spans . It is often more convenient to specify the parameter rather than the standard deviation when constructing the kernel, since a fixed value of would correspond to the same shape of Gaussian regardless of the size . The appropriate value of the standard deviation depends on and is related to as
The routines below accept as an input argument instead of .
The Gaussian filter offers a convenient way of differentiating and smoothing an input signal in a single pass. Using the derivative property of a convolution,
the input signal can be smoothed and differentiated at the same time by convolution with a derivative Gaussian kernel, which can be readily computed from the analytic expression above. The same principle applies to higher order derivatives.
-
gsl_filter_gaussian_workspace *gsl_filter_gaussian_alloc(const size_t K)¶
This function initializes a workspace for Gaussian filtering using a kernel of size
K
. Here, . If is even, it is rounded up to the next odd integer to ensure a symmetric window. The size of the workspace is .
-
void gsl_filter_gaussian_free(gsl_filter_gaussian_workspace *w)¶
This function frees the memory associated with
w
.
-
int gsl_filter_gaussian(const gsl_filter_end_t endtype, const double alpha, const size_t order, const gsl_vector *x, gsl_vector *y, gsl_filter_gaussian_workspace *w)¶
This function applies a Gaussian filter parameterized by
alpha
to the input vectorx
, storing the output iny
. The derivative order is specified byorder
, with0
corresponding to a Gaussian,1
corresponding to a first derivative Gaussian, and so on. The parameterendtype
specifies how the signal end points are handled. It is allowed forx
=y
for an in-place filter.
-
int gsl_filter_gaussian_kernel(const double alpha, const size_t order, const int normalize, gsl_vector *kernel)¶
This function constructs a Gaussian kernel parameterized by
alpha
and stores the output inkernel
. The parameterorder
specifies the derivative order, with0
corresponding to a Gaussian,1
corresponding to a first derivative Gaussian, and so on. Ifnormalize
is set to1
, then the kernel will be normalized to sum to one on output. Ifnormalize
is set to0
, no normalization is performed.
Nonlinear Digital Filters¶
The nonlinear digital filters described below are based on the window median, which is given by
The median is considered robust to local outliers, unlike the mean. Median filters can preserve sharp edges while at the same removing signal noise, and are used in a wide range of applications.
Standard Median Filter¶
The standard median filter (SMF) simply replaces the sample by the median of the window : This filter has one tuning parameter given by . The standard median filter is considered highly resistant to local outliers and local noise in the data sequence .
-
gsl_filter_median_workspace *gsl_filter_median_alloc(const size_t K)¶
This function initializes a workspace for standard median filtering using a symmetric centered moving window of size
K
. Here, . If is even, it is rounded up to the next odd integer to ensure a symmetric window. The size of the workspace is .
-
void gsl_filter_median_free(gsl_filter_median_workspace *w)¶
This function frees the memory associated with
w
.
-
int gsl_filter_median(const gsl_filter_end_t endtype, const gsl_vector *x, gsl_vector *y, gsl_filter_median_workspace *w)¶
This function applies a standard median filter to the input
x
, storing the output iny
. The parameterendtype
specifies how the signal end points are handled. It is allowed to havex
=y
for an in-place filter.
Recursive Median Filter¶
The recursive median filter (RMF) is a modification of the SMF to include previous filter outputs in the window before computing the median. The filter’s response is
Sometimes, the SMF must be applied several times in a row to achieve adequate smoothing (i.e. a cascade filter). The RMF, on the other hand, converges to a root sequence in one pass, and can sometimes provide a smoother result than several passes of the SMF. A root sequence is an input which is left unchanged by the filter. So there is no need to apply a recursive median filter twice to an input vector.
-
gsl_filter_rmedian_workspace *gsl_filter_rmedian_alloc(const size_t K)¶
This function initializes a workspace for recursive median filtering using a symmetric centered moving window of size
K
. Here, . If is even, it is rounded up to the next odd integer to ensure a symmetric window. The size of the workspace is .
-
void gsl_filter_rmedian_free(gsl_filter_rmedian_workspace *w)¶
This function frees the memory associated with
w
.
-
int gsl_filter_rmedian(const gsl_filter_end_t endtype, const gsl_vector *x, gsl_vector *y, gsl_filter_rmedian_workspace *w)¶
This function applies a recursive median filter to the input
x
, storing the output iny
. The parameterendtype
specifies how the signal end points are handled. It is allowed to havex
=y
for an in-place filter.
Impulse Detection Filter¶
Impulsive noise is characterized by short sequences of data points distinct from those in the surrounding neighborhood. This section describes a powerful class of filters, also known as impulse rejection filters and decision-based filters, designed to detect and remove such outliers from data. The filter’s response is given by
where is the median value of the window , is a robust estimate of the scatter or dispersion for the window , and is a tuning parameter specifying the number of scale factors needed to determine that a point is an outlier. The main idea is that the median will be unaffected by a small number of outliers in the window, and so a given sample is tested to determine how far away it is from the median in terms of the local scale estimate . Samples which are more than scale estimates away from the median are labeled as outliers and replaced by the window median . Samples which are less than scale estimates from the median are left unchanged by the filter.
Note that when , the impulse detection filter is equivalent to the standard median filter. When , it becomes the identity filter. This means the impulse detection filter can be viewed as a “less aggressive” version of the standard median filter, becoming less aggressive as is increased. Note that this filter modifies only samples identified as outliers, while the standard median filter changes all samples to the local median, regardless of whether they are outliers. This fact, plus the additional flexibility offered by the additional tuning parameter can make the impulse detection filter a better choice for some applications.
It is important to have a robust and accurate scale estimate in order to
detect impulse outliers even in the presence of noise. The window standard deviation is not
typically a good choice, as it can be significantly perturbed by the presence of even one outlier.
GSL offers the following choices (specified by a parameter of type gsl_filter_scale_t
) for
computing the scale estimate , all of which are robust to the presence of impulse outliers.
-
type gsl_filter_scale_t¶
This type specifies how the scale estimate of the window is calculated.
-
GSL_FILTER_SCALE_MAD¶
This option specifies the median absolute deviation (MAD) scale estimate, defined by
This choice of scale estimate is also known as the Hampel filter in the statistical literature. See here for more information.
-
GSL_FILTER_SCALE_IQR¶
This option specifies the interquartile range (IQR) scale estimate, defined as the difference between the 75th and 25th percentiles of the window ,
where is the p-quantile of the window . The idea is to throw away the largest and smallest 25% of the window samples (where the outliers would be), and estimate a scale from the middle 50%. The factor provides an unbiased estimate of the standard deviation for Gaussian data.
-
GSL_FILTER_SCALE_MAD¶
Warning
While the scale estimates defined above are much less sensitive to outliers than the standard deviation, they can suffer from an effect called implosion. The standard deviation of a window will be zero if and only if all samples in the window are equal. However, it is possible for the MAD of a window to be zero even if all the samples in the window are not equal. For example, if or more of the samples in the window are equal to some value , then the window median will be equal to . Consequently, at least of the absolute deviations will be zero, and so the MAD will be zero. In such a case, the Hampel filter will act like the standard median filter regardless of the value of . Caution should also be exercised if dividing by .
-
gsl_filter_impulse_workspace *gsl_filter_impulse_alloc(const size_t K)¶
This function initializes a workspace for impulse detection filtering using a symmetric moving window of size
K
. Here, . If is even, it is rounded up to the next odd integer to ensure a symmetric window. The size of the workspace is .
-
void gsl_filter_impulse_free(gsl_filter_impulse_workspace *w)¶
This function frees the memory associated with
w
.
-
int gsl_filter_impulse(const gsl_filter_end_t endtype, const gsl_filter_scale_t scale_type, const double t, const gsl_vector *x, gsl_vector *y, gsl_vector *xmedian, gsl_vector *xsigma, size_t *noutlier, gsl_vector_int *ioutlier, gsl_filter_impulse_workspace *w)¶
These functions apply an impulse detection filter to the input vector
x
, storing the filtered output iny
. The tuning parameter is provided int
. The window medians are stored inxmedian
and the are stored inxsigma
on output. The number of outliers detected is stored innoutlier
on output, while the locations of flagged outliers are stored in the boolean arrayioutlier
. The inputioutlier
may beNULL
if not desired. It is allowed to havex
=y
for an in-place filter.
Examples¶
Gaussian Example 1¶
This example program illustrates the Gaussian filter applied to smoothing a time series of length with a kernel size of . Three filters are applied with parameters . The results are shown in Fig. 9.
We see that the filter corresponding to applies the most smoothing, while corresponds to the least amount of smoothing. The program is given below.
#include <stdio.h>
#include <stdlib.h>
#include <gsl/gsl_math.h>
#include <gsl/gsl_filter.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_vector.h>
int
main(void)
{
const size_t N = 500; /* length of time series */
const size_t K = 51; /* window size */
const double alpha[3] = { 0.5, 3.0, 10.0 }; /* alpha values */
gsl_vector *x = gsl_vector_alloc(N); /* input vector */
gsl_vector *y1 = gsl_vector_alloc(N); /* filtered output vector for alpha1 */
gsl_vector *y2 = gsl_vector_alloc(N); /* filtered output vector for alpha2 */
gsl_vector *y3 = gsl_vector_alloc(N); /* filtered output vector for alpha3 */
gsl_vector *k1 = gsl_vector_alloc(K); /* Gaussian kernel for alpha1 */
gsl_vector *k2 = gsl_vector_alloc(K); /* Gaussian kernel for alpha2 */
gsl_vector *k3 = gsl_vector_alloc(K); /* Gaussian kernel for alpha3 */
gsl_rng *r = gsl_rng_alloc(gsl_rng_default);
gsl_filter_gaussian_workspace *gauss_p = gsl_filter_gaussian_alloc(K);
size_t i;
double sum = 0.0;
/* generate input signal */
for (i = 0; i < N; ++i)
{
double ui = gsl_ran_gaussian(r, 1.0);
sum += ui;
gsl_vector_set(x, i, sum);
}
/* compute kernels without normalization */
gsl_filter_gaussian_kernel(alpha[0], 0, 0, k1);
gsl_filter_gaussian_kernel(alpha[1], 0, 0, k2);
gsl_filter_gaussian_kernel(alpha[2], 0, 0, k3);
/* apply filters */
gsl_filter_gaussian(GSL_FILTER_END_PADVALUE, alpha[0], 0, x, y1, gauss_p);
gsl_filter_gaussian(GSL_FILTER_END_PADVALUE, alpha[1], 0, x, y2, gauss_p);
gsl_filter_gaussian(GSL_FILTER_END_PADVALUE, alpha[2], 0, x, y3, gauss_p);
/* print kernels */
for (i = 0; i < K; ++i)
{
double k1i = gsl_vector_get(k1, i);
double k2i = gsl_vector_get(k2, i);
double k3i = gsl_vector_get(k3, i);
printf("%e %e %e\n", k1i, k2i, k3i);
}
printf("\n\n");
/* print filter results */
for (i = 0; i < N; ++i)
{
double xi = gsl_vector_get(x, i);
double y1i = gsl_vector_get(y1, i);
double y2i = gsl_vector_get(y2, i);
double y3i = gsl_vector_get(y3, i);
printf("%.12e %.12e %.12e %.12e\n", xi, y1i, y2i, y3i);
}
gsl_vector_free(x);
gsl_vector_free(y1);
gsl_vector_free(y2);
gsl_vector_free(y3);
gsl_vector_free(k1);
gsl_vector_free(k2);
gsl_vector_free(k3);
gsl_rng_free(r);
gsl_filter_gaussian_free(gauss_p);
return 0;
}
Gaussian Example 2¶
A common application of the Gaussian filter is to detect edges, or sudden jumps, in a noisy input signal. It is used both for 1D edge detection in time series, as well as 2D edge detection in images. Here we will examine a noisy time series of length with a single edge. The input signal is defined as
where is Gaussian random noise. The program smooths the input signal with order and Gaussian filters of length with . For comparison, the program also computes finite differences of the input signal without smoothing. The results are shown in Fig. 10.
The finite difference approximation of the first derivative (second row) shows the common problem with differentiating a noisy signal. The noise is amplified and makes it extremely difficult to detect the sharp gradient at sample . The third row shows the first order Gaussian smoothed signal with a clear peak at the location of the edge. Alternatively, one could examine the second order Gaussian smoothed signal (fourth row) and look for zero crossings to determine the edge location.
The program is given below.
#include <stdio.h>
#include <stdlib.h>
#include <gsl/gsl_math.h>
#include <gsl/gsl_filter.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_vector.h>
int
main(void)
{
const size_t N = 1000; /* length of time series */
const size_t K = 61; /* window size */
const double alpha = 3.0; /* Gaussian kernel has +/- 3 standard deviations */
gsl_vector *x = gsl_vector_alloc(N); /* input vector */
gsl_vector *y = gsl_vector_alloc(N); /* filtered output vector */
gsl_vector *dy = gsl_vector_alloc(N); /* first derivative filtered vector */
gsl_vector *d2y = gsl_vector_alloc(N); /* second derivative filtered vector */
gsl_rng *r = gsl_rng_alloc(gsl_rng_default);
gsl_filter_gaussian_workspace *gauss_p = gsl_filter_gaussian_alloc(K);
size_t i;
/* generate input signal */
for (i = 0; i < N; ++i)
{
double xi = (i > N / 2) ? 0.5 : 0.0;
double ei = gsl_ran_gaussian(r, 0.1);
gsl_vector_set(x, i, xi + ei);
}
/* apply filters */
gsl_filter_gaussian(GSL_FILTER_END_PADVALUE, alpha, 0, x, y, gauss_p);
gsl_filter_gaussian(GSL_FILTER_END_PADVALUE, alpha, 1, x, dy, gauss_p);
gsl_filter_gaussian(GSL_FILTER_END_PADVALUE, alpha, 2, x, d2y, gauss_p);
/* print results */
for (i = 0; i < N; ++i)
{
double xi = gsl_vector_get(x, i);
double yi = gsl_vector_get(y, i);
double dyi = gsl_vector_get(dy, i);
double d2yi = gsl_vector_get(d2y, i);
double dxi;
/* compute finite difference of x vector */
if (i == 0)
dxi = gsl_vector_get(x, i + 1) - xi;
else if (i == N - 1)
dxi = gsl_vector_get(x, i) - gsl_vector_get(x, i - 1);
else
dxi = 0.5 * (gsl_vector_get(x, i + 1) - gsl_vector_get(x, i - 1));
printf("%.12e %.12e %.12e %.12e %.12e\n",
xi,
yi,
dxi,
dyi,
d2yi);
}
gsl_vector_free(x);
gsl_vector_free(y);
gsl_vector_free(dy);
gsl_vector_free(d2y);
gsl_rng_free(r);
gsl_filter_gaussian_free(gauss_p);
return 0;
}
Square Wave Signal Example¶
The following example program illustrates the median filters on a noisy square wave signal. Median filters are well known for preserving sharp edges in the input signal while reducing noise. The program constructs a 5 Hz square wave signal with Gaussian noise added. Then the signal is filtered with a standard median filter and recursive median filter using a symmetric window of length . The results are shown in Fig. 11.
Both filters preserve the sharp signal edges while reducing the noise. The recursive median filter achieves a smoother result than the standard median filter. The “blocky” nature of the output is characteristic of all median filters. The program is given below.
#include <stdio.h>
#include <stdlib.h>
#include <gsl/gsl_math.h>
#include <gsl/gsl_filter.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_vector.h>
int
main(void)
{
const size_t N = 1000; /* length of time series */
const size_t K = 7; /* window size */
const double f = 5.0; /* frequency of square wave in Hz */
gsl_filter_median_workspace *median_p = gsl_filter_median_alloc(K);
gsl_filter_rmedian_workspace *rmedian_p = gsl_filter_rmedian_alloc(K);
gsl_vector *t = gsl_vector_alloc(N); /* time */
gsl_vector *x = gsl_vector_alloc(N); /* input vector */
gsl_vector *y_median = gsl_vector_alloc(N); /* median filtered output */
gsl_vector *y_rmedian = gsl_vector_alloc(N); /* recursive median filtered output */
gsl_rng *r = gsl_rng_alloc(gsl_rng_default);
size_t i;
/* generate input signal */
for (i = 0; i < N; ++i)
{
double ti = (double) i / (N - 1.0);
double tmp = sin(2.0 * M_PI * f * ti);
double xi = (tmp >= 0.0) ? 1.0 : -1.0;
double ei = gsl_ran_gaussian(r, 0.1);
gsl_vector_set(t, i, ti);
gsl_vector_set(x, i, xi + ei);
}
gsl_filter_median(GSL_FILTER_END_PADVALUE, x, y_median, median_p);
gsl_filter_rmedian(GSL_FILTER_END_PADVALUE, x, y_rmedian, rmedian_p);
/* print results */
for (i = 0; i < N; ++i)
{
double ti = gsl_vector_get(t, i);
double xi = gsl_vector_get(x, i);
double medi = gsl_vector_get(y_median, i);
double rmedi = gsl_vector_get(y_rmedian, i);
printf("%f %f %f %f\n",
ti,
xi,
medi,
rmedi);
}
gsl_vector_free(t);
gsl_vector_free(x);
gsl_vector_free(y_median);
gsl_vector_free(y_rmedian);
gsl_rng_free(r);
gsl_filter_median_free(median_p);
return 0;
}
Impulse Detection Example¶
The following example program illustrates the impulse detection filter. First, it constructs a sinusoid signal of length with Gaussian noise added. Then, about 1% of the data are perturbed to represent large outliers. An impulse detecting filter is applied with a window size and tuning parameter , using the statistic as the robust measure of scale. The results are plotted in Fig. 12.
The program is given below.
#include <stdio.h>
#include <stdlib.h>
#include <gsl/gsl_math.h>
#include <gsl/gsl_filter.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_vector.h>
int
main(void)
{
const size_t N = 1000; /* length of time series */
const size_t K = 25; /* window size */
const double t = 4.0; /* number of scale factors for outlier detection */
gsl_vector *x = gsl_vector_alloc(N); /* input vector */
gsl_vector *y = gsl_vector_alloc(N); /* output (filtered) vector */
gsl_vector *xmedian = gsl_vector_alloc(N); /* window medians */
gsl_vector *xsigma = gsl_vector_alloc(N); /* window scale estimates */
gsl_vector_int *ioutlier = gsl_vector_int_alloc(N); /* outlier detected? */
gsl_filter_impulse_workspace * w = gsl_filter_impulse_alloc(K);
gsl_rng *r = gsl_rng_alloc(gsl_rng_default);
size_t noutlier;
size_t i;
/* generate input signal */
for (i = 0; i < N; ++i)
{
double xi = 10.0 * sin(2.0 * M_PI * i / (double) N);
double ei = gsl_ran_gaussian(r, 2.0);
double u = gsl_rng_uniform(r);
double outlier = (u < 0.01) ? 15.0*GSL_SIGN(ei) : 0.0;
gsl_vector_set(x, i, xi + ei + outlier);
}
/* apply impulse detection filter */
gsl_filter_impulse(GSL_FILTER_END_TRUNCATE, GSL_FILTER_SCALE_QN, t, x, y,
xmedian, xsigma, &noutlier, ioutlier, w);
/* print results */
for (i = 0; i < N; ++i)
{
double xi = gsl_vector_get(x, i);
double yi = gsl_vector_get(y, i);
double xmedi = gsl_vector_get(xmedian, i);
double xsigmai = gsl_vector_get(xsigma, i);
int outlier = gsl_vector_int_get(ioutlier, i);
printf("%zu %f %f %f %f %d\n",
i,
xi,
yi,
xmedi + t * xsigmai,
xmedi - t * xsigmai,
outlier);
}
gsl_vector_free(x);
gsl_vector_free(y);
gsl_vector_free(xmedian);
gsl_vector_free(xsigma);
gsl_vector_int_free(ioutlier);
gsl_filter_impulse_free(w);
gsl_rng_free(r);
return 0;
}
References and Further Reading¶
The following publications are relevant to the algorithms described in this chapter,
F. J. Harris, On the use of windows for harmonic analysis with the discrete Fourier transform, Proceedings of the IEEE, 66 (1), 1978.
S-J. Ko, Y-H. Lee, and A. T. Fam. Efficient implementation of one-dimensional recursive median filters, IEEE transactions on circuits and systems 37.11 (1990): 1447-1450.
R. K. Pearson and M. Gabbouj, Nonlinear Digital Filtering with Python: An Introduction. CRC Press, 2015.