View Full Version : Re: Artifacts and splines

Ton Van Den Bogert
07-31-1995, 06:30 AM
Dear Biomch-L:

Paolo de Leva asked some specific questions regarding artifacts
created by spline smoothing. I have not studied the mathematics
of splines extensively, but would like to clarify my observations
so as to avoid confusion. These observations are based on
experimentation with the Woltring implementation some ten years
ago, so my memory may not be accurate.

> 1) Are the artifacts present (a) only in the middle of each cubic or
>quintic polinomial, or (b) also in the extremes (corresponding to the instants
>when the raw data were measured)?
> In case (a) only interpolated data would be affected, while in case
> (b) also non-interpolated data might contain unwanted extra-noise.
> Van der Bogert and Glossop wrote that they noticed artifacts in
> interpolated data (case a). I would like to know for sure if artifacts
> can be excluded at the extremes of each polinomial (case b).

I have noticed these artifacts whenever I used a smoothing spline
on data with gaps. Note that I'm still talking about 'smoothing
splines', not about 'interpolating splines' (splines that pass
exactly through all data points). The interpolation across the
gap suffered from large wave-like fluctuations, especially with
quintic splines. This was on kinematic data which were basically
uniformly spaced, except for certain areas where a marker was
'out-of-view'. I'm not sure if the same artifacts would occur if
data are uniformly spaced throughout, and a spline is used before
resampling the data at a higher sampling rate. I think the same
thing could happen, to a lesser extent.

The answer to Paolo's question is (a). The artifacts only affect
the spline between data points, not right at the data points.
But the derivatives may be affected at the data points!

> 2) Does this uncertainty in the results have a clean mathematical
>explanation? It was not clear whether the artifacts were just observed
>by some researchers, USING SOME PARTICULAR SPLINE ROUTINES, or they are
>expected, embedded in the logic of the equations themselves, and cannot be
>avoided, whatever routine you are using.

A spline is created by minimizing an objective function which is
a combination of smoothness (integral of square of Nth
derivative) of the whole curve and close fit at the data points.
The relative weighting between the two criteria determines the
amount of smoothing. I think the problem with irregularly spaced
data is that widely spaced data points need more smoothing than
closely spaced data points. Since the whole spline is created
using a single smoothing parameter, there is insufficient
smoothing for those areas where data are further apart.

Another mathematical property of splines is that they are
equivalent to digital filters. See the file GCVSPL MEMO which
can be obtained from LISTSERV@nic.surfnet.nl. Roughly, a cubic
spline would be equivalent to a 2nd order Butterworth filter
applied twice. A quintic spline would be equivalent to a 3rd
order Butterworth filter applied twice. Since higher-order
filters have a tendency for 'ringing' close to sharp transitions
in the data, this may help explain why higher-order splines tend
to create more artifacts, especially in the derivatives.

The artifacts are definitely not a problem with specific
software, but inherent in the mathematics.

> 3) Are the artifacts mathematical singularities, that
>occur only in some precise cases, or they occur unpredictably?

They are predictable, and I have only had problems when
interpolating over relatively large gaps in the data.

In reply to Jesus Dapena's question:

>time. "Smoothing" means that the spline curve does not pass exactly through
>the raw data points; "interpolating" means that you are using the spline
>functions to calculate data for times in between the times of the original
>data points (although some people reserve the term "interpolating" only for
>zero smoothing).

> Or am I the one that missed the boat here??

No boat was missed. When I talk about interpolation, I'm still
using smoothing splines but calculate the function at times when
no data are available: between samples and across gaps. The
'some people' are correct in their terminology, by the way. But
zero smoothing isn't used in biomechanics, as far as I know.

Since I have used the Woltring package, some final comments:

1. The Fortran version can be obtained by sending 'GET GCVSPL
FORTRAN' to LISTSERV@nic.surfnet.nl. A C version exists (I
think). Its location must have been announced on Biomch-L (that
would require a search through the archives).

2. I have never had good results when using the GCV option, which
automatically determines the optimal amount of smoothing. The
smoothed function is OK, but the derivatives are much too noisy.
Do others have the same experience?

-- Ton van den Bogert