For instance, when using 4 data points, the Savitzky-Golay filter is,
y(k) = .7 x(k) + .4 x(k-1) + .1 x(k-2) - .2 x(k-3)
x(k) is the newest (current) input
x(k - 1) is the previous input
x(k-2) is the next oldest input
x(k-3) is the oldest input used for the 4-point filter
y(k) is the newest (current) filtered (output) value.
The special case of a 2-point filter is just included for completeness - you wouldn’t ever need to use that. The least squares fit of a straight line for two points goes through both those points, so the least squares estimate is simply the newest input.
The sum of all the coefficients is always 1.0 (within rounding error in the tables above), is a requirement because the response to a steady input (or step response) must match the output when the input is steady. You can see that the oldest data always enters the filter with negative coefficients, so that the sum of earlier data in response to a step input of 1.0 is greater than one. That’s why there is always overshoot to a step response.
For some filter sizes (5, 8, 11), one of the data inputs is completely ignored in each time step. For the most effective noise reduction for the amount of computation, you might as well use filters one size smaller or larger. If you want to use a time period that requires more than 13 data points, consider sampling more slowly at the input to this filter, taking care to avoid aliasing by first filtering with an exponential filter at the higher sample rate.
The Savitzky-Golay smoother has advantages over the filter version. One big benefit is that there is no lead or lag - and no overshoot. Also, the area under the curve is maintained. In general, when not using filtered values immediately for control, if a delay in the application can be tolerated, smoothing will be better. Diagnostics can often tolerate some delay, worth the wait if false alarms can be avoided.
Copyright 2010 - 2015, Greg Stanley
Return to Filtering Next: Spike Filtering
Return to A Guide to Fault Detection and Diagnosis