An extension for PostgreSQL >= 9.6 containing some median-related utilities.
At this time, provides five window functions:
median_filter
which behaves the same as SciPy's medfiltiterated_median_filter
, which applies the median filter iteratively until it converges (no change greater than some small value).rolling_median
, which calculates the median over the precedingn
rows and returnsNULL
for the firstn
rowsbackfilled_rolling_median
, which calculates the median over the precedingn
rows and backfills the firstn
rows with the median over them (returnsNULL
if there are less rows than the window size).rolling_median_impute
:- Calculates the rolling median over all non-null rows with window size
n
- Backfills the first
n
rows with the first median over a complete window - Forward-fills any null row with the nearest rolling median value
- Calculates the rolling median over all non-null rows with window size
Use with any double precision column, for example:
SELECT v, median_filter(v::double precision, 5) over() FROM generate_series(1, 10) as t(v);
v | median_filter
----+---------------
1 | 1
2 | 2
3 | 3
4 | 4
5 | 5
6 | 6
7 | 7
8 | 8
9 | 8
10 | 8
(10 rows)
Or for an iterated version:
SELECT v, iterated_median_filter(v, 3) over() FROM (VALUES (1), (1.1), (0.9), (1.1), (0.95), (2.1), (1.95), (2.0), (2.05), (3.11), (2.99), (3.05), (3.0)) as t(v);
v | iterated_median_filter
------+------------------------
1 | 1
1.1 | 1
0.9 | 1
1.1 | 1.1
0.95 | 1.1
2.1 | 1.95
1.95 | 2
2.0 | 2
2.05 | 2.05
3.11 | 2.99
2.99 | 3
3.05 | 3
3.0 | 3
(13 rows)
Comparing the two:
SELECT median_filter(v, 3) over(), iterated_median_filter(v, 3, 0.0000001) over() FROM (VALUES (1), (1.1), (0.9), (1.1), (0.95), (2.1), (1.95), (2.0), (2.05), (3.11), (2.99), (3.05), (3.0)) as t(v);
median_filter | iterated_median_filter
---------------+------------------------
1 | 1
1 | 1
1.1 | 1
0.95 | 1.1
1.1 | 1.1
1.95 | 1.95
2 | 2
2 | 2
2.05 | 2.05
2.99 | 2.99
3.05 | 3
3 | 3
3 | 3
(13 rows)
Usage of the rolling median functions is similar to the filters:
SELECT v, rolling_median(v::double precision, 5) over() FROM generate_series(1, 10) as t(v);
v | rolling_median
----+----------------
1 |
2 |
3 |
4 |
5 | 3
6 | 4
7 | 5
8 | 6
9 | 7
10 | 8
(10 rows)
Or for the backfilled equivalent:
SELECT v, backfilled_rolling_median(v::double precision, 5) over() FROM generate_series(1, 10) as t(v);
v | backfilled_rolling_median
----+---------------------------
1 | 3
2 | 3
3 | 3
4 | 3
5 | 3
6 | 4
7 | 5
8 | 6
9 | 7
10 | 8
(10 rows)
SELECT v, rolling_median_impute(v, 3) over() FROM (VALUES (1), (1.1), (0.9), (NULL), (NULL), (2.1), (NULL), (2.0), (2.05), (3.11), (2.99), (3.05), (NULL)) as t(v);
rolling_median_impute
-----------------------
1
1.1
0.9
1.1
1.1
2.1
2
2
2.05
3.11
2.99
3.05
2.99
(13 rows)