filter1d

Time domain filtering of 1-D data tables

Synopsis

gmt filter1d [ table ] -Ftypewidth[+h] [ -Dincrement ] [ -E ] [ -Llack_width ] [ -Nt_col ] [ -Qq_factor ] [ -Ssymmetry_factor ] [ -T[min/max/]inc[+a][+e|i|n] |-Tfile|list ] [ -V[level] ] [ -aflags ] [ -bbinary ] [ -dnodata[+ccol] ] [ -eregexp ] [ -fflags ] [ -ggaps ] [ -hheaders ] [ -iflags ] [ -jflags ] [ -oflags ] [ -qflags ] [ -:[i|o] ] [ --PAR=value ]

Note: No space is allowed between the option flag and the associated arguments.

Description

filter1d is a general time domain filter for multiple column time series data. The user specifies which column is the time (i.e., the independent variable). (See -N option below). The fastest operation occurs when the input time series are equally spaced and have no gaps or outliers and the special options are not needed. filter1d has options -L, -Q, and -S for unevenly sampled data with gaps. For spatial series there is an option to compute along-track distances and use that as the independent variable for filtering.

Required Arguments

table: One or more ASCII (or binary, see -bi[ncols][type]) data table file(s) holding a number of data columns. If no tables are given then we read from standard input.

-Ftypewidth[+h]

Sets the filter type. Choose among convolution and non-convolution filters. Append the filter code followed by the full filter width (i.e., \(6 \sigma\)) in same units as time column. By default we perform low-pass filtering. Append +h to select high-pass filtering. Some filters allow for optional arguments and a modifier. Available convolution filter types are:

(b) Boxcar: All weights are equal.

(c) Cosine Arch: Weights follow a cosine arch curve.

(f) Custom: Instead of width, give name of a one-column file with your own weight coefficients.

(g) Gaussian: Weights are given by the Gaussian function.

For further information, see the Filtering of Data in GMT section.

Non-convolution filter types are:

(m) Median: Returns median value.

(p) Maximum likelihood probability (a mode estimator): Return modal value. If more than one mode is found we return their average value. Append +l or +u if you rather want to return the lowermost or uppermost of the modal values.

(l) Lower: Return the minimum of all values.

(L) Lower: Return minimum of all positive values only.

(u) Upper: Return maximum of all values.

(U) Upper: Return maximum of all negative values only.

Upper case type B, C, F, G, M and P will use robust filter versions: i.e., before filtering we replace outliers (2.5 x L1 scale off the median, using 1.4826 * median absolute deviation [MAD] as L1 scale) with the median during filtering.

In the case of L|U it is possible that no data will pass the initial sign test; in that case the filter will return 0.0. Apart from custom coefficients (f), the other filters may accept variable filter widths by passing width as a two-column time-series file with filter widths in the second column. The filter-width file does not need to be co-registered with the data as we obtain the required filter width at each output location via interpolation. For multi-segment data files the filter file must either have the same number of segments or just a single segment to be used for all data segments.

Optional Arguments

-Dincrement: increment is used when series is not equidistantly sampled. Then increment will be the abscissae resolution, i.e., all abscissae will be rounded off to a multiple of increment. Alternatively, resample data with sample1d.

-E: Include Ends of time series in output. Default loses half the filter-width of data at each end.

-Llack_width: Checks for Lack of data condition. If input data has a gap exceeding width then no output will be given at that point [Default does not check Lack].

-Nt_col: Indicates which column contains the independent variable (time). The left-most column is 0, while the right-most is (n_cols - 1) [Default is 0].

-Qq_factor: Assess Quality of output value by checking mean weight in convolution. Enter q_factor between 0 and 1. If mean weight < q_factor, output is suppressed at this point [Default does not check Quality].

-Ssymmetry_factor: Checks symmetry of data about window center. Enter a factor between 0 and 1. If ( (abs(n_left - n_right)) / (n_left + n_right) ) > factor, then no output will be given at this point [Default does not check Symmetry].

-T[min/max/]inc[+a][+e|i|n] |-Tfile|list: Make evenly spaced time-steps from min to max by inc [Default uses input times]. For details on array creation, see Generate 1-D Array.

-V[level]: Select verbosity level [w]. (See full description) (See technical reference).

-a[[col=]name[,…]] (more …): Set aspatial column associations col=name.

-birecord[+b|l] (more …): Select native binary format for primary table input.

-borecord[+b|l] (more …): Select native binary format for table output. [Default is same as input].

-d[i|o][+ccol]nodata (more …): Replace input columns that equal nodata with NaN and do the reverse on output.

-e[~]“pattern” | -e[~]/regexp/[i] (more …): Only accept data records that match the given pattern.

-f[i|o]colinfo (more …): Specify data types of input and/or output columns.

-gx|y|z|d|X|Y|Dgap[u][+a][+ccol][+n|p] (more …): Determine data gaps and line breaks.

-h[i|o][n][+c][+d][+msegheader][+rremark][+ttitle] (more …): Skip or produce header record(s).

-icols[+l][+ddivisor][+sscale|d|k][+ooffset][,…][,t[word]] (more …): Select input columns and transformations (0 is first column, t is trailing text, append word to read one word only).

-je|f|g (more …): Determine how spherical distances or coordinate transformations are calculated.

-ocols[+l][+ddivisor][+sscale|d|k][+ooffset][,…][,t[word]] (more …): Select output columns and transformations (0 is first column, t is trailing text, append word to write one word only).

-q[i|o][~]rows|limits[+ccol][+a|t|s] (more …): Select input or output rows or data limit(s) [all].

-:[i|o] (more …): Swap 1st and 2nd column on input and/or output.

-^ or just -: Print a short message about the syntax of the command, then exit (Note: on Windows just use -).
-+ or just +: Print an extensive usage (help) message, including the explanation of any module-specific option (but not the GMT common options), then exit.
-? or no arguments: Print a complete usage (help) message, including the explanation of all options, then exit.
--PAR=value: Temporarily override a GMT default setting; repeatable. See gmt.conf for parameters.

Units

For map distance unit, append unit d for arc degree, m for arc minute, and s for arc second, or e for meter [Default unless stated otherwise], f for foot, k for km, M for statute mile, n for nautical mile, and u for US survey foot. By default we compute such distances using a spherical approximation with great circles (-jg) using the authalic radius (see PROJ_MEAN_RADIUS). You can use -jf to perform “Flat Earth” calculations (quicker but less accurate) or -je to perform exact geodesic calculations (slower but more accurate; see PROJ_GEODESIC for method used).

ASCII Format Precision

The ASCII output formats of numerical data are controlled by parameters in your gmt.conf file. Longitude and latitude are formatted according to FORMAT_GEO_OUT, absolute time is under the control of FORMAT_DATE_OUT and FORMAT_CLOCK_OUT, whereas general floating point values are formatted according to FORMAT_FLOAT_OUT. Be aware that the format in effect can lead to loss of precision in ASCII output, which can lead to various problems downstream. If you find the output is not written with enough precision, consider switching to binary output (-bo if available) or specify more decimals using the FORMAT_FLOAT_OUT setting.

Generate 1-D Array

We will demonstrate the use of options for creating 1-D arrays via math. Make an evenly spaced coordinate array from min to max in steps of inc, e.g.:

gmt math -o0 -T3.1/4.2/0.1 T =
1
2
3
4
5
6
7
...

Append +b if we should take \(\log_2\) of min and max, get their nearest integers, build an equidistant \(\log_2\)-array using inc integer increments in \(\log_2\), then undo the \(\log_2\) conversion. E.g., -T3/20/1+b will produce this sequence:

gmt math -o0 -T3/20/1+b T =
4
8
16

Append +l if we should take \(\log_{10}\) of min and max and build an array where inc can be 1 (every magnitude), 2, (1, 2, 5 times magnitude) or 3 (1-9 times magnitude). E.g., -T7/135/2+l will produce this sequence:

gmt math -o0 -T7/135/2+l T =
10
20
50
100

For output values less frequently than every magnitude, use a negative integer inc:

gmt math -o0 -T1e-4/1e4/-2+l T =
0.0001
0.01
1
100
10000

Append +i if inc is a fractional number and it is cleaner to give its reciprocal value instead. To set up times for a 24-frames per second animation lasting 1 minute, run:

gmt math -o0 -T0/60/24+i T =
0
0.0416666666667
0.0833333333333
0.125
0.166666666667
...

Append +n if inc is meant to indicate the number of equidistant coordinates instead. To have exactly 5 equidistant values from 3.44 and 7.82, run:

gmt math -o0 -T3.44/7.82/5+n T =
44
535
63
725
82

Alternatively, let inc be a file with output coordinates in the first column, or provide a comma-separated list of specific coordinates, such as the first 6 Fibonacci numbers:

gmt math -o0 -T0,1,1,2,3,5 T =
0
1
1
2
3
5

Notes: (1) If you need to pass the list nodes via a dataset file yet be understood as a list (i.e., no interpolation), then you must set the file header to contain the string “LIST”. (2) Should you need to ensure that the coordinates are unique and sorted (in case the file or list are not sorted or have duplicates) then supply the +u modifier.

If you only want a single value then you must append a comma to distinguish the list from the setting of an increment.

If the module allows you to set up an absolute time series, append a valid time unit from the list year, month, day, hour, minute, and second to the given increment; add +t to ensure time column (or use -f). Note: The internal time unit is still controlled independently by TIME_UNIT. The first 7 days of March 2020:

gmt math -o0 -T2020-03-01T/2020-03-07T/1d T =
2020-03-01T00:00:00
2020-03-02T00:00:00
2020-03-03T00:00:00
2020-03-04T00:00:00
2020-03-05T00:00:00
2020-03-06T00:00:00
2020-03-07T00:00:00

A few modules allow for +a which will paste the coordinate array to the output table.

Likewise, if the module allows you to set up a spatial distance series (with distances computed from the first two data columns), specify a new increment as inc with a geospatial distance unit from the list degree (arc), minute (arc), second (arc), meter, foot, kilometer, Miles (statute), nautical miles, or survey foot; see -j for calculation mode. To interpolate Cartesian distances instead, you must use the special unit c.

Finally, if you are only providing an increment and will obtain min and max from the data, then it is possible (max - min)/inc is not an integer, as required. If so, then inc will be adjusted to fit the range. Alternatively, append +e to keep inc exact and adjust max instead (keeping min fixed).

Examples

Note: Below are some examples of valid syntax for this module. The examples that use remote files (file names starting with @) can be cut and pasted into your terminal for testing. Other commands requiring input files are just dummy examples of the types of uses that are common but cannot be run verbatim as written.

To filter the remote CO2 data set in the file MaunaLoa_CO2.txt (year, CO2) with a 5 year Gaussian filter, try

gmt filter1d @MaunaLoa_CO2.txt -Fg5 > CO2_trend.txt

Data along track often have uneven sampling and gaps which we do not want to interpolate using sample1d. To find the median depth in a 50 km window every 25 km along the track of cruise v3312, stored in v3312.txt, checking for gaps of 10 km and asymmetry of 0.3:

gmt filter1d v3312.txt -FM50 -T0/100000/25 -L10 -S0.3 > v3312_filt.txt

To smooth a noisy geospatial track using a Gaussian filter of full-width 100 km and not shorten the track, and add the distances every 2km to the file, use

gmt filter1d track.txt -T2k+a -E -Fg200 > smooth_track.txt