1. Proposed Long-Options Implementation¶
1.1. What is the problem?¶
Since GMT was first initiated, the modules have used a terse UNIX-inspired command-line syntax:
gmt blockmean -R0/5/0/6 -I1 my_table.txt > new_table.txt
It is not immediately obvious to a new user what the two options means. UNIX tools have the same problem, e.g., for sorting data into numerical order, one may use
sort -g my_table.txt > sorted_table.txt
and nobody will know without reading the documentation for sort what -g means. However, one can now use the equivalent command
sort --sort=general-numeric my_table.txt > sorted_table.txt
which is pretty self-explanatory. In the case of GMT, similar long-options alternatives will be implemented. For instance, the above blockmean command can also be written
gmt blockmean --region=0/5/0/6 --increment=1 my_table.txt > new_table.txt
which now becomes much easier to parse (for humans). Unfortunately, GMT syntax is a bit more complicated for many options. Due to the inexorable growth of new capabilities, many options have become more complex and may have optional modifiers appended to them. Consider the common option -i that is used to specify which input columns the modules should read. Not only can it select which columns (e.g., -i3,2,7-9), it allows optional modifiers that may be repeated for each column (or column group) that handles basic data transformations. For instance, let us imagine that the above example needs column 3 to be used as is, but column 2 needs to be converted by the log10 operator and columns 7-9 must be scaled by 10 and offset by -5. In standard (short) GMT syntax we would write
-i3,2+l,7-9+s10+o-6
which only makes immediate sense to those who wrote the parser. In contrast, for the long-format syntax it will instead be
--read-columns=3,2+log10,7-9+scale=10+offset=-5
which most users might be able to decipher.
1.2. Abstraction¶
So, there are several steps needed to implement this scheme across all GMT modules:
Build a set of long-option to short-option equivalences for the standard GMT Common Options. This is only about 30 options.
Build sets of long-option to short-option equivalences for all the unique module options spread across ~150 modules (which includes the supplements).
Clearly, these translation tables will need to address not only the longer names for the options but also the longer names for modifiers. We can now be more abstract and state that a general GMT short-format option actually follows a specific syntax:
-option[directive][arg][+modifier1[arg1]][+modifier2[arg2]][...]
where option, directive, and any modifier are single characters and there may be none, one, or more modifiers following the initial option and the optional directive (and optional arg). As we saw in the case of -i, the sequence of “optional directive followed by optional modifiers” may in fact be repeated by separating these sequences with a comma. The corresponding long-format syntax format is represented this way:
--long-option[=[directive:]arg][+modifier1[=arg1]][+modifier2[=arg2]][...]
where the key differences are
The option is a mnemonic word and not a single letter
Optional directives are appended after an equal sign
Optional argument are appended after an equal sign or after the directive colon.
Optional modifiers use mnemonic words and not a single letter
Optional modifier arguments are appended after another equal sign
1.3. Implementation Details¶
1.3.1. Common Options¶
The approach taken has been to create a master translation table that relates the short and long option syntax formats so that a function can be used to translate any general long-option argument to the equivalent short-option argument. That way, we only need to call this function at the start of a module and do the replacement. Then, the specific parsers we already have for common and module options will work as is. This design simplifies the coding tremendously and only requires us to create the translation tables. The approach has already been implemented and tested for the ~30 GMT Common Options and developers can play with this by adding the compiler flag -DUSE_COMMON_LONG_OPTIONS when building GMT. The translations for the GMT common options are encapsulated in a single include file (gmt_common_longoptions.h) that populates a gmt_common_kw structure and looks like this:
/* separator, short_option, long_option, short_directives, long_directives, short_modifiers, long_modifiers */
{ 0, 'B', "frame", "", "", "b,g,i,n,o,s,t,w,x,y,z", "box,fill,interior,noframe,pole,subtitle,pen,yzfill,xzfill,xyfill" },
{ 0, 'B', "axis", "x,y,z", "x,y,z", "a,f,l,L,p,s,S,u", "angle,fancy,label,hlabel,prefix,alt_label,alt_hlabel,unit" },
{ 0, 'J', "projection", "", "", "", ""},
{ 0, 'R', "region", "", "", "r,u", "rectangular,unit"},
{ 0, 'U', "timestamp", "", "", "c,j,o", "command,justify,offset"},
{ 0, 'V', "verbosity", "", "", "", ""},
{ 0, 'X', "xshift", "a,c,f,r", "absolute,center,fixed,relative", "", ""},
{ 0, 'Y', "yshift", "a,c,f,r", "absolute,center,fixed,relative", "", ""},
{ 0, 'a', "aspatial", "", "", "", ""},
{ 0, 'b', "binary", "", "", "b,l", "big_endian,little_endian"},
{ 0, 'c', "panel", "", "", "", ""},
{ 0, 'd', "nodata", "i,o", "in,out", "", ""},
{ 0, 'e', "find", "", "", "f", "file"},
{ ',', 'f', "coltypes", "i,o", "in,out", "", ""},
{ 0, 'g', "gap", "", "", "n,p", "negative,positive"},
{ 0, 'h', "header", "i,o", "in,out", "c,d,r,t", "columns,delete,remark,title"},
{ ',', 'i', "incols", "", "", "l,o,s", "log10,offset,scale"},
{ 0, 'j', "distance", "e,f,g", "ellipsoidal,flatearth,spherical", "", ""},
{ 0, 'l', "legend", "", "", "D,G,H,L,N,S,V,f,g,j,o,p,s,w", "hline,gap,header,linetext,ncols,size,vline,font,fill,justify,offset,pen,scale,width"},
{ 0, 'n', "interpolation", "b,c,l,n", "bspline,bicubic,linear,nearneighbor", "a,b,c,t", "anti_alias,bc,clip,threshold"},
{ ',', 'o', "outcols", "", "", "", ""},
{ 0, 'p', "perspective", "x,y,z", "x,y,z", "v,w", "view,world"},
{ ',', 'q', "inrows", "~", "invert", "a,c,f,s", "byset,column,byfile,bysegment"}, /* Actually -qi */
{ ',', 'q', "outrows", "~", "invert", "a,c,f,s", "byset,column,byfile,bysegment"}, /* Actually -qo */
{ 0, 'r', "registration", "g,p", "gridline,pixel", "", ""},
{ 0, 's', "skiprows", "", "", "a,r", "any,reverse"},
{ 0, 't', "transparency", "", "", "", ""},
{ 0, 'w', "wrap", "a,y,w,d,h,m,s,p", "annual,year,week,day,hour,min,sec,period", "c", "column"},
{ 0, 'x', "cores", "", "", "", ""},
Here, separator is a comma if more than one repetition of the sequence is allowed, otherwise it is 0.
1.3.2. Module Options¶
For the ~150 individual modules it is probably not a good idea to introduce ~150 new include files as was done for the common options above. Instead, the translation structure can be stored directly in the module C file. For instance, the local module_kw structure embedded in the blockmean.c module C code looks like this:
/* separator, short_option, long_option, short_directives, long_directives, short_modifiers, long_modifiers */
{ 0, 'A', "fields", "", "", "", "" },
{ 0, 'C', "center", "", "", "", "" },
{ 0, 'E', "extend", "", "", "P,p", "prop-simple,prop-weighted" },
{ 0, 'G', "gridfile", "", "", "", "" },
GMT_INCREMENT_KW, /* Defined in gmt_constant.h since not a true GMT common option (but almost) */
{ 0, 'S', "select", "m,n,s,w", "mean,count,sum,weight", "", "" },
{ 0, 'W', "weights", "i,o", "in,out", "s", "sigma" },
{ 0, '\0', "", "", "", "", ""} /* End of list marked with empty option and strings */
Given these translations we can execute long-format commands like this:
gmt blockmean --region=0/20/10/56 --increment=1 --registration=pixel --select=sum data.txt > sums.txt
that will sum up all the values that fell inside each bin.