Core-Style Arguments for Script Commands

Cyan Ogilvie
Ruby Lane, Inc.
[email protected]

Beyond 2 or 3 positional parameters, readability suffers:

proc searchdb {progcs progq usecache errorsev spid sites style db
    ipp id cs ss page uid readahead _results {regioncode ""}
    {minprice ""} {maxprice ""} {types ""} {redtagpercentoff 0}
    {addedsince ""} {sortby ""} {trialsiteid ""} {trialclickrate ""}
    {showsoldprice 0} {onsale ""} {lane ""} {website ""} {facets ""}
} {
   ...
}

searchdb "" $ss 0 notice "" "" $style $db $maxresults "" "" "" 1 \
    "" 0 progresults "" "" "" "" 0 "" "" $userid $newtestrate 0 "" \
    "" $website

Core commands typically use optional, named parameters:

glob -nocomplain -type f -tails -directory $spooldir *@*
lsort -index 0 -stride 2 -dictionary $search_counts
entry .pw -show * -textvariable pw -width 15

The majority of core commands with more than 3 parameters use this scheme:

binary, chan, clock, exec, fconfigure, fcopy, file, glob, interp, load, lsearch, lsort, namespace, package, puts, read, regexp, regsub, return, socket, source, string, subst, switch, unload, unset, zlib, pack, clipboard, place, event, wm, focus, font, winfo, grab, selection, send, grid, tk, bell and all Tk widget constructors and instance commands.

The only exceptions I could find were [trace remove] and [lreplace] or control structures

But script-defined commands are second class citizens

  • No support provided for these patterns to scripts
  • proc, apply, TclOO's constructors and methods support:
    • Positional parameters
    • Default values
    • Variadic arguments
  • coroutines support only variadic arguments
Tcl scripts can implement core-style argument parsing but:
  • it's slow - a problem for hot code
  • clutters proc implementations with tangential code
  • obscures proc signatures
  • stack traces less clear when incorrect args are passed

Therefore it is very seldom done

Conventions established by the core

-foo a boolean toggle “foo” is enabled, its absence means that the toggle is disabled (e.g. -nocase, -all)
-foo bar an argument named “foo” is assigned the value “bar”. In some cases, not specifying the argument means that it takes a default value (e.g. regexp -start), in other cases that triggers behaviour different to all possible values (e.g. lsort -command)
-foo / -bar / -baz a set of boolean-style arguments that are mutually exclusive and select a value for a single logical argument (e.g. lsort -ascii / -integer / -dictionary)
-- signals the end of the named options, further arguments are interpreted as positional parameters even if they would have matched a named argument (not universal)
When contradicting arguments are given, later arguments override earlier ones:
lsort -increasing -ascii -dictionary -decreasing $list
uses dictionary comparison, decreasing order (possibly not universal)

parse_args

Extension that provides support for these patterns to scripts

Primary design goals:

  • Feel like Tcl
  • Fast
  • Parameter signatures should be obvious at a glance
  • Terse
proc glob args {
    parse_args $args {
        -directory  {}  
        -join       {-boolean}
        -nocomplain {-boolean}
        -path       {}  
        -tails      {-boolean}
        -types      {-default {}} 
        args        {-name patterns}
    }

    if {$join} {
        set patterns [list [file join {*}$patterns]]
    }

    if {[llength $patterns] == 0 && $nocomplain} return

    foreach pattern $patterns {
        if {[info exists directory]} {
            ...
        }
    }
    ...
}
proc regexp args {
    parse_args $args {
        -about      {-boolean}
        -expanded   {-boolean}
        -indices    {-boolean}
        -line       {-boolean}
        -linestop   {-boolean}
        -lineanchor {-boolean}
        -nocase     {-boolean}
        -all        {-boolean}
        -inline     {-boolean}
        -start      {-default 0}
        exp         {-required}
        string      {-required}
        matchvar    {}  
        args        {-name submatchvars}
    }   
    ... 
}
proc lsearch args {
    parse_args $args {
        -exact      {-name matchtype -multi}
        -glob       {-name matchtype -multi}
        -regexp     {-name matchtype -multi}

        -sorted     {-boolean}
        -all        {-boolean}
        -inline     {-boolean}
        -not        {-boolean}
        -start      {-default 0}

        -ascii      {-name compare_as -multi -default ascii}
        -dictionary {-name compare_as -multi}
        -integer    {-name compare_as -multi}
        -real       {-name compare_as -multi}

        -nocase     {-boolean}

        -decreasing {-name order -multi}
        -increasing {-name order -multi -default increasing}
        -bisect     {-boolean}

        -index      {}
        -subindices {-boolean}
    }

    if {$sorted && [info exists matchtype] && $matchtype in {glob regexp}} {
        error "-sorted is mutually exclusive with -glob and -regexp"
    }
    if {![info exists matchtype]} {set matchtype glob}
}

Performance

Two design goals are in conflict:

  • intuitive definitions (easy for humans)
  • high performance (easy for machines)
parse_args $args {
    -directory  {}
    -join       {-boolean}
    -nocomplain {-boolean}
    -path       {}
    -tails      {-boolean}
    -types      {-default {}}
    args        {-name patterns}
}
Tcl_Obj IntRep:
  • Tcl_GetIndexFromObj lookup for named params
  • default values
  • required params
  • validators
  • enum choices

Using the intrep of a custom Tcl_ObjType to store this:

  • Efficiently caches the parse info
  • Gets lifecycle management for free
  • Prevents cached Tcl_Objs from being shared across threads
proc native {t_a title c_a category  w_a wiki {r_a rating} {rating 1.0}} {
    list $title $category $wiki $rating
}   

proc using_parse_args args {
    parse_args $args {
        -title      {-required}
        -category   {-default {}} 
        -wiki       {-required}
        -rating     {-default 1.0 -validate {string is double -strict}}
    }   
    list $title $category $wiki $rating
}
            | microseconds
tcl parsing | 24.540
     native |  0.535
 parse args |  0.838

Beyond [proc] argument parsing

coroutine foo apply [list {} {
    set res     {}
    set options {-code 0 -level 0}
    while 1 {
        catch {
            parse_args [yieldto return -options $options $res] {
                -foo    {-default xyzzy}
                -count  {-required}
            }

            ... generate next value
        } res options
    }
}]

Future Work

  • C API (or perhaps Ns_ParseObjv is a better fit?)
    Ns_ObjvSpec opts[] = {
        {"-cache",       Ns_ObjvTime,   &ttlPtr,  NULL},
        {"-nocache",     Ns_ObjvBool,   &nocache, INT2PTR(NS_TRUE)},
        {"-tcl",         Ns_ObjvBool,   &tcl,     INT2PTR(NS_TRUE)},
        {"--",           Ns_ObjvBreak,  NULL,     NULL},
        {NULL, NULL, NULL, NULL}
    };
    Ns_ObjvSpec args[] = {
        {"file",  Ns_ObjvString, &file,  NULL},
        {"?args", Ns_ObjvArgs,   &nargs, NULL},
        {NULL, NULL, NULL, NULL}
    };
    if (Ns_ParseObjv(opts, args, interp, 1, objc, objv) != NS_OK) {
        return TCL_ERROR;
    }
  • Support positional parameters interspersed with named parameters
  • Documention
  • Introspection