TIP #457: ADD SUPPORT FOR NAMED ARGUMENTS =========================================== Version: $Revision: 1.22 $ Author: Mathieu Lafon Andreas Leitgeb State: Draft Type: Project Tcl-Version: 8.7 Vote: Pending Created: Monday, 21 November 2016 URL: https://tip.tcl-lang.org457.html Post-History: ------------------------------------------------------------------------- ABSTRACT ========== This TIP proposes an enhancement of the Tcl language to support named arguments and additional features when calling a procedure. RATIONALE =========== The naming of arguments to procedures is a computer language feature which allow developers to specify the name of an argument when calling a function. This is especially useful when dealing with arguments with default values, as this does not require to specify all previous arguments when only one argument is required to be specified. As such, this is a commonly requested feature by Tcl developers, who have created various code snippets [] to simulate it. These snippets have drawbacks: not intuitive for new users, require to add extra code at the start of each procedure, no standard on the format to use, few errors handling, etc. After discussing various possibilities with the community, it has been decided to extend the argument specification of the /proc/ command and allow users to define options on arguments. This can be used to support named arguments but also add additional enhancements: flag arguments, pass-by-name (/upvar/) arguments, non-required arguments, ... The others possibilities discussed are detailed in the /Discussion/ section at the end of the document. SPECIFICATION =============== The /proc/ documentation currently define argument specifiers as a list of one or two fields where the first field is the name of the argument and the optional second field is its default value. The proposed modification is to support an alternate specifier format where the first field is also the name of the argument, followed by a paired list of options and their values. This format does not prevent the original format to be used as they can be easily distinguished: the new format uses an odd size list with a minimal size of three fields. AVAILABLE ARGUMENT SPECIFIERS ------------------------------- The following argument specifiers are defined in this TIP: * *-default VALUE* defines the default value for the argument. It is ignored if /-required 1/ is also used. % proc p { { a -default {} } } { list a $a } % p a {} % p foo a foo * *-name NAME* defines the argument to be a named argument. NAME defines the name of the argument when it is defined as a single string. If NAME is a list of strings, it is the list of names that can be used to refer to the argument (i.e. aliases). On the call-site, the name of the argument is prefixed by a single dash and followed by the value. % proc p1 { { v -name val } } { list v $v } % p1 -val 1 v 1 % proc p2 { { v -name {v val value} } } { list v $v } % p2 -value 2 v 2 % p2 -v 2 v 2 * *-switch SWITCHES* defines that the argument is defined on the call-site as a flag-only/switch parameter. SWITCHES is a list of possible switches. Each switch is defined either as a single string (switch name) or as a list of two entries (switch name and related value). On the call-site, the name of the switch is prefixed by a single dash and is not followed by any value. The value assigned to the argument is either the switch name or the related value depending on how it was defined. % proc p { { dbg -default 0 -switch debug } } { list dbg $dbg } % p dbg 0 % p -debug dbg debug % proc p { { level -switch {{quiet 0} {verbose 9}} } { list level $level } % p -quiet level 0 % p -verbose level 9 * *-required BOOLEAN* defines that the value is required to be set. If set to true, the argument is required and any default value is ignored. It is the default handling for non-named argument without a default value. If set to false, the argument is not required to be set and the related argument will be left unset if there is no default value. It is the default handling for named argument. % proc p { { v -required 0 } } { if {[info exist v]} {list v $v} {return "v is unset"} } % p 5 v 5 % p v is unset * *-upvar LEVEL* defines, that the local argument will become an alias to the variable in the frame at level LEVEL corresponding to the parameter value. This is similar to what is achieved when using the /upvar/ command. This specifier is incompatible with the /-switch/ specifier. % proc p { { v -upvar 1 } } { incr v } % set a 2 2 % p a 3 % set a 3 Further argument specifiers may be added in future TIP. Examples of new argument specifiers which may be added in the future: * type assertion (*-assume TYPE*) * argument documentation (*-docstring DOC*) * ... NAMED ARGUMENTS ----------------- The following rules define how named arguments are expected to be specified on the call-site: * Named arguments must always be specified using their name, they can't be specified as positional arguments. % proc p { {a -name A} } { list a $a } % p aa wrong # args: should be "p |-A a|" % p -A aa a aa * When several names (using /-name/ or /-switch/ options) are specified for the same argument, only one is required to be used on the call-site, unless a default value is also specified. If more than one is used, the latest value/switch is kept. % proc p { { v -name {v val} } } { list v $v } % p -v 6 -val 8 v 8 * Both /-name/ and /-switch/ specifiers can be used on the same argument. % proc p { { level -name level -switch {{quiet 0} {verbose 9}} } { list level $level } % p -level 4 level 4 % p -verbose level 9 * A group of contiguous named arguments are handled together and are not required to be specified in the same order as defined. % proc p { {a -name A} {b -name B} } { list a $a b $b } % p -B bb -A aa % a aa b bb * The handling of a group of contiguous named arguments (which can be only one argument) is ended on the first argument which is either a parameter not starting with a dash or the special /--/ end-of-options marker. Remaining arguments will then be assigned to following positional arguments. % proc p { {o -name opt} args } { list o $o args $args } % p -opt O 5 o O args 5 % p -opt O -1 0 wrong # args: should be "p |-opt o| ?arg ...?" % p -opt O -- -1 0 o O args {-1 0} * If there is a fixed number of non-optional positional arguments and no special /args/ variable after the named group, the handling of a named group will also be ended when the remaining arguments to assign will be equal to the number of positional arguments after the group. % proc p { {o -name opt} posarg } { list o $o posarg $posarg } % p -opt O -1 o O posarg -1 GENERATED USAGE DESCRIPTION ----------------------------- The error message, automatically generated when the input arguments are invalid, is updated regarding new options: * Pass-by-name arguments (specified using /-upvar level/ option) are surrounded by the '&' character. % proc p { { v -upvar 1 } } { } % p wrong # args: should be "p &v&" * Named arguments are showed how they should be called and surounded by the '|' character. If several names have been specified, they are grouped together. % proc p { { l -name level -switch {high low} -required 1} } {} % p wrong # args: should be "p |-level l|-high|-low|" * When an argument is optional, '?' is used. % proc p { { v -name var } a } {} % p wrong # args: should be "p ?|-var v|? a" INTROSPECTION --------------- The /info argspec proc/ command is added to get the argument specification of all arguments or of a specific argument. % proc p { a { b 1 } { c -name c } } {} % info argspec proc p a { b -default 1 } { c -name c } % info argspec proc p c -name c Similar /info argspec/ subcommands are also added for lambda, object method and object constructor. The /info argspec specifiers/ command is added to get the specifiers supported by the current interpreter. % info argspec specifiers -default -name -required -switch -upvar OTHER USE CASES ----------------- Extended argument specifiers can also be used with other /proc/-like functions. The following functions are supported and can use extended argument specifiers: * anonymous functions (lambda), used with /apply/ command ; * TclOO constructor or methods. PERFORMANCE ------------- The proposed modification has no significant performance impact: * existing code (and code not using extended argspec) is not impacted by the change as the current initialisation code is still available and used ; * code using extended argspec /may/ be impacted because the initialisation code is different and is required to loop on each argument, but initial testing does not show a significant slowdown. When using named arguments specifiers to replace a similar handling done in Tcl-pure code, there is however a significant increase in performance. See [] for details on performance testing done on the proposed implementation. IMPLEMENTATION ================ This document proposes the following changes to the Tcl core: 1. Add ExtendedArgSpec structure which is linked from CompiledLocal and contains information about extended argument specification; 2. Add a flags field in the Proc structure to later identify a proc with at least one argument defined with an extended argument specification (PROC_HAS_EXT_ARG_SPEC); 3. Update proc creation to handle the extended argument specification and fill the ExtendedArgSpec structure; 4. Update InitArgsAndLocals to initialize the compiled locals using a dedicated function if the PROC_HAS_EXT_ARG_SPEC flag has been set on the proc. If unset, the original initialization code is still used. 5. Update ProcWrongNumArgs to generate an appropriate error message when an argument has been defined using an extended argument specification; 6. Add /info argspec/ command; 7. Update documentation in doc/proc.n and doc/info.n; 8. Update impacted tests and add dedicated tests in tests/proc-enh.test. REFERENCE IMPLEMENTATION -------------------------- The reference implementation is available in the tip-457 [] branch. The code is licensed under the BSD license. DISCUSSION ============ This section details some of the alternate solutions for this feature or specific comments about it. Initial approaches that tried to work with unmodified procedures are not detailed here for clarity. DEDICATED BUILTIN COMMAND --------------------------- A dedicated command can be used to handle the named arguments, using an /-option value/ syntax, before calling the target procedures with all arguments correctly prepared. % call -opts myproc -optC foo -optB {5 5} -- "some pos arg" An implementation of this proposal is available at []. This proposal was abandoned as it was not enough intuitive for users. MODIFICATION IN HOW PROC ARE DEFINED -------------------------------------- Tcl-pure procedures can be defined in a way which state that the procedure will automatically handle /-option value/ arguments. % proc -np myproc { varA { optB defB } { optC defC } { optD defD } args } { .. } % myproc -optC foo -optB {5 5} -- "some pos arg" An other possibility is to support options on arguments and allow name specification: % proc myproc { varA { optB -default defB -name B } args } { .. } % myproc a -B b zz This is the currently proposed solution in this TIP. It requires the procedures to be modified but allow additional features. Some people have expressed concern about the modification of the /proc/ command, which is a core command of Tcl. A particular attention has been paid to ensure that existing code will not be impacted and that future usage could be later added by adding new specifiers. ARGUMENT PARSING COMMAND -------------------------- Cyan Ogilvie's paper from Tcl2016 [] describes a C extension to provide core-like argument parsing at speed comparable to /proc/ argument handling, in a terse and self-documenting way. Alexandre Ferrieux has proposed [] to use the same argument specifiers than this proposal, but with a dedicated command which can be called from the proc body. This has the advantage to not alter the /proc/ command and could be located in an extension. Although the /proc/ usage will not be modified, this new command will probably have to access or modify internal proc structures, for example to support introspection. Having to declare final local variables in the body, also seems confusing for users. PREVENTING DATA-DEPENDENT BUGS -------------------------------- It has been proposed by Christian Gollwitzer [] to make the special '--' end-of-options marker mandatory when the number of positional arguments after the named group is not fixed. This would suppress any potential Data-Dependent bugs related to the search of the initial dash and remove any unwanted object stringification, at the expense of forcing the user to explicitely use the end-of-option marker. This proposal is currently not implemented but the documentation has been modified to list the cases for which '--' should be use. COPYRIGHT =========== This document has been placed in the public domain. ------------------------------------------------------------------------- TIP AutoGenerator - written by Donal K. Fellows