TIP: 460 Title: An Alternative to Upvar Version: $Revision: 1.5 $ Author: Don Hathway State: Draft Type: Project Vote: Pending Created: 08-Dec-2016 Post-History: Keywords: Tcl,variable,link,upvar Tcl-Version: 9.0 ~ Abstract Variable linking with the ''upvar'' command is not as intuitive or effecient as it should be. This TIP proposes an alternative through automatic variable linking. ~Rationale The current strategy used to link a variable in a called procedure to the caller, is to pass the name of the variable to the procedure, and use the ''upvar'' command to create a new variable, which is then linked to the original. Thus linking to a variable requires two components; the variable name and a newly created variable. It is possible to instruct Tcl to do this linking automatically in an idiomatic way and dispense with the ''upvar'' command call. Also, the requirement (by ''upvar'') that the name of the new link variable be a different name from the original is arguably considered counter-intuitive. Benefits to this TIP as proposed: 1. '''No code to perform explicit linking within the procedure's body.''' Unlike ''upvar'', this method requires no additional code to be entered in the body of the procedure. Less code, less bugs, easier to use! It has been said that Tcl'ers should make more use of variable linking in their code. Making it easier for them should have an encouraging effect, similar to how most Tcl'ers prefer ''$var'' over ''set var''. 2. '''Clearly defines links in the procedure's parameter list.''' Readers should instantly know what the links are. Clarity is important, especially for people that read code all day. There are no special project naming conventions to follow. A reader doesn't have to rely on docs or assume that a parameter name of "varName", "_var", or "varLnk" is to be linked by an upvar call, of which may be pushed down in the procedure's body by comments or other code. 3. '''Alleviates arguably messy ''upvar'' chain linking.''' ~~ Upvar Chaining Example Below are three ''upvar''s with the same arguments. As you can see, there is quite a bit of arguably unnecessary code duplication, and that is bug prone. | proc foo {a} { | upvar 1 $a la | maybe do something with la | bar la | } | proc bar {a} { | upvar 1 $a la | maybe do something with la | baz la | } | proc baz {a} { | upvar 1 $a la | maybe do something with la | } | foo begin This could be written more succinctly: | proc foo {*a} { | maybe do something with a | bar a | } | proc bar {*a} { | maybe do something with a | baz a | } | proc baz {*a} { | maybe do something with a | } | foo begin ~ Specification Add support to procedure handling to allow for a parametric hint to procedure definitions with respect to the intent to link variables accordingly. We use the asterisk character "'''*'''" as the symbol to declare this intent; which shall prefix the parameter's name. Consequently, the "'''*'''" character becomes special, but only inside the procedure parameter list. A procedure definition using this facility would then have the signature: | proc foo {*a *b} {...} Where '''*a''' and '''*b''' are the procedure's parameters to be linked to the caller's arguments. New variables are then created for the future linking. In this example '''*a''' creates a new link variable named '''a''', and likewise done for '''*b'''. '''*a''' and '''*b''' holds the values passed in by the caller. The formal parameter's shall retain the same values provided by the caller. The link variable's name shall always have one '''*''' symbol less than its counterpart parameter, for the sake of consistency. In example, a parameter named '''***a''' shall have a counterpart link variable named '''**a'''. Similarily '''**a''' shall have a counterpart link named '''*a'''. Where there are duplicate link parameter names (i.e. proc P {*a *a}) the behavior shall be the same as if there were duplicate '''upvar''' statements. It is legal to have empty link variable names. It shall be possible with a single '''*''' in the procedure's parameter list (i.e. proc P {*} {incr ""}). The same duplicate name rule applies. If the variable to be linked does not exist, it shall be created, if necessary. It shall have the same behavior as '''upvar 1''' in such instances. When a link's construction fails, the behavior shall be the same as if '''upvar''' had failed, the procedure will return with an error before any other commands (with exception to any commands involved in the link's construction) in its body are executed. It is illegal for a link parameter to have a default value. It shall invoke an error during procedure creation time and result in failed procedure creation with the error code: | Tcl_SetErrorCode(interp, "TCL", "OPERATION", "PROC","FORMALARGUMENTFORMAT", NULL); An example of such an error for: | proc P {{*a foo}} {...} Would be: "procedure "P": formal parameter "*a" is to be linked and must not have a default value" In that example, proc '''P''' is never created, the attempt failed due to the error. It is the caller's responsibility to provide the names of variables to be linked. This constraint exists in the spirit of promoting good coding practices and to help avoid obscure and subtle bugs. For the same reasons, this TIP only searches one level up. Therefore, It shall have the same behavior as '''upvar 1'''. '''*args''' is a valid parameter name. For example, '''args''' is simply a link in: | proc foo {a *args} { | incr args | } Note that as of this TIP ''proc foo {args args} {...}'' is legal Tcl. In this instance only the first ''scalar'' '''args''' is usable by the procedure. The rest of the arguments are inaccessible by the script. They're not internally lost, but Tcl's variable lookup mechanics will choose whichever is found first when a script references it. This behavior is inherited for ''proc foo {*args args} {...}''. Where '''args''' will be a link. To further illustrate this proposal with an example: | proc foo {*a *b} { | bar a b | } | proc bar {*a *b} { | incr a | incr b | } | set v1 0 | set v2 1 | foo v1 v2 | puts $v1 | # prints 1 | puts $v2 | # prints 2 | # Version of foo using upvar: | proc foo {a b} { | # Note, upvar $a a would be an error. | upvar 1 $a la $b lb | bar la lb | } | proc bar {a b} { | upvar 1 $a la $b lb | incr la | incr lb | } The "'''*'''" character was chosen primarily because it resembles a star or a snowflake and has a pleasantry to it. It is one of the few ascii characters that '''sticks out''' from its surrounding text. It is also familiar to users of other languages where the same symbol exhibits similar semantics (to wit: a link in Tcl acts as a reference to another variable and doesn't perform a copy when the reference is written to, as it would if it weren't a link). However, unlike other languages, the Tcl core does not expose operations to user scripts that work directly on memory, so the "'''*'''" character should not be mistaken to behave the same or suffer from the same pitfalls as it does in C, C++, Golang, etc. The '''*''' symbol simply instructs Tcl to create a link if it is able to do so. ~~ Consequences 1. Breaks scripts using the special "'''*'''" as the first character in their procedure's parameters (i.e. '''*var'''). > The impact of this should be minimal because these variable names require the user to wrap it in curly braces (i.e. '''${*var}''') to fetch their values, unless they're using the less common form of '''set varname'''. ~Reference Implementation See branch ''dah-proc-arg-upvar'' ~~Implementation Notes tclInt.h: Add a new field named ''numArgsCompiledLocals'' to the Proc struct. The new field holds the number of parameters along with any other relevant local variables which follow immediately after the parameters. For this TIP, these additional locals are variables with the VAR_LINK flag and to be resolved as links to the values of arguments they've been configured to link with. The additional field was a hard choice, but is necessary because ''TclProcCompileProc'' enforces ''procPtr->numCompiledLocals'' to be the same value as ''procPtr->numArgs''. The local variable table is evidently not growable until later. tclProc.c: Modify ''InitArgsAndLocals'' to do the automatic linking. Note that this is a ''very hot'' function and that was kept in mind while making the necessary adjustments. There are two additional branches in the function (the second only visited when an error happens). The first to check if the command has any parameters that need linking and if so, process them with link support handling code. The second branch is to simply check if the link handling code set an error when an error occurs, so this branch should not be a concern as to performance impact. Due to branch prediction and this function being so hot, there should be virtually nil of a performance impact on any code which doesn't make use of the new automatic linking facility. tclProc.c: Modify ''TclCreateProc'' to add additional locals after the list of parameter locals (if any) when there are parameters flagged for auto linking. ~Copyright This document has been placed in the public domain.