TIP #170: BETTER SUPPORT FOR NESTED LISTS =========================================== Version: $Revision: 1.6 $ Author: Sergey Babkin Don Porter Donal K. Fellows State: Draft Type: Project Tcl-Version: 8.7 Vote: Pending Created: Friday, 30 January 2004 URL: https://tip.tcl-lang.org170.html Post-History: ------------------------------------------------------------------------- ABSTRACT ========== Nested lists are easy to create with Tcl but then manipulating them is not easy. For example, think about how to change a value nested in a list 2 levels deep? How about 4 levels deep? The proposed new commands make such manipulation easy, and make the nested lists a great replacement for arrays and structures in C-like languages. RATIONALE AND SPECIFICATION ============================= The new proposed commands start with the prefix "ldeep". They are desgined to resemble the classic list commands with "l" names. In the cases when the meaning of "ldeep" command differs substantially from the "l" command, the name has been selected to be different (such as "ldeeprep", not "ldeepreplace"). The commands have been extensively used in the Not A Commander project [] and have been adjusted to the observed needs of practical use. All these commands use the concept of "path" (see [TIP #22]) and handle the concatenation of paths for reasons of convenience (e.g. a "base" path followed by a "local" path). COMMANDS ---------- The new proposed commands are: *ldeepset* /listvar path/ ?/path/?... /value/ Set a /value/ in a nested list variable. If the variable did not exist previously, it is created. If the intermediate lists specified in the path did not exist, they are created as empty lists. This includes the "filler" elements: for example, if /listvar/ contained a list of one element and the path starts with "5 ...", the elements with indexes 1, 2, 3, 4 and 5 will all be created (and will contain empty strings) and the the further creation within the element at index 5 will proceed. A special meaning is assigned by this command to the negative indexes in the path: they mean "add a new element at the end of the list". So this command also doubles as a nested version of lappend. For example, ldeepset listvar -1 value means the same thing as lappend listvar value This merging has happened because it's often neccessary to add elements to the lists in the middle of the path. The particular value used to indicate the addition of an element can be changed to something more symbolic, for example to "append" instead of a negative value. The /ldeepset/ command returns nothing. Since the version without value as in the common set can not be used, returning the value did not seem to make sense. Also when experimenting with large lists from the command line, returning the value that is a large list itself would cause a long and unpleasant printout of it. /(This is only partially superceded by [TIP #33] and [TIP #331].)/ *ldeepincr* /lstvar path/ ?/path/?... /int-value/ Increase a value within a nested list by /int-value/. Note that since the amount of increase has to be differentiated from the /path/, it's mandatory even for the value of 1. This is a convenient and often used shortcut for the *ldeepindex*-*expr*-*ldeepset* sequence. It returns the value of the element after increase. *ldeeprep* /lstvar path/ ?/path/?... /first last element-list/ Replace a range of elements in a sublist identified by the /path/ with the elements from the /element-list/. It returns nothing. This command is different from *lreplace* in two ways, hence the name change. First, it acts on data in a variable, not on a list as an argument. Second, the elements for replacement are contained in a list, not as separate elements on command line. Both differences were created for convenience of practical use, plus to allow the path to pick up the variable number of arguments. I have found that I always need to replace elements in a variable, not as a pass-through operator, and that I almost always need to insert elements from another list, not just some fixed set of elements. *ldeeppop* /lstvar path/ ?/path/?... /count/ Remove /count/ elements from the end of the sublist identified by /path/ in the variable and return them in a list. This command was inspired by the pop operator in Perl. Somehow I've never has a very string need for the other similar commands (which would be /ldeeppush/ to add elements, and /ldeepshift/ and /ldeepunshift/ for operations on the start of the list) but they can be easily added as well for completeness. The command returns the list of the popped elements in the original order. For example, if /lstvar/ contained {0 1 2 3 4 5}, "ldeeppop lstvar {} 2" would return {4 5}, NOT {5 4}. OTHER EXTENSIONS FOR LIST SUPPORT =================================== In my practice I have found that a few other commands make working with lists much more convenient. They are not directly related to the nested lists but to the lists in general. *lconcat* /sublist/ ?/sublist/?... Concatenate the argument lists and return the resulting list. This command is similar to *concat* but avoids converting the values to strings, concatenating the strings and then re-parsing the strings. When the lists involved grow to a few megabytes, *concat* can become very inefficient both in the sense of time and memory usage; *lconcat* resolved this inefficiency. Note that it does /not/ replace *concat*, which can be used to assemble lists from pieces in different argument strings. The command returns the concatenated list. /(Note that this is largely possible through using/ *list* /and [TIP #157]/[TIP #293], and that/ *concat* /is more likely to handle the lists where this matters efficiently anyway right now due to efficiency tweaks to its implementation. It remains to be seen whether there is still a need for/ *lconcat* /in other areas...)/ REFERENCE IMPLEMENTATION ========================== The reference implementation is available as part of the Not A Commander project [], the source file /cutil.c/. To include the new commands into Tcl, the error messages will have to be adjusted to match the style used in Tcl, and the man pages will have to be written. The current implementation has been tested in fair amounts for both correctness and efficiency by usage in the Not A Commander project, the formal test suite would have to be written. Further progress in this direction depends on acceptance of this proposal. COMMENTS ========== Messages on the TCLCORE mailing list have pointed out that nearly everything proposed here is either equivalent to, or trivially created by composition of the existing commands *lindex*, *lrange*, *lset*, *lassign*, and *lrepeat*. The only novel thing proposed is the ability of *lset* to create new list elements. If that's still desired, a separate new TIP proposing that alone would be the best way to deal with that. I call on the author to withdraw this TIP. COPYRIGHT =========== This document has been placed in the public domain. ------------------------------------------------------------------------- APPENDIX: REMOVED COMMANDS ============================ /These commands were originally part of this TIP, but have been removed to this appendix on the grounds that the functionality they describe is now part of Tcl via other TIPs./ *ldeepindex* /list path/ ?/path/?... Extract an element from a nested list. The element is identified by the logical concatenation of the paths. It returns the extracted element. /(Obsoleted by [TIP #22].)/ *ldeeplen* /list path/ ?/path/?... Find the length of a nested sublist. The sublist is identified by the logical concatenation of the paths. It returns the found length. A non-existing sublist is assumed to have the length of 0. /(Obsoleted by [TIP #22] and normal llength.)/ *ldeeprange* /list path/ ?/path/?... /first last/ Extract a sublist from a nested list. This command is a convenient equivalent of lrange [ldeepindex list path ?path?] first last] "end" is supported as the first or last index. It returns the extracted sublist. /(Obsoleted by [TIP #22] and normal lrange.)/ *mset* /list-of-variable-names list-of-values/ Set the values from the value list to the variable in the variable list at the same index. The command name stands for "multiple set". This command is inspired by assignments in Perl. If there were more variables than the values, the rest of variables are set to an empty value. If there were more values than variables, then the last variable is assigned the whole end of a list (as a list). A special variable name "*-*" can be used to throw away a value, or the whole end of the value list if it's specified as the last one in the variable list. The command returns nothing. It can be argued that it would make sense to return the length of the original list, or the difference between the length of the values list and the length of the variables list. I don't know which one is better. This command is particularly convenient for returning multiple values from a procedure. For example: proc xxx {a b} { return [list [expr $b+$a] [expr $b*$a]] } mset {sum product} [xxx 1 2] /(Obsoleted by [TIP #57].)/ *ldup* /count element/ ?/element/?... This returns a list produced by duplicating the sequence of elements /count/ times. This command is inspired by the operator "x" in Perl, and it is a logical equivalent of: proc ldup {count args} { set res {} for {set i 0} {$i < $count} {incr i} { set res [lconcat $res $args] } return $res } /(Obsoleted by [TIP #136].)/ *lvdup* /count element-list/ This returns a list produced by duplicating the /element-list count/ times. This command is inspired by the operator "x" in Perl, and it is a logical equivalent of: proc lvdup {count list} { set res {} for {set i 0} {$i < $count} {incr i} { set res [lconcat $res $list] } return $res } /(Obsoleted by [TIP #136] and [TIP #157]/[TIP #293].)/ ------------------------------------------------------------------------- TIP AutoGenerator - written by Donal K. Fellows