TIP 170: Better Support for Nested Lists

Login
Author:         Sergey Babkin <[email protected]>
Author:         Don Porter <[email protected]>
Author:		Donal K. Fellows <[email protected]>
State:          Draft
Type:           Project
Vote:           Pending
Created:        30-Jan-2004
Post-History:   
Tcl-Version:    8.7
Obsoleted-By:	22
Obsoleted-By:	136
Obsoleted-By:	157
Implementation-URL: http://nac.sf.net/

Abstract

Nested lists are easy to create with Tcl but then manipulating them is not easy. For example, think about how to change a value nested in a list 2 levels deep? How about 4 levels deep? The proposed new commands make such manipulation easy, and make the nested lists a great replacement for arrays and structures in C-like languages.

Rationale and Specification

The new proposed commands start with the prefix "ldeep". They are desgined to resemble the classic list commands with "l" names. In the cases when the meaning of "ldeep" command differs substantially from the "l" command, the name has been selected to be different (such as "ldeeprep", not "ldeepreplace").

The commands have been extensively used in the Not A Commander project http://nac.sf.net/ and have been adjusted to the observed needs of practical use. All these commands use the concept of "path" (see [22]) and handle the concatenation of paths for reasons of convenience (e.g. a "base" path followed by a "local" path).

Commands

The new proposed commands are:

ldeepset listvar path ?path?... value

Set a value in a nested list variable. If the variable did not exist previously, it is created. If the intermediate lists specified in the path did not exist, they are created as empty lists. This includes the "filler" elements: for example, if listvar contained a list of one element and the path starts with "5 ...", the elements with indexes 1, 2, 3, 4 and 5 will all be created (and will contain empty strings) and the the further creation within the element at index 5 will proceed.

A special meaning is assigned by this command to the negative indexes in the path: they mean "add a new element at the end of the list". So this command also doubles as a nested version of lappend. For example,

   ldeepset listvar -1 value

means the same thing as

   lappend listvar value

This merging has happened because it's often neccessary to add elements to the lists in the middle of the path. The particular value used to indicate the addition of an element can be changed to something more symbolic, for example to "append" instead of a negative value.

The ldeepset command returns nothing. Since the version without value as in the common set can not be used, returning the value did not seem to make sense. Also when experimenting with large lists from the command line, returning the value that is a large list itself would cause a long and unpleasant printout of it.

(This is only partially superceded by [33] and [331].)

ldeepincr lstvar path ?path?... int-value

Increase a value within a nested list by int-value. Note that since the amount of increase has to be differentiated from the path, it's mandatory even for the value of 1. This is a convenient and often used shortcut for the ldeepindex-expr-ldeepset sequence. It returns the value of the element after increase.

ldeeprep lstvar path ?path?... first last element-list

Replace a range of elements in a sublist identified by the path with the elements from the element-list. It returns nothing.

This command is different from lreplace in two ways, hence the name change. First, it acts on data in a variable, not on a list as an argument. Second, the elements for replacement are contained in a list, not as separate elements on command line. Both differences were created for convenience of practical use, plus to allow the path to pick up the variable number of arguments. I have found that I always need to replace elements in a variable, not as a pass-through operator, and that I almost always need to insert elements from another list, not just some fixed set of elements.

ldeeppop lstvar path ?path?... count

Remove count elements from the end of the sublist identified by path in the variable and return them in a list.

This command was inspired by the pop operator in Perl. Somehow I've never has a very string need for the other similar commands (which would be ldeeppush to add elements, and ldeepshift and ldeepunshift for operations on the start of the list) but they can be easily added as well for completeness.

The command returns the list of the popped elements in the original order. For example, if lstvar contained {0 1 2 3 4 5}, "ldeeppop lstvar {} 2" would return {4 5}, NOT {5 4}.

Other Extensions for List Support

In my practice I have found that a few other commands make working with lists much more convenient. They are not directly related to the nested lists but to the lists in general.

lconcat sublist ?sublist?...

Concatenate the argument lists and return the resulting list. This command is similar to concat but avoids converting the values to strings, concatenating the strings and then re-parsing the strings. When the lists involved grow to a few megabytes, concat can become very inefficient both in the sense of time and memory usage; lconcat resolved this inefficiency. Note that it does not replace concat, which can be used to assemble lists from pieces in different argument strings. The command returns the concatenated list.

(Note that this is largely possible through using list and [157]/[293], and that concat is more likely to handle the lists where this matters efficiently anyway right now due to efficiency tweaks to its implementation. It remains to be seen whether there is still a need for lconcat in other areas...)

Reference Implementation

The reference implementation is available as part of the Not A Commander project http://nac.sf.net/ , the source file cutil.c. To include the new commands into Tcl, the error messages will have to be adjusted to match the style used in Tcl, and the man pages will have to be written. The current implementation has been tested in fair amounts for both correctness and efficiency by usage in the Not A Commander project, the formal test suite would have to be written. Further progress in this direction depends on acceptance of this proposal.

Comments

Messages on the TCLCORE mailing list have pointed out that nearly everything proposed here is either equivalent to, or trivially created by composition of the existing commands lindex, lrange, lset, lassign, and lrepeat. The only novel thing proposed is the ability of lset to create new list elements. If that's still desired, a separate new TIP proposing that alone would be the best way to deal with that.

I call on the author to withdraw this TIP.

Copyright

This document has been placed in the public domain.


Appendix: Removed Commands

These commands were originally part of this TIP, but have been removed to this appendix on the grounds that the functionality they describe is now part of Tcl via other TIPs.

ldeepindex list path ?path?...

Extract an element from a nested list. The element is identified by the logical concatenation of the paths. It returns the extracted element. (Obsoleted by [22].)

ldeeplen list path ?path?...

Find the length of a nested sublist. The sublist is identified by the logical concatenation of the paths. It returns the found length. A non-existing sublist is assumed to have the length of 0. (Obsoleted by [22] and normal llength.)

ldeeprange list path ?path?... first last

Extract a sublist from a nested list. This command is a convenient equivalent of

   lrange [ldeepindex list path ?path?] first last]

"end" is supported as the first or last index. It returns the extracted sublist. (Obsoleted by [22] and normal lrange.)

mset list-of-variable-names list-of-values

Set the values from the value list to the variable in the variable list at the same index. The command name stands for "multiple set". This command is inspired by assignments in Perl.

If there were more variables than the values, the rest of variables are set to an empty value. If there were more values than variables, then the last variable is assigned the whole end of a list (as a list).

A special variable name "-" can be used to throw away a value, or the whole end of the value list if it's specified as the last one in the variable list.

The command returns nothing. It can be argued that it would make sense to return the length of the original list, or the difference between the length of the values list and the length of the variables list. I don't know which one is better.

This command is particularly convenient for returning multiple values from a procedure. For example:

   proc xxx {a b} {
      return [list [expr $b+$a] [expr $b*$a]]
   }
   mset {sum product} [xxx 1 2]

(Obsoleted by [57].)

ldup count element ?element?...

This returns a list produced by duplicating the sequence of elements count times. This command is inspired by the operator "x" in Perl, and it is a logical equivalent of:

   proc ldup {count args} {
      set res {}
      for {set i 0} {$i < $count} {incr i} {
         set res [lconcat $res $args]
      }
      return $res
   }

(Obsoleted by [136].)

lvdup count element-list

This returns a list produced by duplicating the element-list count times. This command is inspired by the operator "x" in Perl, and it is a logical equivalent of:

   proc lvdup {count list} {
      set res {}
      for {set i 0} {$i < $count} {incr i} {
         set res [lconcat $res $list]
      }
      return $res
   }

(Obsoleted by [136] and [157]/[293].)