# Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

36 messages
12
Open this post in threaded view
|

## Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 5.2 REFERENCE MANUAL "Unless a __len metamethod is given, the length of a table t is only defined if the table is a sequence, that is, the set of its positive numeric keys is equal to {1..n} for some integer n. In that case, n is its length." *** Conclusion ***: Without a __len metamethod, an empty table has undefined length. 5.1 REFERENCE MANUAL "The length of a table t is defined to be any integer index n such that t[n] is not nil and t[n+1] is nil; moreover, if t[1] is nil, n can be zero. For a regular array, with non-nil values from 1 to a given n, its length is exactly that n, the index of its last value. If the array has "holes" (that is, nil values between other non-nil values), then #t can be any of the indices that directly precedes a nil value (that is, it may consider any such nil value as the end of the array)." *** Conclusion ***: An empty table (t[1] is nil) may or may not have a length of zero. =========================================================================== So I'm assuming the following idiomatic code invokes undefined behavior: a = {} a[#a + 1] = 'foo' As does this code (because of the implicit #a+1): a = {} table.insert(a, 'foo') =========================================================================== Yet such code appears repeatedly in PIL (3rd edition), and I haven't seen any errata for it: PAGE 94 threads = {}    -- list of all live threads -- [...]    table.insert(threads, co) PAGE 95 local timedout = {} -- [...] timedout[#timedout + 1] = res PAGE 99 and PAGE 100 local words = {} for w in pairs(counter) do    words[#words + 1] = w end PAGE 113 and PAGE 114 local t = {} for line in io.lines() do    t[#t + 1] = -- [...] end PAGE 115 path = path or {} visited = visited or {} if visited[curr] then   -- node already visited?    return nil            -- no path here end visited[curr] = true    -- mark node as visited path[#path + 1] = curr  -- add it to path PAGE 129 function Set.tostring (set)    local l = {}     -- list to put all elements from the set    for e in pairs(set) do      l[#l + 1] = e    end PAGE 195 t = {} for line in io.lines() do     table.insert(t, line) end PAGE 196 a = {} for n in pairs(lines) do a[#a + 1] = n end PAGE 197 local a = {} for n in pairs(t) do a[#a + 1] = n end PAGE 202 local t = {}                    -- table to store the indices local i = 0 while true do    i = string.find(s, "\n", i+1) -- find next newline    if i == nil then break end    t[#t + 1] = i end PAGE 203 words = {} for w in string.gmatch(s, "%a+") do    words[#words + 1] = w end PAGE 212 function encode (t)    local b = {}    for k,v in pairs(t) do      b[#b + 1] = (escape(k) .. "=" .. escape(v))    end PAGE 218 local a = {} a[#a + 1] = -- [...] PAGE 223 local lines = {} -- read the lines in table 'lines' for line in io.lines() do lines[#lines + 1] = line end
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 Hi, I typed =#{} into the Lua console and got: 0 which seems pretty OK to me as it has 0 elements before the first non-existing. -- Oliver
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by polyglot Quoth polyglot: > 5.2 REFERENCE MANUAL > > "Unless a __len metamethod is given, the length of a table t is only defined > if the table is a sequence, that is, the set of its positive numeric keys is > equal to {1..n} for some integer n. In that case, n is its length." > > *** Conclusion ***: Without a __len metamethod, an empty table has undefined > length. Not so. An empty table is a still considered a sequence, just with no elements. The manual could be a little more clear on this, but it's not saying the table must contain elements to be considered sequential. It is only differentiating between types of non-empty tables. > 5.1 REFERENCE MANUAL > > "The length of a table t is defined to be any integer index n such that t[n] > is not nil and t[n+1] is nil; moreover, if t[1] is nil, n can be zero. For a > regular array, with non-nil values from 1 to a given n, its length is > exactly that n, the index of its last value. If the array has "holes" (that > is, nil values between other non-nil values), then #t can be any of the > indices that directly precedes a nil value (that is, it may consider any > such nil value as the end of the array)." > > *** Conclusion ***: An empty table (t[1] is nil) may or may not have a > length of zero. Not so. The phrase "can be zero" should not be taken to mean it may sometimes not be zero when there are no other elements. Again, it is speaking about the situation of the non-empty table which has non-sequential data, i.e. t[1] is nil but t[2] is not nil. It could be more clear about the empty-table case, but in my experience, this is rather intuitive and has never caused any ambiguity in my mind. > =========================================================================== > > So I'm assuming the following idiomatic code invokes undefined behavior: > > a = {} > a[#a + 1] = 'foo' > > As does this code (because of the implicit #a+1): > > a = {} > table.insert(a, 'foo') Not so. Since all the examples you listed from the book hinge on this principle, they are not invoking undefined behavior and are written correctly. -Steven
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by Oliver Kroth Quoth Oliver Kroth: > Hi, I typed > > =#{} > > into the Lua console and got: > > 0 > > which seems pretty OK to me as it has 0 elements before the first > non-existing. I believe the OP is claiming that the result appears to be undefined according to the manual, not that it can never be 0. -Steven
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by Steven Degutis On Wed, Sep 17, 2014 at 7:16 AM, Steven Degutis <[hidden email]> wrote: > Quoth polyglot: >> 5.2 REFERENCE MANUAL >> >> "Unless a __len metamethod is given, the length of a table t is only defined >> if the table is a sequence, that is, the set of its positive numeric keys is >> equal to {1..n} for some integer n. In that case, n is its length." >> >> *** Conclusion ***: Without a __len metamethod, an empty table has undefined >> length. > > Not so. An empty table is a still considered a sequence, just with no > elements. The manual could be a little more clear on this, but it's > not saying the table must contain elements to be considered > sequential. It is only differentiating between types of non-empty > tables. More precisely: What is the set of positive numeric keys for an empty table? Does the set of positive numeric keys for an empty table obey this definition? This set of keys is {}, the empty set. It IS a member of the set of sets of the form {1..n}, with n = 0. (Try it with a for loop -- for n = 1, 0 do print(n) end -- and see that this is true.) Therefore, an empty table is a sequence, and it is a sequence of length 0. And because an empty table is a sequence with length 0, the length operator is defined on it. /s/ Adam
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by Steven Degutis On 2014-09-18 00:16, Steven Degutis wrote: > An empty table is a still considered a sequence Hmm. I don't think that's immediately obvious from the definition of a sequence: "We use the term sequence to denote a table where the set of all positive numeric keys is equal to {1..n} for some integer n, which is called the length of the sequence (see §3.4.6)." The most natural reading for me is that a sequence has at least 1 as a positive numeric key.
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by Oliver Kroth On 2014-09-18 00:16, Oliver Kroth wrote: > Hi, I typed > > =#{} > > into the Lua console and got: > > 0 > > which seems pretty OK to me as it has 0 elements before the first > non-existing. > -- > Oliver Yeah, it certainly makes sense to me too, but I'm wondering whether the reference manual needs a revision.
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by Coda Highland On 2014-09-18 01:02, Coda Highland wrote: > > This set of keys is {}, the empty set. It IS a member of the set of > sets of the form {1..n}, with n = 0. "We use the term sequence to denote a table where the set of all positive numeric keys is equal to {1..n} for some integer n, which is called the length of the sequence (see §3.4.6)." Taking n = 0 seems pathological to me, even though 0 is indeed an integer. The main problem, however, is that any integer less than 1 produces the empty set. Try n = -42 with your for loop. So the length of the sequence wouldn't be uniquely defined. Obviously 0 makes the most sense, but the question is how one gets there from the manual.
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 On 17 September 2014 11:47, wrote:The main problem, however, is that any integer less than 1 produces the empty set. Try n = -42 with your for loop.Is that a problem?    \$ lua -e 'print(#{[-42]=true})'    0
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 On 2014-09-18 02:00, Daurnimator wrote: > On 17 September 2014 11:47, <[hidden email]> wrote: > >> The main problem, however, is that any integer less than 1 produces >> the >> empty set. Try n = -42 with your for loop. > > Is that a problem? Depends. a = {} a[#a + 1] = 'foo' Would you be happy with #a = -42? What about #a = (some random integer <= 0)?
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by polyglot 2014-09-17 17:47 GMT+02:00  <[hidden email]>: > On 2014-09-18 01:02, Coda Highland wrote: >> >> >> This set of keys is {}, the empty set. It IS a member of the set of >> sets of the form {1..n}, with n = 0. > > > "We use the term sequence to denote a table where the set of all positive > numeric keys is equal to {1..n} for some integer n, which is called the > length of the sequence (see §3.4.6)." > > Taking n = 0 seems pathological to me, even though 0 is indeed an integer. > The main problem, however, is that any integer less than 1 produces the > empty set. Try n = -42 with your for loop. > > So the length of the sequence wouldn't be uniquely defined. Obviously > 0 makes the most sense, but the question is how one gets there from > the manual. If you want "non-negative" spelt out, you can find it in the 5.3 manual in §3.4.7.
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by Steven Degutis On Sep 17, 2014, at 7:16 AM, Steven Degutis <[hidden email]> wrote:Quoth polyglot:5.2 REFERENCE MANUAL"Unless a __len metamethod is given, the length of a table t is only definedif the table is a sequence, that is, the set of its positive numeric keys isequal to {1..n} for some integer n. In that case, n is its length."*** Conclusion ***: Without a __len metamethod, an empty table has undefinedlength.Not so. An empty table is a still considered a sequence, just with noelements. The manual could be a little more clear on this, but it'snot saying the table must contain elements to be consideredsequential. It is only differentiating between types of non-emptytables.No, I think the OP is correct insofar as what the reference manual says. My reading of the strict definition of sequence is that an empty table is NOT a sequence, and therefore the result of # is indeed undefined, regardless of what it actually returns in practice.(And yes, I’m on record as not liking this definition.)As I have noted before, since Lua provides no means to determine if a table is a sequence, it follows that unless you have control over the creation of a table, the length of the table is undefined, since your code cannot know if it is a sequence. This makes writing robust APIs that consume 3rd party tables problematic.—Tim
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 On Wed, Sep 17, 2014 at 9:49 AM, Tim Hill <[hidden email]> wrote: > Quoth polyglot: > > 5.2 REFERENCE MANUAL > > "Unless a __len metamethod is given, the length of a table t is only defined > if the table is a sequence, that is, the set of its positive numeric keys is > equal to {1..n} for some integer n. In that case, n is its length." I'd say it's at least ambiguous, particularly given that Lua does not recognize a naked zero as an integer in its string library functions that return indices, but somewhat paradoxically recognizes -1 as such: "When indexing a string in Lua, the first character is at position 1 (not at 0, as in C)." Ref. Manual, 6.4. I realize that -1 is a token in that context but still the superficial illogic of recognizing a negative value as an integer whilst denying zero citizenship is kind of fun. ...  :-) Regards, Paul -- [Notice not included in the above original message:  The U.S. National Security Agency neither confirms nor denies that it intercepted this message.]
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by Dirk Laurie-2 On 2014-09-18 02:36, Dirk Laurie wrote: > > If you want "non-negative" spelt out, you can find it in the 5.3 manual > in §3.4.7. Looks good to me, though obviously the definition of sequence in section 2.1 needs changing as well. Where does it leave 5.2 users? Should the 5.2 manual be updated?
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by polyglot On Wed, Sep 17, 2014 at 8:47 AM,  <[hidden email]> wrote: > On 2014-09-18 01:02, Coda Highland wrote: >> >> >> This set of keys is {}, the empty set. It IS a member of the set of >> sets of the form {1..n}, with n = 0. > > > "We use the term sequence to denote a table where the set of all positive > numeric keys is equal to {1..n} for some integer n, which is called the > length of the sequence (see §3.4.6)." > > Taking n = 0 seems pathological to me, even though 0 is indeed an integer. > The main problem, however, is that any integer less than 1 produces the > empty set. Try n = -42 with your for loop. > > So the length of the sequence wouldn't be uniquely defined. Obviously > 0 makes the most sense, but the question is how one gets there from > the manual. I'm applying the strict rules of set logic here.  It's well-defined in mathematics -- it's a DEGENERATE case, yes, but it's not an exception. Would you complain that 1 isn't a power of 2 just because you can't get there by multiplying a nonzero number of twos? It's the same general concept. n = -42 doesn't count because the definition specifically says "positive numeric keys". A table containing { [-42] = "blah" } *is* a sequence. The -42 isn't PART of the sequence, but the table is a sequence. Likewise, { "a" = "blah" } is ALSO a sequence -- also a zero-length one. /s/ Adam
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by polyglot On Sep 17, 2014, at 7:33 PM, [hidden email] wrote: > Where does it leave 5.2 users? "And yet it moves” —  Galileo Galilei, allegedly
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by Coda Highland On 2014-09-18 04:07, Coda Highland wrote: > On Wed, Sep 17, 2014 at 8:47 AM,  <[hidden email]> wrote: >> "We use the term sequence to denote a table where the set of all >> positive >> numeric keys is equal to {1..n} for some integer n, which is called >> the >> length of the sequence (see §3.4.6)." > > n = -42 doesn't count because the definition specifically says > "positive numeric keys". Why would n = 0 count but n = -42 not count? They're both integers, and neither is a positive numeric key.
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by polyglot 2014-09-17 19:33 GMT+02:00  <[hidden email]>: > On 2014-09-18 02:36, Dirk Laurie wrote: >> >> >> If you want "non-negative" spelt out, you can find it in the 5.3 manual >> in §3.4.7. > > > Looks good to me, though obviously the definition of sequence in section > 2.1 needs changing as well. > > Where does it leave 5.2 users? Should the 5.2 manual be updated? #angels_dancing_on_the_head_of_a_pin
Open this post in threaded view
|

## Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

 In reply to this post by polyglot On Wed, Sep 17, 2014 at 11:20 AM,  <[hidden email]> wrote: > On 2014-09-18 04:07, Coda Highland wrote: >> >> On Wed, Sep 17, 2014 at 8:47 AM,  <[hidden email]> wrote: >>> >>> "We use the term sequence to denote a table where the set of all positive >>> numeric keys is equal to {1..n} for some integer n, which is called the >>> length of the sequence (see §3.4.6)." >> >> >> n = -42 doesn't count because the definition specifically says >> "positive numeric keys". > > > Why would n = 0 count but n = -42 not count? They're both > integers, and neither is a positive numeric key. > Ah, sorry, that's an aliasing problem. Different contexts, and I should have been more cautious in how I phrased it. 0 is just as irrelevant to the sequenceness of a table as -42 is. The difference between 0 and -42 in the aforementioned loop iteration is that 0 is a cardinal number (that is, a number that can be used to describe the size of a set), but -42 is not. /s/ Adam