Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

polyglot
5.2 REFERENCE MANUAL

"Unless a __len metamethod is given, the length of a table t is only
defined
if the table is a sequence, that is, the set of its positive numeric
keys is
equal to {1..n} for some integer n. In that case, n is its length."

*** Conclusion ***: Without a __len metamethod, an empty table has
undefined
length.

5.1 REFERENCE MANUAL

"The length of a table t is defined to be any integer index n such that
t[n]
is not nil and t[n+1] is nil; moreover, if t[1] is nil, n can be zero.
For a
regular array, with non-nil values from 1 to a given n, its length is
exactly that n, the index of its last value. If the array has "holes"
(that
is, nil values between other non-nil values), then #t can be any of the
indices that directly precedes a nil value (that is, it may consider any
such nil value as the end of the array)."

*** Conclusion ***: An empty table (t[1] is nil) may or may not have a
length of zero.

===========================================================================

So I'm assuming the following idiomatic code invokes undefined behavior:

a = {}
a[#a + 1] = 'foo'

As does this code (because of the implicit #a+1):

a = {}
table.insert(a, 'foo')

===========================================================================

Yet such code appears repeatedly in PIL (3rd edition), and I haven't
seen any errata for it:

PAGE 94

threads = {}    -- list of all live threads
-- [...]
   table.insert(threads, co)

PAGE 95

local timedout = {}
-- [...]
timedout[#timedout + 1] = res

PAGE 99 and PAGE 100

local words = {}
for w in pairs(counter) do
   words[#words + 1] = w
end

PAGE 113 and PAGE 114

local t = {}
for line in io.lines() do
   t[#t + 1] = -- [...]
end

PAGE 115

path = path or {}
visited = visited or {}
if visited[curr] then   -- node already visited?
   return nil            -- no path here
end
visited[curr] = true    -- mark node as visited
path[#path + 1] = curr  -- add it to path

PAGE 129

function Set.tostring (set)
   local l = {}     -- list to put all elements from the set
   for e in pairs(set) do
     l[#l + 1] = e
   end

PAGE 195

t = {}
for line in io.lines() do
    table.insert(t, line)
end

PAGE 196

a = {}
for n in pairs(lines) do a[#a + 1] = n end

PAGE 197

local a = {}
for n in pairs(t) do a[#a + 1] = n end

PAGE 202

local t = {}                    -- table to store the indices
local i = 0
while true do
   i = string.find(s, "\n", i+1) -- find next newline
   if i == nil then break end
   t[#t + 1] = i
end

PAGE 203

words = {}
for w in string.gmatch(s, "%a+") do
   words[#words + 1] = w
end

PAGE 212

function encode (t)
   local b = {}
   for k,v in pairs(t) do
     b[#b + 1] = (escape(k) .. "=" .. escape(v))
   end

PAGE 218

local a = {}
a[#a + 1] = -- [...]

PAGE 223

local lines = {}
-- read the lines in table 'lines'
for line in io.lines() do lines[#lines + 1] = line end



Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

Oliver Kroth
Hi, I typed

=#{}

into the Lua console and got:

0

which seems pretty OK to me as it has 0 elements before the first
non-existing.
--
Oliver

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

Steven Degutis
In reply to this post by polyglot
Quoth polyglot:
> 5.2 REFERENCE MANUAL
>
> "Unless a __len metamethod is given, the length of a table t is only defined
> if the table is a sequence, that is, the set of its positive numeric keys is
> equal to {1..n} for some integer n. In that case, n is its length."
>
> *** Conclusion ***: Without a __len metamethod, an empty table has undefined
> length.

Not so. An empty table is a still considered a sequence, just with no
elements. The manual could be a little more clear on this, but it's
not saying the table must contain elements to be considered
sequential. It is only differentiating between types of non-empty
tables.

> 5.1 REFERENCE MANUAL
>
> "The length of a table t is defined to be any integer index n such that t[n]
> is not nil and t[n+1] is nil; moreover, if t[1] is nil, n can be zero. For a
> regular array, with non-nil values from 1 to a given n, its length is
> exactly that n, the index of its last value. If the array has "holes" (that
> is, nil values between other non-nil values), then #t can be any of the
> indices that directly precedes a nil value (that is, it may consider any
> such nil value as the end of the array)."
>
> *** Conclusion ***: An empty table (t[1] is nil) may or may not have a
> length of zero.

Not so. The phrase "can be zero" should not be taken to mean it may
sometimes not be zero when there are no other elements. Again, it is
speaking about the situation of the non-empty table which has
non-sequential data, i.e. t[1] is nil but t[2] is not nil. It could be
more clear about the empty-table case, but in my experience, this is
rather intuitive and has never caused any ambiguity in my mind.

> ===========================================================================
>
> So I'm assuming the following idiomatic code invokes undefined behavior:
>
> a = {}
> a[#a + 1] = 'foo'
>
> As does this code (because of the implicit #a+1):
>
> a = {}
> table.insert(a, 'foo')

Not so. Since all the examples you listed from the book hinge on this
principle, they are not invoking undefined behavior and are written
correctly.

-Steven

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

Steven Degutis
In reply to this post by Oliver Kroth
Quoth Oliver Kroth:

> Hi, I typed
>
> =#{}
>
> into the Lua console and got:
>
> 0
>
> which seems pretty OK to me as it has 0 elements before the first
> non-existing.

I believe the OP is claiming that the result appears to be undefined
according to the manual, not that it can never be 0.

-Steven

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

Coda Highland
In reply to this post by Steven Degutis
On Wed, Sep 17, 2014 at 7:16 AM, Steven Degutis <[hidden email]> wrote:

> Quoth polyglot:
>> 5.2 REFERENCE MANUAL
>>
>> "Unless a __len metamethod is given, the length of a table t is only defined
>> if the table is a sequence, that is, the set of its positive numeric keys is
>> equal to {1..n} for some integer n. In that case, n is its length."
>>
>> *** Conclusion ***: Without a __len metamethod, an empty table has undefined
>> length.
>
> Not so. An empty table is a still considered a sequence, just with no
> elements. The manual could be a little more clear on this, but it's
> not saying the table must contain elements to be considered
> sequential. It is only differentiating between types of non-empty
> tables.

More precisely:

What is the set of positive numeric keys for an empty table? Does the
set of positive numeric keys for an empty table obey this definition?

This set of keys is {}, the empty set. It IS a member of the set of
sets of the form {1..n}, with n = 0. (Try it with a for loop -- for n
= 1, 0 do print(n) end -- and see that this is true.)

Therefore, an empty table is a sequence, and it is a sequence of length 0.

And because an empty table is a sequence with length 0, the length
operator is defined on it.

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

polyglot
In reply to this post by Steven Degutis
On 2014-09-18 00:16, Steven Degutis wrote:
> An empty table is a still considered a sequence

Hmm. I don't think that's immediately obvious from the definition
of a sequence:

"We use the term sequence to denote a table where the set of all
positive numeric keys is equal to {1..n} for some integer n, which
is called the length of the sequence (see §3.4.6)."

The most natural reading for me is that a sequence has at least
1 as a positive numeric key.

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

polyglot
In reply to this post by Oliver Kroth
On 2014-09-18 00:16, Oliver Kroth wrote:

> Hi, I typed
>
> =#{}
>
> into the Lua console and got:
>
> 0
>
> which seems pretty OK to me as it has 0 elements before the first
> non-existing.
> --
> Oliver

Yeah, it certainly makes sense to me too, but I'm wondering whether the
reference manual needs a revision.

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

polyglot
In reply to this post by Coda Highland
On 2014-09-18 01:02, Coda Highland wrote:
>
> This set of keys is {}, the empty set. It IS a member of the set of
> sets of the form {1..n}, with n = 0.

"We use the term sequence to denote a table where the set of all
positive
numeric keys is equal to {1..n} for some integer n, which is called the
length of the sequence (see §3.4.6)."

Taking n = 0 seems pathological to me, even though 0 is indeed an
integer.
The main problem, however, is that any integer less than 1 produces the
empty set. Try n = -42 with your for loop.

So the length of the sequence wouldn't be uniquely defined. Obviously
0 makes the most sense, but the question is how one gets there from
the manual.


Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

Daurnimator
On 17 September 2014 11:47, <[hidden email]> wrote:
The main problem, however, is that any integer less than 1 produces the
empty set. Try n = -42 with your for loop.

Is that a problem?

    $ lua -e 'print(#{[-42]=true})'
    0
Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

polyglot
On 2014-09-18 02:00, Daurnimator wrote:
> On 17 September 2014 11:47, <[hidden email]> wrote:
>
>> The main problem, however, is that any integer less than 1 produces
>> the
>> empty set. Try n = -42 with your for loop.
>
> Is that a problem?

Depends.

a = {}
a[#a + 1] = 'foo'

Would you be happy with #a = -42? What about #a = (some random integer
<= 0)?

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

Dirk Laurie-2
In reply to this post by polyglot
2014-09-17 17:47 GMT+02:00  <[hidden email]>:

> On 2014-09-18 01:02, Coda Highland wrote:
>>
>>
>> This set of keys is {}, the empty set. It IS a member of the set of
>> sets of the form {1..n}, with n = 0.
>
>
> "We use the term sequence to denote a table where the set of all positive
> numeric keys is equal to {1..n} for some integer n, which is called the
> length of the sequence (see §3.4.6)."
>
> Taking n = 0 seems pathological to me, even though 0 is indeed an integer.
> The main problem, however, is that any integer less than 1 produces the
> empty set. Try n = -42 with your for loop.
>
> So the length of the sequence wouldn't be uniquely defined. Obviously
> 0 makes the most sense, but the question is how one gets there from
> the manual.

If you want "non-negative" spelt out, you can find it in the 5.3 manual
in §3.4.7.

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

Tim Hill
In reply to this post by Steven Degutis

On Sep 17, 2014, at 7:16 AM, Steven Degutis <[hidden email]> wrote:

Quoth polyglot:
5.2 REFERENCE MANUAL

"Unless a __len metamethod is given, the length of a table t is only defined
if the table is a sequence, that is, the set of its positive numeric keys is
equal to {1..n} for some integer n. In that case, n is its length."

*** Conclusion ***: Without a __len metamethod, an empty table has undefined
length.

Not so. An empty table is a still considered a sequence, just with no
elements. The manual could be a little more clear on this, but it's
not saying the table must contain elements to be considered
sequential. It is only differentiating between types of non-empty
tables.

No, I think the OP is correct insofar as what the reference manual says. My reading of the strict definition of sequence is that an empty table is NOT a sequence, and therefore the result of # is indeed undefined, regardless of what it actually returns in practice.

(And yes, I’m on record as not liking this definition.)

As I have noted before, since Lua provides no means to determine if a table is a sequence, it follows that unless you have control over the creation of a table, the length of the table is undefined, since your code cannot know if it is a sequence. This makes writing robust APIs that consume 3rd party tables problematic.

—Tim

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

Paul E. Merrell, J.D.
On Wed, Sep 17, 2014 at 9:49 AM, Tim Hill <[hidden email]> wrote:
> Quoth polyglot:
>
> 5.2 REFERENCE MANUAL
>
> "Unless a __len metamethod is given, the length of a table t is only defined
> if the table is a sequence, that is, the set of its positive numeric keys is
> equal to {1..n} for some integer n. In that case, n is its length."

I'd say it's at least ambiguous, particularly given that Lua does not
recognize a naked zero as an integer in its string library functions
that return indices, but somewhat paradoxically recognizes -1 as such:
"When indexing a string in Lua, the first character is at position 1
(not at 0, as in C)." Ref. Manual, 6.4.

I realize that -1 is a token in that context but still the superficial
illogic of recognizing a negative value as an integer whilst denying
zero citizenship is kind of fun. ...  :-)

Regards,

Paul

--
[Notice not included in the above original message:  The U.S. National
Security Agency neither confirms nor denies that it intercepted this
message.]

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

polyglot
In reply to this post by Dirk Laurie-2
On 2014-09-18 02:36, Dirk Laurie wrote:
>
> If you want "non-negative" spelt out, you can find it in the 5.3 manual
> in §3.4.7.

Looks good to me, though obviously the definition of sequence in section
2.1 needs changing as well.

Where does it leave 5.2 users? Should the 5.2 manual be updated?

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

Coda Highland
In reply to this post by polyglot
On Wed, Sep 17, 2014 at 8:47 AM,  <[hidden email]> wrote:

> On 2014-09-18 01:02, Coda Highland wrote:
>>
>>
>> This set of keys is {}, the empty set. It IS a member of the set of
>> sets of the form {1..n}, with n = 0.
>
>
> "We use the term sequence to denote a table where the set of all positive
> numeric keys is equal to {1..n} for some integer n, which is called the
> length of the sequence (see §3.4.6)."
>
> Taking n = 0 seems pathological to me, even though 0 is indeed an integer.
> The main problem, however, is that any integer less than 1 produces the
> empty set. Try n = -42 with your for loop.
>
> So the length of the sequence wouldn't be uniquely defined. Obviously
> 0 makes the most sense, but the question is how one gets there from
> the manual.

I'm applying the strict rules of set logic here.  It's well-defined in
mathematics -- it's a DEGENERATE case, yes, but it's not an exception.
Would you complain that 1 isn't a power of 2 just because you can't
get there by multiplying a nonzero number of twos? It's the same
general concept.

n = -42 doesn't count because the definition specifically says
"positive numeric keys". A table containing { [-42] = "blah" } *is* a
sequence. The -42 isn't PART of the sequence, but the table is a
sequence. Likewise, { "a" = "blah" } is ALSO a sequence -- also a
zero-length one.

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

Petite Abeille
In reply to this post by polyglot

On Sep 17, 2014, at 7:33 PM, [hidden email] wrote:

> Where does it leave 5.2 users?

"And yet it moves” —  Galileo Galilei, allegedly


Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

polyglot
In reply to this post by Coda Highland
On 2014-09-18 04:07, Coda Highland wrote:
> On Wed, Sep 17, 2014 at 8:47 AM,  <[hidden email]> wrote:
>> "We use the term sequence to denote a table where the set of all
>> positive
>> numeric keys is equal to {1..n} for some integer n, which is called
>> the
>> length of the sequence (see §3.4.6)."
>
> n = -42 doesn't count because the definition specifically says
> "positive numeric keys".

Why would n = 0 count but n = -42 not count? They're both
integers, and neither is a positive numeric key.

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

Dirk Laurie-2
In reply to this post by polyglot
2014-09-17 19:33 GMT+02:00  <[hidden email]>:

> On 2014-09-18 02:36, Dirk Laurie wrote:
>>
>>
>> If you want "non-negative" spelt out, you can find it in the 5.3 manual
>> in §3.4.7.
>
>
> Looks good to me, though obviously the definition of sequence in section
> 2.1 needs changing as well.
>
> Where does it leave 5.2 users? Should the 5.2 manual be updated?

#angels_dancing_on_the_head_of_a_pin

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

Coda Highland
In reply to this post by polyglot
On Wed, Sep 17, 2014 at 11:20 AM,  <[hidden email]> wrote:

> On 2014-09-18 04:07, Coda Highland wrote:
>>
>> On Wed, Sep 17, 2014 at 8:47 AM,  <[hidden email]> wrote:
>>>
>>> "We use the term sequence to denote a table where the set of all positive
>>> numeric keys is equal to {1..n} for some integer n, which is called the
>>> length of the sequence (see §3.4.6)."
>>
>>
>> n = -42 doesn't count because the definition specifically says
>> "positive numeric keys".
>
>
> Why would n = 0 count but n = -42 not count? They're both
> integers, and neither is a positive numeric key.
>

Ah, sorry, that's an aliasing problem. Different contexts, and I
should have been more cautious in how I phrased it. 0 is just as
irrelevant to the sequenceness of a table as -42 is.

The difference between 0 and -42 in the aforementioned loop iteration
is that 0 is a cardinal number (that is, a number that can be used to
describe the size of a set), but -42 is not.

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Does PIL (3rd edition) repeatedly misuse the length operator on tables and invoke undefined behavior?

polyglot
On 2014-09-18 05:17, Coda Highland wrote:
>
> The difference between 0 and -42 in the aforementioned loop iteration
> is that 0 is a cardinal number (that is, a number that can be used to
> describe the size of a set), but -42 is not.

That's why 0 makes more sense than -42, but the trouble with undefined
behavior is that it's not required to do what makes more sense. The
length value of -42 for an empty table seems perfectly consistent
with the wording of the 5.2 reference manual.

Just going by the 5.2 reference manual, I wouldn't know whether I
could depend on #table being 0 for an empty table. I'd have to peek
under the hood of each particular Lua implementation, and
portability wouldn't be assured.

PIL does a better job with this when it says "In particular, a
table with no [positive] numeric keys is a sequence with length
zero." I think such a direct statement in the normative reference
would be very helpful.

Anyway, practically speaking, I feel my question has been answered,
and I'll continue assuming my empty tables have length zero. :-)


12