table.sort gives wrong result on partially ordered set

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

table.sort gives wrong result on partially ordered set

Egor Skriptunoff-2
$ lua
Lua 5.4.2  Copyright (C) 1994-2020 Lua.org, PUC-Rio
> nan = 0/0
> t = {nan, nan, 20, 10}
> table.sort(t)
> print(table.concat(t, ", "))
-nan, 20, -nan, 10

Reply | Threaded
Open this post in threaded view
|

Re: table.sort gives wrong result on partially ordered set

Jonathan Goble
On Sun, Jan 3, 2021 at 5:48 PM Egor Skriptunoff <[hidden email]> wrote:
$ lua
Lua 5.4.2  Copyright (C) 1994-2020 Lua.org, PUC-Rio
> nan = 0/0
> t = {nan, nan, 20, 10}
> table.sort(t)
> print(table.concat(t, ", "))
-nan, 20, -nan, 10


Garbage in, garbage out. Every sort function I've ever seen has undefined behavior when you introduce NaNs, because a < b and b < a are both false when either is a NaN, so the sort function gets confused.

Don't feed NaNs to a sort function, or if you must, define your own comparison function that detects NaNs and treats them specially (e.g by designating them as less than any other value).
Reply | Threaded
Open this post in threaded view
|

Re: table.sort gives wrong result on partially ordered set

Andrew Gierth
In reply to this post by Egor Skriptunoff-2
>>>>> "Egor" == Egor Skriptunoff <[hidden email]> writes:

 > $ lua
 > Lua 5.4.2  Copyright (C) 1994-2020 Lua.org, PUC-Rio
 >> nan = 0/0
 >> t = {nan, nan, 20, 10}
 >> table.sort(t)
 >> print(table.concat(t, ", "))
 > -nan, 20, -nan, 10

That's fairly standard for sort functions; the comparison function must
be at least a strict weak ordering if not a total order, a partial order
isn't sufficient, and if the comparison function isn't adequate then the
result may be garbage (or for some sort functions in some languages, it
may even crash).

The standard NaN semantics where all comparisons against NaN are false
means that sorting lists containing NaN is impossible; but testing for
this case would complicate the sort function.

The specific requirement that is violated is that the < relation used
for the sort must be a strict partial order with this additional
requirement: the relation (not (a < b or b < a)) must be an equivalence
relation. For the conventional < operator in the presence of NaN, this
is not the case: define (a EQ b) as (not (a < b or b < a)), and we find
that (a EQ nan) and (nan EQ b) is always true, but (a EQ b) may be false.

--
Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: table.sort gives wrong result on partially ordered set

Egor Skriptunoff-2
In reply to this post by Jonathan Goble
On Mon, Jan 4, 2021 at 2:09 AM Jonathan Goble wrote:
Every sort function I've ever seen has undefined behavior when you introduce NaNs

Numbers with NaN is just the simplest example.
The same problem will be observed with table.sort() on many other partially ordered sets (using custom sort function).


Don't feed NaNs to a sort function

Why not?
The Lua manual claims that table.sort() can work with partially ordered sets.
Probably it should not have promised that :-)


BTW, why doesn't table.sort() return the table anymore as it was in Lua 5.1?
Reply | Threaded
Open this post in threaded view
|

Re: table.sort gives wrong result on partially ordered set

Egor Skriptunoff-2
In reply to this post by Andrew Gierth
On Mon, Jan 4, 2021 at 2:19 AM Andrew Gierth wrote:
The specific requirement that is violated is that the < relation used
for the sort must be a strict partial order with this additional
requirement: the relation (not (a < b or b < a)) must be an equivalence
relation.

Interesting.
But this condition is too complex to be included in the manual.
It would be more natural to simply replace "strict partial order" with "strict total order" :-)
Reply | Threaded
Open this post in threaded view
|

Re: table.sort gives wrong result on partially ordered set

Spar
In reply to this post by Egor Skriptunoff-2
It was returning new table in Lua 3.2.
Lua manual for 4.0 says:

setglobal and sort no longer return a value;
On 4 Jan 2021, 02:09 +0300, wrote:

BTW, why doesn't table.sort() return the table anymore as it was in Lua 5.1?
Reply | Threaded
Open this post in threaded view
|

Re: table.sort gives wrong result on partially ordered set

Andrew Gierth
In reply to this post by Egor Skriptunoff-2
>>>>> "Egor" == Egor Skriptunoff <[hidden email]> writes:

 >> The specific requirement that is violated is that the < relation
 >> used for the sort must be a strict partial order with this
 >> additional requirement: the relation (not (a < b or b < a)) must be
 >> an equivalence relation.

 Egor> Interesting.
 Egor> But this condition is too complex to be included in the manual.
 Egor> It would be more natural to simply replace "strict partial order"
 Egor> with "strict total order" :-)

"strict weak order" is the correct minimum requirement - a total order
is not necessary. (A strict weak order can be considered as a total
order over classes of equivalent elements; if you're sorting data where
distinguishable values can sort equally, e.g. objects sorted by some
property value, this distinction matters.)

The documentation is incorrect to state that a strict partial order is
sufficient. (I believe I may have pointed this out before.)

--
Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: table.sort gives wrong result on partially ordered set

Roberto Ierusalimschy
In reply to this post by Egor Skriptunoff-2
> $ lua
> Lua 5.4.2  Copyright (C) 1994-2020 Lua.org, PUC-Rio
> > nan = 0/0
> > t = {nan, nan, 20, 10}
> > table.sort(t)
> > print(table.concat(t, ", "))
> -nan, 20, -nan, 10

Thanks for the report.

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: table.sort gives wrong result on partially ordered set

Roberto Ierusalimschy
In reply to this post by Andrew Gierth
> "strict weak order" is the correct minimum requirement - a total order
> is not necessary. (A strict weak order can be considered as a total
> order over classes of equivalent elements; if you're sorting data where
> distinguishable values can sort equally, e.g. objects sorted by some
> property value, this distinction matters.)
>
> The documentation is incorrect to state that a strict partial order is
> sufficient. (I believe I may have pointed this out before.)

Thanks for the explanation. (I don't think you did.)

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: table.sort gives wrong result on partially ordered set

Philippe Verdy-2
In reply to this post by Andrew Gierth
Comparing NaN with "<=" or ">=" is not significant at all, these operators are not suitable even for a partial order.

Sorting however requires a partial order. The sort function by default assumes that "<=" and ">=" are the logical negation of ">" and "<".

A total order is not always required for sorting (but it is needed if you want a "stable" sort). For a total order, you also need a semantic for "==" (which must be the logical inverse of "!="), something that is also false by default with NaN.

I wonder if the default sort() function should not already have a builtin test for NaN, and a *boolean* parameter saying how we want to sort them (lower than everything by default when it is nil/false, or higher than everything otherwise).

This way no need to specify a custom compare function, it would be implemented natively, may be faster than when using a custom compare function in Lua, and calls to the custom compare functions could be avoided when one of the two values is NaN (and unless you want for a "stable" total order, it can be avoided when both values are NaN; this could optimize for performance in many cases where NaN are very frequent, notably with data selected from database tables with outer joins, i.e. for relations with 0-1 or 0-N cardinality, something that occurs extremely frequently in many apps).

Le lun. 4 janv. 2021 à 00:19, Andrew Gierth <[hidden email]> a écrit :
>>>>> "Egor" == Egor Skriptunoff <[hidden email]> writes:

 > $ lua
 > Lua 5.4.2  Copyright (C) 1994-2020 Lua.org, PUC-Rio
 >> nan = 0/0
 >> t = {nan, nan, 20, 10}
 >> table.sort(t)
 >> print(table.concat(t, ", "))
 > -nan, 20, -nan, 10

That's fairly standard for sort functions; the comparison function must
be at least a strict weak ordering if not a total order, a partial order
isn't sufficient, and if the comparison function isn't adequate then the
result may be garbage (or for some sort functions in some languages, it
may even crash).

The standard NaN semantics where all comparisons against NaN are false
means that sorting lists containing NaN is impossible; but testing for
this case would complicate the sort function.

The specific requirement that is violated is that the < relation used
for the sort must be a strict partial order with this additional
requirement: the relation (not (a < b or b < a)) must be an equivalence
relation. For the conventional < operator in the presence of NaN, this
is not the case: define (a EQ b) as (not (a < b or b < a)), and we find
that (a EQ nan) and (nan EQ b) is always true, but (a EQ b) may be false.

--
Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: table.sort gives wrong result on partially ordered set

Egor Skriptunoff-2
In reply to this post by Roberto Ierusalimschy
On Mon, Jan 4, 2021 at 4:26 PM Roberto Ierusalimschy wrote:
Thanks for the report.

It seems that Lua team had known the truth in 2013 :-)
http://lua-users.org/lists/lua-l/2013-03/msg00532.html