|
|
$ lua Lua 5.4.2 Copyright (C) 1994-2020 Lua.org, PUC-Rio > nan = 0/0 > t = {nan, nan, 20, 10} > table.sort(t) > print(table.concat(t, ", ")) -nan, 20, -nan, 10
|
|
On Sun, Jan 3, 2021 at 5:48 PM Egor Skriptunoff < [hidden email]> wrote: $ lua Lua 5.4.2 Copyright (C) 1994-2020 Lua.org, PUC-Rio > nan = 0/0 > t = {nan, nan, 20, 10} > table.sort(t) > print(table.concat(t, ", ")) -nan, 20, -nan, 10
Garbage in, garbage out. Every sort function I've ever seen has undefined behavior when you introduce NaNs, because a < b and b < a are both false when either is a NaN, so the sort function gets confused.
Don't feed NaNs to a sort function, or if you must, define your own comparison function that detects NaNs and treats them specially (e.g by designating them as less than any other value).
|
|
>>>>> "Egor" == Egor Skriptunoff < [hidden email]> writes:
> $ lua
> Lua 5.4.2 Copyright (C) 1994-2020 Lua.org, PUC-Rio
>> nan = 0/0
>> t = {nan, nan, 20, 10}
>> table.sort(t)
>> print(table.concat(t, ", "))
> -nan, 20, -nan, 10
That's fairly standard for sort functions; the comparison function must
be at least a strict weak ordering if not a total order, a partial order
isn't sufficient, and if the comparison function isn't adequate then the
result may be garbage (or for some sort functions in some languages, it
may even crash).
The standard NaN semantics where all comparisons against NaN are false
means that sorting lists containing NaN is impossible; but testing for
this case would complicate the sort function.
The specific requirement that is violated is that the < relation used
for the sort must be a strict partial order with this additional
requirement: the relation (not (a < b or b < a)) must be an equivalence
relation. For the conventional < operator in the presence of NaN, this
is not the case: define (a EQ b) as (not (a < b or b < a)), and we find
that (a EQ nan) and (nan EQ b) is always true, but (a EQ b) may be false.
--
Andrew.
|
|
On Mon, Jan 4, 2021 at 2:09 AM Jonathan Goble wrote: Every sort function I've ever seen has undefined behavior when you introduce NaNs
Numbers with NaN is just the simplest example.
The same problem will be observed with table.sort() on many other partially ordered sets (using custom sort function).
Don't feed NaNs to a sort function
Why not? The Lua manual claims that table.sort() can work with partially ordered sets.
Probably it should not have promised that :-)
BTW, why doesn't table.sort() return the table anymore as it was in Lua 5.1?
|
|
On Mon, Jan 4, 2021 at 2:19 AM Andrew Gierth wrote:
The specific requirement that is violated is that the < relation used
for the sort must be a strict partial order with this additional
requirement: the relation (not (a < b or b < a)) must be an equivalence
relation.
Interesting.
But this condition is too complex to be included in the manual.
It would be more natural to simply replace "strict partial order" with "strict total order" :-)
|
|
It was returning new table in Lua 3.2.
Lua manual for 4.0 says:
setglobal and sort no longer return a value;
On 4 Jan 2021, 02:09 +0300, wrote:
BTW, why doesn't table.sort() return the table anymore as it was in Lua 5.1?
|
|
>>>>> "Egor" == Egor Skriptunoff < [hidden email]> writes:
>> The specific requirement that is violated is that the < relation
>> used for the sort must be a strict partial order with this
>> additional requirement: the relation (not (a < b or b < a)) must be
>> an equivalence relation.
Egor> Interesting.
Egor> But this condition is too complex to be included in the manual.
Egor> It would be more natural to simply replace "strict partial order"
Egor> with "strict total order" :-)
"strict weak order" is the correct minimum requirement - a total order
is not necessary. (A strict weak order can be considered as a total
order over classes of equivalent elements; if you're sorting data where
distinguishable values can sort equally, e.g. objects sorted by some
property value, this distinction matters.)
The documentation is incorrect to state that a strict partial order is
sufficient. (I believe I may have pointed this out before.)
--
Andrew.
|
|
> $ lua
> Lua 5.4.2 Copyright (C) 1994-2020 Lua.org, PUC-Rio
> > nan = 0/0
> > t = {nan, nan, 20, 10}
> > table.sort(t)
> > print(table.concat(t, ", "))
> -nan, 20, -nan, 10
Thanks for the report.
-- Roberto
|
|
> "strict weak order" is the correct minimum requirement - a total order
> is not necessary. (A strict weak order can be considered as a total
> order over classes of equivalent elements; if you're sorting data where
> distinguishable values can sort equally, e.g. objects sorted by some
> property value, this distinction matters.)
>
> The documentation is incorrect to state that a strict partial order is
> sufficient. (I believe I may have pointed this out before.)
Thanks for the explanation. (I don't think you did.)
-- Roberto
|
|
Comparing NaN with "<=" or ">=" is not significant at all, these operators are not suitable even for a partial order. Sorting however requires a partial order. The sort function by default assumes that "<=" and ">=" are the logical negation of ">" and "<".
A total order is not always required for sorting (but it is needed if you want a "stable" sort). For a total order, you also need a semantic for "==" (which must be the logical inverse of "!="), something that is also false by default with NaN.
I wonder if the default sort() function should not already have a builtin test for NaN, and a *boolean* parameter saying how we want to sort them (lower than everything by default when it is nil/false, or higher than everything otherwise).
This way no need to specify a custom compare function, it would be implemented natively, may be faster than when using a custom compare function in Lua, and calls to the custom compare functions could be avoided when one of the two values is NaN (and unless you want for a "stable" total order, it can be avoided when both values are NaN; this could optimize for performance in many cases where NaN are very frequent, notably with data selected from database tables with outer joins, i.e. for relations with 0-1 or 0-N cardinality, something that occurs extremely frequently in many apps). Le lun. 4 janv. 2021 à 00:19, Andrew Gierth < [hidden email]> a écrit : >>>>> "Egor" == Egor Skriptunoff <[hidden email]> writes:
> $ lua
> Lua 5.4.2 Copyright (C) 1994-2020 Lua.org, PUC-Rio
>> nan = 0/0
>> t = {nan, nan, 20, 10}
>> table.sort(t)
>> print(table.concat(t, ", "))
> -nan, 20, -nan, 10
That's fairly standard for sort functions; the comparison function must
be at least a strict weak ordering if not a total order, a partial order
isn't sufficient, and if the comparison function isn't adequate then the
result may be garbage (or for some sort functions in some languages, it
may even crash).
The standard NaN semantics where all comparisons against NaN are false
means that sorting lists containing NaN is impossible; but testing for
this case would complicate the sort function.
The specific requirement that is violated is that the < relation used
for the sort must be a strict partial order with this additional
requirement: the relation (not (a < b or b < a)) must be an equivalence
relation. For the conventional < operator in the presence of NaN, this
is not the case: define (a EQ b) as (not (a < b or b < a)), and we find
that (a EQ nan) and (nan EQ b) is always true, but (a EQ b) may be false.
--
Andrew.
|
|
In reply to this post by Roberto Ierusalimschy
On Mon, Jan 4, 2021 at 4:26 PM Roberto Ierusalimschy wrote:
|
|