Nitpicking about string.sub argument validity check

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Nitpicking about string.sub argument validity check

Egor Skriptunoff-2
Hi!
 
It is reasonable to warn user about "string.sub" usage mistakes such as
   str:sub(1, 3.14)
that's why in Lua 5.3 string.sub might raise "number has no integer representation" error.
 
But it appears that math.huge as argument also raises this error.
IMO, the following use cases should be considered as being valid:
   string.sub("abc", 2, math.huge)
   string.sub("abc", -math.huge, -2)
Yes, I know we could use -1 and 1 values respectively to get the same results,
but infinite values are suitable for the semantic of the function.
 
It also looks strange that
   ("abc"):sub(2, 10^15)
   ("abc"):sub(-10^15, -2)
are OK, but
   ("abc"):sub(2, 10^155)
   ("abc"):sub(-10^155, -2)
raise the error.

The suggestion:
1) any float position beyond integer range should be treated as infinite value;
2) infinite value should be considered valid despite of not having integer representation.
 
 
P.S.
A small bonus (a hidden surprise for large RAM users): string.rep is unable to repeat 2^31 times.
Why?
(Tested on Lua 5.3 on average PC with 16 GB RAM)
 
> local a = ("a"):rep(2^30); a = a..a; print(#a)
2147483648
> local a = ("a"):rep(2^31); print(#a)
stdin:1: resulting string too large
stack traceback:
        [C]: in function 'string.rep'
        stdin:1: in main chunk
        [C]: in ?

Reply | Threaded
Open this post in threaded view
|

Re: Nitpicking about string.sub argument validity check

Albert Chan


On Aug 31, 2018, at 5:03 PM, Egor Skriptunoff <[hidden email]> wrote:

> Hi!
>  
> It is reasonable to warn user about "string.sub" usage mistakes such as
>    str:sub(1, 3.14)
> that's why in Lua 5.3 string.sub might raise "number has no integer representation" error.
>  
> But it appears that math.huge as argument also raises this error.
> IMO, the following use cases should be considered as being valid:
>    string.sub("abc", 2, math.huge)
>    string.sub("abc", -math.huge, -2)
> Yes, I know we could use -1 and 1 values respectively to get the same results,
> but infinite values are suitable for the semantic of the function.

How to tell if 3.14 were a mistake, and math.huge is not ?

I just felt uneasy using infinity as end-point ...
Especially bad if the infinity value is hidden behind a variable.

What is gain by not using 1 for first location, -1 for last ?


Reply | Threaded
Open this post in threaded view
|

Re: Nitpicking about string.sub argument validity check

Egor Skriptunoff-2
On Sat, Sep 1, 2018 at 1:09 AM, Albert Chan wrote:
> But it appears that math.huge as argument also raises this error.
> IMO, the following use cases should be considered as being valid:
>    string.sub("abc", 2, math.huge)
>    string.sub("abc", -math.huge, -2)
> Yes, I know we could use -1 and 1 values respectively to get the same results,
> but infinite values are suitable for the semantic of the function.

I just felt uneasy using infinity as end-point ...
Especially bad if the infinity value is hidden behind a variable.

What is gain by not using 1 for first location, -1 for last ?

local function get_suffix_of_length(str, len)
   return str:sub(-len)
end

get_suffix_of_length(s, math.huge)  -- Easily understandable
get_suffix_of_length(s, -1)  -- What does "the suffix of length (-1)" mean?

So, the gain is readability.
We have expected behavior for the case when variable suffix_length = math.huge
We don't need to introduce magic number in a documentation: "(-1) means maximal length"
Infinity is very natural value, every user is aware of its meaning.
Reply | Threaded
Open this post in threaded view
|

Re: Nitpicking about string.sub argument validity check

Sean Conner
It was thus said that the Great Egor Skriptunoff once stated:

> On Sat, Sep 1, 2018 at 1:09 AM, Albert Chan wrote:
>
> > > But it appears that math.huge as argument also raises this error.
> > > IMO, the following use cases should be considered as being valid:
> > >    string.sub("abc", 2, math.huge)
> > >    string.sub("abc", -math.huge, -2)
> > > Yes, I know we could use -1 and 1 values respectively to get the same
> > results,
> > > but infinite values are suitable for the semantic of the function.
> >
> > I just felt uneasy using infinity as end-point ...
> > Especially bad if the infinity value is hidden behind a variable.
> >
> > What is gain by not using 1 for first location, -1 for last ?
> >
>
> local function get_suffix_of_length(str, len)
>    return str:sub(-len)
> end
>
> get_suffix_of_length(s, math.huge)  -- Easily understandable
> get_suffix_of_length(s, -1)  -- What does "the suffix of length (-1)" mean?

  If the code is expected to run under Lua 5.3, then why not

        get_suffix_of_length(s,math.maxinteger) --?

  And if you want to run it under previous versions:

        get_suffix_of_length(s,math.maxinteger or -1)

which shows intent *and* is backwards compatible (yes, the -1 looks odd).

  -spc


Reply | Threaded
Open this post in threaded view
|

Re: Nitpicking about string.sub argument validity check

Albert Chan

>> local function get_suffix_of_length(str, len)
>>  return str:sub(-len)
>> end
>>
>> get_suffix_of_length(s, math.huge)  -- Easily understandable
>> get_suffix_of_length(s, -1)  -- What does "the suffix of length (-1)" mean?
>
> If the code is expected to run under Lua 5.3, then why not
>
>   get_suffix_of_length(s,math.maxinteger) --?
>
> And if you want to run it under previous versions:
>
>   get_suffix_of_length(s,math.maxinteger or -1)
>
> which shows intent *and* is backwards compatible (yes, the -1 looks odd).
>
> -spc

I do not even know Lua "trim edges" that that.
I think exact length is more readable, and skip this "feature"

> get_suffix_of_length(s, #s)


Reply | Threaded
Open this post in threaded view
|

Re: Nitpicking about string.sub argument validity check

Lorenzo Donati-3
In reply to this post by Egor Skriptunoff-2
On 01/09/2018 00:56, Egor Skriptunoff wrote:

> On Sat, Sep 1, 2018 at 1:09 AM, Albert Chan wrote:
>
>>> But it appears that math.huge as argument also raises this error.
>>> IMO, the following use cases should be considered as being valid:
>>>    string.sub("abc", 2, math.huge)
>>>    string.sub("abc", -math.huge, -2)
>>> Yes, I know we could use -1 and 1 values respectively to get the same
>> results,
>>> but infinite values are suitable for the semantic of the function.
>>
>> I just felt uneasy using infinity as end-point ...
>> Especially bad if the infinity value is hidden behind a variable.
>>
>> What is gain by not using 1 for first location, -1 for last ?
>>
>
> local function get_suffix_of_length(str, len)
>    return str:sub(-len)
> end
>
> get_suffix_of_length(s, math.huge)  -- Easily understandable
> get_suffix_of_length(s, -1)  -- What does "the suffix of length (-1)" mean?
>
> So, the gain is readability.
> We have expected behavior for the case when variable suffix_length =
> math.huge
> We don't need to introduce magic number in a documentation: "(-1) means
> maximal length"
> Infinity is very natural value, every user is aware of its meaning.
>

The problem is that math.huge it is not guaranteed to be, semantically,
infinity. The docs says:

"The float value HUGE_VAL, a value larger than any other numeric value. "

Which doesn't even guarantee that it is and IEEE754 "inf" value: you may
be working on a non-IEEE754 compliant machine or you may have compiled
Lua with different float support, or whatever.

This may lead to program that behave differently on different
platforms/machines/Lua versions (if you care about that).

As Sean Conner said in another message, you may use math.maxinteger,
which is more explicit.

Cheers!

-- Lorenzo



Reply | Threaded
Open this post in threaded view
|

Re: Nitpicking about string.sub argument validity check

Egor Skriptunoff-2
On Mon, Sep 3, 2018 at 12:07 PM, Lorenzo Donati wrote:
The problem is that math.huge it is not guaranteed to be, semantically, infinity.


Even for a non-standard floating point numbers implementation (for example, without infinite values), Lua guarantees that "math.huge" is larger than any other FP value.
For a sane FP implementation, this means that such a number would be large enough, so we can hope it would be greater than the length of longest string you could create in Lua VM.
Hence, there would be no problems with using "math.huge" as appropriate value for "far-far-away-position-in-a-string".
 
There exist non-standard FP implementations that potentially could be problematic as their maximal number is rather small.
For example, ARM CPUs support "half-precision" 16-bit infinity-less format (1 sign + 5 exponent + 10 significand) which is able to represent numbers up to 131008.0
A string could have length > 128 KBytes, so this is a potential problem.
But ARM manual says: "The __fp16 type is a storage format only. For purposes of arithmetic and other operations, __fp16 values in C or C++ expressions are automatically promoted to float."
So, there is no sane reason to build Lua for ARM with 16-bit FP numbers.
 
There exist a lot of non-standard FP formats
http://www.quadibloc.com/comp/cp0201.htm
and happily almost all of them have 8 bits for exponent, that means their maximal finite number is about 10^38 or more (so, no problem!).