string:sub

classic Classic list List threaded Threaded
34 messages Options
12
Reply | Threaded
Open this post in threaded view
|

string:sub

David Burgess-3
It seems to me that negative values of string.sub() are inconsistent
with not only the first parameter but also with string.find().

Examples:

 >s="123456789"
 >=s:sub(-1)
9
 >=s:sub(1,0)

 >=s:sub(1,-1)
123456789
 >=s:sub(1,-2)
12345678
 >=s:find("56")
5 6
 >=s:sub(1,6-2)
1234
 >=s:sub(-1)
9

I am boldly suggesting that the second parameter with negative indices
behave the same as the first. I guess this implies that the value 0 for
the second parameter would return the length of the string.
Maybe I have used too many similar functions in other languages but -1
meaning the last character seems intuitive to me.

Anyone else of a like view?

DB
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

Daurnimator


On 6 March 2010 11:29, David Burgess <[hidden email]> wrote:
It seems to me that negative values of string.sub() are inconsistent with not only the first parameter but also with string.find().

Examples:

>s="123456789"
>=s:sub(-1)
9
>=s:sub(1,0)

>=s:sub(1,-1)
123456789
>=s:sub(1,-2)
12345678
>=s:find("56")
5       6
>=s:sub(1,6-2)
1234
>=s:sub(-1)
9

I am boldly suggesting that the second parameter with negative indices behave the same as the first. I guess this implies that the value 0 for
the second parameter would return the length of the string.
Maybe I have used too many similar functions in other languages but -1 meaning the last character seems intuitive to me.

Anyone else of a like view?

DB

I don't understand what your question/proposal is..... everything seems normal/sensible to me.
1 is the first character; -1 is the last character.... 
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

Philippe Lhoste
In reply to this post by David Burgess-3
On 06/03/2010 01:29, David Burgess wrote:
> It seems to me that negative values of string.sub() are inconsistent
> with not only the first parameter but also with string.find().

Perhaps that's because I am French, but I have hard time to understand
that sentence. string.sub() returns strings, no negative values.
If you refer to negative values of parameters, I don't see how they are
inconsistent with "the first parameter".
And why it is inconsistent with string.find().
For both functions, negatives values are for indexes are explained at
the start of the String Manipulation chapter: "When indexing a string in
Lua, the first character is at position 1 (not at 0, as in C). Indices
are allowed to be negative and are interpreted as indexing backwards,
from the end of the string. Thus, the last character is at position -1,
and so on."

The results of the examples you show seems consistent with this definition.

> I am boldly suggesting that the second parameter with negative indices
> behave the same as the first.

It does. If they are negative, their absolute value is relative to the
end of the string, going backward.

> Maybe I have used too many similar functions in other languages but -1
> meaning the last character seems intuitive to me.

Perhaps you can use a wrapper around this function so it behaves as you
expect? Or your own function.
Lua, in many ways, isn't like the other languages, that's part of its
charm... :-)

--
Philippe Lhoste
--  (near) Paris -- France
--  http://Phi.Lho.free.fr
--  --  --  --  --  --  --  --  --  --  --  --  --  --

Reply | Threaded
Open this post in threaded view
|

Re: string:sub

WU Jun
In reply to this post by David Burgess-3
-1 means the last char.
0 means the position before the first char.

s="123456789"
s:sub(-3,-1) -- "789", last three chars.
s:sub(1,1) -- "1", first char to first char
s:sub(2,-2) -- "2345678", second char to the char before last one.
s:sub(1,0) -- "", nothing

If a is negative, s:sub(a, b) is the same as s:sub(a+#s+1, b).
If b is negative, s:sub(a, b) is the same as s:sub(a, b+#s+1).
It makes sense.

Python uses -1 indicating the last element in lists as well:
a=[111,222,333] # a is a list
a[-1] # last one, 333
a[-3] # 111

On 03/06/2010 08:29 AM, David Burgess wrote:

> It seems to me that negative values of string.sub() are inconsistent
> with not only the first parameter but also with string.find().
>
> Examples:
>
>  >s="123456789"
>  >=s:sub(-1)
> 9
>  >=s:sub(1,0)
>
>  >=s:sub(1,-1)
> 123456789
>  >=s:sub(1,-2)
> 12345678
>  >=s:find("56")
> 5 6
>  >=s:sub(1,6-2)
> 1234
>  >=s:sub(-1)
> 9
>
> I am boldly suggesting that the second parameter with negative indices
> behave the same as the first. I guess this implies that the value 0 for
> the second parameter would return the length of the string.
> Maybe I have used too many similar functions in other languages but -1
> meaning the last character seems intuitive to me.
>
> Anyone else of a like view?
>
> DB
>

Reply | Threaded
Open this post in threaded view
|

Re: string:sub

Shmuel Zeigerman
In reply to this post by David Burgess-3
David Burgess wrote:
> It seems to me that negative values of string.sub() are inconsistent
> with not only the first parameter but also with string.find().

See a somewhat related post [1]. Nowadays I feel fully consistent with
how Lua treats indexes (and wouldn't post [1] again).

[1] http://lua-users.org/lists/lua-l/2008-04/msg00175.html

--
Shmuel
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

David Burgess-3
In reply to this post by Philippe Lhoste
Whats not very "Luaish" is the inconsistency. As I was unclear,
I have no problems with parameter 1, it is #2 thats the problem.

The inconsistency is what the value 0 does on parameter 2,
seemingly the same as -1. I would have expected -1 on parameter 2
to mean 1 less than zero, i.e. -1 from the end.

DB


On 6/3/2010 6:53 PM, Philippe Lhoste wrote:

> On 06/03/2010 01:29, David Burgess wrote:
>> It seems to me that negative values of string.sub() are inconsistent
>> with not only the first parameter but also with string.find().
>
> Perhaps that's because I am French, but I have hard time to understand
> that sentence. string.sub() returns strings, no negative values.
> If you refer to negative values of parameters, I don't see how they are
> inconsistent with "the first parameter".
> And why it is inconsistent with string.find().
> For both functions, negatives values are for indexes are explained at
> the start of the String Manipulation chapter: "When indexing a string in
> Lua, the first character is at position 1 (not at 0, as in C). Indices
> are allowed to be negative and are interpreted as indexing backwards,
> from the end of the string. Thus, the last character is at position -1,
> and so on."
>
> The results of the examples you show seems consistent with this definition.
>
>> I am boldly suggesting that the second parameter with negative indices
>> behave the same as the first.
>
> It does. If they are negative, their absolute value is relative to the
> end of the string, going backward.
>
>> Maybe I have used too many similar functions in other languages but -1
>> meaning the last character seems intuitive to me.
>
> Perhaps you can use a wrapper around this function so it behaves as you
> expect? Or your own function.
> Lua, in many ways, isn't like the other languages, that's part of its
> charm... :-)
>
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

Erik Lindroos
On Sat, Mar 6, 2010 at 1:59 PM, David Burgess <[hidden email]> wrote:
The inconsistency is what the value 0 does on parameter 2,
seemingly the same as -1. I would have expected -1 on parameter 2
to mean 1 less than zero, i.e. -1 from the end.

What do you mean with seemingly the same? Clearly it's not the same. sub(1, 0) gives you an empty string, sub(1, -1) gives you the whole string, just as you pasted.

Reply | Threaded
Open this post in threaded view
|

Re: string:sub

David Burgess-3
Maybe you have identified the issue:

Lua 5.1.2  Copyright (C) 1994-2007 Lua.org, PUC-Rio
 > =("12345"):sub(1,0)
12345

If you get nothing then I guess thats where the confusion comes
from. I will check into this.

Thanks.

On 7/3/2010 12:06 AM, Erik Lindroos wrote:

> On Sat, Mar 6, 2010 at 1:59 PM, David Burgess<[hidden email]>  wrote:
>>
>> The inconsistency is what the value 0 does on parameter 2,
>> seemingly the same as -1. I would have expected -1 on parameter 2
>> to mean 1 less than zero, i.e. -1 from the end.
>>
>
> What do you mean with seemingly the same? Clearly it's not the same. sub(1,
> 0) gives you an empty string, sub(1, -1) gives you the whole string, just as
> you pasted.
>
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

Enrico Colombini
David Burgess wrote:
> Lua 5.1.2  Copyright (C) 1994-2007 Lua.org, PUC-Rio
>> =("12345"):sub(1,0)
> 12345
>
> If you get nothing then I guess thats where the confusion comes
> from. I will check into this.

I too get nothing (5.1.2, Windows XP).

   Enrico
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

Erik Lindroos
In reply to this post by David Burgess-3
Maybe you have identified the issue:

Lua 5.1.2  Copyright (C) 1994-2007 Lua.org, PUC-Rio
> =("12345"):sub(1,0)
12345

If you get nothing then I guess thats where the confusion comes
from. I will check into this.

Well, in the original post you quoted the result:
 >=s:sub(1,0)
 

When did you get that then?
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

spir ☣
In reply to this post by David Burgess-3
On Sat, 06 Mar 2010 11:29:25 +1100
David Burgess <[hidden email]> wrote:

> It seems to me that negative values of string.sub() are inconsistent
> with not only the first parameter but also with string.find().
>
> Examples:
>
>  >s="123456789"
>  >=s:sub(-1)
> 9
>  >=s:sub(1,0)
>
>  >=s:sub(1,-1)
> 123456789
>  >=s:sub(1,-2)
> 12345678
>  >=s:find("56")
> 5 6
>  >=s:sub(1,6-2)
> 1234
>  >=s:sub(-1)
> 9
>
> I am boldly suggesting that the second parameter with negative indices
> behave the same as the first.

That's what id does. (Unlike eg in python.) What do you mean?

> I guess this implies that the value 0 for
> the second parameter would return the length of the string.

So you wish to break the symetry between positive and negative indices?
This may make sense for half-open interval, where in fact s.sub(i,j) == s[i,j[.

> Maybe I have used too many similar functions in other languages but -1
> meaning the last character seems intuitive to me.

-1 precisely points to the last character. s.sub(-1) == s.sub(#s)

So, what's wrong? (These points precisely are Lua features I'm very happy with :-)

Denis
--
________________________________

la vita e estrany

spir.wikidot.com

Reply | Threaded
Open this post in threaded view
|

Re: string:sub

spir ☣
In reply to this post by Enrico Colombini
On Sat, 06 Mar 2010 14:17:02 +0100
Enrico Colombini <[hidden email]> wrote:

> David Burgess wrote:
> > Lua 5.1.2  Copyright (C) 1994-2007 Lua.org, PUC-Rio  
> >> =("12345"):sub(1,0)  
> > 12345
> >
> > If you get nothing then I guess thats where the confusion comes
> > from. I will check into this.  
>
> I too get nothing (5.1.2, Windows XP).
>
>    Enrico

Lua 5.1.4  Copyright (C) 1994-2008 Lua.org, PUC-Rio
> =("12345"):sub(1,0)

>

on Unbuntu 9.10

Denis
--
________________________________

la vita e estrany

spir.wikidot.com

Reply | Threaded
Open this post in threaded view
|

Re: string:sub

Scott Vokes
> Lua 5.1.4  Copyright (C) 1994-2008 Lua.org, PUC-Rio
>> =("12345"):sub(1,0)
   The reference manual says (pg. 74 /
http://www.lua.org/manual/5.1/manual.html#5.4):
"Indices are allowed to be negative and are interpreted as indexing
backwards, from the end of the string. Thus, the last character is at
position -1, and so on." To me, that means sub(1, 0) should return the
string between the first character and the last character, i.e., the
empty string.

Scott
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

joao lobato
Both arguments have the sa[nm]e meaning: the sign serves only to
identify the direction (like in physics); in practice you don't want
to use 0.

And since

s:sub(0,#s) == s:sub(1,#s) and (''):sub(n,0) == '', for any string s
and any integer n

and

> =("12345"):sub(1,2.5)
12
> =("12345"):sub(1,2.6)
123

I suppose it would be better all around if string.sub only accepted
non-null integers as indices

On 3/6/10, Scott Vokes <[hidden email]> wrote:

>> Lua 5.1.4  Copyright (C) 1994-2008 Lua.org, PUC-Rio
>>> =("12345"):sub(1,0)
>    The reference manual says (pg. 74 /
> http://www.lua.org/manual/5.1/manual.html#5.4):
> "Indices are allowed to be negative and are interpreted as indexing
> backwards, from the end of the string. Thus, the last character is at
> position -1, and so on." To me, that means sub(1, 0) should return the
> string between the first character and the last character, i.e., the
> empty string.
>
> Scott
>
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

Shmuel Zeigerman
joao lobato wrote:
> I suppose it would be better all around if string.sub only accepted
> non-null integers as indices

If s:sub(5,4) is valid then s:sub(1,0) should be valid too.

--
Shmuel
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

Roberto Ierusalimschy
> joao lobato wrote:
> >I suppose it would be better all around if string.sub only accepted
> >non-null integers as indices
>
> If s:sub(5,4) is valid then s:sub(1,0) should be valid too.

Sure. s:sub(1,n) is the prefix of 's' with length 'n', so zero must be
a valid index to allow the empty prefix. zero is not negative, so it
counts from the beginning.

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

joao lobato
In reply to this post by Shmuel Zeigerman
On 3/6/10, Shmuel Zeigerman <[hidden email]> wrote:
> joao lobato wrote:
>> I suppose it would be better all around if string.sub only accepted
>> non-null integers as indices
>
> If s:sub(5,4) is valid then s:sub(1,0) should be valid too.
>
> --
> Shmuel
>
You are right. string.sub's behaviour depends on string.find's; it
must therefore accept both cases you pointed.

But why accept non-integers?
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

joao lobato
In reply to this post by Roberto Ierusalimschy
On 3/6/10, Roberto Ierusalimschy <[hidden email]> wrote:
>> If s:sub(5,4) is valid then s:sub(1,0) should be valid too.
>
> Sure. s:sub(1,n) is the prefix of 's' with length 'n', so zero must be
> a valid index to allow the empty prefix. zero is not negative, so it
> counts from the beginning.
>
> -- Roberto
>

It's just that since s:sub(2,n) doesn't mean "the substring of 's'
with length 'n' starting at the second position of 's'", it just feels
that your point is a happy coincidence rather than a particular case
of a rule.

I feel like I managed to misuse the "principle of [Roberto's] least
surprise" and break list etiquette.
Reply | Threaded
Open this post in threaded view
|

Re: string:sub

spir ☣
In reply to this post by joao lobato
On Sat, 6 Mar 2010 19:40:04 +0000
joao lobato <[hidden email]> wrote:

> And since
>
> s:sub(0,#s) == s:sub(1,#s) and (''):sub(n,0) == '', for any string s
> and any integer n
>
> and
>
> > =("12345"):sub(1,2.5)  
> 12
> > =("12345"):sub(1,2.6)  
> 123
>
> I suppose it would be better all around if string.sub only accepted
> non-null integers as indices

++

Denis
--
________________________________

la vita e estrany

spir.wikidot.com

Reply | Threaded
Open this post in threaded view
|

Re: string:sub

Philippe Lhoste
In reply to this post by joao lobato
On 06/03/2010 20:40, joao lobato wrote:
> I suppose it would be better all around if string.sub only accepted
> non-null integers as indices

Yes, but unlike some languages like Java, which prefer to hold the hand
of users (might be useful in large projects, I reckon), Lua's
philosophy, not unlike C's one, is: "garbage in, garbage out".
Ie. most functions just don't check validity of input, relying on
intelligence of programmers to provide valid data. At the risk of
crashing or having undefined behavior (security breaches...). But
skipping such checking was valuable on 1MHz processors, meaning higher
speed. :-)

--
Philippe Lhoste
--  (near) Paris -- France
--  http://Phi.Lho.free.fr
--  --  --  --  --  --  --  --  --  --  --  --  --  --

12