[PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

François Perrad
With Lua 5.3, the `string` library embeds 3 minilanguages:
  - a text formatting minilanguage in `format` function
  - a regexp minilanguage in `find`, `gmatch`, `gsub` and `match` functions
  - a binary pack/unpacking minilanguage in `pack`, `packsize` and `unpack` functions

These minilanguages are embedded in Lua strings which are interpreted only at runtime.
Before runtime, there are neither syntax check, neither argument type check.
They are outside the Lua grammar.
They are not friendly with JIT optimization.

The text formatting minilanguage is based on the one of C `sprintf`.
This proposal is based on its C++ replacement, the iostream library.
The strict replacement is the output stream, but with this model, it is easy to add the input stream counterpart.
And the binary pack/unpack could be unified in this model.
Two minilanguages are replaced by method chaining on a new userdata representing a string buffer.

    printf("x = %d  y = %d", 10, 20);                           -- C
    string.format("x = %d  y = %d", 10, 20)                     -- Lua 5.0
    ("x = %d  y = %d"):format(10, 20)                           -- Lua 5.1
    cout << "x = " << 10 << "  y = " << 20;                     -- C++
    string.buffer():put'x = ':put(10):put'  y = ':put(20)       -- proposal

    string.format("x = %#x", 200)                               --> "x = 0xc8"
    string.buffer():hex():showbase(true):put'x = ':put(200)

    string.format("pi = %.4f", math.pi)                         --> "pi = 3.1416"
    string.buffer():put'pi = ':fixed(true):precision(4):put(math.pi)

    d = 5; m = 11; y = 1990
    string.format("%02d/%02d/%04d", d, m, y)                    --> "05/11/1990"
    string.buffer():fill'0':width(2):put(d):put'/':width(2):put(m):put'/':width(4):put(y)

The implementation defines a new userdata based on `luaL_Buffer` from the Lua/C API. `string.buffer` is the constructor.
The name of methods comes from C++: `put`, `precision`, `width`, `fill`, `left`, `right`, `internal`, `dec`, `oct`, `hex`, `fixed`, `scientific`, `showbase`, `showpoint`, `showpos`, `uppercase`, `endl`, `ends`.
And the methods `__tostring`, `len` & `add` come from Lua.

As the conversion `int` to `char` makes sense only in C, the format "%c" must be rewrite with an explicit call of `string.char`
    string.format("%c", 0x41)   --> 'A'
    string.buffer():put(string.char(0x41))

And the feature of format "%q" is supplied by a new function `string.repl` (named like in Python)
    string.format("%q", 'a string with "quotes"')               --> "a string with \"quotes\""
    string.repl('a string with "quotes"')

This userdata supplies some input methods: `get`, `getline`, `pos`.
The interface of `get` looks like the Lua `io.read`.
    local sb = string.buffer'05/11/1990'
    print(sb:get'i')            --> 5
    assert(sb:get(1) == '/')
    print(sb:get'i')            --> 11
    assert(sb:get(1) == '/')
    print(sb:get'i')            --> 1990

In order to replace `string.pack` & `string.unpack`, this userdata supplies these methods:
`pack`, `packsize`, `unpack`, `little`, `align`, `int`, `num`, `str`.

    string.pack("iii", 3, -27, 450)
    string.buffer():int'int':pack(3):pack(-27):pack(450)

    string.pack("i7", 1 << 54)
    string.buffer():int(7):pack(1 << 54)

    string.pack("c1", "hello")
    string.buffer():str('prefix', 1):pack'hello'

    string.pack("<i2 i2", 500, 24)
    string.buffer():little(true):int(2):pack(500):pack(24)

pack/unpack are introduced only since Lua 5.3, so, I think they could be deprecated in Lua 5.4.
For historical reasons, `format` can not be deprecated.
`format` is good for small things, and the new way is good for serious things.
In the same way, the regex minilanguage was not deprecated/replaced by LPeg.

Find in attachment, a patch of lstrlib.c against Lua 5.3.4.

François


0001-experiment-string.buffer.patch (28K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Javier Guerra Giraldez
On 26 May 2017 at 16:32, François Perrad <[hidden email]> wrote:
> `format` is good for small things, and the new way is good for serious
> things.

Am I the only one that thinks iostream is the second ugliest part of C++?

and of course, far, far less readable than `printf()` style


--
Javier

Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Luiz Henrique de Figueiredo
In reply to this post by François Perrad
Why this is a proposal for Lua 5.4 instead of being a separate library?

Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Enrico Colombini
In reply to this post by Javier Guerra Giraldez
On 26-May-17 17:44, Javier Guerra Giraldez wrote:
> Am I the only one that thinks iostream is the second ugliest part of C++?

You are not alone. And possibly the most inefficient, too.

--
   Enrico

Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Charles Heywood
In reply to this post by Luiz Henrique de Figueiredo
It's meant to, by the looks of it, replace things that couldn't be optimized in the "old" Lua string library.

On Fri, May 26, 2017, 10:48 Luiz Henrique de Figueiredo <[hidden email]> wrote:
Why this is a proposal for Lua 5.4 instead of being a separate library?

--
--

Software Developer / System Administrator
Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Francisco Olarte
In reply to this post by Javier Guerra Giraldez
Javier

On Fri, May 26, 2017 at 5:44 PM, Javier Guerra Giraldez
<[hidden email]> wrote:
> On 26 May 2017 at 16:32, François Perrad <[hidden email]> wrote:
>> `format` is good for small things, and the new way is good for serious
>> things.
> Am I the only one that thinks iostream is the second ugliest part of C++?
> and of course, far, far less readable than `printf()` style

Nope. One of the first thing I've had to do in several projects in C++
is to add printf-like formatting to logging and io frameworks as the
source was otherwise imposible to understand. And I've done it in Java
and Lua too, where it is much easier ( as you can catch
format-parameter discordances more easily ).

Francisco Olarte.

Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Scott Morgan
In reply to this post by Javier Guerra Giraldez
On 05/26/2017 04:44 PM, Javier Guerra Giraldez wrote:
> On 26 May 2017 at 16:32, François Perrad <[hidden email]> wrote:
>> `format` is good for small things, and the new way is good for serious
>> things.
>
> Am I the only one that thinks iostream is the second ugliest part of C++?

I'm sure I saw on one of the many CppCon videos that even the C++
committee and other big names (Stroustrup etc.) hate the thing when it
comes to string formatting.

Scott

Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Gé Weijers
In reply to this post by François Perrad

On Fri, May 26, 2017 at 8:32 AM, François Perrad <[hidden email]> wrote:
 
    d = 5; m = 11; y = 1990
    string.format("%02d/%02d/%04d", d, m, y)                    --> "05/11/1990"
    string.buffer():fill'0':width(2):put(d):put'/':width(2):put(m):put'/':width(4):put(y)



You're performing a method (table) lookup and a call for each chunk of data you're formatting, is this *really* faster than a 'printf' style interface? A JIT could optimize a some of it away.
Some numbers would be helpful.

Alternatively, 'string.format' could be augmented to 'compile' each format specification into a 'program' for a trivial formatting VM and cache the result, which would be about as fast or even faster, and would not require an interface change. It could be a library that provided a replacement for string.format.


Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Frank Kastenholz-2
In reply to this post by François Perrad
Hi

What is the problem that this is meant to solve?
Embedding formatting instructions in strings is not a problem; modern C compilers, for example, evaluate printf() format strings at compile time and report on format/argument mismatches.  If it can be done in C it can be done in Lua/LuaJIT, no?

Frank


> On May 26, 2017, at 11:32 AM, François Perrad <[hidden email]> wrote:
>
> With Lua 5.3, the `string` library embeds 3 minilanguages:
>  - a text formatting minilanguage in `format` function
>  - a regexp minilanguage in `find`, `gmatch`, `gsub` and `match` functions
>  - a binary pack/unpacking minilanguage in `pack`, `packsize` and `unpack` functions
>
> These minilanguages are embedded in Lua strings which are interpreted only at runtime.
> Before runtime, there are neither syntax check, neither argument type check.
> They are outside the Lua grammar.
> They are not friendly with JIT optimization.
>
> The text formatting minilanguage is based on the one of C `sprintf`.
> This proposal is based on its C++ replacement, the iostream library.
> The strict replacement is the output stream, but with this model, it is easy to add the input stream counterpart.
> And the binary pack/unpack could be unified in this model.
> Two minilanguages are replaced by method chaining on a new userdata representing a string buffer.
>
>    printf("x = %d  y = %d", 10, 20);                           -- C
>    string.format("x = %d  y = %d", 10, 20)                     -- Lua 5.0
>    ("x = %d  y = %d"):format(10, 20)                           -- Lua 5.1
>    cout << "x = " << 10 << "  y = " << 20;                     -- C++
>    string.buffer():put'x = ':put(10):put'  y = ':put(20)       -- proposal
>
>    string.format("x = %#x", 200)                               --> "x = 0xc8"
>    string.buffer():hex():showbase(true):put'x = ':put(200)
>
>    string.format("pi = %.4f", math.pi)                         --> "pi = 3.1416"
>    string.buffer():put'pi = ':fixed(true):precision(4):put(math.pi)
>
>    d = 5; m = 11; y = 1990
>    string.format("%02d/%02d/%04d", d, m, y)                    --> "05/11/1990"
>    string.buffer():fill'0':width(2):put(d):put'/':width(2):put(m):put'/':width(4):put(y)
>
> The implementation defines a new userdata based on `luaL_Buffer` from the Lua/C API. `string.buffer` is the constructor.
> The name of methods comes from C++: `put`, `precision`, `width`, `fill`, `left`, `right`, `internal`, `dec`, `oct`, `hex`, `fixed`, `scientific`, `showbase`, `showpoint`, `showpos`, `uppercase`, `endl`, `ends`.
> And the methods `__tostring`, `len` & `add` come from Lua.
>
> As the conversion `int` to `char` makes sense only in C, the format "%c" must be rewrite with an explicit call of `string.char`
>    string.format("%c", 0x41)   --> 'A'
>    string.buffer():put(string.char(0x41))
>
> And the feature of format "%q" is supplied by a new function `string.repl` (named like in Python)
>    string.format("%q", 'a string with "quotes"')               --> "a string with \"quotes\""
>    string.repl('a string with "quotes"')
>
> This userdata supplies some input methods: `get`, `getline`, `pos`.
> The interface of `get` looks like the Lua `io.read`.
>    local sb = string.buffer'05/11/1990'
>    print(sb:get'i')            --> 5
>    assert(sb:get(1) == '/')
>    print(sb:get'i')            --> 11
>    assert(sb:get(1) == '/')
>    print(sb:get'i')            --> 1990
>
> In order to replace `string.pack` & `string.unpack`, this userdata supplies these methods:
> `pack`, `packsize`, `unpack`, `little`, `align`, `int`, `num`, `str`.
>
>    string.pack("iii", 3, -27, 450)
>    string.buffer():int'int':pack(3):pack(-27):pack(450)
>
>    string.pack("i7", 1 << 54)
>    string.buffer():int(7):pack(1 << 54)
>
>    string.pack("c1", "hello")
>    string.buffer():str('prefix', 1):pack'hello'
>
>    string.pack("<i2 i2", 500, 24)
>    string.buffer():little(true):int(2):pack(500):pack(24)
>
> pack/unpack are introduced only since Lua 5.3, so, I think they could be deprecated in Lua 5.4.
> For historical reasons, `format` can not be deprecated.
> `format` is good for small things, and the new way is good for serious things.
> In the same way, the regex minilanguage was not deprecated/replaced by LPeg.
>
> Find in attachment, a patch of lstrlib.c against Lua 5.3.4.
>
> François
>
> <0001-experiment-string.buffer.patch>


Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

云风 Cloud Wu
In reply to this post by François Perrad



发自我的 iPhone
> 在 2017年5月26日,下午11:32,François Perrad <[hidden email]> 写道:
>
> pack/unpack are introduced only since Lua 5.3, so, I think they could be deprecated in Lua 5.4.

string.pack/unpack are the best feature introduced by lua 5.3 (IMHO), and I don't think any jit technology can do better.

Minilanguage is the key for high cohesion , that is why people prefer printf than iostream.
Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Fontana Nicola
In reply to this post by François Perrad
Il Fri, 26 May 2017 17:32:49 +0200 François Perrad <[hidden email]> scrisse:

> ...
> Two minilanguages are replaced by method chaining on a new userdata
> representing a string buffer.
> ...

Hi François,

when localization comes into play, printf approach is far
superior. And C++ itself must reimplement it:

http://www.boost.org/doc/libs/1_64_0/libs/locale/doc/html/localized_text_formatting.html

IMO iostream is just flawed by design.

Ciao.
--
Nicola

Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Aaron B.
In reply to this post by François Perrad
On Fri, 26 May 2017 17:32:49 +0200
François Perrad <[hidden email]> wrote:

>     printf("x = %d  y = %d", 10, 20);                           -- C
>     string.format("x = %d  y = %d", 10, 20)                     -- Lua 5.0
>     ("x = %d  y = %d"):format(10, 20)                           -- Lua 5.1
>     cout << "x = " << 10 << "  y = " << 20;                     -- C++
>     string.buffer():put'x = ':put(10):put'  y = ':put(20)       -- proposal

The C/Lua 5.0/Lua 5.1 way looks a lot more readable to me, compared to
the C++/proposal method.

The reason being, with the format string, in a glance I can see what
the output will look like. With the stream, I have to put all the pieces
together in my mind first.


--
Aaron B. <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

J Doe

> On May 26, 2017, at 3:03 PM, Aaron B. <[hidden email]> wrote:
>
> On Fri, 26 May 2017 17:32:49 +0200
> François Perrad <[hidden email]> wrote:
>
>>    printf("x = %d  y = %d", 10, 20);                           -- C
>>    string.format("x = %d  y = %d", 10, 20)                     -- Lua 5.0
>>    ("x = %d  y = %d"):format(10, 20)                           -- Lua 5.1
>>    cout << "x = " << 10 << "  y = " << 20;                     -- C++
>>    string.buffer():put'x = ':put(10):put'  y = ':put(20)       -- proposal
>
> The C/Lua 5.0/Lua 5.1 way looks a lot more readable to me, compared to
> the C++/proposal method.

Hi,

As an aside to proposals for string formatting in Lua 5.4, is there a place on: lua.org that lists proposals and/or progress on upcoming Lua releases ?  I checked the download section for source code for possibly related project notes but didn't find what I was looking for.

Thanks,

- J


Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Charles Heywood
I think /work ? Sometimes the code for in-progress versions is released on the website, but I don't think any are available now. However, Roberto did mention perhaps moving to Git, which might lead to Lua's development being on GitHub completely? (Don't quote me please)

On Fri, May 26, 2017, 14:47 J Doe <[hidden email]> wrote:

> On May 26, 2017, at 3:03 PM, Aaron B. <[hidden email]> wrote:
>
> On Fri, 26 May 2017 17:32:49 +0200
> François Perrad <[hidden email]> wrote:
>
>>    printf("x = %d  y = %d", 10, 20);                           -- C
>>    string.format("x = %d  y = %d", 10, 20)                     -- Lua 5.0
>>    ("x = %d  y = %d"):format(10, 20)                           -- Lua 5.1
>>    cout << "x = " << 10 << "  y = " << 20;                     -- C++
>>    string.buffer():put'x = ':put(10):put'  y = ':put(20)       -- proposal
>
> The C/Lua 5.0/Lua 5.1 way looks a lot more readable to me, compared to
> the C++/proposal method.

Hi,

As an aside to proposals for string formatting in Lua 5.4, is there a place on: lua.org that lists proposals and/or progress on upcoming Lua releases ?  I checked the download section for source code for possibly related project notes but didn't find what I was looking for.

Thanks,

- J


--
--

Software Developer / System Administrator
Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Tobias Kieslich
In reply to this post by 云风 Cloud Wu

Quoting 云风 <[hidden email]>:

> 发自我的 iPhone
>> 在 2017年5月26日,下午11:32,François Perrad <[hidden email]> 写道:
>>
>> pack/unpack are introduced only since Lua 5.3, so, I think they  
>> could be deprecated in Lua 5.4.
>
> string.pack/unpack are the best feature introduced by lua 5.3  
> (IMHO), and I don't think any jit technology can do better.
>
> Minilanguage is the key for high cohesion , that is why people  
> prefer printf than iostream.


Hi there,

as part of a library I'm working on I have developed a different type of
pack/unpack mechanism which basically does some pre-compilation of sort but
most importatntly allows for much more expressive way of writing this stuff.
I think this makes a compelling case for Luiz' argument of putting things
into separate libraries.

Please not this is from lua-t which is under heavy development.  So it's a
very moving target. Also C-strings pascal strings and alignmentare not yet
handled :-)
https://github.com/tobbik/lua-t

Example:

> Pack = require't.Pack'
> i = Pack('>I6')              -- Single packer instance
> i
t.Pack.UInt6B: 0xd73f1e8
> p = Pack('>I3<i2bB>I5<I4h')  -- Packer sequence instance
> p
t.Pack.Sequence[7]: 0xd73eb08
> b = 'aBcDeüHiJkLmNoPö'
> #p                           -- Packer behaves like a table
7
> p[3]                         -- Access to single packers
t.Pack.Field[5](Int1L): 0xd74b238 -- Filed with offset of 5 bytes from
                                                                          -- start, 1 byte long
> p[3](b)                      -- Reading from a string is done by __call
-61
> x = p(b)                     -- create a table with results
> for i=1,#p do print(p[i], x[i] ) end
t.Pack.Field[0](UInt3B): 0xd749868      6373987
t.Pack.Field[3](Int2L): 0xd749048       25924
t.Pack.Field[5](Int1L): 0xd7491b8       -61
t.Pack.Field[6](UInt1L): 0xd7492e8      188
t.Pack.Field[7](UInt5B): 0xd748bc8      311004130124
t.Pack.Field[12](UInt4L): 0xd748d68     1349471853
t.Pack.Field[16](Int2L): 0xd748648      -18749

There are some more convieniences, you can have named fields, you can have
nested structures, arrays of packers and you can have bit sized resolution
(using 'r' and 'R' as format strings or 'v' for single bit boolean):
> p   = Pack(
          { threeUInt  = '>I3' }
        , { twoIntegs  = '<i2' }
        , { twoBytes      = Pack(
                  { signedByte = 'b' }
                , { unsignByte = 'B' }
        ) }
        , { bits       = Pack( 'r4','R7',Pack( 'r3', 8  
),'r15','v','v','R1','R1','r1','r1' ) }
        , { fiveSInt   = '>I5' }
        , { fourSInt   = '<I4' }
        , { signShort  = 'h' }
)
> p.bits[3][4]
t.Pack.Field[9](SBit3:0): 0xd73f5b8  -- signed bit, 3 bits wide, 9 byte offset

Additionally, if you use a t.Buffer instance, provided in the same library,
you can write to a partial section of it without recreating strings over and
over again (think crude but mutable strings):

> Buffer = require't.Buffer'
> b   = Buffer( string.rep( string.char( 0xFF ), 25 ) )
> b
T.Buffer[25]: 0xd72d1b8
> b:toHex()
FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF
> p.bits[2](b,0) -- overwriting the bits in 'R7' with 0
> b:toHex()
FF FF FF FF FF FF FF F0 1F FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF





Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Sean Conner
In reply to this post by Frank Kastenholz-2
It was thus said that the Great Frank Kastenholz once stated:
> Hi
>
> What is the problem that this is meant to solve?
>
> Embedding formatting instructions in strings is not a problem; modern C
> compilers, for example, evaluate printf() format strings at compile time
> and report on format/argument mismatches.  If it can be done in C it can
> be done in Lua/LuaJIT, no?

  To a degree.  C can check the following:

        printf("%s is %d units high\n",unit->name,unit->height);

because the format string is an immediate value, and because C has types,
the parameters can be checked for typeness as well.  C can't however, do a
check like:

        printf(unit->format,unit->name,unit->height);

because the string is not available at compile time to do the check (but
this is still legal C code).  So there are limits that even C has.

  Now, for Lua:

        string.format("%s is %d units high\n",unit.name,unit.height)

About the only thing that can happen at compile time is to check that the
number of format characters matches the number of parameters given to the
function.  There's no way to check that unit.name is a string, or that
unit.height is a number, because there's no definition of what 'unit' is.
And it's values that have types, not variables (or fields in a table).

  What I *would* like to see for string.format() is an option (or new
function) that coerces the parameters to the given type instead of throwing
an error.  I can live with the current situation though (annoying as it is).

  -spc

Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

J Doe
In reply to this post by Charles Heywood

> On May 26, 2017, at 3:50 PM, Charles Heywood <[hidden email]> wrote:
>
> I think /work ? Sometimes the code for in-progress versions is released on the website, but I don't think any are available now. However, Roberto did mention perhaps moving to Git, which might lead to Lua's development being on GitHub completely? (Don't quote me please)

Hi Charles,

Ok, thanks.  I will poke around in /work and in the meantime, I look forward to future, in-progress releases.  Git would also be great, as well, but whatever works best for the project.

Thanks,

- J


Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

François Perrad
In reply to this post by Aaron B.


2017-05-26 21:03 GMT+02:00 Aaron B. <[hidden email]>:
On Fri, 26 May 2017 17:32:49 +0200
François Perrad <[hidden email]> wrote:

>     printf("x = %d  y = %d", 10, 20);                           -- C
>     string.format("x = %d  y = %d", 10, 20)                     -- Lua 5.0
>     ("x = %d  y = %d"):format(10, 20)                           -- Lua 5.1
>     cout << "x = " << 10 << "  y = " << 20;                     -- C++
>     string.buffer():put'x = ':put(10):put'  y = ':put(20)       -- proposal

The C/Lua 5.0/Lua 5.1 way looks a lot more readable to me, compared to
the C++/proposal method.

The reason being, with the format string, in a glance I can see what
the output will look like. With the stream, I have to put all the pieces
together in my mind first.


--
Aaron B. <[hidden email]>



Well, at this time, the feedbacks seem clear : everybody is happy with minilanguages.

As seasoned C developer, I know (and I daily use) the `sprintf` formating, so I am biased.
But the pack/unpack minilanguage is not obviously readable.
And same thing with `os.date` which uses the C `strftime` minilanguage.

An implementation based on a minilanguage doesn't allow user extension, for example, you cannot add a "%Q" option to `string.format` in pure Lua.

François
Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Charles Heywood
I agree with your point on Lua's string.{un,}pack not being too readable. However, I feel that your implementation leads to too many function calls. I'd much prefer (and, in FusionScript, plan to utilize) a function that generates a format string and returns a pack() and unpack() function or object to use that string.

On Fri, May 26, 2017, 16:14 François Perrad <[hidden email]> wrote:


2017-05-26 21:03 GMT+02:00 Aaron B. <[hidden email]>:
On Fri, 26 May 2017 17:32:49 +0200
François Perrad <[hidden email]> wrote:

>     printf("x = %d  y = %d", 10, 20);                           -- C
>     string.format("x = %d  y = %d", 10, 20)                     -- Lua 5.0
>     ("x = %d  y = %d"):format(10, 20)                           -- Lua 5.1
>     cout << "x = " << 10 << "  y = " << 20;                     -- C++
>     string.buffer():put'x = ':put(10):put'  y = ':put(20)       -- proposal

The C/Lua 5.0/Lua 5.1 way looks a lot more readable to me, compared to
the C++/proposal method.

The reason being, with the format string, in a glance I can see what
the output will look like. With the stream, I have to put all the pieces
together in my mind first.


--
Aaron B. <[hidden email]>



Well, at this time, the feedbacks seem clear : everybody is happy with minilanguages.

As seasoned C developer, I know (and I daily use) the `sprintf` formating, so I am biased.
But the pack/unpack minilanguage is not obviously readable.
And same thing with `os.date` which uses the C `strftime` minilanguage.

An implementation based on a minilanguage doesn't allow user extension, for example, you cannot add a "%Q" option to `string.format` in pure Lua.

François
--
--

Software Developer / System Administrator
Reply | Threaded
Open this post in threaded view
|

Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)

Sean Conner
In reply to this post by François Perrad
It was thus said that the Great François Perrad once stated:

> 2017-05-26 21:03 GMT+02:00 Aaron B. <[hidden email]>:
>
> > On Fri, 26 May 2017 17:32:49 +0200
> > François Perrad <[hidden email]> wrote:
> >
> > >     printf("x = %d  y = %d", 10, 20);                           -- C
> > >     string.format("x = %d  y = %d", 10, 20)                     -- Lua
> > 5.0
> > >     ("x = %d  y = %d"):format(10, 20)                           -- Lua
> > 5.1
> > >     cout << "x = " << 10 << "  y = " << 20;                     -- C++
> > >     string.buffer():put'x = ':put(10):put'  y = ':put(20)       --
> > proposal
> >
> > The C/Lua 5.0/Lua 5.1 way looks a lot more readable to me, compared to
> > the C++/proposal method.
> >
> > The reason being, with the format string, in a glance I can see what
> > the output will look like. With the stream, I have to put all the pieces
> > together in my mind first.
> >
> >
> > --
> > Aaron B. <[hidden email]>
> >
> >
>
> Well, at this time, the feedbacks seem clear : everybody is happy with
> minilanguages.
>
> As seasoned C developer, I know (and I daily use) the `sprintf` formating,
> so I am biased.
> But the pack/unpack minilanguage is not obviously readable.
> And same thing with `os.date` which uses the C `strftime` minilanguage.

  I don't find the pack/unpack minilanguage all that bad, per se.  Lowercase
letters are signed quantities, uppercase unsigned and there's some mneumonic
meaning to the letters used.  But it can get silly (sample from an SMPP
parser):

        result.service_type,
        result.source.addr_ton,
        result.source.addr_npi,
        result.source.addr,
        result.dest.addr_ton,
        result.dest.addr_npi,
        result.dest.addr,
        result.esm_class,
        result.protocol_id,
        result.prority,
        result.schedule_time,
        result.validity_period,
        result.registered_delivery,
        result.replace_if_present,
        result.data_coding,
        result.sm_default_msg_id,
        result.message =
        string.unpack(">z I1 I1 z I1 I1 z I1 I1 I1 z z I1 I1 I1 I1 s1",blob,13)

  It was hard to debug, and the obvious solution:

        result.service_type,pos    = string.unpack(">z",blob,pos)
        result.source.addr_ton,pos = string.unpack(">I1",blob,pos)
        result.source.addr_npi,pos = string.unpack(">I1",blob,pos)
        --- and so on

just *feels* a lot slower to me.  One could try to create another
minilanguage for this, and I've tried, but I haven't created one that I
like (for me, the *same* language should create both encoder and decoder).

  I agree with the strftime minilanguage---I have to look that one up every
single time I use it.  

> An implementation based on a minilanguage doesn't allow user extension, for
> example, you cannot add a "%Q" option to `string.format` in pure Lua.

  That doesn't *have* to be the case though.  A minilanguage *could* be
designed to allow user defined extensions, but the problem there is mixing
code that install their own extensions to the same minilanguage---there
could be conflicts.  And someone maintaining the code will have to know
where to find the code that defines the extensions.

  -spc

12