Design questions for a small s-expression library

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Design questions for a small s-expression library

Antonio Vieiro
Hi all,

I would like to build a small s-expression ([1]) library in Lua (5.1),
so I need to design cons-cells ([2]) first.

I would appreciate some advice in the design of the cons-cells. I am
thinking of two approaches:

A) Design cons-cells as Lua tables, in plain Lua, with a "car" and a
"cdr" being other Lua objects (possibly other cons-cells).
B) Design cons-cells as a C structure (Lua allocated and garbage
collected with a __gc metamethod), that keeps references to other Lua
objects using luaL_ref and luaL_unref (possibly with references to other
cons-cells).

I am thinking that approach B is going to be more performant and consume
less memory. These are my assumptions:

1. B is better because using a Lua table as a cons cell is too heavy
memory-wise.
2. B Is better because I can abuse LUA_REGISTRYINDEX to store references
to Lua objects, that's better than having as many Lua tables in memory.
3. LuaJIT will be as performant in approach B (invoking C functions for
"car", "cdr", etc.) as in approach A (using Lua native code).
4. There's no problem in having circular dependencies using luaL_ref's
between different cons cells, because the Lua GC will detect that and
cleanse cons-cells appropriately.

Am I right in these assumptions? I'm a little worried about point 4.

Thanks in advance,
Antonio


[1] http://en.wikipedia.org/wiki/S-expressions
[2] http://en.wikipedia.org/wiki/Cons

Reply | Threaded
Open this post in threaded view
|

Re: Design questions for a small s-expression library

Philipp Janda
Am 18.01.2014 15:04 schröbte Antonio Vieiro:
> Hi all,

Hi!

>
> I would like to build a small s-expression ([1]) library in Lua (5.1),
> so I need to design cons-cells ([2]) first.
>
> I would appreciate some advice in the design of the cons-cells. I am
> thinking of two approaches:
>
> A) Design cons-cells as Lua tables, in plain Lua, with a "car" and a
> "cdr" being other Lua objects (possibly other cons-cells).
> B) Design cons-cells as a C structure (Lua allocated and garbage
> collected with a __gc metamethod), that keeps references to other Lua
> objects using luaL_ref and luaL_unref (possibly with references to other
> cons-cells).

C) Use closures for cons-cells. See [3].
D) Similar to A, but use the keys `1` and `2` (i.e. the array part of
the table) for head and tail.

`cons`, `car`, `cdr`, `setcar`, and `setcdr` are the only functions that
need to know the difference, so you can even change the implementation
later ...

Memory per cons-cell in bytes (as reported by `collectgarbage"count"`):
table-based Lua 5.x:   144
table-based LuaJIT:     80
closure-based Lua 5.1: 136
closure-based Lua 5.2: 128
closure-based LuaJIT:   76
array-based Lua 5.x:    96
array-based LuaJIT:     56

>
> I am thinking that approach B is going to be more performant and consume
> less memory. These are my assumptions:
>
> 1. B is better because using a Lua table as a cons cell is too heavy
> memory-wise.
> 2. B Is better because I can abuse LUA_REGISTRYINDEX to store references
> to Lua objects, that's better than having as many Lua tables in memory.
> 3. LuaJIT will be as performant in approach B (invoking C functions for
> "car", "cdr", etc.) as in approach A (using Lua native code).
> 4. There's no problem in having circular dependencies using luaL_ref's
> between different cons cells, because the Lua GC will detect that and
> cleanse cons-cells appropriately.
>
> Am I right in these assumptions? I'm a little worried about point 4.

With good reason. You are basically putting reference counting on top of
garbage collection. Reference counting cannot handle cycles, so the
cycles never become garbage. Even if you have chains instead of cycles
you probably need multiple garbage collection runs before the whole
chain is collected (the `__gc` function has to run before the garbage
collector can realize that the next cons-cell may be collected as well).
So I would advise against B even if you can make sure that there will be
no cycles.

>
> Thanks in advance,
> Antonio

Philipp


>
> [1] http://en.wikipedia.org/wiki/S-expressions
> [2] http://en.wikipedia.org/wiki/Cons
>

   [3]: http://permalink.gmane.org/gmane.comp.lang.lua.general/104428



Reply | Threaded
Open this post in threaded view
|

Re: Design questions for a small s-expression library

H. Conrad Cunningham

On Sat, Jan 18, 2014 at 10:38 AM, Philipp Janda <[hidden email]> wrote:
Am 18.01.2014 15:04 schröbte Antonio Vieiro:

I would like to build a small s-expression ([1]) library in Lua (5.1),
so I need to design cons-cells ([2]) first.

I would appreciate some advice in the design of the cons-cells. I am
thinking of two approaches:

A) Design cons-cells as Lua tables, in plain Lua, with a "car" and a
"cdr" being other Lua objects (possibly other cons-cells).
B) Design cons-cells as a C structure (Lua allocated and garbage
collected with a __gc metamethod), that keeps references to other Lua
objects using luaL_ref and luaL_unref (possibly with references to other
cons-cells).

C) Use closures for cons-cells. See [3].
D) Similar to A, but use the keys `1` and `2` (i.e. the array part of the table) for head and tail.

`cons`, `car`, `cdr`, `setcar`, and `setcdr` are the only functions that need to know the difference, so you can even change the implementation later ...

I have a set of example modules I created for the course I taught in my university's Fall semester (late August to mid-December) on my website at 

However, I intentionally did not include anything like a setcar or setcdr, instead forcing creation or a new Cons cell. I wanted immutability of the cells.

My item 15(a) is more or less approach A above.  My item 15(e) is more or less approach C (which I implemented after a similar suggestion by Phillipp Janda to a post to this list).  ... My item 15(b) was an attempt to enforce immutability of the cells.

Approach D would be a relatively straightforward mod to my 15(a).

The core of the module are six primitives: constructors Nil and Cons, field accessors head and tail, and query functions isNil and isList.  These functions "hide" the implementation of the Cons cell from the rest of the module's functions and from outside.

My item 15(c) and 15(d) are attempts to provide "lazy" lists -- requiring a bit of macro processing in the code that uses the module.

- Conrad Cunningham

Reply | Threaded
Open this post in threaded view
|

Re: Design questions for a small s-expression library

Antonio Vieiro
In reply to this post by Antonio Vieiro
El 18/01/14 17:58, [hidden email] escribió:

> Date: Sat, 18 Jan 2014 17:38:31 +0100
> From: Philipp Janda <[hidden email]>
> Subject: Re: Design questions for a small s-expression library
> To: [hidden email]
> Message-ID: <lbealr$6nt$[hidden email]>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
> Am 18.01.2014 15:04 schröbte Antonio Vieiro:
> [...]
> C) Use closures for cons-cells. See [3].
> D) Similar to A, but use the keys `1` and `2` (i.e. the array part of
> the table) for head and tail.
>
> `cons`, `car`, `cdr`, `setcar`, and `setcdr` are the only functions that
> need to know the difference, so you can even change the implementation
> later ...
>
> Memory per cons-cell in bytes (as reported by `collectgarbage"count"`):
> table-based Lua 5.x:   144
> table-based LuaJIT:     80
> closure-based Lua 5.1: 136
> closure-based Lua 5.2: 128
> closure-based LuaJIT:   76
> array-based Lua 5.x:    96
> array-based LuaJIT:     56

I thought closures would be more expensive than tables, I may try that.
Thanks, Philipp, for the idea!

>> 4. There's no problem in having circular dependencies using luaL_ref's
>> between different cons cells, because the Lua GC will detect that and
>> cleanse cons-cells appropriately.
>>
>> Am I right in these assumptions? I'm a little worried about point 4.
>
> With good reason. You are basically putting reference counting on top of
> garbage collection. Reference counting cannot handle cycles, so the
> cycles never become garbage. Even if you have chains instead of cycles
> you probably need multiple garbage collection runs before the whole
> chain is collected (the `__gc` function has to run before the garbage
> collector can realize that the next cons-cell may be collected as well).
> So I would advise against B even if you can make sure that there will be
> no cycles.
>

Well, this seems logical: Lua has no way to know that a userdata keeps a
reference to another Lua object, so it's reasonable it cannot detect
cycles between objects storing luaL_ref's among them.

Another option could be doing GC on top of GC, i.e., building a custom
garbage collector for cons-cells. It would be great if one could run
this custom garbage-collector immediately before Lua's garbage
collector. But as far as I can tell there's no "hook" available in Lua
that runs a custom C function before Lua's GC is run.

There's a LUA_MASKCOUNT hook [1], though, that could be used, but only
if nobody else is using it. I may explore using it.

Thanks for the advice,
Antonio


[1] http://www.lua.org/manual/5.1/manual.html#lua_sethook



Reply | Threaded
Open this post in threaded view
|

Re: Design questions for a small s-expression library

Philipp Janda
In reply to this post by H. Conrad Cunningham
Am 19.01.2014 05:35 schröbte H. Conrad Cunningham:

> On Sat, Jan 18, 2014 at 10:38 AM, Philipp Janda <[hidden email]> wrote:
>>
>> `cons`, `car`, `cdr`, `setcar`, and `setcdr` are the only functions that
>> need to know the difference, so you can even change the implementation
>> later ...
>
>
> I have a set of example modules I created for the course I taught in my
> university's Fall semester (late August to mid-December) on my website at
> http://www.cs.olemiss.edu/~hcc/csci658/notes/658lectureNotes.html#cellList
>
> However, I intentionally did not include anything like a setcar or setcdr,
> instead forcing creation or a new Cons cell. I wanted immutability of the
> cells.

I started my experiments that way too. And then I tried to implement
some functions from SRFI-1[1] using only `cons`, `car`, `cdr`, and
recursion. And I got stack overflows for large input lists. Some
functions I could fix easily (e.g. length), but others are difficult to
implement efficiently without mutation. So I added `setcar` and `setcdr`
(and started using explicit loops in the implementation) ...

>
> - Conrad Cunningham
>

Philipp

   [1]: http://srfi.schemers.org/srfi-1/srfi-1.html



Reply | Threaded
Open this post in threaded view
|

Re: Design questions for a small s-expression library

Elias Barrionovo
In reply to this post by Antonio Vieiro

On Jan 18, 2014 12:05 PM, "Antonio Vieiro" <[hidden email]> wrote:
> 3. LuaJIT will be as performant in approach B (invoking C functions for "car", "cdr", etc.) as in approach A (using Lua native code).

Not really, since the "classic" C API aborts LuaJIT traces, while plain Lua is properly JITted (except for a few builtins, such as next()).

Reply | Threaded
Open this post in threaded view
|

Re: Design questions for a small s-expression library

Antonio Vieiro
In reply to this post by Antonio Vieiro
El 21/01/14 10:24, [hidden email] escribió:

> Date: Mon, 20 Jan 2014 10:56:54 -0200
> From: Elias Barrionovo <[hidden email]>
> Subject: Re: Design questions for a small s-expression library
> To: Lua mailing list <[hidden email]>
>
> On Jan 18, 2014 12:05 PM, "Antonio Vieiro" <[hidden email]> wrote:
>> 3. LuaJIT will be as performant in approach B (invoking C functions for
> "car", "cdr", etc.) as in approach A (using Lua native code).
>
> Not really, since the "classic" C API aborts LuaJIT traces, while plain Lua
> is properly JITted (except for a few builtins, such as next()).

Yep, I imagine I would have to use the LuaJIT FFI library.

Anyway the "in C" approach proved to be a pain in the neck from the
memory management point of view. Plain Lua is not doing bad.

Thanks,
Antonio

Reply | Threaded
Open this post in threaded view
|

Re: Design questions for a small s-expression library

Philipp Janda
Am 22.01.2014 08:29 schröbte Antonio Vieiro:

> El 21/01/14 10:24, [hidden email] escribió:
>> Date: Mon, 20 Jan 2014 10:56:54 -0200
>> From: Elias Barrionovo <[hidden email]>
>> Subject: Re: Design questions for a small s-expression library
>> To: Lua mailing list <[hidden email]>
>>
>> On Jan 18, 2014 12:05 PM, "Antonio Vieiro" <[hidden email]>
>> wrote:
>>> 3. LuaJIT will be as performant in approach B (invoking C functions for
>> "car", "cdr", etc.) as in approach A (using Lua native code).
>>
>> Not really, since the "classic" C API aborts LuaJIT traces, while
>> plain Lua
>> is properly JITted (except for a few builtins, such as next()).
>
> Yep, I imagine I would have to use the LuaJIT FFI library.
>
> Anyway the "in C" approach proved to be a pain in the neck from the
> memory management point of view. Plain Lua is not doing bad.

Btw., the closure approach could also be implemented in C and would
consume even less memory than the other approaches:

Lua 5.1 C closure-based:  72
Lua 5.2 C closure-based:  64
LuaJIT C closure-based:   48

The remark about aborted traces in LuaJIT is still true, although I've
heard that the latest LuaJIT can trace "around" C API calls. Not sure if
FFI can help in case you want to store arbitrary Lua values ...


>
> Thanks,
> Antonio
>

Philipp