Re: More about packaging (fwd)

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Diego Nehab-3
Hi,

> If the current scheme is to be modified:
> 1) Unloadlib support please. (For Win32 - FreeLibrary())

This will require some support in C. Probably the best way
to do this would be to have a GC method associated with the
namespace table. While we are at it, we might create a _LOADED
table specific to loadlib, just like require has one. I don't
like using the same table for both require and loadlib. Perhaps
_LOADLIB, with weak values.

I will try to implement something when I am back on Tuesday
and submit to the list to see if everyone is happy. Ideally,
whatever we come up with could become part of lauxlib and
thus make it even more likely that developers will use it.
If I am happy enough with it, I will bother Roberto with it. :o)

[]s,
Diego.

Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Edgar Toernig
[hidden email] wrote:
>
> Hi,
> 
> > If the current scheme is to be modified:
> > 1) Unloadlib support please. (For Win32 - FreeLibrary())
> 
> This will require some support in C. Probably the best way
> to do this would be to have a GC method associated with the
> namespace table.

IMHO, that would be too fragile - a script could crash the application[1].

What about the method outlined below?

|[ http://lua-users.org/lists/lua-l/2002-03/msg00199.html ]
|
| I meant real garbage collection.  If there's no reference to the dynamic
| binary it should be unloaded.  Because the only thing it can export are
| functions that's not that difficult.
|
| At the start you only have the startfunc.  All future pushcclosure with
| a function from that binary has to be from startfunc or from some function
| created by it.  And so on.
|
| So, if one could keep track of a "function creation history" you can
| detect when a dynamic binary is no longer in use.  Luckily that's easy *g*
| You just keep a reference to the binary in the upvalues:
|
|  - the change to Lua's core: lua_pushcclosure always appends the last
|    upvalue of the currently active cclosure to the new cclosure (or nil
|    if there is no active cclosure).
|
|  - loadlib creates a userdata containing the handle for that binary.
|    A GC call of the userdata will unload the binary.
|
|  - loadlib attaches that userdata as the single upvalue for the created
|    cclosure of startfunc.  (Needs a special pushcclosure or a magic
|    nupvalue so that lua_pushcclosure will not do its append stuff.)
|
| That's all.  Automatic garbage collection of dynamic loaded binaries.

Ciao, ET.

PS: Btw, what happens under Windows when an application tries to load a dll
    multiple times?  Is it allowed?  How many FreeLibrary-calls are needed?
    Does each instance gets its own data/bss segment?

[1] loadlib"io"  f=io.open"foo"  io=nil  GC()  print(f:read"*a")

Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Asko Kauppi-3

As far as I know:

Multiple DLL usage within the same process goes to same DLL instance. Needs as many frees as there have been loads.

Other processes will see their own instance (code pages etc. shared of course), unless the DLL uses inter-process data sharing.


http://msdn.microsoft.com/library/default.asp?url=/library/en-us/ dllproc/base/freelibrary.asp http://msdn.microsoft.com/library/default.asp?url=/library/en-us/ dllproc/base/process_and_thread_functions.asp

ps.
What's the real usage/need scenario for the unloadlib? Server apps or something (plugin stuff) is the only place where I could see it being really needed.


6.6.2004 kello 01:06, Edgar Toernig kirjoitti:

PS: Btw, what happens under Windows when an application tries to load a dll multiple times? Is it allowed? How many FreeLibrary-calls are needed?
    Does each instance gets its own data/bss segment?


Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Edgar Toernig
Asko Kauppi wrote:
>
> Edgar Toernig wrote:
> >
> > Btw, what happens under Windows when an application tries
> > to load a dll multiple times?  Is it allowed?  How many
> > FreeLibrary-calls are needed?  Does each instance gets
> > its own data/bss segment?
>
> As far as I know:
>
> Multiple DLL usage within the same process goes to same DLL instance.   
> Needs as many frees as there have been loads.
>
> Other processes will see their own instance (code pages etc. shared of  
> course), unless the DLL uses inter-process data sharing.

Thank you.  One last clarification: what happens with the data
section of the dll (i.e. global vars).  Do all instances of the
dll within the single process get the same data segment (and thus
share the data segment) or does each instance gets a private data
segment?

> What's the real usage/need scenario for the unloadlib?

What's the usage of GCing stuff that was loaded with 'loadfile'?
Why should 'loadlib' leak memory?  Whether code is implemented
in Lua or in a dll should be an implementation detail the user
shouldn't care about.  The interfaces are already pretty similar
(input a file name, output a function).  Maybe someday loadlib
gets incorporated into loadfile the same way as precompiled Lua
scripts.  The user wouldn't even know what kind of code he's
loading, a Lua source, a precompiled script or a shared library.
Shouldn't then the GC behaviour be the same too?

Ciao, ET.



> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/ 
> dllproc/base/freelibrary.asp
> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/ 
> dllproc/base/process_and_thread_functions.asp

These links don't seem to work... both give the same "Welcome" page.

Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Mike Pall-43
In reply to this post by Edgar Toernig
Hi,

about "unloadlib" ...

Well, let's face it:

- 99.9% of all developers will never need this because 99.9% of all apps
  have a static working set of libraries.

- Those who need it, know what they are doing. They will take extra
  precautions to remove all references to the library before unloading.

- An extensive framework to support automatic unloading is very difficult
  to get right and will not benefit most developers.
  [I'm not putting down your solution, Edgar. I just don't think we need
   a general solution of that proportion.]

- There is no way we can prevent everyone from shooting themselves in the
  foot. There are far more loopholes than we could possibly close.

- Both perl and Python do not have such a feature. The import/require/use
  statements and the associated library management never unload libraries.
  But both provide a low-level way to explicitly load/unload a dynamic
  library file. And the docs state that you need to be extra careful
  if you give any such library reference to the hands of the internal GC
  and then try to unload the library.

- Lua is a lightweight framework that should promote lightweight solutions.

The current solution already passes a handle to the library as an upvalue
to the function loaded from the library. So this function has the ability
to provide support for unloading. It's just that no libraries I know of do
that. Maybe because it's undocumented?

Another possibility would be to add the library handle (lightuserdata) as
a second return value to loadlib(), so you can write:

func, handle = loadlib(...)
[use library]
...
[carefully destroy all references to library]
unloadlib(handle)

That would be real simple to implement, but I don't know whether this is
really needed. If you write your own libraries you might as well provide
an _unload() function in your function table. If third-party libraries
do not provide such a function then they are probably not safe to unload.

Anyway, I second Asko Kauppi's request for use cases. Can we please have
an indication whether this is an issue about own libraries or third-party
libaries in general?

And about the danger of crashing Lua: If you can use loadlib(), you can
always provoke a crash by referencing an innocent symbol:
  loadlib("libc.so.6", "read")() -> Segmentation fault
So I think this is a non-issue.

BTW: Is there a bug tracker for Lua? While experimenting with this I
     discovered by accident that ...
       debug.getupvalue(function() end, 0)
     ... crashes Lua 5.0.2 and 5.1-work0 due to a missing range check in
     src/lapi.c:aux_upvalue().

Bye,
     Mike

Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Edgar Toernig
Mike Pall wrote:
>
> - An extensive framework to support automatic unloading is very difficult
>   to get right and will not benefit most developers.

Sorry?  2-5 lines of code in the Lua core plus the dlclose handling in
the loader library.  That's all!

> - There is no way we can prevent everyone from shooting themselves in the
>   foot. There are far more loopholes than we could possibly close.

Hmm?  Afaik, any way to crash the app from a Lua script is considered
a serious bug.

> The current solution already passes a handle to the library as an upvalue
> to the function loaded from the library. So this function has the ability
> to provide support for unloading. It's just that no libraries I know of do
> that. Maybe because it's undocumented?

A library cannot unload itself.  The code to perform that has to be
in another segment, else the dlclose call returns to just unmapped
memory.  Beside that, you would require system dependant code in
each library.

> And about the danger of crashing Lua: If you can use loadlib(), you can
> always provoke a crash by referencing an innocent symbol:
>   loadlib("libc.so.6", "read")() -> Segmentation fault

That's only because of the trivial loadlib implementation.  A sane
version would perform checks similar to those in the precompiled-
script loader.  I.e. in my implementation, every dynamically loaded
binary has a structure with a fixed name in the .so that describes
the library: version info, number size and type, start function.
That data is checked and only if everything's ok, the cclosure of
the start function is returned.

Ciao, ET.

Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

D Burgess-2
In reply to this post by Diego Nehab-3
I concur with Edgars comments.

However, it seems that this minor issue has hijacked Diegos 
original agenda, namely; the best way to manage (and mask) the
difference between static and dynamically loaded libraries.

Unload support would be a nice addition but is not critical.

The original question was in

http://lua-users.org/lists/lua-l/2004-06/msg00041.html 

regards
David B

Edgar Toernig <[hidden email]> wrote:

>Mike Pall wrote:
>>
>> - An extensive framework to support automatic unloading is very difficult
>>   to get right and will not benefit most developers.
>
>Sorry?  2-5 lines of code in the Lua core plus the dlclose handling in
>the loader library.  That's all!
>
>> - There is no way we can prevent everyone from shooting themselves in the
>>   foot. There are far more loopholes than we could possibly close.
>
>Hmm?  Afaik, any way to crash the app from a Lua script is considered
>a serious bug.
>
>> The current solution already passes a handle to the library as an upvalue
>> to the function loaded from the library. So this function has the ability
>> to provide support for unloading. It's just that no libraries I know of do
>> that. Maybe because it's undocumented?
>
>A library cannot unload itself.  The code to perform that has to be
>in another segment, else the dlclose call returns to just unmapped
>memory.  Beside that, you would require system dependant code in
>each library.
>
>> And about the danger of crashing Lua: If you can use loadlib(), you can
>> always provoke a crash by referencing an innocent symbol:
>>   loadlib("libc.so.6", "read")() -> Segmentation fault
>
>That's only because of the trivial loadlib implementation.  A sane
>version would perform checks similar to those in the precompiled-
>script loader.  I.e. in my implementation, every dynamically loaded
>binary has a structure with a fixed name in the .so that describes
>the library: version info, number size and type, start function.
>That data is checked and only if everything's ok, the cclosure of
>the start function is returned.
>
>Ciao, ET.
			





Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Peter Loveday-2
Just to raise another issue (yet again).

This would be an excellent time to settle on some 'standard' names for Lua .dll/.so files, so that modules might actually be compatible.....

Love, Light and Peace,
- Peter Loveday
Director of Development, eyeon Software

----- Original Message ----- From: "D Burgess" <[hidden email]>
To: "Lua list" <[hidden email]>
Sent: Monday, June 07, 2004 8:42 AM
Subject: Re: More about packaging (fwd)


I concur with Edgars comments.

However, it seems that this minor issue has hijacked Diegos
original agenda, namely; the best way to manage (and mask) the
difference between static and dynamically loaded libraries.

Unload support would be a nice addition but is not critical.

The original question was in

http://lua-users.org/lists/lua-l/2004-06/msg00041.html

regards
David B

Edgar Toernig <[hidden email]> wrote:

Mike Pall wrote:

- An extensive framework to support automatic unloading is very difficult
  to get right and will not benefit most developers.

Sorry?  2-5 lines of code in the Lua core plus the dlclose handling in
the loader library.  That's all!

- There is no way we can prevent everyone from shooting themselves in the
  foot. There are far more loopholes than we could possibly close.

Hmm?  Afaik, any way to crash the app from a Lua script is considered
a serious bug.

The current solution already passes a handle to the library as an upvalue to the function loaded from the library. So this function has the ability to provide support for unloading. It's just that no libraries I know of do
that. Maybe because it's undocumented?

A library cannot unload itself.  The code to perform that has to be
in another segment, else the dlclose call returns to just unmapped
memory.  Beside that, you would require system dependant code in
each library.

And about the danger of crashing Lua: If you can use loadlib(), you can
always provoke a crash by referencing an innocent symbol:
  loadlib("libc.so.6", "read")() -> Segmentation fault

That's only because of the trivial loadlib implementation.  A sane
version would perform checks similar to those in the precompiled-
script loader.  I.e. in my implementation, every dynamically loaded
binary has a structure with a fixed name in the .so that describes
the library: version info, number size and type, start function.
That data is checked and only if everything's ok, the cclosure of
the start function is returned.

Ciao, ET.








Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Edgar Toernig
In reply to this post by D Burgess-2
D Burgess wrote:
>
> [...] the best way to manage (and mask) the
> difference between static and dynamically loaded libraries.

I played around with that already.  My attempt was like this:

lua.h defines a macro LUA_MODULE(module_name, start_func).
Every module (a static or dynamically loaded library) has one
instance of this macro in one of it's C-files, i.e.:

foo.c:
    LUA_MODULE("foo", foo_init)
    ...
    int foo_init(lua_State *L)
    {
        ...// a regular lua_CFunction to initialize the module
    }
    ...

Depending on whether the module should be compiled for static
or dynamic loading, the LUA_MODULE macro expands to different
(eventually system dependant) code.  The idea here is, that the
makefile specifies whether a library becomes static or dynamic.
The library source is always exactly the same.

I.e. for a dynamic shared library the macro expands to:

    lua_CFunction foo_init;
    lua_Dynlib lua_Dynlib_info = {
        LUA_DYNLIB_VERSION,
        sizeof(lua_Dynlib),
        sizeof(lua_Number), (lua_Number)LUA_DYNLIB_TESTNUMBER,
        "foo", foo_init
    };

The dynloader fetches the "lua_Dynlib_info" symbol, checks whether
the data in the struct-fields match the currently running inter-
preter, and, if everything's ok, pushes a cclosure for foo_init.


In the static case, the macro may become something like this:

    lua_CFunction foo_init;
    static void module_ctor(void) __attribute__((constructor));
    static void module_ctor(void)
    {
        lua_registermodule("foo", foo_init);
    }

The purpose of lua_registermodule is to collect a list of all
builtin modules.  The dynamic loader may search this list to
find a module before trying dlopen.

The above version requires compiler support for constructor functions
(called before main is entered).  If that is not available (or if one
does not want this magic), one could run a script from the makefile
to collect all LUA_MODULE statements of the selected sources and
create a helper c-file that may look like this:

    #define LUA_MODULE(name,func) lua_registermodule(name, func);
    static void register_lua_modules(void)
    {
        LUA_MODULE("foo", foo_init)
        LUA_MODULE("bar", bar_init)
    };

and which is linked to the executable together with the foo.o and
bar.o (the LUA_MODULE macro expands to nothing in foo and bar.o).
Then it's the application's duty to call register_lua_modules once
during startup.


[Btw, both versions (constructor and helper c-file) would allow
building a hybrid shared library consisting of multiple static
libraries.]


Ideally, all this functionallity is hidden inside loadfile.
loadfile first checks the registered modules table, then
searches on disk and decides whether the file is a dll/so,
a precompiled file, or a simple source.  Or, to make the
order selectable, one could add a magic directory (like
$builtin$) to the LUA_PATH variable to tell the loader
when to look into the module table, i.e.:

   $builtin$/?;/usr/lib/lua5/?.so;/usr/share/lua5/?.lua



The first point from the OP's posting (namespaces) could be
discussed independently.  It's usage policy.  I would prefer
fixed global names.  You have to manage the modulename name-
space anyway.  If you get collisions in the global variable
names, you're likely to get collisions in the module names
anyway.  If I say "to use this module you have to require'foo'"
I could also add "and then you can access it's functions via
the table Foo".  Besides, makes reading others sources easier
when each modules has a fixed name.  Else you may end up with
"str.toupper", "S.toupper" and "String.toupper".  Then, what
if a module (like i.e. "stdlib") wants to export multiple
tables?  As a last con from me: it's more complicated!  You have
to cache results of the first require call so that other
scripts can fetch the namespace table later.

But as I said, this can (and should) be discussed independently.

Ciao, ET.

Reply | Threaded
Open this post in threaded view
|

static & dynamic linkage

Asko Kauppi-3
In reply to this post by D Burgess-2

Just for a reference, this is how LuaX allows static/dynamic linkage of a module:

- No changed necessary in module source code (this is important!)

- By default, modules are compiled dynamically, 'GluaModule()' being their entry point.

- Defining 'GLUA_MODULE_NAME=GluaModule_sys' (or similar) at compilation, the entry point can be changed and the module linked statically.

I use static linkage for the 'sys' module only, because its an easy way to get directory access etc. services. If one wants, multiple modules can be linked statically, since each entry point may be uniquely named.

LuaX does not use the regular 'loadlib' approach for linkage, so this approach may not suit others.

-ak


7.6.2004 kello 02:12, D Burgess kirjoitti:

 I concur with Edgars comments.

However, it seems that this minor issue has hijacked Diegos
original agenda, namely; the best way to manage (and mask) the
difference between static and dynamically loaded libraries.

Unload support would be a nice addition but is not critical.

The original question was in

http://lua-users.org/lists/lua-l/2004-06/msg00041.html

regards
David B

Edgar Toernig <[hidden email]> wrote:

Mike Pall wrote:

- An extensive framework to support automatic unloading is very difficult
  to get right and will not benefit most developers.

Sorry?  2-5 lines of code in the Lua core plus the dlclose handling in
the loader library.  That's all!

- There is no way we can prevent everyone from shooting themselves in the
  foot. There are far more loopholes than we could possibly close.

Hmm?  Afaik, any way to crash the app from a Lua script is considered
a serious bug.

The current solution already passes a handle to the library as an upvalue to the function loaded from the library. So this function has the ability to provide support for unloading. It's just that no libraries I know of do
that. Maybe because it's undocumented?

A library cannot unload itself.  The code to perform that has to be
in another segment, else the dlclose call returns to just unmapped
memory.  Beside that, you would require system dependant code in
each library.

And about the danger of crashing Lua: If you can use loadlib(), you can
always provoke a crash by referencing an innocent symbol:
  loadlib("libc.so.6", "read")() -> Segmentation fault

That's only because of the trivial loadlib implementation.  A sane
version would perform checks similar to those in the precompiled-
script loader.  I.e. in my implementation, every dynamically loaded
binary has a structure with a fixed name in the .so that describes
the library: version info, number size and type, start function.
That data is checked and only if everything's ok, the cclosure of
the start function is returned.

Ciao, ET.
			






Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Diego Nehab-3
In reply to this post by D Burgess-2
Hi,

> However, it seems that this minor issue has hijacked Diegos
> original agenda, namely; the best way to manage (and mask) the
> difference between static and dynamically loaded libraries.
>
> Unload support would be a nice addition but is not critical.

Since the unloading seems to be an area of less agreement, I started
with the original problem. So far, I have an implementation of
"requirelib", just like Tiago Dionizio's "requireso", which is a better
way of doing things than my previous approach. ("lib" seemed more
appropriate since other systems don't know what "so" is).

The only difference is that just like "require" uses the _LOADED table,
"requirelib" uses the _LOADEDLIB table to cache.  I made both  _LOADED
and _LOADEDLIB be weak tables to allow for garbage collection.  The
function uses the LUA_PATHLIB environment variable for a search path.

Whenever a C library is loaded, it should set it's return value in the
_LOADEDLIB table. That way, if it  linked static and a user calls
"requirelib", the function will work as expected. The same applies for
"require": Lua modules should set their return value also in the _LOADED
lib, in case they are loaded static (precompiled).

This seems to solve the "dynamic vs. static" issue, but I agree it's
somewhat artificial. Any suggestions?

The interfaces are similar, but I am not sure requirelib and require
should be unified. Maybe it's good to know when a library is C, and when
it's Lua...

[]s,
Diego.

Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Tiago Dionizio-3
Hi,

On Mon, 7 Jun 2004 [hidden email] wrote:

> Since the unloading seems to be an area of less agreement, I started
> with the original problem. So far, I have an implementation of
> "requirelib", just like Tiago Dionizio's "requireso", which is a better
> way of doing things than my previous approach. ("lib" seemed more
> appropriate since other systems don't know what "so" is).

I agree.

> The only difference is that just like "require" uses the _LOADED table,
> "requirelib" uses the _LOADEDLIB table to cache.  I made both  _LOADED
> and _LOADEDLIB be weak tables to allow for garbage collection.  The
> function uses the LUA_PATHLIB environment variable for a search path.

If you call this function directly, the _LOADED table would be usefull.
But personally i don't call it directly to load external libraries; i
mean, if i have a library i want to load i would create a simple lua
script (for example: 'http.lua') and from there place something like
>> http = requirelib("http", "luaopen_http")()
>> ...etc

and later when i need to use the http library i would call
>> require('http')

like this, the http library is only loaded once. The fact that i am
writing another lua file just for the loading doesn't bother me, also, it
hides the details on which function to call from the dynamic module.

Tiago


Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Roberto Ierusalimschy
In reply to this post by Mike Pall-43
> Is there a bug tracker for Lua?

The list ;) The bug is already tracked. Thank you.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

RE: More about packaging (fwd)

Marius Gheorghe
In reply to this post by Diego Nehab-3
> This will require some support in C. Probably the best way
> to do this would be to have a GC method associated with the
> namespace table. While we are at it, we might create a _LOADED
> table specific to loadlib, just like require has one. I don't
> like using the same table for both require and loadlib. Perhaps
> _LOADLIB, with weak values.

I don't know about other OSs but GetModuleHandle("xxx.dll") can be used
under Windows to determine whether a library is already loaded (statically
or dynamically). This mechanism is not necessarily a replacement of _LOADLIB
but could help.

Marius


Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Jamie Webb-3
In reply to this post by Diego Nehab-3
On Monday 07 June 2004 12:42, [hidden email] wrote:
> The interfaces are similar, but I am not sure requirelib and require
> should be unified. Maybe it's good to know when a library is C, and when
> it's Lua...

I think I like the LuaCheia method: all modules are fronted by Lua scripts, 
which may either be the module itself (for pure Lua modules), be a stub which 
loads a C library using some method, or be a mix of C and Lua. All modules 
are then consistently loaded using cheia.load().

-- Jamie Webb

Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Mike Pall-43
In reply to this post by Edgar Toernig
Hi,

Edgar Toernig wrote:
> > - An extensive framework to support automatic unloading is very difficult
> >   to get right and will not benefit most developers.
> 
> Sorry?  2-5 lines of code in the Lua core plus the dlclose handling in
> the loader library.  That's all!

Ok, after thinking about this thoroughly, you convinced me. The overhead
is negligible and the benefit is added orthogonality. I can see the value
in this, even though the average app won't need it right now. But since
dynamic frameworks (aka 'pluggable' apps) are getting more and more common,
it will pay back sooner or later.

The real question is, how to get this into the core (or else we can forget
about it). Reading the 3/2002 lua-l thread looks like it was not adopted
back then? :-/

Ok, back to the original topic:

I like your way for unifying static and dynamic C libraries. Comparable
approaches (Linux kernel modules, Python loadable modules ...) have proven
their value in practice, so please let's do it! I have been bitten by API
incompatibilities with plugins in other projects, so I can appreciate the
value of stringent and safe conformance checks on loadable modules.

And about the other open issues:

- C vs. Lua libraries: Diego seems to lean towards providing two different
  functions for loading them. Edgar would like to integrate both into
  loadfile().
  My opinion: the importing code should NOT need to know what kind of
  library it is loading. I'm all in favour of having one and ONLY ONE
  function to do all of this (whatever it is named or wherever the code
  for it is located).

- Namespaces: I think there is a consensus that the inconsistencies in the
  current model (globals, _LOADED, return from require) should be removed
  (static vs. dynamic C libraries vs. Lua libraries all behave different).

  Diego's proposal (local foo = require "foo" and not setting globals) has
  some merit, because it requires consistency from the coder. Otherwise
  you tend to forget to import a library and just because another library
  that was loaded earlier imported it, you get away with it (until someone
  changes the load order ...).

  Edgar is right that a 'set globals only' model simplifies many things.
  Libraries offering multiple namespaces have no straightforward equivalent
  in the other model.

  My opinion: maybe we can get the benefits of both:

  - Imported libraries set one or more globals, the primary name being one
    of them. The return value does not need to be cached (i.e. _LOADED
    could be dropped except for backwards compatibility).

  - require(name) checks whether _G[name] is set and either just returns
    this value or loads the library and then returns _G[name].

  - Lazy coders can use 'require "foo"' and just use the provided globals.
    Some coders may prefer to add a bunch of calls to require during
    initialization and then do not need to worry about it anymore.

  - Less lazy coders can use 'local foo = require "foo"' and get the
    speed benefit of GETUPVAL vs. GETGLOBAL.

  - Coders that want a clean namespace and stringent checks can use a
    special function at the top of their files (probably called 'module'
    or 'package'). This function sets up a new environment table for the
    caller, populated only with the require function. This forces the
    use of 'local foo = require "foo"' for all libraries (even for "io",
    "string" and so on).

  - So everyone is happy and it even mixes well because the choice is upon
    the coder of the importing file (and not the coder of the imported
    library or the language used or the linking model).
    Oh and I'm not picky about the name of the one-grand-unified import
    function (I choose 'require' in the examples -- 'import' may be
    another popular choice).

Bye,
     Mike

Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

Mike Pall-43
In reply to this post by Jamie Webb-3
Hi,

Jamie Webb wrote:
> I think I like the LuaCheia method: all modules are fronted by Lua scripts, 
> which may either be the module itself (for pure Lua modules), be a stub which 
> loads a C library using some method, or be a mix of C and Lua. All modules 
> are then consistently loaded using cheia.load().

Another benefit: Searching for dynamic libraries in a large library path
may be quite expensive, depending on the OS. This means we get a net benefit
if we assume that a 'grown-up' Lua library collection has more Lua code than
C code (as the Python and perl precedents show).

But we still need to clear up 'loads a C library using some method'.
An approach that does not require hardcoding system dependencies in the
stub would be nice. Added bonus if the stub can be a symlink. :-)

BTW: A bad example for automatic searching for all possible module
     variants:

$ strace -o foo python -c pass; grep open foo
...
open("/usr/lib/python23.zip/os.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT
open("/usr/lib/python23.zip/osmodule.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT
open("/usr/lib/python23.zip/os.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT
open("/usr/lib/python23.zip/os.pyc", O_RDONLY|O_LARGEFILE) = -1 ENOENT
open("/usr/lib/python2.3/os.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT
open("/usr/lib/python2.3/osmodule.so", O_RDONLY|O_LARGEFILE) = -1 ENOENT
open("/usr/lib/python2.3/os.py", O_RDONLY|O_LARGEFILE) = 4
open("/usr/lib/python2.3/os.pyc", O_RDONLY|O_LARGEFILE) = 5
...

Multiply that by the number of import statements and by the path length
(this installation has an extremely short default path of two elements).

Bye,
     Mike

Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

RLak-2
In reply to this post by Diego Nehab-3

I am sorry I missed this interesting discussion, and I apologise for joining it a bit late. (I just moved and my ISP was lamentably slow in reconnecting me to the internet.)

1. I personally feel quite strongly that library loading should not automatically insert anything in the global namespace, whether the library in question is a Lua library or a C library. Furthermore, the end user should be protected from knowing details like "what is the entry point of the library in a {.dylib, .so, .dll} file?" So I would like a simple uniform interface like this:

local string = require "string"
coro = require "coroutine"

It does not then bother me that the name I use in the module is "standard"; the require statement includes the standard name.

2. I also believe that it is necessary to distinguish between two use cases:

  1) Module A requires Module B in order to install itself.
  2) Module A will require Module B during execution of its functions.

This allows for modules to have circular dependencies, which seems to be useful. There was a long discussion about this some time ago; although I am not a big fan of circular dependencies, I can see that there are circumstances where they are unavoidable.

3. Finally, I believe it important that it be easy to sandbox module tables. As I have mentioned before, this requires that each sandbox have its own table. It is entirely legitimate to override a module function (or other member), or to add new ones; but sandboxes need to be protected from each other doing this.

One of the wonderful things about Lua is that all of this machinery is quite simple to implement. However, as Diego points out, a little standardisation would help a lot in implementing it.

At a bit more length:

1. Module tables vs. module instantiators

Internally, modules are always generated by executing a function, either a Lua chunk or some lua_CFunction, typically luaopen_*(L). I'm calling this the module instantiator. Currently, the calling conventions for instantiators vary considerably between proposals; my suggestion is that the standard calling convention be that the instantiator is passed a table as its first parameter, which it fills in and returns. The module system then caches both instantiators and tables. The basic logic of require is "if the table has been cached, return it; if the module has an instantiator, use it to create the table; otherwise, find the Lua or C library and get the instantiator from that. A rough implementation is found below.
 
A simple example in Lua looks like this:

--module complex

local math = _MODULE.math -- see below

return function(complex, name)
  function complex.new(real, imag) return {r = real, i = imag} end
  function complex.add(a, b) return {r = a.r + b.r, i = a.i + b.i} end
  function complex.abs(a) return math.sqrt(a.r^2 +a.i^2) end
  -- etc.
  return complex
end

In C, this looks about the same (considerably simplified)

/* ... some code at end of message ... */

LUALIB_API int luaopen_complex(lua_State *L) {
  luaL_newmetatable(L, "Complex");
  /* Written out here, but the simplification to luaL_openlib() is obvious */
  lua_pushliteral(L, "new"); lua_pushcfunction(L, complex_new); lua_settable(L, 1);
  lua_pushliteral(L, "add"); lua_pushcfunction(L, complex_add); lua_settable(L, 1);
  lua_pushliteral(L, "abs"); lua_pushcfunction(L, complex_abs); lua_settable(L, 1);
  /* ... */
  lua_settop(L, 1);
  return 1;
}

The second argument ("name") is provided for autoloaders. This is a slight change from the current autoloader behaviour, where a global is passed to the chunk with the name. Functionally, there is no difference, though.

Note that in the first case, executing the Lua chunk returns an instantiator. In the second case, the instantiator is luaopen_complex. So in both cases, we have an instantiator function to play with. Static libraries (and dynamically generated modules, for that matter) can be registered by adding their instantiator function to the instantiator cache.

The rough implementation below assumes that C module entry points will always be in the form luaopen_<name>, whether or not they are static; this seems like a simple convention and avoids issues with hypothetical OSs which don't like multiply defined dynamically loaded symbols.

(Edgar is quite right that basic version/sanity checks should be performed prior to calling the entry function.)

The advantage of instantiators is that they can be re-executed in different sandboxes, in order to fill independent module tables.
   
2. Global namespace

I am not so much concerned here about name collision; as Diego (I think) pointed out, modules need to have unique names one way or another. The problem is that there is no reliable way for require() to know which environment table to insert the module into. Furthermore, internal references within the module may or may not be in the same environment as the caller of require(), so hacks like using getfenv to walk up the stack are going to lead to unpredictable results. This is particularly an issue for foreign ("C") modules, which have a different environment than any sandbox. Returning the module table as the result of the require() is much simpler all round.

3. Preloading / circular dependencies

The code below creates _MODULE with a __index metamethod which creates the module table, ready to be filled in. This can be used directly for the case where the library only needs to exist, and will not be called until later. The module tables are then created with a __index metamethod which acts as a trigger; this is based on the implementation in LTN 11.

4. Multiple Lua modules in a single C library.

Some of the base library .c files register more than one Lua module; a couple of them also register globals. In order to get the code presented below to work with this, it will be necessary to at least create separate luaopen_* functions for the different Lua modules. I don't think that is a huge change; the remaining changes to luaL_openlib() actually simplify it.

[CODE SAMPLE 1 -- basic module loading system]
It was easier to write this in Lua, but it needs to be part of the base library in order to get proper integration of static and dynamic libraries. Rewriting it in C would be tedious and arguably unnecessary; it is not going to be of much use in a Lua compiled without at least the byte-code loader, so it can be compiled into the lua executable as bytecode.


-- make a weak keyed table
local function Weak() return setmetatable({}, {__mode = "k"}) end

-- maps module tables to instantiator functions
local MODULE_MAKER = Weak
-- maps module tables to their names
local MODULE_NAME = Weak

-- Find the module. The precise definition of find_in_path might vary,
-- depending on OS interfaces, but a simple implementation would be
-- to try all the possibilities one at a time.

local function find_module(name)
  local filename = find_in_path(name, LUA_PATH)
  if filename then
    return assert(assert(loadfile(filename))())
  end
  filename = find_in_path(name, LUA_DYNPATH)
  if filename then return assert(loadlib(filename, "luaopen_" .. name)) end
  error("Could not find module '"..name.."'")
end

-- An unloaded module lazily instantiates itself; they are created with this
-- metatable.
-- require() forces the instantiation by referencing a module member.
-- I left out the code to verify that the module does not require itself.
local mod_meta = {}
function mod_meta:__index(key)
  local instantiator, name = MODULE_MAKER[self], MODULE_NAME[self]
  if instantiator == nil then
    -- We don't have an instantiator
    instantiator = assert(find_module(name))
    MODULE_MAKER[self] = instantiator
  end
  -- disable the trigger. This should actually change the trigger so that
  -- an error is signalled if the module requires itself, and then set it
  -- to nil after the instantiation
  setmetatable(self, nil)
  -- should do more error checking
  instantiator(self, name)
  return self[key]  -- the trigger has been turned off so rawget is not necessary.
end

-- The _MODULE table uses the following metamethod to automatically create a new table
-- for so-far-unreferenced modules. The unloaded modules are given the metatable above.

local use_meta = {}
function use_meta:__index(name)
  local mod = setmetatable({}, mod_meta)
  self[name], MODULE_NAME[mod] = mod, name
  return mod
end

_MODULE = setmetatable({}, use_meta)

function require(name)
  local mod = _MODULE[name]  -- get the module table
  local _ = mod._VERSION  -- trigger the load if necessary
  -- it doesn't matter if mod._VERSION doesn't exist. But it should :)
  return mod
end

-- Finally, we need a way to register static modules (whether foreign or Lua):

function register_instantiator(name, instantiator)
  local mod = _MODULE[name]
  MODULE_MAKER[mod] = instantiator
end

-- A C stub for registering static libraries on startup:

LUALIB_API void luaL_registerlib(lua_State *L, const char *name, lua_CFunction instantiator) {
  lua_getglobal(L, "register_instantiator");  /* should check that this worked */
  lua_pushstring(L, name);
  lua_pushcfunction(L, instantiator);
  lua_pcall(L, 2, 0);  /* should do something on error here */
}

luaL_registerlib(L, "string", luaopen_string);
luaL_registerlib(L, "table", luaopen_table);
/* ... */

[CODE SAMPLE 2 ---- partial C code for Complex example]
#include <math.h>
#include <lua.h>
#include <lauxlib.h>

typedef struct Complex {lua_Number r, i;} Complex;

static int pushComplex(lua_Number r, lua_Number i) {
  Complex *rv = lua_newuserdata(L, sizeof(Complex));
  luaL_getmetatable(L, "Complex");
  lua_setmetatable(L, -1);
  rv->r = r;
  rv->i = i;
  return 1;
}

static Complex *checkComplex(lua_State *L, int narg) {
  Complex *rv = luaL_checkudata(L, narg, "Complex");
  if (rv == NULL) luaL_typerror(L, narg, "Complex");
  return rv;
}

static int complex_new(lua_State *L) {
  lua_Number r = luaL_checknumber(L, 1);
  lua_Number i = luaL_checknumber(L, 2);
  return pushComplex(r, i);
}

static int complex_add(lua_State *L) {
  Complex *a = checkComplex(L, 1);
  Complex *b = checkComplex(L, 2);
  return pushComplex(a->r + b->r, a->i + b->i);
}

static int complex_abs(lua_State *L) {
  Complex *a = checkComplex(L, 1);
  lua_pushnumber(L, sqrt(a->r * a->r + a->i * a->i));
  return 1;
}
Reply | Threaded
Open this post in threaded view
|

Re: More about packaging (fwd)

RLak-2

A couple of quick notes, since I think my previous post was too long :)

1) Inserting modules into the global namespace is just bad practice. However, for lazy coders there is no problem setting a metamethod on the global table which will trigger the require -- the only issue is figuring out which globals might be module names. Alternatively, you could define:

function import(name)  _G[name] = require(name) end

which will work fine if the project doesn't change environment tables, and will satisfy the lazy programmer market, I hope.

2) In my opinion, the standard name for a library module ought to define a published interface description, not a particular implementation of the interface. Having a system which allows me to have multiple implementations (or multiple versions of a single implementation) simultaneously makes regression testing much simpler. If a module clobbered the globals table on instantiation, this is much harder to do.

There is another pending discussion on environment tables, but I won't get into that right now.

3) The sample code I provided allows me to use the same .c code for a static or a dynamic library with no modifications whatsoever, but it does have the weakness that the name of the module must correspond to the name of the entry point. After trying a few alternatives, I decided I could live with this. I use a script to build a makefile for the lib subdirectory; the script creates a little stub which registers (but does not load) the static libraries it encounters there. I personally prefer to have my lua's built with static libraries for production use, so it is convenient for me to be able to just move the source file into the lua build directory and do a make.

R.