changes in 'require'

classic Classic list List threaded Threaded
82 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

changes in 'require'

Roberto Ierusalimschy
We are trying two changes in 'require' for Lua 5.1:

1) Following Diego's suggestion, require "a.b" would first require "a".
Unlike his suggestion, however, fails in that require are not silently
ignored. (You can simply add an empty "init.lua" file into directory
"a" to satisfy that require.)

2) The Lua loader is called before the C loader. The motivation is that
Lua is more dynamic than C, and so it is easier to change/correct Lua
modules and they need precedence for those changes/corrections to have
effect.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Tomas-14
	Hi Roberto,

1) Following Diego's suggestion, require "a.b" would first require "a".
Unlike his suggestion, however, fails in that require are not silently
ignored. (You can simply add an empty "init.lua" file into directory
"a" to satisfy that require.)
	Then, if a package doesn't have module "a", it will fail to
load.  A real case is LuaSQL: it is installed as a unique dynamic library,
inside directory luasql (in "cpath").  Now it will have another file
which will be installed in another directory tree ("path").  In a Unix box:

/usr/local/lib/lua/5.1/luasql/driver.so
/usr/local/share/lua/5.1/luasql.lua (or luasql/init.lua)

	Am I correct?

2) The Lua loader is called before the C loader. The motivation is that
Lua is more dynamic than C, and so it is easier to change/correct Lua
modules and they need precedence for those changes/corrections to have
effect.
	And there will be a way to force the load of the C module?
I mean, inside a.lua, how could I load a.so?  I know that today this is
not possible (LuaSocket does a trick to achieve that), but do you plan
to provide that?
		Tomas

Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Roberto Ierusalimschy
>  	Then, if a package doesn't have module "a", it will fail to
> load.  A real case is LuaSQL: it is installed as a unique dynamic library,
> inside directory luasql (in "cpath").  Now it will have another file
> which will be installed in another directory tree ("path").  In a Unix box:
> 
> /usr/local/lib/lua/5.1/luasql/driver.so
> /usr/local/share/lua/5.1/luasql.lua (or luasql/init.lua)
> 
>  	Am I correct?

Not quite. In our view that complete separation of paths between C
modules and Lua modules does not make sense when a single package uses
both. (After all, one of the main ideas of packages is to keep all its
files together.)  So we also changed the default Lua path (forgot to
mention ;) to include the C path (with proper extensions). All you
have to do to correct luasql is to add a file "init.lua" in directory
/usr/local/lib/lua/5.1/luasql/.


> And there will be a way to force the load of the C module?

You can call package.loadlib (new name for loadlib) directly (or
you can call the C loader).

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Tomas-14
	Hi Roberto,

Not quite. In our view that complete separation of paths between C
modules and Lua modules does not make sense when a single package uses
both. (After all, one of the main ideas of packages is to keep all its
files together.)  So we also changed the default Lua path (forgot to
mention ;) to include the C path (with proper extensions). All you
have to do to correct luasql is to add a file "init.lua" in directory
/usr/local/lib/lua/5.1/luasql/.
	Ok.  It means that .lua files should now be installed
in ../lib/.. ?

And there will be a way to force the load of the C module?

You can call package.loadlib (new name for loadlib) directly (or
you can call the C loader).
	The C loader will have a name, like package.C_loader?

	Tomas

Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Roberto Ierusalimschy
> It means that .lua files should now be installed in ../lib/.. ?

No. It means that .lua files *can* now be installed in ../lib/..


> The C loader will have a name, like package.C_loader?

I don't think so. Those uses outside require should be kept to a
minimum, mainly for dirty tricks. Healthy packages should not use it.
(Probably it won't even be documented.)

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Diego Nehab-3
In reply to this post by Roberto Ierusalimschy
Hi,

1) Following Diego's suggestion, require "a.b" would first require "a".
Unlike his suggestion, however, fails in that require are not silently
ignored. (You can simply add an empty "init.lua" file into directory
"a" to satisfy that require.)

Any reason for this?

[]s,
Diego.

Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Diego Nehab-3
In reply to this post by Roberto Ierusalimschy
Hi,

The separation of binaries into lib and scripts into share is
in accordance with the Filesystem Hierarchy Standard, I hope. This means that
files in lib can be architecture dependent (but don't have to), whereas those in share have to be independent.

I would suggest Lua scripts be placed in the lib tree only if they are
precompiled. This would make things more "standard" and obvious.

I am waiting for a reply to my last post before I elaborate on the
implications of having split libraries.

[]s,
Diego.

Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Mark Hamburg-4
I have a third suggestion for a change to require:

Don't load the names into the global namespace by default. Modules can do
this themselves if they insist,but they generally shouldn't need to.

Argument in favor: It avoids code fragility from a missing require being
hidden by other modules happening to do the require.

Argument against: It complicates the command line environment.

I would be inclined to resolve the argument against by providing an easy way
to install an __index metamethod on the command-line globals environment
that automatically does the require and populates the globals.

Mark


Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

D Burgess-4
In reply to this post by Diego Nehab-3
1) Lua modules before C.
I think this is retrograde step. The argument that Lua should be first
because "it is more flexible" is fallacious. The argument couldd also be
that it makes a package installation more easily corrupted. We have
been using different versions of require for over a year now and we find 
the C before Lua protocol to work rather well. This change is also
rather disruptive, we have got used to the the way it searches and
find it convenient.

2) Search path. Am I to understand that the meaning of also using the
"C search path" means that you intend to search the %PATH% or do you
mean package.cpath? Extending the number of places to search comes at
a price. I have carefully measured the timings of the existing
search stratgeies on two platforms and the overhead is minimal. It
would be nice to keep it this way. (The final downside to this is
that I will need to repeat the performance measurements).

3) There seems to be an assumption that require will only be used
on systems with hiearchical file systems. There are three flat world
environments that I use. I trust that the current resstructuring will 
take into account flat filing systems.

4) My specific point of interest is embedding Lua code inside Win32
DLLs as resources. We implement this by populating preload with
lazy initialization functions. e.g.

  preloadfnc()
    load the resource()
    execute the resource
  end

The ONLY problem that we have had with work6 require is the question
of how to boot these preloads? This, as I read it, was Diego's original
question. An external Lua script (like init.lua) defeats the original aim
of loading the lua code from DLLs. Adding a resource loader to the
loader chain is problematic because it is application wide it impacts
all packages not just those that are resource loaded.

regards
David Burgess


Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Tomas-14
1) Lua modules before C.
I think this is retrograde step. The argument that Lua should be first
because "it is more flexible" is fallacious. The argument couldd also be
that it makes a package installation more easily corrupted. We have
been using different versions of require for over a year now and we find
the C before Lua protocol to work rather well. This change is also
rather disruptive, we have got used to the the way it searches and
find it convenient.
	I think it doesn't matter.  There shouldn't be two files
with the same name and different extensions, right?

	Tomas

Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Roberto Ierusalimschy
In reply to this post by Diego Nehab-3
> Any reason for this? [require failing on parent modules]

We think it is more simple and more reliable. And we don't see good
arguments against that.


> The separation of binaries into lib and scripts into share is
> in accordance with the Filesystem Hierarchy Standard, I hope.

It is, but as you said, files in lib don't need to be architecture
dependent. It seems much simpler to install a package with all its
modules together under the same directory. (Python, for instance, puts
all its libraries in /usr/lib. Perl too. Tcl too. Your own suggestion
for installing Luasocket puts .so and .lua under the same directory.)


> 1) Lua modules before C.
> I think this is retrograde step. [...]

That may be true. We want more input on that topic. About the change
being disruptive, I though that conflicting C and Lua modules should
be the exception rather the norm (as Tomas pointed out).


> 2) Search path. Am I to understand that the meaning of also using the
> "C search path" means that you intend to search the %PATH% or do you
> mean package.cpath?

It only means that the default path for Lua modules will include
"/usr/local/lib/...". It's a change only in the defaults.


> 3) There seems to be an assumption that require will only be used
> on systems with hiearchical file systems. There are three flat world
> environments that I use. I trust that the current resstructuring will 
> take into account flat filing systems.

As far as I can see the current changes (I wouldn't call that a
"resstructuring") are "orthogonal" to hiearchical file systems.  (Except
for the default pathes, which of course assume a hiearchical file
system).


> The ONLY problem that we have had with work6 require is the question
> of how to boot these preloads? This, as I read it, was Diego's original
> question. An external Lua script (like init.lua) defeats the original aim
> of loading the lua code from DLLs.

There is not need of an external Lua script. The whole point of that
kind of DLL is to have a "parent" module (at least to preload its
descendants). The point about adding a init.lua is for packages that
naturally do not have a parent module.


> I have a third suggestion for a change to require:
> 
> Don't load the names into the global namespace by default.

Require has never done that. Modules do.

-- Roberto


Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Mike Pall-56
In reply to this post by Roberto Ierusalimschy
Hi,

here are my 2 cents worth (Euro cents, mind you):

* require "a.b" forces require "a":

I'm +1 on this change. It makes my life easier, too.
And I'm +1 on fail-if-parent-is-not-there, too.

* Lua loader before C loader:

I'm mostly agnostic about the loader order. But I'm only +0 for the
'easier to change' argument (problem: you really can't move
C modules around because of the luaopen_<name>() conventions --
but see below).

Since module loading usually happens at startup, this is not a
performance issue. If anyone is loading tons of modules at runtime,
he/she can easily add a specific loader in front of the others.

* Adding .../lib/... to the Lua module search path:

Placing non-architecture specific modules into architecture-specific
paths is against established standards. I guess distribution
maintainers will rip this out right away. This makes it kind of
a moot exercise.

The keep-packages-together vs. split-packages-across-the-filesystem
debate is old. There is a pretty solid precedent in current POSIX
systems for the latter.

Mixed-architecture systems were out of fashion for a long time,
but have gained momentum recently with mixed x86/x64 systems
(this is in fact the common setup for Linux x64 systems):

$ file /usr/lib*/libpng.so.3.1.2.5
/usr/lib/libpng.so.3.1.2.5:   ELF 32-bit LSB shared object, Intel 80386
/usr/lib64/libpng.so.3.1.2.5: ELF 64-bit LSB shared object, AMD x86-64

Yes, this means a Lua x64 binary must be built with the C module
search path changed to '...;/usr/local/lib64/lua/5.1/?.so'
(in the distribution-specific package script for Lua).

Oh, and about the Perl/Python precedent: setting up either package
for a mixed-architecture system is a _real_ pain. Please don't copy
their approach or encourage module authors to do so.

* Length of the module search paths:

I'm all in favour of keeping the search paths as short as possible.
Try "strace python -c 'pass'" and you know what I mean.

This is another argument against adding .../lib/... to the Lua module
search path but also an argument against the 51w6 change to add
another two instances of 'l?.so' (note the 'l') to the C module
search path.

This is really useless and a holdover from the time when C modules
were not stored in extra directories. It adds ambiguities and this
change breaks LuaSocket under 51w6, too ("lsocket" and "socket"
refer to the same module).

I know why this was added, because lhf told me on IRC. But I think
there are less intrusive measures to solve that problem (like
changing the Makefiles of his modules ;-) ).

Please revert the paths to the 51w5 setting. Simple is better.

* Wrapping a C module with a Lua module (require() name conflict):

There is only one namespace for require() and C and Lua modules
must share them. Lua wrappers need to load C modules under a
different name. Workarounds like calling package.loadlib() directly
or calling the C loader directly are ugly and should be avoided.

For one this means there should be a naming convention for this
case, to avoid everyone reinventing the wheel. I have been pushing
for an underscore prefix for C modules that should not be loaded
directly (mirroring the Python convention).

I.e. require("socket") loads socket.lua and this in turn uses
require("_socket") to load socket.so. I think the underscore makes
it clear that this module is 'internal' and applications should not
load it directly (I know Diego doesn't like this convention, but
LuaSocket was the easiest example :-) ).

But as I said above, it's not possible to add a Lua wrapper on top
of a C module when the latter has not been designed for that. The
init function must be renamed (i.e. it must be 'luaopen__socket()'
for _socket.so).

I think a convenient solution would be to strip an underscore prefix
from the module name before generating the init function name.
This way it's easy to move (say) foo.so to _foo.so and load it with
require("_foo") from foo.lua. Here is the trivial change to loader_C():

  funcname = luaL_gsub(L, name, ".", LUA_OFSEP);
+ if (funcname[0] == '_') funcname++;
  funcname = lua_pushfstring(L, "%s%s", POF, funcname);

* Windows module search path issues:

We have seen a number of complaints about the fixed base directory
('C:\Program Files\Lua51') for the module search paths under Windows
in Lua 5.1.

I think no amount of changing this to a 'better' (but still fixed)
path will solve this. It's simply against the conventions to encode
a fixed path in a Windows binary. And of course Windows users are the
least likely to patch C sources and recompile. Even developers apt
enough to build their own C modules commonly ask for prebuilt Lua
binaries.

[Umm, and no -- setting environment variables doesn't cut it either
 on Windows.]

I propose to fix this once and for all and follow the suggestion of
David Burgess to get the path dynamically. This should work roughly
like this:

- Change LUA_ROOT to "!" inside the _WIN32 part at the top of luaconf.h.
- Add a function that uses GetModuleFileName() and luaL_gsub()
  to replace all "!"'s in a string with the dynamic path (with the
  executable name stripped). Put this inside the LUA_DL_DLL #ifdef.
- Apply this function to *both* the C and the Lua module paths
  (default paths or from environment).

A Lua binary stored in (say) 'D:\FooBar\lua51.exe' gets
  package.path ->  '?.lua;D:\FooBar\lua\?.lua;D:\FooBar\lua\?\init.lua'
  package.cpath -> '?.dll;D:\FooBar\dll\?.dll'

This should be a 10 line change and it only adds code for the Windows
case. I can create a patch, if needed.

Bye,
     Mike

Reply | Threaded
Open this post in threaded view
|

RE: changes in 'require'

Danilo Tuler-2
Hi,

> We have seen a number of complaints about the fixed base 
> directory ('C:\Program Files\Lua51') for the module search 
> paths under Windows in Lua 5.1.

Now I found a previous message regarding this.
http://lua-users.org/lists/lua-l/2005-02/msg00374.html
Sorry for reraising the subject in another thread.

> I propose to fix this once and for all and follow the 
> suggestion of David Burgess to get the path dynamically.

I happy with this solution.

Regards,
Danilo


Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Diego Nehab-3
In reply to this post by Roberto Ierusalimschy
Hi,

Any reason for this? [require failing on parent modules]

We think it is more simple and more reliable. And we don't see good
arguments against that.

I am assuming that every implicit call to require will behave *exactly*
like the explicit call, trying all loaders too. I.e., given the
structure below,

    lib/a.so
    share/a/b.lua
    lib/a/b/c.so
    share/a/b/c/d.lua

a call to require"a.b.c.d" would execute all these files, in order.

The separation of binaries into lib and scripts into share is
in accordance with the Filesystem Hierarchy Standard, I hope.

It is, but as you said, files in lib don't need to be architecture
dependent. It seems much simpler to install a package with all its
modules together under the same directory. (Python, for instance, puts
all its libraries in /usr/lib. Perl too. Tcl too. Your own suggestion
for installing Luasocket puts .so and .lua under the same directory.)

That is because I was unaware we had two directories. I will fix that
in the next release.

1) Lua modules before C.
I think this is retrograde step. [...]

That may be true. We want more input on that topic. About the change
being disruptive, I though that conflicting C and Lua modules should
be the exception rather the norm (as Tomas pointed out).

Maybe it is the price for an extra search that bothers David. Any functionality implemented in C will be loaded only after a search for a Lua script failed. It used to be the other way around.

I have a third suggestion for a change to require:

Don't load the names into the global namespace by default.

Require has never done that. Modules do.

Require is used, by default, to load modules, which call module() by
default, which sets the global namespace by default, right?  By
transitivity of the "default" operator... :)

This default behavior is to ensure users can use require in two ways

    local a = require"a"
    a.b()

and

    require"a"
    a.b()

and it doesn't bother me too much. Is it easy to override (besides
rewriting module())?

[]s,
Diego.


Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Mark Hamburg-4
In reply to this post by Roberto Ierusalimschy
on 7/8/05 6:24 AM, Roberto Ierusalimschy at [hidden email] wrote:

>> I have a third suggestion for a change to require:
>> 
>> Don't load the names into the global namespace by default.
> 
> Require has never done that. Modules do.

My bad. I just remembered reading pieces of the package system and seeing it
do so. I must have been reading the module code. I would argue that modules
shouldn't do that either for the same reasons.

Mark


Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Mike Pall-56
In reply to this post by Diego Nehab-3
Hi,

Diego Nehab wrote:
> This default behavior is to ensure users can use require in two ways
> 
>     local a = require"a"
>     a.b()
> 
> and
> 
>     require"a"
>     a.b()
> 
> and it doesn't bother me too much. Is it easy to override (besides
> rewriting module())?

There is no need to use module() at all, if you don't want to spill
into the global namespace.

Replace:
  module(...)
  local function internal() ... end
  function foo() ... internal() ... end   -- module-internal call (good)
  function bar() ... foo() ... end        -- intra-module call (avoid)

with:
  local _M = {}
  local function internal() ... end
  function _M.foo() ... internal() ... end
  function _M.bar() ... _M.foo() ... end
  return _M

In fact it's more efficient, because there is no indirection for the
globals table (modified by module()). The little additional syntax
is quite useful, because you can easily spot exported functions.

Note that the intra-module calls to exported functions (_M.foo())
are rare in practice. If you have one of these, it's a good idea
to refactor it to use a common module-internal (local) function
(it's faster, too).

For C modules, replace:
  ... push upvalues ...
  luaL_openlib(L, "modulename", funcs, nupvalues);
  return 1;  /* returns the table created and filled by luaL_openlib */

with:
  lua_newtable(L);
  ... push upvalues ...
  luaL_openlib(L, NULL, funcs, nupvalues);
  return 1;  /* returns the created table that is still on the stack */

So it's up to the module author to decide which way his/her module
can be loaded. I'm a proponent of the 'clean namespace' camp,
so you can guess what I would recommend.

Bye,
     Mike

Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Roberto Ierusalimschy
In reply to this post by Diego Nehab-3
> I am assuming that every implicit call to require will behave *exactly*
> like the explicit call, trying all loaders too.

Sure. That is the idea behind my argument of "simplicity".


> Maybe it is the price for an extra search that bothers David. 

Does anyone know how much is that price? As far as I can guess it seems
to be much smaller than the price of loading the module itself (assuming
a typical .lua module).


> Is it easy to override (besides rewriting module())?

It is easy not to use "module". Just put all your stuff inside a table
and return it. (Anyway, I strongly favor the definition of the global
name.)

-- Roberto


Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Roberto Ierusalimschy
In reply to this post by Mike Pall-56
> * Windows module search path issues:
> [...]
> I propose to fix this once and for all and follow the suggestion of
> David Burgess to get the path dynamically.
> [...]
> This should be a 10 line change and it only adds code for the Windows
> case. I can create a patch, if needed.

Instead of a patch, I would appreciate some clarification :) Should we
use NULL as the hModule argument?  Should we use "PathRemoveFileSpec" to
strip the executable name? Is there a "reasonable" maximum size for the
file name?


> I think a convenient solution would be to strip an underscore prefix
> from the module name before generating the init function name.

An alternative would be to use only the sub-module name, without the
parents. Instead of renaming "foo.so" to "_foo.so", rename it to
"p/foo.so". It seems to be a good idea to hide a private module outside
the main directory, and we simplify the generation of the funcname (no
more LUA_OFSEP). (On the other hand, that could create name conflicts
for homonymous sub-modules from different packages...)

-- Roberto


Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Diego Nehab-3
In reply to this post by Mike Pall-56
Hi,

I.e. require("socket") loads socket.lua and this in turn uses
require("_socket") to load socket.so. I think the underscore makes
it clear that this module is 'internal' and applications should not
load it directly (I know Diego doesn't like this convention, but
LuaSocket was the easiest example :-) ).

There you go... if I had listened to you, I wouldn't have the 'l'
problem I have now! :)

What I am currently using in my development version is a core.so, inside
the socket directory:

    lib/socket/core.so
    share/socket/init.lua

A call to require"socket" finds init.lua. This in turn calls
require"socket.core". According to the new scheme, there will be an
implicit, recursive, second call to require"socket".  Since require()
checks for recursion, this won't be a problem.

What is not possible to do is to load just the core, not the sugar
that init.lua adds to it. Under the new scheme, init.lua will *always*
be called.

If the implicit calls to require() could be identified by the modules
being loaded, on the other hand, init.lua could refrain from loading the
sugars if they had not been required explicitly. This is my last bid for
a
way to tell them appart.

I think a convenient solution would be to strip an underscore prefix
from the module name before generating the init function name.
This way it's easy to move (say) foo.so to _foo.so and load it with
require("_foo") from foo.lua.

I like this. This solves an entirely different problem and I have no
objections to the _ prefix in this case.

[]s,
Diego.

Reply | Threaded
Open this post in threaded view
|

Re: changes in 'require'

Diego Nehab-3
In reply to this post by Mike Pall-56

There is no need to use module() at all, if you don't want to spill
into the global namespace.

Haha. That's not what I meant, of course. But since Roberto also thought
the same, it's obviously my fault.

I am assuming module writers *will* call module(). I was asking if there
is a way to avoid the global namespace polution without changing every
package in your systeme not to use module, and without rewriting the
module function either.

I prefer the clean namespace, but by 7.42% only.

[]s,
Diego.

12345