dependencies between C modules, RTLD_GLOBAL

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

dependencies between C modules, RTLD_GLOBAL

Josh Haberman
Short question: what is the rationale for Lua loading Cextensions as RTLD_LOCAL by default?
Long story: I have several C extensions with dependenciesbetween them.
         +---+     +-->| A |<--+     |   +---+   |     |           |   +---+       +---+   | B |       | C |   +---+       +---+
These dependencies are C-level dependencies; ie. A containsa C function foo() that B and C call.
Now I could tell the linker about these dependencies whenI compile these modules.  That would make the OS aware ofthese dependencies, and make it automatically load A whenB or C are loaded.
But this approach has a couple downsides:
1. Users now need to set both LUA_CPATH *and* LD_LIBRARY_PATH   correctly.  Kind of a big pain, especially since the two   have different failures modes so it's not immediately   obvious which one you got wrong.  Also, the LD_LIBRARY_PATH   part is inherently OS-specific, and is different on OS X.
2. If the user wants to install your library, they will expect   to *not* have to set LD_LIBRARY_PATH, so you have to install   into /usr/lib or /usr/local/lib, and give it an soname that   won't conflict with anything else on the system.  But this   is a little weird, because the library isn't intended to   be used from C, it's *only* intended to be loaded by Lua.
What I wish I could do is just provide, from Lua, that A is alwaysloaded before B or C.  This is easy to do with some trivialwrappers:
-- b.luaimport("a")return import("b_cext")
-- c.luaimport("a")return import("c_cext")
This would be a really nice solution that solves both of theabove problems.  But unfortunately it is not possible becauserequire() loads C libraries with RTLD_LOCAL.  That means thatthe symbols loaded in A are not available to be linked againstfrom B or C.
Any thoughts or workarounds?
Thanks,Josh
Reply | Threaded
Open this post in threaded view
|

Re: dependencies between C modules, RTLD_GLOBAL

Luiz Henrique de Figueiredo
> But unfortunately it is not possible because require() loads C libraries
> with RTLD_LOCAL.

Since Lua 5.2, package.loadlib accepts an optional argument that allows
you to use RTLD_GLOBAL. See
        http://www.lua.org/manual/5.3/manual.html#pdf-package.loadlib

So the answer is to use package.loadlib directly.

Reply | Threaded
Open this post in threaded view
|

Re: dependencies between C modules, RTLD_GLOBAL

Josh Haberman
In reply to this post by Josh Haberman
Wow, my original message got mangled, sorry about that. Such a shame, I did nice ASCII art and everything. :)

I'm going to try once more and see if my original formatting is preserved.

--

Short question: what is the rationale for Lua loading C
extensions as RTLD_LOCAL by default?

Long story: I have several C extensions with dependencies
between them.

         +---+
     +-->| A |<--+
     |   +---+   |
     |           |
   +---+       +---+
   | B |       | C |
   +---+       +---+

These dependencies are C-level dependencies; ie. A contains
a C function foo() that B and C call.

Now I could tell the linker about these dependencies when
I compile these modules.  That would make the OS aware of
these dependencies, and make it automatically load A when
B or C are loaded.

But this approach has a couple downsides:

1. Users now need to set both LUA_CPATH *and* LD_LIBRARY_PATH
   correctly.  Kind of a big pain, especially since the two
   have different failures modes so it's not immediately
   obvious which one you got wrong.  Also, the LD_LIBRARY_PATH
   part is inherently OS-specific, and is different on OS X.

2. If the user wants to install your library, they will expect
   to *not* have to set LD_LIBRARY_PATH, so you have to install
   into /usr/lib or /usr/local/lib, and give it an soname that
   won't conflict with anything else on the system.  But this
   is a little weird, because the library isn't intended to
   be used from C, it's *only* intended to be loaded by Lua.

What I wish I could do is guarantee, from Lua, that A is always
loaded before B or C.  This is easy to do with some trivial
wrappers:

-- b.lua
import("a")
return import("b_cext")

-- c.lua
import("a")
return import("c_cext")

This would be a really nice solution that solves both of the
above problems.  But unfortunately it is not possible because
require() loads C libraries with RTLD_LOCAL.  That means that
the symbols loaded in A are not available to be linked against
from B or C.

Any thoughts or workarounds?

Thanks,
Josh

On Tue, May 12, 2015 at 4:14 PM, Josh Haberman <[hidden email]> wrote:
Short question: what is the rationale for Lua loading Cextensions as RTLD_LOCAL by default?
Long story: I have several C extensions with dependenciesbetween them.
         +---+     +-->| A |<--+     |   +---+   |     |           |   +---+       +---+   | B |       | C |   +---+       +---+
These dependencies are C-level dependencies; ie. A containsa C function foo() that B and C call.
Now I could tell the linker about these dependencies whenI compile these modules.  That would make the OS aware ofthese dependencies, and make it automatically load A whenB or C are loaded.
But this approach has a couple downsides:
1. Users now need to set both LUA_CPATH *and* LD_LIBRARY_PATH   correctly.  Kind of a big pain, especially since the two   have different failures modes so it's not immediately   obvious which one you got wrong.  Also, the LD_LIBRARY_PATH   part is inherently OS-specific, and is different on OS X.
2. If the user wants to install your library, they will expect   to *not* have to set LD_LIBRARY_PATH, so you have to install   into /usr/lib or /usr/local/lib, and give it an soname that   won't conflict with anything else on the system.  But this   is a little weird, because the library isn't intended to   be used from C, it's *only* intended to be loaded by Lua.
What I wish I could do is just provide, from Lua, that A is alwaysloaded before B or C.  This is easy to do with some trivialwrappers:
-- b.luaimport("a")return import("b_cext")
-- c.luaimport("a")return import("c_cext")
This would be a really nice solution that solves both of theabove problems.  But unfortunately it is not possible becauserequire() loads C libraries with RTLD_LOCAL.  That means thatthe symbols loaded in A are not available to be linked againstfrom B or C.
Any thoughts or workarounds?
Thanks,Josh

Reply | Threaded
Open this post in threaded view
|

Re: dependencies between C modules, RTLD_GLOBAL

Josh Haberman
In reply to this post by Luiz Henrique de Figueiredo
On Tue, May 12, 2015 at 4:29 PM, Luiz Henrique de Figueiredo
<[hidden email]> wrote:
>
> Since Lua 5.2, package.loadlib accepts an optional argument that allows
> you to use RTLD_GLOBAL. See
>         http://www.lua.org/manual/5.3/manual.html#pdf-package.loadlib
>
> So the answer is to use package.loadlib directly.

Since package.loadlib bypasses require(), it seems like this approach
will involve re-implementing the require() logic? Stuff like checking
package.preload, splitting LUA_CPATH on ";", looking in each
directory, etc?

Reply | Threaded
Open this post in threaded view
|

Re: dependencies between C modules, RTLD_GLOBAL

Jonathan Goble
On Tue, May 12, 2015 at 9:42 PM, Josh Haberman <[hidden email]> wrote:

> On Tue, May 12, 2015 at 4:29 PM, Luiz Henrique de Figueiredo
> <[hidden email]> wrote:
>>
>> Since Lua 5.2, package.loadlib accepts an optional argument that allows
>> you to use RTLD_GLOBAL. See
>>         http://www.lua.org/manual/5.3/manual.html#pdf-package.loadlib
>>
>> So the answer is to use package.loadlib directly.
>
> Since package.loadlib bypasses require(), it seems like this approach
> will involve re-implementing the require() logic? Stuff like checking
> package.preload, splitting LUA_CPATH on ";", looking in each
> directory, etc?

Yes, you'd have to check package.preload, but if not found there,
`package.searchpath(libname, package.cpath)` ought to return the
filename for passing to package.loadlib.

(Note that I'm basing this on a reading of the manual; I've never
actually had a need to do anything like this, so there may be
something here that I'm missing, but I don't think so.)

Reply | Threaded
Open this post in threaded view
|

Re: dependencies between C modules, RTLD_GLOBAL

William Ahern
In reply to this post by Josh Haberman
On Tue, May 12, 2015 at 06:28:22PM -0700, Josh Haberman wrote:
<snip>
> Short question: what is the rationale for Lua loading C
> extensions as RTLD_LOCAL by default?

Because the alternative is even worse. With RTLD_GLOBAL unrelated modules
could (and would) cause symbol conflicts, either because they didn't
properly define the scope of internal functions, or because libraries that
they depend on had conflicting symbols. Using a default of RTLD_LOCAL is the
safest, sanest approach. If you _know_ there won't be conflicts, you can
explicitly load modules using RTLD_GLOBAL.

See my specific comments inline. But, quickly, I would use package.preload
to solve this.

  local lib_A --> XXX: not sure if we need to anchor the reference
  package.preload["A"] = function ()
    local path = assert(package.searchpath("A", package.cpath))
    lib_A = assert(package.loadlib(path, "*")) --> load global
    local luaopen_A = assert(package.loadlib(path, "luaopen_A"))
    return assert(luaopen_A("A", path))
  end

  package.preload["B"] = function ()
    require"A"
    local path = assert(package.searchpath("B", package.cpath))
    local luaopen_B = assert(package.loadlib(path, "luaopen_B"))
    return assert(luaopen_B("B", path))
  end

  package.preload["C"] = function ()
    require"A"
    local path = assert(package.searchpath("C", package.cpath))
    local luaopen_C = assert(package.loadlib(path, "luaopen_C"))
    return assert(luaopen_C("C", path))
  end


> Long story: I have several C extensions with dependencies
> between them.
>
>          +---+
>      +-->| A |<--+
>      |   +---+   |
>      |           |
>    +---+       +---+
>    | B |       | C |
>    +---+       +---+
>
> These dependencies are C-level dependencies; ie. A contains
> a C function foo() that B and C call.
>
> Now I could tell the linker about these dependencies when
> I compile these modules.  That would make the OS aware of
> these dependencies, and make it automatically load A when
> B or C are loaded.
>
> But this approach has a couple downsides:
>
> 1. Users now need to set both LUA_CPATH *and* LD_LIBRARY_PATH
>    correctly.  Kind of a big pain, especially since the two
>    have different failures modes so it's not immediately
>    obvious which one you got wrong.  Also, the LD_LIBRARY_PATH
>    part is inherently OS-specific, and is different on OS X.

Not if you use rpath. Both ELF and Mach-O support embedding a directory name
in the module for use when searching for the dependency. (For Mach-O you
also need to use otool(3) on the dependency.) Alternatively, just also
install the module in the normal library path, e.g. libA.so.

In any event, I'm not sure this strategy would work. The loader may end up
installing the module twice. The glibc ELF loader, IIRC, indexes libraries
by path name. If the path name is the same, a library will only get loaded
once, no matter how many times it's listed as a dependency or loaded with
dlopen. If the path is different it'll be re-loaded.

> 2. If the user wants to install your library, they will expect
>    to *not* have to set LD_LIBRARY_PATH, so you have to install
>    into /usr/lib or /usr/local/lib, and give it an soname that
>    won't conflict with anything else on the system.  But this
>    is a little weird, because the library isn't intended to
>    be used from C, it's *only* intended to be loaded by Lua.

Again, you would simply embed the direcory path for A within module's B and
C using the widely supported rpath linker support.
 
> What I wish I could do is guarantee, from Lua, that A is always
> loaded before B or C.  This is easy to do with some trivial
> wrappers:

 

> -- b.lua
> import("a")
> return import("b_cext")
>
> -- c.lua
> import("a")
> return import("c_cext")
>
> This would be a really nice solution that solves both of the
> above problems.  But unfortunately it is not possible because
> require() loads C libraries with RTLD_LOCAL.  That means that
> the symbols loaded in A are not available to be linked against
> from B or C.
>
> Any thoughts or workarounds?

The solution I gave above should implement precisely this. However, it
requires Lua 5.2+.

Lua 5.1 has no way to specify symbol visiblity. Furthermore, Lua 5.1 doesn't
pass the path name to the loaded module the way Lua 5.2 does. But most
platforms (notably not AIX, but including OS X and every ELF-based platform
I've tested) support dladdr(3). The A module can use dladdr and dlopen to
upgrade it's visibility instead of relying on loadlib(path, "*"). I use the
following trick in a Lua-Perl module at work to export the Perl XS API so
that Perl code can load other Perl modules.

  #include <dlfcn.h>

  int luaopen_A(lua_State *L) {
    Dl_info

    if (!dladdr(&luaopen_A, &info) || !info.dli_fname)
      return luaL_error(L, "unable to locate path");

    if (!dlopen(info.dli_fname, RTLD_GLOBAL|RTLD_NOW|RTLD_NOLOAD))
      return luaL_error(L, "unable to export symbols");

    // normal module loading stuff
  }

Full disclosure: I've used dladdr+dlopen to pin a module in memory so that
it can't be unloaded, and I know it works on Linux glibc, Linux musl, OS X,
FreeBSD, NetBSD, OpenBSD, and Solaris. For example, if I've installed local
functions as callbacks with OpenSSL, I have to make sure the module is never
unloaded. However, I've only ever tested the visibility hack on Linux and
maybe OS X. I don't think RTLD_NOLOAD is needed, but it's a good way to be
sure that we're definitely only reloading ourselves.


Reply | Threaded
Open this post in threaded view
|

Re: dependencies between C modules, RTLD_GLOBAL

Josh Haberman
In reply to this post by Jonathan Goble
On Tue, May 12, 2015 at 7:16 PM, Jonathan Goble <[hidden email]> wrote:
> On Tue, May 12, 2015 at 9:42 PM, Josh Haberman <[hidden email]> wrote:
>> Since package.loadlib bypasses require(), it seems like this approach
>> will involve re-implementing the require() logic? Stuff like checking
>> package.preload, splitting LUA_CPATH on ";", looking in each
>> directory, etc?
>
> Yes, you'd have to check package.preload, but if not found there,
> `package.searchpath(libname, package.cpath)` ought to return the
> filename for passing to package.loadlib.

I tried this out and it seems to work!

One caveat: require() will load the module RTLD_LOCAL. On OS X at
least, it appears that once you load a module as RTLD_LOCAL,
subsequent loads as RTLD_GLOBAL don't appear to have any effect.

So ultimately my a.lua looks like this:

-- Ensure the library is loaded as RTLD_GLOBAL.
package.loadlib(package.searchpath("a_cext", package.cpath), "*")

-- Let require load the module in the normal way.
require "a_cext"

Turning these two statements around makes it not work anymore! This
also means I probably need to do a bit more work to give a nicer error
message if the first load fails (for example, if LUA_CPATH was set
incorrectly).

Reply | Threaded
Open this post in threaded view
|

Re: dependencies between C modules, RTLD_GLOBAL

Josh Haberman
In reply to this post by William Ahern
Thanks for all the info here!

On Wed, May 13, 2015 at 1:02 PM, William Ahern
<[hidden email]> wrote:
> The solution I gave above should implement precisely this. However, it
> requires Lua 5.2+.

How many people are still on Lua 5.1? 5.2 has been out for almost 4
years now. And LuaJIT supports the package.loadlib(name, "*") feature.

I am favoring this solution for now (what you posted is roughly
equivalent to what I was thinking of doing).

Reply | Threaded
Open this post in threaded view
|

Re: dependencies between C modules, RTLD_GLOBAL

William Ahern
In reply to this post by Josh Haberman
On Wed, May 13, 2015 at 01:03:40PM -0700, Josh Haberman wrote:

> On Tue, May 12, 2015 at 7:16 PM, Jonathan Goble <[hidden email]> wrote:
> > On Tue, May 12, 2015 at 9:42 PM, Josh Haberman <[hidden email]> wrote:
> >> Since package.loadlib bypasses require(), it seems like this approach
> >> will involve re-implementing the require() logic? Stuff like checking
> >> package.preload, splitting LUA_CPATH on ";", looking in each
> >> directory, etc?
> >
> > Yes, you'd have to check package.preload, but if not found there,
> > `package.searchpath(libname, package.cpath)` ought to return the
> > filename for passing to package.loadlib.
>
> I tried this out and it seems to work!
>
> One caveat: require() will load the module RTLD_LOCAL. On OS X at
> least, it appears that once you load a module as RTLD_LOCAL,
> subsequent loads as RTLD_GLOBAL don't appear to have any effect.

That's because Lua will only load the library once, regardless of the
scoping. As it's already been loaded before it won't call dlopen again with
RTLD_GLOBAL, but instead uses the cached handle.

See ll_loadfunc in the Lua 5.2 source and lookforfunc in the 5.3 source.