Patchless modification of Lua source code

classic Classic list List threaded Threaded
69 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Sean Conner
It was thus said that the Great Philippe Verdy once stated:

> Le dim. 25 nov. 2018 à 05:24, Sean Conner <[hidden email]> a écrit :
>
> >   Here's the same, but under Linux:
> >
> >         % cc    -c -o main.o main.c
> >         % cc    -c -o myfunc2.o myfunc2.c
> >         % cc -shared -fPIC -o func1.ss func1.c
> >         % cc -shared -fPIC -o func2.ss func2.c
> >         % cc -shared -o libfunc.so func1.ss func2.ss
> >         % cc  -Wl,-rpath,/tmp/foo -o smain4 main.o myfunc2.o libfunc.so
> >         % ./smain4
> >         Hello from main
> >                 Hello from func1
> >                         Hello from myfunc2 ********
> >                 Back to func1
> >         Back to main
> >                         Hello from myfunc2
> >         Back to main
> >
> > Again, look closely at the line marked with '********'.  Notice the
> > function
> > that was called---myfunc2().  NOT func2().  myfunc2().
>
> You've used the PIC option which means that the shared library does not
> call directly the functions, but passes through a "call gate" to perform
> the indirection.

        % cc -c -o main.o main.c
        % cc -c -o myfunc2.o myfunc2.c
        % cc -shared -o func1.ss func1.c    *** < NOTE LACK OF -fPIC
        % cc -shared -o func2.ss func2.c    *** < NOTE LACK OF -fPIC
        % cc -shared -o libfunc.so func1.ss func2.ss
        % cc -Wl,-rpath,/tmp/foo -o smain4 main.o myfunc2.o libfunc.so
        % ./smain4
        Hello from main
                Hello from func1
                        Hello from myfunc2 *** < NOTE myfunc2 called!
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main

And to further head off that I used "-shared" when compiling func1.c and
func2.c:

        % cc -c -o main.o main.c
        % cc -c -o myfunc2.o myfunc2.c
        % cc -c -o func1.ss func1.c    *** < NOTE LACK OF -shared -fPIC
        % cc -c -o func2.ss func2.c    *** < NOTE LACK OF -shared -fPIC
        % cc -shared -o libfunc.so func1.ss func2.ss
        % cc -Wl,-rpath,/tmp/foo -o smain4 main.o myfunc2.o libfunc.so
        % ./smain4
        Hello from main
                Hello from func1
                        Hello from myfunc2   *** < NOTE myfunc2 called!
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main
        % ldd ./smain4
                libfunc.so => /tmp/foo/libfunc.so (0x00a9c000) *** < NOTE this is loading it at runtime!
                libc.so.6 => /lib/tls/libc.so.6 (0x00b90000)
                /lib/ld-linux.so.2 (0x00b76000)

And to head off that this is Linux specific, I'll do this for Solaris as
well:

        % cc -c -o main.o main.c
        % cc -c -o myfunc2.o myfunc2.c
        % cc -c -o func1.ss func1.c    *** < NOTE LACK OF -shared -xcode=pic32
        % cc -c -o func2.ss func2.c    *** < NOTE LACK OF -shared -xcode=pic32
        % cc -shared -o libfunc.so func1.ss func2.ss
        % cc -Wl,-R/lusr/home/spc/foo -o smain4 main.o myfunc2.o -L/lusr/home/spc/foo -lfunc
        % ldd smain4
        libfunc.so =>    ./libfunc.so
        libc.so.1 =>     /usr/lib/libc.so.1
        libm.so.2 =>     /usr/lib/libm.so.2
        /platform/SUNW,Sun-Fire-T1000/lib/libc_psr.so.1
        % ./smain4
        Hello from main
                Hello from func1
                        Hello from myfunc2   *** < NOTE myfunc2 called!
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main

  [ rest of possibly wrong post snipped ]

  -spc (I guess you coudn't bother to at least try it before writing ... )

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Philippe Verdy
In reply to this post by Sean Conner


Le dim. 25 nov. 2018 à 06:04, Sean Conner <[hidden email]> a écrit :
It was thus said that the Great Philippe Verdy once stated:
> So you just demonstrater that you just want to polemicate,

  You said, and I'm quoting you here, "DLL modules on Windows, as well as on
OS/2, are based on directly on the ELF format," which runs counter to how I
understand computing history.

  Microsoft first created DLLs for Windows, starting with version 1.0
(release in 1985).  OS/2 (released in 1987) inherited the concept from
Windows (OS/2 being a joint project between IBM and Microsoft at the time).

  System V, Release 4 (released in 1988), a combined project between AT&T
and Sun Microsystems, was when the ELF format was introduced.  At best, one
could say that ELF was inspired by Windows DLLs, but what you said is ...
not based upon facts.  That in turn, makes me discount what you say due to
the inaccuracy. 

> you don't
> provide any useful help with your very shortsighted view when the
> discussion was much more generic...
> We were giving some examples, now you interpret the examples as being the
> only available options (which they are not, programmers will always invent
> new alternatives).

  I initally four examples on two platforms, Linux and Solaris.  Then Ivan
provided some of the examples on Windows.  The results were:

mail() calls func1() calls func2()

func1(),func2() in same translation unit, in a library, providing myfunc2(),
static compilation:
        Linux:          failed to link
        Solaris:        failed to link
        Windows:        EXAMPLE NOT PROVIDED

func1(),func2() in different translations units, in a library, providing
myfunc2(), static compilation:
        Linux:          linked, run, myfunc2() called.
        Solaris:        linked, run, myfunc2() called.
        Windows:        lihked, run, myfunc2() called.

func1(),func2() in same translation unit, as a shared library, providing
myfunc2():
        Linux:          linked, run, myfunc2() called.
        Solaris:        linked, run, myfunc2() called.
        Windows:        EXAMPLE NOT PROVIDED

func1(),func2() in different translations units, as a shared library,
providing myfunc2():
        Linux:          linked, run, myfunc2() called.
        Solaris:        linked, run, myfunc2() called.
        Windows:        linked, run, func1() called (NOTE DIFFERENCE!)

So, what damn examples did YOU give that I interpreted as being the only
available options?  Because I've seen very little in the way of actual
examples given, or citations to references, or anything from you.

Hmmmm... your examples are visibly fabriquek in your head. The description you make here even contain errors (results that are impossible to get). There's no source code given explicitly. The link options are not clearly given... All this is highly suspect (notably the last "example" shown supposed for Windows; otheres are not even tested in Windows, those for Linux/Solaris are not even correctly described).

And as well you don't seem to understand call gates and you seem to generalize the use of PIC code for everything which is not the default option (and in fact not the best one except for specific cases, like the CRT library that is extremely unlikely to be replaced safely unit by unit as its internal dependencies are very complex). But even in this case, call gates and PIC are unnecessary (except if you want to rebase the CRTDLL for "ASLR", something that is actually now proven to not offer real protection against external attacks by injection of code via buffer overflows on the stack causing return addresses to known RVA addresses instead of causing segmentation faults: this is the only thing that ASLR protects, but at the price of high paging because no DLL code is then actually shared across processes).

Call gates are not necessary to perform system calls (they are implemnted by software interrupts instead, without passing through virtual memory addresses, and implemented in different protection rings and the system itself is generally not pages out, or uses a different paging pool, independant of pageing pools used by user processes).

ELF and COFF have a common history, both have been used under Unix and even Linux; PE is also used in Linux ! Windows can supports ELF (MZ+PE is not the only option), ir also supports MZ+NE, MZ+LE and some variants still used for VXD drivers. COFF has many variants. ELF itself has its own variants (some extensions tried in Linux, then abandonned in Linux...). Anyway there's no funcamental differences between today's variants of ELF and COFF (the majhor difference is in how they support debugging information and Runtime type info; and ELF was created to support more architectures, and this was done as well in COFF then NE, LE, PE, PE64) Various user programs or libaries can still use or implement their own loaders for supporting the binary format they prefer, or just to emulate other platforms.

I started with just one example but you derived on this topic which was compeltely independant of what C compilers do, because you still maintain the confusion between the C standard and what linkers must do which is absolutely not defined by the C standard itself.

The C standard only defines the "linkage" options that you can use to bind non local symbols (i.e. variables not declared as auto or register, and not on the stack) it describes the "static" linkage (which is inviaible to external linkers and creates a scope local to the compiled unit), and the "extern" linkage possibly decarated by a "C" or "C++" label to indicate the name mangling convention that allows distinguishing or not the symbols used from other compiled units.
Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Sean Conner
It was thus said that the Great Philippe Verdy once stated:
>
> Hmmmm... your examples are visibly fabriquek in your head. The description
> you make here even contain errors (results that are impossible to get).
> There's no source code given explicitly. The link options are not clearly
> given... All this is highly suspect (notably the last "example" shown
> supposed for Windows; otheres are not even tested in Windows, those for
> Linux/Solaris are not even correctly described).

  Fine.  Source code:

/* main.c */
#include <stdio.h>

extern void func1(void);

int main(void)
{
  puts("Hello from main");
  func1();
  puts("Back to main");
  func2();
  puts("Back to main");
  return 0;
}

/* myfunc2.c */
#include <stdio.h>

void func2(void)
{
  puts("\t\tHello from myfunc2");
}

/* func.c */
#include <stdio.h>

void func2(void)
{
  puts("\t\tHello from func2");
}

void func1(void)
{
  puts("\tHello from func1");
  func2();
  puts("\tBack to func1");
}

/* func1.c */
#include <stdio.h>

extern void func2(void);

void func1(void)
{
  puts("\tHello from func1");
  func2();
  puts("\tBack to func1");
}

/* func2.c */
#include <stdio.h>

void func2(void)
{
  puts("\t\tHello from func2");
}

  I've given the commands used.  RUN THE DAMN PROGRAMS AND SEE FOR YOURSELF!
I'll believe my eyes over your mad ramblings.  The results I got for Linux
and Solaris I RAN MYSELF!  I do not have Windows, so I am trusting the
results I got from Ivan.

  -spc

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Viacheslav Usov
In reply to this post by Sean Conner
On Sat, Nov 24, 2018 at 11:23 PM Sean Conner <[hidden email]> wrote:

>  No, the output I saw under Linux was consistent with Solaris, NOT Windows.

Since you did not say that in the original message that had only the output on Solaris, I could not not assume that.

Instead, I relied on the message from Philippe Verdy which had this output:

Hello from main
    Hello from func1
        Hello from func2 (sic!)
    Back to func1
Back to main
    Hello from myfun2 (sic!)
Back to main

His message did not say that it was done on Linux, that was my guess. And it did not show the command line options he used to get the result, nor did it specify the details of the toolchain.

ELF shared libraries allow for symbol interposition, so that a shared library may end up calling a function defined elsewhere even if it is also defined in the shared library. This is I think what your results demonstrate. However, that is both inefficient and questionable from the security standpoint so there are ways controlled by a bunch of different options to suppress this behaviour. Even default behaviour in this respect may have changed over the years and tool versions, so it is not completely surprising that you and Philippe got different results.

On the other hand, I do not think such behaviour is available with PE DLLs. I am not sure about MacOS.

I'd say all this demonstrates "undefined behaviour" quite convincingly.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Sean Conner
It was thus said that the Great Viacheslav Usov once stated:
> On Sat, Nov 24, 2018 at 11:23 PM Sean Conner <[hidden email]> wrote:
>
> >  No, the output I saw under Linux was consistent with Solaris, NOT
> Windows.
>
> Since you did not say that in the original message that had only the output
> on Solaris, I could not not assume that.

  I did say that.   Twice.  Granted, they were in parentheticals but they
are there.  And the reason I reported the Solaris results is that you
explicitely called out the GNU tools as a possible reason, and I wanted to
get a counter to that.

  I could have probably been clearer on that point though.

> Instead, I relied on the message from Philippe Verdy which had this output:
>
> Hello from main
>     Hello from func1
>         Hello from func2 (sic!)
>     Back to func1
> Back to main
>     Hello from myfun2 (sic!)
> Back to main
>
> His message did not say that it was done on Linux, that was my guess. And
> it did not show the command line options he used to get the result, nor did
> it specify the details of the toolchain.
>
> ELF shared libraries allow for symbol interposition, so that a shared
> library may end up calling a function defined elsewhere even if it is also
> defined in the shared library. This is I think what your results
> demonstrate. However, that is both inefficient and questionable from the
> security standpoint so there are ways controlled by a bunch of different
> options to suppress this behaviour. Even default behaviour in this respect
> may have changed over the years and tool versions, so it is not completely
> surprising that you and Philippe got different results.
>
> On the other hand, I do not think such behaviour is available with PE DLLs.

  Ivan Krylov did the experiment under Windows (gave the commands, etc).  He
was able to replicate my results with static compilation (with separate
translation units in the library), but the DLL version behaved differently
than Solaris/Linux.

> I am not sure about MacOS.

  I am not sure either.

> I'd say all this demonstrates "undefined behaviour" quite convincingly.

  True.  But I did find this, written by Microsoft, about the Microsoft LINK
program (about statically linking):

        For example, imagine that a C programmer has written two versions of
        a function named _myfunc()_ that is called by the program MYPROG.C.
        One version of _myfunc()_ is for debugging; its object module is
        found in MYFUNC.OBJ.  The other is a production version whose object
        module resides in MYLIB.LIB.  Under normal circumstances, the
        programmer links the production version of _myfunc()_ by using
        MYLIB.LIB.  To use the debugging version of _myfunc()_, the
        programmer explicitly includes its object module (MYFUNC.OBJ) when
        LINK is executed.  This causes LINK to build the debugging version
        of _myfunc()_ into the executable file because it encounters the
        debugging version in MYFUNC.OBJ before it finds the other version in
        MYLIB.LIB.

        To exploit the order in which LINK resolves external references, it
        is important to know LINK's library search strategy: Each individual
        library is search repeatedly (from first library to last, in the
        sequence in which they are input to LINK) until no further external
        references can be resolved.

                                        _The MS-DOS Encyclopedia_

  Given Microsoft's history of backwards compatability, I'm sure this is
still true to this day.

  -spc


Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Viacheslav Usov
On Sun, Nov 25, 2018 at 12:00 PM Sean Conner <[hidden email]> wrote:

>  I did say that.   Twice.  Granted, they were in parentheticals but they are there.

One of them was "(for the record, it worked under Linux)", whence I could not deduce it produced the same output.

>                                         _The MS-DOS Encyclopedia_

When it comes to static linking, you might as well use a contemporary source:

"Object files on the command line are processed in the order they appear on the command line. Libraries are searched in command line order as well, with the following caveat: [...]"


My point was about PE DLLs, though. In PE, there is a clear distinction between imports and exports. To create an equivalent of an elfian public symbol in a shared library, a DLL export would also have to be an import and would need a body, which to my knowledge is impossible using Microsoft's tools. I am not sure whether one could do that with a special tool exploiting PE's dark corners, but then we would end up with the "original" DLL's having to be built in some weird (for Windows) way, which kind of nullifies the whole point of the exercise.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Andrew Gierth
In reply to this post by Sean Conner
Compare these cases (this is freebsd, clang, GNU ld):

% cc -c -fPIC func.c func1.c func2.c main.c myfunc2.c
% cc -shared -o libfunc.so func1.o func2.o      
% cc -Wl,-rpath,$PWD -o smain main.o myfunc2.o libfunc.so
% ./smain
Hello from main
        Hello from func1
                Hello from myfunc2
        Back to func1
Back to main
                Hello from myfunc2
Back to main

So far so good; we overrode the library version of the function.

Alternative #1:

% cc -shared -Wl,-Bsymbolic -o libfunc.so func1.o func2.o
% ./smain
Hello from main
        Hello from func1
                Hello from func2
        Back to func1
Back to main
                Hello from myfunc2
Back to main

Now we have two func2's, one visible to the library and one to the main
program.

Or we could do:

% printf '{ global: func1; local: func2; };\n' >libfunc.x
% cc -shared -Wl,--version-script,libfunc.x -o libfunc.so func1.o
func2.o
% ./smain
Hello from main
        Hello from func1
                Hello from func2
        Back to func1
Back to main
                Hello from myfunc2
Back to main

Again, two versions.

Let's do it without a version script (NOTE: this is what lua does on
platforms that support it, see LUAI_FUNC)

% printf '/void func2/s/^/__attribute__((visibility("hidden"))) /\nwq\n' | ed func2.c
% cc -c -fPIC func2.c
% cc -shared -o libfunc.so func1.o func2.o
% ./smain                                
Hello from main
        Hello from func1
                Hello from func2
        Back to func1
Back to main
                Hello from myfunc2
Back to main

Two versions again.

The PostgreSQL project went through a more involved version of this mess
recently, to the extent that we now have a source file that exists for
no reason other than to detect mis-linking. (Our case is more complex
because it's to do with dynamic loading, but similar principles apply.)

--
Andrew.

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Philippe Verdy
In reply to this post by Viacheslav Usov
DLLs in Windows have a dllEntry point which can be used for various customization of dynamic linking and resolution of dependencies. DllEntry points are normal C functions, comparable to the main function in C, but taking other arguments (not from a command line, but that can still use the environment if needed and can modify it, however it cannot load other modules/libaries itself due to possible deadlock conditions; as well it cannot start or terminate any thread or process, instead it returns a status that allows the loader to refuse to create or terminate a thread or process).

It is called when first loading the DLL but before actual linking to other programs (actually this is used not only at the per-process level but also for each thread). If the DLL does not need per-thread initialization (e.g. if it does not allocate any TLS data for each thread to keep track of external resources used by the thread when they use the exported DLL functions, they can disable these notifications and the DLL entry will only be called on process termination). The DLL entry point can control the search path and load other libraries itself.

Le dim. 25 nov. 2018 à 13:51, Viacheslav Usov <[hidden email]> a écrit :
On Sun, Nov 25, 2018 at 12:00 PM Sean Conner <[hidden email]> wrote:

>  I did say that.   Twice.  Granted, they were in parentheticals but they are there.

One of them was "(for the record, it worked under Linux)", whence I could not deduce it produced the same output.

>                                         _The MS-DOS Encyclopedia_

When it comes to static linking, you might as well use a contemporary source:

"Object files on the command line are processed in the order they appear on the command line. Libraries are searched in command line order as well, with the following caveat: [...]"


My point was about PE DLLs, though. In PE, there is a clear distinction between imports and exports. To create an equivalent of an elfian public symbol in a shared library, a DLL export would also have to be an import and would need a body, which to my knowledge is impossible using Microsoft's tools. I am not sure whether one could do that with a special tool exploiting PE's dark corners, but then we would end up with the "original" DLL's having to be built in some weird (for Windows) way, which kind of nullifies the whole point of the exercise.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Roberto Ierusalimschy
In reply to this post by Philippe Verdy
> Le dim. 25 nov. 2018 à 06:04, Sean Conner <[hidden email]> a écrit :
>
> [...]

As this discussion seems quite private between a very few members of
this list, and has almost no relationship to Lua, would you all mind
moving it to some other forum?

Many thanks,

-- Roberto

1234