Patchless modification of Lua source code

classic Classic list List threaded Threaded
69 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Philippe Verdy
There's no elephant here, Windows is not different from Unix or other POSIX-compatible systems when it handles simple libraries or shared libraries (DLLs). Not all POSIX systems support shared libraries, but in that case, it is only the possibility of created **non-shared** libraries that can produce unspecified behavior, because **non-shared** libaries are unordered collections of units (the order is unpredictable), so they cannot contain multiple distinct units defining the same exported symbol.
This problem does not occur with shared libaries/DLLs whose internal dependencies are fully resolved and the rest of their unresolved symbols (that are not part of the list of their own exported symbols) is restricted to be a set of unique identifiers: the order or resolution is fully specified with shared libraries.

Le sam. 24 nov. 2018 à 14:23, Ivan Krylov <[hidden email]> a écrit :
Hi!

On Sat, 24 Nov 2018 13:04:50 +0100
Viacheslav Usov <[hidden email]> wrote:

> We have an elephant in the room here, and let me just name it:
> Windows. Once somebody demonstrates similar techniques on Windows, we
> could talk about real-world portability.

For what it's worth, Sean's experiment can be reproduced on MSVC 2010:

 >cl /c main.c
 >cl /c func1.c
 >cl /c func2.c
 >cl /c myfunc2.c
 >lib /out:func.lib func1.obj func2.obj
 >link /out:main.exe main.obj myfunc2.obj func.lib
 >main.exe
 main
 func1
 myfunc2

> First, other libraries that the executable depends on will not have
> been linked statically to the same "replacement" functions, so they
> will have to resort to the "original" dependencies. Your program, as
> whole, may end up using different versions of the same set of
> functions simultaneously.

I've certainly encountered such nasty cases in the past when I was
learning C, but right now I cannot produce such an example that would
also be concise and easy to reproduce. If you have one handy, I would be
glad to test it on Windows, too.

--
Best regards,
Ivan

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Viacheslav Usov
In reply to this post by Philippe Verdy
On Sat, Nov 24, 2018 at 2:18 PM Philippe Verdy <[hidden email]> wrote:

> In other words, the program is linked with two distinct versions of func2(), which exist simultaneously! One version is statically used by func1() from inside the sharedlibrary where func1 is used by main. The second version (myfunc1.o) is however is now used by main but does **not** replace the version used internally by func1() inside the shared library (even if mylib.so also exports the symbol for "func2", this export is not used when linking main because you've specified myfunc2.o **before** mylib.so when linking main).

Which is part of concern "First" in my response to Luiz earlier in this thread, to which you responded with "I disagree". I still do not know what exactly you disagree with, because you never answered my follow-on questions specifically enough.

You might also want to experiment with concern "Second".

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Viacheslav Usov
In reply to this post by Ivan Krylov
On Sat, Nov 24, 2018 at 2:39 PM Ivan Krylov <[hidden email]> wrote:
 
 Hello from main
         Hello from func1
                 Hello from func2
         Back to func1
 Back to main
                 Hello from myfunc2
 Back to main

Note that the same code but with the static func.lib produces

Hello from main
        Hello from func1
        Hello from myfunc2
        Back to func1
Back to main
        Hello from myfunc2
Back to main

Which demonstrates how undefined behaviour can produce different results even in one and the same implementation.

This also illustrates the invalidity of the claim that "the difference between [shared and static libraries] is not important for this disussion [sic]", and why "pre-linked" is an important aspect of shared libraries.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Philippe Verdy
If you use non-shared libraries, in fact the result is unpredictable, you could as well have seen:

Hello from main
        Hello from func1
        Hello from func2
        Back to func1
Back to main
        Hello from func2
Back to main

That's what I was demonstrating: there's undefined behavior ONLY if you use non-shared libraries because the order of units in these libraries is NOT significant.
And in fact makes the use of "pre-linked" even more relevant to this discussion.

Only non-shared libraries are not portable if you have multiple units defining the same symbol.
But you are not required to use them if, instead, you use a linker directly and specify a full list of units names (or shared library names) where the order of resolution is entirely determined.

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Philippe Verdy
Note that Windows and the ELF format in general allows a shared library to be "pre-linked" within its set of internal references, but still be replaceable by a linker: the "pre-link" is kept as a default if there's no other implementation specified but this default is overridable if you specify a prior unit taking the priority, in which case that unit will become the implementation used by the shared libary itself at run-time (the default implementation will still be part of the modules possibly loaded in memory but they won't be used (and a VMM may not even need to page in these unused parts if they are larger than a single memory page, so this won't take too much memory; this unused implementation will still use some amount of virtual memory)...
For that case, the same symbol may be present in the list of imports and the list of exports in a "pre-linked" module (executable file, or shared-library)...

A "pre-linked" module may also specify restrictions in their list of imports, such as specifying that the symbol must be loaded from specific module names, or specifying that it must be in modules matching a specific digital signature, or other security constraints (like version number). The module also specifies also for its list of exports a set of properties, that allows other modules to check these constraints, notably the export list contains the name of the module itself, its version, and some data signature or globally unique identifier.


Le sam. 24 nov. 2018 à 16:01, Philippe Verdy <[hidden email]> a écrit :
If you use non-shared libraries, in fact the result is unpredictable, you could as well have seen:

Hello from main
        Hello from func1
        Hello from func2
        Back to func1
Back to main
        Hello from func2
Back to main

That's what I was demonstrating: there's undefined behavior ONLY if you use non-shared libraries because the order of units in these libraries is NOT significant.
And in fact makes the use of "pre-linked" even more relevant to this discussion.

Only non-shared libraries are not portable if you have multiple units defining the same symbol.
But you are not required to use them if, instead, you use a linker directly and specify a full list of units names (or shared library names) where the order of resolution is entirely determined.

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Philippe Verdy
Also note that a POSIX system does not need any specific file format for **non-shared** libraries: they can just be represented simply as a simple archive (similar to a .zip file) or as a directory. In both cases, trhe units stored in these archives or directories are in random order (which is not warrantied to be stable across time, notably if you add or remove units or any other files from the archive or directory). The archive or directory may expose the list of their contents in arbitrary random order, or ordered in some arbitry collation order (depending on user's locale), so this is not stable and portable.

But you can make the order of units predictable by adding a supplementary "manifest" file in this directory or archive, to specify the explicit order in which units present in that directory or archive must be resolved.

Le sam. 24 nov. 2018 à 16:18, Philippe Verdy <[hidden email]> a écrit :
Note that Windows and the ELF format in general allows a shared library to be "pre-linked" within its set of internal references, but still be replaceable by a linker: the "pre-link" is kept as a default if there's no other implementation specified but this default is overridable if you specify a prior unit taking the priority, in which case that unit will become the implementation used by the shared libary itself at run-time (the default implementation will still be part of the modules possibly loaded in memory but they won't be used (and a VMM may not even need to page in these unused parts if they are larger than a single memory page, so this won't take too much memory; this unused implementation will still use some amount of virtual memory)...
For that case, the same symbol may be present in the list of imports and the list of exports in a "pre-linked" module (executable file, or shared-library)...

A "pre-linked" module may also specify restrictions in their list of imports, such as specifying that the symbol must be loaded from specific module names, or specifying that it must be in modules matching a specific digital signature, or other security constraints (like version number). The module also specifies also for its list of exports a set of properties, that allows other modules to check these constraints, notably the export list contains the name of the module itself, its version, and some data signature or globally unique identifier.


Le sam. 24 nov. 2018 à 16:01, Philippe Verdy <[hidden email]> a écrit :
If you use non-shared libraries, in fact the result is unpredictable, you could as well have seen:

Hello from main
        Hello from func1
        Hello from func2
        Back to func1
Back to main
        Hello from func2
Back to main

That's what I was demonstrating: there's undefined behavior ONLY if you use non-shared libraries because the order of units in these libraries is NOT significant.
And in fact makes the use of "pre-linked" even more relevant to this discussion.

Only non-shared libraries are not portable if you have multiple units defining the same symbol.
But you are not required to use them if, instead, you use a linker directly and specify a full list of units names (or shared library names) where the order of resolution is entirely determined.

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Viacheslav Usov
In reply to this post by Philippe Verdy
On Sat, Nov 24, 2018 at 4:01 PM Philippe Verdy <[hidden email]> wrote:

> That's what I was demonstrating: there's undefined behavior ONLY if you use non-shared libraries because the order of units in these libraries is NOT significant.

The C standard has no notion of shared (or not) libraries, so this distinction is irrelevant. Per the standard, multiple external symbols are undefined behaviour no matter how that is managed in any given implementation. The standard has no notion of the "order of units", so what you say here and later about it is meaningless per the standard.

> Only non-shared libraries are not portable if you have multiple units defining the same symbol.

Again, multiple definitions of an external symbol cause undefined behaviour per the C standard, so there is no portability under the standard. 

Speaking of real-word portability, so far it has only been demonstrated that one gets consistent results using shared libraries on Windows and (probably) Linux (and, pedantically, it is unknown if this is stable if we use different linkers). The behaviour on Solaris, while consistent with itself, seems different, wherein the patched function seems to be used even when called indirectly, while the former platforms use it only when it is directly called. It is possible that some linker option magic can make the behaviour consistent among all three platforms, but that is yet to be demonstrated.

For the record, the demonstrated behaviour on Solaris seems more useful for patching than that on Linux and Windows, but I suspect there ain't no such thing as a free lunch so we are yet to see what tradeoffs that really involves.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Sean Conner
In reply to this post by Viacheslav Usov
It was thus said that the Great Viacheslav Usov once stated:
>
> We have an elephant in the room here, and let me just name it: Windows.
> Once somebody demonstrates similar techniques on Windows, we could talk
> about real-world portability.

  Ivan Krylov ran a similar experiment on Windows, and with static linking,
each function in its own translation unit and turned into a library.  Then
Ivan got the following results:

        >cl /c main.c
        >cl /c func1.c
        >cl /c func2.c
        >cl /c myfunc2.c
        >lib /out:func.lib func1.obj func2.obj
        >link /out:main.exe main.obj myfunc2.obj func.lib
        >main.exe
        main
        func1
        myfunc2

  Which matches what I saw on Linux and Solaris in the same scenario.  He
then did it with shared objects (DLLs), each function in its own translation
unit before making it into a shared library, on Windows:

        >cl /c main.c
        >cl /c func1.c
        >cl /c func2.c
        >cl /c myfunc2.c
        >link /dll /out:mylib.dll func1.obj func2.obj
          Creating library mylib.lib and object mylib.exp
        >link /out:main.exe main.obj myfunc2.obj mylib.lib
        >main
        Hello from main
                Hello from func1
                        Hello from func2
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main

Which is different from what I saw on Linux and Solaris in the same
scenario:

        Hello from main
        Hello from func1
        Hello from myfunc2

  -spc

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Sean Conner
In reply to this post by Viacheslav Usov
It was thus said that the Great Viacheslav Usov once stated:
>
> Speaking of real-word portability, so far it has only been demonstrated
> that one gets consistent results using shared libraries on Windows and
> (probably) Linux (and, pedantically, it is unknown if this is stable if we
> use different linkers).

  No, the output I saw under Linux was consistent with Solaris, NOT Windows.

> The behaviour on Solaris, while consistent with
> itself, seems different, wherein the patched function seems to be used even
> when called indirectly, while the former platforms use it only when it is
                                    ^^^^^^^^^^^^^^^^
> directly called.

  Again, Windows only exhibited that behavior, not Linux.

> For the record, the demonstrated behaviour on Solaris

and Linux

> seems more useful for
> patching than that on Linux and Windows, but I suspect there ain't no such
> thing as a free lunch so we are yet to see what tradeoffs that really
> involves.

  -spc

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Dibyendu Majumdar
In reply to this post by Dirk Laurie-2
On Mon, 19 Nov 2018 at 09:10, Dirk Laurie <[hidden email]> wrote:

>
> Over the years, there have been several replies to suggested patches
> on this list, saying "no need to patch Lua".
>
> So all that is necessary is:
>
> 1. In the Lua source directory, save your modified files with another
> prefix, say "my", instead ol "l" starting their names, e.g.
> 'myctype.c' and 'mylex.c'.
> 2. When you want a new 'lua' including your most recent modifcations:
>     rm my*.o     # necessary because the Makefile does not know about them
>     make linux -e "LUA_O=  lua.o myctype.o mylex.o"

Hi I think this is a bad idea; this is okay for quick hacks but not
the best way to build production software. I think the issues have
been discussed in detail so I won't go into that. But here is an
interesting take on how to build software:

Jonathan Blow on Libraries for his new games programming language:
https://youtu.be/3TwEaRZ4H3w

Regards
Dibyendu

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Philippe Verdy
In reply to this post by Sean Conner


Le sam. 24 nov. 2018 à 23:20, Sean Conner <[hidden email]> a écrit :
It was thus said that the Great Viacheslav Usov once stated:
>
> We have an elephant in the room here, and let me just name it: Windows.
> Once somebody demonstrates similar techniques on Windows, we could talk
> about real-world portability.

  Ivan Krylov ran a similar experiment on Windows, and with static linking,
each function in its own translation unit and turned into a library.  Then
Ivan got the following results:

        >cl /c main.c
        >cl /c func1.c
        >cl /c func2.c
        >cl /c myfunc2.c
        >lib /out:func.lib func1.obj func2.obj
        >link /out:main.exe main.obj myfunc2.obj func.lib
        >main.exe
        main
        func1
        myfunc2

  Which matches what I saw on Linux and Solaris in the same scenario.  He
then did it with shared objects (DLLs), each function in its own translation
unit before making it into a shared library, on Windows:

        >cl /c main.c
        >cl /c func1.c
        >cl /c func2.c
        >cl /c myfunc2.c
        >link /dll /out:mylib.dll func1.obj func2.obj
          Creating library mylib.lib and object mylib.exp
        >link /out:main.exe main.obj myfunc2.obj mylib.lib
        >main
        Hello from main
                Hello from func1
                        Hello from func2
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main

Which is different from what I saw on Linux and Solaris in the same
scenario:

        Hello from main
        Hello from func1
        Hello from myfunc2

This last is certainly not the same scenario ! My scenario made it clear which version was used by testing not the two calls only from main, but also testing the second function when it is called from the 1st one.

This is exactly where you can see that simple (unlinked) libraries do not behave like prelinked libaries, and that it is the use of simple libaries which is completely inconsistant (independantly of the language you used to create them, it may be C or anothy other).

Simple libraries ARE NOT suitable and NOT compatible with the C standard. They can never be strictly portable. Shared libaries (prelinked) is necessary for portability beause this ensures a strict and stable order for resolving external dependencies. In summary don't use imple libaries at all with C! Instead use the linker and provide the **full list** of the libraries you want to link to, in which you'll specify their expected order or resolution

The basic syntax like "-lm" for invokiner the linker or the C compiler with linker capabilities is inconsistant. You can never predict which unit will be included, unless the ".a" libraries are built to ensure that they will NEVER contain any pair of units defining the same symbol (i.e. each symbol defined in the library belongs to one and only one unit of the library).

The old "ar" tool of Unix (or "tar" or "shar", which give equivalent results) is clearly buggy, it's not different from a simple ".zip" file, or from a filesystem directory containing all compiled .or units listed in an unspecified/unstable order (and not containing any additional metadata file like a "manifest" to specify the order of resolution) ! The old "*ar" tools have never been seriously designed to be suitable for programming, they were made only for backing up and transfering collections of files of any type between systems by packing them ins a single file transferable in a single session (e.g. by FTP) !

There's no such limitation and portability issue when you don't use ANY static library but only use "prelinked" shared libraries (like DLL on windows or ELF libraries on Linux.. note that executable and DLL modules on Windows, as well as on OS/2, are based on directly on the ELF format) !

Let's be serious! Prelinked library formats (or archives containing manifests) are the only suitable formats for developping in any language allowing separate compilations of units (now almost all of them); the alternative is to use unit names with their full pathname (and not just the unit name) so that you can choose between multiple possible implementations in separate modules (with distinct paths) defining the same exported symbol.

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Sean Conner
It was thus said that the Great Philippe Verdy once stated:

> Le sam. 24 nov. 2018 à 23:20, Sean Conner <[hidden email]> a écrit :
>
> >         >cl /c main.c
> >         >cl /c func1.c
> >         >cl /c func2.c
> >         >cl /c myfunc2.c
> >         >link /dll /out:mylib.dll func1.obj func2.obj
> >           Creating library mylib.lib and object mylib.exp
> >         >link /out:main.exe main.obj myfunc2.obj mylib.lib
> >         >main
> >         Hello from main
> >                 Hello from func1
> >                         Hello from func2
> >                 Back to func1
> >         Back to main
> >                         Hello from myfunc2
> >         Back to main
> >
> > Which is different from what I saw on Linux and Solaris in the same
> > scenario:
> >
> >         Hello from main
> >         Hello from func1
> >         Hello from myfunc2
>
> This last is certainly not the same scenario ! My scenario made it clear
> which version was used by testing not the two calls only from main, but
> also testing the second function when it is called from the 1st one.

  Okay.

        % uname
        Linux
        % cc    -c -o main.o main.c
        % cc    -c -o myfunc2.o myfunc2.c
        % cc -shared -fPIC -o func.ss func.c
        % ar rv libfuncall.so func.ss
        % ar: creating libfuncall.so
        % a - func.ss
        % cc  -Wl,-rpath,/tmp/foo -o smain2 main.o myfunc2.o libfuncall.so
        % ./smain2
        Hello from main
                Hello from func1
                        Hello from myfunc2
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main

  Happy now?

> There's no such limitation and portability issue when you don't use ANY
> static library but only use "prelinked" shared libraries (like DLL on
> windows or ELF libraries on Linux.. note that executable and DLL modules on
> Windows, as well as on OS/2, are based on directly on the ELF format) !

  Citation needed.

  -spc

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Philippe Verdy


Le dim. 25 nov. 2018 à 02:23, Sean Conner <[hidden email]> a écrit :
It was thus said that the Great Philippe Verdy once stated:
> Le sam. 24 nov. 2018 à 23:20, Sean Conner <[hidden email]> a écrit :
        % uname
        Linux
        % cc    -c -o main.o main.c
        % cc    -c -o myfunc2.o myfunc2.c
        % cc -shared -fPIC -o func.ss func.c
        % ar rv libfuncall.so func.ss
        % ar: creating libfuncall.so
        % a - func.ss
        % cc  -Wl,-rpath,/tmp/foo -o smain2 main.o myfunc2.o libfuncall.so
        % ./smain2
        Hello from main
                Hello from func1
                        Hello from myfunc2
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main

  Happy now?

That's what I wanted. And demonstrates what I wanted to show: this is the only portable and expected behavior !

>> There's no such limitation and portability issue when you don't use ANY
>> static library but only use "prelinked" shared libraries (like DLL on
>> windows or ELF libraries on Linux.. note that executable and DLL modules on
>> Windows, as well as on OS/2, are based on directly on the ELF format) !

 > Citation needed.

I gave citations as examples, because there's nowhere any other counter example using shared/prelinked libraries (or archive formats containing an explicit manifest of the expected link order) any example I can find where this is not true.

So instead I ask you to demonstrate the existence of any counter-example !

The C standard itself does not really indicate how we will link units into a working program: it does not even need the existence of "libraries", it just speaks the possibility of creating programs using collections of units compilable separately (that's thbe meaning of the "extern" keyword in C), and then it needs to use an external linker and to specify the units you need for a working program in a well defined resolution order (and basic libaries which are only collections of units in unspecified order are not compatible with this goal).

But shared libaries/ prelinked libraries are compatible with this goal as their role is to prelink them partially, using also the same linker and the same specified order, which then gives a predictable resolution order for symbols in the library (something impossible to reach using only basic static libraries containining arbitrary number of object units in random order).

It's not a basic "librarian" tool that resolves symbols (the librarian is just a way to archive and pack multiple files into a single one, because it is generally faster to process than large collections of small files, from which they can be extracted without modification; the speed of processing was certinaly true with past filesystems, but it's no longer the case with modern filesystems). Only a true linker (actually made for building programs) does that work of resolving symbols accurately and predictably!

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Philippe Verdy
In fact the C standar NEVER forbids different compiled units to redefine the same external symbols.

In fact it even **encourages** such use, which is required for example so that all C programs can have their own implementation of the "main()" function which is externalized to the same symbol once compiled with the same "C" linkage by default.

Then it's not the C language or the compiler that instruct themselves how these compiled units will be linked together. This is only specified at the linker level (which is not part of the C standard itself). But all linkers require you specify unambiguously the order of units.

If **any** linker allows you to use "libraries" (containing multiple units in random order), it can accept to process them ONLY if there's no pair of distinct units in the same library that define the same symbol (otherwise the result would be completely unpredictable).

Shared libraries (or libraries and directories containing an ordering manifest file) are assured to respect this constraint because they are already the result of a prevcious pass of the linker for which the order or resolution was already specified and checked.

Shared libraries are not a requirement for any POSIX system compatible with C: these systems can work perfectly without supporting them (it's enough on these system to have a linker tool that supports a manifest in their supported archive format, or that will look for the existence of a manifest file if libraries are represented as a filesystem directory containing all compiled units).

In practice, archive formats recognized by linkers **already look for a manifest file** containing not just the order of units, but a full mapping index of symbols, associating them with the name of the unit in which they are defined and exported, because it allows much faster linking, without having to process completely each unit file inside the library: such unique mapping index cannot be built and inserted inside the library if there are two units in the library definining the same symbol.

But:
- old archiving tools (like old versions of "ar" on old versions of Unix) did not check that and did not build this index/manifest, it had to be built and added separately to the archive, and updated each time you added/updated/removed a unit from the library. The old "ar" tool is now completely deprecated. For backup/archiving purpose, "tar" is much better and universally supported on all Unix/Linux variants, and now most archives are compressed ("taz", "tgz, "zip", "gz", "xz", you have the choice...)
- archive formats containing a manifest or index solve the problem for linkers (this solution is used by linkers inside virtual machines like Java, .Net, Perl, Python... (Even Lua uses such solution even if it's hidden behind the concept of "loaders", which are actually linkers that programs themselves can control and tune for their needs)
- shared libaries are in fact much cleaner, more compact, and much faster to process to create native programs: the ELF format (or similar variants) is now almost universal between all Unix/Linux/Windows and many other systems (and they still allow compiled programs to invoke or use the system linker system themselves using "loaders")

This solution based on shared libraries (or archives with manifests) and the concept of generic "loaders" is not just for linking standalone programs, but it also exists in scripting languages and programs runing on virtual machines: these programs can control themselves the resolution order, control themselves the environment path in which external libraries or units will be found, they can check contraints like security requirements, access rights, digital signatures; they can use network services to download the units; they can use conditions like the user's locale and other preferences, they can try to best match the architecture such as i386 vs x64 vs. i686 when they have the choice, they can perform comparative benchmark tests before deciding which implementation to use...

**Absolutely nothing in the C standard** forbids units needed or used in the same program to be limited to sets of unique symbols !

All what the C standard says, is that a separate unit will never be created by the compiler such that it contains multiple instances of the same extern symbol with the "C" linkage (Some compilers for specific systems may be exceptions to this rule : they may still create **simultaneously** multiple implementations of the same source, compiled for different architectures, or for different goals such as debugging purpose, or different levels or methods of optimization which may not be safe in all situations such as relocatable vs. reentrant versions for multithreading; in which case the object format will actuall be like an archive with several distinct entry points for the same symbol but distinguished by some encoded goals, or by some "decoration" in the encoded exported symbols; but this is equivalent to exporting distinct symbols; this works only if the linker recognizes the multiple encoded symbols and knows the rules to locate them and to decide predicatably which implementation are the most suitable)


Le dim. 25 nov. 2018 à 02:45, Philippe Verdy <[hidden email]> a écrit :


Le dim. 25 nov. 2018 à 02:23, Sean Conner <[hidden email]> a écrit :
It was thus said that the Great Philippe Verdy once stated:
> Le sam. 24 nov. 2018 à 23:20, Sean Conner <[hidden email]> a écrit :
        % uname
        Linux
        % cc    -c -o main.o main.c
        % cc    -c -o myfunc2.o myfunc2.c
        % cc -shared -fPIC -o func.ss func.c
        % ar rv libfuncall.so func.ss
        % ar: creating libfuncall.so
        % a - func.ss
        % cc  -Wl,-rpath,/tmp/foo -o smain2 main.o myfunc2.o libfuncall.so
        % ./smain2
        Hello from main
                Hello from func1
                        Hello from myfunc2
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main

  Happy now?

That's what I wanted. And demonstrates what I wanted to show: this is the only portable and expected behavior !

>> There's no such limitation and portability issue when you don't use ANY
>> static library but only use "prelinked" shared libraries (like DLL on
>> windows or ELF libraries on Linux.. note that executable and DLL modules on
>> Windows, as well as on OS/2, are based on directly on the ELF format) !

 > Citation needed.

I gave citations as examples, because there's nowhere any other counter example using shared/prelinked libraries (or archive formats containing an explicit manifest of the expected link order) any example I can find where this is not true.

So instead I ask you to demonstrate the existence of any counter-example !

The C standard itself does not really indicate how we will link units into a working program: it does not even need the existence of "libraries", it just speaks the possibility of creating programs using collections of units compilable separately (that's thbe meaning of the "extern" keyword in C), and then it needs to use an external linker and to specify the units you need for a working program in a well defined resolution order (and basic libaries which are only collections of units in unspecified order are not compatible with this goal).

But shared libaries/ prelinked libraries are compatible with this goal as their role is to prelink them partially, using also the same linker and the same specified order, which then gives a predictable resolution order for symbols in the library (something impossible to reach using only basic static libraries containining arbitrary number of object units in random order).

It's not a basic "librarian" tool that resolves symbols (the librarian is just a way to archive and pack multiple files into a single one, because it is generally faster to process than large collections of small files, from which they can be extracted without modification; the speed of processing was certinaly true with past filesystems, but it's no longer the case with modern filesystems). Only a true linker (actually made for building programs) does that work of resolving symbols accurately and predictably!

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Sean Conner
In reply to this post by Philippe Verdy
Let me try this one more time ...

It was thus said that the Great Philippe Verdy once stated:
> note that executable and DLL modules on
> Windows, as well as on OS/2, are based on directly on the ELF format) !

  I would like for you to provide a citation that states the Windows
exectuable format, known as PE (for "Portable Executable") is based directly
on ELF (for "Executable/Linker Format").  They are two different formats.

  -spc




Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Philippe Verdy
"Based on" does not mean they are the same... There are other formats, working on the same principles (for example COFF)...
And no! The PE format is not used directly when building, it's only one of the final formats supported by the native loader/linker.
Note that the Windows kernel (as well as several Unix/Linux kernels) also supports other formats (PE is not the exclusive one for Windows), and so it implements several native loaders.
I just took one example.


Le dim. 25 nov. 2018 à 04:06, Sean Conner <[hidden email]> a écrit :
Let me try this one more time ...

It was thus said that the Great Philippe Verdy once stated:
> note that executable and DLL modules on
> Windows, as well as on OS/2, are based on directly on the ELF format) !

  I would like for you to provide a citation that states the Windows
exectuable format, known as PE (for "Portable Executable") is based directly
on ELF (for "Executable/Linker Format").  They are two different formats.

  -spc




Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Philippe Verdy
So you just demonstrater that you just want to polemicate, you don't provide any useful help with your very shortsighted view when the discussion was much more generic...
We were giving some examples, now you interpret the examples as being the only available options (which they are not, programmers will always invent new alternatives).

Le dim. 25 nov. 2018 à 04:58, Philippe Verdy <[hidden email]> a écrit :
"Based on" does not mean they are the same... There are other formats, working on the same principles (for example COFF)...
And no! The PE format is not used directly when building, it's only one of the final formats supported by the native loader/linker.
Note that the Windows kernel (as well as several Unix/Linux kernels) also supports other formats (PE is not the exclusive one for Windows), and so it implements several native loaders.
I just took one example.


Le dim. 25 nov. 2018 à 04:06, Sean Conner <[hidden email]> a écrit :
Let me try this one more time ...

It was thus said that the Great Philippe Verdy once stated:
> note that executable and DLL modules on
> Windows, as well as on OS/2, are based on directly on the ELF format) !

  I would like for you to provide a citation that states the Windows
exectuable format, known as PE (for "Portable Executable") is based directly
on ELF (for "Executable/Linker Format").  They are two different formats.

  -spc




Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Sean Conner
In reply to this post by Philippe Verdy
It was thus said that the Great Philippe Verdy once stated:

> Le dim. 25 nov. 2018 à 02:23, Sean Conner <[hidden email]> a écrit :
>
> > It was thus said that the Great Philippe Verdy once stated:
> > > Le sam. 24 nov. 2018 à 23:20, Sean Conner <[hidden email]> a écrit :
> >         % uname
> >         Linux
> >         % cc    -c -o main.o main.c
> >         % cc    -c -o myfunc2.o myfunc2.c
> >         % cc -shared -fPIC -o func.ss func.c
> >         % ar rv libfuncall.so func.ss
> >         % ar: creating libfuncall.so
> >         % a - func.ss
> >         % cc  -Wl,-rpath,/tmp/foo -o smain2 main.o myfunc2.o libfuncall.so
> >         % ./smain2
> >         Hello from main
> >                 Hello from func1
> >                         Hello from myfunc2
> >                 Back to func1
> >         Back to main
> >                         Hello from myfunc2
> >         Back to main
> >
> >   Happy now?
>
> That's what I wanted. And demonstrates what I wanted to show: this is the
> only portable and expected behavior !

  I am confused.  

  Here's the Windows example.  This has func1() and func2() in separate C
files that are compiled, and both files are used to create the shared
library:

        >cl /c main.c
        >cl /c func1.c
        >cl /c func2.c
        >cl /c myfunc2.c
        >link /dll /out:mylib.dll func1.obj func2.obj
          Creating library mylib.lib and object mylib.exp
        >link /out:main.exe main.obj myfunc2.obj mylib.lib
        >main
        Hello from main
                Hello from func1
                        Hello from func2 ********
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main

Look closely at the line marked with '********' (which I just added by hand,
to the output).  Notice the function that was called---func2().  NOT
myfunc2().  func2().  

  Here's the same, but under Linux:

        % cc    -c -o main.o main.c
        % cc    -c -o myfunc2.o myfunc2.c
        % cc -shared -fPIC -o func1.ss func1.c
        % cc -shared -fPIC -o func2.ss func2.c
        % cc -shared -o libfunc.so func1.ss func2.ss
        % cc  -Wl,-rpath,/tmp/foo -o smain4 main.o myfunc2.o libfunc.so
        % ./smain4
        Hello from main
                Hello from func1
                        Hello from myfunc2 ********
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main

Again, look closely at the line marked with '********'.  Notice the function
that was called---myfunc2().  NOT func2().  myfunc2().

Do you notice the difference between the two?

BOTH examples used shared libraries.  So what is the "portable and expected"
answer?  What should the output be for shared libraries?

  -spc (Before this, I would have said the Linux version, but I was not familiar
        with Windows, so I was wrong in that regard)

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Sean Conner
In reply to this post by Philippe Verdy
It was thus said that the Great Philippe Verdy once stated:
> So you just demonstrater that you just want to polemicate,

  You said, and I'm quoting you here, "DLL modules on Windows, as well as on
OS/2, are based on directly on the ELF format," which runs counter to how I
understand computing history.

  Microsoft first created DLLs for Windows, starting with version 1.0
(release in 1985).  OS/2 (released in 1987) inherited the concept from
Windows (OS/2 being a joint project between IBM and Microsoft at the time).

  System V, Release 4 (released in 1988), a combined project between AT&T
and Sun Microsystems, was when the ELF format was introduced.  At best, one
could say that ELF was inspired by Windows DLLs, but what you said is ...
not based upon facts.  That in turn, makes me discount what you say due to
the inaccuracy.  

> you don't
> provide any useful help with your very shortsighted view when the
> discussion was much more generic...
> We were giving some examples, now you interpret the examples as being the
> only available options (which they are not, programmers will always invent
> new alternatives).

  I initally four examples on two platforms, Linux and Solaris.  Then Ivan
provided some of the examples on Windows.  The results were:

mail() calls func1() calls func2()

func1(),func2() in same translation unit, in a library, providing myfunc2(),
static compilation:
        Linux: failed to link
        Solaris: failed to link
        Windows: EXAMPLE NOT PROVIDED

func1(),func2() in different translations units, in a library, providing
myfunc2(), static compilation:
        Linux: linked, run, myfunc2() called.
        Solaris: linked, run, myfunc2() called.
        Windows: lihked, run, myfunc2() called.

func1(),func2() in same translation unit, as a shared library, providing
myfunc2():
        Linux: linked, run, myfunc2() called.
        Solaris: linked, run, myfunc2() called.
        Windows: EXAMPLE NOT PROVIDED

func1(),func2() in different translations units, as a shared library,
providing myfunc2():
        Linux: linked, run, myfunc2() called.
        Solaris: linked, run, myfunc2() called.
        Windows: linked, run, func1() called (NOTE DIFFERENCE!)

So, what damn examples did YOU give that I interpreted as being the only
available options?  Because I've seen very little in the way of actual
examples given, or citations to references, or anything from you.

  -spc

Reply | Threaded
Open this post in threaded view
|

Re: Patchless modification of Lua source code

Philippe Verdy
In reply to this post by Sean Conner


Le dim. 25 nov. 2018 à 05:24, Sean Conner <[hidden email]> a écrit :
It was thus said that the Great Philippe Verdy once stated:
> Le dim. 25 nov. 2018 à 02:23, Sean Conner <[hidden email]> a écrit :
>
> > It was thus said that the Great Philippe Verdy once stated:
> > > Le sam. 24 nov. 2018 à 23:20, Sean Conner <[hidden email]> a écrit :
> >         % uname
> >         Linux
> >         % cc    -c -o main.o main.c
> >         % cc    -c -o myfunc2.o myfunc2.c
> >         % cc -shared -fPIC -o func.ss func.c
> >         % ar rv libfuncall.so func.ss
> >         % ar: creating libfuncall.so
> >         % a - func.ss
> >         % cc  -Wl,-rpath,/tmp/foo -o smain2 main.o myfunc2.o libfuncall.so
> >         % ./smain2
> >         Hello from main
> >                 Hello from func1
> >                         Hello from myfunc2
> >                 Back to func1
> >         Back to main
> >                         Hello from myfunc2
> >         Back to main
> >
> >   Happy now?
>
> That's what I wanted. And demonstrates what I wanted to show: this is the
> only portable and expected behavior !

  I am confused. 

  Here's the Windows example.  This has func1() and func2() in separate C
files that are compiled, and both files are used to create the shared
library:

        >cl /c main.c
        >cl /c func1.c
        >cl /c func2.c
        >cl /c myfunc2.c
        >link /dll /out:mylib.dll func1.obj func2.obj
          Creating library mylib.lib and object mylib.exp
        >link /out:main.exe main.obj myfunc2.obj mylib.lib
        >main
        Hello from main
                Hello from func1
                        Hello from func2 ********
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main

Look closely at the line marked with '********' (which I just added by hand,
to the output).  Notice the function that was called---func2().  NOT
myfunc2().  func2(). 

  Here's the same, but under Linux:

        % cc    -c -o main.o main.c
        % cc    -c -o myfunc2.o myfunc2.c
        % cc -shared -fPIC -o func1.ss func1.c
        % cc -shared -fPIC -o func2.ss func2.c
        % cc -shared -o libfunc.so func1.ss func2.ss
        % cc  -Wl,-rpath,/tmp/foo -o smain4 main.o myfunc2.o libfunc.so
        % ./smain4
        Hello from main
                Hello from func1
                        Hello from myfunc2 ********
                Back to func1
        Back to main
                        Hello from myfunc2
        Back to main

Again, look closely at the line marked with '********'.  Notice the function
that was called---myfunc2().  NOT func2().  myfunc2().

You've used the PIC option which means that the shared library does not call directly the functions, but passes through a "call gate" to perform the indirection.
Call gates ARE overridable when you link a DLL with other units specified before it, because these call gates are NOT part of the linked module but are part of a table indexed by the exported functions.
This is the difference with direct calls that, once linked, are not overriden as the target is resolved internally in the code sections.

PIC code used via call gates are in fact NOT resolved internally by these shared library formats. So they can be overriden. Still, the order of resolution is predicatable (this is not like with classic libraries): the "pre-linked" shared library in that cases is warrantied to resolve all internal calls to use the same call gate. But function calls via call gates suffer from an additiona indirection (which causes runtime performance penalties). However they have the advantage that the PIC code does not need to be "patched" in paged memory so this PIC code can be in shared read-only pages; only the memory segment containining the call gates is private per process: using PIC code is then only interesting if a shared library is likely to be used by LOT of concurrent processes (e.g. for the standard C library); there's no advantage when shared libaries are used by very few processes (the total number of threads using it does not matter as all threads in the same process shared the same virtual memory, except their TLS data and interrupt stack; the normal stack is usually allocated in the shared heap unless the program uses TLS allocation to create new threads).

1234