mergesort patch for lua-5.1.4

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

mergesort patch for lua-5.1.4

Dave Rodgers

Here is a patch that replaces the lua-5.1.4 table.sort() quicksort algorithm
with a mergesort algorithm implementation. The primary motivation for
writing this patch was to produce a stable sort (needed for deterministic
operation across multiple hosts). The speedup came as a bonus.

1. it's a stable sort
2. it's faster  (see the results)
3. it's safer  (modifying the original table during sorting has no effect)

1. it uses more memory  (but cleans up after itself immediately)
2. it drags some lower-level code into ltablib.c
   (not an end-user issue, and could be mitigated with some refactoring)

Here are some results:
  garbage collection enabled:
  garbage collection disabled:

The 'custom' comparison function was as follows:
  local t = {} -- filled with data
  local comps = 0
  local function comp(a, b)
    local x = t[1]
    comps = comps + 1  
    return a < b

Note that the performance is dominated by whether or not a custom function
is being used (for both algorithms). I guess that all those lua_call()'s
add up...


  luaok (patch/ISO-C)
    Ordered keys, including the addition:
      prev(t [, key]) & rpairs(t) -- reverse iteration
      table.sortkeys(t [, function])
      table.insertkey(t, oldkey, newkey [, value])
      table.usearray(t [,boolean]) -> boolean | t  -- for `all hash' keying
  xnum (package/C99)
    Userdata types for i8, u8, i16, u16, i32, u32, i64, u64, f32, f64.
    Include bitwise operations, and querying to number of the host's C types
    (ex: size_t, ptrdiff_t, time_t, etc...)
  pool (package/C99)
    Memory access with sub-classing, see:
    Sub-classing is made possible by using the userdata `env' for pool
    Included sub-class examples are matrix4x4 and mmap.
  luagl3 (package/C99)
    Complete OpenGL 3.2 wrapper with RAII for GL objects.
    No limitations on GLenum values.
    All calls can be intercepted using per-call or global C-side
#define'd hooks.
  luasfml (package/C99)
    Complete wrapper for sfml-window
    Partial wrapper for sfml-image  (with the additional
image:FlipY(image) function)

* Ironically, all WIPs are almost complete except for xnum.