# fastest way to maintain nested tables of counters? Classic List Threaded 8 messages Open this post in threaded view
|

## fastest way to maintain nested tables of counters?

 is the following code the best way to handle auto-initialization of nested table cells to zero? function inc3(c,x,y,z) -- inc 3d to zero    c[x] = c[x] or {}   c[x][y] = c[x][y] or {}    c[x][y][z] = c[x][y][z] or 0    c[x][y][z] = c[x][y][z] + 1 end i ask since this task is the core of  my naive bayes classifier. so it must be fast. notes: the table index names won't be known till runtime, so you can auto-init the data with the right names as a pre-processor.oh-  i've read the automagical table entry (http://lua-users.org/wiki/AutomagicTables) and i cant quite justify to myself that that approach is preferred to the above.  but i'd like the advice of some Lua gurus!ADVAthanxNCEshare and enjoy!:-)tim menzies
Open this post in threaded view
|

## Re: fastest way to maintain nested tables of counters?

 On 15 May 2011 20:48, Tim Menzies <[hidden email]> wrote: > is the following code the best way to handle auto-initialization of nested > table cells to zero? >  function inc3(c,x,y,z) -- inc 3d to zero >    c[x] = c[x] or {} >    c[x][y] = c[x][y] or {} >    c[x][y][z] = c[x][y][z] or 0 >    c[x][y][z] = c[x][y][z] + 1 > end > i ask since this task is the core of  my naive bayes classifier. so it must > be fast. > notes: the table index names won't be known till runtime, so you can > auto-init the data with the right names as a pre-processor. > oh-  i've read the automagical table entry > (http://lua-users.org/wiki/AutomagicTables) and i cant quite justify to > myself that that approach is preferred to the above. > but i'd like the advice of some Lua gurus! > ADVAthanxNCE > share and enjoy! > :-) > tim menzies Perhaps: function inc3(t, x, y, z)     if not t[x] then t[x] = {} end     t = t[x]     if not t[y] then t[y] = {} end     t = t[y]     t[z] = (t[z] or 0) + 1 end There might be a better way again but at least this reduces the number of table lookups.
Open this post in threaded view
|

## Re: fastest way to maintain nested tables of counters?

 On Sun, May 15, 2011 at 02:56:03PM +0200, Mason Larobina wrote: > function inc3(t, x, y, z) >     if not t[x] then t[x] = {} end >     t = t[x] >     if not t[y] then t[y] = {} end >     t = t[y] >     t[z] = (t[z] or 0) + 1 > end > > There might be a better way again but at least this reduces the number > of table lookups. If t[x][y] already exists then this code has 6 lookups; if t[x], 7; if t only, 8; if not even t, there is an error, as there should be. The following code reduces 6,7,8 to 4,5,6. function inc3(t, x, y, z)     local tx = t[x]     if not tx then tx={}; t[x] = tx end     local txy = tx[y]     if not txy then txy={}; tx[y] = txy end     local txyz = txy[z];     if not txyz then txy[z]=1 else txy[z]=txyz+1 end end Dirk
Open this post in threaded view
|

## Re: fastest way to maintain nested tables of counters?

 I would wonder if creating a hash out of x,y,z and using that as an index would be cheaper. Creating the hash might be expensive but you'd avoid multiple lookups - depending on whether x,y,z end up being hashed themselves for the lookup?.   Kaj On Sun, May 15, 2011 at 8:51 AM, Dirk Laurie wrote: On Sun, May 15, 2011 at 02:56:03PM +0200, Mason Larobina wrote:> function inc3(t, x, y, z)>     if not t[x] then t[x] = {} end>     t = t[x]>     if not t[y] then t[y] = {} end >     t = t[y]>     t[z] = (t[z] or 0) + 1> end>> There might be a better way again but at least this reduces the number> of table lookups.If t[x][y] already exists then this code has 6 lookups; if t[x], 7; if t only, 8; if not even t, there is an error, as there should be.The following code reduces 6,7,8 to 4,5,6. function inc3(t, x, y, z)   local tx = t[x]   if not tx then tx={}; t[x] = tx end   local txy = tx[y]   if not txy then txy={}; tx[y] = txy end   local txyz = txy[z];   if not txyz then txy[z]=1 else txy[z]=txyz+1 end endDirk
Open this post in threaded view
|

## Re: fastest way to maintain nested tables of counters?

 In reply to this post by Tim Menzies On May 15, 2011, at 5:48 AM, Tim Menzies wrote: > is the following code the best way to handle auto-initialization of nested table cells to zero? > >  function inc3(c,x,y,z) -- inc 3d to zero >    c[x] = c[x] or {} >    c[x][y] = c[x][y] or {} >    c[x][y][z] = c[x][y][z] or 0 >    c[x][y][z] = c[x][y][z] + 1 > end function inc3( c, x, y, z )         local cx = c[ x ]         if not cx then cx = { }; c[ x ] = cx end         local cxy = cx[ y ]         if not cxy then cxy = { }; cx[ y ] = cxy end         local cxyz = cxy[ z ]         if not cxyz then cxyz = 0 end -- No need to assign since we will do so below         cxy[ z ] = cxyz + 1 end Faster still may be to use metatables. local setmetatable = setmetatable local xyzmeta = { __index = function( t, k ) return 0 end } local xymeta = { __index = function( t, k ) local v = setmetatable( { }, xyzmeta ); t[ k ] = v; return v end } local xmeta = { __index = function( t, k ) local v = setmetatable( { }, xymeta ); t[ k ] = v; return v end } Now, initialize c with:         c = setmetatable( { }, xmeta ) And do the increment with:         local cxy = c[ x ][ y ]         cxy[ z ] = cxy[ z ] + 1 The benefit here is that the table misses get detected in the VM index instruction rather than in a subsequent test. On the other hand, the miss handling is more expensive because it incurs function call overhead. So, whether this is faster in practice depends on how often you miss. (And a really optimized version of the above code might try to share code between xmeta and xymeta in order to reduce the memory footprint for the code, but at that point we're really getting tweaky.) Mark
Open this post in threaded view
|

## Re: fastest way to maintain nested tables of counters?

 In reply to this post by Dirk Laurie On May 15, 2011, at 8:51 AM, Dirk Laurie wrote: > If t[x][y] already exists then this code has 6 lookups; if t[x], 7; > if t only, 8; if not even t, there is an error, as there should be. > > The following code reduces 6,7,8 to 4,5,6. > > function inc3(t, x, y, z) >    local tx = t[x] >    if not tx then tx={}; t[x] = tx end >    local txy = tx[y] >    if not txy then txy={}; tx[y] = txy end >    local txyz = txy[z]; >    if not txyz then txy[z]=1 else txy[z]=txyz+1 end > end Missed this one when writing my response. Same as mine except the last line is more clever. Cute. Mark
 In reply to this post by Tim Menzies A boring sunday and I was curious about the performance of the different implementations, especially regarding the differences between Lua and LuaJIT. A quick summary - explanation further down:   -- ORIG ---- lua 7.23s, luajit 0.24s/0.62s ---   -- OPT ----- lua 2.91s, luajit 0.14s/0.32s ---   -- NAIVE --- lua 2.03s, luajit 3.83s/21.34s --- (no typo!)   -- TUNED --- lua 1.84s, luajit 0.15s/1.21s --- I noticed that LuaJIT is sometimes MUCH slower if the test loop is slightly modified (Lua shows no difference) so there are two numbers for LuaJIT, the first for this test loop: |-- x y n z |for x=1,100 do |  for y=1,100 do |    for n=1,10 do |      for z=1,20 do |        inc3(c,x,y,z) |end end end end and the second for this one: |-- n x y z |for n=1,10 do |  for x=1,100 do |    for y=1,100 do |      for z=1,20 do |        inc3(c,x,y,z) |end end end end To the original question ... Tim Menzies wrote: > > is the following code the best way to handle auto-initialization of nested > table cells to zero? |--\[[-- ORIG ---- lua 7.23s, luajit 0.24s/0.62s --- |local function inc3(c,x,y,z) |    c[x] = c[x] or {} |    c[x][y] = c[x][y] or {} |    c[x][y][z] = c[x][y][z] or 0 |    c[x][y][z] = c[x][y][z] + 1 |end |local c = {} |--]]---------------------------------------- Best depends.  But it's definitely not the fastest ;-) You do a lot of redundant table lookups and they are not for free.  LuaJIT shows what it can do with such code: 12 to 30 times faster then Lua. An optiomized version would be: |--\[[-- OPT ----- lua 2.91s, luajit 0.14s/0.32s --- |local function inc3(c,x,y,z) |    local b = c[x]  if b==nil then b={} c[x] = b end |    local a = b[y]  if a==nil then a={} b[y] = a end |    a[z] = (a[z] or 0) + 1 |end |local c = {} |--]]---------------------------------------- Table lookups/stores are minimized.  And it shows: about two times faster then the original version - both for Lua and LuaJIT. > oh-  i've read the automagical table entry [...] and > i cant quite justify to myself that that approach is > preferred to the above. Well, autotables move the nil-tests to the C side.  You'll get a performance that is similar to pre-populated tables. And, IMO more important, you don't have to perform these tests for every use.  You don't have to write functions like inc3 (see next paragraph). The call "inc3(c,x,y,z)" in the test-loop is replaced by the inline code: |   local t=c[x][y]  t[z]=t[z]+1 First a naive implementation of autotables: |--\[[-- NAIVE --- lua 2.03s, luajit 3.83s/21.34s --- |local function zero() return 0 end |local function auto_tab(a,b,c,d,e) |    return setmetatable({}, { |        __index = function(t,k) |            local v = a(b,c,d,e) |            t[k] = v |            return v |        end |    }) |end |local c = auto_tab(auto_tab, auto_tab, zero) |--]]---------------------------------------- It's expensive on the memory side (each table has its own metatable and a unique index-function with upvalues) but Lua can handle it quite well, about 30% faster then the optimized version though most of this speedup is due to the inlined increment code (one less function call). But what's LuaJIT doing?  Especially with the alternative test loop?  21s?!?  Ten times slower than Lua! Ok, time for a tuned "unrolled" auto-table version: |--\[[-- TUNED --- lua 1.84s, luajit 0.15s/1.21s --- |local auto_0, auto_t0, auto_tt0 |local meta_0   = { __index = function(t,k) local v=0         t[k]=v return v end } |local meta_t0  = { __index = function(t,k) local v=auto_0()  t[k]=v return v end } |local meta_tt0 = { __index = function(t,k) local v=auto_t0() t[k]=v return v end } |function auto_0()   return setmetatable({}, meta_0) end |function auto_t0()  return setmetatable({}, meta_t0) end |function auto_tt0() return setmetatable({}, meta_tt0) end |local c = auto_tt0() |--]]---------------------------------------- This version is only slightly faster (1.84s vs 2.03s) but it needs no additional memory. LuaJIT is back on track, about the same runtime as the optimized non-auto-table version (impressive) but it still chokes on the alternative test-loop (8 times slower but still faster than Lua). > but i'd like the advice of some Lua gurus! Well, the usual advice: don't optimize too early ;-) I.e.: For 2 million increments you can save at max 7.2seconds. If the total runtime to generated these 2 million increments is 10 minutes, you save 1.2% runtime - not worth looking at. Ciao, ET. Full test code attached ... autotabs.lua (1K) Download Attachment