Iterating pairs of keys from a table

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Iterating pairs of keys from a table

Tim Hill
I recently had a need to iterate table key combinations in pairs. For example, for a table with keys (not values) of {a, b, c, d}, I wanted to iterate:

    (a,b), (a,c), (a,d), (b,c), (b,d), (c,d)

That is, I wanted unique combinations (not permutations) of non-identical keys. Sort order was not important.

Anyway, I came up with this generator function:

function keypairs(t)
    local s = next(t)
    local n, i = next(t, s), s
    return function()
        i = next(t, i)
        if i == nil then
            s, n = n, next(t, n)
            i = n
            if i == nil then s = nil end
        end
        return s, i
    end
end

Example use:

t = { a=true, b=true, c=true, d=true }
for k1, k2 in keypairs(t) do
    print(k1, k2)
end

However, I was wondering if anyone has a better (faster? smaller? cleaner?) solution they would like to share (or have I missed something very obvious in Lua?). I poked around in the archives and didn’t find anything, but it’s quite possible my limited search missed something.

(The only tweak I thought of was to swap the return order of s and i, which would make the nested “if” unnecessary. However I liked the fact that as written the generator returns the pairs in (major, minor) order that matches that returned by pairs().)

—Tim



Reply | Threaded
Open this post in threaded view
|

Re: Iterating pairs of keys from a table

Dirk Laurie-2
2017-02-28 5:38 GMT+02:00 Tim Hill <[hidden email]>:

> I recently had a need to iterate table key combinations in pairs. For example, for a table with keys (not values) of {a, b, c, d}, I wanted to iterate:
>
>     (a,b), (a,c), (a,d), (b,c), (b,d), (c,d)
>
> That is, I wanted unique combinations (not permutations) of non-identical keys. Sort order was not important.
>
> Anyway, I came up with this generator function:
>
> function keypairs(t)
>     local s = next(t)
>     local n, i = next(t, s), s
>     return function()
>         i = next(t, i)
>         if i == nil then
>             s, n = n, next(t, n)
>             i = n
>             if i == nil then s = nil end
>         end
>         return s, i
>     end
> end
>
> Example use:
>
> t = { a=true, b=true, c=true, d=true }
> for k1, k2 in keypairs(t) do
>     print(k1, k2)
> end
>
> However, I was wondering if anyone has a better (faster? smaller? cleaner?) solution they would like to share (or have I missed something very obvious in Lua?). I poked around in the archives and didn’t find anything, but it’s quite possible my limited search missed something.
>
> (The only tweak I thought of was to swap the return order of s and i, which would make the nested “if” unnecessary. However I liked the fact that as written the generator returns the pairs in (major, minor) order that matches that returned by pairs().)

One could use a coroutine. That saves a lot of bookkeeping.

keypairs = function(tbl)
  return coroutine.wrap(
    function()
      for k in next,tbl do
        for m in next,tbl,k do
          coroutine.yield(k,m)
        end
      end
    end)
  end

Reply | Threaded
Open this post in threaded view
|

Re: Iterating pairs of keys from a table

Tim Hill

> On Feb 28, 2017, at 12:24 AM, Dirk Laurie <[hidden email]> wrote:
>
> 2017-02-28 5:38 GMT+02:00 Tim Hill <[hidden email]>:
>> Anyway, I came up with this generator function:
>>
>> function keypairs(t)
>>    local s = next(t)
>>    local n, i = next(t, s), s
>>    return function()
>>        i = next(t, i)
>>        if i == nil then
>>            s, n = n, next(t, n)
>>            i = n
>>            if i == nil then s = nil end
>>        end
>>        return s, i
>>    end
>> end
>>
>
> One could use a coroutine. That saves a lot of bookkeeping.
>
> keypairs = function(tbl)
>  return coroutine.wrap(
>    function()
>      for k in next,tbl do
>        for m in next,tbl,k do
>          coroutine.yield(k,m)
>        end
>      end
>    end)
>  end
>

Very nice idea, and very clean, though I worry about the performance. Out of curiosity I ran a simple timing test with both approaches:

t = {}
for i = 1, 10000 do
        t["k" .. tostring(i)] = true
end

t1 = os.clock()
for k1, k2 in keypairs(t) do
        ;
end
t2 = os.clock()
print("elapsed time=", t2 - t1)

Results averaged over 5 passes:
Non-coroutine: 7.65 seconds
Coroutine: 14.34 seconds

Of course, this does nothing WITH the pairs, and in a real world situation its quite possible that any processing overhead would render the difference in iterator performance moot.

—Tim





Reply | Threaded
Open this post in threaded view
|

Re: Iterating pairs of keys from a table

Chris Berardi-3

On Tue, Feb 28, 2017, at 06:18 PM, Tim Hill wrote:

>
> > On Feb 28, 2017, at 12:24 AM, Dirk Laurie <[hidden email]> wrote:
> >
> > 2017-02-28 5:38 GMT+02:00 Tim Hill <[hidden email]>:
> >> Anyway, I came up with this generator function:
> >>
> >> function keypairs(t)
> >>    local s = next(t)
> >>    local n, i = next(t, s), s
> >>    return function()
> >>        i = next(t, i)
> >>        if i == nil then
> >>            s, n = n, next(t, n)
> >>            i = n
> >>            if i == nil then s = nil end
> >>        end
> >>        return s, i
> >>    end
> >> end
> >>
> >
> > One could use a coroutine. That saves a lot of bookkeeping.
> >
> > keypairs = function(tbl)
> >  return coroutine.wrap(
> >    function()
> >      for k in next,tbl do
> >        for m in next,tbl,k do
> >          coroutine.yield(k,m)
> >        end
> >      end
> >    end)
> >  end
> >
>
> Very nice idea, and very clean, though I worry about the performance. Out
> of curiosity I ran a simple timing test with both approaches:
>
> t = {}
> for i = 1, 10000 do
> t["k" .. tostring(i)] = true
> end
>
> t1 = os.clock()
> for k1, k2 in keypairs(t) do
> ;
> end
> t2 = os.clock()
> print("elapsed time=", t2 - t1)
>
> Results averaged over 5 passes:
> Non-coroutine: 7.65 seconds
> Coroutine: 14.34 seconds
>
> Of course, this does nothing WITH the pairs, and in a real world
> situation its quite possible that any processing overhead would render
> the difference in iterator performance moot.
>
> —Tim

What about creating a tmp array?

local t = {};
for i=1,10000 do
   t['k'..tostring(i)] = true
end

t1 = os.clock();
-- create tmp array
local tbl_,ix = {},1
for k,v in pairs(t) do tbl_[ix] = k; ix = ix + 1 end

local k1,k2,num
for i=1,#tbl_ do
   for j=i+1,#tbl_ do
      k1,k2 = t[i],t[j]
      num = num + 1
   end
end
t2 = os.clock();
print('elasped time for '..num..' => '..(t2-t1)..' sec')

Result average over 5 passes: 2.48 seconds

Reply | Threaded
Open this post in threaded view
|

Re: Iterating pairs of keys from a table

Tim Hill

> On Feb 28, 2017, at 3:50 PM, Chris Berardi <[hidden email]> wrote:
>
>
> On Tue, Feb 28, 2017, at 06:18 PM, Tim Hill wrote:
>>
>
> What about creating a tmp array?
>
> local t = {};
> for i=1,10000 do
>   t['k'..tostring(i)] = true
> end
>
> t1 = os.clock();
> -- create tmp array
> local tbl_,ix = {},1
> for k,v in pairs(t) do tbl_[ix] = k; ix = ix + 1 end
>
> local k1,k2,num
> for i=1,#tbl_ do
>   for j=i+1,#tbl_ do
>      k1,k2 = t[i],t[j]
>      num = num + 1
>   end
> end
> t2 = os.clock();
> print('elasped time for '..num..' => '..(t2-t1)..' sec')
>
> Result average over 5 passes: 2.48 seconds
>

I thought of that originally, but I didn’t like the memory overhead involved in creating the array every time the generator was used (particularly for large tables).

—Tim


Reply | Threaded
Open this post in threaded view
|

Re: Iterating pairs of keys from a table

Dirk Laurie-2
In reply to this post by Tim Hill
2017-03-01 1:18 GMT+02:00 Tim Hill <[hidden email]>:

> Very nice idea, and very clean, though I worry about the
> performance.

If performance is an issue, simply code the two loops every time.

      for k in next,tbl do
        for m in next,tbl,k do
          -- use k and m
        end
      end

If you must have an iterator factory, and wish to avoid a coroutine
(damn! I so seldom have good reason to use one, your problem
seemed to be a case in point), it is possible to do it all in a single
generic `for` by making your routine stateless. Exercise for the
reader.

keypairs = function(tbl)
  return function(tbl,k1,k2)
    -- no upvalues required
   return k1,k2
  end
end

Reply | Threaded
Open this post in threaded view
|

Re: Iterating pairs of keys from a table

Chris Berardi-3
In reply to this post by Tim Hill
On Tue, Feb 28, 2017, at 08:00 PM, Tim Hill wrote:
>
> I thought of that originally, but I didn’t like the memory overhead
> involved in creating the array every time the generator was used
> (particularly for large tables).
>
> —Tim

I thought that might be a concern. It's an example of the classic space
vs time tradeoff.

Reply | Threaded
Open this post in threaded view
|

Re: Iterating pairs of keys from a table

Tim Hill
In reply to this post by Dirk Laurie-2

> On Feb 28, 2017, at 9:43 PM, Dirk Laurie <[hidden email]> wrote:
> If you must have an iterator factory, and wish to avoid a coroutine
> (damn! I so seldom have good reason to use one, your problem
> seemed to be a case in point), it is possible to do it all in a single
> generic `for` by making your routine stateless. Exercise for the
> reader.
>
> keypairs = function(tbl)
>  return function(tbl,k1,k2)
>    -- no upvalues required
>   return k1,k2
>  end
> end
>

Well, you have me beaten. For a start, I was under the impression Lua only passed two arguments to the iterator function, yet you have three? Clearly I’m missing something.

I did look at cunning ways to bury the state is the (s, v) pair maintained by the for loop as a way to avoid upvalues, but I didn’t really see any advantage over upvalues from a performance perspective.

—Tim


Reply | Threaded
Open this post in threaded view
|

Re: Iterating pairs of keys from a table

Dirk Laurie-2
2017-03-02 12:09 GMT+02:00 Tim Hill <[hidden email]>:

>
>> On Feb 28, 2017, at 9:43 PM, Dirk Laurie <[hidden email]> wrote:
>> If you must have an iterator factory, and wish to avoid a coroutine
>> (damn! I so seldom have good reason to use one, your problem
>> seemed to be a case in point), it is possible to do it all in a single
>> generic `for` by making your routine stateless. Exercise for the
>> reader.
>>
>> keypairs = function(tbl)
>>  return function(tbl,k1,k2)
>>    -- no upvalues required
>>   return k1,k2
>>  end
>> end
>>
>
> Well, you have me beaten. For a start, I was under the impression
> Lua only passed two arguments to the iterator function, yet you have
> three? Clearly I’m missing something.

You're right. I did not reread the manual page just before posting
the above. Should have done. Great! It does need a coroutine
after all!