Behavior of newkey(): intentional?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Behavior of newkey(): intentional?

Paul Du Bois
I've been examining the rather neat hash algorithm+implementation in
lua 5. I have been trying to figure out this line of code in newkey(),
the one that checks whether the mainposition is used or not:

  if (!ttisnil(gval(mp))) {  /* main position is not free? */

In particular, I'm curious why it checks gval(mp) instead of gkey(mp).
In the case that mp _was_ used In this case, the new key will get its
mainposition, but it will re-use the old chain.

A concrete example: assume 8 nodes, and the hash function hash(str) = str[0]

t.a = 1  -- a hashes to node 1
t.q = 1  -- q hashes to node 1; gets node 7. table looks like 1<a,1> -> 7<q,1>
t.i = 1 -- i hashes to node 1; gets node 6. table looks like 1<a,1> ->
6<i,1> -> 7<q,1>
t.i = nil -- table looks like 1<a,1> -> 6<i,nil> -> 7<q,1>
t.f = 6 -- f hashes to node 6. table looks like 1<a,1> -> 6<f,6> -> 7<q,1>

Now keys that hash to 1 and keys that hash to 6 share the same chain.

So, it seems like checking gkey(mp) would keep the chains separate, at
the cost of increasing the amount of re-hashing (moving an item out of
mainposition uses up a free node). Is this in fact the tradeoff that's
being made?

paul
Reply | Threaded
Open this post in threaded view
|

Re: Behavior of newkey(): intentional?

Roberto Ierusalimschy
> I've been examining the rather neat hash algorithm+implementation in
> lua 5. I have been trying to figure out this line of code in newkey(),
> the one that checks whether the mainposition is used or not:
>
>   if (!ttisnil(gval(mp))) {  /* main position is not free? */
>
> In particular, I'm curious why it checks gval(mp) instead of gkey(mp).
> In the case that mp _was_ used In this case, the new key will get its
> mainposition, but it will re-use the old chain.
>
> [...]
>
> So, it seems like checking gkey(mp) would keep the chains separate, at
> the cost of increasing the amount of re-hashing (moving an item out of
> mainposition uses up a free node). Is this in fact the tradeoff that's
> being made?

Yes. As far as I remember, the difference between both implementations
is small. (Such "collisions" are rare.) The main reason for the use of
gval is that it is more consistent with the rest of the code, which
always assumes that gval=nil means an empty slot. (In particular, Lua
never accesses the gkey of an empty slot.)

-- Roberto