Interesting article on hash table performance

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Interesting article on hash table performance

Daurnimator
I read this today:
https://probablydance.com/2017/02/26/i-wrote-the-fastest-hashtable/
I thought it might be interesting reading for many of you.

I'm eager to see how lua's algorithm compares, and what lessons might be learnt.

Reply | Threaded
Open this post in threaded view
|

Re: Interesting article on hash table performance

Xavier Wang
2017-02-27 19:12 GMT+08:00 Daurnimator <[hidden email]>:
> I read this today:
> https://probablydance.com/2017/02/26/i-wrote-the-fastest-hashtable/
> I thought it might be interesting reading for many of you.
>
> I'm eager to see how lua's algorithm compares, and what lessons might be learnt.
>

I feel Lua's implement is much faster than this. it use a limited
linear search, but Lua's use a linked list into open address. with a
linked list, Lua's get a free node is much easier than author's.

when insert, Lua also move the colliding node into it's main position.
in my amoeba library[1], in a 1000+ elements hash table, the most link
has only 3 elements. and the lookup loop looks even more compact than
author's:

static const am_Entry *am_gettable(const am_Table *t, am_Symbol key) {
    const am_Entry *e;
    if (t->size == 0 || key.id == 0) return NULL;
    e = am_mainposition(t, key);
    for (; e->key.id != key.id; e = am_index(e, e->next))
        if (e->next == 0) return NULL;
    return e;
}

the most inspired thing is the limited linear search, in Lua it means
limited search in lastfree, it's not easy to add this change to my
hashtable, and I will do some research for this change. but in current
test, the worst case for max loop for lastfree is the whole list
length. so maybe it will have a big improve.

it's really a very impressive thing, TIL :-)

--
regards,
Xavier Wang.

Reply | Threaded
Open this post in threaded view
|

Re: Interesting article on hash table performance

Ahmed Charles
In reply to this post by Daurnimator
On 2/27/2017 3:12 AM, Daurnimator wrote:
> I read this today:
> https://probablydance.com/2017/02/26/i-wrote-the-fastest-hashtable/
> I thought it might be interesting reading for many of you.
>
> I'm eager to see how lua's algorithm compares, and what lessons might be learnt.

I'm not familiar with Lua's algorithm for hash tables, but removing the
bounds check for the array in the inner loop by having allocating a
slightly larger array reminds me of removing the bounds check in
insertion sort by placing the smallest/largest element at the
beginning/end, respectively, based on which way you search the array.
That's a really awesome insight.
Reply | Threaded
Open this post in threaded view
|

Re: Interesting article on hash table performance

Francisco Olarte
Ahmed:

On Wed, Mar 1, 2017 at 10:52 AM, Ahmed Charles <[hidden email]> wrote:
> I'm not familiar with Lua's algorithm for hash tables, but removing the
> bounds check for the array in the inner loop by having allocating a
> slightly larger array reminds me of removing the bounds check in
> insertion sort by placing the smallest/largest element at the
> beginning/end, respectively, based on which way you search the array.

You normally ( specially when table is something like a table of
pointers in C, or references in other languages ) just put the
inserted element as sentinel ( that's what they are tipically called )
as the smalest/largest may be difficult to construct or not even exist
( specially in qsort(3) type functions, which took table, element
size, comparator function ). And the same trick was used typically for
lineal search ( in the times where testing the loop counter was really
expensive, this days I do it more for reducing instruction count, and
thus bug surface, than speed ).

Francisco Olarte.

Reply | Threaded
Open this post in threaded view
|

Re: Interesting article on hash table performance

Russell Haley
In reply to this post by Xavier Wang
On Mon, Feb 27, 2017 at 4:41 AM, Xavier Wang <[hidden email]> wrote:

> 2017-02-27 19:12 GMT+08:00 Daurnimator <[hidden email]>:
>> I read this today:
>> https://probablydance.com/2017/02/26/i-wrote-the-fastest-hashtable/
>> I thought it might be interesting reading for many of you.
>>
>> I'm eager to see how lua's algorithm compares, and what lessons might be learnt.
>>
>
> I feel Lua's implement is much faster than this. it use a limited
> linear search, but Lua's use a linked list into open address. with a
> linked list, Lua's get a free node is much easier than author's.
>
> when insert, Lua also move the colliding node into it's main position.
> in my amoeba library[1], in a 1000+ elements hash table, the most link
> has only 3 elements. and the lookup loop looks even more compact than
> author's:
>
> static const am_Entry *am_gettable(const am_Table *t, am_Symbol key) {
>     const am_Entry *e;
>     if (t->size == 0 || key.id == 0) return NULL;
>     e = am_mainposition(t, key);
>     for (; e->key.id != key.id; e = am_index(e, e->next))
>         if (e->next == 0) return NULL;
>     return e;
> }
>
> the most inspired thing is the limited linear search, in Lua it means
> limited search in lastfree, it's not easy to add this change to my
> hashtable, and I will do some research for this change. but in current
> test, the worst case for max loop for lastfree is the whole list
> length. so maybe it will have a big improve.
>
> it's really a very impressive thing, TIL :-)
>
> --
> regards,
> Xavier Wang.
>
Did you mean to add this link as reference [1]:
https://github.com/starwing/amoeba?

Very slick. Some months back I found a round robin database for
applicative monitoring here: https://github.com/shmul/mule. I didn't
drill in too far but I wonder out loud if your library could be
applied for relative logic in things like object tracking?

I'll keep your library in mind when I am creating my
killer-artificially-intelligent-robot in lua.

Russ