sortedforeach()

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

sortedforeach()

Stephan Herrmann-2
Hi,

I'd like to discuss an issue of a modified version of "foreach" 
which I'm using for a while now. The function is called "sortedforeach" 
and traverses the fields of a table in ascending order of their indices.
I needed this function, because in many situations I must know in which 
order the fields are processed, which is not given by "foreach".
I will share the code at the end of this mail.
The main modification is one memcpy and one qsort plus the provision
of a compare-function.

I'm uncertain in whether to propose adding this function to Lua or not.
My main concern is, to be able to use "sortedforeach" accross 
releases of Lua without digging too deep into the interpreter.
Would it be a harm to have foreach always do the sorting?
Should a hook be provided to process the fields of a table 
before iterating?

Or is it just okay to have this function in a separate C library?
How stable are the interfaces I'm using???

I didn't yet mention two more differences between foreach and sortedforeach:
1. The modified version is robust against modifying the original table
   from within the iteration (due to the fact that the table is in fact
   copied). Please don't argue about this style of programming ;-)
2. Two optional arguments may be used to filter only certain fields
   according to the tag of their index (and value).

Does any of this make sense to you?

And then, there's two points where one might argue for an even more general
solution: 

A.) Of course "sortedforeach" will not remain the last wish:
On which level should all the nice "map, filter, reduce, mapconcat ..."
etc. functions be implemented? I didn't mean to provide a general filter
function by my solution. But all these functions would be nice to have, huh?
Would a "foldleft" or so serve as a good basis for some of the others?

B.) Last but not least I thought about permitting differnt orders and also 
sorting userdata etc. But then I stumbled again across the issue EQUALITY
versus COMPARISON. IFF a tagmethod for comparison in the style of 
strcmp/strcoll (returning -1,0,1) would exist, I would be glad to use this TM 
within `entrycompare()' below. 

I still feel it's a pitty, that equality for userdata cannot be redefined 
(one library I use creates many different handles for the same object and 
provides its specific test for equality. I'd really like to use that test 
function as implementation for u1 == u2 concerning this special tag of 
userdata, BUT, ...). Wouldn't life be much easier, if we had just one
comparison tag method (yielding -1,0,1) and make <,<=,>,>= and == rely on
that function? Maybe someone can enlighten me about the reasons for 
separating 4 tag methods and one hardcoded test (equality).

Cheers
Stephan

--------------------------------------------------------------------------
 Stephan Herrmann
 ----------------
 Technical University Berlin
 Software Engineering Research Group
 http://swt.cs.tu-berlin.de/~stephan
 phone: ++49 30 314 73174         
--------------------------------------------------------------------------


---------- here comes the code ----------------------------------------

#include <stdlib.h>
#include "lapi.h"
#include "ltable.h"
#include "ldo.h"
#include "ltm.h"

#define intcmp(i1,i2) ((i1<i2)?(-1):((i1==i2)?0:1))

/** Compare function to be used with qsort() */
int entrycompare (const void *e1, const void *e2) {
  TObject *o1 = ref((Node*)e1);
  TObject *o2 = ref((Node*)e2);
  int t1 = luaT_efectivetag(o1);
  int t2 = luaT_efectivetag(o2);
  /* Really compare only if same type: */
  if (t1 == t2) {
    if (t1 == LUA_T_STRING) 
      return strcoll(svalue(o1), svalue(o2));
    else if (t1 == LUA_T_NUMBER)
      return intcmp(nvalue(o1),nvalue(o2));
    return 0; /* Other types not sorted */
  } 
  /* Numbers rank first: */
  else if (t1 == LUA_T_NUMBER) return -1;
  else if (t2 == LUA_T_NUMBER) return 1;
  /* Strings rank second: */
  else if (t1 == LUA_T_STRING) return -1;
  else if (t2 == LUA_T_STRING) return 1;
  /* All others types in 'natural' order: */
  return -intcmp(t1,t2);
}


/* Function: sortedforeach (table, func[, indtag[, valtag]])
 * Iterate all elements of `table' in ascending order of their indices.
 * Type order is: number string others (others according to their tag value).
 * Perform `func(i,v)' on each element and return the first non-nil result.
 * If `indtag' is given call `func' only for elements with an index of this tag
 * If `valtag' is given call `func' only for elements with an value of this tag
 */
static void sortedforeach (void)
{
  TObject t = *luaA_Address(luaL_tablearg(1));
  TObject f = *luaA_Address(luaL_functionarg(2));

  lua_Object a3 = lua_getparam(3);
  int index_tag = 42;
  int value_tag = (int)luaL_opt_number(4,42);
  int i, n = avalue(&t)->nhash;
  Node *sorttab = (Node *)malloc(n * sizeof(Node));
  memcpy(sorttab, avalue(&t)->node, (n * sizeof(Node)));
  qsort(sorttab,n,sizeof(Node),entrycompare);

  if (lua_isnumber(a3)) 
    index_tag = (int)lua_getnumber(a3);

  for (i=0; i<n; i++) {
    Node *nd = &(sorttab[i]);
    if ((ttype(ref(nd)) != LUA_T_NIL) && (ttype(val(nd)) != LUA_T_NIL)
        && ((index_tag == 42) || (luaT_efectivetag(ref(nd)) == index_tag))
        && ((value_tag == 42) || (luaT_efectivetag(val(nd)) == value_tag))) {
      luaA_pushobject(&f);
      luaA_pushobject(ref(nd));
      luaA_pushobject(val(nd));
      luaD_call((L->stack.top-L->stack.stack)-2, 1);
      if (ttype(L->stack.top-1) != LUA_T_NIL) {
        free(sorttab);
        return;
      }
      L->stack.top--;
    }
  }
  free(sorttab);
}

---------- here is a sample usage in Lua (results are indented) ----------

a={but="here", follow="some", hashed="values"}
a[5]="list"
a[3]='in'
a[1]='some'
a[2]='strings'
a[4]='a'
-- unsorted printout:
foreach(a, print)
   1       some
   2       strings
   3       in
   but     here
   5       list
   4       a
   hashed  values
   follow  some
-- normal sorted printout:
sortedforeach(a, print)
   1       some
   2       strings
   3       in
   4       a
   5       list
   but     here
   follow  some
   hashed  values
-- sorted printout only of fields with a 'number' index:
sortedforeach(a, print, tag(1))
   1       some
   2       strings
   3       in
   4       a
   5       list
-- sorted printout only of fields with a 'string' index:
sortedforeach(a, print, tag(""))
   but     here
   follow  some
   hashed  values

Reply | Threaded
Open this post in threaded view
|

Re: sortedforeach()

Roberto Ierusalimschy
Som quick considerations:

1) how does your routine handle GC? What happens if you delete an element
from the original table, that element is collected, and then you traverse
it (it is still in the copy)?

2) What happens if there is an error in the function called? You will not
free the "malloc" block... (I think you can fix 1 and 2 by creating a
"complete" table to keep the copy, and pushing it on the stack.)

3) We can only regret about having four different methods for order. That
was really a mistake. In fact, only one would be much better. We were
considering using only the "lt" tag method, and doing the others as
(a<=b <==> not b<a, etc). But we do not know if now it pays the
incompatibilities...

4) The problem with an equality tag method is table indexing. We always 
have that a==b => t[a] is the same element that t[b]. It would be very
confusing to break this rule, we think. And it would be very inefficient
and almost impossible to keep right if we had table indexing using a tag
method for equality. (For instance, what would happen to the old hashings
when the program redefines equality?) That is why == is hardcoded
(toghether with table indexing).

5) Lua 3.2 will have a built-in function sort, that gets a table and an 
optional "less-than" function (and does the sort in-place). A
"foreachsorted" is really a very useful function, but maybe a little
redundant with a plain sort.  We will think about that.


-- Roberto