GC "survey"

classic Classic list List threaded Threaded
45 messages Options
123
Reply | Threaded
Open this post in threaded view
|

GC "survey"

Roberto Ierusalimschy
As lhf wrote, we are thinking about a generational collector. The
problem is not how to implement it, but whether it would do any good.
For some applications, a generational collector can be a real booster,
for others it can make no difference from mark & sweep.

To try to improve garbage collection in Lua, we need data about it in
real programs. So, if you think you can have problems with it, please
consider taking some time to tell us about your problems. If you prefer,
send the data directly to us [lua at tecgraf...]; we will keep it
confidential. We cannot promise anything, but such data will certainly
increase the chances.

What we want to know?

* the "size" of your problem (e.g. it should run in Xms but it takes
10Xms);

* the "size" of your data (how many tables? typical size of tables?
how many closures? etc.)

* do you know the time profile of the GC in your program? Where does
it take most time? (mark, sweep, traverse of tables, sweep of strings,
actually freeing memory, etc.)

* does your program respect the "generational hypothesis"? (that is,
most objects die young, before 2-5 GC cycles?)

* does your program have lots of constant data (such as tables that you
set up once, and after that never [or seldom] change again.)? How much?
(This is very important; mainly, those are the data a generational GC
can avoid visiting, and so is the source of its improvements.)

* feel free to tell anything else you think is relevant;

* don't forget to tell which version of Lua you are using, and whether
you modified it in any way that may affect the GC.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

Erik Hougaard
----- Original Message -----
> To try to improve garbage collection in Lua, we need data about it in
> real programs. So, if you think you can have problems with it, please
> consider taking some time to tell us about your problems. If you prefer,
> send the data directly to us [lua at tecgraf...]; we will keep it
> confidential. We cannot promise anything, but such data will certainly
> increase the chances.

Let me start with one of my projects.

Crazy Ivan is autonomous robot that participates in competitions. We use Lua
because we need a tool where we can change everything right in the pit
without having to compile/download/flash etc.etc.... And we need advanced
data structures to handle our data.

The main structure of the software is a loop.

repeat
    Collect Data From Sensors
    Store And Calculate
    Change Motor Speeds
until I win

This is run on a machine with a Motorla Coldfire 5206e (50 Mips) - a CPU
without MMU.  The loop should run somewhere 80-100 times/second, but lot of
the time was used by the collect data bus.

What we would see from time to time was the robot running off the curse
because a GC cycle used ~ 200 - 500 ms, and ofcause this would not take
place when the wheel where pointing forward :-) Our solution is a forced GC
cycle with the motors stopped for every 1000 (main loop) cycles - not very
nice but quite effective.

In the collection state we would keep tables with the last 10-100 samples
from every sensor and doing calculations on those.

Another dimension in this is the present of static objects. When we where
running all functions and a lot of global variables would not be touched -
So some sort of "disregards from GC" scope could be very interesting,
something like:

F = 34
enterrealtime()
x={}
repeat
  x[Mycounter] = DataFromSensors()
until finish
exitrealtime()

would ever spend time considering F and all functions - but a mark and sweep
GC should do this, right?

/Erik

p.s. We are building a new robot where the GC cycle take place while some
other housekeeping (Download from camera in C) are running.



Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

Roberto Ierusalimschy
> Another dimension in this is the present of static objects. When we where
> running all functions and a lot of global variables would not be touched -
> So some sort of "disregards from GC" scope could be very interesting,

This is exactly what a generational collector does: it disregards "old"
objects (where "old" means the object has been alive and has not
been modified for a while).

(Actually, that definition of "old" is a simplification, but that also
simplifies the implementation of the algorithm.)

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: GC "survey" (2)

Roberto Ierusalimschy
In reply to this post by Erik Hougaard
(forgot this part)

> would ever spend time considering F and all functions - but a mark and sweep
> GC should do this, right?

Right. But the question is whether the time spent "considering F and all
functions" is significant or not.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

RE: GC "survey"

Nick Trout-4
In reply to this post by Roberto Ierusalimschy
> > Another dimension in this is the present of static objects. 
> When we where
> > running all functions and a lot of global variables would 
> not be touched -
> > So some sort of "disregards from GC" scope could be very 
> interesting,
> 
> This is exactly what a generational collector does: it 
> disregards "old"
> objects (where "old" means the object has been alive and has not
> been modified for a while).
> 
> (Actually, that definition of "old" is a simplification, but that also
> simplifies the implementation of the algorithm.)

But presumably everything has to be checked at somepoint which will result
in the same problems as are being encountered now? Would adding a
"const"/"static" keyword help?
eg.

const function foo() ... end

foo = 1 -- error

You could now move the closure referenced by foo to a "safe" area. I think
this would have 2 benefits. Firstly you wouldnt have to garbage collect the
foo closure as foo would always reference it. Secondly, you could possibly
call it faster as you would not necessary have to hash foo to the object
referenced, it would always reference the same object, so you could give it
a permanent pointer?

I'm not sure if "static" or "const" is the correct keyword here. It would be
static data, but a const reference. Oh, it might not work either... over to
the Lua experts...

Nick



Reply | Threaded
Open this post in threaded view
|

RE: GC "survey"

Brownsword
I am (still) in the process of evaluating Lua, so I can't answer most of the
questions below.  The biggest hesitation in adopting it is caused by the
mark-and-sweep garbage collector.  If I understand the current
implementation, it is a monolithic task which happens at times that aren't
directly under user control.  It walks across a lot of data, and on some
platforms (i.e. PS2) that is a very painful operation since the machine uses
high latency memory and very small caches.  Our normal solution for this
kind of problem is reference counting, which tends to spread the overhead
evenly across execution (I'm not suggesting that this is applicable to Lua,
its just what we usually use).


Another request I have is that all memory allocations be vector-able... and
that the "free" function take the number of bytes being freed.  This is very
useful if using a fast lightweight small block allocator.  C++ supports this
by the "operator delete (void*, size_t)" overload.




-----Original Message-----
From: Roberto Ierusalimschy [[hidden email]]
Sent: Friday, August 23, 2002 5:36 AM
To: Multiple recipients of list
Subject: GC "survey"


As lhf wrote, we are thinking about a generational collector. The
problem is not how to implement it, but whether it would do any good.
For some applications, a generational collector can be a real booster,
for others it can make no difference from mark & sweep.

To try to improve garbage collection in Lua, we need data about it in
real programs. So, if you think you can have problems with it, please
consider taking some time to tell us about your problems. If you prefer,
send the data directly to us [lua at tecgraf...]; we will keep it
confidential. We cannot promise anything, but such data will certainly
increase the chances.

What we want to know?

* the "size" of your problem (e.g. it should run in Xms but it takes
10Xms);

* the "size" of your data (how many tables? typical size of tables?
how many closures? etc.)

* do you know the time profile of the GC in your program? Where does
it take most time? (mark, sweep, traverse of tables, sweep of strings,
actually freeing memory, etc.)

* does your program respect the "generational hypothesis"? (that is,
most objects die young, before 2-5 GC cycles?)

* does your program have lots of constant data (such as tables that you
set up once, and after that never [or seldom] change again.)? How much?
(This is very important; mainly, those are the data a generational GC
can avoid visiting, and so is the source of its improvements.)

* feel free to tell anything else you think is relevant;

* don't forget to tell which version of Lua you are using, and whether
you modified it in any way that may affect the GC.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

Roberto Ierusalimschy
> If I understand the current implementation, it is a monolithic task
> which happens at times that aren't directly under user control.

Actually, you can have complete control over when it happens. You
can stop it, or you can force a GC cycle. That is, you can run the
GC exactly when you want. (But yes, (1) once you start it it will go
monolithically until the end, and (2) if you do not call it with a
reasonably frequency your program will use too much memory.)


> Another request I have is that all memory allocations be vector-able...

What does that mean?


> and that the "free" function take the number of bytes being freed.

Lua already provides that.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

Roberto Ierusalimschy
In reply to this post by Nick Trout-4
> But presumably everything has to be checked at somepoint which will result
> in the same problems as are being encountered now?

Old immutable objects are only "checked" at full collections. Usually
you have to run a full collection from time to time, but this is much
less frequent than "short" collections. For some particular tunned
applications, you can completely avoid full collections.


> Would adding a "const"/"static" keyword help?

Lua still have to check that nobody changes such objects, so the overhead
would be similar (small in both cases). But with a generational collection
Lua would detect those "const" objects for you.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

RE: GC "survey"

Brownsword
In reply to this post by Roberto Ierusalimschy
>>> Another request I have is that all memory allocations be vector-able...
>What does that mean?

Oops, I meant "revectorable".  Doesn't really matter since we have the
source, but I prefer to be able to get new releases without having to update
it with changes each time.

>>> and that the "free" function take the number of bytes being freed.
>Lua already provides that.

I wasn't aware of that -- is it new to 5.0?


Reply | Threaded
Open this post in threaded view
|

Lua 5.0 alpha long string quirk

Björn De Meyer
In reply to this post by Roberto Ierusalimschy
Suppose I want to print the string following
three line string in lua:
Hello brackets!
[[]
How are you?


Then it's impossible to do this using the
following single long string: 
[[
Hello brackets!
[[]
How are you?]]]]

As the two final brackets will be added in the string.

The lua lexer tries to balance opening [[ with ]], 
to allow for nesting comments. However, this balancing 
has some quirky effects when using [[ in such a long string. 
It's a a minor quirk that I wanted to point out. Maybe it 
would be better not to try and match the [[]] brackets? 
Or only match[[ bracets with ]] when in a comment?

-- 
"No one knows true heroes, for they speak not of their greatness." -- 
Daniel Remar.
Björn De Meyer 
[hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: GC "survey"

Eric Ries
In reply to this post by Brownsword
I would think that doing reference counting to catch the easy cases would
solve most of the problem. It?s not like ref counting ever gives you any
false positives (objects that should not be collected). If you used ref
counting to amortize the cost of freeing the simple garbage, the additional
overhead of mark-and-sweep would only be high in cases where you have lots
of circular data structures. Anyone in that category?

Eric



Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

Edgar Toernig
Eric Ries wrote:
> 
> I would think that doing reference counting to catch the easy cases would
> solve most of the problem. It?s not like ref counting ever gives you any
> false positives (objects that should not be collected).

Why do you think that ref-counting is faster than a GC?  Most of the time
it's slower.  It takes more memory.  You still need a GC.  And the run-time
behaviour is also hard to predict.  I see no benefit...

> If you used ref
> counting to amortize the cost of freeing the simple garbage, the additional
> overhead of mark-and-sweep would only be high in cases where you have lots
> of circular data structures. Anyone in that category?

It happens more often than one may think.  I.e. every "back/parent-pointer"
creates a loop...

Ciao, ET.

Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

John Belmonte-2
In reply to this post by Roberto Ierusalimschy
I have little interest in a generational collector. It's going to add complexity for uncertain return. It won't help graphical applications that have to meet a hard video frame deadline (such as when doing interlace video with only one output buffer) very much. (Or perhaps not at all, now that I see this talk of needing an "occasional" full gc even with a generational collector unless you do some programming acrobatics.)

What I'd like to see is for the authors to do the exercise of adding a real-time write-barrier type gc to Lua. Now, I'm not suggesting that this become the default, because such a gc, although it will reduce worst case gc times, will increase program execution time. Rather, from the exercise, we can gather how to make the gc framework general enough to support both mark-and-sweep and real-time collectors, and discover if this generality can be done without hurting Lua's current performance. Of course, the game people will also gain a useful gc in the process :-).

-John



--
http:// i      .   /


Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

Thatcher Ulrich
On Aug 24, 2002 at 12:25 +0900, John Belmonte wrote:

> I have little interest in a generational collector.  It's going to
> add complexity for uncertain return.  It won't help graphical
> applications that have to meet a hard video frame deadline (such as
> when doing interlace video with only one output buffer) very much.
> (Or perhaps not at all, now that I see this talk of needing an
> "occasional" full gc even with a generational collector unless you
> do some programming acrobatics.)

Incremental *and* generational is the holy grail I think, if it's not
too bloated.  Either one will be an improvement over vanilla
mark-and-sweep.  I agree with you that incremental is more desirable
for games.

I'm not totally against reference-counting as a solution either -- we
use it at work in our C++ code and it performs decently.  We use weak
pointers to get around the problem with cycles.

> What I'd like to see is for the authors to do the exercise of adding
> a real-time write-barrier type gc to Lua.  Now, I'm not suggesting
> that this become the default, because such a gc, although it will
> reduce worst case gc times, will increase program execution time.
> Rather, from the exercise, we can gather how to make the gc
> framework general enough to support both mark-and-sweep and
> real-time collectors, and discover if this generality can be done
> without hurting Lua's current performance.  Of course, the game
> people will also gain a useful gc in the process :-).

Any incremental GC will need a write barrier, which is IMO the most
difficult part to get right; I'd be pretty happy with a macro in the
Lua core that encapsulated all pointer writes into the heap or stack.
Such a macro could just do the normal write by default, or be
optionally redefined by an incremental GC patch.  When I was last
looking into this, it almost seemed like the Lua core already had this
macro, but I wasn't totally certain of what I was seeing.

[hidden email] posted a while back about an experimental incremental GC
patch he made, but the link seems to be dead.  Paul, what's the status
of that?

-- 
Thatcher Ulrich
http://tulrich.com

Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

Thatcher Ulrich
On Aug 24, 2002 at 12:40 -0400, Thatcher Ulrich wrote:
> 
> [hidden email] posted a while back about an experimental incremental GC
> patch he made, but the link seems to be dead.  Paul, what's the status
> of that?

Hm, I seem to have it on my hard drive :) Time to look at the code...

-- 
Thatcher Ulrich
http://tulrich.com

pau
Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

pau
In reply to this post by Thatcher Ulrich
On Sat, Aug 24, 2002 at 12:40:57AM -0400, Thatcher Ulrich wrote:
> 
> [hidden email] posted a while back about an experimental incremental GC
> patch he made, but the link seems to be dead.  Paul, what's the status
> of that?

The link'd gone dead a few months back when I went for vacation. It
has been up and running after I returned :-P 

	http://www.gime.org/twiki/bin/view/Gime/WebDownload

Regards,
.paul.

pau
Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

pau
In reply to this post by Roberto Ierusalimschy
My experience is summarized as below:

size of (constant) data table		12709 tables, 23019 strings,
					1928217 bytes compiled as luac.out
number of declared functions 		2509

The most typical GC delay is obviously due the traversal of large
nested string table. Without forcing GC to be executed, the default
collection sometimes takes up 2 to 3 seconds.

After using my patch of setconstant, which avoided the traversal
of static (old) tables and strings, the default collection time
dropped to around 1 second. 

A notable behavior of this Lua program is that it uses quite a lot
of small tables to construct temporary function callbacks (closure)
and also anonymous functions, which become garbage after the
closure is invoked. This takes up quite a big portion of sweep.

Then I forces a collection every 200 frames, it takes around 90ms
to collect each time.

My configuration is a laptop (toshiba portege 3440) PIII 500, 196MB
ram running Linux 2.4.17, and used Lua 4.0

I'd believe my program will benefit from a generation collector, 
but maybe not in a very significant manner. It is unlikely that
each collection will take less than 10ms. I'd much prefer a 5ms
delay every frame than a 40ms delay every 200 frames.

The ideal implementation would be that it is overall a generational
collector, but it can be controlled progressively at collecting the
new generation. Oh well, I've got no clue on this ...

Regards,
.paul.

Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

Thomas Lavergne-3
In reply to this post by Roberto Ierusalimschy
We use lua in some application but gc cycle was a problem for only one, in it we use lua for data base handling because we don't have space for a big classical database manager, we only have static table a need to make query in it. The data base was stored in very huge lua table (75000 item last time I see it) with a lot of string and user data. Theses table need a lot of time for gc traversal and because it was modified very often, most of time this time was lose. I've modified the lua gc to allow definition of _const_ object and with a garbage function that allow traversal of a table at any time. So we use const tables and functions and each time we have enought time we start garbage the table and collect the little amount of dead object in it. Without this patch a gc cycle take 17 second in worst case with it it take less than 1 second. This app work under a little embeded system with slow memory so traversal of all lua object was slow (it was also why we have choose lua, ie hash lookup need a small look in memory).

But this sample was very specific and probably a unique case, this patch was very usefull for this app but I've never used it for other application because of small difference in classical code.


--
Thomas Lavergne                       "Le vrai rêveur est celui qui rêve
                                       de l'impossible."  (Elsa Triolet)
[hidden email]
[hidden email]    ICQ:#137121910     http://assoc.wanadoo.fr/thallium/


Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

Thatcher Ulrich
In reply to this post by pau
On Aug 24, 2002 at 12:56 +0800, [hidden email] wrote:
> On Sat, Aug 24, 2002 at 12:40:57AM -0400, Thatcher Ulrich wrote:
> > 
> > [hidden email] posted a while back about an experimental incremental GC
> > patch he made, but the link seems to be dead.  Paul, what's the status
> > of that?
> 
> The link'd gone dead a few months back when I went for vacation. It
> has been up and running after I returned :-P 
> 
> 	http://www.gime.org/twiki/bin/view/Gime/WebDownload

I took a quick look through the diff -- it looks promising.  I think
the size of the collection data within objects can be condensed to
just two pointers (one of which replaces the existing "mark"
pointer).  The idea is to keep separate lists for each type of object
(thus eliminating the vtype variable) and not storing an explicit
"marked" value -- instead, the marked status is determined by which
list the object is linked into (white, gray, or black).

Hopefully I will get a chance to fiddle around with this in the near
future.  I think the first task is to make a "luaC_checkinvariants()"
function, which does an expensive traversal of everything and ensures
that the GC state is consistent.  This would be used only for
debugging.  Then, assemble a test suite and run it, calling
luaC_checkinvariants very frequently.  Hopefully this will smoke out
any missing write barriers.

Thanks for posting your stuff!

-- 
Thatcher Ulrich
http://tulrich.com

Reply | Threaded
Open this post in threaded view
|

Re: GC "survey"

Joe Myers-3
This has been a very interesting thread.

Since there are always tradeoffs involved in various approaches, what I
personally would find most useful would be a couple of (totally
optional) commands one could give the compiler to 'do something
special'.

In other words, say (for example) we're using mark & sweep, but (as
[hidden email] mentioned), we have some tables we KNOW will stay constant,
so a declaration like
	live_forever(table1, table2, ...)

would remove the referenced objects from garbage collection. Well, sure,
the downside is you might run out of memory. In which case you remove the
statement. ... On the other foot, if you're running up against a specific
problem and are bumping up against constraints, judicious hints to the
compiler could be very useful.

Another compiler 'hint' I'd like to see:
	clobber(object1, something_else2, ...)
I'm thinking of the case:

	s = {'a', 'b', 'c'}
	t = s
	...   -- a few hundred lines of code later
	t = nil

Well, that sure eliminates t, but s still lives! And (here I don't know),
if even we did a
	s = nil
wouldn't table {'a', 'b', 'c'} be floating around (unreferenced) til the
next garbage collect?

So I'd like 'clobber' to brutally annihilate the underlying objects
allocated, and release the space immediately to the free pool.

((Aside: is this what happens right now to locals when they go out of
scope?))

((Aside#2: heck I'm not a purist, these 'hints' could simply be C
functions manipulating the lua VM!))

Joe










123