Garbage Collection

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Garbage Collection

Gavin Wraith-2
Everybody refers to reference counting as
though it cannot catch cyclic structures.
Rafael Lins, whom I believe is now in the
Departamento de Electronica e Sistemas at
the Universidade de Pernambuco, produced
some beautiful gc reference counting
algorithms that do catch cyclic structures.
See "Cyclic Reference Counting with Lazy
Mark-Scan", UKC Technical Report 75
(1991). I believe he is now a world-expert
on gc, and has co-authored books on the topic.
Seek advice from your eminent compatriots, 
ye progenitors of Lua.
-- 
Gavin Wraith ([hidden email])
Home page: http://www.wraith.u-net.com/


Reply | Threaded
Open this post in threaded view
|

Re: Garbage Collection

Roberto Ierusalimschy
> Everybody refers to reference counting as though it cannot catch cyclic
> structures. Rafael Lins, whom I believe is now in the Departamento de
> Electronica e Sistemas at the Universidade de Pernambuco, produced
> some beautiful gc reference counting algorithms that do catch cyclic
> structures. See "Cyclic Reference Counting with Lazy Mark-Scan", UKC
> Technical Report 75 (1991). I believe he is now a world-expert on gc,
> and has co-authored books on the topic. Seek advice from your eminent
> compatriots, ye progenitors of Lua.

I know him quite well, but I could never "extract" from him the details
of this algorithm about cyclic reference counting. Curiosly, his main
book about garbage collection (which is, I believe, *the* book about
garbage collection) is from 1996 (so, five years after that paper), but
it does not include that algorithm...

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Garbage Collection

Thatcher Ulrich
On Oct 29, 2002 at 03:39 -0300, Roberto Ierusalimschy wrote:
> > Everybody refers to reference counting as though it cannot catch cyclic
> > structures. Rafael Lins, whom I believe is now in the Departamento de
> > Electronica e Sistemas at the Universidade de Pernambuco, produced
> > some beautiful gc reference counting algorithms that do catch cyclic
> > structures. See "Cyclic Reference Counting with Lazy Mark-Scan", UKC
> > Technical Report 75 (1991). I believe he is now a world-expert on gc,
> > and has co-authored books on the topic. Seek advice from your eminent
> > compatriots, ye progenitors of Lua.
> 
> I know him quite well, but I could never "extract" from him the details
> of this algorithm about cyclic reference counting. Curiosly, his main
> book about garbage collection (which is, I believe, *the* book about
> garbage collection) is from 1996 (so, five years after that paper), but
> it does not include that algorithm...

This is a very good book, by the way, clear and comprehensive.  I
recently went back and skimmed the chapter on reference-counting, when
the topic popped up on the list here.  In my copy, there's a
discussion of Lins's algorithm on pages 62-67.  I haven't read it in
enough detail yet to summarize it, but I'll quote a few summary
sentences from the book:

"Lins's algorithm is lazy in the sense that the mark-sweep garbage
collection is performed on demand.  [...big snip...]  Like
generational garbage collection, Lins's method works best when
side-effects are comparatively rare, for example for programs written
in a functional style.  Its success also rests on the assumptions that
the great majority of nodes are uniquely referenced and can be
reclaimed without resort to garbage collection; and that the
sub-graphs traversed are sufficiently small to make the garbage
collection delay small.  The drawback is that Lins traces garbage
whereas standard mark-sweep algorithms trace only active cells.
Unfortunately, implementations of functional programming languages
generate copious amounts of garbage: collection rates of over 80
percent of the heap are common.  No thorough comparisons of cyclic
reference counting against other methods have been carried out."

My spin: reference counting plus an occasional mark-and-sweep to get
the cycles is perfectly valid and correct, and relatively simple.  The
problem is, the mark-and-sweep is just as bad as it is currently,
because you have to traverse the whole active heap.  That's where
Lins's algorithm comes in -- it's an attempt to make the occasional
cyclic collection less painful.

However, pragmatically speaking, in many games you may be able to code
defensively and avoid creating (much) cyclic garbage, and therefore
avoid having to call the mark-and-sweep at all; or only call it at
certain convenient times.  With the addition of weak tables in Lua,
this becomes especially feasible.

-- 
Thatcher Ulrich
http://tulrich.com

Reply | Threaded
Open this post in threaded view
|

Re: Garbage Collection

Roberto Ierusalimschy
> "Lins's algorithm is lazy in the sense that the mark-sweep garbage
> collection is performed on demand.  [...big snip...]  Like
> generational garbage collection, Lins's method works best when
> side-effects are comparatively rare, for example for programs written
> in a functional style.  Its success also rests on the assumptions that
> the great majority of nodes are uniquely referenced and can be
> reclaimed without resort to garbage collection; and that the
> sub-graphs traversed are sufficiently small to make the garbage
> collection delay small.

Most of these are not true in Lua: most programs use mainly side-effects
to do their jobs, and many nodes have more than one reference (because a
single assignment or parameter passing duplicates the reference). Maybe
that's why I forgot about it ;-)

In a related subject: I don't want to go into the merits of reference
counting x games, but I feel that a correct implementation of reference
counting in Lua is not going to be simple. For instance, everytime Lua
pops its stack (a simple "L->top--", for instance) it is destroying
references. The stack is pervasive in Lua's implementation, and to
control correctly all these pushs and pops seems a difficult task. (More
difficult yet if there is any concern with performance...)

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Garbage Collection

Steve Dekorte-4
In reply to this post by Gavin Wraith-2

On Tuesday, October 29, 2002, at 09:54 AM, Gavin Wraith wrote:
Everybody refers to reference counting as
though it cannot catch cyclic structures.
Rafael Lins, whom I believe is now in the
Departamento de Electronica e Sistemas at
the Universidade de Pernambuco, produced
some beautiful gc reference counting
algorithms that do catch cyclic structures.
See "Cyclic Reference Counting with Lazy
Mark-Scan", UKC Technical Report 75
(1991). I believe he is now a world-expert
on gc, and has co-authored books on the topic.
Seek advice from your eminent compatriots,
ye progenitors of Lua.

It sounds like this would only reduce the frequency of the mark/sweep pauses, but not eliminate them. Unfortunately this wouldn't be a solution as any pauses are unacceptable.

Cheers,
Steve
OSX freeware and shareware: http://www.dekorte.com/downloads.html


Reply | Threaded
Open this post in threaded view
|

Re: Garbage Collection

Thatcher Ulrich
In reply to this post by Roberto Ierusalimschy
On Oct 30, 2002 at 09:51 -0300, Roberto Ierusalimschy wrote:
> > "Lins's algorithm is lazy in the sense that the mark-sweep garbage
> > collection is performed on demand.  [...big snip...]  Like
> > generational garbage collection, Lins's method works best when
> > side-effects are comparatively rare, for example for programs written
> > in a functional style.  Its success also rests on the assumptions that
> > the great majority of nodes are uniquely referenced and can be
> > reclaimed without resort to garbage collection; and that the
> > sub-graphs traversed are sufficiently small to make the garbage
> > collection delay small.
> 
> Most of these are not true in Lua: most programs use mainly side-effects
> to do their jobs, and many nodes have more than one reference (because a
> single assignment or parameter passing duplicates the reference). Maybe
> that's why I forgot about it ;-)

Yeah, I agree it doesn't match Lua's requirements too well.

> In a related subject: I don't want to go into the merits of reference
> counting x games, but I feel that a correct implementation of reference
> counting in Lua is not going to be simple. For instance, everytime Lua
> pops its stack (a simple "L->top--", for instance) it is destroying
> references. The stack is pervasive in Lua's implementation, and to
> control correctly all these pushs and pops seems a difficult task. (More
> difficult yet if there is any concern with performance...)

You're right, popping the stack destroys a reference.  You can defer
the actual destruction of the reference, though, until something is
pushed into that slot in the stack, or until some later convenient
time, and destroy all the extra references at once.

I believe an incremental mark-sweep, and possibly also generational,
suffers from a similar problem: you must have a write-barrier to
enforce your invariants.  Any time you change a reference, some GC
checks must be done.  The different algorithms and variations have
different invariants, so there may be "safe" situations when you can
change references, but of course this is what makes these things so
complicated.

I think a generic write-barrier macro used throughout the Lua core, as
Russell Webb is suggesting, would be good for GC experiments.
Actually, I just looked at lobject.h in 5.0-alpha, and I think the
infrastructure is already in place:

sethvalue() and the other similar macros can serve as the write
barrier.  Just redefine those macros to the write-barrier code
(possibly a function call).

Does the Lua core always use the set* macros to change a Value's
state?  If so, then we're in good shape -- I think pretty much any GC
can be added without touching much of the rest of the core, by
defining those macros appropriately, and adding the necessary support
code.

-- 
Thatcher Ulrich
http://tulrich.com

Reply | Threaded
Open this post in threaded view
|

Re: Garbage Collection

Roberto Ierusalimschy
> Does the Lua core always use the set* macros to change a Value's state?

Yes (or it should), but this is not enough. As we discussed for the
stack, we still need some way to catch creation/destroying of references
(and not only changes).

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Garbage Collection

Roberto Ierusalimschy
In reply to this post by Thatcher Ulrich
> I believe an incremental mark-sweep, and possibly also generational,
> suffers from a similar problem: you must have a write-barrier to
> enforce your invariants.

For a generational GC, we can consider that the stack is always new, so
we do not need a write-barrier when manipulating the stack. That greatly
simplifies the implementation (and also improves the performance).

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Garbage Collection

Thatcher Ulrich
In reply to this post by Roberto Ierusalimschy
On Oct 30, 2002 at 04:10 -0300, Roberto Ierusalimschy wrote:
> > Does the Lua core always use the set* macros to change a Value's state?
> 
> Yes (or it should), but this is not enough. As we discussed for the
> stack, we still need some way to catch creation/destroying of references
> (and not only changes).

Good point.  If new Value objects are always initialized to nil, then
that should take care of creation.  I suppose that's not already being
done though.

For destruction of stack references, it can be safely deferred, like
we discussed.  So, in the GC's where this is necessary, it can be
deferred all the way until luaC_collect is called.  To get earlier
collection of things referred to by the stack, an explicit
lua_cleanstack() API could be added, or perhaps it could be done in
luaD_checkstack().  Basically, it needs to call setnilvalue() on all
entries of the stack above the current top.  So I think this stack
cleaning could easily be part of a GC patch -- it doesn't need to be
included in the core.

For destruction of non-stack references, it would suffice to set all
Values to nil with setnilvalue() when they are destroyed.  I suppose
this also is not being done currently, but then I haven't looked.

-- 
Thatcher Ulrich
http://tulrich.com

Reply | Threaded
Open this post in threaded view
|

Re: Garbage Collection

Roberto Ierusalimschy
> Good point.  If new Value objects are always initialized to nil, then
> that should take care of creation.  I suppose that's not already being
> done though.

But we need a different kind of "setnil". For reference counting, the
usual "setnil" must decrement the count of the previous value, but
the "setnil" of a new reference cannot do that.


> For destruction of non-stack references, it would suffice to set all
> Values to nil with setnilvalue() when they are destroyed.  I suppose
> this also is not being done currently, but then I haven't looked.

Destruction of non-stack references ocurr mainly when Lua destroys an
object, so this is already controled by the garbage collector.

-- Roberto