Functions and memory usage

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Functions and memory usage

John Donovan
Hi,
I'm new to the list, and normally I'd just lurk for a couple of weeks
getting the feel for the list. However, time is of the essence at the
moment.
We're developing a game on a couple of well-known games consoles, and I've
been given the task of implementing a scripting system. After looking at
various options, I picked Lua. Now I'm trying to convince my boss that Lua
is A Good Thing. He's worried about memory fragmentation.
As most of you will be aware, consoles have a very limited amount of memory,
and I'm going to be given as small a piece of that as possible, so I've got
to make every byte count. After looking at the code, and how it allocates
and deallocates memory, we noticed that when it loads up a function (with
LoadFunction()) it loads locals, lines, constants, and code in that order.
However when it comes to free the function (with luaF_freeproto()) it does
it in a different order. My question is, if I shuffle the deallocation of
the memory LoadFunction() allocates so it does it in reverse order, could we
implement a second memory allocator that uses a simple stack? So when
LoadFunction() pushes locals, lines, constants, and code, we could safely
free it in reverse order without worrying if something else has realloc'd or
otherwise messed around with our function's memory?
A search of the archives turned up nothing on this specific question, but I
have noticed that fragmentation is a reported problem that some people have
managed to keep to a minimum.

-John Donovan
Programmer - Magenta Software Ltd.


The content of this message and any attached file are confidential and/or
privileged and are for the intended recipient only. If you are not the
intended recipient, any unauthorised use, disclosure, copying, distribution
or other dissemination is strictly prohibited. If you receive this message
in error please notify the sender immediately by email, telephone or fax and
then delete this email. Any attachment with this message should be checked
for viruses before it is opened.  Magenta Software Limited cannot be held
responsible for any failure by the recipient to check for viruses before
opening any attachment. Copyright in this email and attachments created by
us belongs to Magenta Software Limited. Should you communicate with anyone
at Magenta Software Limited by email you consent to us monitoring and
reading any such correspondence.



Reply | Threaded
Open this post in threaded view
|

Functions and memory usage

Basile STARYNKEVITCH
>>>>> "John" == John Donovan <[hidden email]> writes:

    John> Hi, I'm new to the list, and normally I'd just lurk for a
    John> couple of weeks getting the feel for the list. However, time
    John> is of the essence at the moment.  We're developing a game on
    John> a couple of well-known games consoles, and I've been given
    John> the task of implementing a scripting system. After looking
    John> at various options, I picked Lua. Now I'm trying to convince
    John> my boss that Lua is A Good Thing. He's worried about memory
    John> fragmentation.  As most of you will be aware, consoles have
    John> a very limited amount of memory, [...]

In my opinion the only real tool against memory fragmentation is a
moving (or copying, or compacting) garbage collector. The LUA GC is a
mark&sweep one, so it does not compact memory and neither does the
ordinary manual malloc+free (as in C) or new+delete (as in C++) memory
management. Fragmentation issues do appear in C or C++. There are lots
of literature on compacting or moving garbage collection (and there is
a GC mailing list, even if it is a bit quiet these days).

I am not sure that fragmentation is really an issue for game scripting
(I really don't know what these are exactly - I don't play games on my
computer, and never coded any game).

However, if you feel that fragmentation is an issue:

1. you cannot use traditional manual memory management (ie malloc+free
   or new+delete) because it has the same fragmentation problem.

2. you should consider compacting or moving garbage collection
   techniques, which do have some drawbacks (in particular, real-time
   or interactive-time constraints may require tricky tradeoffs or
   tuning with such moving GCs).

3. you could perhaps change Lua GC to be compacting. The simple case
   would be to compact the memory once in a while, in your idle loop,
   when the Lua call stack is nearly empty. If you do this, please
   contribute (if possible) your compacting GC to the Lua community.

4. you might consider alternative scripting languages (or virtual
   machines). I feel that Ocaml is a good scripting language, and its
   garbage collection can be tuned to perform very well in an
   interactive setting (like a game). Ocaml garbage collection is one
   of the finest I know (it is generational, copying for new objects,
   and marking & compacting for old ones). See www.ocaml.org 

5. You might use a copying GC in C. This is painful but doable. My
Qish runtime is actually such a beast. See my home page and/or send me
email.

[Sorry for mentionning another language on the Lua mailing list, but I
think that there are cases where Lua might not be the best solution]

Regards.


-- 

Basile STARYNKEVITCH         http://starynkevitch.net/Basile/ 
email: basile<at>starynkevitch<dot>net 
alias: basile<at>tunes<dot>org 
8, rue de la Faïencerie, 92340 Bourg La Reine, France

Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

Nick Trout-5
In reply to this post by John Donovan
> In my opinion the only real tool against memory fragmentation is a
> moving (or copying, or compacting) garbage collector. The LUA GC is a
> mark&sweep one, so it does not compact memory and neither does the
> ordinary manual malloc+free (as in C) or new+delete (as in C++) memory
> management. Fragmentation issues do appear in C or C++. There are lots
> of literature on compacting or moving garbage collection (and there is
> a GC mailing list, even if it is a bit quiet these days).

Useful resources here:
http://www.memorymanagement.org/
http://lua-users.org/wiki/GarbageCollection
http://lua-users.org/wiki/MemoryAllocation

Compacting may not be very good solution for games as it may take to
much time. You don't really want to spend 10-20% of your CPU time moving
memory around and have to deal with all the indirection as well. Some
points are made here:
http://www.memorymanagement.org/articles/recycle.html#copying 

I think a lot of fragmentation problems are down to the allocation
algorithm. There is some useful info here and I think this allocator
looks pretty good. I think I pulled it off this list a while ago. Its
fast and should be good for a scripting system allocating and
deallocating little blocks of similar sizes.
http://g.oswego.edu/dl/html/malloc.html

It wont solve the fragmentation problem but it might reduce it to an
acceptable level. If you try not to create many objects at runtime you
will probably suffer less as well:
http://lua-users.org/wiki/OptimisingGarbageCollection



> [Sorry for mentionning another language on the Lua mailing list, but I
> think that there are cases where Lua might not be the best solution]

There is nothing wrong with suggesting other solutions. I wish some more
Lua comparisons would appear on the wiki: 
http://lua-users.org/wiki/LuaComparison

This list and Luas progress is quite pragmatic. If you were to
demonstrate features of Ocaml that you liked and would benefit Lua you
might find them included. The authors are intending adding a
generational GC to Lua so perhaps you could shed light on this and/or
other optimisations.

Nick








Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

Luiz Henrique de Figueiredo
In reply to this post by John Donovan
>I think a lot of fragmentation problems are down to the allocation
>algorithm. There is some useful info here and I think this allocator
>looks pretty good. I think I pulled it off this list a while ago. Its
>fast and should be good for a scripting system allocating and
>deallocating little blocks of similar sizes.
>http://g.oswego.edu/dl/html/malloc.html

We'd be very interested in knowing whether simply replacing the system malloc
and friends by dlmalloc (the one at the URL above) helps. It's difficult for
us to test memory allocation pattern in typical, real-life cases. So, someone
with memory fragmentation problems or suspicion of such, could you please try
dlmalloc and see if makes any difference? Please report here. Thanks.

Another thing: Lua typically does not deallocating little blocks of similar
sizes: strings and userdata allocated their "data" arrays right after the
"header" part.
--lhf

Reply | Threaded
Open this post in threaded view
|

Re: Functions and memory usage

terenctb
Hi,

We are did try to use dlmalloc to replace the regular malloc but it 
didn't seem to make much difference...well other than the fact we are 
probably doing something wrong anyway ;-)>

I also found out(not sure if the info that valid) that the dlmalloc 
is the same one used by in linux anyways so it seemed kinda silly to 
replace it.

Terence
--- In lua-l@y..., Luiz Henrique de Figueiredo <lhf@t...> wrote:
> >I think a lot of fragmentation problems are down to the allocation
> >algorithm. There is some useful info here and I think this 
allocator
> >looks pretty good. I think I pulled it off this list a while ago. 
Its
> >fast and should be good for a scripting system allocating and
> >deallocating little blocks of similar sizes.
> >http://g.oswego.edu/dl/html/malloc.html
> 
> We'd be very interested in knowing whether simply replacing the 
system malloc
> and friends by dlmalloc (the one at the URL above) helps. It's 
difficult for
> us to test memory allocation pattern in typical, real-life cases. 
So, someone
> with memory fragmentation problems or suspicion of such, could you 
please try
> dlmalloc and see if makes any difference? Please report here. 
Thanks.
> 
> Another thing: Lua typically does not deallocating little blocks of 
similar
> sizes: strings and userdata allocated their "data" arrays right 
after the
> "header" part.
> --lhf


Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

jmckenn
In reply to this post by John Donovan
From: Basile STARYNKEVITCH [[hidden email]]

>However, if you feel that fragmentation is an issue:
>
>1. you cannot use traditional manual memory management (ie malloc+free
>   or new+delete) because it has the same fragmentation problem.

That's not entirely true.  If you're doing manual allocation, you can
control when each allocation is done.  If you're careful about how you
do things, you can avoid fragmentation.  It requires a detailed plan
up front, and a big stick to beat programmers who break it.

View memory as a stack, and forbid freeing anything until everything
that was allocated after it has been freed.  You can be a bit more
flexible than that - use a stack of states, and insist that each state
frees all of the memory it allocated before it leaves.

With garbage collection, being careful doesn't help.  It might be a
while before the system notices that a block is no longer in use.  By
then you might have allocated new blocks beyond it, fragmenting memory.

I suppose you could force a collection on each state transition.  That
might work.  You still need discipline, of course.  And a big stick. 

Whichever language you use, patch its memory allocation routines so you
can produce a log of all allocation: where, how much, and who did it.
We have an off-line graphical viewer that nicely shows where fragmentation
occurs, and who to blame.  It's much easier to do this in Lua than in
C++.

>2. you should consider compacting or moving garbage collection
>   techniques, which do have some drawbacks (in particular, real-time
>   or interactive-time constraints may require tricky tradeoffs or
>   tuning with such moving GCs).

You mean spend CPU time on something that's not directly related to
the game?  Heresy! :-)

John
-Virus scanned and cleared ok

Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

John Donovan
Thanks everyone for your replies, it's very enlightening! One thing I forgot
to tell everyone is that I'm using dlmalloc, and after a bit of tweaking
works a treat. Strangely enough, before I implemented it, I used the
"native" malloc on this particular console, and the profile information
suggested it was the same algorithm.

> View memory as a stack, and forbid freeing anything until everything
> that was allocated after it has been freed.  You can be a bit more
> flexible than that - use a stack of states, and insist that each state
> frees all of the memory it allocated before it leaves.

That's the sort of approach my boss wants to take. I'm under the impression
that it's not going to be that simple. However, he wants this scripting
system to last for a good few games, so spending time now making it as
robust as possible is going to pay off. Any hints on how I might go about
this?

> I suppose you could force a collection on each state transition.  That
> might work.  You still need discipline, of course.  And a big stick. 

The thing is, I'd probably be the only person being beaten with that stick!
I have enough of that at the weekends (I'm a member of a Viking and
Anglo-Saxon reenactment group, nothing kinky)!

> Whichever language you use, patch its memory allocation 
> routines so you
> can produce a log of all allocation: where, how much, and who did it.
> We have an off-line graphical viewer that nicely shows where 
> fragmentation
> occurs, and who to blame.  It's much easier to do this in Lua than in
> C++.

That's pretty much what I'm doing at the moment. I'm just using .csv files
and Excel at the moment.

> You mean spend CPU time on something that's not directly related to
> the game?  Heresy! :-)

LOL! This is true. Especially as it may occur at any time during the game,
and take an indeterminate amount of time. Dropping frames at random
intervals is A Bad Thing on a console. Of course the graphics guys get all
the time they want, and the sound guys have a completely separate chip to
play around with *mumble mumble* :)

John Donovan
Programmer - Magenta Software Ltd.


The content of this message and any attached file are confidential and/or
privileged and are for the intended recipient only. If you are not the
intended recipient, any unauthorised use, disclosure, copying, distribution
or other dissemination is strictly prohibited. If you receive this message
in error please notify the sender immediately by email, telephone or fax and
then delete this email. Any attachment with this message should be checked
for viruses before it is opened.  Magenta Software Limited cannot be held
responsible for any failure by the recipient to check for viruses before
opening any attachment. Copyright in this email and attachments created by
us belongs to Magenta Software Limited. Should you communicate with anyone
at Magenta Software Limited by email you consent to us monitoring and
reading any such correspondence.



Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

Fabio Reis Cecin
In reply to this post by jmckenn
Here's some material on the Java garbage collector(s) (there is more than one
and it seems that you can select wich one you want) :

http://java.sun.com/products/hotspot/whitepaper.html#5
http://developer.java.sun.com/developer/technicalArticles/Networking/HotSpot/

One of them uses the "train" algorythm, wich supposedly eliminates
undesirable 1-sec pauses when "mark-compact"-like garbage collection
would occur.

Another idea: have a couple of functions on the Lua API for the
programmer to know an estimate of how bad fragmentation is, and to
be able to force a full GC and full memory compaction. This could be
useful for the programmer to call, when the user "pauses the game",
or is detected to be away for the program, or when the game is
changing levels, etc.

Fabio

On 17 Oct 2002, at 9:17, John Mckenna wrote:

> >2. you should consider compacting or moving garbage collection
> >   techniques, which do have some drawbacks (in particular, real-time
> >   or interactive-time constraints may require tricky tradeoffs or
> >   tuning with such moving GCs).
> 
> You mean spend CPU time on something that's not directly related to
> the game?  Heresy! :-)
> 
> John
> -Virus scanned and cleared ok
> 
> Esta mensagem foi verificada pelo E-mail Protegido Terra.
> Scan engine: VirusScan / Atualizado em 16/10/2002 / Versão: 1.3.13
> Proteja o seu e-mail Terra: http://www.emailprotegido.terra.com.br/
> 


--
[]'s
Fábio R. Cecin
[hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

Luiz Henrique de Figueiredo
In reply to this post by John Donovan
>Another idea: have a couple of functions on the Lua API for the
>programmer to know an estimate of how bad fragmentation is

How can we do that using ANSI C functions?
--lhf

Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

Fabio Reis Cecin
I guess you can always use some sort of indirect measure. For example, you can
sample the average time elapsed with memory allocations, to see if it's growing too
much. Or compare the size of the heap (last address allocated - first address
alloc. ?) with the size of allocated structures. If ratio above a certain amount, the
memory should be fragmented (or not?).

The goal would be just to the programmer to know, with some margin for error,
that it's time to force GC+compact. Then it can tell the user: ok, move away
from the line of fire and stand in a dark corner while the world cleans itself
up... :-)

Fabio

On 17 Oct 2002, at 11:32, Luiz Henrique de Figueiredo wrote:

> >Another idea: have a couple of functions on the Lua API for the
> >programmer to know an estimate of how bad fragmentation is
> 
> How can we do that using ANSI C functions?
> --lhf
--
[]'s
Fábio R. Cecin
[hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

Benoit Germain
In reply to this post by John Donovan
I use dlmalloc too, and I have slightly modified it so that I can manage
several memory pools. That way, I ensure that fragmentation won't affect the
entire RAM, but will be confined to some area of my choosing. All I have to
do is dedicate one pool to LUA, et voilà :-).

> -----Original Message-----
> From: terenctb [[hidden email]]
> Sent: jeudi 17 octobre 2002 08:12
> To: Multiple recipients of list
> Subject: Re: Functions and memory usage
> 
> 
> Hi,
> 
> We are did try to use dlmalloc to replace the regular malloc but it 
> didn't seem to make much difference...well other than the fact we are 
> probably doing something wrong anyway ;-)>
> 
> I also found out(not sure if the info that valid) that the dlmalloc 
> is the same one used by in linux anyways so it seemed kinda silly to 
> replace it.
> 
> Terence
> --- In lua-l@y..., Luiz Henrique de Figueiredo <lhf@t...> wrote:
> > >I think a lot of fragmentation problems are down to the allocation
> > >algorithm. There is some useful info here and I think this 
> allocator
> > >looks pretty good. I think I pulled it off this list a while ago. 
> Its
> > >fast and should be good for a scripting system allocating and
> > >deallocating little blocks of similar sizes.
> > >http://g.oswego.edu/dl/html/malloc.html
> > 
> > We'd be very interested in knowing whether simply replacing the 
> system malloc
> > and friends by dlmalloc (the one at the URL above) helps. It's 
> difficult for
> > us to test memory allocation pattern in typical, real-life cases. 
> So, someone
> > with memory fragmentation problems or suspicion of such, could you 
> please try
> > dlmalloc and see if makes any difference? Please report here. 
> Thanks.
> > 
> > Another thing: Lua typically does not deallocating little blocks of 
> similar
> > sizes: strings and userdata allocated their "data" arrays right 
> after the
> > "header" part.
> > --lhf
> 

Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

John Donovan
In reply to this post by John Donovan
Yes, I've allocated a portion of memory just for Lua's use. You have to
override sbrk in the dlmalloc code. Right near the bottom of dlmalloc.c
there is a Win32 version, and a MacOS version commented out. I used that one
as a starting point.

John Donovan
Programmer - Magenta Software Ltd.


> -----Original Message-----
> From: Benoit Germain [[hidden email]] 
> Sent: 17 October 2002 16:30
> To: Multiple recipients of list
> Subject: RE: Functions and memory usage
> 
> 
> I use dlmalloc too, and I have slightly modified it so that I 
> can manage
> several memory pools. That way, I ensure that fragmentation 
> won't affect the
> entire RAM, but will be confined to some area of my choosing. 
> All I have to
> do is dedicate one pool to LUA, et voilà :-).
> 
> > -----Original Message-----
> > From: terenctb [[hidden email]]
> > Sent: jeudi 17 octobre 2002 08:12
> > To: Multiple recipients of list
> > Subject: Re: Functions and memory usage
> > 
> > 
> > Hi,
> > 
> > We are did try to use dlmalloc to replace the regular malloc but it 
> > didn't seem to make much difference...well other than the 
> fact we are 
> > probably doing something wrong anyway ;-)>
> > 
> > I also found out(not sure if the info that valid) that the dlmalloc 
> > is the same one used by in linux anyways so it seemed kinda 
> silly to 
> > replace it.
> > 
> > Terence
> > --- In lua-l@y..., Luiz Henrique de Figueiredo <lhf@t...> wrote:
> > > >I think a lot of fragmentation problems are down to the 
> allocation
> > > >algorithm. There is some useful info here and I think this 
> > allocator
> > > >looks pretty good. I think I pulled it off this list a 
> while ago. 
> > Its
> > > >fast and should be good for a scripting system allocating and
> > > >deallocating little blocks of similar sizes.
> > > >http://g.oswego.edu/dl/html/malloc.html
> > > 
> > > We'd be very interested in knowing whether simply replacing the 
> > system malloc
> > > and friends by dlmalloc (the one at the URL above) helps. It's 
> > difficult for
> > > us to test memory allocation pattern in typical, real-life cases. 
> > So, someone
> > > with memory fragmentation problems or suspicion of such, 
> could you 
> > please try
> > > dlmalloc and see if makes any difference? Please report here. 
> > Thanks.
> > > 
> > > Another thing: Lua typically does not deallocating little 
> blocks of 
> > similar
> > > sizes: strings and userdata allocated their "data" arrays right 
> > after the
> > > "header" part.
> > > --lhf
> > 
> 


The content of this message and any attached file are confidential and/or
privileged and are for the intended recipient only. If you are not the
intended recipient, any unauthorised use, disclosure, copying, distribution
or other dissemination is strictly prohibited. If you receive this message
in error please notify the sender immediately by email, telephone or fax and
then delete this email. Any attachment with this message should be checked
for viruses before it is opened.  Magenta Software Limited cannot be held
responsible for any failure by the recipient to check for viruses before
opening any attachment. Copyright in this email and attachments created by
us belongs to Magenta Software Limited. Should you communicate with anyone
at Magenta Software Limited by email you consent to us monitoring and
reading any such correspondence.



Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

Alberto Demichelis
In reply to this post by John Donovan
In our game we managed to reduce fragmetation to 0%.
What we do is managing small allocations(smaller than 513 bytes) and big allocations in a different way.
for every allocation size between 1 and 512 bytes we allocate a 4K page where we will store only object of a certain size;
for example 1 page with only 4 bytes chunks 1 page with only 32 bytes page.we round the allocations size to multiple of 4.
for all chunk bigger than 512 bytes, we allocate it with a normal general pourpose allocator using a "best fit" algorithm.
we "waste" 200/300k of memory every 100Mb allocated.
With this solution we have an allocator that is almost CPU free for small alloc/free and relatively fast for big alloc/free
because the number of chunk managed by the "best fit" are not so many.

we do not have a dedicated pool for lua but a dedicated pool for all game.

ciao
Alberto

---------------------
Alberto Demichelis
[hidden email]
Crytek Studios GmbH
www.crytek.com
---------------------



-----Original Message-----
From: Benoit Germain [[hidden email]]
Sent: Donnerstag, 17. Oktober 2002 17:30
To: Multiple recipients of list
Subject: RE: Functions and memory usage


I use dlmalloc too, and I have slightly modified it so that I can manage
several memory pools. That way, I ensure that fragmentation won't affect the
entire RAM, but will be confined to some area of my choosing. All I have to
do is dedicate one pool to LUA, et voilà :-).

> -----Original Message-----
> From: terenctb [[hidden email]]
> Sent: jeudi 17 octobre 2002 08:12
> To: Multiple recipients of list
> Subject: Re: Functions and memory usage
> 
> 
> Hi,
> 
> We are did try to use dlmalloc to replace the regular malloc but it 
> didn't seem to make much difference...well other than the fact we are 
> probably doing something wrong anyway ;-)>
> 
> I also found out(not sure if the info that valid) that the dlmalloc 
> is the same one used by in linux anyways so it seemed kinda silly to 
> replace it.
> 
> Terence
> --- In lua-l@y..., Luiz Henrique de Figueiredo <lhf@t...> wrote:
> > >I think a lot of fragmentation problems are down to the allocation
> > >algorithm. There is some useful info here and I think this 
> allocator
> > >looks pretty good. I think I pulled it off this list a while ago. 
> Its
> > >fast and should be good for a scripting system allocating and
> > >deallocating little blocks of similar sizes.
> > >http://g.oswego.edu/dl/html/malloc.html
> > 
> > We'd be very interested in knowing whether simply replacing the 
> system malloc
> > and friends by dlmalloc (the one at the URL above) helps. It's 
> difficult for
> > us to test memory allocation pattern in typical, real-life cases. 
> So, someone
> > with memory fragmentation problems or suspicion of such, could you 
> please try
> > dlmalloc and see if makes any difference? Please report here. 
> Thanks.
> > 
> > Another thing: Lua typically does not deallocating little blocks of 
> similar
> > sizes: strings and userdata allocated their "data" arrays right 
> after the
> > "header" part.
> > --lhf
> 

Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

Nick Trout-5
In reply to this post by John Donovan
> In our game we managed to reduce fragmetation to 0%.

By your own admission, the fragmentation is not 0%! :  ''we "waste"
200/300k of memory every 100Mb allocated.'' :-)

> What we do is managing small allocations(smaller than 513 
> bytes) and big allocations in a different way.
> for every allocation size between 1 and 512 bytes we allocate 
> a 4K page where we will store only object of a certain size;
> for example 1 page with only 4 bytes chunks 1 page with only 
> 32 bytes page.we round the allocations size to multiple of 4.
> for all chunk bigger than 512 bytes, we allocate it with a 
> normal general pourpose allocator using a "best fit" algorithm.
> we "waste" 200/300k of memory every 100Mb allocated.
> With this solution we have an allocator that is almost CPU 
> free for small alloc/free and relatively fast for big alloc/free
> because the number of chunk managed by the "best fit" are not so many.

The schema you suggest would actually enforce fragmentation due to
unfilled 4k pages. But, such a recursive allocator is good way of
localising fragmentation with the benefits of fast pool allocation. I
assume each pool size would have an most recently used stack, only
allocating a new pool once you've checked the existing pools for space?

Regards,
Nick






Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

Alberto Demichelis
In reply to this post by John Donovan
We spent a while testing the allocator and at least for our usage the memory
is perfectly stable.
The 4k pages thing was an experiment but we noticed that at the end all pages were always
full and cut in half the time spent for freeing memory(that cut in half the lua GC time).

200k/300k of internal structures of the allocator on 100Mb of allocated memory sound
pretty good to me, I do not understand the smile :-)

> I assume each pool size would have an most recently used stack, only
> allocating a new pool once you've checked the existing pools for space?

I don't really understand the question :P
What do you mean as most recently used stack?

ciao
Alberto

---------------------
Alberto Demichelis
[hidden email]
Crytek Studios GmbH
www.crytek.com
---------------------

-----Original Message-----
From: Nick Trout [[hidden email]]
Sent: Donnerstag, 17. Oktober 2002 23:07
To: Multiple recipients of list
Subject: RE: Functions and memory usage



> In our game we managed to reduce fragmetation to 0%.

By your own admission, the fragmentation is not 0%! :  ''we "waste"
200/300k of memory every 100Mb allocated.'' :-)

> What we do is managing small allocations(smaller than 513 
> bytes) and big allocations in a different way.
> for every allocation size between 1 and 512 bytes we allocate 
> a 4K page where we will store only object of a certain size;
> for example 1 page with only 4 bytes chunks 1 page with only 
> 32 bytes page.we round the allocations size to multiple of 4.
> for all chunk bigger than 512 bytes, we allocate it with a 
> normal general pourpose allocator using a "best fit" algorithm.
> we "waste" 200/300k of memory every 100Mb allocated.
> With this solution we have an allocator that is almost CPU 
> free for small alloc/free and relatively fast for big alloc/free
> because the number of chunk managed by the "best fit" are not so many.

The schema you suggest would actually enforce fragmentation due to
unfilled 4k pages. But, such a recursive allocator is good way of
localising fragmentation with the benefits of fast pool allocation. I
assume each pool size would have an most recently used stack, only
allocating a new pool once you've checked the existing pools for space?

Regards,
Nick






Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

RLak
In reply to this post by John Donovan
> What we do is managing small allocations(smaller than 513
> bytes) and big allocations in a different way.
> for every allocation size between 1 and 512 bytes we allocate
> a 4K page where we will store only object of a certain size;
> for example 1 page with only 4 bytes chunks 1 page with only
> 32 bytes page.we round the allocations size to multiple of 4.
> for all chunk bigger than 512 bytes, we allocate it with a
> normal general pourpose allocator using a "best fit" algorithm.
> we "waste" 200/300k of memory every 100Mb allocated.
> With this solution we have an allocator that is almost CPU
> free for small alloc/free and relatively fast for big alloc/free
> because the number of chunk managed by the "best fit" are not so many.

If I'm not mistaken, this is a very similar algorithm to the one used by
dlmalloc. Have you compared your performance and fragmentation with it?
dlmalloc is a very mature and fast allocator.

This strategy works well if there are only a few sizes of blocks allocated,
and if deallocation-time is correlated with allocation-time. These are both
reasonable assumptions in many applications. However it is very easy to
come up with pathological cases.

An example of a pathological case is something like the following:

do
  local a = {}
  b = {}
  for i = 1, big_number do
    a[i] = {}
    b[i] = {}
  end
end

a, being local to the outermost do, will disappear at the end of it. b,
being global, won't. However, the allocations for the elements of a and b
were interleaved, so when a is garbage collected, it is going to leave 50%
fragmentation.

Most people would avoid code like that, but it comes up sometimes in
computations which memoise a lot of temporary values; it is not always easy
to notice that it is happening. Of course, no non-compacting garbage
collector is likely to help you with that sort of thing.

There is some evidence that first-fit is less likely to fragment than
best-fit, by the way. I believe that dlmalloc is almost-first-fit, but
don't quote me on that.




Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

Nick Trout-5
In reply to this post by John Donovan
> 200k/300k of internal structures of the allocator on 100Mb of 
> allocated memory sound
> pretty good to me, I do not understand the smile :-)

Then its not memory wasted, its used to manage the memory management. I
assumed you meant that 200k-300k is lost to spaces at the end of pages,
and rounding up, as you describe.

The smile was to show that while I might be aiming a criticism at you
there is no malice intended! Its all a bit dry this technical mumbo
jumbo sometimes and email can often be read ambiguously.

> > I assume each pool size would have an most recently used stack, only
> > allocating a new pool once you've checked the existing 
> pools for space?
> 
> I don't really understand the question :P
> What do you mean as most recently used stack?

What happens when you deallocate/free an object? Lets say you allocated
10 pools (4k blocks) of 16 byte objects and then free a load of them. If
you then wanted to allocate a new 16 byte object, how would that be
done? I assume for fast allocation you would keep track of pools, and
group them by size of object, so you could quickly find a pool
containing 16 byte objects to allocate from. I was speculating that a
set of stacks of pools might be a good way to organise this. When a new
pool of a certain size is required it is pushed onto the stack and can
be accessed quickly. However when deallocation takes place this might
not be the best idea. I'd be interested to know how this is managed
since you mention your free time was greatly reduced.

If you are allocating 100Mb then this would be beyond the scope of
certain platforms and so a system like yours may not be as efficient or
suitable for all applications. The dlmalloc system may be a better
general solution?

Nick





Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

Alberto Demichelis
In reply to this post by John Donovan
> What happens when you deallocate/free an object?
Instead of keeping track of the pages we keep a linked list of feer blocks for every chunk size,
because they are "empty" we use the block himself as node of the linked list so no memory overhead.
When we create a page, the address of the first chunk is aligned in a way that allow us to 
retrieve the page start address from the chunk address.The page is ref counted(first 4 bytes).
When we allocate a chunk we pop it from the free list and add a reference to the page.
When we free we put it back in the free list and we decrement the reference if is 0 we free the page.

I've took a quick look at dlmalloc, it looks not so far from our approach exept for the implementation.
I'll test it.


ciao
Alberto

PS: btw 200k/300k are internal structures + "waste".

---------------------
Alberto Demichelis
[hidden email]
Crytek Studios GmbH
www.crytek.com
---------------------


-----Original Message-----
From: Nick Trout [[hidden email]]
Sent: Freitag, 18. Oktober 2002 01:06
To: Multiple recipients of list
Subject: RE: Functions and memory usage



> 200k/300k of internal structures of the allocator on 100Mb of 
> allocated memory sound
> pretty good to me, I do not understand the smile :-)

Then its not memory wasted, its used to manage the memory management. I
assumed you meant that 200k-300k is lost to spaces at the end of pages,
and rounding up, as you describe.

The smile was to show that while I might be aiming a criticism at you
there is no malice intended! Its all a bit dry this technical mumbo
jumbo sometimes and email can often be read ambiguously.

> > I assume each pool size would have an most recently used stack, only
> > allocating a new pool once you've checked the existing 
> pools for space?
> 
> I don't really understand the question :P
> What do you mean as most recently used stack?

What happens when you deallocate/free an object? Lets say you allocated
10 pools (4k blocks) of 16 byte objects and then free a load of them. If
you then wanted to allocate a new 16 byte object, how would that be
done? I assume for fast allocation you would keep track of pools, and
group them by size of object, so you could quickly find a pool
containing 16 byte objects to allocate from. I was speculating that a
set of stacks of pools might be a good way to organise this. When a new
pool of a certain size is required it is pushed onto the stack and can
be accessed quickly. However when deallocation takes place this might
not be the best idea. I'd be interested to know how this is managed
since you mention your free time was greatly reduced.

If you are allocating 100Mb then this would be beyond the scope of
certain platforms and so a system like yours may not be as efficient or
suitable for all applications. The dlmalloc system may be a better
general solution?

Nick





Reply | Threaded
Open this post in threaded view
|

RE: Functions and memory usage

Nick Trout-5
> > What happens when you deallocate/free an object?
> Instead of keeping track of the pages we keep a linked list 
> of feer blocks for every chunk size,
> because they are "empty" we use the block himself as node of 
> the linked list so no memory overhead.
> When we create a page, the address of the first chunk is 
> aligned in a way that allow us to 
> retrieve the page start address from the chunk address.The 
> page is ref counted(first 4 bytes).
> When we allocate a chunk we pop it from the free list and add 
> a reference to the page.
> When we free we put it back in the free list and we decrement 
> the reference if is 0 we free the page.

So you could have lots of pages allocated that are half full? Eg. If
longer living objects were mixed with shorter living ones. I don't
suppose there is an easy way around this. I like the fact that
fragmentation of the similar size objects is localised. Given that you
have a general system though and the allocation could go in any of the
pages I wonder if it would be more efficient for subsystems to be able
to create their own pages. This might help with cache coherence? Eg. If
a particle system allocated X pages its update and rendering may
benefit. The other benefit may be that if you kill it then all its pages
disappear and you may not have a number of them lingering because they
have a few chunks still used in them. I think there is a danger in your
system that after a time you might end up with lots of pages allocated
containing a few chunks, and fragmentation will become a problem? 

There are lots of factors. Your method works for your game and platform
but I'm not sure how well it would work on a more restricted system with
a very free flowing environment.

> I've took a quick look at dlmalloc, it looks not so far from 
> our approach exept for the implementation.
> I'll test it.

That would be great. 

Regards,
Nick




Reply | Threaded
Open this post in threaded view
|

Re: Functions and memory usage

terenctb
In reply to this post by Alberto Demichelis
I have been thinking of doing something simliar to what Alberto has 
been describing(we already do something like this for our other 
memory manager) but I am not sure that it will be enough to cut down 
the amount of data being accessed because of the method that the 
allocation/gc is being done already does lend it self to speed.

The way I was going to do it was to just throw memory at it(since I 
am in eviable position of using LUA on server with alot of 
memory)..taking it from a pre-allocated area and sizing it so that 
each allocatable area is tuned to the largest size possible. This 
means no best fit, no frag and hopefully big boost in speed. BUT I 
put some tracers on the way LUA alllocates/deallocates memory and it 
seems a better way would be just to predefine a memory 'pool' for 
each script..since each script is basically running the 'same' logic, 
pre-allocation of all necessary data would faster..

Hmmm..now if only I figure out how to do it the short time I have;-)
Do you think the new version of LUA will have multiple 'memory' 
managers instead of 1, cause I think right now this is the biggest 
thing that people always end up changing in lua.


Terence
Phoenix-GameStudios

www.phoenix-gamestudios.com

--- In lua-l@y..., "Alberto Demichelis" <alberto@c...> wrote:
> > What happens when you deallocate/free an object?
> Instead of keeping track of the pages we keep a linked list of feer 
blocks for every chunk size,
> because they are "empty" we use the block himself as node of the 
linked list so no memory overhead.
> When we create a page, the address of the first chunk is aligned in 
a way that allow us to 
> retrieve the page start address from the chunk address.The page is 
ref counted(first 4 bytes).
> When we allocate a chunk we pop it from the free list and add a 
reference to the page.
> When we free we put it back in the free list and we decrement the 
reference if is 0 we free the page.
> 
> I've took a quick look at dlmalloc, it looks not so far from our 
approach exept for the implementation.
> I'll test it.
> 
> 
> ciao
> Alberto
> 
> PS: btw 200k/300k are internal structures + "waste".
> 
> ---------------------
> Alberto Demichelis
> alberto@c...
> Crytek Studios GmbH
> www.crytek.com
> ---------------------
> 
> 
> -----Original Message-----
> From: Nick Trout [[hidden email]...]
> Sent: Freitag, 18. Oktober 2002 01:06
> To: Multiple recipients of list
> Subject: RE: Functions and memory usage
> 
> 
> 
> > 200k/300k of internal structures of the allocator on 100Mb of 
> > allocated memory sound
> > pretty good to me, I do not understand the smile :-)
> 
> Then its not memory wasted, its used to manage the memory 
management. I
> assumed you meant that 200k-300k is lost to spaces at the end of 
pages,
> and rounding up, as you describe.
> 
> The smile was to show that while I might be aiming a criticism at 
you
> there is no malice intended! Its all a bit dry this technical mumbo
> jumbo sometimes and email can often be read ambiguously.
> 
> > > I assume each pool size would have an most recently used stack, 
only
> > > allocating a new pool once you've checked the existing 
> > pools for space?
> > 
> > I don't really understand the question :P
> > What do you mean as most recently used stack?
> 
> What happens when you deallocate/free an object? Lets say you 
allocated
> 10 pools (4k blocks) of 16 byte objects and then free a load of 
them. If
> you then wanted to allocate a new 16 byte object, how would that be
> done? I assume for fast allocation you would keep track of pools, 
and
> group them by size of object, so you could quickly find a pool
> containing 16 byte objects to allocate from. I was speculating that 
a
> set of stacks of pools might be a good way to organise this. When a 
new
> pool of a certain size is required it is pushed onto the stack and 
can
> be accessed quickly. However when deallocation takes place this 
might
> not be the best idea. I'd be interested to know how this is managed
> since you mention your free time was greatly reduced.
> 
> If you are allocating 100Mb then this would be beyond the scope of
> certain platforms and so a system like yours may not be as 
efficient or
> suitable for all applications. The dlmalloc system may be a better
> general solution?
> 
> Nick


12