Table Garbage Collection

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Table Garbage Collection

Nick Nugent
I'm trying to wrap my head around some garbage collection basics,
here, and I'm having a hard time really getting to the bottom of
things.  I'm hoping someone can explain (or point to an explanation
for) the following behavior.  For the record, I am using this example
with the Lua 5.1.1 standalone interpreter on Linux.

Consider the following chunk:

-- BEGIN
collectgarbage("collect");
print(collectgarbage("count"));

do
	for i = 1, 1000000 do
		local t = {};
	end
end

print(collectgarbage("count"));
collectgarbage("collect");
print(collectgarbage("count"));

io.read();
-- END

In this case, everything works as I would expect.  The output shows
moderate memory growth before the final garbage collection and then a
return to the VM's initial size.

If, however, I replace the contents of the do block with the
following, I see something very different.

-- BEGIN
do
	local g = {};
	for i = 1, 1000000 do
		local t = {};
		g[i] = t;
	end
end
-- END

Now, as I understand it, at the end of this block, all the strong
references should be out of scope, and the garbage collector should be
able to reap everything I've just done.  The script output actually
appears to agree, showing significant growth before the collection,
and a return to the initial size again afterward.  In this second
case, however, in contrast to the first, inspecting the size of the
interpreter's process in the host operating system shows that very
little of the total memory allocated by this script is ever freed.

Am I missing something glaring in either my script or my analysis?
I've searched pretty significantly online and in the list archives for
the answer, but I haven't stumbled upon it yet, so I thought I'd ask.

Thanks.

--Nick

Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

Javier Guerra Giraldez
On Wednesday 22 November 2006 4:32 pm, Nick Nugent wrote:
> case, however, in contrast to the first, inspecting the size of the
> interpreter's process in the host operating system shows that very
> little of the total memory allocated by this script is ever freed.

in Linux (like any Unix, AFAIK), it's very rare for a process to return memory 
to the OS; any free()'d memory is returned to an in-process free list.  the 
process size (as reported by /proc, ps, top, and friends) is roughly the max 
size it has reached.

-- 
Javier

Attachment: pgpw8Sal6AgE8.pgp
Description: PGP signature

Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

Rob Kendrick
On Wed, 2006-11-22 at 16:40 -0500, Javier Guerra wrote:
> On Wednesday 22 November 2006 4:32 pm, Nick Nugent wrote:
> > case, however, in contrast to the first, inspecting the size of the
> > interpreter's process in the host operating system shows that very
> > little of the total memory allocated by this script is ever freed.
> 
> in Linux (like any Unix, AFAIK), it's very rare for a process to return memory 
> to the OS; any free()'d memory is returned to an in-process free list.  the 
> process size (as reported by /proc, ps, top, and friends) is roughly the max 
> size it has reached.

Additionally, the reason for this is that the memory is allocated
logically in a flat space, so if you have one block still allocated
right at the end, but megabytes of free memory before it, this can't
easily be returned to the OS if the memory was allocated by brk() (which
extends the process's memory.)

What happens may be very different if the memory was allocated via
mmap(), however.

B.


Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

Nick Nugent
On 11/22/06, Rob Kendrick wrote:
On Wed, 2006-11-22 at 16:40 -0500, Javier Guerra wrote:
> in Linux (like any Unix, AFAIK), it's very rare for a process to return memory
> to the OS; any free()'d memory is returned to an in-process free list.  the
> process size (as reported by /proc, ps, top, and friends) is roughly the max
> size it has reached.

Additionally, the reason for this is that the memory is allocated
logically in a flat space, so if you have one block still allocated
right at the end, but megabytes of free memory before it, this can't
easily be returned to the OS if the memory was allocated by brk() (which
extends the process's memory.)

What happens may be very different if the memory was allocated via
mmap(), however.

All right, that is a satisfying if frustrating answer.  I've verified
that the interpreter does end up calling brk(), of course, in which
case it would all seem to make sense.

I came across this trying to write some benchmark and memory leak
analysis utilities for a Lua-fronted package using the C API.  The
real application would likely never grow so large in the first place,
as it would be garbage collecting along the way.  It seems it is not
quite appropriate to benchmark the garbage collector in this manner,
and that what I thought was a red flag turns out to be a red herring
instead.

As an interesting follow up, a second loop that allocates out of the
"free" region as before, though not quite as much as the first, which
is then garbage collected again returns the VM almost to its original
size.  Weird.

In any case, thanks for the help.

--Nick

Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

Sam Roberts-2
On Wed, Nov 22, 2006 at 02:33:34PM -0800, Nick Nugent wrote:
> >What happens may be very different if the memory was allocated via
> >mmap(), however.

Fwiw, its easy to replace the allocator used by lua, so if you found an
allocator that used mmap, and was better at returning memory for the
usage profile of a GCing interpreter like lua, that could be used.

On the other hand, maybe it doesn't matter  that the heap is large, the
pages will get swapped out, and if you never touch them, who cares?

Sam



Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

Nick Nugent
On 11/22/06, Sam Roberts wrote:
On the other hand, maybe it doesn't matter  that the heap is large, the
pages will get swapped out, and if you never touch them, who cares?

Unless you're embedded and have no swap.  :(.

--Nick

Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

Glenn Maynard
In reply to this post by Sam Roberts-2
On Wed, Nov 22, 2006 at 05:18:42PM -0800, Sam Roberts wrote:
> On the other hand, maybe it doesn't matter  that the heap is large, the
> pages will get swapped out, and if you never touch them, who cares?

You're going to touch them the next time you allocate more memory, and
the swapper--not knowing that memory was stale--will swap the old data
back in; whatever the user is doing that needs that memory will be
blocked while that happens.  I care if applications are wasting my
system's performance by increasing swap file activity.

By the way, it might be possible to improve Lua's memory use by having
the allocation function use a C++ allocator.  Knowing the size on
deallocation can give better characteristics than malloc/free can do,
especially for small allocations (eg. 4-byte allocations that really use
4 bytes).  I don't know if it would have an appreciable impact on total
memory use by Lua.

-- 
Glenn Maynard

Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

Glenn Maynard
In reply to this post by Nick Nugent
On Wed, Nov 22, 2006 at 05:24:41PM -0800, Nick Nugent wrote:
> On 11/22/06, Sam Roberts wrote:
> >On the other hand, maybe it doesn't matter  that the heap is large, the
> >pages will get swapped out, and if you never touch them, who cares?
> 
> Unless you're embedded and have no swap.  :(.

I don't agree with "who cares", but it doesn't matter so much for
embedded use.  On a PC, it's nice to let go of memory when you're
done with it, so other applications can use it--but embedded, with
(usually) no other applications running, there's nothing to return
memory to and no other applications that could be using it, so the
problem sort of doesn't exist.

-- 
Glenn Maynard

Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

David Given
In reply to this post by Rob Kendrick
Rob Kendrick wrote:
[...]
> Additionally, the reason for this is that the memory is allocated
> logically in a flat space, so if you have one block still allocated
> right at the end, but megabytes of free memory before it, this can't
> easily be returned to the OS if the memory was allocated by brk() (which
> extends the process's memory.)
> 
> What happens may be very different if the memory was allocated via
> mmap(), however.

glibc's allocator does use mmap() for large allocations, but that's not
particularly frequent (especially for Lua, which does lots of small allocations).

Beware: mmap() can only allocate complete pages. On Linux on IA-32, a page is
4kB, which means a 100 byte memory allocation will waste 3996 bytes of memory.
But on IA-64, a page is 4MB! Which means you're wasting a lot more.

A true embedded system typically doesn't do any dynamic memory allocation ---
the C or C++ part of the system will used statically allocated memory, and the
Lua part of the system will allocate memory out of a fixed-size static pool,
so this isn't normally a problem. It's only the medium-sized systems which are
bigger than embedded but too small for swap that suffer...

-- 
âââDavid GivenâââMcQââ "A line dancer near a graduated cylinder, the
ââ [hidden email]âââââ blithe spirit inside some wheelbarrow, and a tomato
ââ([hidden email])ââ are what made America great!" --- received via spam
âââwww.cowlark.comââââ

Attachment: signature.asc
Description: OpenPGP digital signature

Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

Michael Broughton
I think IA-64 supports a few page sizes, but none of them are above 64K, as far as I know.

I have heard that Windows can use 'super pages' which are 4M, but I don't think that is the normal case for user space.

Mike


David Given wrote:
Rob Kendrick wrote:

But on IA-64, a page is 4MB! Which means you're wasting a lot more.


Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

Michael Broughton
Sorry, looked it up and IA-64 does support page sizes larger than 64K. Still, I don't know that pages that big are normal used.

Mike



Michael Broughton wrote:
I think IA-64 supports a few page sizes, but none of them are above 64K, as far as I know.

I have heard that Windows can use 'super pages' which are 4M, but I don't think that is the normal case for user space.

Mike


David Given wrote:
Rob Kendrick wrote:

But on IA-64, a page is 4MB! Which means you're wasting a lot more.



Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

David Given
In reply to this post by Michael Broughton
Michael Broughton wrote:
> I think IA-64 supports a few page sizes, but none of them are above 64K, 
> as far as I know.
> 
> I have heard that Windows can use 'super pages' which are 4M, but I 
> don't think that is the normal case for user space.

Whoops. You're right; I'd misread the documentation and conflated IA64 pages
with 'huge pages', which is a Linux kernel feature to save TLB space; plus,
IA32 can be configured to use 4MB small pages instead of 4kB ones, which must
be where Windows super pages come from.

-- 
âââDavid GivenâââMcQââ "A line dancer near a graduated cylinder, the
ââ [hidden email]âââââ blithe spirit inside some wheelbarrow, and a tomato
ââ([hidden email])ââ are what made America great!" --- received via spam
âââwww.cowlark.comââââ

Attachment: signature.asc
Description: OpenPGP digital signature

Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

Rici Lake-2
In reply to this post by Glenn Maynard

On 22-Nov-06, at 9:06 PM, Glenn Maynard wrote:

On Wed, Nov 22, 2006 at 05:18:42PM -0800, Sam Roberts wrote:
On the other hand, maybe it doesn't matter that the heap is large, the
pages will get swapped out, and if you never touch them, who cares?

You're going to touch them the next time you allocate more memory, and
the swapper--not knowing that memory was stale--will swap the old data
back in; whatever the user is doing that needs that memory will be
blocked while that happens.  I care if applications are wasting my
system's performance by increasing swap file activity.

Some Unix systems have a way of telling the kernel that memory is stale, through the madvise or posix_madvise interface. However, the only Unix I work with a lot that really implements this feature in a way that malloc() can make use of it is FreeBSD, which provides the MADV_FREE advice option to madvise. Linux, as I understand it, will interpret MADV_DONTNEED to mean that a page of zeros can be provided instead of swapping back in, if the page was originally allocated through mmap with anonymous backing store.

The FreeBSD standard malloc() implementation can be configured to use this feature but it is rarely a win; although it does reduce swapping, it also increases overall overhead because of increased bookkeeping on page tables.

In my experience, swapping out is more expensive than swapping in; in the case where large regions can be marked stale, the FreeBSD facility can speed things up quite a bit, but that is not very common.

glibc malloc()'s can be configured to return memory to the operating system by providing a negative argument to sbrk(); this can only be done if the memory to be returned is at the end of the heap. See mallopt() for details (there's no manpage for it in the Linux system I have access to right now, but it is documented on the glibc info pages at http://www.gnu.org/software/libc/manual/html_node/Malloc-Tunable- Parameters.html


By the way, it might be possible to improve Lua's memory use by having
the allocation function use a C++ allocator.  Knowing the size on
deallocation can give better characteristics than malloc/free can do,
especially for small allocations (eg. 4-byte allocations that really use
4 bytes).  I don't know if it would have an appreciable impact on total
memory use by Lua.

FreeBSD's standard malloc does not store allocation sizes in the allocated block; all allocations on a given page are the same size. (That's slightly imprecise, the details are available in the documentation.) This can be a big win, but it also can be a memory hog; it means that 40-byte allocations occupy 64 bytes of memory, instead of 48 bytes.


Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

Taj Khattra
On 11/23/06, Rici Lake <[hidden email]> wrote:
See mallopt() for details (there's no manpage for it in the Linux system I
have access to right now, but it is documented on the glibc info pages
at
http://www.gnu.org/software/libc/manual/html_node/Malloc-Tunable-Parameters.html

btw, these options can also be set using environment variables
MALLOC_MMAP_MAX_, MALLOC_MMAP_THRESHOLD_ and MALLOC_TRIM_THRESHOLD_,
which can be useful when used in combination with strace.

Reply | Threaded
Open this post in threaded view
|

Re: Table Garbage Collection

Dave Dodge
In reply to this post by David Given
On Thu, Nov 23, 2006 at 05:01:20PM +0000, David Given wrote:
> Whoops. You're right; I'd misread the documentation and conflated IA64 pages
> with 'huge pages', which is a Linux kernel feature to save TLB space;

Just a data point: on my Linux/IA-64 system, normal page size is 16K
and hugepage size is 256M.  Those are the defaults that SGI's kernel
provides.

Hugepages can be a hugepain to deal with, though there is a relatively
new libhugetlbfs which supposedly simplifies things and allows
malloc() to use them.

                                                  -Dave Dodge