in ,

Writing a Simple Garbage Collector in C, Hacker News


To begin, we need to write a memory allocator, or as we will be calling it, a malloc function. The simplest malloc implementations maintain a linked-list of free blocks of memory that can be partitioned and given out as needed. When a user requests a chunk of memory, a block of the right size is removed from the free list and returned. If no blocks of the right size exist, either a block of a larger size is partitioned into smaller blocks or more memory is requested from the kernel. Freeing a chunk of memory simply adds it back to the free list.

Each chunk of memory in the free list begins with a header describing the block. Our header will contain two fields, one indicating the size of the chunk and the second pointing to the next free block of memory:

typedefstruct(header{     unsignedintsize;     structheader*next; }header_t;

Using headers that are embedded in the memory we allocate is really the only sensible way of doing this, but it has the added benefit of automatically word-aligning the chunks, which is important.

Because we will need to keep track of the blocks of memory currently in use as well as the blocks that are not, we will have a used list in addition to a free list. Items will be added to the used list when they are removed from the free list, and vice-versa.

We are almost ready to complete the first step and write our malloc implementation. Before we do that, we first need to understand how to request memory from the kernel.

Dynamically allocated memory resides in the so-called heap, a section memory between the stack and the BSS (uninitialized data segment – all your global variables that have the default value of zero). The heap starts at a low address bordering the BSS and ends at the program break, which resides somewhere between the BSS and the stack. Attempting to access any memory between the stack and the break will cause an access violation (unless you access within the amount the stack can be extended by, but that’s a whole separate conversation). In order to obtain more memory from the kernel, we simply extend the break, thus allowing us to access more memory. To do this, we call the Unix sbrk system call, which extends the break by its argument and returns the address of the previous break on success, thus giving the program more memory. On failure, sbrk returns -1 casted to a void pointer, which is a terrible convention that no one likes.

We can use this knowledge to create two functions:morecoreandadd_to_free_list. In the case theta we are out of blocks in the free list, we will callmorecoreto request more memory. Since requesting the kernel for more memory is expensive, we will do it in page-size chunks. Knowing what a page is is not important right now, but a terse explanation is that it is the smallest unit of virtual memory that can be mapped to any particular location in physical memory. We will use the functionadd_to_free_listto do exactly what it sounds like.

staticheader_t(base;/ *Zero sized block to get us started.* /staticheader_t*freep=& base;/ *Points to first free block of memory.* /staticheader_t*usedp;/ *Points to first used block of memory.* // ** Scan the free list and look for a place to put the block. Basically, we're* looking for any block the to be freed block might have been partitioned from.* /staticvoidadd_to_free_list(header_t*bp) {     header_t*p;      for(p=freep;!(bp>p && bp  (next); p=p->next)         if(p>=p->next && (bp>p || bpnext) )             break;      if(bp   bp->size==p->next) {         bp->size =p->next->size;         bp->next=p->next->next;     }else        bp->next=p->next;      if(p   p->size==bp) {         p->size =bp->size;         p->next=bp->next;     }else        p->next=bp;      freep=p; }# defineMIN_ALLOC_SIZE4096/ *We allocate blocks in page sized chunks.* // ** Request more memory from the kernel.* /staticheader_t*morecore(size_tnum_units) {     void*vp;     header_t*up;      if(num_units>MIN_ALLOC_SIZE)         num_units=MIN_ALLOC_SIZE /sizeof(header_t);      if((vp=sbrk (num_units*sizeof(header_t)))==(void-1)         returnNULL;      up=(header_tvp;     up->size=num_units;     add_to_free_list (up);     returnfreep; }

Now that we have our two helper functions, writing our malloc function is pretty straight forward. We simply scan the free list and use the first block that is at least as big as the chunk we’re trying to find. Because we use the first block we find instead of trying to find a “better” block, this algorithm is known as first fit.

A quick note to clarify: the size field in the header struct is measured in header-sized blocks, and not bytes.

/ ** Find a chunk from the free list and put it in the used list.* /void*GC_malloc(size_talloc_size) {     size_tnum_units;     header_t*p, *prevp;      num_units=(alloc_size sizeof(header_t) - 1) /sizeof(header_t )   1;     prevp=freep;      for(p=prevp->next ;; prevp=p, p=p->next) {         if(p->size>=num_units) {/ *Big enough.* /            if(p->size==num_units)/ *Exact size.* /                prevp->next=p->next;             else{                 p->size -=num_units;                 p =p->size;                 p->size=num_units;             }              freep=prevp;              / *Add to p to the used list.* /            if(usedp==NULL)                 usedp=p->next=p;             else{                 p->next=usedp->next;                 usedp->next=p;             }              return(void(p   1);         }         if(p==freep) {/ *Not enough memory.* /            p=morecore (num_units);             if(p==NULL)/ *Request for more memory failed.* /                returnNULL;         }     } }

Although this code is not going to win any awards for low fragmentation, it’ll work. And if it works, that means we can finally get to the fun part – the garbage collection!

****************** (Read More) *******************************

What do you think?

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

Antonio Brown Has Built the Biggest Brand in Sports – Too Bad It's Worthless, Crypto Coins News

Antonio Brown Has Built the Biggest Brand in Sports – Too Bad It's Worthless, Crypto Coins News

artagnon / clayoven, Hacker News

artagnon / clayoven, Hacker News