Heap Memory Allocator in C Programming

PART2

Greetings, Systems devs!

In this article on Building our own Memory Allocator, we dive into malloc library function. Please first read PART1 to understand sbrk() system call provided by Linux.

malloc(size) function allocates the size bytes of memory in heap and returns a pointer to the allocated memory. Our simple malloc will look like:

void *malloc(size_t size) {
    void *block;
    block = sbrk(size);
    if(block == (void*)-1)
        return NULL;
    return block;
}

In the above code, we call sbrk() with the given size. On success, size bytes are allocated on the heap. That was easy, doesn't it?

The tricky part is freeing this memory. The free(ptr) function frees the memory block pointed to by ptr, which must have been returned by a previous call to malloc(), calloc() or realloc(). But to free a block of memory, the first order of business is to know the size of the memory block to be freed. In the current scheme of things, this is not possible as the size information is not stored anywhere. So, we will have to find a way to store the size of an allocated block somewhere.

Moreover, we need to understand that the heap memory the operating system has provided is contiguous. So we can only release memory which is at the end of the heap. We can’t release a block of memory in the middle to the OS.

To address this issue of not being able to release memory that’s not at the end of the heap, we will make a distinction between freeing memory and releasing memory. From now on, freeing a block of memory does not necessarily mean we release memory back to OS. It just means that we keep the block marked as free. This block marked as free may be reused on a later malloc() call. Since memory not at the end of the heap can’t be released, this is the only way ahead for us.

So now, we have two things to store for every block of allocated memory:

  1. size
  2. Whether a block is free or not-free?

To store this information, we will add a header to every newly allocated memory block. The header will look something like this:

struct header_t {
    size_t size;
    unsigned is_free;
 };

The idea is simple. When a program requests for size bytes of memory, we calculate total_size = header_size + size, and call sbrk(total_size). We use this memory space returned by sbrk() to fit in both the header and the actual memory block. The header is internally managed, and is kept completely hidden from the calling program.

Now, each one of our memory blocks will look like:

We can’t be completely sure the blocks of memory allocated by our malloc is contiguous. Imagine the calling program has a foreign sbrk(), or there’s a section of memory mmap()ed in between our memory blocks. We also need a way to traverse through our blocks for memory (why traverse? we will get to know when we look at the implementation of free()). So to keep track of the memory allocated by our malloc, we will put them in a linked list. Our header will now look like:

struct header_t {
    size_t size;
    unsigned is_free;
    struct header_t *next;
}

Now, the linked list of memory blocks looks like this:

Now, let’s wrap the entire header struct in a union along with a stub variable of size 16 bytes. This makes the header end up on a memory address aligned to 16 bytes. Recall that the size of a union is the larger size of its members. So the union guarantees that the end of the header is memory aligned. The end of the header is where the actual memory block begins and therefore the memory provided to the caller by the allocator will be aligned to 16 bytes.

typedef char ALIGN[16];

union header {
    struct {
        size_t size;
        unsigned is_free;
        union header* next;
    }s;
    ALIGN stub;
};

typedef union header header_t;

We will have a head and tail pointer to keep track of the list.

header_t *head, *tail;

To prevent two or more threads from concurrently accessing memory, we will put a basic locking mechanism in place. We’ll have a global lock, and before every action on memory you have to acquire the lock, and once you are done you have to release the lock.

pthread_mutex_t global_malloc_lock;

Our malloc is now modified to:

void *malloc(size_t size) {
    size_t total_size;
    void *block;
    header_t *header;
    if(!size)
        return NULL;
    pthread_mutex_lock(&global_malloc_lock);
    header = get_free_block(size);
    if(header) {
        header->s.is_free = 0;
        pthread_mutex_unlock(&global_malloc_lock);
        return (void*)(header+1);
    }
    total_size = sizeof(header_t) + size;
    block = sbrk(total_size);
    if(block == (void*)-1) {
        pthread_mutex_unlock(&global_malloc_lock);
        return NULL;
    }
    header = block;
    header->s.size = size;
    header->s.is_free = 0;
    header->next = NULL;

    if(!head)
        head = header;
    if(tail)
        tail->s.next = header;
    tail = header;
    pthread_mutex_unlock(&global_malloc_lock);
    return (void*)(header+1);
}

header_t *get_free_block(size_t size) {
    header_t *curr = head;
    while(curr) {
        if(curr->s.is_free && curr->s.size >= size)
            return curr;
        curr = curr->s.next;
    }
    return NULL;
}

We check if the requested size is zero. If it is, then we return NULL.
For a valid size, we first acquire the lock. The we call get_free_block() - it traverses the linked list and see if there already exist a block of memory that is marked as free and can accomodate the given size. Here, we take a first-fit approach in searching the linked list.

If a sufficiently large free block is found, we will simply mark that block as not-free, release the global lock, and then return a pointer to that block. In such a case, the header pointer will refer to the header part of the block of memory we just found by traversing the list. Remember, we have to hide the very existence of the header to an outside party. When we do (header + 1), it points to the byte right after the end of the header. This is incidentally also the first byte of the actual memory block, the one the caller is interested in. This is cast to (void*) and returned.

If we have not found a sufficiently large free block, then we have to extend the heap by calling sbrk(). The heap has to be extended by a size that fits the requested size as well a header. For that, we first compute the total size: total_size = sizeof(header_t) + size;. Now, we request the OS to increment the program break: sbrk(total_size).

In the memory thus obtained from the OS, we first make space for the header. In C, there is no need to cast a void* to any other pointer type, it is always safely promoted. That’s why we don’t explicitly do: header = (header_t *)block;
We fill this header with the requested size (not the total size) and mark it as not-free. We update the next pointer, head and tail so to reflect the new state of the linked list. As explained earlier, we hide the header from the caller and hence return (void*)(header + 1). We make sure we release the global lock as well.

That's it for today. We will cover free(), calloc(), and realloc() functions in the next writeup.

Thank you for reading. All comments and feedback is welcome.

Happy learning!

References

https://wiki.osdev.org/Memory_Allocation

https://www.jamesgolick.com/2013/5/15/memory-allocators-101.html