Heap Memory Allocator in C Programming

PART 1

Welcome Reader.

Hope you're doing well. If you are a creature, curious in the area of System Programming like me, hope you enjoy reading this article.

Today I am all set to explore a topic of Memory allocation, which is very much familiar to every one who came around the concept of Dynamic Memory Allocation in our C Programming. Actually, to those who likes playing around C, this concept can be a more challenging as well as more rewarding in terms of learning.

As an Operating Systems enthusiast, for me, It's really important to understand how things works at a low level as it helps me improve my skills as a Programmer and plays vital role in avoiding at least few mistakes that could lead to security problems in my code.

We know that, we use C library functions malloc(), calloc(), realloc(), and free() for Dynamic Memory allocation. I want to explore how these library functions implemented at a system level. As a side note, this is neither fast nor efficient but it's a memory allocator that works, coding just for learning.

Have some fun, just to learn!

Now, to the Actual stuff.

A memory allocator's responsibility is to manage free blocks of memory. If you've never read a malloc implementation, you may have assumed that calling free simply causes memory to be released to the operating system. But acquiring memory from the OS has a cost, so allocators tend to keep free chunks around for a while for possible re-use before deciding to release them.

Every process runs within its own virtual address space that’s distinct from the virtual address spaces of other processes. This address space consists of 5 sections as shown in below image.

Memory layout of a Program

MemoryLayout.png

Text Section : It is the space used for storing the binary instructions to be executed by the Processor.

Data Section : Used to store non-zero initialized static data.

Block Started by Symbol(BSS) : Contains zero-initialized static data. Static data uninitialized in program is initialized 0 and goes here.

Heap : It contains dynamically allocated data.

Stack : It contains all the automatic variables, function parameters, base pointer copy, etc.

As you can see in the image, the stack and the heap grow in the opposite directions. Sometimes the data, bss and heap sections are collectively referred to as the “data segment”, the end of which is demarcated by a pointer named program break or brk.

That is, brk points to the end of the heap.

Now if we want to allocate more memory in the heap, we need to request the system to increment brk. Similarly, to release memory we need to request the system to decrement brk.

Assuming we run Linux, we can make use of sbrk() system call that lets us manipulate the program break.

There are better alternatives like mmap() available today. sbrk() is not really thread safe. It can can only grow or shrink in LIFO order. However, the glibc implementation of malloc still uses sbrk() for allocating memory that’s not too big in size. So, we will go ahead with sbrk() for our simple memory allocator.

sbrk() example usage:

sbrk(0) gives the current address of program break.

sbrk(x) with a positive value increments brk by x bytes, as a result allocating memory.

**sbrk(-x) *with a negative value decrements brk by x bytes, as a result releasing memory. On failure, sbrk() returns (void) -1.

That's it for now.

Will continue this series to cover the remaining interesting topics including:

PART 2 : Our way of malloc() implementation

PART 3: Our way of free(), calloc(), and realloc() implementation

I am really excited to explore these implementations, and learn to code few things on the go.

Keep Learning!

Signing out for now....

Bhanuprakash Eagala