Custom Memory Allocator in C
This project is my introductory dive into manual memory management, something I had no prior experience with. Having previously worked with languages that use garbage collectors, managing memory and working with pointers were entirely new concepts for me. This project showcases my attempt to understand these concepts and implement a memory allocator inspired by C's built-in allocator.
The allocator is designed to handle dynamic memory requests from the program and manage memory efficiently. By tracking memory blocks through a linked list, the allocator can dynamically allocate, free, and reallocate memory while minimizing fragmentation.
For more information and to explore the codebase, please refer to the GitHub repository.
Project Overview
The memory allocator is designed to manage a single large pool of memory, dividing it into blocks as needed by the program. It dynamically allocates, frees, and reallocates memory by implementing C's well-known functions: malloc
, free
and realloc
. To organize the memory, a linked list is used to track both free and used blocks.
In the subsequent sections we will inspect the key components of the allocator, including how it utilizes the memory for organizing metadata, how linked lists are used to track memory blocks, and interesting code snippets from the codebase.
Key Components
The project is structured around several core components.
- Allocator Object (
Allocator
): The allocator source file contains key function implementations, such asallocator_malloc
,allocator_free
, andallocator_realloc
, along with important auxiliary functions that ensure the allocator works properly. Increate_allocator
, there is a single call to C'smalloc
to obtain the initial memory pool that the allocator will manage. This pool is then divided into two parts: a user pool and a reserved pool.The user pool contains blocks of allocated data, while the reserved pool holds metadata related to the allocator, such as the linked list, nodes, and member variables. Inspired by the heap-stack relationship, the reserved pool grows downwards, and the user pool grows upwards.
- Linked List for Memory Blocks (
LinkedList
): A linked list is utilized to keep track of memory blocks that are free or in use within the user pool.During allocator creation, the linked list instantiates a single node, which contains the entire user pool set to free. When the
allocator_malloc
function is called, depending on the search strategy, the allocator will go through the linked list to find a node with an available memory block. If found, it will split the node into two: one containing the allocated memory block, and the other containing the remaining free memory. - Data Object (
MemoryData
): Since the node implementation is generic, the payload of a node is aMemoryData
object. This object contains all the necessary information about the memory block the node holds, such as the start address of the memory block, its size, and whether it is free or allocated. - Static Variable (
current_alloc
): To avoid the need to pass a pointer to the allocator object as an argument when using the allocator functions, the static variablecurrent_alloc
is used instead. By using theset_allocator
andrelease_allocator
functions, any allocator function that requires the allocator object can retrieve it fromcurrent_alloc
.
Code Overview
In this section, we will highlight essential parts of the code that make up the memory allocator.
create_allocator.c
Script
The create_allocator
function is responsible for initializing the allocator, allocating the heap, and setting up the initial metadata. To set up the metadata, we must be diligent as we deal with pointer arithmetic. As previously mentioned, the reserved pool grows downward, starting from the highest memory. This is reflected in the following pseudocode of the create_allocator
function. When we instantiate an object in the reserved pool, we first have to make room for it by increasing the size of the reserved pool.
The align_size
function is simply turning the integer value into a multiple of eight to accommodate best practices in 64-bit systems, which organize memory in blocks of eight.
Allocator* create_allocator(size_t heap_size) {
// Realign to a factor of 8 for memory efficency
heap_size = align_size(heap_size);
// Utilize the built-in C malloc to acquire the managed heap
char* heap_start = (char*) malloc(heap_size);
if (heap_start == NULL) {
// Memory error from malloc()
return NULL;
}
// Initialize Allocator object at the end of the heap
char* heap_end = heap_start + heap_size;
Allocator* alloc = (Allocator*) (heap_end - align_size(sizeof(Allocator)));
// Set Allocator member variables (pseudocode)
// alloc->member_variable = ...;
// Increase the reserved pool to accommodate for the LinkedList
alloc->reserved_pool_border -= align_size(sizeof(LinkedList));
alloc->reserved_pool_size += align_size(sizeof(LinkedList));
// Initialize LinkedList object
LinkedList* list = (LinkedList*) alloc->reserved_pool_border;
// Set LinkedList member variables (pseudocode)
// list->head = ...;
// Set Allocator member variable
alloc->list = list;
// Initialize a Node referencing the entire user memory pool
void* memory_start = heap_start;
// Increase the reserved pool to accommodate for the MemoryData
alloc->reserved_pool_border -= align_size(sizeof(MemoryData));
alloc->reserved_pool_size += align_size(sizeof(MemoryData));
MemoryData* data = (MemoryData*) alloc->reserved_pool_border;
// Set MemoryData member variables (pseudocode)
// data->member_variable = ...;
// Increase the reserved pool to accommodate for the Node
alloc->reserved_pool_border -= align_size(sizeof(Node));
alloc->reserved_pool_size += align_size(sizeof(Node));
// Create Node containing the MemoryData
Node* node = (Node*) alloc->reserved_pool_border;
// Set Node member variables (pseudocode)
// node->member_variable = ...;
/*
* The Node is a generic structure that can carry different types of data.
* In this line, we set the payload of the Node to the previously created
* MemoryData, which stores information about the memory block.
*/
node->data = data;
// Add the newly created Node to the LinkedList
add(list, node);
return alloc;
}
allocator_malloc.c
Script
The allocator_malloc
function can be considered as having two parts. The first is concerned with finding a node that has an available memory block. Multiple search strategies are available, such as first-fit, best-fit, and next-fit. Once a suitable node has been found, we enter the second part. Here, we determine how to utilize the found node. In the implementation, if the block is larger than needed, it is split. The block is divided, allocating the required portion and leaving the remainder as a free block.
The following is the pseudocode for the allocator_malloc
function.
void* allocator_malloc(size_t required_size) {
// Realign to a factor of 8 for memory efficency
required_size = align_size(required_size);
// Search for a Node with an available memory block
Node* available_node = naive_search(required_size);
if (available_node == NULL) {
/*
* No available Node was found, perform memory
* pool cleansing (reduce fragmentation, remove unused metadata).
*/
cleanse_user_pool();
cleanse_reserved_pool();
// Try again to find an available memory block
available_node = naive_search(required_size);
if (available_node == NULL) {
// The managed heap is full
return NULL;
}
}
/*
* A Node has been found. Inspect the Node to see how to split
* up into allocated memory Node and residual memory Node.
*/
// Retrieve Node memory block data
MemoryData* available_data = (MemoryData*) available_node->data;
size_t node_block_size = available_data->block_size;
/*
* Logic for splitting up the Node into allocated memory
* and residual memory Nodes (pseudocode).
*/
// ...
// Return the pointer to the start of the allocated memory
return available_data->memory_start;
}
allocator_free.c
Script
Similar to the allocator_malloc
function, the allocator_free
function consists of two parts. Firstly, it verifies that the input pointer ptr
was previously allocated by the allocator. Thereafter, if a corresponding node is found, the function marks it as free and tries to merge the node with adjacent nodes that are also free in order to reduce memory fragmentation.
The following is the pseudocode for the allocator_malloc
function.
void allocator_free(void* ptr) {
if (current_alloc == NULL) {
// There is no Allocator object to process
return;
}
LinkedList* list = current_alloc->list;
LinkedListIterator iter;
iter.current = get_head(list);
// Search for the Node corresponding to 'ptr' in the LinkedList
Node* matched_node = NULL;
while (has_next(&iter)) {
Node* node = next(&iter);
MemoryData* data = (MemoryData*) node->data;
if (data->memory_start == ptr && !data->is_free) {
matched_node = node; // Node found
break;
}
}
if (!matched_node) {
// The Allocator did not allocate this pointer
return;
}
// Mark the memory block as free
MemoryData* matched_data = (MemoryData*) matched_node->data;
matched_data->is_free = true;
// Reset the iterator to look for adjacent free Nodes
iter.current = get_head(list);
while (has_next(&iter)) {
Node* node = next(&iter);
// Skip if its the matched Node itself
if (node->id == matched_node->id) { continue; }
MemoryData* data = (MemoryData*) node->data;
/*
* Check if the Node is adjacent to the freed memory block,
* and if it is free, merge the Nodes (pseudocode).
*/
if (data->is_free && is_adjacent(matched_data, data)) {
merge_meta_data_nodes(list, matched_node, node);
// Update the matched Node and memory boundaries (pseudocode)
// matched_memory_start = ...;
// matched_memory_end = ...;
}
}
}
Conclusion
This project serves as a starting point for exploring concepts such as manual memory management, pointer manipulation, and the structure of a simple memory allocator. In creating efficient allocators, the process can be viewed as a game of placement: how can we place allocated data most efficiently, with the constraint that once placed, it cannot be moved, only freed when no longer needed. Memory allocators focus on two important aspects: designing good search strategies, with the simplest being the naive first-fit search, and creating a solid allocator design that allows for efficient defragmentation. In this project, we touched on both aspects, making it a valuable entry point into memory allocation.
Future improvements could include developing a more specialized linked list and node structure, rather than relying on generic implementations. We could also use system calls like mmap, which would allow us to completely detach from C's built-in allocator, giving us full control. Furthermore, implementing segregated linked lists, where free nodes and nodes in use are kept in separate lists could improve efficiency.