C programming | Working with pointers
Accessing memory locations is one of the greatest features of the C programming language, although it requires some responsibility. The word pointer often scares programmers away, but it shouldn't.
Pointers give support for dynamic memory allocation, level-up flow control in a program and are closer to hardware which makes code more efficient.
What are pointers?
In C programming, variables hold values at a specific memory address. Pointers are variables that hold memory addresses and types of other variables and functions, giving direct access to physical memory locations anywhere inside the computer.
Given the variablevar
,&var
is a pointer tovar
.
You can think about pointers and variables like license-plates and vehicles. While vehicles can seize too many types and forms, license-plates usually come in an unified form.
How pointers work
With pointers it's possible to access any memory location and change the data contained at that location.
A pointer is declared by adding an asterisk (*
) in front of the variable name in the declaration statement.
It's heavily recommended to initialize pointers as NULL
since when we create a pointer it isn't initialized and holds random data that can point to anywhere in the computer memory.
int variable_name = 8; //define a variable int *variable_pointer = NULL; //define a pointer to a variable
NULL
is a macro to address 0
. In programming terms 0
is an invalid address. It can be defined like this:
#define NULL ((void*)0)
— In order to work with declared pointers, we have two basic pointer operators:
&
Address constant of an object in memory. Given a variable, point to it.
/* we pass a variable asking for its memory address */ printf("%d\n", &variable_name); /* the program should return a memory address */ "0xfbee324b"
*
Content of a memory address. Given a pointer, get the value stored in it. This is usually called pointer dereferencing.
/* make our pointer point to the address of the given variable */ variable_pointer = &variable_name; /* we pass a variable asking for its value */ printf("%d\n", *variable_pointer); /* the program should return the content of the memory address */ "8"
Let's make a quick reminder of how to work with simple pointers:
int main() { /* define a variable for a number */ int num; /* define a pointer to num */ int *int_ptr = NULL; /* add a value to num */ num = 14; /* now make the pointer point to num. This assigns num address to int_ptr */ int_ptr = # /* let's check what values contain each variable */ printf("num = %d\n", num); printf("&num = %p\n", &num); printf("int_ptr = %p\n", int_ptr); printf("*int_ptr = %p\n", *int_ptr); printf("&int_ptr = %p\n", &int_ptr); /* int_ptr points to num, changing int_ptr value modifies num too */ *int_ptr = 8; printf("modified *int_ptr = %p modified num to num = %d\n", *int_ptr, num); return 0; }
The result of that program should be similar to this:
num = 14 &num = 0x6fff86d087a5 int_ptr = 0x6fff86d087a5 *int_ptr = 14 &int_ptr = 0x6fff86d087a5 modified *int_ptr = 8 modified num to num = 8
Pointer utilities
We've seen a quick refresh of how pointers work. Now let's take a look at some options pointers give to us.
— We can have multiple pointers pointing to the same variable.
int main() { int num; int *first_ptr; int *second_ptr; num = 14; first_ptr = # second_ptr = first_ptr; return 0; }
since first_ptr
and second_ptr
are both pointers we can reference them.
— We can pass pointers as function arguments.
Passing data using a pointer allows the function to modify the external data.
If we try to do the same with data passed as values instead of pointers then we only modify the function parameter, and not the original value since the addresses of the parameter and the variable in main are not the same.
void ModifyData(int *data); int main() { int externalData = 10; printf("\nExternal data value before modify is %d", externalData); ModifyData(&externalData); printf("\nExternal data value after been modified is %d", externalData); } void ModifyData(int *data) { *data = 0; }
The result should be:
External data value before modify is 10 External data value after been modified is 0
However, we can't change the actual pointer to the data since passed a copy of the pointer.
A common practical example using pointers as function parameters is a swap function.
void SwapFloat( float *a, float *b) { float tmp = *a; *a = *b; *b = tmp; }
— We can pass pointer to a pointer as a function argument.
This way we can modify the original pointer and not its copy. Similar to passing a variable in the previous example.
— We can return pointers.
This is pretty much straight forward. We have to declare the return type to be a pointer to the appropriate data type.
int *RoundFloat(float num);
An example implementing the function:
int *RoundFloat(float *num); int main() { float fnum = 5.23; int *frounded; frounded = RoundFloat(&fnum); printf("rounded value from %f is %d.", fnum, *frounded); return 0; } int *RoundFloat(float *num) { int *tmp; *tmp = ((*num + 0.5f) *1) /1; return tmp; }
— We can create function pointers.
A function pointer is a variable that stores the address of a function to be used later on the program.
typedef float (*OperationsTable)(float, float);
When we call a function, we might need to pass the data for it to process along pointers to subroutines that determine how it processes the data.
typedef float (*OperationsTable)(float, float); float Add( float x, float y) { return x+y; } float Sub( float x, float y) { return x-y; } float Operate(OperationsTable opTable, float x, float y) { return opTable(x,y); }; int main() { int a, b; int a = 5; int b = 10; Operate(Add, a, b); }
Another option is to store function pointers in arrays and later call the functions using the array index notation.
float Add( float x, float y) { return x+y; } float Sub( float x, float y) { return x-y; } float (*OperationsTable[2])(float, float) = { Add, Sub}; int main() { int a, b; int a = 5; int b = 10; OperationsTable[0](a, b); }
— We can use pointers with structs.
Normally we access struct components with a dot .
but when a struct is marked as pointer, we access their values using the point-to operator ->
.
Note that is possible to still use a dot.
, but then the call to the component is as follows:(*foo).variable
.
typedef struct Vector3 { int x; int y; int z; } Vector3; int main() { Vector3 origin = {3, 5, 10}; Vector3 *point; point->x = origin.x; point->y = 0; point->z = origin.z; printf("\npoint values are x = %d | y = %d | z = %d", (*origin).x, (*origin).y, (*origin).z); }
— We can define strings.
There's no such thing recognized as a "string" in C. Strings in C are arrays of characters terminated with a NUL (represented as \0
).
char *title = "unixworks";
The array way of creating a string literal would be:
char title[] = "unixworks";
Which is the equivalent to:
char title[] = {'u', 'n', 'i', 'x', 'w', 'o', 'r', 'k', 's', '\0'};
Note that using the pointer approach to create strings doesn't allow to modify the string later as it's supposed to be treated as a const
.
However, we can work with the string pointer as an array. It will return the value of the first character, since the variable actually points exactly to the beginning of the string.
Pointers and arrays
Although pointers and arrays aren't the same thing, they can work hand to hand in C. In most of the cases, the name of the array is converted to a pointer to the first element.
- An array notation like
array[index]
can be achieved using pointers with*(array + index)
. - The same way the array notation
&array[index]
can be achieved using pointer notationarray + index
.
Arrays in C programming need to have its size declared when we create them, or at least we are told to do so when learning. Other programming languages can perform dynamic arrays without declaring its size when created.
—The fact is that we can create dynamic arrays in C combining pointers and arrays. Managing memory in real-time is extremely useful to arrays that are generated at run-time.
The only prerequisite to create a dynamic array using pointers is to reserve memory for it. That is achieved calling malloc()
.
int *num_ptr; num_ptr = malloc(MAX_NUMBERS * sizeof(int));
where:
(int *)
casts the data type.MAX_NUMBERS
can be whatever value that determines the max elements in the array.sizeof(int)
is the amount of bytes that each element in the array holds.
Dynamic array of void pointers
A useful utility mixing pointers and arrays we can create is a dynamic array of void pointers.
We start defining a struct as follows:
typedef struct Set { void **data; int capacity; int count; } * Set_t;
Where we have:
void **data
that are void pointers stored as a dynamic arrays. When used in a pointer,void
defines a generic pointer (pointer of any type).capacity
which is the total allowed items.count
which is the current amount of items. It acts as an index for the stored data.
We can initialize our List structure with the same criteria for dynamic arrays, using malloc()
:
Set_t set = malloc(sizeof(struct List)); *set = (struct Set_t) { .count = 0, .capacity = 1, .data = malloc(sizeof(void *)) };
If we want to add data to our list, we can increase the count value of our struct:
set->count += 1;
and compare it against the capacity value.
if(set->count == set->capacity) { set->capacity *=2; set->data = realloc(list->data, list->capacity * sizeof(void *)); }
As most of the data structures, this dynamic array of pointers becomes useful when we create some functions to work with it.
As an example we can make a function to get a value from an index of the set, and another one to check if a value is contained in an index of the set.
void *IndexValue(Set_t set, int index) { if (index > set->count) { printf("Index is out of bounds.\n"); exit(1); } return set->data[index]; } void SetContains(Set_t set, void * value) { for (int i = 0; i < set->count; i++) { if (Index(set, i) == value) printf("Value is in the set.\n"); } printf("Value is not in the set.\n"); }
Linked lists
Arrays are fine, but they can be inefficient depending the program to create and the target device architecture.
Linked lists are a data structure. Instead of asking a large contiguous block of memory in a request to store an array, ask for one data unit at a time, for one element at a time in a separate request.
Let's say we have some data we want to store as a list.
int x, y, z, w;
This makes the memory to allocate the data in non contiguous memory blocks. But we need to link the memory blocks in some way.
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | x | | | y | | z | | | | w | | | +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
One common solution is to store next to each data value, the memory address of the next data block.
—This can be represented in C creating a struct, which we can name Node
, where we store the data value, and the next node address (a pointer):
typedef struct Node { int data; node* next; } Node_t;
This way we'll have something like this:
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | x |y_mem| | y |z_mem| z |w_mem| | w | 0 | | | +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
instead of
arr[3] = {x, y, z, w}; +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ | | a[0]| a[1]| a[2]| a[3]| | | | | | | | +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
We are using some extra memory in the list compared with the array method, but that gives us the ability to create and free nodes dynamically where we need, and when we need.
—Worth mention here is the last node of the list points to NULL
or 0
as the next node address, indicating there's no more data in the list.
In the other hand, the address of the first node of the list gives us access to the complete linked list. Usually this first node is called head
.
Node_t *head = NULL;
If we'd need to create another data link to our list, we first need to create a separate node, and then link the last node address to the newly created node instead of NULL
.
Node_t *nodeA; nodeA = malloc(sizeof(node_t)); nodeA->data = 10; head = nodeA;
We can also insert nodes anywhere in the list. The only thing to take care is to relocate the address values of each node.
Node_t *head = NULL; Node_t *nodeA, *nodeB; nodeA = malloc(sizeof(node_t)); nodeB = malloc(sizeof(node_t)); nodeA->data = 10; nodeB->data = 43; head = nodeB; nodeB->next = nodeA; nodeA->next = NULL;
At this point the process starts to repeat itself a lot, and programming is intended to automate tasks. We can organize this a bit, creating a function that creates nodes for us.
Node_t *CreateNode(int data) { Node_t *result = malloc(sizeof(node_t)); result->data = data; result->next = NULL; return result; }
This way we can start working dynamically and add nodes each time we need them.
Node_t *head = NULL; Node_t *dummy; dummy = CreateNode(10); head = dummy; dummy = CreateNode(43); dummy->next = head; head = dummy;
This data structures are more useful if we implement functions to work with them. As an example, we can make a function to locate a node inside the list.
Node_t *LocateNode(Node_t *head, int data) { Node_t *tmp = head; while(tmp != NULL) { if (tmp->data == data) return tmp; tmp = tmp->next; } return NULL; }
Another great feature in linked lists is the possibility to insert nodes at a certain point of the list. We can make a function for it too:
void InsertNodeAt(Node_t *insertPoint, Node_t *newNode) { newNode->next = insertPoint->next; insertPoint->next = newNode; }
Summing up
Pointers open a huge field of possibilities in C but remember, with great power comes great responsibility.
We have to take care of the heap use in some way. Using pointers introduce us the power to dynamically allocate elements in memory and this can cause to out of memory errors.
When using malloc()
to allocate memory, we know that it will return NULL
if it runs out of memory, so a good practice is to check if we really got the memory needed when allocating.
pointerVar = malloc(sizeof(type_t)); if (pointerVar == NULL) { printf("\nOut of memory"); exit(1); }
Another good practice is to free up memory once we are done using it. A reminder on working with memory can be found here.