C programming
January 11, 2021

C programming | Working with threads

When we run a program inside the OS, a process is asked to handle the task. If our code isn't designed in a concurrent way, then the process uses only one thread to run the main function. This makes the program to perform its actions sequentially, but we can take advantage of threads to perform more than one thing at a time if needed.

Modern microprocessors are built with multiple processors (cores). To achieve programming concurrency we can face two scenarios:

  • Multiple threads running inside one process.
  • Multiple processes running at the same time.

Concurrent programming defines an environment where created tasks can be performed at the same time, but it doesn't mean that all tasks are going to be executed in parallel.

A process consists in a running program plus the resources that allow the program's execution. Processes can have multiple threads running inside them.

We can check running processes inside *nix using commands like ps, pstree or top.

In this article, we are going to focus in the first case scenario: multiple threads running inside one process.

What is a thread?

A thread is a separate dynamic set of code executions or instructions that run alongside the main process in a program, and it can be scheduled.

Threads give us concurrency without isolation, working in the same process and sharing memory space, which makes the ability for threads to intercommunicate.

Creating a thread is cheaper than creating a process, and ending a thread is faster than ending a process.

Until now, all the examples shown in previous articles have been made using serial or sequential computation. That's not wrong, but we were using only one thread in one process to achieve our functionality.

Sequential commands run like this:

start -> job_a -> job_b -> job_c -> ... -> end

which in code is as we usually call functions in main:

int main() {
    job_a();
    job_b();
    job_c();
    ...
    
    return 0;
}

A thread set is executed like this:

       -> job_a ->
      /           \
start ->  job_b   -> end
      \           /
       -> job_c ->

which in pseudo code would look like this:

int main() {
    createThread(job_a());
    createThread(job_b());
    createThread(job_c());
    
    ...
    
    join_thread(job_a());
    join_thread(job_b());
    join_thread(job_c());
    
    return 0;
}

In order for a program to take advantage of threads, it needs to be able to be organized into discrete, independent tasks which can execute concurrently.

|-- job_a --| |-- job_b --| ... |-- job_n --|

Considering our sequential code from above, we can check three situations to check if threading is possible in our program:

  • Jobs or routines can be interchanged and result is not modified.
|-- job_b --| |-- job_a --| ... |-- job_n --|
  • Jobs or routines can be interleaved and result is not modified.
|- rA -| |- rB -| |- rA -| |- rB -| |- rA -| |- rN -|
  • Jobs or routines can be overlapped and result is not modified.
|-- job_a --|        |-- job_n --|
        |-- job_b --|

We can take a look at the internal workflow inside an IDE (Integrated Development Environment). An IDE usually contains various spaces inside a workspace.

When we launch the program, a process is created by the operating system. That process contains the required threads for the IDE to run the multiple operations it needs, like the integrated terminal emulator, the file explorer, the text editor, or the syntax checker.

We can implement threading in our program as a matter of trial and error, or to specific task only at the beginning, incrementing the number of tasks and threads as the program evolve and grow up the thread model. If we want to start from a proven ground, the POSIX threads standard offers some existing models for threaded programs, which are not designed for any specific application kind, but are worth knowing, like:

— Pipeline model

The pipeline model takes a long input stream and process each of the inputs through a series of stages or sub-operations. Each stage can handle a different unit of input at a time.

input -> thread_a -> thread_b -> thread_n -> output

The overall throughput of a pipeline is limited by the thread that processes its slowest stage, meaning threads that follow it in the pipeline are stopped until it has finished. In this type of threading model is good to design the program in a way where all stages take about the same amount of time to finish.

The standard Graphics Pipeline uses this threading model.

— Thread pool

In this model, one thread is in charge of work assignments for the other threads. The thread in charge deals with requests and communications that arrive in an asynchronous way, while the other threads perform how to handle the requests and process the data.

This model is also known as the manager-worker model.

input_a ->              -> worker_a
          \            /
input_b    -> manager  ->  worker_b
          /            \
input_c ->              -> worker_c

This model fits well in database servers, or desktop related tasks like window managing.

— Peer model

In this model, a thread must create all the other peer threads when the program starts but after that, all threads work concurrently on their tasks without a specific leader. This makes each thread responsible for its own input.

       -> thread_a ->
      /              \
input ->  thread_b   -> output
      \              /
       -> thread_c ->

Given the lack of a manager thread, peers need to synchronize their access to common input sources.

Implement threading in C

We can work with threads in C, however there isn't any built-in solution for this. Inside unix-like machines we have a set of POSIX types and calls wrapped in a header named pthread.h that let us access threading functions in C. So before we even start, we need to add the header to our code.

#include <pthread.h>

Let's create a first threads' boilerplate. It's easier than you may expect.

In short, we need a function we want to execute in parallel to our main() one, then we need to create a thread, assign the desired function to it, ensure that we are executing it, and terminate the thread once we're done.

Create a function to execute an entry point

The standard prototype for a function that is going to be passed to a thread follows the scheme void *function_name(void *arg)

void *thread_job() { 
    printf("We are in a new thread\n"); 
    return NULL;
}

Create a thread

pthread_t thread;
pthread_create(&thread, NULL, function_to_execute, &value_to_pass);

We need to pass the following parameters to the thread creation:

  • The ID from the created thread.
  • The attributes we want to use to create the thread. Pass NULL if you don't need any special ones, so defaults are applied.
  • A pointer to the function to execute by the thread.
  • A pointer to the thread argument.

This returns 0 if thread creation is successful and nonzero if not.

It's a good practice to avoid code errors checking the returning value of the thread creation function.

pthread_create(&thread, NULL, function_to_execute, &value_to_pass) != 0 ? printf("Failed to create Thread\n") : printf("Thread created!\n");

— Note that we are creating threads from the main() function of the program, but we can create them from inside actual threads too.

            -> thread_c ->
           /              \
thread_a ->    thread_b    -> thread_d ...

Once a thread is created, it has a life cycle that consists in four states:

  • Ready state, meaning the thread is waiting for a processor, and able to run.
  • Running state, when the thread is currently executing.
  • Blocked state, meaning the thread is waiting for a synchronization mechanism or an I/O operation to complete.
  • Terminated state, once the thread is done or cancelled.
blocked --> ready <---> running --> terminated
   |                       |
   └----------<------------┘

Ensure thread execution

We can use pthread_join() as a thread synchronization call to ensure that our main thread waits until the second thread finishes:

pthread_join(thread, NULL);
Note we are passing NULL as the second argument. We'll use this second argument in a few lines below to return data from our thread.

Terminate a thread

Threads normally terminate once they done their inside work correctly. However there are more options to terminate a thread.

  • We can explicitly tell a thread to terminate using pthread_exit():
pthread_exit(NULL);
  • We can specify which thread to terminate using pthread_cancel():
pthread_cancel(thread);

— After following the steps, our code should look like this:

#include <stdio.h>
#include <pthread.h>

int error_close() {
    printf("Failed to create Thread\n");
    return 1;
}

void *thread_job() { 
    printf("We are in a new thread\n"); 

    pthread_exit(NULL); /* optional, but recommended */
    return NULL;        /* optional, but recommended */
}

int main() {
    pthread_t thread;
    pthread_create(&thread, NULL, function_to_execute, &value_to_pass) != 0 ? error_close : printf("Thread created!\n");
    
    printf("We are inside Main()\n");
    
    pthread_join(thread, NULL);
    
    pthread_exit(NULL);  /* optional, but recommended */
    
    return 0;
}

Tell the compiler to use pthread lib

To compile our program using threads we need to link it along with the POSIX thread library. Adding the -pthread flag to the compiler should work.

$ gcc -pthread -o test_threads main.c
$ ./test_threads

Sharing data between threads

Threads can communicate each other, but they need fast communication methods. Most thread communication involves using memory, since all threads created by the program live in the same process and share the same memory space.

We have three types of memory to work with (Refer here to read about managing memory in C) and to place data to be shared between threads.

Global memory

If we know that we are only going to have an instance of an object inside our multi threaded program like a mutex, which we don't want to be inside individual threads.

Stack memory

Storing data in this memory location is recommended for thread routines since its lifetime is the same of the routine execution.

Dynamic memory

Storing data dynamically requires some memory management routine like malloc(). Data stored in this type of memory has a lifetime scoped between memory allocation and memory deallocation.

This is usually recommended to manage persistent context, since it's independent from all program's threads.

We can find the following shared data between threads in a process:

  • Memory space.
  • Global variables.
  • Opened files.
  • Children processes.
  • Timers.
  • Semaphores and signals.

Threads also have private data. Variables declared within the thread function are local to the thread.

Other private data from a thread is:

  • Thread ID.
  • Registers.
  • Thread status.
  • Thread context when it's not executing.

— A thread doesn't keep track of the other created threads, nor does it know the thread that created it. As part of the POSIX thread header functions, we can take advantage of pthread_self() to get the running thread's id.

Inside the thread_job() function we can add the following lines:

void *thread_job() { 
    printf("We are in a new thread with ID: %ld\n", pthread_self()); 
    pthread_exit();
}

Since pthread_self() returns the thread handle of the calling thread, we can use it in combination to pthread_equal() to identify a thread when entering a routine.

Passing arguments to threads

Thread functions take a void pointer as an argument, and return a void pointer as result. Since this is generic data, it leaves us almost total freedom to operate with our data.

Let's modify our actual code. We are going to define a thread count number, and we are going to create as much threads as the defined value has.

We are going to print which thread are we in when running the thread_job() function. To know which one is the working one, we are passing the thread counter as the argument value.

#include <stdio.h>
#include <pthread.h>

#define THREAD_COUNT 10

void *thread_job(void *value) { 
    long t_num;
    t_num = (long)value;    
    
    printf("Thread %ld with ID %ld is working...\n", t_num, pthread_self());
    
    /* sleep acts as a dummy, simulating some work being made */
    sleep(2);

    printf("Thread %ld with ID %ld is done!\n", t_num, pthread_self());
    
    pthread_exit(NULL);
    return NULL;
}

int main() {
    pthread_t *threads = (pthread_t*)malloc(sizeof(pthread_t));
    long i;
    
    for(i = 0; i < THREAD_COUNT; i++){
        printf("We are inside Main()\n");
        
        if(pthread_create(&threads[i], NULL, thread_job, (void *)i) != 0){
            printf("error creating thread[%ld]", i);
            return 1;
        }
    }

    for(i = 0; i < THREAD_COUNT; i++) {
        pthread_join(thread[i], NULL);
    }
    
    return 0;
}

This is nice but in real life we'll probably need to pass more than one argument to our thread on creation. We can collect all the data we need to pass to a thread inside a struct type.

typedef struct tdata {
    int      amount;
    char     *account_name;
    e_action action;
}tdata_t;

When we pass a struct into the thread job function, we can access its data by simply casting the type of the struct:

void *thread_job(void *data) { 
    tdata_t received_data;
    
    received_data.amount = ((tdata_t*)data)->amount;
    received_data.account_name = ((tdata_t*)data)->account_name;
    received_data.action = ((tdata_t*)data)->action;

    
    printf("Thread job. Account name is: %s\n", received_data.account_name);

    /* free data struct before leaving if not needed anymore */
    free(data);
    return NULL;
}

Returning values from threads

Sometimes we may need our thread to make some operations and return something from it.

We can return almost anything since thread functions are type of void pointer. The important point here is to allocate memory to the local value we want to return. Otherwise it will cause a segmentation fault since it's going to be on the stack memory of the function.

Allocate some memory in the thread function.

void *thread_job(void *value) { 

    /* allocate some memory for our desired return value */ 
    int *t_int = (int *)malloc(sizeof(int));
    
    for(int i = 0; i < (int)value; i++)
        (*t_int)++;

    /* return the value */
    return t_int;
}

Inside our external function that controls the thread creation and execution, we can create a variable to hold what is returned from the thread job.

int *ext_result;

Using pthread_join() we can get the return value from the function using the second argument of the function:

pthread_join(entry_point, (void*)&ext_result);

Now we can use the returned value in the rest of our program.

Following the good practice of freeing memory up when we are done using it. Inside the external function we have to free *ext_result after using it (since we cannot do it inside the thread job function, and both variables point at the same memory address).

free(ext_result);

Explicit synchronization

In concurrent programs is not possible to determine what is going to happen when we execute it just by looking at it. Threads run concurrently and the execution order depends on the scheduler, but we can manage to intentionally make a thread wait for another one to finish.

If more than one thread is asked to access or write a memory location we can run into a situation known as race condition.

 race condition between two threads accessing and writing the same memory
   
   thread_a      memory      thread_b           threads' steps
     
              |00|0A|0B|0C|
             /             \
|00|0A|0B|0C|               |00|0A|0B|0C|    1. read the value
     |                           |
   |08|                        |0E|          2. modify the value
     |                           |
|00|08|0B|0C|               |00|0E|0B|0C|    3. write the value
             \             /
              |00|0E|0B|0C|
              
          this time thread_b wins

Avoiding these situations can be achieved via mechanisms that manage read/write locks and barriers such as mutexes or semaphores.

— If we use threads to run completely independent functions that have no correlation from each other, synchronization isn't a problem, and we would choose to skip this process.

MUTEX

A mutex is the basic pthread synchronization mechanism. Its name stands for mutual exclusion lock. It's useful to solve unpredictable race conditions by serializing the execution of threads.

If a thread succeeds calling a mutex lock, it will block the other threads to execute the code below until the owner thread unlocks the mutex.

The pthreads API provides mutex functions and operations to work with.

In order to create a mutex we need to declare a pthread_mutex_t. We can do it in an static or a dynamic way:

  • Static, declaring it outside any function:
/* just the mutex */
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;


/* A mutex inside a struct, holding protected data */
typedef struct m_data {
    pthread_mutex_t mutex;
    int             value;
} m_data_t;

m_data_t data = {PTHREAD_MUTEX_INITIALIZER, 0};
  • Dynamic, declaring it when we allocate memory to it:
/* A mutex inside a struct, holding protected data */
typedef struct m_data {
    pthread_mutex_t mutex;
    int             value;
} m_data_t;

...

foo(){
    m_data_t *data;
    data = (m_data_t*)malloc(sizeof(data_t));
    pthread_mutex_init(&data->mutex, NULL);
    ...
}

Remember to initialize the mutex before creating any threads.

Once its initialized, we can lock it and unlock it using the following functions:

pthread_mutex_lock(&mutex); 

/* code to execute in between */

pthread_mutex_unlock(&mutex);

If a thread calls the mutex lock, the code between the lock function call and the unlock function call can only be accessed by a single thread until the mutex is unlocked.

This kills parallelism, but allow to make responsiveness in places like user interfaces. We can have a thread doing the I/O and the rest calculating whatever needed in the back.

The example below calculates the first 21 Fibonacci numbers using a separate thread for each one. Try commenting out the mutex lock and run several times the program. Different results in the numbering order may occur.

#include <stdio.h>
#include <pthread.h>

#define THREAD_COUNT 21

int result;
pthread_mutex_t result_mutex = PTHREAD_MUTEX_INITIALIZER;

int calc_fibonacci (long num) {
    if (num <= 1) {
        return 1;
    }
    return calc_fibonacci(num -1) + calc_fibonacci(num -2);
}

void *thread_job(void *value) { 
    pthread_mutex_lock(&result_mutex); 
    
    result = calc_fibonacci((long)value);
    
    pthread_mutex_unlock(&result_mutex);
    
    printf("We are in thread num %ld, and result is %d\n", (long)value, result);
    
    sleep(1);
    return NULL;
}


int main() {
    pthread_t thread;
    long i;
    
    for(i = 0; i < THREAD_COUNT; i++){
        
        if(pthread_create(&threads[i], NULL, thread_job, (void *)i) != 0){
            printf("error creating thread[%ld]", i);
            return 1;
        }
    }

    for(i = 0; i < THREAD_COUNT; i++){
    
        pthread_join(thread[i], NULL);
    
        pthread_exit(NULL);
    }
    
    return 0;
}

When implementing mutexes, we need to take care of a few factors:

  • Waiting threads are not good for performance. It's a good practice to apply several small mutexes to unrelated code executions rather than using a single mutex that locks them all at once.
If the data to lock is independent, is a good idea to use separate mutexes. Complications face up when data isn't independent at all.
  • It takes time to lock and unlock mutexes. This means performance issues, so the first factor should be guided by the common sense of mutexing only critical parts.

CONDITION VARIABLES

Condition variables are a signal mechanism associated with mutexes and their protected shared data. They control threads' access to data, and let threads synchronize between them based on the value of the data.

We can think about condition variables as a notification system among threads.

To create a condition variable, the process is fairly familiar:

pthread_cond_t;

/* using an initializer macro */
condition_var = PTHREAD_COND_INITIALIZER;

/* or using the function call */
int pthread_cond_init(&condition_var, NULL);

Once a condition variable has been initialized, we can use it with a thread in the following two ways:

  • Make the thread wait on the condition variable.
pthread_cond_wait(&condition_var, &mutex);

/* or specifying a timeout with */

pthread_cond_timedwait();

Calling any of the waiting functions require to pass a locked mutex next to the condition variable.

  • Make the thread signal other threads waiting on the condition variable.
/* signal only one of the waiting threads */
pthread_cond_signal(&condition_var); 

/* singal all the waiting threads */
pthread_cond_broadcast(&condition_var);

Both functions make the thread calling them to hold the mutex. The mutex must be unlocked after the call.

SEMAPHORE

A semaphore is a synchronization mechanism made from an unsigned int whose changes can't be interrupted. It's stored in a memory location accessible by all the processes that need to synchronize their operations.

Semaphores' header is separated from pthreads. In order to implement semaphores in our project, the header semaphore.h is required.

The main difference with a mutex, is that semaphores don't have a concept of ownership. While we cannot use a thread to lock a mutex and another one to unlock it, since the mutex expect the same thread to unlock it, it's possible to do the same using semaphores.

In most case scenarios, using mutexes and condition variables is more than enough to solve synchronization problems.

— In order to have a semaphore inside our code we need to declare it and start it:

sem_t *semaphore;

sem_init(&semaphore, 0, N);

We can work with semaphores using two operations:

  • WAIT operation which will try to decrease the semaphore value if its value is greater than zero. If not, it'll wait.
sem_wait();
  • SIGNAL operation which will increment the value of the semaphore, and return.
sem_post();

As most of the data structures in C, we need to create it before using it, and destroy it after using it so we avoid garbage.

A complete overview on how to implement a semaphore could look like this:

#include <semaphore.h>

#define N 6 /* can be any positive value */

/* create a semaphore */
sem_t semaphore;

/* initialize semaphore */
sem_init(&semaphore, 0, N); 

/* allocate a resource */
sem_wait(&semaphore);

...

/* return semaphore to pool */
sem_post(&semaphore);

...

sem_destroy(&semaphore);

return 0;

— We can use a semaphore in a similar way to a mutex by using a binary semaphore (define N as 1), to protect critical parts of the code from race conditions.

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <semaphore.h>

#define THREADS 4
sem_t semaphore;
int counter = 0;

void* thread_job(void* args) {
    printf("Hi from thread %d\n", *(int*)args);
    
	sem_wait(&semaphore);
	counter++;
    printf("Counter value is: %d\n",counter);
	sem_post(&semaphore);
	
	free(args);
}

int main(void) {
	
	pthread_t *threads = malloc(sizeof(pthread_t) * THREADS);
	
	sem_init(&semaphore, 0, 1); //we can change 1 to other value and have more threads running at a time
	int i;
	
	for(i = 0; i < THREADS; i++) {
		int *a = malloc(sizeof(int));
		*a = i;
		if(pthread_create(&thread[i], NULL, &thread_job, a) !=0){
			printf("cannot create thread.\n");
		}
	}
	
	for(i = 0; i < THREADS; i++) {
		if(pthread_join(thread[i], NULL) !=0){
			printf("cannot join thread.\n");
		}
	}
	
	sem_destroy(&semaphore);
	return 0;
}

A working example

In the previous article we worked on a fictitious weather forecast program to explain how to save files. Let's grow our program a bit.

The single thread program

In a serialized way, if we'd want the user to make interaction with the program, we can think of three main functions to implement:

  • Add data to the program.
  • Return data from the program.
  • Generate new data from existing data.
  • Exit the program when done, or requested.

This can be translated into code this way:

typedef enum {
    EXIT = 0,
    WRITE,
    READ,
    OPERATE,
} e_action;

And so, our main function can deal with the type of action, one at a time:

/* simple error message handling */
int handle_error(char* msg) {
    printf("%s\n", msg);
    return 1;
}

int main(int argc, char *argv[]) {
    
    /* Get the desired action (this time from argv[1]) */
    int action;
    if(argc > 1)
        action = atoi(argv[1]);
    else
        action = -1;

    
    switch(action) {
        case WRITE:
            func_write();
            break;
        
        case READ:
            func_read();
            break;
        
        case OPERATE:
            func_operate();
            break;
        
        case EXIT:
            func_end_program();
            break;
            
        default:
            handle_error("No action passed to argv[1]");
            break;
    }

    return 0;
}

If the program is going to be used from a single terminal by a single user, there is no much complication, but let's scale our fictitious program a bit.

Let's take in consideration that weather's forecast data is coming from several automatic stations around a country's region. That data is sent to a server along with the action to perform and once done, the server responses back.

If we maintain a serial version of the program, the moment many automatic weather forecast stations send actions, the performance of the server is going to degrade quickly.

The multi-threaded program

If we want to keep server performance in a good state, one solution is to add threads to our program, so looking at the general tasks we can make threads that operate independent from each other.

Since we need to pass more than one argument to the threads we create, we can use a struct to do so:

typedef struct tdata {
    int              action;    /* the action to perform */
    e_operation      operation; /* the operation to perform, if any */
    daily_forecast_t day;       /* the data to work with */
} tdata_t;


int main(int argc, char *argv[]) {
    tdata_t   *thread_data;
    pthread_t *thread;
    
    int action;
    if(argc > 1)
        action = atoi(argv[1]);
    else
        action = -1;
    
    thread_data = (tdata_t*)malloc(sizeof(tdata_t));
    
    thread = (pthread_t*)malloc(sizeof(pthread_t));
    
    ...

This way the data handling falls into the thread's function:

void *thread_job(void *data) { 
   
    tdata_t received_data;
    received_data.action = ((tdata_t*)data)->action;
    received_data.operation = ((tdata_t*)data)->operation;
    received_data.day = ((tdata_t*)data)->day;

 
    switch(received_data) {
        case WRITE:
            func_write();
            break;
        
        case READ:
            func_read();
            break;
        
        case OPERATE:
            func_operate();
            break;
            
        default:
            handle_error("No valid action passed to argv[1]");
            break;
    }
 
    free(data);
    return NULL;
}

— Now instead of creating a new thread each time a station needs to perform an action, we can define a maximum number of threads, initialize them at the beginning of the program, and reuse them in a thread pool.

A thread pool needs to take care of the following things:

  • The total number of available threads, so we can limit the number of data requests at the same time.
#define NUM_THREADS 10
  • The max size for the data queue, so we can limit the number of requests waiting for service.
#define QUEUE_SIZE 10

Since the queue is a critical part, we need some sort of control over it. We can have a counter to keep track of it, and a mutex to avoid other threads to run over the same queue at the same time.

int queue_count = 0;
pthread_mutex_t data_mutex;
  • A way to behave when all threads are working and the data queue is full, so we don't loose data.
  • A way to behave if the data queue is empty so we don't overheat the processor.
pthread_cond_t data_cond;

— In terms of design, we could figure out the main behavior of the program in the following steps:

  1. The thread pool is waiting until a job is created.
  2. The main thread creates a job and signals the thread pool.
  3. The thread pool gets the task and executes it.
  4. If required, a result is sent back to the main thread.

First of all, we need to define what our threads are going to do when created.

void* start_thread() {

    /* create a struct var to hold data */
    tdata_t data;
        
    /* lock critical part with mutex */
    pthread_mutex_lock(&data_mutex); 
    
    /* if we don't have any data in the queue, we tell the threads to wait */    
    while (data_count == 0) {
        pthread_cond_wait(&data_cond, &data_mutex);
    }
    
    /* if we receive data, then we assign the first element of the queue 
     * to our data holder, and shift the data queue */    
    data = data_queue[0];
    for(int i = 0; i < data_count -1; i++) {
        data_queue[i] = data_queue[i +1];
    }
    
    /* keep track of the data slots */
    data_count--;
    
    /* unlock mutex when done */    
    pthread_mutex_unlock(&data_mutex); 
    
    /* execute the thread job */
    thread_job(&data);
}

Our function thread_job() does not require anymore to be a void* so we can leave it just as a void function.

void thread_job(void *data) { 
   
    tdata_t received_data;
    received_data.action = ((tdata_t*)data)->action;
    received_data.operation = ((tdata_t*)data)->operation;
    received_data.day = ((tdata_t*)data)->day;

 
    switch(received_data) {
        case WRITE:
            func_write();
            break;
        
        case READ:
            func_read();
            break;
        
        case OPERATE:
            func_operate();
            break;
            
        default:
            handle_error("No valid action passed to argv[1]");
            break;
    }
}

Then we need a function to submit jobs with data to the waiting threads:

void submit_job(tdata_t data) {
    
    /* managing the data queue is a critical part so let's lock it
     * before doing anything */
    pthread_mutex_lock(&data_mutex); 
    
    /* assign the data to our data queue and
     * keep track of the data slots */
    data_queue[data_count] = data;
    data_count++;
    
    /* unlock the mutex when done */
    pthread_mutex_unlock(&data_mutex);
 
    /* Wake up one thread */
    pthread_cond_signal(&data_cond);
}

Inside the main function, we can create an infinite loop that listens to user input after we create the thread pool:

The expression for ( ;; ) is the same as while(1)
for ( ;; ) {
    
    printf("\nAutomatic weather forecast station\nWrite action to take: ");
    scanf("%s", buffer);
    action = atoi(buffer);
    
    if(action == EXIT) {
        printf("\nExiting...\n");
        
        free(buffer);
        free(thread_data);
        free(thread);
        
        break;
    }
    
    thread_data->action = action;
    
    submit_data(*thread_data);
}

If we run the code right now, text in the terminal emulator is going to overlap. We need to signal the menu when we are done executing a thread job so the text appears in order.

There are many ways to handle this. Since in this article we talked about semaphores, let's create a binary semaphore that signals when our thread job is done.

Using a simple integer that changes from 0 to 1 can do the trick too.
/* create the semaphore */
sem_t ready_sem;

/* initialize it in the main function, before using it 
 * note that the value is 1, so we can print the menu for the first time */
int main(int argc, char *argv[]) {
    sem_init(&ready_sem, 0, 1);
    ...
}

We need the semaphore to wait before printing the menu:

for ( ;; ){
    sem_wait(&ready_sem);
    ...

And we need to signal once our thread job has finished:

void thread_job(void *data) {
    
    ...
    sem_post(&ready_sem);
}

Now we can operate from the command line without overlapping text messages.

Another option could be not printing any confirmation message from the thread_jobs, leading only errors to prone in the terminal emulator, and that way we can experiment with multiple tasks at a time from a single machine.

Working examples along with compiling instructions are going to be uploaded at unixworks' repo.

Summing up

Threading in computer programs is an extensive field. Covering in depth threads would require more than an article to do it right however, after diving a bit through threading, applied to POSIX and C in this article, we can see that most of it is a game on locking and releasing, waiting and signaling.

Although using threads is not always the best idea to make a program faster, knowing how to implement them can help in our programming design workflows.

There is a newer header for threads, designed for C11 named threads.h which maybe substitutes pthread.h in a future. Right now using it reduces portability and is only available in major C compilers.

Also OpenMP is a multi threading implementation worth mentioning for larger projects. It is an industry standard and is portable and multi-platform.