POSIX Threads in OS

Overview

The p in pthread means POSIX. POSIX abbreviates to Portable Operating System Interface and it is a set of standards defined by IEEE for maintaining the portability of code. Let us understand what “portability” means with respect to computer science and also understand what gave rise to POSIX. Before we begin, let us find out what you will learn from this article.

The Idea Behind POSIX

Let us say that your professor gave you a computer program as an assignment. You read the problem and came up with a solution and hurray! The program that you wrote ran perfectly. You then submitted the assignment happily and after a couple of days the grades for these assignments were released and a shocking thing happened. Instead of getting an outstanding grade, you get a bad grade. You then went to the professor and he said that your code didn’t give the answer on his PC. Where do you think you made a mistake in the submission ?

POSIX

From the above scenario, we understand that the code written by you was compatible with your PC and it executed perfectly fine. On the other side, we noticed that the code wasn’t compatible with your professor’s PC. So, the solution to this problem is to write a program that will run perfectly on both PCs. How but? The answer is simple, just follow a common standard that’ll run on all PCs. Standard in computer science means a set of rules that are defined by an organization that is agreed by all parties that make use of that particular entity. Here the entity can be software, a set of code, a style of writing code, etc.

This example explains the fact that when our code follows a standard, we can compile and run the code on a different computing machine easily. The ability for a program to run perfectly in different machines (or) platforms is what portability means. POSIX specifically defines a standard on APIs(Application Program Interfaces) & command line shells across variants of UNIX and other operating systems.

POSIX threads, also known as pthreads, is a parallel execution model. What does a parallel execution model mean? Threads give us the capability of running different tasks parallelly at the same time. This feature of being parallelly executable helps the process to get executed faster. A single process in general is divided into threads and each thread has its own functionality/work to achieve.

In the case of pthreads, threads are created and controlled by calling the POSIX threads API.

What is an API? API is simply a piece of code (or) a program that acts as an interface (or) a communication medium between 2 systems. Here the 2 systems can simply be 2 programs communicating with each other (or) communication between a client and a server (or) 2 applications communicating with each other. APIs have a set of functions defined inside them and these functions cater to the communication. So to understand the pthread API let's look at the pthread library.

pthread library and the thread functions

pthread library is a standardized API and pthreads can be implemented in the C/C++ language using pthread library. The header used to implement pthreads is pthread.h. This library provides a set of functions, constants, and data types in order to create, manage, and control threads.

Let’s understand some important functions that the pthread library provides

pthread_create()

Creation of a thread can be done using the pthread_create() function in the pthread library.

Syntax

Number of arguments: 4

1st argument is a pointer to pthread_t and it represents the TID(thread ID). This is a unique ID assigned to the threads in a certain process.
2nd argument speaks about attributes and using this we can specify the features (or) properties of the current thread.
When the pthread_create function is called it will create a context for the thread. A context is nothing but all the data that a certain thread needs to start its execution. The context will start its execution from the function specified in the 3rd argument.
4th arg is a pointer to the arguments of the function that we pass as the 3rd argument.

pthread_join()

The pthread_join() is used in order to wait for a thread to complete its execution.

Syntax

Number of arguments: 2

1st argument is the TID(thread id) of the thread that the current thread is waiting for.
2nd argument is the pointer that points to the location that stores the return status of the thread ID that is referred to in the 1st argument.

pthread_exit()

killing (or) destroying a thread can be done using the pthread_exit() function .

Syntax

Number of arguments: 1

retval is a pointer to the return value which is an integer. This integer has the return status of the thread that we end using the thread_exit() function.

These 3 functions are some important functions used in creating and managing threads. There are a few other functions that the pthread library provides like thread_cancel(), thread_equal() etc.

Why are pthreads used?

pthreads ::

Pthreads are used to leverage the power of multiple processors. Here a process is broken into threads, each thread can use a processor to complete its execution, and because there are multiple processors executing threads at the same time, parallelism in execution can be seen.
Parallelism caused by pthreads helps in increasing the performance of the program (or)application that is making use of these threads.
The process completion becomes faster.
All of these threads have a shared address space that they can use to communicate with each other. Thus saving a lot of memory and establishing faster communication among threads.
Usage of pthreads helps in having lesser overhead on the operating system. Overhead can be understood using a small example. Let’s say you want to have a sandwich so now the work you put in is cutting veggies, buying cheese, etc. This work is called overhead. So overhead is nothing but excess time wasted to achieve an end goal. So when we are using threads the OS overhead is less as compared to when directly working with processes as switching between threads is easier and faster when compared to switching between processes.

Let's write some code!

Example 1

Compile

Execute

Output

Explanation A thread, in general, is used to do a task (or) perform a functionality. So, here In the code above, I created a function with the name "samplejob" which does something common like printing Hello Scaler! and I am passing the samplejob() function in the thread. In the main function, I initialized a thread variable of data type pthread_t. I then initialized/created the thread using pthread_create() and then waited for the thread to execute using pthread_join().

Example 2

Now to check if different threads run in parallel at the same time or not, let's create 2 different threads and test them

Compile

Execute

Output

Explanation Here the function "samplejob" executes the first print statement Hello Scaler! and then after a time lapse of 3 seconds the second print statement **How are you?** is executed. In the main function, we create 2 threads using pthread_create() and wait for them to complete their execution using pthread_join(). Here in the output, we see that Hello Scaler! is getting printed 2 times and after 3 seconds How are you? is getting printed 2 times. This indicates that both the threads i.e. first_thread & second_thread run at the same time. From this example, we understand that we can leverage the power of parallel execution using pthreads.

Summary

pthreads is a parallel execution model and we use the pthread library to create and manage pthreads in the operating system.
pthread library gives an IEEE standardized API that consists of different functions to perform thread operations. The standardization is done to make code portable.
pthreads in OS help in achieving parallelism and less overhead on the operating system.
The header used for the pthread library is pthread.h
pthread_create() is used for creating threads. pthread_join() is used to wait for the thread to complete its execution. pthread_exit() is used to end the pthread.