In the world of programming, irrespective of the programming languages, speed of execution is very important. There are many techniques to improve the performance and speed of the execution. One such highly efficient way is to leverage the benefit of multiprocessing.
In this tutorial, I’m exploring the basics of multiprocessing and how to implement multiprocessing in Python.
Table of Contents
Multiprocessing is a technique in computer engineering that allows running multiple processes concurrently. It improves the performance of execution in a multi-core system.
Each process has its memory. It means they can run independently.
Here are the 3 things I like using multiprocessing in my Python project
Python allows multiprocessing using module multiprocessing. I found this module easiest to implement multiprocessing.
Here I’m writing a simple example demonstrating multiprocessing in Python.
import multiprocessing def worker(num): """Thread worker function""" print(f'Worker: {num}') if __name__ == '__main__': jobs = [] for i in range(5): p = multiprocessing.Process(target=worker, args=(i,)) jobs.append(p) p.start()
Output:
Worker: 3
Worker: 0
Worker: 2
Worker: 1
Worker: 4
If I execute the code again, I might get the answer with a different sequence. This is because all processes are running independently and simultaneously. Whichever process completes execution first, will get the response.
Explanation:
multiprocessing.Process()
function.worker()
independently by assigning a function name to the variable target
.In the above example, I created one process to execute each worker() function. Suppose I have 4 core system. Only 4 processes can be executed concurrently. In this case, it doesn’t make sense to create 5 processes. Creating one extra process is extra overhead.
To overcome this situation, I use multiprocessing.Pool() to define the number of processes to be created concurrently.
Let’s take an example where I have to find the cube of numbers from 0 to 9.
import multiprocessing def cube(val): return val * val * val if __name__ == '__main__': with multiprocessing.Pool(4) as pool: results = pool.map(cube, range(10)) print(results)
Output:
[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]
Explanation:
pool.map()
function is used to map the process and input variable. The first parameter is function name and the second parameter is input list.cube()
.pool.map()
returns the result from each process execution as a list.In the above two examples, I have not used any shared object. All the processes are running independently.
There are many use cases where I have to share the states, values, or objects between the processes.
Here is an example where I calculate the number of processes created. Basically, it shares the counter value and each process increases the value by one.
import multiprocessing def count_process(shared_counter): with shared_counter.get_lock(): shared_counter.value += 1 if __name__ == '__main__': counter = multiprocessing.Value('i', 0) processes = [ multiprocessing.Process( target=count_process, args=(counter,) ) for _ in range(4) ] for p in processes: p.start() for p in processes: p.join() print(f'Total number of processes: {counter.value}')
Output:
Total number of processes: 4
Explanation:
Value
to share the counter between multiple processes.get_lock()
is used to ensure only one process increment or update the counter at a time.I have explained what is multiprocessing and how it can be achieved in Python to enhance the performance and execution of the code. Hope these examples and coding snippets of using pool worker and sharing states between multiple processes make your learning easy.
If you have any questions or doubts, feel free to ask me in the comment section below.
Meanwhile, you can also explore the other features of a multiprocessing
Python module to leverage the benefits of multiprocessing.
Happy coding!