Multiprocessing in Python | Example Explained with Code

Multiprocessing in Python | Example Explained with Code

Multiprocessing in Python | Example Explained with Code

In the world of programming, irrespective of the programming languages, speed of execution is very important. There are many techniques to improve the performance and speed of the execution. One such highly efficient way is to leverage the benefit of multiprocessing.

In this tutorial, I’m exploring the basics of multiprocessing and how to implement multiprocessing in Python.

What is multiprocessing?

Multiprocessing is a technique in computer engineering that allows running multiple processes concurrently. It improves the performance of execution in a multi-core system.

Each process has its memory. It means they can run independently.

Advantages of using Multiprocessing

Here are the 3 things I like using multiprocessing in my Python project

  1. Performance improvement:
    By using multiprocessing I divide the task into multiple processes.
  2. Parallel execution:
    It allows me to execute multiple processes simultaneously making execution faster.
  3. Avoid GIL:
    GIL allows executing Python bytecode by only one thread. In multiprocessing, each process has its own GIL lock. In other words, by using multiprocessing, we bypass the bottleneck due to GIL in a multithreaded system.

Basic Example Explaining Multiprocessing in Python

Python allows multiprocessing using module multiprocessing. I found this module easiest to implement multiprocessing.

Here I’m writing a simple example demonstrating multiprocessing in Python.

import multiprocessing

def worker(num):
    """Thread worker function"""
    print(f'Worker: {num}')

if __name__ == '__main__':
    jobs = []
    for i in range(5):
        p = multiprocessing.Process(target=worker, args=(i,))
        jobs.append(p)
        p.start()

Output:

Worker: 3
Worker: 0
Worker: 2
Worker: 1
Worker: 4

If I execute the code again, I might get the answer with a different sequence. This is because all processes are running independently and simultaneously. Whichever process completes execution first, will get the response.

Explanation:

  • You don’t have to install any Python module as the multiprocessing module comes preinstalled with Python.
  • Here I am creating 5 processes using for loop.
  • To create a process I’m using multiprocessing.Process() function.
  • Each process executes the function worker() independently by assigning a function name to the variable target.

Using a Pool of Workers

In the above example, I created one process to execute each worker() function. Suppose I have 4 core system. Only 4 processes can be executed concurrently. In this case, it doesn’t make sense to create 5 processes. Creating one extra process is extra overhead.

To overcome this situation, I use multiprocessing.Pool() to define the number of processes to be created concurrently.

Let’s take an example where I have to find the cube of numbers from 0 to 9.

import multiprocessing

def cube(val):
    return val * val * val

if __name__ == '__main__':
    with multiprocessing.Pool(4) as pool:
        results = pool.map(cube, range(10))
    print(results)

Output:

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]

Explanation:

  • I defined the pool of for process. That means, only four processes can be executed concurrently.
  • pool.map() function is used to map the process and input variable. The first parameter is function name and the second parameter is input list.
  • range(10) gives a list of numbers from 0 to 9. Each number will be passed as an input to the worker function cube().
  • pool.map() returns the result from each process execution as a list.

Sharing State between Processes

In the above two examples, I have not used any shared object. All the processes are running independently.

There are many use cases where I have to share the states, values, or objects between the processes.

Here is an example where I calculate the number of processes created. Basically, it shares the counter value and each process increases the value by one.

import multiprocessing

def count_process(shared_counter):
    with shared_counter.get_lock():
        shared_counter.value += 1

if __name__ == '__main__':
    counter = multiprocessing.Value('i', 0)
    processes = [
        multiprocessing.Process(
            target=count_process,
            args=(counter,)
        ) for _ in range(4)
    ]

    for p in processes:
        p.start()
    for p in processes:
        p.join()

    print(f'Total number of processes: {counter.value}')

Output:

Total number of processes: 4

Explanation:

  • Here we use Value to share the counter between multiple processes.
  • get_lock() is used to ensure only one process increment or update the counter at a time.

Conclusion

I have explained what is multiprocessing and how it can be achieved in Python to enhance the performance and execution of the code. Hope these examples and coding snippets of using pool worker and sharing states between multiple processes make your learning easy.

If you have any questions or doubts, feel free to ask me in the comment section below.

Meanwhile, you can also explore the other features of a multiprocessing Python module to leverage the benefits of multiprocessing.

Happy coding!

Leave a Reply

Your email address will not be published. Required fields are marked *