Parallel Computing

9 min read

In this article written by Jalem Raj Rohit, author of the book Julia Cookbook, cover the following recipes:

Basic concepts of parallel computing
Data movement
Parallel map and loop operations
Channels

(For more resources related to this topic, see here.)

Introduction

In this article, you will learn about performing parallel computing and using it to handle big data. So, some concepts like data movements, sharded arrays, and the map-reduce framework are important to know in order to handle large amounts of data by computing on it using parallelized CPUs. So, all the concepts discussed in this article will help you build good parallel computing and multiprocessing basics, including efficient data handling and code optimization.

Basic concepts of parallel computing

Parallel computing is a way of dealing with data in a parallel way. This can be done by connecting multiple computers as a cluster and using their CPUs for carrying out the computations.

This style of computation is used when handling large amounts of data and also while running complex algorithms over significantly large data. The computations are executed faster due to the availability of multiple CPUs running them in parallel as well as the direct availability of RAM to each of them.

Getting ready

Julia has an in-built support for parallel computing and multiprocessing. So, these computations rarely require any external libraries for the task.

How to do it…

Julia can be started in your local computer using multiple cores of your CPU. So, we will now have multiple workers for the process. This is how you can fire up Julia in the multi-processing mode in your terminal. This creates two worker process in the machine, which means it uses twwo CPU cores for the purpose
```
julia -p 2
```
The output looks something like this. It might differ for different operating systems and different machines:
Now, we will look at the remotecall() function. It takes in multiple arguments, the first one being the process which we want to assign the task to. The next argument would be the function which we want to execute. The subsequent arguments would be the parameters or the arguments of that function which we want to execute. In this example, we will create a 2 x 2 random matrix and assign it to the process number 2. This can be done as follows:
```
task = remotecall(2, rand, 2, 2)
```
The preceding command gives the following output:
Now that the remotecall() function for remote referencing has been executed, we will fetch the results of the function through the fetch() function. This can be done as follows:
```
fetch(task)
```
The preceding command gives the following output:
Now, to perform some mathematical operations on the generated matrix, we can use the @spawnat macro, which takes in the mathematical operation and the fetch() function. The @spawnat macro actually wraps the expression 5 .+ fetch(task) into an anonymous function and runs it on the second machine This can be done as follows:
```
task2 = @spawnat 5 .+ fetch(task)
```
There is also a function that eliminates the need of using two different functions: remotecall() and fetch(). The remotecall_fetch() function takes in multiple arguments. The first one being the process that the task is being assigned. The next argument is the function which you want to be executed. The subsequent arguments would be the arguments or the parameters of the function that you want to execute. Now, we will use the remote call_fetch() function to fetch an element of the task matrix for a particular index. This can be done as follows:
```
remotecall_fetch(2, getindex, task2, 1, 1)
```

How it works…

Julia can be started in the multiprocessing mode by specifying the number of processes needed while starting up the REPL. In this example, we started Julia as a two process mode. The maximum number of processes depends on the number of cores available in the CPU.
The remotecall() function helps in selecting a particular process from the running processes in order to run a function or, in fact, any computation for us.
The fetch() function is used to fetch the results of the remotecall() function from a common data resource (or the process) for all the running processes. The details of the data source would be covered in the later sections.
The results of the fetch() function can also be used for further computations, which can be carried out with the @spawnat macro along with the results of fetch(). This would assign a process for the computation.
The remotecall_fetch() function further eliminates the need for the fetch function in case of a direct execution. This has both the remotecall() and fetch() operations built into it. So, it acts as a combination of both the second and third points in this section.

Data movement

In parallel computing, data movements are quite common and are also a thing to be minimized due to the time and the network overhead due to the movements. In this recipe, we will see how that can be optimized to avoid latency as much as we can.

Getting ready

To get ready for this recipe, you need to have the Julia REPL started in the multiprocessing mode. This is explained in the Getting ready section of the preceding recipe.

How to do it…

Firstly, we will see how to do a matrix computation using the @spawn macro, which helps in data movement. So, we construct a matrix of shape 200 x 200 and then try to square it using the @spawn macro. This can be done as follows:
```
mat = rand(200, 200)
exec_mat = @spawn mat^2
fetch(exec_mat)
```
The preceding command gives the following output:
Now, we will look at an another way to achieve the same. This time, we will use the @spawn macro directly instead of the initialization step. We will discuss the advantages and drawbacks of each method in the How it works… section. So, this can be done as follows:
```
mat = @spawn rand(200, 200)^2
fetch(mat)
```
The preceding command gives the following output:

How it works…

In this example, we try to construct a 200X200 matrix and then used the @spawn macro to spawn a process in the CPU to execute the same for us. The @spawn macro spawns one of the two processes running, and it uses one of them for the computation.

In the second example, you learned how to use the @spawn macro directly without an extra initialization part. The fetch() function helps us fetch the results from a common data resource of the processes. More on this will be covered in the following recipes.

Parallel maps and loop operations

In this recipe, you will learn a bit about the famous Map Reduce framework and why it is one of the most important ideas in the domains of big data and parallel computing. You will learn how to parallelize loops and use reducing functions on them through the several CPUs and machines and the concept of parallel computing, which you learned about in the previous recipes.

Getting ready

Just like the previous sections, Julia just needs to be running in the multiprocessing mode to follow along the following examples. This can be done through the instructions given in the first section.

How to do it…

Firstly, we will write a function that takes and adds n random bits. The writing of this function has nothing to do with multiprocessing. So, it has simple Julia functions and loops. This function can be written as follows:
Now, we will use the @spawn macro, which we learned previously to run the count_heads() function as separate processes. The count_heads()function needs to be in the same directory for this to work. This can be done as follows:
```
require("count_heads")
a = @spawn count_heads(100)
b = @spawn count_heads(100)
fetch(a) + fetch(b)
```
However, we can use the concept of multi-processing and parallelize the loop directly as well as take the sum. The parallelizing part is called mapping, and the addition of the parallelized bits is called reduction. Thus, the process constitutes the famous Map-Reduce framework. This can be made possible using the @parallel macro, as follows:
```
nheads = @parallel (+) for i = 1:200
    Int(rand(Bool))
end
```

How it works…

The first function is a simple Julia function that adds random bits with every loop iteration. It was created just for the demonstration of Map-Reduce operations.

In the second point, we spawn two separate processes for executing the function and then fetch the results of both of them and add them up.

However, that is not really a neat way to carry out parallel computation of functions and loops. Instead, the @parallel macro provides a better way to do it, which allows the user to parallelize the loop and then reduce the computations through an operator, which together would be called the Map-Reduce operation.

Channels

Channels are like the background plumbing for parallel computing in Julia. They are like the reservoirs from where the individual processes access their data from.

Getting ready

The requisite is similar to the previous sections. This is mostly a theoretical section, so you just need to run your experiments on your own. For that, you need to run your Julia REPL in a multiprocessing mode.

How to do it…

Channels are shared queues with a fixed length. They are common data reservoirs for the processes which are running.

The channels are like common data resources, which multiple readers or workers can access. They can access the data through the fetch() function, which we already discussed in the previous sections.

The workers can also write to the channel through the put!() function. This means that the workers can add more data to the resource, which can be accessed by all the workers running a particular computation.

Closing a channel after usage is a good practice to avoid data corruption and unnecessary memory usage. It can be done using the close() function.

Summary

In this article we covered the basic concepts of parallel computing and data movement that takes place in the network.

We also learned about parallel maps and loop operations along with the famous Map Reduce framework. At the end we got a brief understanding of channels and how individual processes access their data from channels.