Multi-node Parallelism ----------------------- .. admonition:: Overview :class: Overview * **Tutorial:** 20 min * **Exercises:** 5 min **Objectives:** #. Learn about distributed computing using message passing. #. Learn about the *Broadcast* operation in MPI. #. Learn about the *Gather* operation in MPI. While all the aforementioned parallelisms are beneficial, it is limited to a single node. To truly scale up an application, we need to use multiple nodes, i.e., distributed computing. The main challenge with distributed computing is that the memory in each node is distinct and separate, meaning there is no way for a thread in one node to access data in another node. .. image:: ../figs/multinodePrallelism.drawio.png We overcome this challenge by using message passing. The **Message Passing Interface (MPI)**, is a standardized and portable communication protocol used in parallel computing. It enables different processes, possibly running on different machines in a distributed system, to communicate and coordinate their actions by sending and receiving messages. MPI is widely used for developing parallel applications and is designed to handle tasks ranging from simple data exchange to complex inter-process communication in high-performance computing (HPC) environments. Its key features include support for both point-to-point and collective communications, process synchronization, and fault tolerance. .. image:: ../figs/MPI.png Broadcast Operation ******************* The **MPI broadcast operation** is a collective communication function in MPI that allows one process, known as the "root" process, to send a message to all other processes in a communicator. Essentially, the root process "broadcasts" a message to every other process in the group, ensuring that each process receives the same data. .. image:: ../figs/bcast.png This operation is often used to distribute initial data or configuration information to all processes participating in a parallel computation. The broadcast operation helps in synchronizing data across multiple processes efficiently. GPU-aware MPI and All-Gather Operation ************************************** The **MPI Allgather** operation is a collective communication function in MPI that allows all processes in a communicator to exchange and collect data from every other process. Specifically, each process sends its data to all other processes, and in return, each process receives the combined data from every other process. This results in each process having a complete view of the data from all processes in the communicator. .. image:: ../figs/allgather.png The MPI Allgather operation is useful for scenarios where every process needs to access or aggregate information from all other processes, such as in parallel data processing and aggregation tasks. Exercise ********* 1. What distinguishes blocking MPI from non-blocking MPI? ? .. code-block:: console :linenos: qsub 6_mpi.pbs qsub 7_mpi_non_blocking.pbs 2. How will you perform the broadcast from *Process 1*? .. code-block:: console :linenos: qsub 8_mpi_bcast.pbs 3. Can you aggregate the gathered values in *Process 0*? .. code-block:: console :linenos: qsub 9_mpi_gpu.pbs .. admonition:: Key Points :class: hint #. MPI is an effective tool for distributed computating #. Message passing incurs communication cost.