Message Passing Interface (MPI)

Message Passing Interface (MPI) is a standardized and portable model used for parallel programming across distributed-memory systems. Unlike OpenMP (which uses shared memory), MPI is designed for systems where each processor has its own private memory, and communication happens by explicitly sending messages.

The Message Passing Interface Standard (MPI) is a message passing library standard based on the consensus of the MPI Forum, which has over 40 participating organizations, including vendors, researchers, software library developers, and users. The goal of the Message Passing Interface is to establish a portable, efficient, and flexible standard for message passing that will be widely used for writing message passing programs. As such, MPI is the first standardized, vendor independent, message passing library. The advantages of developing message passing software using MPI closely match the design goals of portability, efficiency, and flexibility. MPI is not an IEEE or ISO standard, but has in fact, become the "industry standard" for writing message passing programs on HPC platforms.

MPI is a language-independent communications protocol used to program parallel computers. Both point-to-point and collective communication are supported. MPI "is a message-passing application programmer interface, together with protocol and semantic specifications for how its features must behave in any implementation." MPI's goals are high performance, scalability, and portability. MPI remains the dominant model used in high-performance computing today.

Why do we use MPI?

MPI is used to:

• Solve very large problems that don’t fit in a single machine’s memory
• Scale applications across hundreds or thousands of nodes
• Achieve high performance in distributed systems

It’s essential when:

• Shared memory (like OpenMP) is not enough
• You need fine control over communication

When should you use MPI?

Use MPI when:

• You are working on clusters, supercomputers, or cloud HPC systems
• Your application needs distributed memory parallelism
• The workload can be divided into independent processes with communication

Typical use cases:

• Climate modeling
• Large-scale simulations
• Computational physics
• Big data processing
• Machine learning at scale

Key Features

• Explicit communication (send/receive messages)
• Scalability to thousands of processes
• Portability across platforms
• High performance on distributed systems
• Supports both synchronous and asynchronous communication

Key Components

1. Point-to-Point Communication

Direct communication between two processes:

• MPI_Send
• MPI_Recv

2. Collective Communication

Operations involving multiple processes:

• Broadcast (MPI_Bcast)
• Reduce (MPI_Reduce)
• Gather/Scatter

3. Communicators

Define groups of processes that can communicate:

• MPI_COMM_WORLD (default group of all processes)

4. Process Management

Each process has:

• A rank (ID)
• Total number of processes

Advantages

• Works on distributed memory systems
• Extremely scalable (used in supercomputers)
• Gives fine control over communication
• Efficient for large-scale problems

Disadvantages

• More complex than OpenMP
• Requires explicit data communication
• Debugging can be difficult
• Communication overhead can hurt performance

Specification of the MPI

• MPI is a specification for the developers and users of message passing libraries. By itself, it is NOT a library - but rather the specification of what such a library should be.

• MPI primarily addresses the message-passing parallel programming model: data is moved from the address space of one process to that of another process through cooperative operations on each process.

• Simply stated, the goal of the Message Passing Interface is to provide a widely used standard for writing message passing programs. The interface attempts to be:
- Practical
- Portable
- Efficient
- Flexible

• The MPI standard has gone through a number of revisions, with the most recent version being MPI-3.

• Interface specifications have been defined for C and Fortran90 language bindings:

• C++ bindings from MPI-1 are removed in MPI-3

• MPI-3 also provides support for Fortran 2003 and 2008 features

• Actual MPI library implementations differ in which version and features of the MPI standard they support. Developers/users will need to be aware of this.

The MPI interface is meant to provide essential virtual topology, synchronization, and communication functionality between a set of processes (that have been mapped to nodes/servers/computer instances) in a language-independent way, with language-specific syntax (bindings), plus a few language-specific features. MPI programs always work with processes, but programmers commonly refer to the processes as processors. Typically, for maximum performance, each CPU (or core in a multi-core machine) will be assigned just a single process. This assignment happens at runtime through the agent that starts the MPI program, normally called mpirun or mpiexec.