MPI is a library-based specification for parallel and distributed computing, similar to OpenMP and focused on Fortran and C languages. MPI is designed to explicitly implement parallelism in distributed memory environments using processes and a library-based syntax, while POSIX threads and OpenMP are used for implementing shared memory parallel programming with threads. MPI supports many parallel architectures and not just distributed systems. Compilation tools for MPI are typically wrappers that set the required flags and environment on top of existing compilers.
#include <mpi.h>
int main(int argc, char argv[])
{
/* No MPI call before this line */
MPI_Init(&argc, &argv);
/* MPI calls go here */
MPI_Finalize();
/* No MPI call after this line */
return 0;
}
MPI_init(int * argc_p, char *** argv_p)
is used to initialize the MPI environment and must be called by all processes in a parallel program before any other MPI function can be called. MPI_Init is responsible for setting up the communication infrastructure that allows the processes to communicate and coordinate their work.
argc_p
and argv_p
are the parameters of the main function. In case of multithreaded programs it is used MPI_Init_thread
when the program has to run on threads and not on a cluster.
Dually int MPI_Finalize(void)
is to de-allocates the message buffers and cleans up all.
The MPI_COMM_WORLD
is the main communicator is a collection of processes that can communicate with each other, but there are many communicators.
The MPI_Comm_size(MPI_Comm , int * size)
function returns the number of processes in the communicator, while the MPI_Comm_rank(MPI_Comm , int * rank)
function returns the rank of a particular process in the communicator. These functions are used to identify and distinguish processes within the communicator.
The return value of these functions is an error code, while the relevant data is inserted into a pointer provided as an argument.
#include <stdio.h>
#include <mpi.h>
int main(int argc, char** argv) {
// Initialize the MPI environment
MPI_Init(NULL, NULL);
// Get the number of processes
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &size);
// Get the rank of the process
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
// Get the name of the processor
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
// Print off a hello world message
printf("Hello world from processor %s, rank %d out of %d processors\n",
processor_name, world_rank, world_size);
// Finalize the MPI environment.
MPI_Finalize();
}
The MPI_Send
routine is used to write a message to a communicator.
int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)
The MPI_Recv
routine is used to read a message from a communicator.
int MPI_Recv(const void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status)
Both are a blocking routine: the function will not return until the targeted processes reads the message or have written the message.
buf
is the array ready to receive data or it's the data ready to send.count
states how many replicas of the data type will be sent, or the maximum allowed in the buffer- The
status
parameter is used to store information about the message. - the
source
anddest
argument specifies the rank of the process in the communicatorcomm
from which the message is received. This value can be a specific rank, such as0
or1
, or it can be the special valueMPI_ANY_SOURCE
, which indicates that the message can be received from any process in the communicatorcomm
. This can be useful if you don't know the rank of the process that will be sending the message. tag
is used to distinguish messages travelling on the same connection (nonnegative int).- Note that sending and receiving messages introduces synchronization between processes.
- Where there is synchronization, deadlocks may arise.
We use this type to define what kind of message MPI is delivering.
Possible datatypes:
- MPI_INT
- MPI_CHAR
- MPI_BYTE
- MPI_FLOAT
- MPI_DOUBLE
- MPI_SHORT
- MPI_LONG
- MPI_LONG_DOUBLE
- MPI_UNSIGNED
- MPI_UNSIGNED_CHAR
- MPI_UNSIGNED_SHORT
- MPI_UNSIGNED_LONG
Typically al processes can access the standard output, while only the process 0 can access the standard input.
int MPI_ Bcast (void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm)
int MPI_Gather(const void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int dest, MPI_Comm comm)
Join even chunks of data from all processes in a
communicator. The result stored in recvbuf
is a single vector of elements in dest
process, with size = communicator size * recvcount
.
int MPI_Scatter(const void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm)
Scatters data evenly across all processes in the communicator. sendcount
is the amount of element to be stored in each recvbuf
structure.
int MPI_Scatterv(const void *sendbuf, int sendcounts[k], const int displs[k], MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm)
Scatters data across all processes according to a user defined distribution. Process with rank k
will receive sendcounts[k]
elements, starting from the position displs[k]
of the sendbuf.
int MPI_Reduce(const void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int dest, MPI_Comm comm)
Reduces the values in the sendbuf
using the specified operation and stores the result in the receive buffer.
Note that count
specifies how many elements of send/input buffer should be reduced.
MPI_Op | Operation |
---|---|
MPI_MAX | Maximum |
MPI_MIN | Minimum |
MPI_SUM | Sum |
MPI_PROD | Product |
MPI_LAND | Logical AND (Between booleans) |
MPI_BAND | Bit-wise AND |
MPI_LOR | Logical OR |
MPI_BOR | Bit-wise OR |
MPI_LXOR | Logical XOR |
MPI_BXOR | Bit-wise XOR |
MPI_MAXLOC | Maximum value and location |
MPI_MINLOC | Minimum value and location |
int MPI_Allreduce(const void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, MPI_Comm comm)
MPI_Allreduce is a collective operation that combines the values of all the processes in a communicator and broadcasts the result to all the processes, while MPI_Reduce is a collective operation that combines the values of all the processes in a communicator and returns the result to a single process (the root one).
It's the same of Reduce but the result of the reduction is stored in the recvbuf
of all processes in the communicator.
Barrier explicit synchronization for all processes in a communicator.
// Synchronize all processes with a barrier MPI_Barrier(MPI_COMM_WORLD);
double MPI_Wtime(void)
provides a processor dependent time measurement to evaluate the computing time of a portion of program.
Partitions the group associated with comm into disjoint subgoups , one for each value of color.
MPI_Comm new_comm;
int color = 0;
int key = 0;
if (rank < n/2)
color = 0;
else
color = 1;
MPI_Comm_split(MPI_COMM_WORLD, color, key, &new_comm);
/*
If `rank` is less than `n/2`, the process belongs to the 'global' communicator, otherwise to the new communicator.
*/