Generate and write to file a random matrix in a distributed, balanced way with MPI
I am trying to generate a random 4x4 matrix in a parallel, "even" way: given the 1D array (representing a matrix) dimension N (e.g. 16 in this case) and the number of processes NP, each process generates N/NP values for the matrix (not necessarily aligned with the row dimensions, to achieve load balancing between all the processes - just as an 1D vector, I will treat that as a matrix in a proper way in some further steps, it is not required here), then each process writes the generated part of matrix to a file. I am currently trying to run the following code with 8 processes (see below for the run command):
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
int size, rank;
int distribute_elems_count(int elems_no) {
int base_elems = elems_no / size;
int spare_elems = elems_no % size;
return rank < spare_elems ? ++base_elems : base_elems;
}
int main(int argc, char **argv) {
MPI_Init(&argc, &argv);
MPI_Comm comm;
MPI_Comm_dup(MPI_COMM_WORLD, &comm);
MPI_Comm_size(comm, &size);
MPI_Comm_rank(comm, &rank);
int max = 30, min = -30;
int rows = 4, cols = 4;
// generating by columns
int elements_count = distribute_elems_count(rows * cols);
double *part = (double *) malloc(elements_count * sizeof(double));
// random value generation
for (int i = 0; i < elements_count; i++) {
part[i] = rand() % (max + 1 - min) + min;
printf("[proc. %d] generated: %f\n", rank, part[i]);
}
int rank_norm = rank / cols;
char *filename = "filename";
MPI_Datatype filetype;
MPI_File file;
MPI_Status status;
int gsizes[1], distribs[1], dargs[1], psizes[1];
gsizes[0] = rows * cols; // size of the main matrix (1d array)
distribs[0] = MPI_DISTRIBUTE_BLOCK; // block distribution
dargs[0] = elements_count;
psizes[0] = size; // all processes can work
MPI_Type_create_darray(size, rank_norm, 1,
gsizes, distribs, dargs, psizes,
MPI_ORDER_C, MPI_DOUBLE, &filetype);
MPI_Type_commit(&filetype);
MPI_File_open(comm, filename,
MPI_MODE_CREATE | MPI_MODE_RDWR, MPI_INFO_NULL, &file);
MPI_File_set_view(file, 0, MPI_DOUBLE, filetype, "native", MPI_INFO_NULL);
MPI_File_write_all(file, part,size, MPI_DOUBLE, &status);
MPI_Barrier(comm);
MPI_File_close(&file);
printf("proc. %d still alive at the end\n", rank);
return 0;
}
The way I compile and run it is the following:
$ rm -f filename && mpicc -Wall test2.c && mpirun -np 8 --oversubscribe ./a.out
[proc. 0] generated: -1.000000
[proc. 0] generated: 24.000000
[proc. 2] generated: -1.000000
[proc. 2] generated: 24.000000
[proc. 4] generated: -1.000000
[proc. 4] generated: 24.000000
[proc. 1] generated: -1.000000
[proc. 3] generated: -1.000000
[proc. 3] generated: 24.000000
[proc. 5] generated: -1.000000
[proc. 5] generated: 24.000000
[proc. 1] generated: 24.000000
[proc. 6] generated: -1.000000
[proc. 6] generated: 24.000000
[proc. 7] generated: -1.000000
[proc. 7] generated: 24.000000
proc. 3 still alive at the end
proc. 7 still alive at the end
proc. 0 still alive at the end
proc. 2 still alive at the end
proc. 6 still alive at the end
proc. 4 still alive at the end
--------------------------------------------------------------------------
mpirun has exited due to process rank 6 with PID 0 on
node compaq exiting improperly. There are three reasons this could occur:
1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.
2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"
3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
orte_create_session_dirs is set to false. In this case, the run-time cannot
detect that the abort call was an abnormal termination. Hence, the only
error message you will receive is this one.
This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
You can avoid this message by specifying -quiet on the mpirun command line.
--------------------------------------------------------------------------
I expect the final matrix to be something like this:
-1 24 -1 24
-1 24 -1 24
-1 24 -1 24
-1 24 -1 24
with each process generating 2 items each for the matrix, like this:
proc. 0 -> -1 24
proc. 1 -> -1 24 // end of 1st row
proc. 2 -> -1 24
proc. 3 -> -1 24 // end of 2nd row
proc. 4 -> -1 24
proc. 5 -> -1 24 // end of 3rd row
proc. 6 -> -1 24
proc. 7 -> -1 24 // end of 4th row
(rand()
has poor randomization and seed is uninitialized, but this is not a problem right now - it is actually useful for debugging)
When I try to read back the file, I have some garbage like this:
$ od -t f8 filename
0000000 -1 24
*
0000040 2,12202817e-314 0
0000060 3,16e-322 0
0000100 0 0
*
0000200 1,6e-322 6,37e-322
0000220 1,6e-322 4e-322
0000240 0 0
*
0000400 2,437226570133198e-152 4,824071254712238e+228
0000420 6,9023403697521e-310 5e-324
0000440 0 0
*
0000600 2,645218138042807e+185 3,946618425203878e+180
0000620 8,66284807723416e+217 1,384331367053311e+219
0000640
What am I missing? I think something is wrong with the parameters I pass to MPI_Type_Create_darray
. I also would let this code work with an arbitrary number of processes (e.g. with different elements_count
for each process), but right now I would just be happy to make it work with equal numbers of matrix value generated.
Answers
The issue with your code lies in the parameters you pass to MPI_Type_create_darray
. Let's correct that and make some adjustments to ensure each process writes its portion of the matrix correctly.
Here's the corrected version of your code:
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
int size, rank;
int distribute_elems_count(int elems_no) {
int base_elems = elems_no / size;
int spare_elems = elems_no % size;
return rank < spare_elems ? base_elems + 1 : base_elems;
}
int main(int argc, char **argv) {
MPI_Init(&argc, &argv);
MPI_Comm comm;
MPI_Comm_dup(MPI_COMM_WORLD, &comm);
MPI_Comm_size(comm, &size);
MPI_Comm_rank(comm, &rank);
int max = 30, min = -30;
int rows = 4, cols = 4;
// Generating by rows for simplicity
int elements_count = distribute_elems_count(rows * cols);
int row_count = elements_count / cols; // Number of rows to generate
double *part = (double *) malloc(elements_count * sizeof(double));
// Random value generation
for (int i = 0; i < row_count; i++) {
for (int j = 0; j < cols; j++) {
part[i * cols + j] = rand() % (max + 1 - min) + min;
printf("[proc. %d] generated: %f\n", rank, part[i * cols + j]);
}
}
int rank_norm = rank / cols;
char *filename = "filename";
MPI_Datatype filetype;
MPI_File file;
MPI_Status status;
int gsizes[2], distribs[2], dargs[2], psizes[2];
gsizes[0] = rows; // Number of rows
gsizes[1] = cols; // Number of columns
distribs[0] = MPI_DISTRIBUTE_BLOCK;
distribs[1] = MPI_DISTRIBUTE_BLOCK;
dargs[0] = row_count; // Number of rows per process
dargs[1] = cols; // Number of columns per process
psizes[0] = size; // Number of processes
psizes[1] = size; // Number of processes
MPI_Type_create_darray(size, rank_norm, 2, gsizes, distribs, dargs, psizes, MPI_ORDER_C, MPI_DOUBLE, &filetype);
MPI_Type_commit(&filetype);
MPI_File_open(comm, filename, MPI_MODE_CREATE | MPI_MODE_RDWR, MPI_INFO_NULL, &file);
MPI_File_set_view(file, 0, MPI_DOUBLE, filetype, "native", MPI_INFO_NULL);
MPI_File_write_all(file, part, elements_count, MPI_DOUBLE, &status);
MPI_Barrier(comm);
MPI_File_close(&file);
printf("proc. %d still alive at the end\n", rank);
free(part);
MPI_Finalize();
return 0;
}
Here are the key changes:
-
We adjust the
distribute_elems_count
function to distribute elements evenly among processes. -
Instead of generating the matrix by columns, we generate it by rows for simplicity. This ensures that each process generates a contiguous block of rows.
-
We calculate the number of rows each process generates (
row_count
) and use it to generate the matrix part. -
In
MPI_Type_create_darray
, we specify the distribution of rows and columns usingdistribs
anddargs
, respectively. The number of processes along each dimension (psizes
) is set tosize
for both rows and columns. -
The file is opened in
MPI_MODE_CREATE | MPI_MODE_RDWR
mode to allow both reading and writing.
With these changes, the code should generate the random matrix correctly and write it to the file with proper distribution among processes.