README.md 4.04 KB
Newer Older
Iker Martín Álvarez's avatar
Iker Martín Álvarez committed
1
# Journal of Supercomputing Submission Branch
Iker Martín Álvarez's avatar
Iker Martín Álvarez committed
2

Iker Martín Álvarez's avatar
Iker Martín Álvarez committed
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
## Overview
This branch contains the codebase used for a malleable Conjugate Gradient (CG) experiments and results presented for the paper in section "Paper Information" in the system Nasp. The code represents the state of the project at the time of submission and is tagged accordingly.

## Paper Information
- **Title:** Proteo: A Framework for the Generation and Evaluation of Malleable MPI Applications
- **Authors:** Iker Martín-Álvarez, José I. Aliaga, Maribel Castillo, Sergio Iserte
- **Journal:** Journal of Supercomputing
- **Submission Date:** 30/11/2023

## Branch Structure
This branch is divided into the following 4 directories:
- **BashScripts**: Contains utility scripts for scripts that execute the code.
- **Exec**: Contains the scripts to execute the CG and perform a post-mortem analysis.
- **Main**: Contains code related to CG only.
- **malleability**: Contains all code needed to perform resizes over the CG.

Also, it contains the test sparse matrix "bcsstk01.rsa" and a Makefile.

## Installation

### Prerequisites
Before installing, ensure you have the following prerequisites:
- MPI (MPICH) installed on your system. This code has been tested with MPICH versions 3.4.1 and 4.0.3 with the OFI netmod.
- Slurm is installed on your system. This code has been tested with slurm-wlm 19.05.5.
The following requisites are optional and only needed to process and analyse the data:
- Python 3(Optional). Only if you want to perform the post-mortem processing or analyse the data.
- Numpy 1.24.3(Optional). Only if you want to perform the post-mortem processing or analyse the data.
- Pandas 1.5.3(Optional). Only if you want to perform the post-mortem processing or analyse the data.
- Seaborn 0.12.2(Optional). Only if you want to analyse the data.
- Matplotlib 3.7.1(Optional). Only if you want to analyse the data.
- Scipy 1.10.1(Optional). Only if you want to analyse the data.
- scikit-posthocs 0.7.0(Optional). Only if you want to analyse the data.


### Steps
1. Clone the repository to your local machine:

    ```bash
    $ git clone http://lorca.act.uji.es/gitlab/martini/malleable_cg.git
    $ cd malleable_cg
    $ git checkout JournalSupercomputing23/24
    ```

2. Compile the code using the `make` command:

    ```bash
    $ make install_slurm
    ```

    This command compiles the code using the MPI (MPICH) library.

3. Test the installation:
    ```bash
    $ cd Results
    $ bash ../Exec/runTest.sh
    ```
    This test launches an Slurm Job with the provided sparse matrix bcsstk01.rsa.
    As soon as it ends, it will provide in the job output file the results. Example of a successful run with expected output:

    ```bash
    $ cat job_output.txt
    Test numP=2 numC=4 Meths=0 0 1 -- Is_synch=0 qty=1
    Working with general format
    Start CG
    T_spawn: X
    T_SR: X
    T_AR: X
    T_Malleability: X
    T_total: X
    End(Y) --> (Y,    Y)
    shrink cleaning on node 0 (rank 0 in comm 0): shrink cleaning 0
    ```

    The "X" values represent values in seconds, while the Y represent CG values.
### Clean Up
To clean the installation and remove compiled binaries, use:

```bash
$ make clean
```

## Reproducing Experiments
To reproduce the experiments performed with the malleable CG the following steps have to be performed:

1. Download the Queen_4147 sparse matrix with a Rutherford Boeing(rb) format: https://sparse.tamu.edu/Janna/Queen_4147

2. Move the matrix to the main directory of this branch.

3. From the main directory of this branch execute:
    ```bash
    $ cd Results
    $ bash ../Exec/runOFI.sh 5 > runOfi.txt
    ```

4. (Optional) When the experiments end, you can process the data. To perform this task the optional installation requisites must be meet. To process the data:
    ```bash
    $ cd Exec/
    $ python3 MallTimes.py [slurm] ../Results/ dataCG_G
    $ python3 CreateResizeDataframe.py dataCG_G.pkl dataCG_M
    ```
    After this commands, you will have two files, dataCG_G.pkl and dataCG_M.pkl. These files can be opened in Pandas as dataframes to analyse the data.

<!-- Terminar con paso 5 -->