Testing the latest and the greatest

The goal of this project is to install the latest hypre distribution, to compare its performance to the previous revision on the matrix "ij" interface and to test the "struct" interface to the PCG.

Do not forget to read our Beowulf cluster Web pages!

Installing the latest Hypre

On Beowulf, use the "env" command to check if you run C or TC shells. If not, use "chsh" command to change your shell to "/bin/tcsh". If you want the shell change on beowulf to be permanent, change the shell on math.

Get the latest Hypre alpha in your beowulf account terminal window:

cd
cp ~aknyazev/hypre_03_19_04.tgz .
tar xzvf hypre_03_19_04.tgz

This creates a new directory, linear_solvers.
There might be several options to compile this hypre distribution on UCD Beowulf, but at the moment only one of them is tested, that is gcc-scali:

./configure
make test

The configure command finds and uses the the mpicc script, which says:
#!/bin/csh -f
gcc $* -O2 -fomit-frame-pointer -D_REENTRANT -I/opt/scali/include \
-L/opt/scali/lib -lmpi

Check the make output to make sure that there are no errors (warnings are OK).

All drivers are now located in the test directory, so cd to test before you run. The runs are started in the same way as before for the gcc-scali compile of the previous version.

Examples (interactive job):

mpimon ij -solver 1 -- node1 2 node2 2

Examples (background job using PBS):

scasub -mpimon -np 6 -npn 2 stuct-n 50 50 50

Retesting the ij interface to the PCG

Please run ij with solvers 1,2,12 and 43 and compare the results with those for the gcc-scali compile of the previous revision. Expected outcome: no noticable difference.

Testing the struct interface to the PCG

Please run struct test driver with all PCG solvers 10, 11, 17-19 (solver 12 is apparently broken in this release):

                        10 - CG with SMG precond
                        11 - CG with PFMG precond
                        17 - CG with 2-step Jacobi
                        18 - CG with diagonal scaling
                        19 - CG
and compare the results with those for the ij with solvers 1,2,12 and 43. Expected outcome: struct solvers should run several times faster than the corresponding ij solvers.

Attention: the input parameters and the defaults in ij and struct interface drivers are completely different in the present version of Hypre. Namely, in struct, the `-n' option allows one to specify the local problem size PER processor. The global problem size will be Px*nx by Py*ny by Pz*nz. Also, the defaut -P option in struct is different from that of ij, as well as the default tolerance and the max number of iterations.

To change the default tolerance and the max number of iterations and the verbosity level to make them consistent with that of the ij driver, you need to change lines 1263, 1264 and 1267 of the the struct.c file:
HYPRE_PCGSetMaxIter( (HYPRE_Solver)solver, 50 );
HYPRE_PCGSetTol( (HYPRE_Solver)solver, 1.0e-06 );
HYPRE_PCGSetPrintLevel( (HYPRE_Solver)solver, 1 );
into
HYPRE_PCGSetMaxIter( (HYPRE_Solver)solver, 1000);
HYPRE_PCGSetTol( (HYPRE_Solver)solver, 1.0e-09 );
HYPRE_PCGSetPrintLevel( (HYPRE_Solver)solver, 2 );
and recompile, using
make struct
in the test directory. The struct driver does not seem to have the command line -tol option as the ij driver has.

In order to have consistent problem size, follow these examples:

mpimon ./ij -solver 1 -n 100 100 100 -P 1 2 1 -- node1 2
mpimon ./struct -solver 11 -n 100 50 100 -P 1 2 1 -iout 0 -- node1 2

solve the problem with the 3D Laplacian of the same size 100x100x100 on one node 2 CPUs, while

mpimon ./ij -solver 1 -n 100 200 100 -P 1 4 1 -- node1 2 node2 2
mpimon ./struct -solver 11 -n 100 50 100 -P 1 4 1 -iout 0 -- node1 2 node2 2

solve the problem with the 3D Laplacian of the same size 100x200x100 on 2 nodes with 2 CPUs each.

The -iout 0 option in struct prevent its from generating output files we do not need. Even with -iout 0 option, the struct driver generates a file called zout.A.00000 that you need to remove manually. I could not find out how to tell the struct driver NOT to generate this file.

The default right-hand side (vector of ones) and the default initial guess (vector of zeros) seem to be the same in both ij and struct drivers.

Here are the scalability plots obtained by the students. (Check if the default struct 1.0e-06 tolerance has been used for these tests, which is different from the default ij 1.0e-09 tolerance.)