I have a large landscape under analysis and it's taking way too long. So, I'm thinking on using the multicore facility provided through the cli. Has anyone had experience on doing this. Particularly using Condor (http://www.ucs.cam.ac.uk/scientific/camgrid/technical/mpi) as the core manager? We could improve the example in the cli docs for graphab.
In my particular case, I'm trying to run a --delta IIC analysis on a quite large (~22k patches) dataset on a multicore (8) computer, but it still is very... very slow, around 1%/day. However, I have access to a cluster which may speed up things. Now, from the cli docs, I cannot figure out how to setup my script in Condor to send the process to the mli!
So, my question is twofold:
1) What is the advantage or difference between running a process multicore or on a cluster. I imagine that multicore would be preferable as communication between cores should be much larger, but large (>30) core units are not easily available, which would justify the usage of a cluster setup.
2) What are the differences in graphab between -proc and sending a mpi process?
Any help, hint, example or suggestion is welcome. So that I can try this on my large landscape
You can use 2 types of paralellism in Graphab :
- the threaded approach on one computer with several cores or processors
- the mpi approach on a cluster of several computer
The first can be used on any computer from the GUI (Graphical User Interface) or the CLI (Command Line Interface) of Graphab.
It's the simplest way, limited only by the number of cores of the computer.
In the GUI, you can set the number of cores used by Graphab in the menu File | Preferences | Processors.
In the CLI, you have to set the option -proc :
java -jar graphab-1.2.jar --project myproject.xml -proc 8 ...
To use mpi with Graphab, you need a cluster supporting java and openmpi.
The graphab command is :
java -jar graphab-1.2.jar --mpi --project myproject.xml ...
The full mpi command using 64 cores :
mpirun -np 64 java -jar graphab-1.2.jar --mpi --project myproject.xml ...
But this command may differ depending on your cluster configuration