Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Multiple threads & MPI evaluation (Read 3235 times)
Oscar Harari
Guest


Multiple threads & MPI evaluation
Feb 23rd, 2006 at 7:46am
 
Hi,
I've been deploying HyPhy on several platforms and it rocks!

My aim is to run it on parallel platforms:

- MPI: The linker reports the following error:

Linking HYPHYMPI
obj_MPI/likefunc.cpp.o(.text+0xf86f): In function `_LikelihoodFunction::Optimize()':
: undefined reference to `_hy_mpi_node_rank'
collect2: ld returned 1 exit status

Emulating the GTK version, I added the following lines to main-unix.cxx:
(line 89)        int             _hy_mpi_node_rank;
(line 697)         _hy_mpi_node_rank = rank;

It linked. Now I don’t know which algorithm execute to test it as a parallel job.

- Multiple threads:
     I was able to compile MP and MP2 versions without problems. When I execute the process (with the command line opcion CPU=4, because I have a 2 dual core processors) I just observe only 1 processor fully loaded, but the others idle.
     Again I don’t know which algorithms make use of the pthreads facilities.

Regards, Oscar Harari
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Multiple threads & MPI evaluation
Reply #1 - Feb 23rd, 2006 at 8:37am
 
Dear Oscar,

Thanks for the heads-up about missing variable definitions. I was actually updating the GTK version to run in MPI mode and forgot to add the same global variable to main-unix.cxx

I'll fix that before I build the next snapshot.

Re:pthreads; all likelihood function optimizations use pthreads - the effect is more noticeable when the data sets are large. Since most analyses need likelihood function optimizations, you should see multiple threads spawned and run. You can try to run BatchFiles/speedtest.bf with various numbers of CPUs and see if the speed improves. That dataset is small, so you won't see a 2x (or 4x) speedup, but you should see some. Some more obscure algorithms (e.g. ancestral state sampling) also make use of pthreads.

Cheers,
Sergei
Back to top
« Last Edit: Feb 23rd, 2006 at 7:42pm by Sergei »  

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Oscar Harari
Guest


Re: Multiple threads & MPI evaluation
Reply #2 - Feb 24th, 2006 at 7:30am
 
Thank you very much!
I was able to check the speed up executing the MP version.

Right now I am testing the MPI version. Basically I am runing  MPITest.bf. It seems to exchange messages among the nodes. I would like to know if the algorithm are using all of the available nodes or one should code a batch file to produce a phylogenetic tree.

Regards, Oscar
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Multiple threads & MPI evaluation
Reply #3 - Feb 24th, 2006 at 9:46am
 
Dear Oscar,

You are correct in that MPITest.bf doesn't do very much. Many standard analyses have been MPI enabled, including most tree reconstruction methods. Please note that HyPhy is SLOW when it comes to tree reconstruction (Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login), because we never got around to implementing tree-search specific speed tricks (a-la PhyML for example).

You can tell if an analysis is MPI enabled by looking at it's source .bf file and seeing if there are references to the MPI_NODE_COUNT variable. In general, it is necessary to write HyPhy batch files with MPI in mind (even though it is easy to do in many cases)

Also see, Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login.

Finally, there are a few command-line flags, most notably:

  • MPI_OPTIMIZER : to automatically spread calculations of models with N rate categories on Master+N processors (no .bf modification needed).
  • MPI_PARTITIONS: to automatically allocate the computation of a likelihood function on P data partitions over N<=P slave nodes
  • AUTO_PARALLELIZE_OPTIMIZE: a run time flag, which, if set to 1 in a batch file will cause HyPhy to attempt to automatically adapt single likelihood function optimization (almost all types of functions) across all available MPI nodes.


Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged