HyPhy message board - Multiple threads & MPI evaluation

	Welcome, Guest. Please Login

Home

Help

HyPhy message board › HYPHY Package › HyPhy bugs › Multiple threads & MPI evaluation

(Moderators: Sergei, Simon)

‹ Previous Topic | Next Topic ›

Pages: 1

Send Topic

Multiple threads & MPI evaluation (Read 3235 times)

Oscar Harari

Guest

Multiple threads & MPI evaluation
Feb 23^rd, 2006 at 7:46am

Hi,
I've been deploying HyPhy on several platforms and it rocks!

My aim is to run it on parallel platforms:

- MPI: The linker reports the following error:

Linking HYPHYMPI
obj_MPI/likefunc.cpp.o(.text+0xf86f): In function `_LikelihoodFunction::Optimize()':
: undefined reference to `_hy_mpi_node_rank'
collect2: ld returned 1 exit status

Emulating the GTK version, I added the following lines to main-unix.cxx:
(line 89) int _hy_mpi_node_rank;
(line 697) _hy_mpi_node_rank = rank;

It linked. Now I don’t know which algorithm execute to test it as a parallel job.

- Multiple threads:
I was able to compile MP and MP2 versions without problems. When I execute the process (with the command line opcion CPU=4, because I have a 2 dual core processors) I just observe only 1 processor fully loaded, but the others idle.
Again I don’t know which algorithms make use of the pthreads facilities.

Regards, Oscar Harari

IP Logged

Sergei

YaBB Administrator

Offline

Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male

Re: Multiple threads & MPI evaluation
Reply #1 - Feb 23^rd, 2006 at 8:37am

Dear Oscar,

Thanks for the heads-up about missing variable definitions. I was actually updating the GTK version to run in MPI mode and forgot to add the same global variable to main-unix.cxx

I'll fix that before I build the next snapshot.

Re:pthreads; all likelihood function optimizations use pthreads - the effect is more noticeable when the data sets are large. Since most analyses need likelihood function optimizations, you should see multiple threads spawned and run. You can try to run BatchFiles/speedtest.bf with various numbers of CPUs and see if the speed improves. That dataset is small, so you won't see a 2x (or 4x) speedup, but you should see some. Some more obscure algorithms (e.g. ancestral state sampling) also make use of pthreads.

Cheers,
Sergei

« Last Edit: Feb 23^rd, 2006 at 7:42pm by Sergei »

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego

WWW

IP Logged

Oscar Harari

Guest

Re: Multiple threads & MPI evaluation
Reply #2 - Feb 24^th, 2006 at 7:30am

Thank you very much!
I was able to check the speed up executing the MP version.

Right now I am testing the MPI version. Basically I am runing MPITest.bf. It seems to exchange messages among the nodes. I would like to know if the algorithm are using all of the available nodes or one should code a batch file to produce a phylogenetic tree.

Regards, Oscar

IP Logged

Sergei

YaBB Administrator

Offline

Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male

Re: Multiple threads & MPI evaluation
Reply #3 - Feb 24^th, 2006 at 9:46am

Dear Oscar,

You are correct in that MPITest.bf doesn't do very much. Many standard analyses have been MPI enabled, including most tree reconstruction methods. Please note that HyPhy is SLOW when it comes to tree reconstruction (Multimedia File Viewing and Clickable Links are available for Registered Members only!! You need to

Login), because we never got around to implementing tree-search specific speed tricks (a-la PhyML for example).

You can tell if an analysis is MPI enabled by looking at it's source .bf file and seeing if there are references to the MPI_NODE_COUNT variable. In general, it is necessary to write HyPhy batch files with MPI in mind (even though it is easy to do in many cases)

Also see, Multimedia File Viewing and Clickable Links are available for Registered Members only!! You need to

MPI_OPTIMIZER : to automatically spread calculations of models with N rate categories on Master+N processors (no .bf modification needed).
MPI_PARTITIONS: to automatically allocate the computation of a likelihood function on P data partitions over N<=P slave nodes
AUTO_PARALLELIZE_OPTIMIZE: a run time flag, which, if set to 1 in a batch file will cause HyPhy to attempt to automatically adapt single likelihood function optimization (almost all types of functions) across all available MPI nodes.

Cheers,
Sergei

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego

WWW

IP Logged

Pages: 1

Send Topic

‹ Previous Topic | Next Topic ›

« Home

‹ Board

Top of this page