Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
sample sizes and software run times (Read 720 times)
Sweeney
Guest


sample sizes and software run times
Apr 5th, 2005 at 12:11pm
 
I am just now getting to know HyPhy, so pardon the basic question, but how long might it take an analysis of say 50 nodes using  "distribution 3" and an MG94xREV, dual rate model, to run on a G5?  For example, might it take hours or days?  What kinds of major things affect the rate at which the program runs?
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: sample sizes and software run times
Reply #1 - Apr 5th, 2005 at 12:20pm
 
Greetings,

Quote:
I am just now getting to know HyPhy, so pardon the basic question, but how long might it take an analysis of say 50 nodes using  "distribution 3" and an MG94xREV, dual rate model, to run on a G5?  For example, might it take hours or days?  What kinds of major things affect the rate at which the program runs?


I presume you are thinking of running dNdSRateAnalysis.bf analysis? A 50 sequences with 3x3 rates using MG94xREV with dual rate model will probably take a day or so to run on the G5.

For this specific analysis the factors that affect speed are:
  • Whether estimated or approximate branch lengths are used. For most data sets, the use of approximate branch lengths is reasonable and results in a dramatic (several fold) reduction in computational time
  • Size of the data set, both in terms of sequence counts (slightly worse that linear) and bp length (slightly better than linear)
  • Number of rate categories (e.g. 4x4 is 16/9 - roughly - times slower than 3x3)


HTH,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged