HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
HYPHY Package >> HyPhy feedback >> sample sizes and software run times
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1112728303

Message started by Sweeney on Apr 5th, 2005 at 12:11pm

Title: sample sizes and software run times
Post by Sweeney on Apr 5th, 2005 at 12:11pm
I am just now getting to know HyPhy, so pardon the basic question, but how long might it take an analysis of say 50 nodes using  "distribution 3" and an MG94xREV, dual rate model, to run on a G5?  For example, might it take hours or days?  What kinds of major things affect the rate at which the program runs?

Title: Re: sample sizes and software run times
Post by Sergei on Apr 5th, 2005 at 12:20pm
Greetings,


wrote on Apr 5th, 2005 at 12:11pm:
I am just now getting to know HyPhy, so pardon the basic question, but how long might it take an analysis of say 50 nodes using  "distribution 3" and an MG94xREV, dual rate model, to run on a G5?  For example, might it take hours or days?  What kinds of major things affect the rate at which the program runs?


I presume you are thinking of running dNdSRateAnalysis.bf analysis? A 50 sequences with 3x3 rates using MG94xREV with dual rate model will probably take a day or so to run on the G5.

For this specific analysis the factors that affect speed are:

  • Whether estimated or approximate branch lengths are used. For most data sets, the use of approximate branch lengths is reasonable and results in a dramatic (several fold) reduction in computational time
  • Size of the data set, both in terms of sequence counts (slightly worse that linear) and bp length (slightly better than linear)
  • Number of rate categories (e.g. 4x4 is 16/9 - roughly - times slower than 3x3)


HTH,
Sergei

HyPhy message board » Powered by YaBB 2.5.2!
YaBB Forum Software © 2000-2024. All Rights Reserved.