Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
How can I profile a HYPHY execution? (Read 2987 times)
avilella
YaBB Newbies
*
Offline


I love YaBB 1G - SP1!

Posts: 35
How can I profile a HYPHY execution?
Oct 17th, 2005 at 5:31am
 
Dears,

HYPHY is taking a lot of time with the "Replace stop codons with gaps
in codon data" with a file of very long sequences:

echo -e "4\n1\n1\n/home/avb/wallace/eukarya/drosophila/concat/obacs/data/Dmsye.fasta\n4\
n3\n1\n1\n/home/avb/wallace/eukarya/drosophila/concat/obacs/data/Dmsye.nogaps.fa
sta\n" | ./HYPHYMP

And I was wondering how can I profile, somehow, where is it spending
so much time to do the job.

Are there any tools that you, the HYPHY gurus, use for that?

Bests,

    Albert.
Back to top
 
 
IP Logged
 
avilella
YaBB Newbies
*
Offline


I love YaBB 1G - SP1!

Posts: 35
Re: How can I profile a HYPHY execution?
Reply #1 - Oct 17th, 2005 at 5:36am
 
http://gromit.bio.ub.es/Dmsye.fasta.bz2

This is an example file with large sequences and a good amount of stop codons.
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: How can I profile a HYPHY execution?
Reply #2 - Oct 17th, 2005 at 4:31pm
 
Dear Albert,

I fixed the bug which was causing the slowdown.

Effectively it's a 'stream problem'. Effectively, what happens is this imagine you have a very long string A and a string B, which you want to populate with a modified version of A.

The code I had in the file (and it works fine for short sequences) is like this:

[code]
B="";

for (k=0; k<Abs(A); k=k+1)
{
     if (A[k] == something)
    {
           B = B + function of A[k]);
    }
    else
    {
           B = B + another function of A[k];
    }
}
[/code]

The slowdown comes when Abs(A) is large. Effectively in B=B+something, new memory has to be allocated of length(B)+length(something). When length (B) is large, the cost of reallocating length(B)+length (something) can be order (length (B)). Thus the overall complexity of the algorithm can be order (length (A)^2).

Here's a streamed version of the same alrgorithm:

[code]
B="";
B*8192; /* make B into a stream string and allocate some initial storage */

for (k=0; k<Abs(A); k=k+1)
{
     if (A[k] == something)
    {
           B * function of A[k]); /* '*' in this context is an 'add to stream' operation */
    }
    else
    {
           B * another function of A[k];
    }
}
B*0; /*trim unused memory; B is now a regular string */
[/code]

Now, when time comes to allocate more memory to stream B, new allocation is not length (function of A[k]), but rather length (B)/2 (i.e. if the string is already long, we allocate a lot more memory that immediately needed, but then save on memory allocations). Now we only have Log (length (A)) allocations (assuming functions of A[k] return constant length strings), and execution time is Log(length (A)) * length (A).

Fixed version of CleanStopCodons.bf [url] http://www.hyphy.org/pubs/BFs/CleanStopCodons.bf[/url] and will be rolled into the next update.

Having a HyPhy batch code profiler is a good idea! I will add it soon.

Cheers,
Sergei

Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: How can I profile a HYPHY execution?
Reply #3 - Oct 21st, 2005 at 3:39pm
 
Dear Albert,

I have added a simple code profiler to HyPhy as of today's (Oct 21st) build. Take a look at the profile_test.bf file in Examples/BatchLanguage for a trivial example.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged