HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
HYPHY Package >> HyPhy feedback >> Huge data
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1179752675

Message started by KE on May 21st, 2007 at 6:04am

Title: Huge data
Post by KE on May 21st, 2007 at 6:04am
Dear Sergei,

I tried to apply AnalyseCodonData.bf (MG94, global) via Analyses menu to a huge file: approx. 10^7 nt and 9 species. Everything went well until 99% of likelihood function optimisation, when I got an error message:

Failed to allocate 1086581776 bytes. Current BL Command: Optimize storing into, res, the following likelihood function: lf;

What happened? Is it possible to get the results somehow, or that file is too huge? At least using another method/more powerful computer? And is it possible to run FEL on such huge files?

Many thanks,
Kate

Title: Re: Huge data
Post by Sergei on May 21st, 2007 at 7:52am
Dear Kate,

I am not sure why you got the memory error; 10MBase analyses on 9 sequenes should not take very much memory actually. A few things:

1). Do you have the latest HYPHY version? I doubt that this is the problem, but if you have the latest version, then we can talk about the same version. Try running dNdSRatesAnalysis.bf under Codon Selection Analyses (see Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login for details - section 2).

2). FEL is probably not the best choice, because it's complexity is proportional to the length of the alignment; it will take a long time! REL is probably the way to go.

3). What computer/OS are you running HyPhy on?

Cheers,
Sergei

Title: Re: Huge data
Post by KE on May 23rd, 2007 at 5:41am
Dear Sergei,

I've installed HyPhy on 22.02.2007. My computer's got AMD Sempron processor 3100+ 1.8 GHz, 512 MB RAM and Windows XP Professional SP2. I used HYPHY-AthlonXP.exe

When I use REL, how many bins and which distribution should I take to get interpretable results on 9 sequences for different lengths from several hundreds nt to 10MB? Can using many bins replace FEL (and is speaking about replacing a continuous distribution by a discrete optimisation correct here)? And what are the rules for choosing the parameters in general?

Thank you very much for the manual, I'm going to work it over.

Kate

Title: Re: Huge data
Post by KE on May 23rd, 2007 at 6:00am
I tried dndsRatesAnalysis.bf and got the same error message, unfortunately.

Title: Re: Huge data
Post by Sergei on May 23rd, 2007 at 6:20am
Dear Kate,

This is odd; could be a bug in a recent version. I would like to try debugging the code on your file. Could you perhaps compress the alignment and e-mail it to me for testing?

Cheers,
Sergei

Title: Re: Huge data
Post by KE on May 23rd, 2007 at 8:51am
I've e-mailed it  :)

Title: Re: Huge data
Post by Sergei on May 23rd, 2007 at 12:07pm
Dear Kate,

Try editing the source of AnalyzeCodonData.bf (inside TemplateBatchFiles) to have these two lines (first line optional, second line needed to avoid memory overflow).

[code]
OPTIMIZE_SUMMATION_ORDER_PARTITION = 500;
CACHE_SUBTREES                               = 0;
[/code]

You data has about 1,000,000 unique codon patterns, and the second cache allocator was asking for something like 1,000,000*3*61*sizeof (double) bytes to speed up likelihood calculations - that step caused the memory error.

Without the caching scheme the calculations will be a bit slower, but not orders of magnitude slower.

Cheers,
Sergei

Title: Re: Huge data
Post by KE on May 28th, 2007 at 1:42am
Thank you very much!!! It works  :)

HyPhy message board » Powered by YaBB 2.5.2!
YaBB Forum Software © 2000-2024. All Rights Reserved.