HyPhy message board

	Welcome, Guest. Please Login

Home

Help

HyPhy message board › HYPHY Package › HyPhy feedback › Huge data

(Moderators: Sergei, Simon)

‹ Previous Topic | Next Topic ›

Pages: 1

Send Topic

Huge data (Read 3945 times)

YaBB Newbies

Offline

I love YaBB 1G - SP1!

Posts: 5

Huge data
May 21^st, 2007 at 6:04am

Dear Sergei,

I tried to apply AnalyseCodonData.bf (MG94, global) via Analyses menu to a huge file: approx. 10^7 nt and 9 species. Everything went well until 99% of likelihood function optimisation, when I got an error message:

Failed to allocate 1086581776 bytes. Current BL Command: Optimize storing into, res, the following likelihood function: lf;

What happened? Is it possible to get the results somehow, or that file is too huge? At least using another method/more powerful computer? And is it possible to run FEL on such huge files?

Many thanks,
Kate

IP Logged

Sergei

YaBB Administrator

Offline

Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male

Re: Huge data
Reply #1 - May 21^st, 2007 at 7:52am

Dear Kate,

I am not sure why you got the memory error; 10MBase analyses on 9 sequenes should not take very much memory actually. A few things:

1). Do you have the latest HYPHY version? I doubt that this is the problem, but if you have the latest version, then we can talk about the same version. Try running dNdSRatesAnalysis.bf under Codon Selection Analyses (see Multimedia File Viewing and Clickable Links are available for Registered Members only!! You need to

Login for details - section 2).

2). FEL is probably not the best choice, because it's complexity is proportional to the length of the alignment; it will take a long time! REL is probably the way to go.

3). What computer/OS are you running HyPhy on?

Cheers,
Sergei

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego

WWW

IP Logged

YaBB Newbies

Offline

I love YaBB 1G - SP1!

Posts: 5

Re: Huge data
Reply #2 - May 23^rd, 2007 at 5:41am

Dear Sergei,

I've installed HyPhy on 22.02.2007. My computer's got AMD Sempron processor 3100+ 1.8 GHz, 512 MB RAM and Windows XP Professional SP2. I used HYPHY-AthlonXP.exe

When I use REL, how many bins and which distribution should I take to get interpretable results on 9 sequences for different lengths from several hundreds nt to 10MB? Can using many bins replace FEL (and is speaking about replacing a continuous distribution by a discrete optimisation correct here)? And what are the rules for choosing the parameters in general?

Thank you very much for the manual, I'm going to work it over.

Kate

« Last Edit: May 23^rd, 2007 at 7:02am by KE »

IP Logged

KE YaBB Newbies Offline I love YaBB 1G - SP1! Posts: 5	Re: Huge data Reply #3 - May 23^rd, 2007 at 6:00am I tried dndsRatesAnalysis.bf and got the same error message, unfortunately.
Back to top	IP Logged

Sergei YaBB Administrator Offline Datamonkeys are forever... Posts: 1658 UCSD Gender:	Re: Huge data Reply #4 - May 23^rd, 2007 at 6:20am Dear Kate, This is odd; could be a bug in a recent version. I would like to try debugging the code on your file. Could you perhaps compress the alignment and e-mail it to me for testing? Cheers, Sergei
Back to top	Associate Professor Division of Infectious Diseases Division of Biomedical Informatics School of Medicine University of California San Diego WWW IP Logged

KE YaBB Newbies Offline I love YaBB 1G - SP1! Posts: 5	Re: Huge data Reply #5 - May 23^rd, 2007 at 8:51am I've e-mailed it
Back to top	IP Logged

Sergei

YaBB Administrator

Offline

Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male

Re: Huge data
Reply #6 - May 23^rd, 2007 at 12:07pm

Dear Kate,

Try editing the source of AnalyzeCodonData.bf (inside TemplateBatchFiles) to have these two lines (first line optional, second line needed to avoid memory overflow).

Code:

OPTIMIZE_SUMMATION_ORDER_PARTITION = 500;
CACHE_SUBTREES					 = 0;

You data has about 1,000,000 unique codon patterns, and the second cache allocator was asking for something like 1,000,000*3*61*sizeof (double) bytes to speed up likelihood calculations - that step caused the memory error.

Without the caching scheme the calculations will be a bit slower, but not orders of magnitude slower.

Cheers,
Sergei

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego

WWW

IP Logged

KE YaBB Newbies Offline I love YaBB 1G - SP1! Posts: 5	Re: Huge data Reply #7 - May 28^th, 2007 at 1:42am Thank you very much!!! It works
Back to top	IP Logged

Pages: 1

Send Topic

‹ Previous Topic | Next Topic ›

« Home

‹ Board

Top of this page