Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Running GARD on many alignments (Read 7359 times)
kamushkina
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 6
Running GARD on many alignments
Sep 20th, 2010 at 10:46am
 
Dear Datamonkeys,

I am new to HyPhy, so maybe my questions are easy and already addressed elsewhere.

I want to run GARD on many (~5000) different alignments.
I managed to install HYPHYMPI and get GARD.bf running, now I need to get it automated. Thus I was wondering if it is possible to create an option file where I will provide input file name, model of evolution, rate across sites variation (I would use General Discrete with 4 rate categories) and output file name. If it is possible, then how do I do that?

Another questions are:
Is it possible to run GARD locally on protein alignments (that is what I would prefer to do)?
What are the codes for different evolutionary models? So far I know only 010010 (stands for HKY85).

Thanks a lot!
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Running GARD on many alignments
Reply #1 - Sep 20th, 2010 at 11:07am
 
Hi there,

1). To learn how to make options files, please take a look at section 2.8 in Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login
2). The local version of GARD does not support protein alignments at the moment
3). Page 16 of Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login should help with encoding nucleotide models

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
kamushkina
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 6
Re: Running GARD on many alignments
Reply #2 - Sep 20th, 2010 at 1:52pm
 
Hi Sergei,

Thank you very much for being so very fast and helpful!

To the 1st point.
I run GARD.bf in terminal like that:
$mpirun -np 4 HYPHYMPI GARD.bf

then I give input file name (for instance, gr_1000_al_dna1)
code for a model (for instance, 010010)
choose rate across sites variation (2, general discrete)
choose number of bins (4)
specify output file (for instance, gr_1000_al_dna1_gard_res2)
GARD runs just fine.

Now I create a file (run_gard) with the next content:
inputRedirect = {};
inputRedirect[’’01’’]=’’./gr_1000_al_dna1’’;
inputRedirect[’’02’’]=’’010010’’;
inputRedirect[’’03’’]=’’2’’;
inputRedirect[’’05’’]=’’4’’;
inputRedirect[’’06’’]=’’./gr_1000_al_dna1_gard_res2’’;
ExecuteAFile (’’./GARD.bf’’, inputRedirect);


Now I am trying to execute it in terminal like this and here is what I am getting:
$mpirun -np 4 HYPHYMPI run_gard
Error:MPI Node:0
Bad symbols in expression inputRedirect[????’01’’]=’’./gr_1000_al_dna1’’
Current BL Command:inputRedirect[’’01’’]=’’./gr_1000_al_dna1’’

Am I missing something obvious here?
Is it a problem of mpi or of hyphy?




To the 2nd point.
If I understand correctly, in the 6 symbol string, same numbers denote parameters of the model (corresponding substitution rates) which have to have the same values estimated from the data. Thus, if I want to specify general time reversible model (I understand the danger of using this model) I will put 01234?
Is it correct?


Thank you very much,
Olga.
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Running GARD on many alignments
Reply #3 - Sep 20th, 2010 at 10:01pm
 
Dear Olga,

1. Replace back quotes (``) with double quotes (") in your file and the error should go away.
2. GTR = 012345, so you are essentially correct (your understanding of the notation seems to be spot on).


Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
kamushkina
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 6
Re: Running GARD on many alignments
Reply #4 - Sep 21st, 2010 at 9:29am
 
Yes, it did work! At least started. However, it still does not accept all the options:

This is my current control file:
inputRedirect = {};
inputRedirect["01"]="./gr_1000_al_dna1";
inputRedirect["02"]="012345";
inputRedirect["03"]="2";
inputRedirect["05"]="4";
inputRedirect["06"]="./gr_1000_al_dna1_gard_res2";
ExecuteAFile ("./GARD.bf", inputRedirect);


This is what I am getting when trying to run hyphy:
[okamneva@sheep4 HyPhy]$ mpirun -np 4 HYPHYMPI run_gard
Initialized GARD on 4 MPI nodes.
Population size is 6 models

Please enter a 6 character model designation (e.g:010010 defines HKY85):Error:MPI Node:0
Not a valid option: '2' passed to Choice List 'Rate variation options' using redirected stdin input
Current BL Command:Choice List for Rate variation options with choice list:SKIP_NONE. Store result in rvChoice
[okamneva@sheep4 HyPhy]$



Seems like I need to pass choice from the list somehow differently then just a string.


Thanks for any suggestions,
Olga.




Back to top
 
 
IP Logged
 
Art Poon
Global Moderator
*****
Offline


Feed your monkey!

Posts: 0
Re: Running GARD on many alignments
Reply #5 - Sep 21st, 2010 at 9:54am
 
Shouldn't that argument list be 0-index?
So you should start your first entry with ["00"]
Back to top
 
 
IP Logged
 
kamushkina
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 6
Re: Running GARD on many alignments
Reply #6 - Sep 21st, 2010 at 10:07am
 
does not seem to work ether.




[okamneva@sheep4 HyPhy]$ more run_gard
inputRedirect = {};
inputRedirect["00"]="./gr_1000_al_dna1";
inputRedirect["01"]="012345";
inputRedirect["02"]="2";
inputRedirect["03"]="4";
inputRedirect["04"]="./gr_1000_al_dna1_gard_res2";
ExecuteAFile ("./GARD.bf", inputRedirect);
[okamneva@sheep4 HyPhy]$ mpirun -np 4 HYPHYMPI run_gard
Initialized GARD on 4 MPI nodes.
Population size is 6 models

Please enter a 6 character model designation (e.g:010010 defines HKY85):Error:MPI Node:0
Not a valid option: '2' passed to Choice List 'Rate variation options' using redirected stdin input
Current BL Command:Choice List for Rate variation options with choice list:SKIP_NONE. Store result in rvChoice
[okamneva@sheep4 HyPhy]$
Back to top
 
 
IP Logged
 
Art Poon
Global Moderator
*****
Offline


Feed your monkey!

Posts: 0
Re: Running GARD on many alignments
Reply #7 - Sep 21st, 2010 at 10:26am
 
Well, whenever I'm using input redirection, I send the analysis option strings as arguments.  So instead of a "2", try "Yes" or "Global" or whatever the string is for that menu?
- art.
Back to top
 
 
IP Logged
 
Art Poon
Global Moderator
*****
Offline


Feed your monkey!

Posts: 0
Re: Running GARD on many alignments
Reply #8 - Sep 21st, 2010 at 10:27am
 
So in your example, try replacing "2" with "Beta-Gamma".
Back to top
 
 
IP Logged
 
kamushkina
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 6
Re: Running GARD on many alignments
Reply #9 - Sep 21st, 2010 at 12:39pm
 
Cool, it works!
Just for those who might be interested, control file is as follows:

inputRedirect = {};
inputRedirect["01"]="./gr_1000_al_dna1";
inputRedirect["02"]="012345";
inputRedirect["03"]="General Discrete";
inputRedirect["04"]="4";
inputRedirect["05"]="./gr_1000_al_dna1_gard_res2";
ExecuteAFile ("./GARD.bf", inputRedirect);



Thanks to everyone!
Olga.
Back to top
 
 
IP Logged
 
Art Poon
Global Moderator
*****
Offline


Feed your monkey!

Posts: 0
Re: Running GARD on many alignments
Reply #10 - Sep 21st, 2010 at 3:00pm
 
Great, glad it worked!  But please don't call it a "control file" - that's so PAML.  Smiley
We much prefer "HBL script".
Back to top
 
 
IP Logged
 
hs
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 20
Re: Running GARD on many alignments
Reply #11 - Nov 5th, 2010 at 10:43am
 
(1) Of the 1295 alignments, 28 failed GARD because of the following HYPHY error:
----------
Constrained optimization failed, since a starting point within the domain specified for the variables couldn't be found. Set it by hand, or check your constraints for compatibility. Failed constraint:c_scale:=RS_1*PS_1+1*(1-PS_1)*PS_2+1*RS_3*(1-PS_1)*(1-PS_2)*PS_3+1*RS
_3*RS_4*(1-PS_1)*(1-PS_2)*(1-PS_3) must be in [0,10000]. Current value = 10000.
----------
I wonder how to solve this GARD problem.

(2) I am wondering if you could give me examples of batch commands to implement single breakpoint (SBP) method. I could not find any information in HYPHY Documentation.
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Running GARD on many alignments
Reply #12 - Nov 7th, 2010 at 7:10pm
 
Hi Olga,

The error you are seeing is unusual -- it occurs when HyPhy can't initialize model parameters to sensible values BEFORE optimization.
This typically happens when something is not appropriately reset between consecutive runs. Could you possible attach one of the 28 problematic files so that I can troubleshoot the bug?

SBP is implemented in 'SingleBreakpointRecomb.bf', which is a standard analysis under the recombination rubrik. You can find the file itself inside the 'TemplateBatchFiles' directory.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
hs
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 20
Re: Running GARD on many alignments
Reply #13 - Nov 7th, 2010 at 7:45pm
 
I have attached one of the 28 problematic files (pdf file generated from phylip format '1609.phy').
Thanks,
hs
Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (12 KB | )
 
IP Logged