Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Pages: 1 2 
positive selection during speciation (Read 9012 times)
LMC
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 10
positive selection during speciation
Oct 19th, 2011 at 10:17am
 
Sergei,

I'm new to HyPhy, so I apologize if this is a naive question.
I have sequence from many genes from two populations of very closely related species.  I would like to identify positive selection that may have occurred in the differentiation of these two species.  For example, if this is my phylogenetic tree where A and B are the different species and #1, #2, and #3 are labels for each branch (I'm using the PAML labeling system here):

(A1 #2, A2 #2), A3 #2) #1, (B1 #3, B2 #3), B3 #3) #1

Then, #1 would have some proportion of sites under positive selection and #2 and #3 may not be equal to each other, but neither would have sites under positive selection.

Is it possible for me to perform this test using the existing batch files?  Perhaps with BranchSitesREL?

Many thanks for any help!
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: positive selection during speciation
Reply #1 - Oct 19th, 2011 at 3:01pm
 
Hi LMC,

Yeah, run your data through BranchSiteREL (on Datamonkey); it is a bit more flexible that what you were after (i.e. all branches will have their own selective regimes), but what we found is that forcing some branches to share distributions of omega can lead to pretty poor statistical properties. When you run BranchSiteREL, you can also look at uncorrected p-values (or corrected for TWO tests) for the two branches labeled #1, because you hypothesized that only a subset of branches are of interest to you a priori. Hope this helps.

Sergei



Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
LMC
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 10
Re: positive selection during speciation
Reply #2 - Oct 19th, 2011 at 3:08pm
 
Hi Sergei,
Thanks.  That makes sense.  I'll try analyzing a few genes with Datamonkey first.  Due to the large number of genes I am looking at, I was planning to set up a wrapper for command line HyPhy.  I am assuming that BranchSiteREL will work the same as in command line as on the Datamonkey server?
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: positive selection during speciation
Reply #3 - Oct 19th, 2011 at 4:12pm
 
Hi LMC,

Indeed, BranchSiteREL is the file you want to work with using HyPhy.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
LMC
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 10
Re: positive selection during speciation
Reply #4 - Oct 27th, 2011 at 1:02pm
 
Hi Sergei,

I've been playing around with a few alignments and I have a couple questions.  The random effects branch-site model is slow on my server, which is ok, but I guess I'm just an impatient sort of girl. And I have some 40,000 alignments to cycle through so it could end up taking quite a long time.  I could seek out a larger cluster to do the analysis on and/or I could speed this up ~30 fold if I was only analyzing my branches of interest. How does the approx likelihoods at a site for subtree specific selection pressure differ from the branch site rel methods?  How about the episodic direction selection on a set of labeled branches (MEDS)?  Would either of these be an appropriate and computationally faster method for looking at positive (direction) selection along a lineage?

One last question.. I have been using the default of 3 classes of omega in the branch site REL test; however, I am guessing that there is a more appropriate way to choose the number of classes.  Do you have any suggestions?

Thank you again for your time!
Back to top
 
 
IP Logged
 
LMC
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 10
Re: positive selection during speciation
Reply #5 - Oct 28th, 2011 at 2:44pm
 
Hi Sergei,
I am still exploring possibly faster options.  I am currently using the YangNielsenBranSite2005.bf, which, if I understand it correctly does force  BG (and FG) lineages to evolve at a common rate-- resulting in lower power, but also possibly false positives? 

At any rate, I'm using command line HYPHMP with a batch file for processing my many alignments through the model A and the null models, but I am unsure of how to print the likelihood and (and possibly BEB data) to an output file instead of to stdout.  I have a feeling I'll need to modify the YangNielsenBranSite2005.bf, but I'm not sure how to go about doing so. 

Thanks once more...
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: positive selection during speciation
Reply #6 - Oct 31st, 2011 at 7:19am
 
Hi LMC,

1). We have a version of BranchSiteREL which only applies the test to a subset of user-selected branches -- I'll ask my postdoc who developed it to post the file and a brief description of its use to this thread. We applied it (successfully) to an analysis of 10000+ genes 30+ species in a reasonable amount of time on the cluster.

2). 3 rate classes for BranchSiteREL are hard-coded at the moment. In most cases (at least based on our simulations) there seems to be no power to find more than 3 rate classes unless your alignment is very long (e.g 5000 codons).

3). I would not use the YangNielsen etc; episodic selection is difficult to detect in general, so maximizing one's power and accuracy is worthwhile, especially in genome wide scans, where one generally cannot postulate anything that will hold for all genes for all background lineages.

4). The other methods (MEME etc) are geared towards detecting sites under episodic selection; they will only succeed by pooling multiple branches to gain power. If your object is to test for selection along a single lineage, this will have to be done with branch-centric methods that pool information across sites to gain power.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
LMC
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 10
Re: positive selection during speciation
Reply #7 - Oct 31st, 2011 at 12:31pm
 
Thank you for the advice and explanation, Sergei!  The BranchSiteREL you mentioned would be extremely useful if you could provide it.
Back to top
 
 
IP Logged
 
Joel
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 4
Re: positive selection during speciation
Reply #8 - Oct 31st, 2011 at 2:49pm
 
Hi LMC,

I have attached a version of BranchSiteREL which can be used for testing a predefined subset of all branches in the phylogeny for positive selection.

The bf requires an input alignment and phylogeny. For ease of use, it is best to name the branches in the treefile with unique identifiers.

If you are using an inputRedirect file, insert a blank input line after the branch identifiers.

Cheers,
Joel
Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (14 KB | )
 
IP Logged
 
LMC
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 10
Re: positive selection during speciation
Reply #9 - Nov 12th, 2011 at 11:24am
 
Hi Joel,

Thanks for providing the amended version of BranchSiteREL.  I am struggling to have it work in a wrapper.  With my wrapper, it processes the first alignment fine, but it when it gets to the second, I have errors.  All the files seem to run fine if I am not using the wrapper.

Here is the error output:

Code:
Not a valid (or duplicate) option: 'tree/01g00300.CDS.tree' passed to ChoiceList (with multiple selections) 'Choose the foreground branch' using redirected stdin input

Function call stack
1 : Choice List for Choose the foreground branch with choice list:NO_SKIP. Store result in stOption
        Standard input redirect:
                04 : TESTBRANCH
                05 :
                06 : /output/01g00300.CDS.out
                -------
2 : ExecuteAFile from file fileToExe using basepath /. reading input from inputRedirect
{"01":"Universal",
"04":"TESTBRANCH",
"05":"",
"02":"/aln/01g00300.CDS.nex",
"03":"/tree/01g00300.CDS.tree",
"06":"/output/01g00300.CDS.out"}
 



And here is the wrapper:

Code:
fileToExe = HYPHY_BASE_DIRECTORY + "TemplateBatchFiles" + DIRECTORY_SEPARATOR + "BranchSiteRELMarked.bf";

/* a  list of file paths */
SetDialogPrompt ( "Provide a list of files to process:" );
fscanf ( PROMPT_FOR_FILE, "Lines", _inDirectoryPaths );

fprintf (stdout, "[READ ", Columns (_inDirectoryPaths), " file path lines]\n");

inputRedirect = {};
inputRedirect["01"]="Universal";
inputRedirect["04"]="TESTBRANCH";
inputRedirect["05"]="";

for ( _fileLine = 0; _fileLine < Columns ( _inDirectoryPaths ); _fileLine = _fileLine + 1 ) {
	inputRedirect["02"]="/aln/"+_inDirectoryPaths[ _fileLine ]+".nex";
	inputRedirect["03"]="/tree/"+_inDirectoryPaths[ _fileLine ]+".tree";
	inputRedirect["06"]="/out/"+_inDirectoryPaths[ _fileLine ]+".bsm";
	ExecuteAFile ( fileToExe, inputRedirect );

}
 



Do you have any thoughts about what I'm doing wrong here?

Thank you.
Back to top
 
 
IP Logged
 
Art Poon
Global Moderator
*****
Offline


Feed your monkey!

Posts: 0
Re: positive selection during speciation
Reply #10 - Dec 10th, 2011 at 7:20pm
 
It looks to me like your standard input redirection contents are not lined up with your batch file options.  Specifically, you're feeding a path to a file to a ChoiceList that anticipates one or more tree node names. 

Cheers,
- Art.
Back to top
 
 
IP Logged
 
LMC
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 10
Re: positive selection during speciation
Reply #11 - Jan 27th, 2012 at 3:39pm
 
I am finally back to playing around with this analysis. 
Yes, that is what it the error output is.  I do not understand why it runs correctly for the first file (redirect and batch file options match), and then breaks with the second file.  It seems as if the batch file options are not the same.  I'm wondering if at the end of the loop I need to "clear" my input redirect somehow?
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: positive selection during speciation
Reply #12 - Jan 30th, 2012 at 5:16pm
 
Hi LMC,

I think this is what happens:

  • When the first alignment file is read, there is no tree present in the file, so between options ["02"] and ["03"], HyPhy does not prompt the user to respond "y/n" to using the file present in the alignment.
  • After the first analysis is finished, the tree from it remains in memory (and occupies the variable named DATAFILE_TREE and IS_TREE_PRESENT_IN_DATA), hence for the second pass, between the two options, HyPhy is expecting a "Y/N" answer to whether or not the tree should be used.


Try adding

Code:
IS_TREE_PRESENT_IN_DATA = 0;
DATAFILE_TREE = 0;
 



at the end of the loop and see if the problem goes away.
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
LMC
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 10
Re: positive selection during speciation
Reply #13 - Jan 31st, 2012 at 9:49am
 
Hi Sergei,
I'm still stuck on this.  Your suggestion seemed to make sense, but it is not working for me.  I still get the same error when I add the lines at the end of the loop to set DATAFILE_TREE and IS_TREE_PRESENT_IN_DATA to 0.  The third inputRedirect is still looking for the foreground branch instead of a tree.

Out of curiosity I changed the loop to try different inputRedirects after the first file.  The tree input is missing, but no Y/N response.  This is the same whether or not I set DATAFILE_TREE and IS_TREE_PRESENT_IN_DATA to 0.

Thanks for any further thoughts you may have about my problem!
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: positive selection during speciation
Reply #14 - Feb 1st, 2012 at 11:59am
 
Hi LMC,

The problem was actually with BranchSiteRELMarked.bf; try using the version I attach here.

Sergei
Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (14 KB | )

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Pages: 1 2