Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
BranchClassDNDS documentation and/or examples? (Read 6271 times)
Jamie
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 7
BranchClassDNDS documentation and/or examples?
Jan 24th, 2011 at 2:39am
 
Where can I learn more about the theory and practical implementation of the BranchClassDNDS.bf analysis?

I've just worked my way through the PhyloHandbook examples (generally an EXCELLENT resource -- thanks!) but I can't find discussion this particular standard analyses there.  Nor can I find much about it in the discussion board or other docs.  Makes me feel like either I'm not searching correctly or am seriously just missing something obvious.

In any case, I'd love a bit more description of what exactly this bf tests and how to prepare the requisite files.

Thank very much,

Jamie
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: BranchClassDNDS documentation and/or examples?
Reply #1 - Jan 28th, 2011 at 4:39pm
 
Hi Jamie,

Sorry it took me a while to get around to responding. First, a disclaimer and an apology -- HyPhy is always a work in progress and it is quite common to find standard analyses without documentation. BranchClassDNDS does the following fits a model that permits different sets of tree branches to have different selective regimes. The original impetus for the analysis was to permit the analysis of lineage-specific selection in the context of HIV transmission where you have two sets of clonal sequences (one from the source and one from the recipient) separated by the transmission branch. The tree would then be split into the patient clades and the transmission branch and fit a separate dN/dS to each branch class, i.e. all branches within each patient would have the same dN/dS and the transmission branch would have a separate dN/dS as in the attached picture (part of the archive). 

The inputs to the analysis are the alignment file and a HyPhy batch file (see attached) specifying the partitions. The analysis also includes several options for site-to-site rate variation.


HTH,
Sergei
Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (187 KB | )

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Jamie
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 7
Re: BranchClassDNDS documentation and/or examples?
Reply #2 - Jan 29th, 2011 at 7:34am
 
Sergei,

Thanks for the examples and explanation.  Very much appreciated.  I actually had a very helpful conversation last week over here in Cambridge with Simon about this.  He pointed out that hyphy will 'automatically' generate such 'branch-specifciation' tree files as output from the SlatkinMaddison.bf. This tree file can then be used as input to BranchClassDNDS.bf.  So my sense was that even if one isn't interested in the SlatkinMaddison analysis, it is a useful tool for formatting the treefile to use with BranchClassDNDS.  I mention it here for the benefit of anyone else who is interested.

Cheers,

Jamie
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: BranchClassDNDS documentation and/or examples?
Reply #3 - Jan 29th, 2011 at 7:49am
 
Hi Jamie,

Good point about SlatkinMaddison.bf
If you are in Cambridge, then Simon is an excellent local resource on all things HyPhy.


Because a part of the SM test is a parsimony reconstruction of 'migration' events (the meaning of migration is very much context specific), you can automatically identify the lineages where such events took place, and the remainder of the lineages are placed in the corresponding 'population'.  However, SlatkinMaddison.bf requires that your sequences be named in a regular way, i.e. all of the sequences from all but one population must be identifiable by a unique regular expression. One of my postdocs has put together a script that converts color annotations from TreeFig into HyPhy partition files to make the process a bit more interactive. If you are interested in the latter, I'll see if I can track it down.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Djm59
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
Re: BranchClassDNDS documentation and/or examples?
Reply #4 - Apr 4th, 2011 at 3:53am
 
Hi Sergei,

This follows on from previous discussion about the BranchClassDNDS batch file.

I have used the information below to get this analysis running (nice tip with the SlatkinMaddison.bf output by Jamie, although a user friendly interface such as you describe that your post-doc has put together would be great for compartmentalization!). My analysis simply compartmentalizes 2 sister clades from the rest of the tree.


The output is obviously point dN/dS estimates for the specified clades and their seperating branches. My question: is there an easy way to get a statistical feel to compare these estimates e.g. to compare if values differ significantly between say clade 1 Vs. clade 2, or between the branch leading to clade 1 Vs. the branch leading to clade 2. At the minute I have no feel for the variation in generating these estimates.  Any thoughts, or ideas on a more appropriate approach?

Cheers

Dan
ps. Sergei, I am also interested in the new BranchsiteREl analysis in Hyphy 2.0? Do you have some documentation for this? Cheesy
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: BranchClassDNDS documentation and/or examples?
Reply #5 - Apr 4th, 2011 at 10:04pm
 
Hi Dan,

I'd better finally start writing decent documentation Smiley
Take a look at Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login

Let me know if this is helpful.

BranchSiteREL manuscript is currently in revision at MBE; I will post complete documentation shortly. In the meantime, look at Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Djm59
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
Re: BranchClassDNDS documentation and/or examples?
Reply #6 - Apr 5th, 2011 at 2:30am
 
Morning (here) Sergei,

I am familiar with the SelectionLRT analysis and found it very useful in another study, which I had published in MBE last year. I used HyPhy alot in the paper (thanks!), if you are interested: Mol Biol Evol. 2010 Aug;27(8):1886-1902.

However, for my current work, this analysis is not as helpful, since I am looking at duplicated genes with the topology: ((Paralog1_Species1,Paralog1_Species4),(Paralog1_Species2,Paralog1_Species3)),(P
aralog2_Species1,(Paralog2_Species4,(Paralog2_Species2,Paralog2_Species3))),Undu
plicated_Outgroups)

I am interested in comparing (in a single analysis preferably since I am considering setting up a high-throughput analysis with 100s of genes) gene wide constraints for 5 compartments i.e. Paralog 1 orthologs, the branch leading to Paralog 1, Paralog 2 orthologs, the branch leading to Paralog 2 and Outgroup orthologs. The same LRT approach as in SelectionLRT but with these 5 comparments (or a subset based on specific hypotheses) would be perfect.

With Selection LRT I could only infer selection for one paralog clade VS its branch VS all other seperating branches (including the other paralog clade plus more basal unduplicated outgroups).

Thoughts?  Grin

All the best

Dan

ps, the HyPhy wikis are an excellent resource- you should have many PAML converts soon!
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: BranchClassDNDS documentation and/or examples?
Reply #7 - Apr 13th, 2011 at 2:12pm
 
Hi Dan,

Thanks for the reference -- a very nice paper! In terms of handling your analysis:

  • Does the same tree apply to all genes?
  • There are five compartments, but many (52) different ways to bin 5 rates (i.e. all the same, all different, paralog 1 separate all others the same etc). Do you want to test all such models?


There is no predefined analysis to perform this type of test, but SelectionLRT.bf can be modified,e.g. by prompting for two branches basal to Paralog 1 and Paralog 2 species, without too much hassle.

Sergei

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Djm59
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
Re: BranchClassDNDS documentation and/or examples?
Reply #8 - Apr 14th, 2011 at 2:10am
 
Hi Sergei,

Thanks for the response!

The tree topology is the same across genes, only the number of orthologs in each clade would change.

I see the issue with binning rates into the 5 compartments. I would not want all 52 possibilities compared!

I am not quite sure I get you mean about adapting SelectionLRT. Would I need to keep 3 compartments presumably? In your example, would I be comparing Paralog 1 clade vs. the branch leading to Paralog 1 clade vs. the branch leading to Paralog 2 clade?, or would this just be 2 X the normal 3 compartments in one analysis (i.e. paralog 1 vs. its branch Vs other branches AND paralog 2 vs. its branch vs .other branches). Would you clarify please?

More generally, I am not convinced that the compartmentalization approach is necessarily the best for the questions I wish to ask. Namely, something like:

1.      Are dN/dS significantly different between paralog 1 orthologs, paralog 2 orthologues and unduplicated orthologs(ignoring branches leading to each paralog clade)

2.      Are dN/dS significantly different for the 2 branches leading to paralog 1 and 2?

3.      For paralog 1: is dN/dS for its branch (i.e. before speciation events) significantly different to paralog 1 orthologs? Same question for paralog 2.

I could use a local model to estimate dN and dS for each branch in the tree and could bootstrap the procedure to get branch SDs/CI’s, but this is not hypothesis testing. I know there is a way in HyPhy to do what I need, I just don’t have the brain for it! Any thoughts on the best approach

Thanks again!

Dan
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: BranchClassDNDS documentation and/or examples?
Reply #9 - Apr 14th, 2011 at 10:11am
 
Hi Dan,

First, I must say that I always get confused by the paralog/ortholog terminology (not being a properly educated biologist:)

My general advice for answering your questions would be the following.

2). Is the easiest. Alternative model is "local dN/dS", Null Model dN/dS (MRCA Paralogs 1) = dN/dS (MRCA Paralogs 2). The test is a one D.F LRT.

For (1) and (3) the formulation is a bit more tricky. IF we are willing to assume that dN/dS is homogeneous within a given paralog clade (i.e. all branches in the paralog 1 ortholog subtree have the same dN/dS) then we could do the following.

1). Alternative model: deep branches have their own dN/dS, Paralog 1 clade, Paralog 2 clade and Unduplicated clade each have clade wide dN/dS, but those differ between clades (3 parameters, say omega1, omega2 and omegaUD). Several nulls could be tested (by a 1 or 2 D.F. LRT)

  (a) omega1 = omega2 = omegaUD
  (b) omega1 = omegaUD
  (c) omega1 = omega2
  (d) omega2 = omegaUD

3). Is analogous

The more challenging test would be to see if the distribution of dN/dS within clades are different. Roughly speaking, what happens if in paralog1 clade there are four branches all with dN/dS of 0.1, but in paralog 2 clade, 3 branches have dN/dS of 0 and 1 branch has dN/dS of 0.4. On average (which is what the above test would examine), there is probably not much difference between clades, but biologically there appears to be something quite different. There is not an easy way (that I know of) to examine this general distributional difference. If paralog clades have the same number of branches you could simply test the hypothesis that at least one pair of paralogs have different dN/dS. This can be done by using the local model as the alternative, and constraining a pair of branches (e.g. Paralog1_speciesX, Paralog2_speciesX) to have the same dN/dS.

Which is more appropriate in your case?

Sergei



Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Djm59
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
Re: BranchClassDNDS documentation and/or examples?
Reply #10 - Apr 15th, 2011 at 3:32am
 
Hi Sergei,

Thanks for the ongoing advice, very helpful.

I would not be particularly interested in knowing if the distribution of dN/dS was different within clades (I won’t bore you with details why), so your other hypothesis suggestions would be quite appropriate to implement. However, I am having issues in executing the analysis, since I have not used the GUI extensively and am self-taught. This is my attempt so far based on literature I can find:

-I load the codon file into the GUI

-I specify all sites as a single partition, specify that it is codon data, specify a tree and substitution model, set parameters to LOCAL and equilibrium freqs to PARTITION.

-I then build the Likelihood function and optimize it in the table of parameter values interface.

-This gives me a list of dN and dS for every branch in the tree. I know how to constrain parameters for different branch combinations and also how to implement the LRT, but do not know how to get a SINGLE dN/dS estimate for each branch. Thus, I can’t current constrain dN/dS in clades/branch combinations to go on and do the hypothesis testing you suggest.

-I am sure this is a VERY simple one for you! Once I know how to do this I will be able to go on and do some LRTs for various hypotheses based on your above suggestions…

Thanks again for your support!

Dan
Smiley
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: BranchClassDNDS documentation and/or examples?
Reply #11 - Apr 18th, 2011 at 1:34pm
 
Hi Dan,

The GUI approach will work for a few analyses (since you have to redefine the constrains for each data set); for batch processing I can write a simple script which will take an annotated list to determine which branches fall into which dN/dS class:

A,1
B,1
AB,2
C,3
D,3

for the tree ((A,B)AB,C,D)

Alternatively, something like ((A:1,B:1):2,C:3,D:3) could be used. Let me know if this will be useful; I can script it up fairly quickly and would only ask for a grant # acknowledgement if you ever publish based on it (the joys of NIH reporting).

To constrain ALL branches in the tree (or a clade) to have the same dN/dS in the GUI, use the ReplicateConstraint command as described around page 34 of Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login

Serge

Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Djm59
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
Re: BranchClassDNDS documentation and/or examples?
Reply #12 - May 16th, 2011 at 2:47am
 
Hi Sergei,

Apologies for the slow response to this, I have been on Paternity leave!

The script you suggest would be great and most helpful- I would of course acknowledge any grant you required me to do so


Thanks very much for this!

Dan
Back to top
 
 
IP Logged