Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Pages: 1 2 
Questions about use of GARD and GABranch (Read 9063 times)
smurray
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 13
Questions about use of GARD and GABranch
Apr 15th, 2009 at 4:35am
 
Dear Sergei and other smart people on here,

I was wondering if you could answer a few questions about my analyses. I sent you an email before i discovered this forum today, so please feel free to ignore it!  Smiley

1. I have some alignments with gene orthologues and paralogues together. I believe there has been a duplication followed by positive selection just along that one branch, following the duplication. Can I use GABranch to test this?
What about if recombination has been detected using GARD?

2. Are there biological processes other than recombination that could lead to break points in the alignment? I am not entirely sure about recombination in this case.

3. What is the minimum number of sequences to get meaningful results using both GARD and selection detection programs? I have some alignments with only 3 or 4 sequences, about 300 - 800 bp.

Many thanks for your time!  Smiley
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Questions about use of GARD and GABranch
Reply #1 - Apr 16th, 2009 at 10:26am
 
Dear smurray,

To answer your questions:

1). GABranch will be useful, and also tests described in section 1.5 of Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login

2). Yes, there are many. One is convergent evolution, another is local rate variation or heterotachy (when different branches in different parts of the alignment evolve under different evolutionary regimes).

3). GARD will work fine with 3 to 4 sequences; selection detection programs can really only address alignment-wide questions (e.g. is there selection somewhere in the alignment), but not site-specific questions (e.g. which sites are under selection), for alignments of this size.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
smurray
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 13
Re: Questions about use of GARD and GABranch
Reply #2 - Apr 17th, 2009 at 10:28pm
 
Many thanks for your reply, its very useful.

Do you know of any methods that attempt to distinguish between the signal/s of recombination and those of local rate variation or heterotachy?

Is this theoretically possible?
Shauna



Back to top
 
 
IP Logged
 
Art Poon
Global Moderator
*****
Offline


Feed your monkey!

Posts: 0
Re: Questions about use of GARD and GABranch
Reply #3 - May 1st, 2009 at 9:38am
 
Hi Shauna,

GARD accommodates heterotachy and local rate variation in the sense that local branch lengths (heterotachy) vary between split trees, as well as overall tree length (local rate variation).  As a result, the signal of recombination (non-congruent topologies among split trees) should be relatively free of that influence.  I think  Grin

- Art.
Back to top
 
 
IP Logged
 
smurray
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 13
Re: Questions about use of GARD and GABranch
Reply #4 - Jul 29th, 2009 at 1:32am
 
Hi there,

Regarding question 3), about how to analyse very small alignments, do you mean they can only be analysed by distance methods ie in Mega, or do you mean they can be analsed by likelihood methods but only as implemented in PARRIS, for example?

I am also interested in your opinion about our GAbranch results which clearly show a difference in selective regime on a branch after a gene duplication event, as might be expected.
However, the alignment has some recombination as predited by GARD. I would like to know how to determine whether the recombination is real and whether it has potentially impacted the detection of selection on the particular branch after duplication?

thanks
Back to top
 
 
IP Logged
 
Art Poon
Global Moderator
*****
Offline


Feed your monkey!

Posts: 0
Re: Questions about use of GARD and GABranch
Reply #5 - Jul 29th, 2009 at 10:47pm
 
Hi Shauna,

Sure, distance methods such as those implemented in MEGA would be reasonable for very small alignments (with fewer than 10 sequences) as they use the information at all sites simultaneously.  Likelihood methods would be fine also, so long as you are assessing selection at an appropriate level, i.e., not at a site-by-site basis as one would do with Datamonkey.  PARRIS would be handy in this respect as recombination breakpoints (if they occur) would partition your alignment so that you're not just looking at the gene-wide mean dN/dS.

[In case you're wondering: ML estimation can be done within our stand-alone application (HyPhy) by using "Open Data File.." to open the alignment, selecting all sites for a codon partition, selecting a tree and codon model (MG94xHKY84 should be fine) with global parameters and then fitting the likelihood function.  Gene-wide dN/dS is given by the parameter estimate R.]

More later,
- Art.

Back to top
 
 
IP Logged
 
Art Poon
Global Moderator
*****
Offline


Feed your monkey!

Posts: 0
Re: Questions about use of GARD and GABranch
Reply #6 - Jul 29th, 2009 at 11:37pm
 
Hi again,
I think it should be possible to modify the PARRIS script to assess branch-specific rates, using the ReplicateConstraint() command on the respective branches in each tree.  Probably somewhere around line 697 in PARRIS.bf:

Code:
ExecuteCommands ("ReplicateConstraint (\"this1.?.synRate:=this2.?.t__/codonFactor\",givenTree_" + fileID + ",nucTree_" + fileID + ");");
 



and something like what's done in YangNielsenBranchSite2005.bf:

Code:
	/* constrain the foreground branch first */
	for (bc = 0; bc < Columns (stOption); bc = bc + 1)
	{
		ExecuteCommands ("givenTree."+choiceMatrix[stOption[bc]][0]+".nonSynRate:=omega_FG*givenTree."+choiceMatrix[stOption[bc]][0]+".synRate;");
	}

	/* constrain other branches next */
	ReplicateConstraint ("this1.?.nonSynRate:=omega_BG*this2.?.synRate",givenTree,givenTree);
 



I'll get back to this later, it's getting late..

- Art.
Back to top
 
 
IP Logged
 
smurray
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 13
Re: Questions about use of GARD and GABranch
Reply #7 - Aug 8th, 2009 at 11:32pm
 
Dear Art,

thanks so much for your in depth replies. I'm sorry I didn't notice the second part of your reply till now, I have been in Europe and travelling with my 2 year old daughter...

Now back to work.
I love the idea that I may be able to use PARRIS to assess branch-specific selection rates for the 3 alignments where there has been a gene duplication, as well as the other shorter alignments. I could then use the same method for all, and I can understand and justify that particular choice of method.

As a relative newcomer to selection analyses I am finding there's a dizzying array of algorithms, each justified by pages of equations that I only understand with great effort!

I am no programmer so if possible could you explain exactly where and in what way I should modify PARRIS?
Back to top
 
 
IP Logged
 
Art Poon
Global Moderator
*****
Offline


Feed your monkey!

Posts: 0
Re: Questions about use of GARD and GABranch
Reply #8 - Aug 10th, 2009 at 3:27pm
 
Hi Shauna,

I've got a kid myself (11 months old) so I understand completely Smiley

This is a tall order.  The PARRIS batch file is bucket-loads of code to absorb and modifying it to address your question will take at least several days to do even for an experienced programmar.

I suppose one way to start is if you could send me a toy data set that I could muck about with in PARRIS under default settings before I break it. 

- Art.
Back to top
 
 
IP Logged
 
smurray
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 13
Re: Questions about use of GARD and GABranch
Reply #9 - Aug 10th, 2009 at 7:27pm
 
Hi there Art,

Congratulations .... 11 months is a lovely age. Well each age so far has been lovely, but it was such fun when she started walking and talking.

Here is the largest alignment I have with 9 taxa and 1029 nucleotides. There is a tree in there too, which is one I inferred based on the proteins, with ML and more outgroups. But I guess that tree won't be the only relevant one for branch-specific rate analyses if the GARD result is correct and there are 6 breakpoints, 4 of which are significant.

What do you think about another analysis I have done with 6 taxa, GARD found breakpoints but they were not significant? Would you consider this alignment too small to use GAbranch and/or that the possible recombination should still be taken into account in any selection detection analysis?

thanks again!
Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (2 KB | )
 
IP Logged
 
Art Poon
Global Moderator
*****
Offline


Feed your monkey!

Posts: 0
Re: Questions about use of GARD and GABranch
Reply #10 - Aug 17th, 2009 at 2:21pm
 
Okay, I've merged the PARRIS.bf and YangNielsenBranchSite2005.bf files.  You need to put the attached file into the /TemplateBatchFiles directory for it to work properly.  The file assumes a fully-reversible nucleotide substitution model (you can edit this on line 22) --- this might be an over-parameterization for your data.  

When you run the file, it will ask you how many files to input.  Enter '1' in the console window.  Select a file containing the NEXUS output from a GARD analysis of your data.  It should contain multiple partitions.  For each partition and tree, you will be asked to define foreground and background branches.  These two levels are differentiated by their dN/dS rate ratio.  Note that you can select MORE THAN ONE branch to place in the foreground.

The analysis depends on your being able to define meaningful foreground and background for every partition and tree!  I don't have any way of automating this  Huh

Then HyPhy will optimize the likelihood function consisting of the tree whose synonymous rates (branch lengths) are constrained to nucleotide trees estimated from each partition and substitution rate biases estimated from the entire alignment.

Post-processing (i.e. interpretation of results) that were present in the NielsenYangBranchSite2005.bf batch file have been stripped out, because they would barf on the multiple partitions.  

Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (8 KB | )
 
IP Logged
 
smurray
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 13
Re: Questions about use of GARD and GABranch
Reply #11 - Aug 18th, 2009 at 4:03am
 
Fantastic!
thanks very much.
I'll let you know how it goes.
Shauna
Back to top
 
 
IP Logged
 
smurray
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 13
Re: Questions about use of GARD and GABranch
Reply #12 - Sep 8th, 2009 at 4:49am
 
Dear Art,

I ran the analysis with the data set, and it worked perfectly!
now I have a question about the interpretation of the results.
Back to top
 
 
IP Logged
 
smurray
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 13
Re: Questions about use of GARD and GABranch
Reply #13 - Sep 8th, 2009 at 5:37am
 
...take two...

These I are the relevant results:

Log Likelihood = -3477.74288102097;
Shared Parameters:
codonFactor=0.0318823
P_0=0.706115
P_1_aux=0.23038
omega_0=0.155844
omega_2=4.07621
kappa_inv=0.479152
P_1=Min(P_1_aux,1-P_0)=0.23038
omega_FG=(site_kind==1)*omega_0+(site_kind==2)+(site_kind>2)*omega_2=4.07621
omega_BG=((site_kind==1)+(site_kind==3))*omega_0+(site_kind==2)+(site_kind==4)=1


Followed by the trees with the branch lengths optimised. How do I interpret omegaFG and omegaBG?

thanks again Smiley
Back to top
 
 
IP Logged
 
smurray
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 13
Re: Questions about use of GARD and GABranch
Reply #14 - Oct 8th, 2009 at 1:52pm
 
Hi there,

I have done LRTs on the results from the different models and the results all make sense. In one dataset I had a FG/BG omega of 11, but when I did the LRT with a null model assuming no positive selection, it could not be rejected. I am guessing this is because the other branches in that tree were too variable in their selective regimes. Still, for the other alignments the results made sense.

I am just finishing the paper and have acknowledged your help. Many thanks
Back to top
 
 
IP Logged
 
Pages: 1 2