Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Equilibrium Frequencies with branch-specific model (Read 1183 times)
Aidan Budd
YaBB Newbies
*
Offline


Monkey fed...

Posts: 29
Heidelberg, Germany
Gender: male
Equilibrium Frequencies with branch-specific model
May 15th, 2008 at 9:56am
 
Hi All/Sergei,

I'm working with binary characters evolving over gene trees.

I want to test whether there is better support for a model assuming equal or different rates of gain and loss of these characters - so compare a time-reversible model with a non-time-reversible one.

I also want to see if a model that allows different rates on branches that are daughters of duplication nodes fits the data better than one where forward/backward rates are the same all over the tree - so I apply different substitution models to different branches of the tree.

However - I'm not sure what to do about the equilibrium frequencies for the branch-specific model analyses.

For comparing the rev/nonRev models where all branches are associated with the same model, I use:
[code]
eqFrequencies = {{backwardRate/(forwardRate+ backwardRate), forwardRate/(forwardRate+ backwardRate)}};[/code]


But what to do with these branch-specific analyses? If one of these models were run over all branches of the tree, I'd expect it to yield different equilibrium frequencies than if the other model were used in the same way..?

Should I try and use different equilibrium frequencies for the two different models? Sounds maybe a bit dodgy, but I find it hard to put my finger on why...

So - my question is: What would be an appropriate way to describe (and code) an appropriate set of equilibrium frequencies into this set of model-testing scenarios?

I attach the code I'm using below - any comments much appreciated.

All the best,

Aidan

[code]
fprintf (stdout, "\nBegin new analysis\n\n");

/*Specify models */
global forwardRate   = 1;
global backwardRate  = 1;
global forwardPostDuplicationBranchRateScalar = 1;
global backwardPostDuplicationBranchRateScalar = 1;

binaryNonRevQ = {{*, forwardRate*t}
                {backwardRate*t,*}};
           
dupnNonRevBinaryQ = {{*, forwardPostDuplicationBranchRateScalar*forwardRate*t}
                   {backwardPostDuplicationBranchRateScalar*backwardRate*t,*}};


eqFrequencies = {{backwardRate/(forwardRate+ backwardRate), forwardRate/(forwardRate+ backwardRate)}};

/* Get character data */
dataString = "$BASESET:01\n#seqA\n11\n#seqB\n11\n#seqC\n00\n#seqD\n11\n#seqE\n11";
DataSet myData = ReadFromString(dataString);
DataSetFilter myFilter = CreateFilter (myData,1);

/* Specify format for tree data */
ACCEPT_BRANCH_LENGTHS = 1;
ACCEPT_ROOTED_TREES  = 1;
LIKELIHOOD_FUNCTION_OUTPUT=1;

/* "daughter" is the label given to the post-dupn branches in the trees provided by Claudia */
Model daughter = (dupnNonRevBinaryQ, eqFrequencies);
Model binaryNonRevModel = (binaryNonRevQ, eqFrequencies);

UseModel (USE_NO_MODEL);
Tree dummy = "((((seqD:0.028306,seqA:0.142698):0.109803,seqE:0.107957):0.096862{daughter},seqB:0.179313):0.573944,seqC:0.455613):0.0;";

branchNames       = BranchName   (dummy,-1); /* populate a vector of branch names; post order traversal */
branchLengthsNow  = BranchLength (dummy,-1); /* populate a vector of branch lengths; post order traversal */
   
UseModel (binaryNonRevModel);

Tree myTree = "((((seqD:0.028306,seqA:0.142698):0.109803,seqE:0.107957):0.096862{daughter},seqB:0.179313):0.573944,seqC:0.455613):0.0;";

bn = BranchName (myTree, -1);
for (j = 0; j < Columns (bn) -1; j=j+1) {
   ExecuteCommands ("myTree." + bn[j] + ".t:=" + branchLengthsNow[j] + ";");
}

LikelihoodFunction lf = (myFilter, myTree);
fprintf(stdout, "Likelihood Function with starting parameters: ", lf, "\n");

Optimize (parameterValues,lf);
fprintf(stdout, "Likelihood Function after optimising over all four parameters: ", lf, "\n");

/* do optimisation with forwardPostDuplicationBranchRateScalar parameter set to 1 */
forwardPostDuplicationBranchRateScalar := 1;
Optimize (parameterValues,lf);
fprintf(stdout, "Likelihood Function after optimising over different forward-reverse rates and backwardPostDuplicationBranchRateScalar, but fixing forwardPostDuplicationBranchRateScalar to 1: \n", lf, "\n");

/* do optimisation with backwardPostDuplicationBranchRateScalar parameter set to 1 */
forwardPostDuplicationBranchRateScalar = 1;
backwardPostDuplicationBranchRateScalar := 1;
Optimize (parameterValues,lf);
fprintf(stdout, "Likelihood Function after optimising over different forward-reverse rates and forwardPostDuplicationBranchRateScalar, but fixing backwardPostDuplicationBranchRateScalar to 1: \n", lf, "\n");

/* do optimisation with both duplication parameters set to 1 */
forwardPostDuplicationBranchRateScalar: = 1;
Optimize (parameterValues,lf);
fprintf(stdout, "Likelihood Function after optimising over different forward-reverse rates, with both duplicationsRateScalars set to 1: \n", lf, "\n");

/* do analysis forcing rates to be the same, forward and back */
forwardRate := backwardRate;
Optimize (parameterValues,lf);
fprintf(stdout, "Likelihood Function after OPTIMISATION and fixing forward-reverse rate the same and with dupnScalar set = 1: \n", lf, "\n");
[/code]

Back to top
 

Aidan Budd&&Computational Biologist&&EMBL Heideberg, Germany
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Equilibrium Frequencies with branch-specific m
Reply #1 - May 15th, 2008 at 10:16am
 
Dear Aidan,

For binary models, if you want to preserve stationary character frequencies, then you lose the ability to tweak the rates.
In the reversible model
Code:
binaryNonRevQ = {{*, f*t}
	   {b*t,*}};
 



you are going to have stationary frequencies defined by

Code:
{{b/(f+b),f/(f+b)}}
 



When you change the relative values of b and f, you automatically alter stationary frequencies.
Hence when two different models are applied to different parts of the same Alsotree, you are making
a statement about the rates and automatically about the base frequencies. That's not necessarily a
problem - one simply has to think carefully how to interpret the rates. Your biological intuition
actually suggests that after duplication there may be character redistribution - i.e. a frequency
altering event.

Also, you should include the third argument in 'Model' declarations and set it to 0, because otherwise
HyPhy will automatically post-multiply rate matrix entries by base frequencies, and this will actually
break the equilibrium frequencies. E.g.

Code:
Model binaryNonRevModel = (binaryNonRevQ, eqFrequencies);
 



Will construct the following rate matrix:

{
{binaryNonRevQ[0][0]*eqFrequencies[0],binaryNonRevQ[0][1]*eqFrequencies[1]}
{binaryNonRevQ[1][0]*eqFrequencies[0],binaryNonRevQ[1][1]*eqFrequencies[1]}
}

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged