HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
HYPHY Package >> Feature Additions >> dNdS trees - Bootstrap - Var estimates - UI
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1127985283

Message started by avilella on Sep 29th, 2005 at 2:14am

Title: dNdS trees - Bootstrap - Var estimates - UI
Post by avilella on Sep 29th, 2005 at 2:14am
Hi all,

I have been looking at HYPHY to try to save the dNdS trees (and the
variance estimates) for each of the bootstrap runs of a bootstrap
analysis.

My ultimate goal would be to have the value of each of the distances
in the dN and dS trees for each of the bootstrap runs,

and,

the variance estimates for the bootstrap at each of those distances.

But after looking at the code in post_variance.bf, post_sns.bf,
post_dNdS.bf, simpleBootstrap.bf I realized that my goal can't be done
without some code reshuffling:

It seems to me that the likelihood parameters, with which the sysSubs
and nsynSubs are calculated:

     synSubs  = (horOnes*(aRateMx$synM))*vertOnes;
     nsynSubs = (horOnes*(aRateMx$nonSynM))*vertOnes;

would be lost after the bootstrap execution. So that the post_sns.bf
calculations should be done just after the bootstrap run, and saved to
a desired file.

I would like to ask for some advice about how all this should be taken
into consideration to add these features in the cleanest possible way,

Bests,

   Albert.

Title: Re: dNdS trees - Bootstrap - Var estimates - UI
Post by Sergei on Sep 29th, 2005 at 5:45pm
Dear Albert,

There are two steps involved in computing expected syn and non-syn subs for each branch (in the batch language):

1). Based on the genetic code (which is stored in the _Genetic_Code variable when you are using chooseGeneticCode.def to select a genetic code), one needs to compute a stencil for rate matrix elements: which entries correspond to syn (or non-syn) substitutions. You can check whether this variable is defined by checking if Abs(_Genetic_Code ) is positive) This step should only be done once and can be lifted straight out of 'postdNdS.bf':


Code (]
/* do once */

branchNames = BranchName (givenTree,-1);
T = Columns (branchNames);

ExecuteCommands(
"GetInformation (aRateMx, givenTree."+branchNames[0):

+");");

/* make syn and non-syn template matrices */

nonStopCount = Columns (aRateMx);

vertOnes = {nonStopCount,1};
horOnes  = {1,nonStopCount};

for (h1 = 0; h1<nonStopCount; h1=h1+1)
{
     vertOnes [h1] = 1;
     horOnes  [h1] = 1;
}

/* do for each replicate, here simulatedCodonEFV is the vector of
equilibrium codon frequencies for the iterate which is computed to define the model for a replicate, e.g. in simpleBootstrap.bf */

synM    = {nonStopCount,nonStopCount};
nonSynM = {nonStopCount,nonStopCount};

hShift = 0;
for (h1 = 0; h1 < 64; h1=h1+1)
{
     gc1 = _Genetic_Code[h1];
     if (gc1 == 10)
     {
           hShift = hShift+1;
     }
     else
     {
           vShift = hShift;
           for (v1 = h1+1; v1 < 64; v1=v1+1)
           {
                 gc2 = _Genetic_Code[v1];
                 if (gc2 == 10)
                 {
                       vShift = vShift + 1;
                 }
                 else
                 {
                       if (gc1 == gc2)
                       {
                             synM [h1-hShift][v1-vShift] = simulatedCodonEFV[h1-hShift];
                             synM [v1-vShift][h1-hShift] = simulatedCodonEFV[v1-vShift];
                       }
                       else
                       {
                             nonSynM [h1-hShift][v1-vShift] = simulatedCodonEFV[h1-hShift];
                             nonSynM [v1-vShift][h1-hShift] = simulatedCodonEFV[v1-vShift];
                       }
                 }
           }
     }
}



2). For every replicate, after it has been fitted, and assuming the tree used for the replicate likelihood function is 'simulatedTree', compute a pair of associative arrays (indexed by branch name) which will be populated with expected number of syn and non-syn subs (per codon here; divide by 3 if you want it by nucleotide):


Code (]
synSubsAVL = {};
nsSubsAVL  = {};

for (h1=0; h1 < T-1; h1=h1+1)
{
     abn = branchNames[h1):

;
     ExecuteCommands("GetInformation (aRateMx, givenTree."+abn+");");
     synSubs  = (horOnes*(aRateMx$synM))*vertOnes;
     nsynSubs = (horOnes*(aRateMx$nonSynM))*vertOnes;      
     synSubsAVL[abn] = synSubs;
     nsSubsAVL [abn] = nsynSubs;
}


Then you can do various things with those replicates: write them to file, compute means, etc.

HTH,
Sergei

Title: Re: dNdS trees - Bootstrap - Var estimates - UI
Post by avilella on Oct 18th, 2005 at 12:26pm
Dears,

I have been working my way through the source code to try to better
understand it, and get this Var Estimates feature done, to no avail.

So I am in need to ask you for implementing the code to have the
Value, Std. Dev. and 95% Conf. Intervals in the same E[Syn subs] and
E[nonSyn subs] units as in the dSdN trees option.

Thanks in advance,

Bests,

   Albert.

Title: Re: dNdS trees - Bootstrap - Var estimates - UI
Post by Sergei on Oct 19th, 2005 at 6:43pm
Dear Albert,

I'll be happy to write this for you. I am in the middle of rather frantic grant revisions at the moment, but in a few days I should have some time to do this.

Cheers,
Sergei

Title: Re: dNdS trees - Bootstrap - Var estimates - UI
Post by avilella on Nov 4th, 2005 at 11:56pm
The units of the same dS and dN branch parameters in the sliding
windows output are also expressed in their ML units, not E[Syn subs]
and E[nonSyn subs] units.

I'm still cutting my teeth on HYPHY language, and I understand that
there could be a function call (somewhere) that calculates the
E[(non)Syn subs] values for the given matrix and prints them out
instead of the ML units.

Again, this is still beyond my reach... :-(

Thanks in advance,

   Albert.

Title: Re: dNdS trees - Bootstrap - Var estimates - UI
Post by Sergei on Nov 10th, 2005 at 5:16am
Dear Albert,

Please take a look at today's build (November 10th). I modified simpleBootstrap.bf to output branch lengths (split by dN and dS and total) to the detailed results file (CSV).

Effectively, when you call the new BranchLength function in the new version, HyPhy will check if BRANCH_LENGTH_STENCIL (a matrix) is defined. If it is, then BranchLength will compute branch lengths based only on some substitutions (those for which entries of BRANCH_LENGTH_STENCIL are >0). The functions in dNdSTreeTools.ibf allow define the stencils to include only synonymous and non-synonymous substitutions.

Hopefully there is enough detail in the implementation so that you can modify the output of SlidingWindow as well.

Cheers,
Sergei

Title: Re: dNdS trees - Bootstrap - Var estimates - UI
Post by avilella on Nov 10th, 2005 at 7:33am
Awesome!

Sergei, you rock man! And I seriously owe you a beer at least!

Cheers,

   Albert.

HyPhy message board » Powered by YaBB 2.5.2!
YaBB Forum Software © 2000-2024. All Rights Reserved.