Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
dNdS trees - Bootstrap - Var estimates - UI (Read 4877 times)
avilella
YaBB Newbies
*
Offline


I love YaBB 1G - SP1!

Posts: 35
dNdS trees - Bootstrap - Var estimates - UI
Sep 29th, 2005 at 2:14am
 
Hi all,

I have been looking at HYPHY to try to save the dNdS trees (and the
variance estimates) for each of the bootstrap runs of a bootstrap
analysis.

My ultimate goal would be to have the value of each of the distances
in the dN and dS trees for each of the bootstrap runs,

and,

the variance estimates for the bootstrap at each of those distances.

But after looking at the code in post_variance.bf, post_sns.bf,
post_dNdS.bf, simpleBootstrap.bf I realized that my goal can't be done
without some code reshuffling:

It seems to me that the likelihood parameters, with which the sysSubs
and nsynSubs are calculated:

     synSubs  = (horOnes*(aRateMx$synM))*vertOnes;
     nsynSubs = (horOnes*(aRateMx$nonSynM))*vertOnes;

would be lost after the bootstrap execution. So that the post_sns.bf
calculations should be done just after the bootstrap run, and saved to
a desired file.

I would like to ask for some advice about how all this should be taken
into consideration to add these features in the cleanest possible way,

Bests,

    Albert.
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: dNdS trees - Bootstrap - Var estimates - UI
Reply #1 - Sep 29th, 2005 at 5:45pm
 
Dear Albert,

There are two steps involved in computing expected syn and non-syn subs for each branch (in the batch language):

1). Based on the genetic code (which is stored in the _Genetic_Code variable when you are using chooseGeneticCode.def to select a genetic code), one needs to compute a stencil for rate matrix elements: which entries correspond to syn (or non-syn) substitutions. You can check whether this variable is defined by checking if Abs(_Genetic_Code ) is positive) This step should only be done once and can be lifted straight out of 'postdNdS.bf':

Code:
/* do once */

branchNames = BranchName (givenTree,-1);
T = Columns (branchNames);

ExecuteCommands(
"GetInformation (aRateMx, givenTree."+branchNames[0]+");");

/* make syn and non-syn template matrices */

nonStopCount = Columns (aRateMx);

vertOnes = {nonStopCount,1};
horOnes  = {1,nonStopCount};

for (h1 = 0; h1<nonStopCount; h1=h1+1)
{
	vertOnes [h1] = 1;
	horOnes  [h1] = 1;
}

/* do for each replicate, here simulatedCodonEFV is the vector of
equilibrium codon frequencies for the iterate which is computed to define the model for a replicate, e.g. in simpleBootstrap.bf */

synM    = {nonStopCount,nonStopCount};
nonSynM = {nonStopCount,nonStopCount};

hShift = 0;
for (h1 = 0; h1 < 64; h1=h1+1)
{
	gc1 = _Genetic_Code[h1];
	if (gc1 == 10)
	{
		hShift = hShift+1;
	}
	else
	{
		vShift = hShift;
		for (v1 = h1+1; v1 < 64; v1=v1+1)
		{
			gc2 = _Genetic_Code[v1];
			if (gc2 == 10)
			{
				vShift = vShift + 1;
			}
			else
			{
				if (gc1 == gc2)
				{
					synM [h1-hShift][v1-vShift] = simulatedCodonEFV[h1-hShift];
					synM [v1-vShift][h1-hShift] = simulatedCodonEFV[v1-vShift];
				}
				else
				{
					nonSynM [h1-hShift][v1-vShift] = simulatedCodonEFV[h1-hShift];
					nonSynM [v1-vShift][h1-hShift] = simulatedCodonEFV[v1-vShift];
				}
			}
		}
	}
}

 



2). For every replicate, after it has been fitted, and assuming the tree used for the replicate likelihood function is 'simulatedTree', compute a pair of associative arrays (indexed by branch name) which will be populated with expected number of syn and non-syn subs (per codon here; divide by 3 if you want it by nucleotide):

Code:
synSubsAVL = {};
nsSubsAVL  = {};

for (h1=0; h1 < T-1; h1=h1+1)
{
	abn = branchNames[h1];
	ExecuteCommands("GetInformation (aRateMx, givenTree."+abn+");");
	synSubs  = (horOnes*(aRateMx$synM))*vertOnes;
	nsynSubs = (horOnes*(aRateMx$nonSynM))*vertOnes;
	synSubsAVL[abn] = synSubs;
	nsSubsAVL [abn] = nsynSubs;
}
 



Then you can do various things with those replicates: write them to file, compute means, etc.

HTH,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
avilella
YaBB Newbies
*
Offline


I love YaBB 1G - SP1!

Posts: 35
Re: dNdS trees - Bootstrap - Var estimates - UI
Reply #2 - Oct 18th, 2005 at 12:26pm
 
Dears,

I have been working my way through the source code to try to better
understand it, and get this Var Estimates feature done, to no avail.

So I am in need to ask you for implementing the code to have the
Value, Std. Dev. and 95% Conf. Intervals in the same E[Syn subs] and
E[nonSyn subs] units as in the dSdN trees option.

Thanks in advance,

Bests,

    Albert.
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: dNdS trees - Bootstrap - Var estimates - UI
Reply #3 - Oct 19th, 2005 at 6:43pm
 
Dear Albert,

I'll be happy to write this for you. I am in the middle of rather frantic grant revisions at the moment, but in a few days I should have some time to do this.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
avilella
YaBB Newbies
*
Offline


I love YaBB 1G - SP1!

Posts: 35
Re: dNdS trees - Bootstrap - Var estimates - UI
Reply #4 - Nov 4th, 2005 at 11:56pm
 
The units of the same dS and dN branch parameters in the sliding
windows output are also expressed in their ML units, not E[Syn subs]
and E[nonSyn subs] units.

I'm still cutting my teeth on HYPHY language, and I understand that
there could be a function call (somewhere) that calculates the
E[(non)Syn subs] values for the given matrix and prints them out
instead of the ML units.

Again, this is still beyond my reach... :-(

Thanks in advance,

    Albert.
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: dNdS trees - Bootstrap - Var estimates - UI
Reply #5 - Nov 10th, 2005 at 5:16am
 
Dear Albert,

Please take a look at today's build (November 10th). I modified simpleBootstrap.bf to output branch lengths (split by dN and dS and total) to the detailed results file (CSV).

Effectively, when you call the new BranchLength function in the new version, HyPhy will check if BRANCH_LENGTH_STENCIL (a matrix) is defined. If it is, then BranchLength will compute branch lengths based only on some substitutions (those for which entries of BRANCH_LENGTH_STENCIL are >0). The functions in dNdSTreeTools.ibf allow define the stencils to include only synonymous and non-synonymous substitutions.

Hopefully there is enough detail in the implementation so that you can modify the output of SlidingWindow as well.

Cheers,
Sergei
Back to top
« Last Edit: Nov 10th, 2005 at 7:21am by Sergei »  

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
avilella
YaBB Newbies
*
Offline


I love YaBB 1G - SP1!

Posts: 35
Re: dNdS trees - Bootstrap - Var estimates - UI
Reply #6 - Nov 10th, 2005 at 7:33am
 
Awesome!

Sergei, you rock man! And I seriously owe you a beer at least!

Cheers,

    Albert.
Back to top
 
 
IP Logged