Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
How to get detailed Results from command line Ver (Read 7023 times)
SAN
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
How to get detailed Results from command line Ver
Feb 15th, 2011 at 10:40am
 
I am trying to implement HYPHY to estimate the dN/dS ratios of all the branches (58 branches) of 1600 codon alignments I have. I was able to compile the command line binary ( HYPHYMP) and successfully implement the analysis. The results are ‘stdout’ to the screen which contains the Log Likelihood and Tree givenTree. Is it possible that could direct the output to a file? Also when I used the GUI I was able to get the branch wise dN/dS from Analysis -> Results ->dN-dS for branches. But I was not able to retrieve this data in the command line version.

It would be great if you could let me know how could I retrieve it? I can’t use the GUI since I have like 1600 codon alignments. If I could output the dN-dS for branches for each alignment separately, I could just use a sh script to do it again and again for each of the alignment.

Please let me know.
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: How to get detailed Results from command line Ver
Reply #1 - Feb 18th, 2011 at 9:01am
 
Hi SAN,

SAN wrote on Feb 15th, 2011 at 10:40am:
I am trying to implement HYPHY to estimate the dN/dS ratios of all the branches (58 branches) of 1600 codon alignments I have. I was able to compile the command line binary ( HYPHYMP) and successfully implement the analysis. The results are ‘stdout’ to the screen which contains the Log Likelihood and Tree givenTree. Is it possible that could direct the output to a file?


Yes. You can either use the shell output redirect by including '>' on the command line (search Google for examples), or modify the batch file you are using to use a file name in place of 'stdout' when calls to fprintf are made. 

Quote:
Also when I used the GUI I was able to get the branch wise dN/dS from Analysis -> Results ->dN-dS for branches. But I was not able to retrieve this data in the command line version. It would be great if you could let me know how could I retrieve it? I can’t use the GUI since I have like 1600 codon alignments. If I could output the dN-dS for branches for each alignment separately, I could just use a sh script to do it again and again for each of the alignment.

Please let me know.


You will need to write a wrapper file (search these message boards for examples, but first take a look at section 2.8 of Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login. You would call the analysis to fit a codon model (presumably AnalyzeCodonData.bf) followed by a call to TemplateBatchFiles/post_dNdS.bf to retrieve dN-dS for branches.

Sergei


Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
SAN
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
Re: How to get detailed Results from command line Ver
Reply #2 - Feb 21st, 2011 at 3:56pm
 
It would be great if you could show me an example of how to write the wrapper file. Do I need to first call for AnalyzeCodonData.bf and use those results to callin for post_dNdS.bf to retreive dN-dS for branches.
I am little confused here and it would be great if you could get me a little push here.

SAN
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: How to get detailed Results from command line Ver
Reply #3 - Feb 22nd, 2011 at 10:58am
 
Hi SAN,

Save this to a file (e.g. wrapper.bf)

Code:
RequireVersion ("2.0020110101");

fprintf (stdout, "Process this file:");
fscanf (stdin, "String", filePath);

LoadFunctionLibrary ("AnalyzeCodonData",
					  {"00": "Universal",
					   "01": filePath,
					   "02": "MG94CUSTOM",
					   "03": "Local",
					   "04": "012345",
					   "05": "Y"});
LoadFunctionLibrary ("post_dNdS.bf");

 



Then call

Code:
$(echo /path/to/some/file) | /path/to/HYPHYMP CPU=2 BASEPATH=/path/to/HYPHY /path/to/wrapper.bf

 



Replace /path/to with appropriate directory paths on your system. You can change the options I pass to AnalyzeCodonData in the above example to whatever you need.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
SAN
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
Re: How to get detailed Results from command line Ver
Reply #4 - Feb 22nd, 2011 at 2:20pm
 
Thanks for the valuable input. I guess I would have to input a tree file as well as codon alignment file..right> where do I specify it? I presume it is at echo /path/to/some/file). if that is the case do I need to put the alignment file and tree in one folder for all of my alignments?

Also . what exactly is the BASEPATH=/path/to/HYPHY? is it the directory of the HYPHY source code?

Please let me know.
SAN
Back to top
 
 
IP Logged
 
SAN
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
Re: How to get detailed Results from command line Ver
Reply #5 - Feb 24th, 2011 at 8:07am
 
I did explored more on the wrapper and changed the parameters. I want to get the dN/dS ratio of each of the branches. SO I believe I would have to input an alignment and tree. I want to use the model "GY94". I modified the the wrapper.bf as below:

Code:
RequireVersion ("2.0020110101");

fprintf (stdout, "Process this file:");
fscanf (stdin, "String", filePath);
fscanf (stdin, "Tree", TreePath);

LoadFunctionLibrary ("AnalyzeCodonData.bf",
                                          {"00": "Universal",
                                           "01": filePath,
                                           "02": "GY94",
                                           "03": "Local",
                                           "04": TreePath});
LoadFunctionLibrary ("post_dNdS.bf");

 



After I ran the command I got the following error message:

Code:
Process this file:Error:
All entries in the associative array used as input redirect argument to ExecuteCommands/ExecuteAFile must be strings. The following key was not: 04

Function call stack
1 : LoadFunctionLibrary from file"AnalyzeCodonData.bf" using basepath /Users/sandeep/Documents/HYPHY/. reading input from {"00":"Universal","01":filePath,"02":"GY94","03":"Local","04":TreePath}
-------
 



Would you please look into it and suggest a solution for this.
I am also attaching one alignment file and a tree file. I have to do it for 2000 such alignments to get the dN/dS ratios of each branches in each alignment.
Please let me know what might have gone wrong.
Back to top
 
 
IP Logged
 
SAN
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
Re: How to get detailed Results from command line Ver
Reply #6 - Feb 24th, 2011 at 9:19am
 
I got the wrapper working and I got the result file:
Code:
Process this file:
______________READ THE FOLLOWING DATA______________
58 species:{BACI,BALH,BC,BCAH1_,BCAH8_,BCB4_,BCE,BCZK,BT,G924_,GBAA_,bce10_,bce11_,bce12_,bce13_,bce14_,bce15_,bce16_,bce17_,bce18_,bce19_,bce1_,bce20_,bce21_,bce22_,bce23_,bce25_,bce26_,bce27_,bce28_,bce29_,bce2_,bce30_,bce4_,bce5_,bce6_,bce7_,bce98_,bce9_,bceKB_,bmyc1_,bmyc2_,bmyc3_,bpmyx,bth10_,bth11_,bth12_,bth13_,bth14_,bth1_,bth2_,bth3_,bth4_,bth5_,bth6_,bth7_,bth8_,bth9_};
Total Sites:984;
Distinct Sites:304
Please enter a 6 character model designation (e.g:010010 defines HKY85):
______________RESULTS______________
Log Likelihood = -12072.9379512379;
Shared Parameters:
GT=0.541603
CT=1.01131
CG=0.315775
AT=0.285311
AC=0.476575

Tree givenTree=((((bce22_:0.0976799,((bmyc2_:0,bmyc3_:0)Node7:0,bpmyx:0.0032812)Node6:0.0643221)Node4:0.0238212,bce98_:0.131946)Node3:0.216181,((((bce14_:0.00101548,bceKB_:0.00219835)Node15:0.0172679,(bce7_:0.0156655,bmyc1_:0.294627)Node18:0.0214159)Node14:0.0131025,bce26_:0.0171199)Node13:0.00101741,(bce29_:0,bce30_:0)Node22:0.0379876)Node12:0.00883453)Node2:0.00195322,((bce17_:0,bce20_:0)Node26:0.00309761,bce19_:0.00346193)Node25:0.0325879,((bce9_:0.296845,((bce28_:0.288973,bce6_:0.020992)Node34:0,(((((BCAH1_:0,bce13_:0)Node41:0.297827,bce1_:0.00227104)Node40:0,BCE:0.00343703)Node39:0,((bce10_:0.00113824,bth1_:0.00113733)Node47:0,G924_:0.00214067)Node46:0)Node38:0.280163,(BCZK:0,(((((((((BCAH8_:0.317937,bce16_:0)Node61:0,bth10_:0.00111754)Node60:0.0595776,bth7_:0)Node59:0,GBAA_:0.289237)Node58:0,bth12_:0.00110489)Node57:0,bth9_:0.28566)Node56:0,BT:0.00110196)Node55:0,BACI:0.286604)Node54:0.00221898,((BALH:0,bce4_:0)Node72:0,bce21_:0.00110873)Node71:0)Node53:0.00222426)Node51:0.0139253)Node37:0.0140406)Node33:0)Node31:0.00544632,((((((((BC:0,bce15_:0)Node83:0.0483543,bce12_:0.0011444)Node82:0,(bce18_:0.00113288,bce27_:0.0496362)Node87:0)Node81:0.195484,bth5_:0.0924736)Node80:0,(BCB4_:0.0010064,bce11_:0)Node91:0.101749)Node79:0.191855,((((bce23_:0.0735883,bce25_:0)Node97:0,bth6_:0.00219817)Node96:0.00220004,bce5_:0.0719465)Node95:0.00101108,bce2_:0)Node94:0)Node78:0.00379512,bth11_:0.00285239)Node77:0.0187235,((bth13_:0.296283,(bth14_:0,bth4_:0.0126539)Node107:0.0145902)Node105:0.00579535,((bth2_:0,bth8_:0)Node111:0,bth3_:0)Node110:0.0336835)Node104:0.00337846)Node76:0.0139159)Node30:0.00457504);

Leaf branches:

bce22_ : 0
bmyc2_ : 0
bmyc3_ : 0
bpmyx  : 0
bce98_ : 0
bce14_ : 0
bceKB_ : 0
bce7_  : 0
bmyc1_ : 0
bce26_ : 0
bce29_ : 0
bce30_ : 0
bce17_ : 0
bce20_ : 0
bce19_ : 0
bce9_  : 0
bce28_ : 0
bce6_  : 0
BCAH1_ : 0
bce13_ : 0
bce1_  : 0
BCE    : 0
bce10_ : 0
bth1_  : 0
G924_  : 0
BCZK   : 0
BCAH8_ : 0
bce16_ : 0
bth10_ : 0
bth7_  : 0
GBAA_  : 0
bth12_ : 0
bth9_  : 0
BT     : 0
BACI   : 0
BALH   : 0
bce4_  : 0
bce21_ : 0
BC     : 0
bce15_ : 0
bce12_ : 0
bce18_ : 0
bce27_ : 0
bth5_  : 0
BCB4_  : 0
bce11_ : 0
bce23_ : 0
bce25_ : 0
bth6_  : 0
bce5_  : 0
bce2_  : 0
bth11_ : 0
bth13_ : 0
bth14_ : 0
bth4_  : 0
bth2_  : 0
bth8_  : 0
bth3_  : 0

Internal Branches:

Node7  : 0
Node6  : 0
Node4  : 0
Node3  : 0
Node15 : 0
Node18 : 0
Node14 : 0
Node13 : 0
Node22 : 0
Node12 : 0
Node2  : 0
Node26 : 0
Node25 : 0
Node34 : 0
Node41 : 0
Node40 : 0
Node39 : 0
Node47 : 0
Node46 : 0
Node38 : 0
Node61 : 0
Node60 : 0
Node59 : 0
Node58 : 0
Node57 : 0
Node56 : 0
Node55 : 0
Node54 : 0
Node72 : 0
Node71 : 0
Node53 : 0
Node51 : 0
Node37 : 0
Node33 : 0
Node31 : 0
Node83 : 0
Node82 : 0
Node87 : 0
Node81 : 0
Node80 : 0
Node91 : 0
Node79 : 0
Node97 : 0
Node96 : 0
Node95 : 0
Node94 : 0
Node78 : 0
Node77 : 0
Node107: 0
Node105: 0
Node111: 0
Node110: 0
Node104: 0
Node76 : 0
Node30 : 0

E[Syn subs/nucleotide site] tree:
	((((bce22_:0.0742224,((bmyc2_:0,bmyc3_:0):0,bpmyx:0.00227493):0.0607191):0.0097918,bce98_:0.108035):0.159719,((((bce14_:0,bceKB_:0.00219835):0.0131878,(bce7_:0.0126398,bmyc1_:0.215356):0.0151017):0.0110628,bce26_:0.0171199):0,(bce29_:0,bce30_:0):0.0276719):0.00781854):0.00195322,((bce17_:0,bce20_:0):0.00208478,bce19_:0.00346193):0.0244208,((bce9_:0.221631,((bce28_:0.212442,bce6_:0.0179442):0,(((((BCAH1_:0,bce13_:0):0.222642,bce1_:0.00227104):0,BCE:0.00343703):0,((bce10_:0.00113824,bth1_:0.00113733):0,G924_:0.00113341):0):0.206168,(BCZK:0,(((((((((BCAH8_:0.237812,bce16_:0):0,bth10_:0.00111754):0.0482675,bth7_:0):0,GBAA_:0.211175):0,bth12_:0.00110489):0,bth9_:0.208783):0,BT:0.00110196):0,BACI:0.21152):0.00221898,((BALH:0,bce4_:0):0,bce21_:0.00110873):0):0.00222426):0.0129056):0.0140406):0):0.004428,((((((((BC:0,bce15_:0):0.0391534,bce12_:0.0011444):0,(bce18_:0.00113288,bce27_:0.0404362):0):0.195484,bth5_:0.00681538):0,(BCB4_:0,bce11_:0):0.0232256):0.191855,((((bce23_:0.064369,bce25_:0):0,bth6_:0.00219817):0.00220004,bce5_:0.0627291):0,bce2_:0):0):0.00379512,bth11_:0.00285239):0.0156825,((bth13_:0.221967,(bth14_:0,bth4_:0.00447804):0.0123304):0.000924139,((bth2_:0,bth8_:0):0,bth3_:0):0.0275685):0.00337846):0.0139159):0.00457504)

E[Non-syn subs/nucleotide site] tree:
	((((bce22_:0.0234575,((bmyc2_:0,bmyc3_:0):0,bpmyx:0.00100627):0.00360306):0.0140294,bce98_:0.0239114):0.0564612,((((bce14_:0.00101548,bceKB_:0):0.00408006,(bce7_:0.00302561,bmyc1_:0.0792714):0.00631422):0.00203966,bce26_:0):0.00101741,(bce29_:0,bce30_:0):0.0103157):0.00101599):0,((bce17_:0,bce20_:0):0.00101283,bce19_:0):0.00816713,((bce9_:0.0752134,((bce28_:0.0765304,bce6_:0.00304787):0,(((((BCAH1_:0,bce13_:0):0.0751849,bce1_:0):0,BCE:0):0,((bce10_:0,bth1_:0):0,G924_:0.00100726):0):0.0739955,(BCZK:0,(((((((((BCAH8_:0.0801247,bce16_:0):0,bth10_:0):0.0113101,bth7_:0):0,GBAA_:0.0780614):0,bth12_:0):0,bth9_:0.0768777):0,BT:0):0,BACI:0.0750832):0,((BALH:0,bce4_:0):0,bce21_:0):0):0):0.00101972):0):0):0.00101831,((((((((BC:0,bce15_:0):0.00920091,bce12_:0):0,(bce18_:0,bce27_:0.00920002):0):0,bth5_:0.0856582):0,(BCB4_:0.0010064,bce11_:0):0.0785229):0,((((bce23_:0.00921931,bce25_:0):0,bth6_:0):0,bce5_:0.00921738):0.00101108,bce2_:0):0):0,bth11_:0):0.00304104,((bth13_:0.0743164,(bth14_:0,bth4_:0.00817586):0.00225975):0.00487121,((bth2_:0,bth8_:0):0,bth3_:0):0.00611502):0):0):0)

Check messages.log details of this run.

 



I think now it is working.
But I would like to have some help in interpreting the dN/dS

to get the dN/dS for each branch do I need to take value from E[Non-syn subs/nucleotide site] tree: and E[Syn subs/nucleotide site] tree: from each branch and take the ratio?

for example: for the branch bce22_, dN/dS =0.0234575/0.0742224 = 0.3159

Is that right?
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: How to get detailed Results from command line Ver
Reply #7 - Feb 24th, 2011 at 3:14pm
 
Hi SAN,

You could get dN/dS out of the tree, but if that's what you want as the primary output, the easier way to go is to simply add some code in the wrapper to do it. For example, if you use any of the MG class models with local parameter options, you can add the following loop in the wrapper instead of loading post_dNdS.bf

Code:
fprintf (stdout,"\nLeaf branches: \n\n");

T = TipCount (givenTree);

for (h = 0; h<T; h=h+1)
{
	bName = TipName (givenTree,h);
	fprintf (stdout,bName);
	for (v = Abs(bName); v<maxLength; v=v+1)
	{
		fprintf (stdout," ");
	}
	fprintf (stdout,": ",Eval("givenTree."+bName+".nonSynRate/givenTree."+bName+".synRate"),"\n");
}

T = BranchCount (givenTree);

if (T>0)
{
	fprintf (stdout,"\nInternal Branches:\n\n");
	for (h = 0; h<T; h=h+1)
	{
		bName = BranchName (givenTree,h);
		fprintf (stdout,bName);
		for (v = Abs(bName); v<maxLength; v=v+1)
		{
			fprintf (stdout," ");
		}
		  fprintf (stdout,": ",Eval("givenTree."+bName+".nonSynRate/givenTree."+bName+".synRate"),"\n");
	}
}

 



Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
SAN
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
Re: How to get detailed Results from command line Ver
Reply #8 - Mar 14th, 2011 at 1:59pm
 
Hi,
I was able to get the dN/dS of each branch. Thanks.

I have a small question. Is there a way to get a confidence interval (S.E) of those estimated values using HYPHY.
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: How to get detailed Results from command line Ver
Reply #9 - Mar 15th, 2011 at 3:16pm
 
Hi San,

Do you want the approximate confidence intervals on omega (i.e. dN/dS)? Are you using the code snippet from my previous post to get dN/dS values?

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
SAN
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
Re: How to get detailed Results from command line Ver
Reply #10 - Mar 16th, 2011 at 7:06am
 
Yes, I want the approximate confidence interval on omega for each branch. Yes, I am using the code snippet from your previous post to get the dN/dS values.

Thanks
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: How to get detailed Results from command line Ver
Reply #11 - Mar 16th, 2011 at 7:47am
 
Hi SAN,

Use the code below. Note that if a branch has dS = 0, this code won't be able to compute dN/dS and it's CI; it can be done with a different model parameterization.

Sergei

Code:
RequireVersion ("2.0020110101");

fprintf (stdout, "Process this file:");
fscanf (stdin, "String", filePath);

LoadFunctionLibrary ("AnalyzeCodonData",
					  {"00": "Universal",
					   "01": filePath,
					   "02": "MG94CUSTOM",
					   "03": "Local",
					   "04": "012345",
					   "05": "Y"});

fprintf (stdout,"\nLeaf branches: \n\n");

T = TipCount (givenTree);

for (h = 0; h<T; h=h+1)
{
	bName = TipName (givenTree,h);
	fprintf (stdout,bName);
	for (v = Abs(bName); v<maxLength; v=v+1)
	{
		fprintf (stdout," ");
	}
	branchOmegaValues = getCIAndValue (bName);
	fprintf (stdout,": ",branchOmegaValues[1],"\t[",branchOmegaValues[0], "-", branchOmegaValues[2],"]\n");
}

T = BranchCount (givenTree);

if (T>0)
{
	fprintf (stdout,"\nInternal Branches:\n\n");
	for (h = 0; h<T; h=h+1)
	{
		bName = BranchName (givenTree,h);
		fprintf (stdout,bName);
		for (v = Abs(bName); v<maxLength; v=v+1)
		{
			fprintf (stdout," ");
		}
		branchOmegaValues = getCIAndValue (bName);
		fprintf (stdout,": ",branchOmegaValues[1],"\t[",branchOmegaValues[0], ",", branchOmegaValues[2],"]\n");
	}
}

function getCIAndValue (branchName)
{
	if (Eval ("givenTree."+branchName+".synRate") > 0)
	{
		saveNS = Eval ("givenTree."+branchName+".nonSynRate");
		global myOmega = Eval ("givenTree."+branchName+".nonSynRate/givenTree."+branchName+".synRate");
		ExecuteCommands ("givenTree."+branchName+".nonSynRate:=myOmega*givenTree."+branchName+".synRate");
		COVARIANCE_PRECISION = 0.95;
		COVARIANCE_PARAMETER = "myOmega";
		CovarianceMatrix (cmx, lf);
		ExecuteCommands ("givenTree."+branchName+".nonSynRate=saveNS");
		return cmx;
	}
	else
	{
		return {{0,5000,10000}};
	}
}
 

Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
SAN
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 8
Re: How to get detailed Results from command line Ver
Reply #12 - Nov 10th, 2011 at 11:49am
 
I came across a situation where certain branches has dS=0 so it is not able to compute dN/dS and its CI. As you mentioned, this could be solved using a different model parameterization. Would you please suggest that different model parameterization?
Back to top
 
 
IP Logged