HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
Methodology Questions >> How to >> How to get detailed Results from command line Ver
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1297795207

Message started by SAN on Feb 15th, 2011 at 10:40am

Title: How to get detailed Results from command line Ver
Post by SAN on Feb 15th, 2011 at 10:40am
I am trying to implement HYPHY to estimate the dN/dS ratios of all the branches (58 branches) of 1600 codon alignments I have. I was able to compile the command line binary ( HYPHYMP) and successfully implement the analysis. The results are ‘stdout’ to the screen which contains the Log Likelihood and Tree givenTree. Is it possible that could direct the output to a file? Also when I used the GUI I was able to get the branch wise dN/dS from Analysis -> Results ->dN-dS for branches. But I was not able to retrieve this data in the command line version.

It would be great if you could let me know how could I retrieve it? I can’t use the GUI since I have like 1600 codon alignments. If I could output the dN-dS for branches for each alignment separately, I could just use a sh script to do it again and again for each of the alignment.

Please let me know.

Title: Re: How to get detailed Results from command line Ver
Post by Sergei on Feb 18th, 2011 at 9:01am
Hi SAN,


SAN wrote on Feb 15th, 2011 at 10:40am:
I am trying to implement HYPHY to estimate the dN/dS ratios of all the branches (58 branches) of 1600 codon alignments I have. I was able to compile the command line binary ( HYPHYMP) and successfully implement the analysis. The results are ‘stdout’ to the screen which contains the Log Likelihood and Tree givenTree. Is it possible that could direct the output to a file?


Yes. You can either use the shell output redirect by including '>' on the command line (search Google for examples), or modify the batch file you are using to use a file name in place of 'stdout' when calls to fprintf are made.  


Quote:
Also when I used the GUI I was able to get the branch wise dN/dS from Analysis -> Results ->dN-dS for branches. But I was not able to retrieve this data in the command line version. It would be great if you could let me know how could I retrieve it? I can’t use the GUI since I have like 1600 codon alignments. If I could output the dN-dS for branches for each alignment separately, I could just use a sh script to do it again and again for each of the alignment.

Please let me know.


You will need to write a wrapper file (search these message boards for examples, but first take a look at section 2.8 of Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login. You would call the analysis to fit a codon model (presumably AnalyzeCodonData.bf) followed by a call to TemplateBatchFiles/post_dNdS.bf to retrieve dN-dS for branches.

Sergei



Title: Re: How to get detailed Results from command line Ver
Post by SAN on Feb 21st, 2011 at 3:56pm
It would be great if you could show me an example of how to write the wrapper file. Do I need to first call for AnalyzeCodonData.bf and use those results to callin for post_dNdS.bf to retreive dN-dS for branches.
I am little confused here and it would be great if you could get me a little push here.

SAN

Title: Re: How to get detailed Results from command line Ver
Post by Sergei on Feb 22nd, 2011 at 10:58am
Hi SAN,

Save this to a file (e.g. wrapper.bf)


Code (]

RequireVersion ("2.0020110101");

fprintf (stdout, "Process this file:");
fscanf (stdin, "String", filePath);

LoadFunctionLibrary ("AnalyzeCodonData",
                               {"00": "Universal",
                                "01": filePath,
                                "02": "MG94CUSTOM",
                                "03": "Local",
                                "04": "012345",
                                "05": "Y"});
LoadFunctionLibrary ("post_dNdS.bf");

[/code):

Then call

[code]

$(echo /path/to/some/file) | /path/to/HYPHYMP CPU=2 BASEPATH=/path/to/HYPHY /path/to/wrapper.bf


Replace /path/to with appropriate directory paths on your system. You can change the options I pass to AnalyzeCodonData in the above example to whatever you need.

Sergei

Title: Re: How to get detailed Results from command line Ver
Post by SAN on Feb 22nd, 2011 at 2:20pm
Thanks for the valuable input. I guess I would have to input a tree file as well as codon alignment file..right> where do I specify it? I presume it is at echo /path/to/some/file). if that is the case do I need to put the alignment file and tree in one folder for all of my alignments?

Also . what exactly is the BASEPATH=/path/to/HYPHY? is it the directory of the HYPHY source code?

Please let me know.
SAN

Title: Re: How to get detailed Results from command line Ver
Post by SAN on Feb 24th, 2011 at 8:07am
I did explored more on the wrapper and changed the parameters. I want to get the dN/dS ratio of each of the branches. SO I believe I would have to input an alignment and tree. I want to use the model "GY94". I modified the the wrapper.bf as below:


Code (]
RequireVersion ("2.0020110101");

fprintf (stdout, "Process this file:");
fscanf (stdin, "String", filePath);
fscanf (stdin, "Tree", TreePath);

LoadFunctionLibrary ("AnalyzeCodonData.bf",
                                         {"00": "Universal",
                                          "01": filePath,
                                          "02": "GY94",
                                          "03": "Local",
                                          "04": TreePath});
LoadFunctionLibrary ("post_dNdS.bf");

[/code):



After I ran the command I got the following error message:

[code]
Process this file:Error:
All entries in the associative array used as input redirect argument to ExecuteCommands/ExecuteAFile must be strings. The following key was not: 04

Function call stack
1 : LoadFunctionLibrary from file"AnalyzeCodonData.bf" using basepath /Users/sandeep/Documents/HYPHY/. reading input from {"00":"Universal","01":filePath,"02":"GY94","03":"Local","04":TreePath}
-------


Would you please look into it and suggest a solution for this.
I am also attaching one alignment file and a tree file. I have to do it for 2000 such alignments to get the dN/dS ratios of each branches in each alignment.
Please let me know what might have gone wrong.

Title: Re: How to get detailed Results from command line Ver
Post by SAN on Feb 24th, 2011 at 9:19am
I got the wrapper working and I got the result file:

Code (]
Process this file:
______________READ THE FOLLOWING DATA______________
58 species:{BACI,BALH,BC,BCAH1_,BCAH8_,BCB4_,BCE,BCZK,BT,G924_,GBAA_,bce10_,bce11_,bce12_,bce13_,bce14_,bce15_,bce16_,bce17_,bce18_,bce19_,bce1_,bce20_,bce21_,bce22_,bce23_,bce25_,bce26_,bce27_,bce28_,bce29_,bce2_,bce30_,bce4_,bce5_,bce6_,bce7_,bce98_,bce9_,bceKB_,bmyc1_,bmyc2_,bmyc3_,bpmyx,bth10_,bth11_,bth12_,bth13_,bth14_,bth1_,bth2_,bth3_,bth4_,bth5_,bth6_,bth7_,bth8_,bth9_};
Total Sites:984;
Distinct Sites:304
Please enter a 6 character model designation (e.g:010010 defines HKY85):
______________RESULTS______________
Log Likelihood = -12072.9379512379;
Shared Parameters:
GT=0.541603
CT=1.01131
CG=0.315775
AT=0.285311
AC=0.476575

Tree givenTree=((((bce22_:0.0976799,((bmyc2_:0,bmyc3_:0)Node7:0,bpmyx:0.0032812)Node6:0.0643221)Node4:0.0238212,bce98_:0.131946)Node3:0.216181,((((bce14_:0.00101548,bceKB_:0.00219835)Node15:0.0172679,(bce7_:0.0156655,bmyc1_:0.294627)Node18:0.0214159)Node14:0.0131025,bce26_:0.0171199)Node13:0.00101741,(bce29_:0,bce30_:0)Node22:0.0379876)Node12:0.00883453)Node2:0.00195322,((bce17_:0,bce20_:0)Node26:0.00309761,bce19_:0.00346193)Node25:0.0325879,((bce9_:0.296845,((bce28_:0.288973,bce6_:0.020992)Node34:0,(((((BCAH1_:0,bce13_:0)Node41:0.297827,bce1_:0.00227104)Node40:0,BCE:0.00343703)Node39:0,((bce10_:0.00113824,bth1_:0.00113733)Node47:0,G924_:0.00214067)Node46:0)Node38:0.280163,(BCZK:0,(((((((((BCAH8_:0.317937,bce16_:0)Node61:0,bth10_:0.00111754)Node60:0.0595776,bth7_:0)Node59:0,GBAA_:0.289237)Node58:0,bth12_:0.00110489)Node57:0,bth9_:0.28566)Node56:0,BT:0.00110196)Node55:0,BACI:0.286604)Node54:0.00221898,((BALH:0,bce4_:0)Node72:0,bce21_:0.00110873)Node71:0)Node53:0.00222426)Node51:0.0139253)Node37:0.0140406)Node33:0)Node31:0.00544632,((((((((BC:0,bce15_:0)Node83:0.0483543,bce12_:0.0011444)Node82:0,(bce18_:0.00113288,bce27_:0.0496362)Node87:0)Node81:0.195484,bth5_:0.0924736)Node80:0,(BCB4_:0.0010064,bce11_:0)Node91:0.101749)Node79:0.191855,((((bce23_:0.0735883,bce25_:0)Node97:0,bth6_:0.00219817)Node96:0.00220004,bce5_:0.0719465)Node95:0.00101108,bce2_:0)Node94:0)Node78:0.00379512,bth11_:0.00285239)Node77:0.0187235,((bth13_:0.296283,(bth14_:0,bth4_:0.0126539)Node107:0.0145902)Node105:0.00579535,((bth2_:0,bth8_:0)Node111:0,bth3_:0)Node110:0.0336835)Node104:0.00337846)Node76:0.0139159)Node30:0.00457504);

Leaf branches:

bce22_ : 0
bmyc2_ : 0
bmyc3_ : 0
bpmyx  : 0
bce98_ : 0
bce14_ : 0
bceKB_ : 0
bce7_  : 0
bmyc1_ : 0
bce26_ : 0
bce29_ : 0
bce30_ : 0
bce17_ : 0
bce20_ : 0
bce19_ : 0
bce9_  : 0
bce28_ : 0
bce6_  : 0
BCAH1_ : 0
bce13_ : 0
bce1_  : 0
BCE    : 0
bce10_ : 0
bth1_  : 0
G924_  : 0
BCZK   : 0
BCAH8_ : 0
bce16_ : 0
bth10_ : 0
bth7_  : 0
GBAA_  : 0
bth12_ : 0
bth9_  : 0
BT     : 0
BACI   : 0
BALH   : 0
bce4_  : 0
bce21_ : 0
BC     : 0
bce15_ : 0
bce12_ : 0
bce18_ : 0
bce27_ : 0
bth5_  : 0
BCB4_  : 0
bce11_ : 0
bce23_ : 0
bce25_ : 0
bth6_  : 0
bce5_  : 0
bce2_  : 0
bth11_ : 0
bth13_ : 0
bth14_ : 0
bth4_  : 0
bth2_  : 0
bth8_  : 0
bth3_  : 0

Internal Branches:

Node7  : 0
Node6  : 0
Node4  : 0
Node3  : 0
Node15 : 0
Node18 : 0
Node14 : 0
Node13 : 0
Node22 : 0
Node12 : 0
Node2  : 0
Node26 : 0
Node25 : 0
Node34 : 0
Node41 : 0
Node40 : 0
Node39 : 0
Node47 : 0
Node46 : 0
Node38 : 0
Node61 : 0
Node60 : 0
Node59 : 0
Node58 : 0
Node57 : 0
Node56 : 0
Node55 : 0
Node54 : 0
Node72 : 0
Node71 : 0
Node53 : 0
Node51 : 0
Node37 : 0
Node33 : 0
Node31 : 0
Node83 : 0
Node82 : 0
Node87 : 0
Node81 : 0
Node80 : 0
Node91 : 0
Node79 : 0
Node97 : 0
Node96 : 0
Node95 : 0
Node94 : 0
Node78 : 0
Node77 : 0
Node107: 0
Node105: 0
Node111: 0
Node110: 0
Node104: 0
Node76 : 0
Node30 : 0

E[Syn subs/nucleotide site):

tree:
     ((((bce22_:0.0742224,((bmyc2_:0,bmyc3_:0):0,bpmyx:0.00227493):0.0607191):0.0097918,bce98_:0.108035):0.159719,((((bce14_:0,bceKB_:0.00219835):0.0131878,(bce7_:0.0126398,bmyc1_:0.215356):0.0151017):0.0110628,bce26_:0.0171199):0,(bce29_:0,bce30_:0):0.0276719):0.00781854):0.00195322,((bce17_:0,bce20_:0):0.00208478,bce19_:0.00346193):0.0244208,((bce9_:0.221631,((bce28_:0.212442,bce6_:0.0179442):0,(((((BCAH1_:0,bce13_:0):0.222642,bce1_:0.00227104):0,BCE:0.00343703):0,((bce10_:0.00113824,bth1_:0.00113733):0,G924_:0.00113341):0):0.206168,(BCZK:0,(((((((((BCAH8_:0.237812,bce16_:0):0,bth10_:0.00111754):0.0482675,bth7_:0):0,GBAA_:0.211175):0,bth12_:0.00110489):0,bth9_:0.208783):0,BT:0.00110196):0,BACI:0.21152):0.00221898,((BALH:0,bce4_:0):0,bce21_:0.00110873):0):0.00222426):0.0129056):0.0140406):0):0.004428,((((((((BC:0,bce15_:0):0.0391534,bce12_:0.0011444):0,(bce18_:0.00113288,bce27_:0.0404362):0):0.195484,bth5_:0.00681538):0,(BCB4_:0,bce11_:0):0.0232256):0.191855,((((bce23_:0.064369,bce25_:0):0,bth6_:0.00219817):0.00220004,bce5_:0.0627291):0,bce2_:0):0):0.00379512,bth11_:0.00285239):0.0156825,((bth13_:0.221967,(bth14_:0,bth4_:0.00447804):0.0123304):0.000924139,((bth2_:0,bth8_:0):0,bth3_:0):0.0275685):0.00337846):0.0139159):0.00457504)

E[Non-syn subs/nucleotide site] tree:
     ((((bce22_:0.0234575,((bmyc2_:0,bmyc3_:0):0,bpmyx:0.00100627):0.00360306):0.0140294,bce98_:0.0239114):0.0564612,((((bce14_:0.00101548,bceKB_:0):0.00408006,(bce7_:0.00302561,bmyc1_:0.0792714):0.00631422):0.00203966,bce26_:0):0.00101741,(bce29_:0,bce30_:0):0.0103157):0.00101599):0,((bce17_:0,bce20_:0):0.00101283,bce19_:0):0.00816713,((bce9_:0.0752134,((bce28_:0.0765304,bce6_:0.00304787):0,(((((BCAH1_:0,bce13_:0):0.0751849,bce1_:0):0,BCE:0):0,((bce10_:0,bth1_:0):0,G924_:0.00100726):0):0.0739955,(BCZK:0,(((((((((BCAH8_:0.0801247,bce16_:0):0,bth10_:0):0.0113101,bth7_:0):0,GBAA_:0.0780614):0,bth12_:0):0,bth9_:0.0768777):0,BT:0):0,BACI:0.0750832):0,((BALH:0,bce4_:0):0,bce21_:0):0):0):0.00101972):0):0):0.00101831,((((((((BC:0,bce15_:0):0.00920091,bce12_:0):0,(bce18_:0,bce27_:0.00920002):0):0,bth5_:0.0856582):0,(BCB4_:0.0010064,bce11_:0):0.0785229):0,((((bce23_:0.00921931,bce25_:0):0,bth6_:0):0,bce5_:0.00921738):0.00101108,bce2_:0):0):0,bth11_:0):0.00304104,((bth13_:0.0743164,(bth14_:0,bth4_:0.00817586):0.00225975):0.00487121,((bth2_:0,bth8_:0):0,bth3_:0):0.00611502):0):0):0)

Check messages.log details of this run.


I think now it is working.
But I would like to have some help in interpreting the dN/dS

to get the dN/dS for each branch do I need to take value from E[Non-syn subs/nucleotide site] tree: and E[Syn subs/nucleotide site] tree: from each branch and take the ratio?

for example: for the branch bce22_, dN/dS =0.0234575/0.0742224 = 0.3159

Is that right?

Title: Re: How to get detailed Results from command line Ver
Post by Sergei on Feb 24th, 2011 at 3:14pm
Hi SAN,

You could get dN/dS out of the tree, but if that's what you want as the primary output, the easier way to go is to simply add some code in the wrapper to do it. For example, if you use any of the MG class models with local parameter options, you can add the following loop in the wrapper instead of loading post_dNdS.bf

[code]
fprintf (stdout,"\nLeaf branches: \n\n");

T = TipCount (givenTree);

for (h = 0; h<T; h=h+1)
{
     bName = TipName (givenTree,h);
     fprintf (stdout,bName);
     for (v = Abs(bName); v<maxLength; v=v+1)
     {
           fprintf (stdout," ");
     }
     fprintf (stdout,": ",Eval("givenTree."+bName+".nonSynRate/givenTree."+bName+".synRate"),"\n");
}

T = BranchCount (givenTree);

if (T>0)
{
     fprintf (stdout,"\nInternal Branches:\n\n");
     for (h = 0; h<T; h=h+1)
     {
           bName = BranchName (givenTree,h);
           fprintf (stdout,bName);
           for (v = Abs(bName); v<maxLength; v=v+1)
           {
                 fprintf (stdout," ");
           }
             fprintf (stdout,": ",Eval("givenTree."+bName+".nonSynRate/givenTree."+bName+".synRate"),"\n");
     }
}

[/code]

Sergei

Title: Re: How to get detailed Results from command line Ver
Post by SAN on Mar 14th, 2011 at 1:59pm
Hi,
I was able to get the dN/dS of each branch. Thanks.

I have a small question. Is there a way to get a confidence interval (S.E) of those estimated values using HYPHY.

Title: Re: How to get detailed Results from command line Ver
Post by Sergei on Mar 15th, 2011 at 3:16pm
Hi San,

Do you want the approximate confidence intervals on omega (i.e. dN/dS)? Are you using the code snippet from my previous post to get dN/dS values?

Sergei

Title: Re: How to get detailed Results from command line Ver
Post by SAN on Mar 16th, 2011 at 7:06am
Yes, I want the approximate confidence interval on omega for each branch. Yes, I am using the code snippet from your previous post to get the dN/dS values.

Thanks

Title: Re: How to get detailed Results from command line Ver
Post by Sergei on Mar 16th, 2011 at 7:47am
Hi SAN,

Use the code below. Note that if a branch has dS = 0, this code won't be able to compute dN/dS and it's CI; it can be done with a different model parameterization.

Sergei


Code (]

RequireVersion ("2.0020110101");

fprintf (stdout, "Process this file:");
fscanf (stdin, "String", filePath);

LoadFunctionLibrary ("AnalyzeCodonData",
                               {"00": "Universal",
                                "01": filePath,
                                "02": "MG94CUSTOM",
                                "03": "Local",
                                "04": "012345",
                                "05": "Y"});

fprintf (stdout,"\nLeaf branches: \n\n");

T = TipCount (givenTree);

for (h = 0; h<T; h=h+1)
{
     bName = TipName (givenTree,h);
     fprintf (stdout,bName);
     for (v = Abs(bName); v<maxLength; v=v+1)
     {
           fprintf (stdout," ");
     }
     branchOmegaValues = getCIAndValue (bName);
     fprintf (stdout,": ",branchOmegaValues[1):

,"\t[",branchOmegaValues[0], "-", branchOmegaValues[2],"]\n");
}

T = BranchCount (givenTree);

if (T>0)
{
     fprintf (stdout,"\nInternal Branches:\n\n");
     for (h = 0; h<T; h=h+1)
     {
           bName = BranchName (givenTree,h);
           fprintf (stdout,bName);
           for (v = Abs(bName); v<maxLength; v=v+1)
           {
                 fprintf (stdout," ");
           }
           branchOmegaValues = getCIAndValue (bName);
           fprintf (stdout,": ",branchOmegaValues[1],"\t[",branchOmegaValues[0], ",", branchOmegaValues[2],"]\n");
     }
}

function getCIAndValue (branchName)
{
     if (Eval ("givenTree."+branchName+".synRate") > 0)
     {
           saveNS = Eval ("givenTree."+branchName+".nonSynRate");
           global myOmega = Eval ("givenTree."+branchName+".nonSynRate/givenTree."+branchName+".synRate");
           ExecuteCommands ("givenTree."+branchName+".nonSynRate:=myOmega*givenTree."+branchName+".synRate");
           COVARIANCE_PRECISION = 0.95;
           COVARIANCE_PARAMETER = "myOmega";
           CovarianceMatrix (cmx, lf);
           ExecuteCommands ("givenTree."+branchName+".nonSynRate=saveNS");
           return cmx;
     }
     else
     {
           return {{0,5000,10000}};
     }
}

Title: Re: How to get detailed Results from command line Ver
Post by SAN on Nov 10th, 2011 at 11:49am
I came across a situation where certain branches has dS=0 so it is not able to compute dN/dS and its CI. As you mentioned, this could be solved using a different model parameterization. Would you please suggest that different model parameterization?

HyPhy message board » Powered by YaBB 2.5.2!
YaBB Forum Software © 2000-2024. All Rights Reserved.