HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
Methodology Questions >> How to >> Individual substitutions on branches?
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1157919486

Message started by Sarah on Sep 10th, 2006 at 1:18pm

Title: Individual substitutions on branches?
Post by Sarah on Sep 10th, 2006 at 1:18pm
Is there any way for HyPhy to list inferred the amino acid substitutions on tree branches? I could export trees to PAUP and rerun ML analysis there, but it would be nice to stick with one program.

Thanks,
Sarah

Title: Re: Individual substitutions on branches?
Post by artpoon on Sep 11th, 2006 at 5:56pm
Dear Sarah,

As a matter of fact, I need to generate a batch file to do just this thing for my own research.  Theoretically, one could use ReconstructAncestors(lf) to infer the sequences at internal nodes and traverse the tree to assign substitutions to branches, but that would be assuming no more than one substitution per branch.  I'm planning to implement a recently published method that allows for more than one substitution.

After optimizing your likelihood function with the HyPhy GUI, you can select Analysis > Results > Ancestral Sequences to save the reconstructed sequences to a flat-formatted file.  In the batch language, you can use:

DataSet ancseq = ReconstructAncestors(lf)

where lf identifies the likelihood function that you just optimized.  I'll need to look over some notes to figure out how to write a batch file to do the next step, i.e. assign substitutions to branches given the intervening nodes.  I'll post something up here as soon as I do.

As far as I know, however, there isn't an existing batch file to do what I think you're asking for exactly (i.e. the strictly correct way), not yet.  If it's okay to assume single substitutions, however, then it's straight-forward (says the post-doc who's going to spend some time now figuring out exactly how to write this into a HYPHY BF).
- Art.

Title: Re: Individual substitutions on branches?
Post by artpoon on Sep 11th, 2006 at 6:53pm
Dear Sarah,

I poked around a little more and realized that there's some batch files in the UserAddins directory that may be of some use to you.  You might want to try CodonMutationToTreeMapper.bf, which will take your likelihood function (has to be called "lf" and generate a PostScript file depicting the inferred tree with the branch or branches highlighted with codon substitutions.

Sorry if you've already tried this out already =)  I'm going through the batch file to see how I can adapt it.
- Art.

Title: Re: Individual substitutions on branches?
Post by Sergei on Sep 13th, 2006 at 9:00am
Dear Sarah,

As Art pointed out, there are some existing scripts which can report (graphically), inferred substitutions for a given site. If you give me an idea of what kind of output you wanted, I can modify them quite easily.

Cheers,
Sergei

Title: Re: Individual substitutions on branches?
Post by Sarah on Sep 14th, 2006 at 6:43pm
Thanks, Art & Sergei.

I had missed CodonMutationToTreeMapper.bf. It looks good.

I have an embarrassing question. Working from the GUI, once I build a likelihood and optimize it under some model, how do I save the likelihood function? I thought I had saved it using File > Save > lf (as data file) in the console after I had optimized my MG94-REV_3x4_DualRV_GDD model. I put copies in both the UserAddins and general HyPhy directories. CodonMutation...bf gives me an initialization error. Am I really saving the likelihood function (= param + seq + tree)? I can see it as an object in Object Explorer but cannot save from there.

I've been rereading the HyPhy manual and GUI examples and am making slow progress.

Thanks for your help.

Sarah

Title: Re: Individual substitutions on branches?
Post by artpoon on Sep 15th, 2006 at 1:11pm
Dear Sarah,

To export your likelihood function to a file from the HyPhy GUI, you need to select the menu option Analysis > Results > Save Results, which will bring up a "Likelihood Function Display" dialog window.  Select the last option "Export Analysis" and hit OK, which will bring up a file directory window in which you can specify a filename and save.

For importing your likelihood function into CodonMutationToTreeMapper.bf, you need to ensure that your likelihood function identifier is "lf".  If your likelihood function takes some other name (you can check this via Object Inspector), you will need to open the file that you've exported your LF to, and find the line that looks like this (it should be near the bottom):

LikelihoodFunction yourLF = (yourDataFilter,yourTree);

and change the identifier (i.e. "yourLF") to "lf".  Then open the exported LF file as a batch file.  Now "lf" should appear in your Object Inspector.  Run CodonMutationToTreeMapper.bf .

I'm working on writing documentation for CodonMutationToTreeMapper.bf, and will try to get it to accept any LF identifier.

- Art.



Title: Re: Individual substitutions on branches?
Post by Sergei on Sep 15th, 2006 at 2:10pm
Dear Art,

Here's a way to select a likelihood function identifier:


Code (]
tIndex = -1;
avLFCount = Rows ("LikelihoodFunction");
if (avLFCount>1)
{
     choices = {avLFCount,2};
     for (k=0; k<avLFCount;k=k+1)
     {
           GetString (tName, LikelihoodFunction, k);
           choices[k):

[0] = tName;
           choices[k][1] = "Function "+tName;
     }
     
     ChoiceList (tIndex,"Simulate from this likelihood function:",1,SKIP_NONE,choices);
}
else
{
   tIndex = avLFCount - 1;
}

if (tIndex>=0)
{
      GetString (tName, LikelihoodFunction, tIndex);
}
else
{
/* no lf's defined */
return 0;
}

/* now tName has the ID of the LF */



Cheers,
Sergei

Title: Re: Individual substitutions on branches?
Post by artpoon on Sep 15th, 2006 at 2:55pm
Thanks, Sergei!  I had gotten stuck on that part. :-/
- Art.

Title: Re: Individual substitutions on branches?
Post by Sarah on Sep 19th, 2006 at 1:33pm
Thanks. I'm still missing something, though. :-[ I can now save, modify, and open the lf just fine. When I try opening CodonMutationToTreeMapper.bf, I get:


Quote:
Likelihood Functionlf has not been initialized
Current BL Command:Reconstruct Ancestors into ancestralSeqs from the likelihood function lf Current Task has been terminated. Would you like to see the remaining error messages, if there are any?

Clicking 'yes':


Quote:
Problem occurred in line:setupMapToTree(0)
Current BL Command:setupMapToTree(0) Current Task has been terminated....

On my second attempt, I opened the object explorer and to view the lf, thinking that might somehow initialize it. I also tried 'recalculate.' After trying to open CodonMutation...bf I got:


Quote:
DataSet/DataSetFilter filteredData has not been initialized
Current BL Command:Harvest Frequencies into matrix observedCEFV from DataSet filteredData with unit size = 3 with atom size = 3 with position specific flag = 0 Partition: Current Task has been terminated. Would you like to see the remaining error messages, if there are any?

Clicking 'yes,' I get the same second message as before.

Am I missing something obvious?

For this analysis I'm working with HyPhy on Windows.

Sarah

Title: Re: Individual substitutions on branches?
Post by Sarah on Sep 19th, 2006 at 2:05pm
I'm also unable to open the saved lf file on my intel mac. I've tried playing with the extensions (.bf, nothing) and have also checked to confirm that the file opens fine in Windows. The error I get is


Quote:
[Dead end] An error occured:
Node names should begin with a letter, a number, or an underscore....


Sarah

update: I restarted my computer. Now HyPhy on my intel imac can't open any sequence files (including the included example files) without crashing. I get crashes when I try to load a file under 'New Analysis' and when I use Open > Data file. update2: it will open p51.nex if i try to run a basic analysis.

Title: Re: Individual substitutions on branches?
Post by Sergei on Sep 19th, 2006 at 4:46pm
Dear Sarah,
Have you tried selecting "Show All Files" from the topmost drop-down menu in the file open dialog? This usually allows you to select any file to open.

Sorry for the crashes; I am at a loss why this would happen (Universal Binary Builds work fine on my G5 and Intel iMac).
Could you perhaps paste in the crash log? You can open Applications->Utilities->Console program, open the Log drawer, look under ~/Library/Logs/ then CrashReporter then HYPHY.crash.log and paste in whatever the system wrote out when the program crashed.

Thanks!
Sergei

P.S. Could you also e-mail me the lf fit file which was giving you error messages?

Title: Re: Individual substitutions on branches?
Post by Sarah on Sep 21st, 2006 at 3:29pm
I was thinking that the error message might come from the way Windows handles carriage returns. I didn't modify the exported likelihood function in any way (except to rename the function itself as Art showed) after saving it in Windows and opening it on my Mac.

The interesting news now is that after building and optimizing a likelihood function in my newly reinstalled HyPhy (on my Mac), I am unable to save results. Analysis > Results is gray. I think I might try uninstalling and reinstalling and working with sample files only to pinpoint the problems.

Sarah


Title: Re: Individual substitutions on branches?
Post by Sergei on Sep 21st, 2006 at 4:24pm
Dear Sarah,

Carriage returns should not matter; HyPhy should automatically handle all 3 types (UNIX/Mac/Windows).

That the 'Results' menu is grayed out is odd - it usually happens only if HyPhy can't find the TemplateBatchFiles directory.

Cheers,
Sergei

Title: Re: Individual substitutions on branches?
Post by Sarah on Sep 21st, 2006 at 5:12pm
How can I check if/how HyPhy recognizes the Template Batch Files directory? It looks like it's there. This is the install from today (universal binary uploaded 9/14). I can run a standard analysis (AnalyzeCodonData.bf) with no problem. I run into problems when I do the following:

Open data file
Create partition
Define partition type as codon
Select tree from file (tree looks good in window)
Define substitution model type
Set parameters to local
Analysis > Build function
Analysis > Optimize

The optimization appears good (messages log and console report no problems); I can see a tree and LF. Analysis > Results remains gray. I repeated this procedure with three different data sets and encountered the same problem.

Maybe there's something wrong with my computer... I'm sorry for all these troubles... thought they might be useful for the record.

Sarah

Title: Re: Individual substitutions on branches?
Post by Sergei on Sep 21st, 2006 at 5:33pm
Dear Sarah,

The Analysis->Results menu is usually enabled after you run one of the standard (non-GUI) analyses. For your setting (running things via the interface), please use the little gears button in the bottom right of the console window, and choose 'ExportLikelihoodFunction' from the list.

Sorry for the confusion.

Cheers,
Sergei

Title: Still trying
Post by Sarah on Nov 15th, 2006 at 10:59am
I'm returning to this task after a two-month hiatus. HyPhy is no longer crashing, but I also can't get CodonMutationToTreeMapper.bf to work using the GUI on my Mac.

Here's what I do:

1. I open a data set, create a partition, select a tree and a model, and set parameters to local. I then build a likelihood function. I save the likelihood function using the gears button, as Sergei suggested. A window pops up asking me if I want to include an optimization step; I say yes. (Should I optimize now instead?)

2. I then open the file to which I saved the likelihood function and change the function's name as Art described to "lf". (I haven't included the code Sergei wrote.)

3. I then select File > Open Batch File and choose CodonMutationToTreeMapper.bf. The GUI terminal window says it's reconstructing ancestors, prints 10410.4 two times on the screen, and a "Dead End" window appears, saying,

"An error occured:
DataSet/DataSetFilter filteredData has not been initialized
Current BL Command:Harvest Frequencies into matrix observed CEFV from DataSet  filteredData with unit size = 3 with atom size = 3 wth position specific flag = 0
Partition:
Curren operation/job has been terminated."

Another "Dead End" window appears, complaining about setupMapToTree(0).

Is there a reference for this bf short of the code itself? Should I scrap the GUI and try command line on Linux?

Thanks so much for your help and patience.

Sarah

Title: Re: Still trying
Post by Sergei on Nov 15th, 2006 at 11:29am
Dear Sarah,

CodonMutationToTreeMapper.bf is really quite unfriendly - I wrote it to make figures for one of our papers. It expects things (likelihood functions, trees, filters) to be named a certain way, or it fails.

A simple way to get it to work though (use the GUI), is to run a standard analysis (e.g. AnalyzeCodonData.bf) - as it will produce appropriately named objects. When your analysis is done, you can save the LF if you want (for later), and use the gears button to call CodonMutationToTreeMapper.bf. It will write a PostScript file mapping all inferred substitutions for a given site. You can view PS files in Preview on a Mac, or ghostscript (ghostview) on Linux/Windows.

Cheers,
Sergei

P.S. I'll make the mapping utility more user friendly, if I ever find the time.

Title: Re: Individual substitutions on branches?
Post by Sarah on Nov 16th, 2006 at 10:57am
Great! It works! I ran AnalyzeCodonData.bf as you suggested and then chose CodonMutationToTreeMapper.bf from the gears button.

I'm thinking I might try to modify the bf so that the tree shows tip labels. I have a hard time reading on which branches the substitutions occur (the branches are also very dense relative to the size of the letters). Is there a way to print substitutions from a range of codons on the tree, rather than the substitutions associated with one residue at a time?

I think it'd be good practice to try to modify the code myself. My goal is to show the (dis)appearance of glycosylation sites on the tree.

Thanks again,
Sarah

Title: Re: Individual substitutions on branches?
Post by Sergei on Nov 16th, 2006 at 11:09am
Dear Sarah,

The font scaling thing should be resolved by removing lines 452 and 453 in the file (which fix the font size at 11), and let the drawing routines decide what size is appropriate.

If you want to label the leaves, you can edit line 307 to read


Code (]
nodeSpec ["TREE_OUTPUT_BRANCH_LABEL"):

= "__FONT_SIZE__ 2 idiv\n__FONT_SIZE__ 3 idiv\nneg\nrmoveto\n("+cd3+") show";


PostScript based tree drawing is pretty ugly to get your head around, I am afraid...

Cheers,
Sergei

HyPhy message board » Powered by YaBB 2.5.2!
YaBB Forum Software © 2000-2024. All Rights Reserved.