Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Ancestral States of a Protein family? (Read 3916 times)
jpnoel1964
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 4
Ancestral States of a Protein family?
Dec 1st, 2009 at 10:11am
 
Hi. I am totally new to this software package which upon first view seems quite impressive.

I am quite interested in reconstructing ancestral protein sequences of a set of orthologous secondary metabolic enzymes of varying degrees of identity/similarity. In fact, many reside within the 80-95% identity range but being enzymes of secondary metabolism, even small changes give rise to substantial changes in functional output.

We are very interested in examining the effect of distal changes as these genes/enzymes undergo evolution on the product spectrum of these enzymes that provide protection against pathogens in the plants local ecosystem. Since the sequences in question exhibit high levels of identity/similarity, we believe that ancestral reconstructions in these cases will be much more realistic then previous attempts using sequences of much wider divergence.

See:

O'Maille, P.E., Malone, A., Dellas, N., Hess, B.A., Smentek, L., Sheehan, I., Greenhagen, B.T., Chappell, J., Manning, G. and Noel, J.P. (2008) Quantitative exploration of the catalytic landscape separating divergent plant sesquiterpene synthases. Nat. Chem. Biol. 4: 617-623. Epub 2008 Sep 7. PMCID: PMC2664519.

Can HyPhy provide the most probable nucleotide/codon at each node in a properly reconstructed tree?
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Ancestral States of a Protein family?
Reply #1 - Dec 1st, 2009 at 10:27am
 
Hi jpnoel1964,

Indeed, HyPhy can do that and more. You can reconstruct ancestral sequences (supplying a protein alignment and [optionally] a tree) using our online server Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (ASR is the analysis you want). Please let me know if you have any issues running that analysis. My suggestions for running this via datamonkey would be

1). Upload your alignment (make sure you specify that it's a protein alignment)
2). Screen for recombination/gene conversion (unless its biologically improbable)
3). Run the automatic model selection procedure to select the best fitting evolutionary model
4). Run the ASR analysis

HyPhy will return 3 different ancestral reconstructions:

1). Joint: this is the joint assignment of all ancestral characters which maximizes the conditional probability of observing the extant nodes over all possible joint assignments
2). Marginal: this is the character assignment (at a given node) which maximized the probability of the conditional probability given a character at a fixed node (the rest of the tree is 'summed' out)
3). Sampled: samples ancestral characters from the joint distribution in (1); useful for 'noisy' data and to assess reliability.

Be careful with the sites/nodes where the three approaches disagree - there are usually multiple probable reconstructions there.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
jpnoel1964
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 4
Re: Ancestral States of a Protein family?
Reply #2 - Dec 1st, 2009 at 11:21am
 
Sergei,
Thank you so much! I will read more and give it a go. Second question if you don't mind - would it be more reliable to use the gene sequence or the protein sequence? In this case, the gene sequences are quite similar as well and I was thinking they would provide a more "realistic" model of the evolution of the actual coding sequences.
Joe
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Ancestral States of a Protein family?
Reply #3 - Dec 1st, 2009 at 11:23am
 
Hi Joe,

You can use both codon and protein sequences and compare the results as a further hedge against reconstruction error. Protein models are actually better for residue reconstruction (in general) because codon models (currently implemented in Datamonkey and almost every other package) do not account for different residue exchangeabilities.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
jpnoel1964
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 4
Re: Ancestral States of a Protein family?
Reply #4 - Dec 1st, 2009 at 1:26pm
 
Pardon my naive notions here! LOL

After I run the automatic model selection procedure to select the best fitting evolutionary model, how do I best interpret the results to then go back to run the ASR?
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Ancestral States of a Protein family?
Reply #5 - Dec 1st, 2009 at 6:37pm
 
Hi Joe,

Could you paste in the URL with model selection results on your data? Model interpretation will be easier on a concrete example.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
jpnoel1964
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 4
Re: Ancestral States of a Protein family?
Reply #6 - Dec 1st, 2009 at 7:18pm
 
datamonkey.org/spool/upload.452120129811344.1_pmodel.php
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Ancestral States of a Protein family?
Reply #7 - Dec 2nd, 2009 at 11:56am
 
Dear Joe,

The results of the model selection suggest JTT+F (alignment frequencies) as the best fitting model, based on the default criterion (c-AIC). If you click on the [Information/Other analyses] link at the top of the model selection page, and select to run (or re-run) ASR, this model will automatically be filled in for you on the analysis setup page. You can also select other models manually using pull-down menus.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged