Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Site specific rate estimation (SiteRates.bf) (Read 7020 times)
Travis_Clark
YaBB Newbies
*
Offline



Posts: 3
New Haven, CT
Site specific rate estimation (SiteRates.bf)
Oct 26th, 2006 at 11:51am
 
Hello,

I need to estimate site specific rates of my alignments, as DNArates does.  I chose to use HYPHY because it is really easy to use and also will estimate the site rates for protein alignments.

I compared my output for a nucleotide file for DNArates and have noticed a few differences.  The branchlengths of my input tree effect the rate estimates in DNArates, if I half all my branch lengths the rate estimates change.  If I half my branch lengths, move decimal point over, or even input a tree with zero branch lengths, HYPHY will give me the same rate estimation.  I was curious if HYPHY was even using the input tree and changed the topology and got slightly different rates.

I would appreciate knowing how HYPHY is estimating the substitution rates in reference to the input tree.  I am using the rate estimates to profile phylogenetic informativeness of genes and would like to use and recommend HYPHY instead of DNArates.

Thank you!
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Site specific rate estimation (SiteRates.bf)
Reply #1 - Oct 26th, 2006 at 11:59am
 
Greetings,

For this analysis, HyPhy does not use branch lengths input with the tree at all. What HyPhy reports are rate*Tree Length, which is due to the standard confounding of evolutionary rates and times. Branch lengths are estimated from the entire alignment before site-by-site estimation of rates is carried out.

If you want to decouple rates and times (e.g. you have some information to date the tree), then it's a simple matter of dividing the [known] tree length from the output.

Hope this helps,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Travis_Clark
YaBB Newbies
*
Offline



Posts: 3
New Haven, CT
Re: Site specific rate estimation (SiteRates.bf)
Reply #2 - Oct 26th, 2006 at 12:15pm
 
WOW, Thank you for the ultra-quick reply.

I was inputing a chronogram (made with r8s) as my input tree.  If I understand your reply correctly, the HYPHY output is the number of times a site changed on the tree, not the actual rate?  I will divide the tree length from the output as you suggest!

Thank you again!
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Site specific rate estimation (SiteRates.bf)
Reply #3 - Oct 26th, 2006 at 12:43pm
 
Dear Travis,

Essentially, this should work. Rates are measured in expected substitutions/site/unit time, not quite in the number of times the site changed, but the same in spirit.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Travis_Clark
Ex Member


Re: Site specific rate estimation (SiteRates.bf)
Reply #4 - Jan 4th, 2007 at 6:49am
 
Hi Sergei,

I somewhat came back to my original question because I am not confident I am analyzing my data correctly.

In the analysis HYPHY outputs a Newick formatted tree with branch lengths, is this the tree length that is estimated by HYPHY to do the analysis?  If so, is this the proper tree length to divide the data by to get the rate?

I believe I was in error when I previously divided by the tree length of my final phylogeny for the rate estimations of different genes.  I was finding odd results with the same gene analyzing it in nucleotide versus protein sequence.

Best regards,

Travis
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Site specific rate estimation (SiteRates.bf)
Reply #5 - Jan 4th, 2007 at 11:03am
 
Dear Travis,

You should NOT use the tree output by HyPhy to determine the total tree length, because that length is the product of (mean rate)*evolutionary_time, hence normalizing your site-specific rate estimate by this length will result in
Code:
(site rate)/((mean rate) * (evolutionary time))
 

.

Site rates, as output by HyPhy, are relative to the 'mean' (not necessarily average, I use this term loosely) rate for the entire alignment, and can only be compared to each other. Because all sites in the same gene (most likely) evolved for the same duration of time, this comparison is meaningful.

If you have some other means of estimating tree lengths (e.g. based on molecular clock), then you can divide out the time.

What are some of the oddities you are seeing?

Hope this helps.
Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Nicola
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 13
Gender: male
Re: Site specific rate estimation (SiteRates.bf)
Reply #6 - Nov 30th, 2013 at 11:10am
 
Dear Sergei,

I have the opposite problem than Travis.
I want to use HyPhy for dating, assuming that I specify my transition matrix Q with the correct mutation rates.
Now, Q is a bit complicated (PoMo).
Nevertheless, I understand from this thread that I can estimate the time t defining the probability matrix e^tQ of a branch, using the HyPhy output branch length T:
t=T/(- \sum_i \pi_i * q_{ii})
with \pi_i the equilibrium frequency of state i.

Do you think this is correct?
Best wishes,
Nicola De Maio
Back to top
 
 
IP Logged