HyPhy message board | |
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
Methodology Questions >> How to >> substitutions pr site and positive selection http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1170175339 Message started by KBR on Jan 30th, 2007 at 8:42am |
Title: substitutions pr site and positive selection Post by KBR on Jan 30th, 2007 at 8:42am
Hi
I really need help. I do not know much about these things, and it is difficult for me to choose one method instead of the other. I have a dataset of 204 influenza HA sequences from the year 1999 up to today. I have used SLAC and FEL in HyPhy to predict positive selected sites. When they give different results...which should I believe in?, should I publish the results from both methods? Is a treshold of 0.1 ok? Is it too strict or not? Is there a good way, the most correct way, to estimat the global dN/dS ratio? I have intil now used the value that is given when I run the positive selection alaysis. Which option should I use if I would like to know the transition/transversion ratio? Also I would like to estimate the substitutions/site/year....is this what is called evolutionary rate? Does this infer a molecular clock??? Which program can I use to calculate this. Finally....I have a dataset of 19 sequences also from 1999-2006 of some other genes of influenza (internal genes)...are there any point in estimate for positive selected sites on such a small dataset????? I hope you have time to help me Kindly Karoline |
Title: Re: substitutions pr site and positive selection Post by KBR on Feb 8th, 2007 at 1:09am
Thanks Sergei
I have now tried it out and it works fine except for one gene. I have 15 sequences, 300 bases long. I have run SLC and FEL with at 0.1 threshold and REL with at 50 bayes threshold. I get a result for REL that I do not trust. SLC and FEL give no positive selected sites, while REL gives 4. Four sites in this gene is VERY unlikely. Is the 50 bayes to liberal? It worked fine for the other genes. Any suggestions? This gene is spliced, but I have done the measurements on each of the two coding frames in the gene. I see that many publications publish estimated nucleotidechange/site/year on influenza datasets. First: I have a suspision that they do not calculate it based on likelihood or molecular clock but simply divides the overall number of transitions and transversion with the number of nucleotides, each year. What do you think about that? That is not a way to do it, for a publication, right? Second: I can see in my dataset that the evolution of the genes is not consistant/constant, but is highly influenced by reassortments and intoduction of new strains from elsewere. Based on this it can not be right to estimate a molecular clock to calculate substitutions/site/year, right? Will I be better off not to mention anyting about substitution rates i my publication, when I do not believe that there has been a constant rate of evolution? |
Title: Re: substitutions pr site and positive selection Post by Sergei on Feb 8th, 2007 at 4:37pm
Dear Karoline,
For your problem case, check to see if SLAC and FEL are borderline significant (e.g. 0.15) for the same sites. REL tends to be the most liberal of the tests, but it rarely produces a large number of false positives for > 10 sequences, unless there is something strange going on (e.g. very low sequence divergence). What do you mean by two reading frames? Is it a dual coding gene (the tail of PB2)? In terms of substitution rates, I think you are quite right not to report estimates which are based on molecular clock (i.e. constant rates over time), because they may have little to do with real substitution rates when the assumption breaks down to a great extent. Cheers, Sergei |
HyPhy message board » Powered by YaBB 2.5.2! YaBB Forum Software © 2000-2024. All Rights Reserved. |