Hi
Can someone explain to me why the PB1-F2 protein of influenza has a dN/ds ratio of between 9-13 while all other influenza genes have a ratio between 0.03-0.4?????
Some publications claim that the result is biased for PB1-f2 because this protein is dereived from a much larger protein from a +1 reading frame. I believe the protein is only 90 aa. It should also be very conserved. How is it then possible to have such a high score????
Should one believe that this small protein is regulated under extreme positive selection pressure?? or what..I don't quite believe it.
How would you explain it....so that I understand.???
I also wonder about predicting individual positive selected sites from spliced genes. The NS and M genes of influensa are spliced into NS1/Ns2 and M1/M2, respectively. the M2 proteins have a higer dN/dS ratio than other proteins that we NOW are more prone to postive selection. Also several sites in M1 are selected as positive selection sites (this protein should not have been influenzed by selection pressure). Some excplain high dN/dS ratio for spliced genes as bias because dS is suppressed for overlapping regions. I don't understand this....if you have extracted the amino acids from an alignment giving the M1 protein and run selection analysis on it....how can it then be biased by overlapping regions????
When i look at the HA protein as whole I find no positively selected sites, but when I only look at the HA1 region only, I find 25 sites. Why does SLAC suddenly identify these sites? is it only because the lenght of the alignment is shorter than the original one? If I shorten it further....will I then find more sites???
How can I then trust that the sites I find are truly under positive selection?
Please help
This is very hard!
KBR