Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Viral positive selection detection (Read 2720 times)
Jan
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 2
Viral positive selection detection
Sep 26th, 2009 at 4:24pm
 
Hi all,

I'm working on characterising the meta genome of a virus, and wanted to identify areas of the genome under positive evolutionary pressure. It looks like the HyPhy package would do the trick, but I'm unfortunately not well versed with genomic statistics or the programs associated with them. Thus, I was hoping someone could point me in the right direction.

Some background: The virus has low overall NA diversity (~2% on ~6Kb), with its most variable gene being 97% conserved (~1.5kb gene). The virus does have subgroups, and these subgroups are the origin of most of the variation (that is, the polymorphisms are conserved in a sub-type specific manner). This leads me to conclude that those sites would be under positive selection, correct?

As a bit of an excercise, I plugged in the alignment of my most variable gene into the REL app on the Datamonkey.org site, and got some results that I was puzzled by. The most vairable gene showed that I only had sites under negative selection, and the output was sensitive to the number or sequences that I would put in, despite identical sequences being pruned out prior to analyses apparently. I think I am missing something fundemental here...

So,
I guess my questions are:
1) Given the low sequence diversity and small sequence size, what would be the best method for analysing selective pressure?
2) Would it be most accurate to analyse the whole genome, including non-coding and regulatory regions, or analyse each gene separately?

Thanks for any help!
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Viral positive selection detection
Reply #1 - Sep 28th, 2009 at 11:46am
 
Hi Jan,

Generally speaking, the level of divergence in your alignment is probably too low (unless you have many, i.e. 50+ sequences) to detect positive selection at the level of a single site reliably (which is why REL gives you unstable results). Typically, the best you can do in these cases is to test for the action of selection on all (or part) of the gene (i.e. are there any sites under selection), in place of specific sites (i.e. which sites are under selection).


What I would recommend you do instead is try to look for region specific selection, e.g. using PARRIS on Datamonkey to test for evidence of selection on the most variable gene.

If you haven't had a chance to look at Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login, you may find this document useful.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Jan
Ex Member
*


Feed your monkey!

Re: Viral positive selection detection
Reply #2 - Jan 1st, 1970 at 4:35pm
 
Hi Sergei,

I have run all of my genes through PARRIS (out of 64 samples I get ~35 unique ones), including the most variable (capsid protein), and found no evidence of positive selection at p<0.1. This was also the case if I ran capsid gene in ~100nt windows, or if I only used a 198nt region of the capsid gene that encoded for an antigenic loop where one subtype had 2AA substitutions.

So I'm at a bit of a loss to explain the data. The best I can think of is that:

From a visual examination, the conserved subtype specific AA changes in the capsid protein, and especially in the antigenic loops, would be under positive selective pressure (diversifying). However, using PARRIS, no regions were detected under positive selection. So therefore could it be possible that these two subtypes have divereged enough that they exist as separate populations and thus are no longer under positive selective pressure in relation to each other?

Thanks again for your help!
Back to top
 
 
IP Logged