Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
BSR sensitivity and specificity (Read 3973 times)
jlee337
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 8
BSR sensitivity and specificity
Sep 24th, 2012 at 3:58pm
 
I have a question that I can't seem to find an answer to in either the paper describing BSR nor any of the tutorials (which are great by the way!).

How do the Type I and Type II error rates vary with alignment length?

I ask because I am screening alignments for lineages under positive selection but my sequences have a lot of recombination bps and therefore, I have divided them up into rather short segments to run BSR (since I assume BSR is sensitive to recombination?).

I notice that the shorter fragments often have strange results so I'm wondering if you could inform me about the error rates at various alignment lengths.

Thanks,

Justin
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: BSR sensitivity and specificity
Reply #1 - Sep 25th, 2012 at 12:55pm
 
Hi Justin,

The only way to establish these values for a particular size/divergence data set is to perform simulations under these conditions.

With short alignments there will be two issues:

1). Lack of statistical power to detect selection: for example if 5% of 50 codons are under selection, then the method is very unlikely to find it based on the evolution of ~2.5 codons, where as 5% of
1000 codons is a lot more manageable.

2). Numerical issues: the standard BSR model with 3 rates per branch is going to be overparameterized for short alignments. This is generally not a problem, statistically, because you simply won't have the information to estimate those parameters accurately, but the likelihood ratio tests will not be significant. However, the optimization algorithms can fail to converge if an overly complex model is fitted to a small dataset. This could manifest as false positives, for example (if the alternative model is mis-estimated). BSR has some built-in checks for that type of behavior, however.

What types of strange results are you seeing?

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
jlee337
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 8
Re: BSR sensitivity and specificity
Reply #2 - Oct 2nd, 2012 at 7:12am
 
Sergei,

Thanks for the detailed response.

I've attached a tree from one of the BSR runs with an alignment length of about 600bp.  The tree is strange and the results suggested that one sample is under selection:

Sample: EF455610
Pr {w=w+}: 0.03
w+: 10000.00
p-value: 0.001


Justin

Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (10 KB | 3 )
 
IP Logged
 
jlee337
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 8
Re: BSR sensitivity and specificity
Reply #3 - Oct 2nd, 2012 at 7:13am
 
Here's a BSR tree from the neighboring non-recombinant fragment of about 500bp for comparison...

Thanks for your time and help!

J
Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (10 KB | 1 )
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: BSR sensitivity and specificity
Reply #4 - Oct 2nd, 2012 at 10:42am
 
Hi Justin,

The first tree is definitely fubared. Infinite branch lengths are a sign of convergence issues, alignment problems. Could you attach the underlying alignment and the tree for me to try to diagnose what is going on?

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
jlee337
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 8
Re: BSR sensitivity and specificity
Reply #5 - Oct 2nd, 2012 at 12:30pm
 
Here's the alignment...

I used the NJ trees produced in DataMonkey when running BSR.

I appreciate you taking the time to look into this...

Justin
Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (19 KB | 2 )
 
IP Logged