Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
SBP and GARD (Read 5398 times)
Ana
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 2
SBP and GARD
Apr 30th, 2012 at 11:03am
 
Hi,

A couple of naive questions…

I’m trying to run SBP and/or GARD on a huge dataset of two paralogous genes (34 sequences, 4905bp) and due to its size I’m always getting issues with the analysis being stopped before convergence, because the CPU time limit per job. I don’t want to reduce the number of sequences, so is it correct to cut in half the size of the sequences and run each independently?

The other question is, when performing positive selection analyses on each gene if I run first GARD, will the subsequent output of REL/SLAC/FEL/MEME be already corrected? I am assuming they will be. I also think most of the breakpoints that I obtain result from rate variations and not from recombination. The only way of checking this is analyzing the resultant trees from the fragments defined by the breakpoints, right?

I hope this makes any sense…
Thank you!
Cheers
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: SBP and GARD
Reply #1 - Apr 30th, 2012 at 1:56pm
 
Hi Ana,

Regarding the large GARD analysis, there are several things you can do.

1). If you send me the alignment, I could run it through GARD on the cluster without a time limit.
2). You could use the preliminary GARD results (I assume it is showing some evidence of recombination), to partition the alignment by sites, i.e. if GARD finds 1 breakpoint, you can independently run the section left of it, and then the one right of it.
3). You could randomly split your data set by sequences (e.g. 1/2 or 1/4 of sequences) and run all of them through to see if the results are consistent between splits. If they are not, then the likely cause is that one or more of the sequences present in one part of the data set but not in others are driving recombination signal.

In terms of correcting selection analyses, you should select 'GARD Trees' from the Use this tree set option list. You could see whether recombination is affecting results by comparing the results of a single partition analysis, vs the GARD-inferred set analysis. Take a look at the first exercise in Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login for an example.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Ana
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 2
Re: SBP and GARD
Reply #2 - May 3rd, 2012 at 9:32am
 
Hi Sergei,

Thank you so much for your reply.

Regarding the large data set to GARD analysis, I followed the suggestion of randomly splitting the sequences, but in fact the results are not consistent between analysis and they still don't run till the end.
So, if you don't mind I would like to know if it's possible to try your first suggestion and send you the alignment. Please, just let me know.

Thank you so much!
Best

ana
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: SBP and GARD
Reply #3 - May 3rd, 2012 at 9:43am
 
Hi Ana,

Sure, send me the alignment. spond at ucsd dot edu

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged