Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
How to fetch the sequence Ids for recombinant sequences (Read 2440 times)
RTuteja
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 13
Ireland
Gender: female
How to fetch the sequence Ids for recombinant sequences
Nov 22nd, 2012 at 3:54am
 
Hi Sergei,

I got the GARD output for few gene families. But it just reports me whether the breakpoint (with position) is significant or not. Is there any way to fetch the id of the sequences where recombination is detected. I want to discard the sequences that have been detected with recombination to proceed for selection analysis.

May be I am missing something while reading the documentation. Can you please help!

Thanks,
Reetu
Back to top
 
 
IP Logged
 
RTuteja
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 13
Ireland
Gender: female
Re: How to fetch the sequence Ids for recombinant sequences
Reply #1 - Nov 23rd, 2012 at 6:28am
 
I tried to identify the sequences involved in recombination events using Geneconv. But in few cases, geneconv does not identify any recombination fragments though breakpoint has been reported GARD. Can you please suggest what should I do in such cases.

I do have few gene family with 506 sequences. Does GARD works fine for such a large dataset?
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: How to fetch the sequence Ids for recombinant sequences
Reply #2 - Nov 28th, 2012 at 4:40am
 
Hi Reetu,

GARD is not designed to identify which sequences are recombinant. As we say in Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (page 8, this also applies to your large dataset question)

Quote:
GARD is geared towards mapping the breakpoints and detecting segments of the alignment
which can be adequately described by a single tree topology; as we discuss in the next section, this
is necessary to allow more complex analyses to handle alignments with recombinant sequences.
Because GARD allows arbitrary tree changes across breakpoints, there are certain cases when it
will not perform well; for example, short alignments with many sequences. GARD requires about
approximately 4 times as many sites as sequences to run; otherwise the number of samples (sites)
is less than the number of model parameters (branch lengths and rates). Another case occurs when
only a few sequences in a large alignment have undergone recombination, in which instance the
cost of adding many new branch length parameters for one or more trees will likely outweigh the
likelihood improvement due to several local subtree rearrangements.


Generally, we recommend that you don't exclude recombinant sequences from downstream analyses,
but rather use those analyses which can handle recombinant sequences (see the link above for examples).

The fact that geneconv doesn't find any recombinant sequences isn't that unusual: it could reflect, for example,
ancient recombination events.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged