Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
What is maximum alignment size allowed? (Read 2949 times)
Brian T. Foley, PhD
Guest


What is maximum alignment size allowed?
Feb 16th, 2005 at 1:34pm
 
I attempted to upload an alignment of a few hundred sequences, each about 7,000 bases long.  I got a "data file too large error" but it did not tell me what the maximum file size is.

Back to top
 
 
IP Logged
 
BrianFoley
YaBB Newbies
*
Offline


I love YaBB 1G - SP1!

Posts: 1
Maximum is 150 species, or 10K codons
Reply #1 - Feb 16th, 2005 at 4:09pm
 
Later, with a smaller data set, I did get a message which stated that the maximum number of sequences is 150, or a maximum of 10,000 codons with the current (Feb 16 2005) computers this is running on.

I also found out that internal stop codons are not allowed.
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: What is maximum alignment size allowed?
Reply #2 - Feb 16th, 2005 at 7:50pm
 
Dear Brian,

There are two limitations on file size in datamonkey:

1). Raw file size (4MB) - basically to prevent hackers from trying to crash the server - these files never get processed.
2). For valid alignments, current size limitations are 150 sequences and/or 10000 codons. This is mostly to keep a single job from running over 3-4 minutes. You can always run a larger analysis locally in HyPhy (it is am option in Standard Analyses); data sets larger than 1000 sequences can cause underflow problems (we are working on fixing that, but it is rather involved).

Internal stop codons are indeed disallowed. The analyses are mostly targeted towards a single-gene setting, in which there should be no premature stop codons.

HyPhy has a tool ('Standard Analyses: Data file tools: CleanStopCodons.bf) which reads in an alignment and replaces all stop codons with gaps. This might be a reasonable way to preprocess alignments with premature stop codons (due to sequencing errors, e.g).

If there is enough interest, I could implement a user-selectable option for the datamonkey upload module to

(a). Replace all detected stop codons with gaps
(b). Replace stop codons with site consensus (good for sequencing error correction).

HTH,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged