Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
File Ext for Positive Selection (Read 2771 times)
Becky
Guest


File Ext for Positive Selection
Jul 31st, 2005 at 12:29pm
 
I have attempted to run a SLAC analysis and run into problems. I think they may be due to the type of input files I use.

First, The data set is large, 369 taxa, 1617 char.
I use a fasta file for the data and a netwick tree file from MEGA, the analysis completes phase 4, but I then get this message when I use HYPHY-P3 on windows.

"The number of columns in the data matrix must match the dimension of the header matrix in call to Open Window (Chart_Window)
Current BL Command:Open window of type CHARTWINDOW with parameters from {{Data Rates} {labels}, {fullSites}, {Contrast Bars}, } Current task has been terminated would you like to see the remaining error messages if there are any?"

I assume this has something to do with the files I used as inputs, so I tried NEXUS files - but always get this error message-

"The number of tree tips in givenTree (369) is not equal to the number of species in the data filter associated with the tree (0).
Current BL Command:Construct the following likelihood function:nucData; givenTree; Current Task has been terminated. Would you like to see the remaining error messages, if there are any?"

Can you tell me the optimum type of input file format to use and which program to create this file in?
I have tried trees from MEGA and PAUP4b. And input files from MEGA and BioEdit.

I have used DataMonkey with success, but would like to increase the number of taxa in the analysis- hence my attempt at Hyphy.

Any suggestions you have would be helpful.
Thanks
Becky
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: File Ext for Positive Selection
Reply #1 - Jul 31st, 2005 at 10:06pm
 
Dear Becky,

I am not entirely sure what causes either of the problems you reported - the size of the data set should not matter.

1). The error report indicates that HyPhy finished the analysis but was unable to open a graphical chart display at the end of the run. The QuickSelectionDetection.bf analysis has grown to contain many options, and they are quite confusing! Based on the error message, it would appear that you had chosen to run the "Rate Distribution" option of the analysis, which is currently still experimental and is not recommended for general usage. The posting at Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login contains the list of options one needs to choose to replicate the SLAC analyses from the paper. Please try these settings and let me know if the problem persists.

2). For the NEXUS file, it would appear that HyPhy did not properly read the NEXUS file. I'd be very thankful if you could send me (by e-mail) the file which generated this error, so that I reproduce the error and try to determine what NEXUS formatting issues caused it.

Best,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Becky
Guest


Re: File Ext for Positive Selection
Reply #2 - Aug 1st, 2005 at 11:57am
 
Hi Sergei, Thanks so much for the information. In the first attempt I had selected the "rate distribution". I went back and was able to sucessfully run a SLAC analysis using a fasta file and a nwk file. In only 20 minutes too!!

For NEXUS files, I found that those produces in Clustal X are not compatible with nwk files from MEGA.
Also, MEGA NEXUS files must be interleaved in order to be compatible for MEGA nwk files in HYPHY.  I will email you the Clustal X NEXUS file- I use this file as an input for PAUP4b, and would eventually like to use tree files I produce in PAUP4b for HYPHY analysis.

Thanks for your help
Becky
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: File Ext for Positive Selection
Reply #3 - Aug 3rd, 2005 at 12:59pm
 
Dear Becky,

I have identified some problems with my NEXUS reader which caused your file to be rejected.

Specifically, your file had the following NEXUS block:

Code:
BEGIN DATA;
dimensions ntax=189 nchar=1620;
format missing=?
symbols="ABCDEFGHIKLMNPQRSTUVWXYZ"
datatype=DNA interleave  gap= -;
 



HyPhy expected the symbols command to contain characters separated by a space (e.g. "ABC..." should be "A B C..."). I looked at the NEXUS format specification and realized that this was actually not necessary, but that "ABC..." is acceptable. This has been fixed in today's build.

Also, HyPhy had a bug which cause the 'gap= -' to be interpreted incorrectly, because of the space after '='. This has also been fixed.

You can either download the new version, or modify the header to read:

Code:
BEGIN DATA;
dimensions ntax=189 nchar=1620;
format missing=?
datatype=DNA interleave  gap=-;
 



Note that the SYMBOLS command actually defines a new character alphabet, and your data will not be interpreted as DNA. In the original header, the DNA datatype will over-ride the symbols command. I am pretty sure that SYMBOLS is inappropriate in this context and is incorrectly output. As written, the block implies that all listed characters (not just ACGT) are valid DISTINCT states, whereas in reality they should be defined ambiguities. At any rate, I would advise you to remove the SYMBOLS statement altogether.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged