HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
HYPHY Package >> HyPhy bugs >> File Ext for Positive Selection
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1122838199

Message started by Becky on Jul 31st, 2005 at 12:29pm

Title: File Ext for Positive Selection
Post by Becky on Jul 31st, 2005 at 12:29pm
I have attempted to run a SLAC analysis and run into problems. I think they may be due to the type of input files I use.

First, The data set is large, 369 taxa, 1617 char.
I use a fasta file for the data and a netwick tree file from MEGA, the analysis completes phase 4, but I then get this message when I use HYPHY-P3 on windows.

"The number of columns in the data matrix must match the dimension of the header matrix in call to Open Window (Chart_Window)
Current BL Command:Open window of type CHARTWINDOW with parameters from [Data Rates} {labels}, {fullSites}, {Contrast Bars}, } Current task has been terminated would you like to see the remaining error messages if there are any?"

I assume this has something to do with the files I used as inputs, so I tried NEXUS files - but always get this error message-

"The number of tree tips in givenTree (369) is not equal to the number of species in the data filter associated with the tree (0).
Current BL Command:Construct the following likelihood function:nucData; givenTree; Current Task has been terminated. Would you like to see the remaining error messages, if there are any?"

Can you tell me the optimum type of input file format to use and which program to create this file in?
I have tried trees from MEGA and PAUP4b. And input files from MEGA and BioEdit.

I have used DataMonkey with success, but would like to increase the number of taxa in the analysis- hence my attempt at Hyphy.

Any suggestions you have would be helpful.
Thanks
Becky

Title: Re: File Ext for Positive Selection
Post by Sergei on Jul 31st, 2005 at 10:06pm
Dear Becky,

I am not entirely sure what causes either of the problems you reported - the size of the data set should not matter.

1). The error report indicates that HyPhy finished the analysis but was unable to open a graphical chart display at the end of the run. The QuickSelectionDetection.bf analysis has grown to contain many options, and they are quite confusing! Based on the error message, it would appear that you had chosen to run the "Rate Distribution" option of the analysis, which is currently still experimental and is not recommended for general usage. The posting at Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login contains the list of options one needs to choose to replicate the SLAC analyses from the paper. Please try these settings and let me know if the problem persists.

2). For the NEXUS file, it would appear that HyPhy did not properly read the NEXUS file. I'd be very thankful if you could send me (by e-mail) the file which generated this error, so that I reproduce the error and try to determine what NEXUS formatting issues caused it.

Best,
Sergei

Title: Re: File Ext for Positive Selection
Post by Becky on Aug 1st, 2005 at 11:57am
Hi Sergei, Thanks so much for the information. In the first attempt I had selected the "rate distribution". I went back and was able to sucessfully run a SLAC analysis using a fasta file and a nwk file. In only 20 minutes too!!

For NEXUS files, I found that those produces in Clustal X are not compatible with nwk files from MEGA.
Also, MEGA NEXUS files must be interleaved in order to be compatible for MEGA nwk files in HYPHY.  I will email you the Clustal X NEXUS file- I use this file as an input for PAUP4b, and would eventually like to use tree files I produce in PAUP4b for HYPHY analysis.

Thanks for your help
Becky

Title: Re: File Ext for Positive Selection
Post by Sergei on Aug 3rd, 2005 at 12:59pm
Dear Becky,

I have identified some problems with my NEXUS reader which caused your file to be rejected.

Specifically, your file had the following NEXUS block:


Code (]
BEGIN DATA;
dimensions ntax=189 nchar=1620;
format missing=?
symbols="ABCDEFGHIKLMNPQRSTUVWXYZ"
datatype=DNA interleave  gap= -;
[/code):



HyPhy expected the symbols command to contain characters separated by a space (e.g. "ABC..." should be "A B C..."). I looked at the NEXUS format specification and realized that this was actually not necessary, but that "ABC..." is acceptable. This has been fixed in today's build.

Also, HyPhy had a bug which cause the 'gap= -' to be interpreted incorrectly, because of the space after '='. This has also been fixed.

You can either download the new version, or modify the header to read:

[code]
BEGIN DATA;
dimensions ntax=189 nchar=1620;
format missing=?
datatype=DNA interleave  gap=-;


Note that the SYMBOLS command actually defines a new character alphabet, and your data will not be interpreted as DNA. In the original header, the DNA datatype will over-ride the symbols command. I am pretty sure that SYMBOLS is inappropriate in this context and is incorrectly output. As written, the block implies that all listed characters (not just ACGT) are valid DISTINCT states, whereas in reality they should be defined ambiguities. At any rate, I would advise you to remove the SYMBOLS statement altogether.

Cheers,
Sergei

HyPhy message board » Powered by YaBB 2.5.2!
YaBB Forum Software © 2000-2024. All Rights Reserved.