Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Defining custom alphabets for sequence data. (Read 1692 times)
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Defining custom alphabets for sequence data.
May 23rd, 2005 at 8:52am
 
Received by e-mail:
Quote:
... but how do I "tell"
hyphy to read in sequences with 7 different variables??
Can you please give me a short example of how to run hyphy with different
dimensions model (other then 4 and 20).


HyPhy can be told to interpet character data using a custom alphabet by specifying the details in an alignment file. FASTA and PHYLIP formats do not directly provide a mechanism for defining a new character set, but HyPhy has custom instructions for this.

The little fragment below shows an alignment of 3 sequences on three characters (A,B and C) and a gap character:

Code:
$BASESET:"ABC"
$TOKEN:"-"="ABC"

>seq1
AABBCC
>seq2
A-BBAC
>seq3
CABBCC
 



The BASESET command tells hyphy to use A,B and C as character states, and the TOKEN command maps '-' to a 3-way ambiguity (or a gap).

The NEXUS format includes mechanisms to define a custom alphabet directly, e.g.

Code:
#NEXUS

BEGIN TAXA;
	DIMENSIONS NTAX = 3;
	TAXLABELS
		'seq1' 'seq2' 'seq3' ;
END;

BEGIN CHARACTERS;
	DIMENSIONS NCHAR = 6;
	FORMAT

	SYMBOLS = "A B C"

	EQUATE="- = ABC"
		MISSING=-
	;

MATRIX
	'seq1'  AABBCC
	'seq2'  A-BBAC
	'seq3'  CABBCC;
END;
 



Datafiles like this can be read and filtered by HyPhy and if one defines a 3 state rate matrix for the above example, e.g.

rateMx = {{*,a,b}
               {a,*,c}
               {b,c,*}},

(here an i,j entry is the rate of substituting character i with chartacter j, i,j=0..2. In the example above 'A' has code 0 since it was listed first, 'B' has code 1 and 'C' has code 2), then a model can be fitted by ML.

HTH,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Steven
Guest


Re: Defining custom alphabets for sequence data.
Reply #1 - May 24th, 2005 at 3:10am
 
Thanks so much

This feature is amazing

HyPhy rules all dimensions 8)
Back to top
 
 
IP Logged