HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
Methodology Questions >> How to >> Doublet model for RNA
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1165591111

Message started by Jose_Patane on Dec 8th, 2006 at 7:18am

Title: Doublet model for RNA
Post by Jose_Patane on Dec 8th, 2006 at 7:18am
Hi all,

How can I implement a doublet model in HyPhy, using a batch file? I took a look at the documentation and help files, but I wasn't able to figure it out. I've tried MrBayes, but the doublet model in there considers the 16 possible states... but I only have 8, so I'd rather try a less parametric model.

Title: Re: Doublet model for RNA
Post by Sergei on Dec 8th, 2006 at 9:07am
Dear Jose,

There are a number (quite a few more than MrBayes) of 16x16 stem RNA models implemented in HyPhy (available via StandardAnalyses>Basic Analyses>AnalyzeDiNucData.bf).

When you say you have only 8 states, I presume you mean that only 8 pairings (AU,UA,CG,GC,GU,UG and 2 more most likely) are actually observed in the alignment?

You could implement your analysis following the lines of, for example the file Examples/SimpleAnalyses/HKY85shared.bf, with a few tricks.

1). You need to create a doublet filter and explicitly exclude the 8 states you don't want to model. For example
[code]
DataSetFilter RNASomeStates = CreateFilter (myData,2,"","","AC,AG,UC");
[/code]
would create a 13 state filter (16 states minus "AC,AG,UC"). This construct also assumes that your data are linearly arranged, i.e. the site 1 is paired with site 2,
site 3 with site 4 etc

2). When defining the rate matrix to model the evolution of your 8 states, it is important to understand how they are ordered (from 1st to last). For 16 states, the arrangement is lexicographic (AA,AC,AG,AU,....,UA,UC,UG,UU). For a reduced set, the ordering will be the same, except the excluded states won't be there. This mapping will help you define the rate matrix properly.

HTH,
Sergei

Title: Re: Doublet model for RNA
Post by Jose_Patane on Dec 8th, 2006 at 9:40am
Hmm... I got the whole idea... but, anyway, is there a simpler way to input the alignment in the 'datasetfilter' step? I'm asking because by the way you're suggesting, I'd have to re-edit the columns of my alignment, putting side-by-side the stem sites, and make them start from position 1. Using "Paired intervals", as suggested in the PDF language manual under the 'CreateFilter' topic, would be a workaround to do that?


And thanks again for your help so far!

Title: Re: Doublet model for RNA
Post by Sergei on Dec 8th, 2006 at 9:52am
Dear Jose,

Yes, indeed, you could filter the alignment differently, without having to edit it at all. Which format do you currently use to identify paired nucleotides in an alignment? If you use the parenthetical notation, I have a converter file which will produce a filter for HyPhy to work on...

Cheers,
Sergei

Title: Re: Doublet model for RNA
Post by Jose_Patane on Dec 8th, 2006 at 10:17am
I annotated stems and loops using BioEdit, but haven't done so in a DCSE-way, neither used any other kind of mask (for until now it was needless to me - in MrBayes, I simply wrote a list of the pairing sites). Do you know of any software that automates graphical annotations done by columns color-coding into parenthetical notation - or something remotely related to that? Manually, it should be a very error-prone process...

Title: Re: Doublet model for RNA
Post by Sergei on Dec 8th, 2006 at 10:32am
Dear Jose,

If you have a list of pairing sites handy, you can easily stick it into HyPhy (with the caveat that HyPhy starts numbering from 0).

E.g.

DataSetFilter RNAData = CreateFilter (ds, 2, "0,5,1,4...

Alternatively, if you have runs of sites (e.g. 0-4 is paired with 15-11), you can write "0-4&15-11", to generate the list 0,15,1,14,2,13,3,12,4,11 etc

Also, HyPhy can read CHARSET blocks from NEXUS files if you have those...

Cheers,
Sergei

Title: Re: Doublet model for RNA
Post by Jose_Patane on Dec 8th, 2006 at 10:49am
Great! It seems to me it's gonna work... thanks a lot Sergei!

Title: Re: Doublet model for RNA
Post by Jose_Patane on Dec 8th, 2006 at 10:55am
Just some more questions: how does HyPhy read PAUP's 'charset' command? What further steps do I need to put in datasetfilter so that such data is read properly? Does HyPhy automatically 0-index the sites?

Title: Re: Doublet model for RNA
Post by Sergei on Dec 8th, 2006 at 11:36am
Dear Jose,

Glad I could help. HyPhy will automatically -1 shift all the indices in the CHARSET.  

I made a little example (look in the HYPHY block) to show how to create filters from CHARSET specifications and diagnose them.

(Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login)

Cheers,
Sergei

Title: Re: Doublet model for RNA
Post by Jose_Patane on Dec 8th, 2006 at 2:37pm
Wow... those are really neat tricks... thks once more!

HyPhy message board » Powered by YaBB 2.5.2!
YaBB Forum Software © 2000-2024. All Rights Reserved.