HyPhy message board - GARD for analyze multiple data files

	Welcome, Guest. Please Login

Home

Help

HyPhy message board › Methodology Questions › How to › GARD for analyze multiple data files

(Moderators: Sergei, Simon)

‹ Previous Topic | Next Topic ›

Pages: 1

Send Topic

GARD for analyze multiple data files (Read 5197 times)

Miguel

Junior Member

Offline

Hi Hyphy!

Posts: 53
CBMSO, CSIC (Spain)
Gender: male

GARD for analyze multiple data files
Nov 26^th, 2008 at 10:07am

Hi,

Is it possible to analyze multiple data files in GARD (the fastest case by homogeneous rate)? e.g. by datamonkey?
Or, Do I have to modify the batch file "GARD.bf" for that?. In this last case, I see that the batch file is not very friendly, perhaps it was done for execute in html and on MPI environment. Is there other GARD batchfile easier to work in?. Thanks.

Regards,
Miguel

WWW

IP Logged

Sergei YaBB Administrator Offline Datamonkeys are forever... Posts: 1658 UCSD Gender:	Re: GARD for analyze multiple data files Reply #1 - Nov 26^th, 2008 at 10:43am Dear Miguel, By multiple files, you mean run a batch files through GARD? It's probably easiest to modify the GARD.bf file - what kind of output were you looking for? Cheers, Sergei
Back to top	Associate Professor Division of Infectious Diseases Division of Biomedical Informatics School of Medicine University of California San Diego WWW IP Logged

Miguel

Junior Member

Offline

Hi Hyphy!

Posts: 53
CBMSO, CSIC (Spain)
Gender: male

Re: GARD for analyze multiple data files
Reply #2 - Nov 27^th, 2008 at 1:39am

Dear Sergei,

I have 1200 data files (alignments of DNA sequences), the idea is to analyze them independently by GARD to obtain the corresponding 1200 outputs. Each output should contain the breakpoints (or fragment intervals) and a phylogenetic tree for each fragment, such as GARD of datamonkey does for an alignment (but here I have 1200), nexus format is the perfect kind of output for me (datamonkey generates it). For example:

#NEXUS

[
Generated by HYPHY 1.0020080624beta(MPI) for Linux on x86_64 on Sat Nov 22 09:41:23 2008

]

BEGIN TAXA;
DIMENSIONS NTAX = 11;
TAXLABELS
'seq00001' 'seq00002' 'seq00003' 'seq00004' 'seq00005' 'seq00006' 'seq00007' 'seq00008' 'seq00009' 'seq00010' 'outgroup' ;
END;

BEGIN CHARACTERS;
DIMENSIONS NCHAR = 999;
FORMAT
DATATYPE = DNA

GAP=-
MISSING=?
NOLABELS
;

MATRIX
"sequences"
END;

BEGIN TREES;
TREE part_1 = (((seq00001:0.00989392,((seq00004:0,seq00005:0):0,seq00006:0):0.0312366):0.01017
1,(seq00003:0,seq00008:0.00630028):0.0189827):0,((seq00002:0.00631735,seq00009:0
.0063189):0.00619616,seq00010:0.01929):0.0324627,(seq00007:0.0262537,outgroup:0.
122788):0.0263472);
TREE part_2 = ((((seq00001:0.00854775,seq00002:0.0171899):0,seq00009:0.0171894):0.00861809,seq
00007:0.0172076):0.00859177,((((seq00003:0,(seq00005:0,seq00006:0):0.00854154):0
,seq00008:0):0.0171795,seq00010:0):0.0171975,outgroup:0.0815803):0.0174034,seq00
004:0.00863121);
TREE part_3 = (((seq00001:0.0206147,(seq00002:0,seq00010:0.0204341):0.01012):0.0311044,((seq00
004:0,seq00009:0):0,seq00007:0.0308687):0.0202623):0,(seq00003:0,seq00008:0):0.0
206851,((seq00005:0.0101502,seq00006:0):0.0649942,outgroup:0.0876822):0.0306586)
;
TREE part_4 = (((seq00001:0.0578165,((seq00004:0,seq00007:0):0.0344496,outgroup:0.157492):0.03
07402):0.029442,(((seq00002:0.0102012,seq00010:0):0.0205504,seq00009:0):0.064412
,(seq00005:0,seq00006:0):0.00971875):0.0332362):0.00913766,seq00003:0,seq00008:0
.0101911);
TREE part_5 = ((((seq00001:0.021233,((seq00002:0.005375,seq00010:0.00295404):0.0432586,outgrou
p:0.0940907):0.0156284):0.012343,(seq00003:0.00276942,seq00008:0):0.0143391):0.0
135234,seq00004:0.030578):0.00384256,(seq00005:0,seq00006:0):0,(seq00007:0,seq00
009:0):0);
TREE part_6 = (((seq00001:0.0061545,(seq00009:0.0314923,seq00010:0):0.0188424):0.0189173,(seq0
0004:0.025081,(seq00005:0,seq00006:0):0):0.00620545):0.00620585,((seq00002:0,seq
00007:0):0.0191061,outgroup:0.114216):0.0122447,(seq00003:0,seq00008:0):0);

END;

BEGIN ASSUMPTIONS;
CHARSET span_1 = 1-159;
CHARSET span_2 = 160-277;
CHARSET span_3 = 278-376;
CHARSET span_4 = 377-475;
CHARSET span_5 = 476-837;
CHARSET span_6 = 838-999;

END;

At this time I am trying to modify GARD.bf but this batch file is a bit complex, with many functions, MPI environment and it seems for executing in html (I'm not sure). So, the question is, is there other GARD batchfile easier to work in?
Thanks again!

Miguel

WWW

IP Logged

Miguel Junior Member Offline Hi Hyphy! Posts: 53 CBMSO, CSIC (Spain) Gender:	Re: GARD for analyze multiple data files Reply #3 - Dec 4^th, 2008 at 6:18am I get it!! Thanks anyway. Cheers, Miguel
Back to top	WWW IP Logged

Sergei YaBB Administrator Offline Datamonkeys are forever... Posts: 1658 UCSD Gender:	Re: GARD for analyze multiple data files Reply #4 - Dec 8^th, 2008 at 4:41pm Dear Miguel, Were you able to work out the formatting? Sorry - I was away on holiday/business for a few weeks. Let me know if you still need my code. Cheers, Sergei
Back to top	Associate Professor Division of Infectious Diseases Division of Biomedical Informatics School of Medicine University of California San Diego WWW IP Logged

Miguel

Junior Member

Offline

Hi Hyphy!

Posts: 53
CBMSO, CSIC (Spain)
Gender: male

Re: GARD for analyze multiple data files
Reply #5 - Dec 9^th, 2008 at 1:44am

Dear Sergei,

GARD.bf generates an output file in nexus format, so I just modified the batch file to run multiple data sets by a loop and deleting the prompt questions (I give the parameter values directly inside of the batch file). Then, it is running so good.
Do not worry for being away, that is normal. Here, I just tried it step by step and I get it.
Thanks a lot for your message! Wink

Cheers,
Miguel

WWW

IP Logged

Zhuofei

YaBB Newbies

Offline

Curious HyPhy user

Posts: 3
Belgium
Gender: male

Re: GARD for analyze multiple data files
Reply #6 - Nov 12^th, 2013 at 1:28am

Dear All,

I have an identical analysis goal. I have 1600 FASTA-format data files of DNA alignments, the idea is to independent detect recombination breakpoints and split alignment using GARD. If the alignment has recombination breakpoints, the outputs are split alignments in two or more FASTA-format files.

I don't know how to prepare a proper batch file (GARD.bf) for running multiple data sets in HyPhy. I'm very appreciated your help on my analysis!

I have another question about running GARD.bf in Hyphy in Windows system. I have downloaded windows-version Hyphy 2.2.0. When I selected Standard analysis, recombination, GARD.bf, it showed the following error:
[ERROR] This analysis requires an MPI environment to run

I'm wondering Is there possible to run GARD in windows system or Should I install Hyphy and MPI in Linux for running GARD?

Thanks in advance!

Best regards,
Zhuofei

IP Logged

Pages: 1

Send Topic

‹ Previous Topic | Next Topic ›

« Home

‹ Board

Top of this page