Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
GARD for analyze multiple data files (Read 4937 times)
Miguel
Junior Member
**
Offline


Hi Hyphy!

Posts: 53
CBMSO, CSIC (Spain)
Gender: male
GARD for analyze multiple data files
Nov 26th, 2008 at 10:07am
 
Hi,

Is it possible to analyze multiple data files in GARD (the fastest case by homogeneous rate)? e.g. by datamonkey?
Or, Do I have to modify the batch file "GARD.bf" for that?. In this last case, I see that the batch file is not very friendly, perhaps it was done for execute in html and on MPI environment. Is there other GARD batchfile easier to work in?. Thanks.

Regards,
Miguel

Back to top
 
WWW WWW  
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: GARD for analyze multiple data files
Reply #1 - Nov 26th, 2008 at 10:43am
 
Dear Miguel,

By multiple files, you mean run a batch files through GARD? It's probably easiest to modify the GARD.bf file - what kind of output were you looking for?

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Miguel
Junior Member
**
Offline


Hi Hyphy!

Posts: 53
CBMSO, CSIC (Spain)
Gender: male
Re: GARD for analyze multiple data files
Reply #2 - Nov 27th, 2008 at 1:39am
 
Dear Sergei,

I have 1200 data files (alignments of DNA sequences), the idea is to analyze them independently by GARD to obtain the corresponding 1200 outputs. Each output should contain the breakpoints (or fragment intervals) and a phylogenetic tree for each fragment, such as GARD of datamonkey does for an alignment (but here I have 1200), nexus format is the perfect kind of output for me (datamonkey generates it). For example:

#NEXUS

[
Generated by HYPHY 1.0020080624beta(MPI) for Linux on x86_64 on Sat Nov 22 09:41:23 2008

]

BEGIN TAXA;
     DIMENSIONS NTAX = 11;
     TAXLABELS
           'seq00001' 'seq00002' 'seq00003' 'seq00004' 'seq00005' 'seq00006' 'seq00007' 'seq00008' 'seq00009' 'seq00010' 'outgroup' ;
END;

BEGIN CHARACTERS;
     DIMENSIONS NCHAR = 999;
     FORMAT
           DATATYPE = DNA

           GAP=-
           MISSING=?
           NOLABELS
     ;

MATRIX
"sequences"
END;

BEGIN TREES;
     TREE part_1 = (((seq00001:0.00989392,((seq00004:0,seq00005:0):0,seq00006:0):0.0312366):0.01017
1,(seq00003:0,seq00008:0.00630028):0.0189827):0,((seq00002:0.00631735,seq00009:0
.0063189):0.00619616,seq00010:0.01929):0.0324627,(seq00007:0.0262537,outgroup:0.
122788):0.0263472);
     TREE part_2 = ((((seq00001:0.00854775,seq00002:0.0171899):0,seq00009:0.0171894):0.00861809,seq
00007:0.0172076):0.00859177,((((seq00003:0,(seq00005:0,seq00006:0):0.00854154):0
,seq00008:0):0.0171795,seq00010:0):0.0171975,outgroup:0.0815803):0.0174034,seq00
004:0.00863121);
     TREE part_3 = (((seq00001:0.0206147,(seq00002:0,seq00010:0.0204341):0.01012):0.0311044,((seq00
004:0,seq00009:0):0,seq00007:0.0308687):0.0202623):0,(seq00003:0,seq00008:0):0.0
206851,((seq00005:0.0101502,seq00006:0):0.0649942,outgroup:0.0876822):0.0306586)
;
     TREE part_4 = (((seq00001:0.0578165,((seq00004:0,seq00007:0):0.0344496,outgroup:0.157492):0.03
07402):0.029442,(((seq00002:0.0102012,seq00010:0):0.0205504,seq00009:0):0.064412
,(seq00005:0,seq00006:0):0.00971875):0.0332362):0.00913766,seq00003:0,seq00008:0
.0101911);
     TREE part_5 = ((((seq00001:0.021233,((seq00002:0.005375,seq00010:0.00295404):0.0432586,outgrou
p:0.0940907):0.0156284):0.012343,(seq00003:0.00276942,seq00008:0):0.0143391):0.0
135234,seq00004:0.030578):0.00384256,(seq00005:0,seq00006:0):0,(seq00007:0,seq00
009:0):0);
     TREE part_6 = (((seq00001:0.0061545,(seq00009:0.0314923,seq00010:0):0.0188424):0.0189173,(seq0
0004:0.025081,(seq00005:0,seq00006:0):0):0.00620545):0.00620585,((seq00002:0,seq
00007:0):0.0191061,outgroup:0.114216):0.0122447,(seq00003:0,seq00008:0):0);

END;

BEGIN ASSUMPTIONS;
     CHARSET span_1 = 1-159;
     CHARSET span_2 = 160-277;
     CHARSET span_3 = 278-376;
     CHARSET span_4 = 377-475;
     CHARSET span_5 = 476-837;
     CHARSET span_6 = 838-999;

END;

At this time I am trying to modify GARD.bf but this batch file is a bit complex, with many functions, MPI environment and it seems for executing in html (I'm not sure). So, the question is, is there other GARD batchfile easier to work in?
Thanks again!

Miguel



Back to top
 
WWW WWW  
IP Logged
 
Miguel
Junior Member
**
Offline


Hi Hyphy!

Posts: 53
CBMSO, CSIC (Spain)
Gender: male
Re: GARD for analyze multiple data files
Reply #3 - Dec 4th, 2008 at 6:18am
 
I get it!!   Smiley  
Thanks anyway.
Cheers,

Miguel
Back to top
 
WWW WWW  
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: GARD for analyze multiple data files
Reply #4 - Dec 8th, 2008 at 4:41pm
 
Dear Miguel,

Were you able to work out the formatting? Sorry - I was away on holiday/business for a few weeks. Let me know if you still need my code.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Miguel
Junior Member
**
Offline


Hi Hyphy!

Posts: 53
CBMSO, CSIC (Spain)
Gender: male
Re: GARD for analyze multiple data files
Reply #5 - Dec 9th, 2008 at 1:44am
 
Dear Sergei,

GARD.bf generates an output file in nexus format, so I just modified the batch file to run multiple data sets by a loop and deleting the prompt questions (I give the parameter values directly inside of the batch file). Then, it is running so good.
Do not worry for being away, that is normal. Here, I just tried it step by step and I get it.
Thanks a lot for your message!  Wink

Cheers,
Miguel

Back to top
 
WWW WWW  
IP Logged
 
Zhuofei
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 3
Belgium
Gender: male
Re: GARD for analyze multiple data files
Reply #6 - Nov 12th, 2013 at 1:28am
 
Dear All,

I have an identical analysis goal. I have 1600 FASTA-format data files of DNA alignments, the idea is to independent detect recombination breakpoints and split alignment using GARD. If the alignment has recombination breakpoints, the outputs are split alignments in two or more FASTA-format files.

I don't know how to prepare a proper batch file (GARD.bf) for running multiple data sets in HyPhy. I'm very appreciated your help on my analysis!

I have another question about running GARD.bf in Hyphy in Windows system. I have downloaded windows-version Hyphy 2.2.0. When I selected Standard analysis, recombination, GARD.bf, it showed the following error:
[ERROR] This analysis requires an MPI environment to run

I'm wondering Is there possible to run GARD in windows system or Should I install Hyphy and MPI in Linux for running GARD?

Thanks in advance!

Best regards,
Zhuofei
Back to top
 
 
IP Logged