PGDSpider version 3.0.0.0 (April 2021)

An automated data conversion tool for connecting population genetics and genomics programs

Introduction

System requirements

Download and Installation Instructions

Formats supported by PGDSpider:

Help

Screenshot

How to cite PGDSpide and License

Contact and bug report




Introduction

PGDSpider is a powerful automated data conversion tool for population genetic and genomics programs. It facilitates the data exchange possibilities between programs for a vast range of data types (e.g. DNA, RNA, NGS, microsatellite, SNP, RFLP, AFLP, multi-allelic data, allele frequency or genetic distances). Besides the conventional population genetics formats, PGDSpider integrates population genomics data formats commonly used to store and handle next-generation sequencing (NGS) data. Currently, PGDSpider is not meant to convert very large NGS files as it loads into memory the whole input file, whose size may exceed available RAM. However, since PGDSpider allows one to convert specific subsets of these NGS files into any other format, one could use this feature to calculate parameters or statistics for specific regions, and thus perform sliding window analysis over large genomic regions.

In the beginning PGDSpider was designed to use a newly developed PGD (Population Genetics Data) format as an intermediate step in the conversion process. PGD is a file format designed to store various kinds of population genetics data, including different data types (e.g. DNA sequences, microsatellites, AFLP or SNPs) and ploidy levels. PGD is based on the XML format and is therefore independent of any particular computer system and extensible for future needs. PGDSpider used PGD to connect population genetics and genomics programs like a spider knits a web. Since version 3.0.0.0 the intermediate PGD file was replaced by an PGD object.

PGDSpider is written in Java and is therefore platform independent. It is user friendly due to its intuitive graphical user interface. PGDSpider allows the user to store his preferred conversion settings for repeated conversions of similar input formats. A command line version of PGDSpider is also provided, making it possible to embed PGDSpider in data analysis pipelines.


System requirements

PGDSpider is written in Java and therefore platform independent, but SUN Java 1.7 RE (or a newer version) has to be installed. Java RE can be downloaded under following link:

http://www.oracle.com/technetwork/java/javase/downloads/index.html


Download and Installation Instructions

1st step:

Install the Java7 RE (or a newer version)

Windows:

Linux:

Mac:


2nd step:

Download the PGDSpider application and unzip it on the local drive: PGDSpider (3.0.0.0) - Download

Execute PGDSpider GUI:

Execute PGDSpider-cli (command line):


Formats supported by PGDSpider:

PGDSpider is able to parse 33 and to write 36 different file formats:

Data format Version References External dependency Input format Output format
PGD 1.1 Lischer and Excoffier, 2012   x x
Arlequin 3.5 (12.4.2015) Excoffier and Lischer, 2010   x x
BAM 1 (7.1.2021) Li et al., 2009 SAMtools, BCFtools x x
BAMOVA 1.02 (27.9.2011) Gompert and Buerkle, 2011; Gompert et al., 2010     x
BAPS 6.0 (17.12.2012) Tang et al., 2009   x x
BATWING (2003) Wilson et al., 2003   x x
Bayenv 2.0 (20.11.2013) Coop et al., 2010; Gunther and Coop, 2013     x
BCF (14.5.2011)   SAMtools, BCFtools x x
CONVERT 1.31 (March 2005) Glaubitz, 2004   x  
EIGENSOFT 7.2.1 (June 2017) Patterson et al., 2006; Price et al., 2006   x x
extended multi-FASTA (XMFA)       x x
FASTA   Pearson, 1990   x x
FASTQ   Cock, et al., 2010   x x
FDist2 (datacal)   Beaumont and Nichols, 1996; Flint et al., 1999     x
FSTAT 2.9.4 (November 2003) Goudet, 2001   x x
GDA 1.1 (7.1.2002) Lewis, 2001   x x
GENELAND 4.0.7 (28.6.2019) Guedj and Guillot, 2011; Guillot, 2008; Guillot et al., 2005; Guillot and Santos, 2009; Guillot and Santos, 2010; Guillot et al.,2008   x x
GENEPOP 4.7.2 (23.6.2019) Rousset, 2008   x x
GENETIX 4.05 (5.5.2004) Belkhir, 1996-2004   x x
GESTE /
BayeScan
GESTE: 2.0 /
BayeScan: 2.01 (21.1.2012)
GESTE: Foll and Gaggiotti, 2006
BayeScan: Fischer et al., 2011; Foll et al., 2010; Foll and Gaggiotti, 2008
    x
HGDP Stanford     x
HGDP-CEPH
(Arlequin + log file)
      x
Immanc / BayesAss Immanc: 5.0 (8.10.1998) /
BayesAss: 3.04 (2.3.2018)
Immanc: Rannala, Mountain, 1997;
BayesAss: Wilson, Rannala, 2003
  x x
IM / IMa (17.12.2009) IM: Hey and Nielsen, 2004; Nielsen and Wakeley, 2001;
IMa: Hey and Nielsen, 2007
  x x
IMa2 / IMa3 IMa2: (26.8.2011) /
IMa3 (3.6.2019)
IMa2: Hey, 2010 /
IMa3: Hey et al., 2018
  x x
KML 2.2 Google 2009     x
MAF 1.0     x
MEGA 10.1 (9.9.2019) Kumar et al., 2018   x x
MIGRATE 3.2.6 (13.10.2010) Beerli, 2009   x x
MSA 4.05 Dieringer, Schlotterer, 2003   x x
MSVar 0.4.1.b (7.4.1999) Beaumont, 1999     x
NewHybrids 1.1 beta (7.4.2003) Anderson and Thompson, 2002   x x
NEXUS   Maddison et al., 1997
--> able to read CharSet definitions within a MrBayes block
  x x
ONeSAMP 2.0 Tallmon et al., 2008   x x
PED 1.9 (16.4.2021) Chang et al., 2015   x x
PHYLIP / RAxML PHYLIP: 3.69 (April 2013)
RAxML: 8.2.12 (2018)
PHYLIP: Felsenstein, 1989; Felsenstein, 2004
RAxML: Stamatakis, 2014
  x x
SAM 1 (7.1.2021) Li et al. 2009 SAMtools, BCFtools x x
Structurama   Huelsenbeck et al., 2011     x
STRUCTURE / fastSTRUCTURE STRUCTURE: 2.3.4 (July 2012)
fastSTRUCTURE: 1.0
STRUCTURE: Falush et al., 2003; Falush et al., 2007; Pritchard et al., 2000; Hubisz et al., 2009;
fastSTRUCTURE: Raj et al., 2014
  x x
VCF 4.1 (2.8.2012) --> without structural variants (only SNP and INDELs) SAMtools, BCFtools x x

Note that, PGDSpider is currently not meant to convert large NGS files as it loads into memory the whole input file, which may lead to memory issues. However, PGDSpider allows one to convert specific subsets of these NGS files into any other format, and this approach can be used to perform sliding windows analyses on large NGS files.


Help

If you have any problems:


Screenshot

PGDSpider GUI:

PGDSpider GUI

SPID Editor:

SPID editor


How to cite PGDSpide and License

Lischer HEL and Excoffier L (2012) PGDSpider: An automated data conversion tool for connecting population genetics and genomics programs. Bioinformatics 28: 298-299.

Copyright (c) 2007-2021, Heidi E.L. Lischer. All rights reserved.

PGDSpider is distributed under the BSD 3-Clause License. For the full text of the license, see the file LICENSE.txt. By using, modifying or distributing this software you agree to be bound by the terms of this license.


Contact and bug report

If there are any bugs, send me an e-mail. Please give me a short description of the bug and tell me the input and output file format. If it is possible also attach the input file which caused the problem.

PGDSpider is an on-going project. For any comments or suggestions of further file formats, please send me an e-mail.

e-mail: heidi.tschanz-lischer(at)bioinformatics.unibe.ch

Heidi Tschanz-Lischer
Interfaculty Bioinformatics Unit (IBU)
University of Berne
3012 Bern
Switzerland

member of the Swiss Institute of Bioinformatics (SIB)

26.04.2021