PGDSpider version 3.0.0.0 (April 2021)
An automated data conversion tool for connecting population genetics and genomics programs
System requirements
Download and Installation Instructions
Formats supported by PGDSpider:
Help
Screenshot
How to cite PGDSpide and License
Contact and bug report
Introduction
PGDSpider is a powerful automated data conversion tool for population genetic and genomics programs. It facilitates the data exchange possibilities between programs for a vast range of data types (e.g. DNA, RNA, NGS, microsatellite, SNP, RFLP, AFLP, multi-allelic data, allele frequency or genetic distances). Besides the conventional population genetics formats, PGDSpider integrates population genomics data formats commonly used to store and handle next-generation sequencing (NGS) data. Currently, PGDSpider is not meant to convert very large NGS files as it loads into memory the whole input file, whose size may exceed available RAM. However, since PGDSpider allows one to convert specific subsets of these NGS files into any other format, one could use this feature to calculate parameters or statistics for specific regions, and thus perform sliding window analysis over large genomic regions.
In the beginning PGDSpider was designed to use a newly developed PGD (Population Genetics Data) format as an intermediate step in the conversion process. PGD is a file format designed to store various kinds of population genetics data, including different data types (e.g. DNA sequences, microsatellites, AFLP or SNPs) and ploidy levels. PGD is based on the XML format and is therefore independent of any particular computer system and extensible for future needs. PGDSpider used PGD to connect population genetics and genomics programs like a spider knits a web. Since version 3.0.0.0 the intermediate PGD file was replaced by an PGD object.
PGDSpider is written in Java and is therefore platform independent. It is user friendly due to its intuitive graphical user interface. PGDSpider allows the user to store his preferred conversion settings for repeated conversions of similar input formats. A command line version of PGDSpider is also provided, making it possible to embed PGDSpider in data analysis pipelines.
System requirements
PGDSpider is written in Java and therefore platform independent, but SUN Java 1.7 RE (or a newer version) has to be installed. Java RE can be downloaded under following link:
http://www.oracle.com/technetwork/java/javase/downloads/index.html
Download and Installation Instructions
1st step:
Install the Java7 RE (or a newer version)
Windows:
- download and install Java RE with following link: http://www.oracle.com/technetwork/java/javase/downloads/index.html
Linux:
Ubuntu/Debian: Execute the following command as a root user:
apt-get install openjdk-8-jre
Other LINUX distributions: http://www.oracle.com/technetwork/java/javase/downloads/index.html
Mac:
- Apple Computer supplies their own version of Java. Use the Software Update feature (available on the Apple menu) to check that you have the most up-to-date version of Java for your Mac. If you have problems with downloading, installing or using Java on Mac, please contact Apple Computer Technical Support.
2nd step:
Download the PGDSpider application and unzip it on the local drive: PGDSpider (3.0.0.0) - Download
PGDSpider user manual: PGDSpider manual_v3-0-0-0.pdf
changes: changelog.txt
simple example files to do some trial format conversion with PGDSpider: examples.zip
Execute PGDSpider GUI:
Windows: execute the file
PGDSpider3.exe
to start the programLinux: execute the command
./PGDSpider3.sh
to start the programMac and others: execute the command
java -Xmx1024m -Xms512m -jar PGDSpider3.jar
to start the program
Execute PGDSpider-cli (command line):
Windows: execute the command
PGDSpider3-cli.exe
Linux: execute the command
java -Xmx1024m -Xms512m -jar PGDSpider3-cli.jar
Mac and others: execute the command
java -Xmx1024m -Xms512m -jar PGDSpider3-cli.jar
Formats supported by PGDSpider:
PGDSpider is able to parse 33 and to write 36 different file formats:
Data format | Version | References | External dependency | Input format | Output format |
---|---|---|---|---|---|
PGD | 1.1 | Lischer and Excoffier, 2012 | x | x | |
Arlequin | 3.5 (12.4.2015) | Excoffier and Lischer, 2010 | x | x | |
BAM | 1 (7.1.2021) | Li et al., 2009 | SAMtools, BCFtools | x | x |
BAMOVA | 1.02 (27.9.2011) | Gompert and Buerkle, 2011; Gompert et al., 2010 | x | ||
BAPS | 6.0 (17.12.2012) | Tang et al., 2009 | x | x | |
BATWING | (2003) | Wilson et al., 2003 | x | x | |
Bayenv | 2.0 (20.11.2013) | Coop et al., 2010; Gunther and Coop, 2013 | x | ||
BCF | (14.5.2011) | SAMtools, BCFtools | x | x | |
CONVERT | 1.31 (March 2005) | Glaubitz, 2004 | x | ||
EIGENSOFT | 7.2.1 (June 2017) | Patterson et al., 2006; Price et al., 2006 | x | x | |
extended multi-FASTA (XMFA) | x | x | |||
FASTA | Pearson, 1990 | x | x | ||
FASTQ | Cock, et al., 2010 | x | x | ||
FDist2 (datacal) | Beaumont and Nichols, 1996; Flint et al., 1999 | x | |||
FSTAT | 2.9.4 (November 2003) | Goudet, 2001 | x | x | |
GDA | 1.1 (7.1.2002) | Lewis, 2001 | x | x | |
GENELAND | 4.0.7 (28.6.2019) | Guedj and Guillot, 2011; Guillot, 2008; Guillot et al., 2005; Guillot and Santos, 2009; Guillot and Santos, 2010; Guillot et al.,2008 | x | x | |
GENEPOP | 4.7.2 (23.6.2019) | Rousset, 2008 | x | x | |
GENETIX | 4.05 (5.5.2004) | Belkhir, 1996-2004 | x | x | |
GESTE / BayeScan |
GESTE: 2.0 / BayeScan: 2.01 (21.1.2012) |
GESTE: Foll and Gaggiotti, 2006 BayeScan: Fischer et al., 2011; Foll et al., 2010; Foll and Gaggiotti, 2008 |
x | ||
HGDP | Stanford | x | |||
HGDP-CEPH (Arlequin + log file) |
x | ||||
Immanc / BayesAss | Immanc: 5.0 (8.10.1998) / BayesAss: 3.04 (2.3.2018) |
Immanc: Rannala, Mountain, 1997; BayesAss: Wilson, Rannala, 2003 |
x | x | |
IM / IMa | (17.12.2009) | IM: Hey and Nielsen, 2004; Nielsen and Wakeley, 2001; IMa: Hey and Nielsen, 2007 |
x | x | |
IMa2 / IMa3 | IMa2: (26.8.2011) / IMa3 (3.6.2019) |
IMa2: Hey, 2010 / IMa3: Hey et al., 2018 |
x | x | |
KML | 2.2 | Google 2009 | x | ||
MAF | 1.0 | x | |||
MEGA | 10.1 (9.9.2019) | Kumar et al., 2018 | x | x | |
MIGRATE | 3.2.6 (13.10.2010) | Beerli, 2009 | x | x | |
MSA | 4.05 | Dieringer, Schlotterer, 2003 | x | x | |
MSVar | 0.4.1.b (7.4.1999) | Beaumont, 1999 | x | ||
NewHybrids | 1.1 beta (7.4.2003) | Anderson and Thompson, 2002 | x | x | |
NEXUS | Maddison et al., 1997 --> able to read CharSet definitions within a MrBayes block |
x | x | ||
ONeSAMP | 2.0 | Tallmon et al., 2008 | x | x | |
PED | 1.9 (16.4.2021) | Chang et al., 2015 | x | x | |
PHYLIP / RAxML | PHYLIP: 3.69 (April 2013) RAxML: 8.2.12 (2018) |
PHYLIP: Felsenstein, 1989; Felsenstein, 2004 RAxML: Stamatakis, 2014 |
x | x | |
SAM | 1 (7.1.2021) | Li et al. 2009 | SAMtools, BCFtools | x | x |
Structurama | Huelsenbeck et al., 2011 | x | |||
STRUCTURE / fastSTRUCTURE | STRUCTURE: 2.3.4 (July 2012) fastSTRUCTURE: 1.0 |
STRUCTURE: Falush et al., 2003; Falush et al., 2007; Pritchard et al., 2000; Hubisz et al., 2009; fastSTRUCTURE: Raj et al., 2014 |
x | x | |
VCF | 4.1 (2.8.2012) | --> without structural variants (only SNP and INDELs) | SAMtools, BCFtools | x | x |
Note that, PGDSpider is currently not meant to convert large NGS files as it loads into memory the whole input file, which may lead to memory issues. However, PGDSpider allows one to convert specific subsets of these NGS files into any other format, and this approach can be used to perform sliding windows analyses on large NGS files.
Help
If you have any problems:
- read the user manual: PGDSpider manual_v3-0-0-0.pdf
- read the help file which can be found in the "Config" menu of the PGDSpider
Screenshot
PGDSpider GUI:
SPID Editor:
How to cite PGDSpide and License
Copyright (c) 2007-2021, Heidi E.L. Lischer. All rights reserved.
PGDSpider is distributed under the BSD 3-Clause License. For the full text of the license, see the file LICENSE.txt. By using, modifying or distributing this software you agree to be bound by the terms of this license.
Contact and bug report
If there are any bugs, send me an e-mail. Please give me a short description of the bug and tell me the input and output file format. If it is possible also attach the input file which caused the problem.
PGDSpider is an on-going project. For any comments or suggestions of further file formats, please send me an e-mail.
e-mail: heidi.tschanz-lischer(at)bioinformatics.unibe.ch
Heidi Tschanz-Lischer
Interfaculty Bioinformatics Unit (IBU)
University of Berne
3012 Bern
Switzerland
member of the Swiss Institute of Bioinformatics (SIB)
26.04.2021