The Dlugosch Lab @ The University of Arizona
pic
The Dlugosch Lab webpage has moved to a new site! Software downloads are available there with their associated publications. Follow this link to the new lab page at medium.com/dlugosch-lab


evopipes.net
Bioinformatics software and scripts from the Dlugosch Lab are made available through EvoPipes.net, a resource for evolutionary and ecological bioinformatics.  These include the following...

AllelePipe
AllelePipe
DOWNLOAD HERE
AllelePipe takes in assembled sequence contigs from one or more individuals and passes them through the following steps to extract allelic variation at putative individual loci:

1) Similarity is assessed among all sequences from all individuals using SSAHA2 according to user-defined minimum similarity and alignment length thresholds.

2) Alignment throughout the region of overlap is verified.

3) Sequences are clustered by either single-linkage clustering or MCL as desired, with the option of re-starting the clustering with alternative methods/granularities.

4)  Multiple alignments are created for sequences within each cluster and their consensus sequence generated, using CAP3. A single consensus genomic reference fasta file is generated for the whole dataset which can be used again in other analyses.

5) Optionally, putatively chimeric clusters are removed, assuming that these are clusters where only one sequence bridges an internal region of the multiple alignment. This step is only appropriate for datasets with many individuals and good coverage of loci, where many sequences should be aligning across the length of each locus.

6) SNPs (currently excluding indels) are identified using ssahaSNP against the reference sequence for the same or different sets of individuals, as desired. The program can be restarted from this step for additional analyses with different parameter and/or new individuals.

7) Clusters are sorted as being single or multi-locus, based upon user settings for the maximum number of alleles allowed per individual
Citation:
Dlugosch KM, Lai Z, Bonin A, Hierro J, Rieseberg LH.  2013. Allele identification for transcriptome-based population genomics in the invasive plant Centaurea solstitialis. G3 3: 359-367.
SnoWhite
SnoWhite
DOWNLOAD HERE
Snowhite is a pipeline designed to flexibly and aggressively clean sequence reads (gDNA or cDNA) prior to assembly. It takes in and returns fastq or fasta formatted sequence files.

The pipeline employs several steps that can be turned on and off as desired. Briefly, these are:
  • File splitting or multi-plexed Barcode parsing
  • Quality trimming
  • End clipping
  • TagDust filtering
  • SeqClean trimming
  • PolyA trimming
  • Conversion between FASTQ and FASTA
    
NOTE: Version 2.x involves a major overhaul to handle both FASTQ and FASTA input file types, as well as a larger variety of common functions, such as splitting via barcode.

Citation:
Dlugosch KM, Lai Z, Bonin A, Hierro J, Rieseberg LH.  2013. Allele identification for transcriptome-based population genomics in the invasive plant Centaurea solstitialis. G3 3: 359-367.

SCARF
SCARF
Scaffolded and Corrected Assembly of Roche 454
Designed especially for assembling 454 EST sequences against high quality reference sequences from related species.

Citation:
Barker MS*, KM Dlugosch*, ACC Reddy, SN Amyotte & LH Rieseberg (2009)  SCARF: Maximizing next-generation EST assemblies for evolutionary and population genomic analyses.  Bioinformatics 25: 535-536.  *co-first authors
NU-IN
NU-IN
DOWNLOAD HERE
Neutral evolution and input genome module for EvolSimulator2.1.0

Citation:
Dlugosch KM, MS Barker & LH Rieseberg  (2010)  NU-IN: Nucleotide evolution and input module for the EvolSimulator genome simulation platform.  BMC Research Notes 3: 217.


All contents \A9 copyright 2011-2015 Katrina M Dlugosch