Final Assembly files

Final assemblies now available:

Genome Final assemblies
Stapylococcus aureus
Download all assemblies for this genome
Rhodobacter sphaeroides
Download all assemblies for this genome
H. sapies Chr 14
Download all assemblies for this genome
Bombus impatiens
Download all assemblies for this genome


Download the analysis scripts used for validation (last updated 04/02/12). These scripts require MUMmer 3.23 or higher. The scripts assume that your reference ends in a standard fasta extension: "fasta" or "fa" and that your assembly (contigs, scaffolds) end in: "contig", "scafSeq", "fa", "fna", "fasta", or "final". If your files have a different extension, please rename them prior to running the scripts. To run an analysis:

  • Download MUMmer 3.23, available here
  • Run: tar xvzf MUMmer3.23.tar.gz
  • Run: cd MUMmer3.23
  • Run: make
  • Run: tar xvzf gage-validation.tar.gz
  • Run: sh <Reference Fasta> <Contig Fasta> <Scaffold Fasta>
  • To evaluate assembly contiguity without a reference run: java GetFastaStats -o -min 200 -genomeSize <Genome Expected Size> <Contig Fasta/Scaffold Fasta>

Download an updated version of the analysis scripts with minor changes to scaffold validation (last updated 11/8/12).

Final analysis now available:

03/30/2012: The SOAPdenovo scaffold validation results were updated to correct a mistake on the bacterial genomes. This reduced the scaffold errors and increased the corrected N50 in both bacterial genomes.

Thank you to Francesco Vezzi for reporting the discrepancy.