The contigs from assembly 1.03 were polished using the SOLiD data and SOLEXA data. Polishing included sinlge-base error correction and indel correction (mostly homopolymer).
Contamination from E.coli and vector sequences was removed. Organellar sequences were separated (and are thus not included in this data set).
Several structural inconsistencies were solved.
Contigs from fully sequences BACs were integrated.
Superscaffolds were built using clone-end information (BAC and fosmid ends).
Assembly stats
Contigs (only the contigs that make up the scaffolds): 29,736 sequences, 733.0 Mb, 50% of assembly in 2,754 contigs of 69,257 bp or longer
Scaffolds: 3,584 sequences, 781.2 Mb, 50% of assembly in 27 scaffolds of 7.8 Mb or longer
Version 1.03 (2010-01-22)
Changes
During the assembly we screened for E. coli sequences to prevent the E. coli contamination (from the SBM data) as found in version 1.0
Two new 454 runs (3kb and 20kb) were added to the assembly
This assembly was made with an updated version of the assembler
Assembler
This assembly was made with Newbler version 2.3-PostRelease-01/11/2010
Assembly input
A new filtered 454 data set was created because of the addition of 2 new 454 runs (3kb and 20kb)
56 million reads, 20.8 Gb, approx. 22.0X coverage
SBM and clone-end input were the same as in version 1.00 (see below)
Assembly stats
Contigs: 110,872 sequences, 762.0 Mb, 50% of assembly in 3,641 contigs of 55,730 bp or longer
Scaffolds: 3,761 sequences, 781.7 Mb, 50% of assembly in 52 scaffolds of 4.4 Mb or longer
Version 1.00 (2009-11-27)
Assembler
This assembly was made with newbler version 2.3-PostRelease-11/19/2009
Assembly input
This assembly contains 454 sequences, Selected BAC clone Mixture (SBM) sequences, and BAC/fosmid end sequences.
454 (filtered): 55 million reads, 20.5 Gb, approx. 21.6X coverage
SBM (filtered): 3.8 million reads, 3.1 Gb, approx. 3.3X coverage
Approx. 39 runs from WGS libraries, 19 runs from 3 Kb libraries, 12 runs from 8 Kb libraries, and 10 runs from 20Kb libraries, resulting in approx. 20 Gb from WGS reads and 8.3 Gb from paired-end reads