************* Release notes ************* Version 1.50 (2010-05-14) ========================= Changes ------- * The contigs from assembly 1.03 were polished using the SOLiD data and SOLEXA data. Polishing included sinlge-base error correction and indel correction (mostly homopolymer). * Contamination from E.coli and vector sequences was removed. Organellar sequences were separated (and are thus not included in this data set). * Several structural inconsistencies were solved. * Contigs from fully sequences BACs were integrated. * Superscaffolds were built using clone-end information (BAC and fosmid ends). Assembly stats -------------- * Contigs (only the contigs that make up the scaffolds): 29,736 sequences, 733.0 Mb, 50% of assembly in 2,754 contigs of 69,257 bp or longer * Scaffolds: 3,584 sequences, 781.2 Mb, 50% of assembly in 27 scaffolds of 7.8 Mb or longer Version 1.03 (2010-01-22) ========================= Changes ------- * During the assembly we screened for E. coli sequences to prevent the E. coli contamination (from the SBM data) as found in version 1.0 * Two new 454 runs (3kb and 20kb) were added to the assembly * This assembly was made with an updated version of the assembler Assembler --------- * This assembly was made with Newbler version 2.3-PostRelease-01/11/2010 Assembly input -------------- * A new filtered 454 data set was created because of the addition of 2 new 454 runs (3kb and 20kb) - 56 million reads, 20.8 Gb, approx. 22.0X coverage * SBM and clone-end input were the same as in version 1.00 (see below) Assembly stats -------------- * Contigs: 110,872 sequences, 762.0 Mb, 50% of assembly in 3,641 contigs of 55,730 bp or longer * Scaffolds: 3,761 sequences, 781.7 Mb, 50% of assembly in 52 scaffolds of 4.4 Mb or longer Version 1.00 (2009-11-27) ========================= Assembler --------- * This assembly was made with newbler version 2.3-PostRelease-11/19/2009 Assembly input -------------- * This assembly contains 454 sequences, Selected BAC clone Mixture (SBM) sequences, and BAC/fosmid end sequences. - 454 (filtered): 55 million reads, 20.5 Gb, approx. 21.6X coverage - SBM (filtered): 3.8 million reads, 3.1 Gb, approx. 3.3X coverage - BAC ends (filtered): 308,490 sequences (incl. 135,271 pairs), 180 Mb, approx. 0.18X coverage - Fosmid ends (filtered): 151,299 sequences (incl. 64,722 pairs), 83 Mb, approx. 0.087X coverage Assembly stats -------------- * Contigs: 118,692 sequences, 762.5 Mb, 50% of assembly in 4,238 contigs of 47,298 bp or longer * Scaffolds: 7,409 sequences, 794.6 Mb, 50% of assembly in 50 scaffolds of 4.5 Mb or longer Raw data ======== * 454 - approx. 80 runs, 78.6 million reads, 27.7 Gb, approx. 29.2X coverage - Data sequenced with the GS FLX Titanium - Approx. 39 runs from WGS libraries, 19 runs from 3 Kb libraries, 12 runs from 8 Kb libraries, and 10 runs from 20Kb libraries, resulting in approx. 20 Gb from WGS reads and 8.3 Gb from paired-end reads * SBM (=Selected BAC clone Mixture) - 4,039,383 sequences, 4.9 Gb, approx. 5.2X coverage - Shotgun sequencing from BAC clone pools, using the BAC end sequences available at the SGN website - http://www.kazusa.or.jp/tomato/ * BAC ends: 309,305 sequences, 180 Mb, approx. 0.19X coverage - BAC ends come from 3 enzyme libraries (HindIII, MboI, EcoRI) * Fosmid ends: 151,301 sequences, 83 Mb, approx. 0.08X coverage * SOLiD: 2 runs 1kb, 2 runs 5kb, 2 runs 10kb, 2 runs 7kb (forward only) - The SOLiD data has not been used in the assembly yet (up to version 1.03)