There are 3 different datasets of proteins for an unigene build. + Proteins predicted by longest6frame (a SGN script that translate the sequence in the 6 ORF and get the longest) + Proteins predicted by estscan (http://www.ch.embnet.org/software/ESTScan2.html) + Proteins preferred (for each unigene, compare both methods and get the longest protein) For each dataset, exists two files (cds and protein). Also is provided a version of the preferred protein dataset with annotations compatibles with ProteinPilot program. ---------------------------------------- Report: ---------------------------------------- Files: * cds sequences: - cds fasta file: Antirrhinum_majus_cds_predicted_by_longest6frame.v1.fasta - number of sequences: 13722 - total bases: 6506067 - average sequences length: 474 - maximum sequence length: 3690 - minimum sequence length: 18 * protein sequences: - protein fasta file: Antirrhinum_majus_protein_predicted_by_longest6frame.v1.fasta - number of sequences: 13722 - total aminoacids: 2162747 - average sequences length: 157 - maximum sequence length: 1230 - minimum sequence length: 6 * cds sequences: - cds fasta file: Antirrhinum_majus_cds_predicted_by_estscan.v1.fasta - number of sequences: 12054 - total bases: 6771951 - average sequences length: 561 - maximum sequence length: 3810 - minimum sequence length: 51 * protein sequences: - protein fasta file: Antirrhinum_majus_protein_predicted_by_estscan.v1.fasta - number of sequences: 12054 - total aminoacids: 2266044 - average sequences length: 187 - maximum sequence length: 1272 - minimum sequence length: 17 * cds sequences: - cds fasta file: Antirrhinum_majus_cds_predicted_by_preferred.v1.fasta - number of sequences: 12666 - total bases: 7058067 - average sequences length: 557 - maximum sequence length: 3810 - minimum sequence length: 18 * protein sequences: - protein fasta file: Antirrhinum_majus_protein_predicted_by_preferred.v1.fasta - number of sequences: 12666 - total aminoacids: 2356967 - average sequences length: 186 - maximum sequence length: 1272 - minimum sequence length: 6 ----------------------------------------