CUSTOMIZED TARGET-DECOY DATABASE OF NICOTIANA BENTHAMIANA PROTEOME 0. AUTHORS Sergio Ciordia and Fabio Pasin [Centro Nacional de Biotecnología (CNB-CSIC), Madrid, Spain] 1. BACKGROUND In proteomic analysis softwares, long descriptions or special characters in sequence annotations can be source of database search errors. Usage of a concatenated 'target-decoy' sequence list [Elias et al., Nat Methods. 2005, 2(9):667-75; Elias et Gygi, Methods Mol Biol. 2010, 604:55-71] is a widely used solution to estimate false positive identifications. We provide a minimal Nicotiana benthamiana sequence database to overcome database search errors. For decoy composite searches and false discovery rate (FDR) estimations all sequences have been reversed. Sequence descriptions of target proteins can be retrieved using the attached Excel document. This will facilitate result evaluations. 2. FILE LIST The database "Niben.genome.v0.4.4.proteins.annotated.fasta" [Bombarely et al., Mol Plant Microbe Interact. 2012 Dec;25(12):1523-30] which contains predicted proteins from a draft N. benthamiana nuclear genome sequence was used as input for the following files. 2.1 Niben.v0.4.4.TargetDecoy.fasta Sequence file to be used in MS/MS ion searches and off-line FDR estimations. Only N. benthamiana protein sequences and their identifiers were maintained. N. tabacum plastid and mitochondrial proteomes were included [GenBank: NC_001879.2; GenBank: NC_006581.1]. In this concatenated target-decoy database, all protein sequences in the reverse orientation were included and identifiers were marked with the "_REVERSED" text flag. 2.2 Niben.v0.4.4.TargetDecoy_Annotations.xlsm Protein annotation file to retrive full-length descriptions of MS/MS identified proteins. Macros must be activated. 3. REFERENCES Using the described approach, we performed iTRAQ proteomic experiments of N. benthamiana samples. Results were presented in the following study (which might be used as reference): PLoS Pathog. 2014 Mar 6;10(3):e1003985. doi: 10.1371/journal.ppat.1003985. The hypervariable amino-terminus of P1 protease modulates potyviral replication and host defense responses. Pasin F, Simón-Mateo C, García JA. CNB-CSIC proteomics facility provides scientific assistance on proteomic experimental design and analysis. A list of services can be found at: http://proteo.cnb.csic.es/proteomica/ If you have any questions and comments about these documents or service requests, please contact: Sergio Ciordia [sciordia@cnb.csic.es] Fabio Pasin [fpasin@cnb.csic.es] 4. CUSTOMIZING To customize the provided files (i.e., addition of new entries, update of protein descriptions): - Add to the database file "2.1" the protein sequence of interested in fasta format. Add its respective reversed sequence, and the "_REVERSED" text flag to the reversed sequence identifier; - Update annotation file "2.2" adding to the "2.DB_Annotations" spreadsheet the new protein identifier plus its description.