README for ATH_GO annotation and GOSLIM file. The ATH_GO document is a tab-delimited file containing GO annotations for Arabidopsis genes annotated by TAIR and TIGR with terms from the Gene Ontology Consortium controlled vocabularies (see www.geneontology.org). This file includes an updated set of literature based annotations and >40,000 electronic annotations based upon matches to INTERPRO domains supplied by Nicola Mulder from SWISS PROT/INTERPRO. Ontologies and annotations can also be searched/browsed at: http://godatabase.org/cgi-bin/go.cgi. You can search this file by locus name on line at: http://godot.ncgr.org/tools/bulk/go/ This file is updated on a weekly basis. Please cite this paper when using TAIR's GO annotations in your research: Berardini, TZ, Mundodi, S, Reiser, R, Huala, E, Garcia-Hernandez, M, Zhang, P, Mueller, LM, Yoon, J, Doyle, A, Lander, G, Moseyko, N, Yoo, D, Xu, I, Zoeckler, B, Montoya, M, Miller, N, Weems, D, and Rhee, SY (2004) Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol. 135(2):1-11. Column headers :explanation 1. locus name: standard AGI convention name 2. TAIR accession:the unique identifier for an object in the TAIR database- the object type is the prefix, followed by a unique accession number(e.g. gene:12345). 3.object name : the name of the object (gene, protein, locus) being annotated. 4. GO term: the actual string of letters corresponding to the GO ID 5. GO ID: the unique identifier for a GO term. 6. TAIR Keyword ID: the unique identifier for a keyword in the TAIR database. 7. Aspect: F=molecular function, C=cellular component, P=biological process. 8. GOslim term: high level GO term helps in functional categorization. 9. Evidence code: three letter code for evidence types (see: http://www.geneontology.org/GO.evidence.html). 10. Reference: Either a TAIR accession for a reference (reference table: reference_id) or reference from PubMed (e.g. PMID:1234). 11. Annotating database: TAIR or TIGR 12. Date annotated: date the annotation was made. Reference explanations: 1.publication or PMID: For literature based annotations the associated reference is given for either PubMed (PMID) or TAIR (publication:reference_id). 2.communication:1345790 : Used for annotations to unknown function, process or component when there was no data available from publications to annotate to any specific function, process or component. Used with ND or NAS evidence codes. 3.For electronic annotations (IEA evidence code) the references are as follows: AnalysisReference:1144958 : refers to annotations made by regular expression matching of molecular function GO terms to the definition lines for genes supplied in the July 15 release of the Arabidopsis genome sequence from TIGR. AnalysisReference:1346599 : refers to annotations made by matching Arabidopsis proteins in the SwissProt database to INTERPRO domains and mapping the INTERPRO domains to GO terms. The INTERPRO scan and mapping was generously provided by Nicola Mulder from Swiss Prot. The Interpro to GO mapping was also created by Nicola Mulder. AnalysisReference:1346600: similar to 1346599 but done using the ATH1.pep protein sequences from TIGR's August 2001 release AnalysisReference:1445901: refers to annotations made during the creation of the AraCyc database using Pathologic Software. A hand-edited file of enzyme names, based on TIGR annotations, was used as the input parameter and the output was manually curated. _______________________________________________________________________________