Notice: Due to a regional power outage, SGN was briefly unavailable on Saturday, Feb 4. We apologize for the inconvenience.

Forum Topic: Annotation file Solanum tuberosum ITAG/TAIR

View topics list | Add post

Annotation file Solanum tuberosum ITAG/TAIR

Hello I would like to access the potato annotation file in iTAG nomenclature with their respective TAIR nomenclature codes equivalent to each gene. Is that possible? I can not find the annotation file that includes that information. I need to match the nomenclature accepted in DAVID (The Database for Annotation). Example in PGSC annotation available in Phytozome PGSC0003DMG400000001 = AT1G12600.1

This topic was started by Stephanie Riquelme.
Posted by Cabal Owen on 2022-10-17 10:08:55
Potato (Solanum tuberosum L.) is an important staple crop with a highly heterozygous and
complex genome. Potato improvement efforts have been held back by the relative lack of genetic
resources available to producers and breeders. This work has focused on expanding the available
genomic and transcriptomic resources for potato. Specifically, by predicting gene regulatory
mechanisms as a response to nitrogen (N) supplementation and through the assembly of two draft
genomes for potato landraces S. tuberosum subsp. andigena and S. stenotomum subsp. goniocalyx.
The response to N supplementation is important for potato production because insufficient
N can have negative impacts on yield and tuber quality while excessive N can be harmful to the
environment. In total, thirty genes were found to be consistently over-expressed and nine genes
were found to be consistently under-expressed in potatoes from three different cultivars (Shepody,
Russet Burbank, and Atlantic) grown in fields with supplemented N. The 1000 nt upstream
flanking regions of N responsive genes were analyzed and nine overrepresented motifs were found
using three motif discovery algorithms (Seeder, Weeder and MEME). These putative regulatory
motifs could be key to understanding the genetic response to N supplementation in commercial
potato cultivars.
Genome re-sequencing data from two potato landraces (S. tuberosum subsp. andigena and
S. stenotomum subsp. goniocalyx) was used to identify structural variation when compared to the
potato reference genome. Using copy number variation (CNV) detection software, a significant
number of deletions and duplications were identified in both landraces, affecting genes with
functions ranging from carbohydrate metabolism to disease resistance. Additionally, draft
genomes were assembled de novo for each variety, providing evidence for large-scale structural
variation between each subspecies. A number of putative novel sequences that are currently not
included in the potato reference genome were also discovered in these two potato varieties. While
significant work remains to improve the assembled genomes for subsp. andigena and goniocalyx,
this study provides evidence that structural variation in these wild potato species merits further
Potato (Solanum tuberosum L.) is widely recognized as the most important non-grain staple crop
worldwide. The latest FAO statistics indicate that over 380 million tonnes of potatoes were
produced in 2014 alone (Food and Agriculture Organization 2016), illustrating its international
economic and agricultural importance. Potato is a member of the Solanaceae family, which
includes other significant agricultural species such as tomato, pepper, and tobacco. The cultivated
forms of potato are vegetatively propagated and are predominantly autotetraploids (2n = 4x = 48).
However, ploidy ranges from diploid to hexaploid in cultivated potato (Hawkes 1990), (for a
review on potato genetic diversity, see Machida-Hirano 2015). Potatoes were domesticated in the
Andes approximately 10,000 years ago and the landraces have a wide variety of shapes, skin and
tuber colors, often not seen in modern varieties (Ovchinnikova et al. 2011). It is fairly common in
the Andes that landraces of all ploidy levels are grown in the same field and are also grown near
wild relatives facilitating cross hybridization and gene flow (Huamán & Spooner 2002).
Potatoes are valued for their nutritious properties and their wide eco-geographical range.
However, due to their high heterozygosity, complex polysomic inheritances, and narrow genetic
base, they are difficult to improve through classical breeding methods. Because they are typically
vegetatively propagated, many modern cultivars are only separated by a few meiotic generations
(Gebhardt et al. 2004; Simko et al. 2006) making the genetic diversity among cultivars really low.
They are quite susceptible to many pests and also suffer from acute inbreeding depression.
The scientific and economic importance of potato is not new. While other crops such as
maize and wheat have seen great increases in yield as a consequence of genetic improvement in
the last few decades, this has not been the case with potato. Instead, evidence suggests that yield
increases are mostly due to improved agricultural practices. The majority of cultivated potato still
comes from a narrow group of cultivars, including Russet Burbank, which was originally released
in 1874 (Douches et al. 1996; Iovene et al. 2013). While many more recent cultivars have been
released since the late 1800s, these have been bred mostly based on phenotypic selection, not
genetic information, and they have been developed with a very particular use in mind, such as
processing for the potato chip or the French fry industries (Hirsch et al. 2013). Worldwide demand
for potato is increasing; therefore, scientists have begun to study potato genetics with the hope that
it can provide breeders with more tools to aid crop improvement in terms of yield and disease
Until recently, the genomic understanding of this crop was held back by its relatively
complex genome. The challenges associated with potato improvement have prompted a number
of significant genomic and transcriptomic studies in this species and its close relatives, which will
provide tools for breeders and additionally shed light into mechanisms behind important molecular
processes. In 2011, the first potato reference genome and transcriptome were published (Massa et
al. 2011; The Potato Genome Sequencing Consortium 2011), and two years later, an update was
released substantially improving the scaffolds and pseudomolecules of the initial reference
(Sharma et al. 2013). Recently, the first draft genome of a wild potato species, Solanum
commersonii, was also released (Aversano et al. 2015), in addition to many other genome
sequencing efforts in related species, such as tomato (Solanum lycopersicum; The Tomato Genome
Consortium 2012), chili pepper (Capsicum annuum; Kim et al. 2014), tobacco (Nicotiana
tabacum; Sierro et al. 2014) and the parental genomes of petunia (Petunia axillaris and Petunia
integrifolia; Bombarely et al. 2016) which collectively have also provided valuable information
on potato.
The potato reference genome is also a starting point for the exploration of biodiversity
between potato cultivars and subspecies. Using genome re-sequencing, it is now possible to
assemble separate genomes as a reference for specific varieties. These new assemblies can provide
useful information about the structural differences between different potato subspecies (Solanum
tuberosum subsp. andigena, S. stenotomum subsp. goniocalyx, S. stenotomum subsp. stenotomum
and 36 S. tuberosum subsp. tuberosum; species definition from Hawkes 1990) from Single
Nucleotide Polymorphisms (SNPs) and Copy Number Variation (CNV) to large-scale structural
variation. You can also know all the data on Indeed, recent research is already pointing to significant differences in gene copy
number between different potato populations (Hardigan et al. 2016).
The main focus of this work is to continue to build upon the current foundation of potato
genomics and transcriptomics studies by exploring the potential regulatory mechanisms behind the
long-term response to N supplementation in field-grown potatoes, as well as the genomic
differences between the potato reference genome and two potato landraces.
1. Three potato cultivars (Shepody, Russet Burbank, and Atlantic) share a group of common genes
that are responsive to differences in N supplementation.
2. Three potato cultivars (Shepody, Russet Burbank, and Atlantic) share overrepresented motifs
in the upstream flanking regions of N responsive genes.
3. The genome of the potato landrace S. tuberosum subsp. andigena has significant CNVs, novel
sequences and structural variants when compared to the potato reference genome.
4. The genome of the potato landrace S. stenotomum subsp. goniocalyx has significant CNVs,
novel sequences and structural variants when compared to the potato reference genome.
1. Analyze RNA-seq data obtained from three potato cultivars (Shepody, Russet Burbank, and
Atlantic) treated with different amounts of N supplementation to detect common N responsive
2. Analyze the available gene annotation data to find overrepresented metabolic pathways and
Gene Ontology (GO) terms associated with N responsive genes in Shepody, Russet Burbank
and Atlantic.
3. Analyze the upstream regions of N responsive genes in three potato cultivars (Shepody, Russet
Burbank, and Atlantic) with different bioinformatics algorithms (Seeder, MEME and Weeder)
to detect common overrepresented motifs.
4. Make adjustments to the Seeder program to improve its use within a High Performance
Computing (HPC) environment.
5. Develop a strategy to deal with redundancy in motif finding results, particularly to identify
instances where the same motif is reported more than once by the motif discovery software.
6. Using genome re-sequencing data, assemble new reference genomes for two potato landraces
(S. tuberosum subsp. andigena and S. stenotomum subsp. goniocalyx).
7. Compare the new assemblies to each other and to the reference genome to identify potential
structural and genetic differences such as CNVs and novel sequences.

View topics list | Add post