Skip to content
- Plant Genomic Databases, and useful sites for info about proteins
- In this module we'll be exploring several plant databases including Ensembl Plants, Gramene, PLAZA, SUBA, TAIR and Araport. The information in these databases allows us to easily identify functional regions within gene products, view subcellular localization, find homologs in other species, and even explore pre-computed gene trees to see if our gene of interest has undergone a gene duplication event in another species, all at the click of a mouse!
- Expression Analysis
- Vast databases of gene expression and nifty visualization tools allow us to explore where and when a gene is expressed. Often this information can be used to help guide a search for a phenotype if we don't see a phenotype in a gene mutant under "normal" growth conditions. We explore several tools for Arabidopsis data (eFP Browser, Genevestigator, TraVA DB, Araport) along with NCBI's Genome Data Viewer for RNA-seq data for other plant species. We also examine the MPSS database of small RNAs and degradation products to see if our example gene has any potential microRNA targets.
- Coexpression Tools
- Being able to group genes by similar patterns of expression across expression data sets using algorithms like WGCNA is a very useful way of organizing the data. Clusters of genes with similar patterns of expression can then be subject to Gene Ontology term enrichment analysis (see Module 5) or examined to see if they are part of the same pathway. What's even more powerful is being able to identify genes with similar patterns of expression without doing a single expression profiling experiment, by mining gene expression databases! There are several tools that allow you to do this in many plant species simply by entering a query gene identifier. The genes that are returned are often in the same biological process as the query gene, and thus this "guilt-by-association" paradigm is a excellent tool for hypothesis generation.
- Sectional Quiz 1
- Promoter Analysis
- The regulation of gene expression is one of the main ways by which a plant can control the abundance of a gene product (post-translational modifications and protein degradation are some others). When and where a gene is expressed is controlled to a large extent by the presence of short sequence motifs, called cis-elements, present in the promoter of the gene. These in turn are regulated by transcription factors that perhaps get induced in response to environmental stresses or during specific developmental programs. Thus understanding which transcription factors can bind to which promoters can help us understand the role the downstream genes might be playing in a biological system.
- Functional Classification and Pathway Vizualization
- Often the results of 'omics experiments are large lists of genes, such as those that are differentially expressed. We can use a "cherry picking" approach to explore individual genes in those lists but it's nice to be able to have an automated way of analyzing them. Here tools for performing Gene Ontology enrichment analysis are invaluable and can tell you if any particular biological processes or molecular functions are over-represented in your gene list. We'll explore AgriGO, AmiGO, tools at TAIR and the BAR, and g:Profiler, which all allow you to do such analyses. Another useful analysis is to be able to map your gene lists (along with associated e.g. expression values) onto pathway representations, and we'll use AraCyc and MapMan to do this. In this way it is easy to see if certain biosynthetic reactions are upregulated, which can help you interpret your 'omics data!
- Network Exploration (PPIs, PDIs, GRNs)
- Molecules inside the cell rarely operate in isolation. Proteins act together to form complexes, or are part of signal transduction cascades. Transcription factors bind to cis-elements in promoters or elsewhere and can act as activators or repressors of transcription. MicroRNAs can affect transcription in other ways. One of the main themes to have emerged in the past two decades in biology is that of networks. In terms of protein-protein interaction networks, often proteins that are highly connected with others are crucial for biological function – when these “hubs” are perturbed, we see large phenotypic effects. The way that transcription factors interact with downstream promoters, some driving the expression of other transcription factors that in turn regulate genes combinatorially with upstream transcription factors can have an important biological effect in terms of modulating the kind of output achieved. The tools described in this lab can help us to explore molecular interactions in a network context, perhaps with the eventual goal of modeling the behaviour of a given system.
- Sectional Quiz 2 and Final Assignment