Skip to content
Bioinformatic Methods II
- Protein Motifs
- In this module we'll be exploring conserved regions within protein families. Such regions can help us understand the biology of a sequence, in that they are likely important for biological function, and also be used to help ascribe function to sequences where we can't identify any homologs in the databases. There are various ways of describing the conserved regions from simple regular expressions to profiles to profile hidden Markov models (HMMs).
- Protein-Protein Interactions
- In this module we'll be exploring protein-protein interactions (PPIs). Protein-protein interactions are important as proteins don't act in isolation, and often an examination of the interaction partners (determined in an unbiased, perhaps high throughput way) of a given protein can tell us a lot about its biology. We'll talk about some different methods used to determine PPIs and go over their strengths and weaknesses. In the lab we'll use 3 different tools and two different databases to examine interaction partners of BRCA2, a protein that we examined in last module's lab. Finally, we'll touch on a "foundational" concept, Gene Ontology (GO) term enrichment analysis, to help us understand in an overview way the proteins interacting with our example.
- Protein Structure
- The determination of a protein's tertiary structure in three dimensions can tell us a lot about the biology of that protein. In this module's mini-lecture, we'll talk about some different methods used to determine a protein's tertiary structure and cover the main database for protein structure data, the PDB. In the lab we'll explore the PDB and an online tool for searching for structural (as opposed to sequence) similarity, VAST. We'll then use a nice piece of stand-alone software, PyMOL, to explore several protein structures in more detail.
- Review: Protein Motifs, Protein-Protein Interactions, and Protein Structure
- Gene Expression Analysis I
- When and where genes are expressed (active) in tissues or cells is one of the main determinants of what makes that tissue or cell the way it is, both in terms of morphology and in terms of response to external stimuli. Several different methods exist for generating gene expression levels for all of the genes in the genome in tissues or even at cell-type-specific resolution. In this class we'll be processing and then examining some gene expression data generated using RNA-seq. We'll explore one of the main databases for RNA-seq expression data, the Sequence Read Archive (SRA), and then use an open-source suite of programs in R called BioConductor to process the raw reads from 4 RNA-seq data sets, to summarize their expression levels, to select significantly differentially expressed genes, and finally to visualize these as a heat map.
- Gene Expression Analysis II
- When and where genes are expressed (active) in tissues or cells is one of the main determinants of what makes that tissue or cell the way it is, both in terms of morphology and in terms of response to external stimuli. Several different methods exist for generating gene expression levels for all of the genes in the genome in tissues or even at cell-type-specific resolution. In this class we'll be hierarchically clustering our significantly differentially expressed genes from last time using BioConductor and the built-in function of an online tool, called Expression Browser. Then we'll be using another online tool that uses a similarity metric, the Pearson correlation coefficient, to identify genes responding in a similar manner to our gene of interest, in this case AP3. We'll use a second tool, ATTED-II to corroborate our gene list. We'll also be exploring some online databases of gene expression and an online tool for doing a Gene Ontology enrichment analysis.
- Cis Regulatory Systems
- When and where genes are expressed in tissues or cells is one of the main determinants of what makes that tissue or cell the way it is, both in terms of morphology and in terms of response to external stimuli. Gene expression is controlled in part by the presence of short sequences in the promoters (and other parts) of genes, called cis-elements, which permit transcription factors and other regulatory proteins to bind to direct the patterns of expression in certain tissues or cells or in response to environmental stimuli: We'll explore a couple of sets of promoters of genes that are coexpressed with AP3 from Arabidopsis, and with INSULIN from human, for the presence of known cis-elements, and we'll also try to predict some new ones using a couple of different methods.
- Review: Gene Expression Analysis and Cis Regulatory Systems + Final Assignment