Skip to content
Bioinformatic Methods I
- NCBI/Blast I
- In this module we'll be exploring the amazing resources available at NCBI, the National Centre for Biotechnology Information, run by the National Library of Medicine in the USA. We'll also be doing a Blast search to find similar sequences in the enormous NR sequence database. We can use similar sequences to infer homology, which is the primary predictor of gene or protein function.
- Blast II/Comparative Genomics
- In this module we'll continue exploring the incredible resources available at NCBI, the National Centre for Biotechnology Information. We will be performing several different kinds of Blast searches: BlastP, PSI-Blast, and Translated Blast. We can use similar sequences identified by such methods to infer homology, which is the primary predictor of gene or protein function. We'll also be comparing parts of the genomes of a couple of different species, to see how similar they are.
- Multiple Sequence Alignments
- In this module we'll be doing multiple sequence alignments with Clustal and MUSCLE (as implemented in MEGA), and MAFFT. Multiple sequences alignments can tell you where in a sequence the conserved and variable regions are, which is important for understanding the biology of the sequences under investigation. It also has practical applications, such as being able to design PCR primers that will amplify sequences from a number of different species, for example.
- Review: NCBI/Blast I, Blast II/Comparative Genetics, and Multiple Sequence Alignments
- In this module we'll be using the multiple sequence alignments we generated last lab to do some phylogenetic analyses with both neighbour-joining and maximum likelihood methods. The tree-like structure generated by such analyses tells us how closely sequences are related one to another, and suggests when in evolutionary time a speciation or gene duplication event occurred.
- Selection Analysis
- In this module we'll take a set of orthologous sequences from bacteria and use DataMonkey to analyze them for the presence of certain sites under positive, negative or neutral selection. Such an analysis can help understand the biology of a set of protein coding sequences by identifying residues that might be important for biological function (those residues under negative selection) or those that might be involved in response to external influences, such as drugs, pathogens or other factors (residues under positive selection).
- 'Next Gen' Sequence Analysis (RNA-Seq) / Metagenomics
- In this module we'll explore some of the data that have been generated as a result of the rapid decrease in the cost of sequencing DNA. We'll be exploring a couple of RNA-Seq data sets that can tell us where any given gene is expressed, and also how that gene might be alternatively spliced. We'll also be looking at a couple of metagenome data sets that can tell us about the kinds of species (especially microbial species that might otherwise be hard to culture) that are in a given environmental niche.
- Review: Phylogenetics, Selection Analysis, and 'Next Gen' Sequence Analysis (RNA-seq)/Metagenomics + Final Assignment