Topic Name Description
File Table of Contents
File A Quick Guide to Large-Scale Genomic Data Mining
For the first several hundred years of research in cellular biology, the main bottleneck to scientific progress was data collection. Our newfound data-richness, however, has shifted this bottleneck from collection to analysis. While a variety of options exists for examining any one experimental dataset, we are still discovering what new biological questions can be answered by mining thousands of genomic datasets in tandem, potentially spanning different molecular activities, technological platforms, and model organisms. As an analogy, consider the difference between searching one document for a keyword and executing an online search. While the tasks are conceptually similar, they require vastly different underlying methodologies, and they have correspondingly large differences in their potentials for knowledge discovery.
Topic 1 File An overview of functional genomic data
File Genomic data resources
Topic 2 File Scalable machine learning
File Machine learning: An example using Weka
In this short demonstration, we will use Weka {Hall, 2009} to perform a toy analysis associating metagenomic reads from the {Turnbaugh, 2006} mouse gut microbiome with their functional categories as defined by IMG/M {Markowitz, 2008}.
File Machine learning resources
Topic 3 File High-throughput sequencing
File Second-Generation Sequencing
Topic 4 File An introduction to metagenomics
File Coupon collector's problem
File Metagenomics: An example using MG-RAST
In this short demonstration, we will use the MG-RAST server {Meyer, 2008} to view the sequence read statistics, phylogenetic distribution, and functional profile of a human gut microbiome as sampled in {Turnbaugh, 2008}. We will also compare its taxonomic structure to that of a very different ocean microbiome as sampled by {Venter, 2004}.
File Metagenomics resources
Topic 5 File Genomic data integration
File Merging Curated and in silico Interaction Data in Network Analysis
File Prioritizing GWA data
File Data integration: An example using HEFalMp
In this short demonstration, we will use the HEFalMp server {Huttenhower, 2009} to view the functional relationships, functional network neighborhood, processes, and genetic disorders associated with human genes.
File Host Interaction Example
File GWAS PPI Example
File Genomic data integration resources
Topic 6 File Scalable systems
File Scalability