mtx2021

Statistical approaches for differential expression analysis in metatranscriptomics
Description

This webpage is created for sharing the datasets and analysis codes for the project that evaluates statistical approaches for differential expression (DE) in metatranscriptomics. We simulated large-scale synthetic microbial communities, comprising a broad range of settings representing different scenarios (e.g. spiked vs. null associations; confounded associations arising from species abundance, gene loss/gain or sequencing depth; paired MTX and MGX vs. MTX only, etc.)

We further provide the statistical results of the simulated datasets by applying six statistical models for identifying DE that make different assumptions about MTX normalization, MGX availability, and the relationship between DNA and RNA copy numbers.

Citation

If you find this data useful, please cite our paper

Yancong Zhang, Kelsey N. Thompson, Curtis Huttenhower, Eric A. Franzosa*. Statistical approaches for differential expression analysis in metatranscriptomics. Bioinformatics, 37.Supplement_1: i34-i41 (2021)

Datasets

 Simulated datasets

  • synth_mgx_mtx.tar.gz: 11 synthetic datasets include simulated counts of metatranscriptomic and/or metagenomic features, and lists of expected interactions (MD5: 8c3e28c0dad79bd4536d561755ff169c)

 Statistical results

Dataset MD5 Description
global_filtering.tar.gz

74605ff463ae562a9b5b55773c15310a

statistical results of simulated datasets from different models for DE identification
local_filtering.tar.gz

26f871640c0ba28a2ced2fb540b3c8f3

statistical results of simulated datasets from different models for DE identification which ignored samples with zero RNA and zero DNA for a given gene
strict_filtering.tar.gz

f2d0207a774f0c37c78015fd85fcb9cc

statistical results of simulated datasets from different models for DE identification which ignored samples with zero RNA or zero DNA for a given gene
Codes availability

This analysis mainly relies on open-source R packages, MaAsLin 2. For analysis-specific programs, please check out the Github repository for DE analysis: MTX_model and synthetic data generation: MTX_synthetic