This webpage is created for sharing the datasets and analysis codes for the project that evaluates statistical approaches for differential expression (DE) in metatranscriptomics. We simulated large-scale synthetic microbial communities, comprising a broad range of settings representing different scenarios (e.g. spiked vs. null associations; confounded associations arising from species abundance, gene loss/gain or sequencing depth; paired MTX and MGX vs. MTX only, etc.)
We further provide the statistical results of the simulated datasets by applying six statistical models for identifying DE that make different assumptions about MTX normalization, MGX availability, and the relationship between DNA and RNA copy numbers.
If you find this data useful, please cite our paper
Yancong Zhang, Kelsey N. Thompson, Curtis Huttenhower, Eric A. Franzosa*. Statistical approaches for differential expression analysis in metatranscriptomics. Bioinformatics, 37.Supplement_1: i34-i41 (2021)
Simulated datasets
- synth_mgx_mtx.tar.gz: 11 synthetic datasets include simulated counts of metatranscriptomic and/or metagenomic features, and lists of expected interactions (MD5: 8c3e28c0dad79bd4536d561755ff169c)
Statistical results
Dataset | MD5 | Description |
---|---|---|
global_filtering.tar.gz |
74605ff463ae562a9b5b55773c15310a |
statistical results of simulated datasets from different models for DE identification |
local_filtering.tar.gz |
26f871640c0ba28a2ced2fb540b3c8f3 |
statistical results of simulated datasets from different models for DE identification which ignored samples with zero RNA and zero DNA for a given gene |
strict_filtering.tar.gz |
f2d0207a774f0c37c78015fd85fcb9cc |
statistical results of simulated datasets from different models for DE identification which ignored samples with zero RNA or zero DNA for a given gene |
This analysis mainly relies on open-source R packages, MaAsLin 2. For analysis-specific programs, please check out the Github repository for DE analysis: MTX_model and synthetic data generation: MTX_synthetic