A novel statistical method for handling zeros in microbiome data
- Juxin Liu, University of Saskatchewan
Modern sequencing technologies, such as 16S rRNA sequencing, provide a valuable approach to large-scale profiling of microbial communities. However, the sequencing data are compositional, over-dispersed, and zero-inflated due to the limitations of the sequencing technologies. There has been an extensive amount of work on how to tackle these challenges. This project focuses on how to handle zeros. The importance of handling zeros cannot be overstated because almost all different types of downstream analyses, such as network analysis, rely on the quality of imputed data.
To our knowledge, none of the existing zero-imputation methods use phylogenetic distances. The proposed project aims to fill this gap in the literature. We will first identify the sources of zeros, i.e., biological zeros or sampling zeros. We will only impute sampling zeros by borrowing information from the taxa that are phylogenetically close.
- Expected team size: 2
- Student Experience Level: Advanced: students who have taken multiple upper-level mathematics courses
Where course numbers are give, students should look for the closest equivalent course given at their home institution
- STAT 241 (Probability Theory)
- STAT 242 (Statistical Theory and Methodology)
- STAT 344 (Elementary Statistical Concepts)
- R programming