Source Themes

Strategies for addressing pseudoreplication in multi-patient scRNA-seq data
The rapidly evolving field of single-cell transcriptomics has provided a powerful means for understanding cellular heterogeneity. Large-scale studies with multiple biological samples hold promise for discovering differentially expressed biomarkers with a higher level of confidence through a better characterization of the target population. However, the hierarchical nature of these experiments introduces a significant challenge for downstream statistical analysis. Indeed, despite the availability of numerous differential expression methods, only a select few can accurately address the within-patient correlation of single-cell expression profiles. Furthermore, due to the high computational costs associated with some of these methods, their practical use is limited. In this manuscript, we undertake a comprehensive assessment of different strategies to address the hierarchical correlation structure in multi-sample scRNA-seq data. We employ synthetic data generated from a simulator that retains the original correlation structure of multi-patient data while making minimal assumptions, providing a robust platform for benchmarking method performance. Our analyses indicate that neglecting within-patient correlation jeopardizes type I error control. We show that, in line with some previous reports and in contrast with others, Poisson Generalized Estimation Equations provide a useful and flexible framework for addressing these issues. We also show that pseudobulk approaches outperform single-cell level methods across the board. In this work, we resolve the conflicting results regarding the utility of GEEs and their performance relative to pseudobulk approaches. As such, we provide valuable guidelines for researchers navigating the complex landscape of gene expression modeling, and offer insights on choosing the most appropriate methods based on the specific structure and design of their datasets.
MultiNicheNet; a flexible framework for differential cell-cell communication analysis from multi-sample multi-condition single-cell transcriptomics data
Dysregulated cell-cell communication is a hallmark of many disease phenotypes. Due to recent advances in single-cell transcriptomics and computational approaches, it is now possible to study intercellular communication on a genome- and tissue-wide scale. However, most current cell-cell communication inference tools have limitations when analyzing data from multiple samples and conditions. Their main limitation is that they do not address inter-sample heterogeneity adequately, which could lead to false inference. This issue is crucial for analyzing human cohort scRNA-seq datasets, complicating the comparison between healthy and diseased subjects. Therefore, we developed MultiNicheNet (https://github.com/saeyslab/multinichenetr), a novel framework to better analyze cell-cell communication from multi-sample multi-condition single-cell transcriptomics data. The main goals of MultiNicheNet are inferring the differentially expressed and active ligand-receptor pairs between conditions of interest and predicting the putative downstream target genes of these pairs. To achieve this goal, MultiNicheNet applies the principles of state-of-the-art differential expression algorithms for multi-sample scRNA-seq data. As a result, users can analyze differential cell-cell communication while adequately addressing inter-sample heterogeneity, handling complex multifactorial experimental designs, and correcting for batch effects and covariates. Moreover, MultiNicheNet uses NicheNet-v2, our new and substantially improved version of NicheNet’s ligand-receptor network and ligand-target prior knowledge model. We applied MultiNicheNet to patient cohort data of several diseases (breast cancer, squamous cell carcinoma, multisystem inflammatory syndrome in children, and lung fibrosis). For these diseases, MultiNicheNet uncovered known and novel aberrant cell-cell signaling processes. We also demonstrated MultiNicheNet’s potential to perform non-trivial analysis tasks, such as studying between- and within-group differences in cell-cell communication dynamics in response to therapy. As a final example, we used MulitNicheNet to elucidate dysregulated intercellular signaling in idiopathic pulmonary fibrosis while correcting batch effects in integrated atlas data. Given the anticipated increase in multi-sample scRNA-seq datasets due to technological advancements and extensive atlas-building integration efforts, we expect that MultiNicheNet will be a valuable tool to uncover differences in cell-cell communication between healthy and diseased states.