Demultiplexing

In order to enable consistency for QC of the datasets used for the sceQTL-Gen Consortium, demultiplexing and doublet detection has been built into a snakemake pipeline leveraging Singularity environments to maintain consistency across different compute structures. We identified five softwares to use for the sceQTL-Gen Consortium by testing the combination of 10 demultiplexing and doublet detection softwares with multiple intersectional methods (manuscript in process). The five softwares that will be used for demultiplexing and doublet detection in this consortium include two SNP-based demultiplexing and doublet detection softwares:

and three transcriptome-based doublet detection softwares:

The complete pipeline was built in Snakemake in order to provide reproducibility across labs and users with a Singularity image to enable consistency in softwares across systems. Most of the softwares (popscle-demuxlet, souporcell and scds) run without need for any user interaction besides providing input files. However, both DoubletDetection and scrublet require users to check that the thresholding used is effective for the data. Therefore, the pipeline has been built to stop and only run certain jobs after users have provided input regarding each pool they are demultiplexing. Let’s first prepare the data and software that we will need to run this pipeline.

If you have any questions or issues, feel free to open an issue or directly email Drew Neavin (d.neavin @ garvan.org.au)