Skip to content

How to integrate more than two scATAC-seq datasets? #455

Answered by timoast
zrcjessica asked this question in Q&A
Discussion options

You must be logged in to vote

I would recommend first creating a unified set of peaks across all of the datasets, then quantifying the unified peak set in each dataset, and merging the resulting Seurat objects together. You can use the GenomicRanges::reduce() function for this, and there's an example in the merge vignette: https://satijalab.org/signac/articles/merging.html

If you find that you see a batch effect (cells separate by both cell state and dataset of origin) after merging the objects, then you could also apply data integration methods to remediate this. For >2 scATAC datasets, I'd recommend trying Harmony (example here). We're working on some updates to the Seurat integration to better support single-cell c…

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@Dragonmasterx87
Comment options

@timoast
Comment options

@Dragonmasterx87
Comment options

@zrcjessica
Comment options

@timoast
Comment options

Answer selected by timoast
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants