Find consensus of SingleR and CellAssign cell type labels; use as reference for copy number alterations methods #853
jaclyn-taroni
started this conversation in
Propose a new analysis
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Proposed analysis
The processed data available as part of our data releases includes the cell type annotation results from two reference-based, automated methods: SingleR and CellAssign (https://scpca.readthedocs.io/en/stable/processing_information.html#cell-type-annotation).
Our rationale for including two methods is that we expect they will agree sometimes; when they do, we might be more confident in those cell type calls. Our intuition – given that we use references generated from healthy tissues – is that when there is agreement, those cells are more likely to be non-malignant. There's also a particular focus on immune cells in the references, so we might expect any consensus between the methods to capture immune infiltrate in solid tumor samples. We have yet to explore these expectations and intuition systematically.
I am proposing that we:
Scientific goals
The goals of the proposed analysis are to:
From the first rounds of cell typing, it's apparent that folks want to see the output of methods like CopyKAT, inferCNV, and SCEVAN as part of their cell typing workflow (and the reference matters very much!). If we can eventually get this working across many samples (i.e., port it to
OpenScPCA-nf
), it would likely significantly speed up cell typing for some cancer types (or at least generate negative results in a way that saves time).Methods or approach
This proposal is relatively big and will likely end up as at least two modules before porting, so I will focus here on the first steps.
The first challenge we'll need to overcome is what it means for SingleR and CellAssign to agree, given that they use different references. We should leverage ontologies here and develop relatively simple rules to determine agreement. Here is an example of a rule we might use: if two cell types in different references share a common parent term, use the parent term as the consensus label. We might also want to do some kind of permutation testing – if you shuffle the cell type labels, how likely are you to see agreement? I expect when we don't see agreement between methods, the consensus label would be "unknown."
I would expect the first pass at this (but not necessarily the first pull request!) to include:
Existing modules
It is not expected to consume results from any module. However, it is conceptually related to the cell typing modules.
Input data
No response
Scientific literature
No response
Other details
No response
Beta Was this translation helpful? Give feedback.
All reactions