Find consensus of SingleR and CellAssign cell type labels; use as reference for copy number alterations methods #853

jaclyn-taroni · 2024-11-05T13:27:05Z

jaclyn-taroni
Nov 5, 2024
Maintainer

Proposed analysis

The processed data available as part of our data releases includes the cell type annotation results from two reference-based, automated methods: SingleR and CellAssign (https://scpca.readthedocs.io/en/stable/processing_information.html#cell-type-annotation).

Our rationale for including two methods is that we expect they will agree sometimes; when they do, we might be more confident in those cell type calls. Our intuition – given that we use references generated from healthy tissues – is that when there is agreement, those cells are more likely to be non-malignant. There's also a particular focus on immune cells in the references, so we might expect any consensus between the methods to capture immune infiltrate in solid tumor samples. We have yet to explore these expectations and intuition systematically.

I am proposing that we:

Create a set of consensus cell type labels when SingleR and CellAssign agree for each sample
For samples where we believe the consensus labels are capturing a subset of non-malignant cells in the sample, use these cell populations as normal reference cells for copy number alteration methods

Scientific goals

The goals of the proposed analysis are to:

Understand how and when the automated cell type annotation methods agree
Create a set of consensus cell type labels
Use that set of consensus cell type labels to generate normal reference populations for samples where we are confident the consensus represents a subset of the non-malignant cells in a sample
Run several copy number alterations methods on samples where it is feasible

From the first rounds of cell typing, it's apparent that folks want to see the output of methods like CopyKAT, inferCNV, and SCEVAN as part of their cell typing workflow (and the reference matters very much!). If we can eventually get this working across many samples (i.e., port it to OpenScPCA-nf), it would likely significantly speed up cell typing for some cancer types (or at least generate negative results in a way that saves time).

Methods or approach

This proposal is relatively big and will likely end up as at least two modules before porting, so I will focus here on the first steps.

The first challenge we'll need to overcome is what it means for SingleR and CellAssign to agree, given that they use different references. We should leverage ontologies here and develop relatively simple rules to determine agreement. Here is an example of a rule we might use: if two cell types in different references share a common parent term, use the parent term as the consensus label. We might also want to do some kind of permutation testing – if you shuffle the cell type labels, how likely are you to see agreement? I expect when we don't see agreement between methods, the consensus label would be "unknown."

I would expect the first pass at this (but not necessarily the first pull request!) to include:

Making sure we have using ontology term mappings
Developing a set of rules for agreement that leverages ontology structures
For a selection of samples across multiple projects, can you get (known) consensus cell type labels using those rules?

Existing modules

It is not expected to consume results from any module. However, it is conceptually related to the cell typing modules.

Input data

No response

Scientific literature

No response

Other details

No response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Find consensus of SingleR and CellAssign cell type labels; use as reference for copy number alterations methods #853

{{title}}

Replies: 0 comments

Select a reply

Find consensus of SingleR and CellAssign cell type labels; use as reference for copy number alterations methods #853

jaclyn-taroni Nov 5, 2024 Maintainer

Proposed analysis

Scientific goals

Methods or approach

Existing modules

Input data

Scientific literature

Other details

Replies: 0 comments

jaclyn-taroni
Nov 5, 2024
Maintainer