Annotate cell types in desmoplastic small round cell tumor samples on the Portal #583
Replies: 4 comments 1 reply
-
Hi @danhtruong. I'm Jen, the Scientific Community Manager at the Data Lab. Thank you for sharing your proposed analysis! Our team is currently reviewing your proposal, and we're looking forward to discussing it with you soon. We'll get back to you with next steps within 3 business days. We will also be setting up an Amazon Web Services account for you. Once we do, you should receive an email with an invitation to finish setting up your account. I'll reach out again when you should be expecting to see this! In the meantime, please let me know if you have any questions about OpenScPCA. We look forward to working together! |
Beta Was this translation helpful? Give feedback.
-
@danhtruong your AWS account has been created and you should receive an email to complete setup. Here are instructions for setting up AWS! |
Beta Was this translation helpful? Give feedback.
-
This sounds great @danhtruong and we are excited to have you on board! I had a few questions regarding your analysis plan. You mention curating a marker gene list and using that marker gene list to annotate the tumor cells. How exactly are you planning on doing that? You mention using Based on my experience with the Ewing data, classifying cells solely based on gene expression can be quite challenging, especially if there's no clear separation between tumor and normal cells. However, if you have distinct populations of normal and tumor cells, using a method like I was curious if you expect to find normal cells in these samples. I will note that for these particular samples, four of them came directly from the tumor, and the other three are matched PDX samples. My naive expectation is that you will be more likely to see a mix of tumor and normal cells in the samples that came directly from the tumor over the PDX samples, so those are probably the better samples to start with. But I'm not a DSRCT expert, so I'm sure you have a better idea of what to expect than I do! I did also notice that in Henon et al they looked at the expression of two different DSRCT gene signatures to label tumor cells, and they have a nice supplemental table (Table S2A) of the genes that are differentially expressed in the tumor cells and normal cell types across all 12 patients. Those should be great starting points! That list of DEGs that they provide could potentially be used to create a reference that you might be able to use with a marker gene based assignment method like I know you mentioned looking at gene expression of marker genes across clusters, which I think is a very reasonable approach. I just want to caution you that if you do plan on using that approach, you will need to spend time first doing the clustering. We do provide clustering assignments in the processed You also mention using I would also encourage you to check out these helpful sections of our documentation to get started: Technical setup Please let me know if you have specific questions on how to get started! |
Beta Was this translation helpful? Give feedback.
-
Hi @danhtruong, I wanted to respond to provide some additional guidance as you are ready to get started. In an effort to keep the analysis across modules uniform and transparent, we ask that you start your analysis with the processed objects available on the ScPCA Portal ( When you are ready to get started, you can download the data using the Please follow the below steps to start contributing to the project:
After you have initiated your module, you will be ready to continue with the rest of the analysis that you proposed. I would recommend that you break up your work into the following steps, where each bullet point would be an issue and at least one subsequent pull request:
For more information on contributing to the project, I recommend you review these sections of the documentation: |
Beta Was this translation helpful? Give feedback.
-
Proposed analysis
This analysis aims to annotate cell types for the existing desmoplastic small round cell tumor samples on the Portal. These can be found in SCPCP000013.
Specifically, we plan to add three levels of annotations:
The first will annotate cells as tumor cells and normal cells.
The second will annotate the normal cells, providing a distinct cell type (e.g., fibroblasts, T cells).
The last will classify cell states/lineage found in the tumor cells (e.g., epithelial, mesenchymal, neuroendocrine, or muscle)
In addition to providing human-readable names for cell types, we will also provide cell ontology identifiers where possible.
As part of this analysis, we will generate a reference of marker genes associated with desmoplastic small round cell tumor cells and cell states. We will also create a reference dataset containing well-annotated desmoplastic small round cell tumor cells. These references can be used to annotate other samples from desmoplastic small round cell tumor cells.
Scientific goals
The goal of this analysis is to provide validated cell type annotations that can be used for downstream analysis of the desmoplastic small round cell tumor samples. The analysis will accomplish the following goals:
Additionally, these annotations can be added to the existing objects available on the ScPCA Portal. This will give users the validated annotations without performing their own cell-type annotations.
Methods or approach
As part of this analysis, we will annotate all desmoplastic small round cell tumor samples and provide a reference that can be used by the community. To do this, we will perform the following steps on some of the samples (2-3) to first create a well-annotated reference dataset for desmoplastic small round cell tumors. Then, we can use this as a reference to annotate cells from all other samples using a reference-based approach like SingleR.
Step 1
Before starting to identify any cell types or cell states, it will be helpful to curate a list of marker genes that are associated with desmoplastic small round cell tumor cells or specific cell states. This may help assign cell types or can be used to validate assigned cell types and cell states. To do this, we will use our pre-existing data and literature to identify markers that can be used to identify:
Desmopastic small round cell tumor cells (e.g., ST6GALNAC5 and CACNA2D2)
Cell lineages (e.g., epithelial, mesenchymal)
Step 2
Next, we must identify which cells in each sample are tumor or normal.
There are multiple ways to do this:
Seurat
functionaddmodulescore
to create a summary metric to identify tumor cellsIn addition, we can perform unsupervised clustering, which should separate the tumor cells and normal cells. Next, we can annotate the clusters based on the marker genes.
Step 3
Next, we want to classify the types of normal cells that are present. Here, we can pull out any cells that are classified as normal cells and use a reference-based method to classify those cells, such as SingleR.
For the reference, we can use a publicly available reference from celldex containing both immune cells and other non-immune stromal cells (fibroblasts, endothelial cells).
The benefit of using
celldex
is the presence of cell ontology terms included in the reference datasets.We will then need to validate these findings, which we can do by first curating a list of known markers for the normal cell types in our dataset. Then, we can plot the expression of those known markers across the cell types. We expect that the cells assigned to the cell type where that gene is a marker gene would have the highest expression. One thing to note is that there are some cases where we expect to see correlated expression of multiple marker genes, indicating a specific cell type. In addition, we will try not to subtype the normal cells unless there are verifiable markers.
Step 4
The last thing we will want to do is identify any tumor cells that can be further classified into known cell states or lineages. Desmoplastic small round cell tumors are driven by the pathognomonic EWS::WT1 fusion gene. However, literature has shown tumor cells can display multi-lineage expression, including epithelial, mesenchymal, muscle, and neuroendocrine markers.
Immunohistochemistry has identified markers associated with the different lineages in desmoplastic small round cell tumor specimens. Upon identifying the tumor cells, we will analyze classical markers associated with the lineages. Cells may express markers from only one lineage or multiple lineages. We can perform unsupervised clustering on tumor cells to better separate and identify these cell states.
Existing modules
Yes, this module is based on an existing module of Ewing sarcoma samples in discussion.
Input data
This analysis will use the processed
SingleCellExperiment
objects forSCPCP000013
. Depending on the methods and tools implemented, we may also need to use the processed AnnData objects.Additionally, we will obtain a reference dataset from the
celldex
package to annotate normal cells.Scientific literature
The following papers may be helpful in curating marker gene lists to use for this analysis:
New transcriptional-based insights into the pathogenesis of desmoplastic small round cell tumors (DSRCTs)
Desmoplastic small round cell tumor is dependent on the EWS-WT1 transcription factor
Single-cell multiomics profiling reveals heterogeneous transcriptional programs and microenvironment in DSRCTs
Comprehensive Molecular Profiling of Desmoplastic Small Round Cell Tumor
Other details
Computational resources
All analysis will be done on my local machine if possible.
Timeline
This analysis will be done in stages, where each stage is, at minimum, a single pull request:
Stage 1: Curate lists of marker genes from the literature
Stage 2: Classify cells as tumor vs. normal in a single sample
Stage 3: Classify normal cells in a single sample
Stage 4: Classify tumor cell states in a single sample
Stage 5: Apply this analysis to 2-3 samples to create a reference dataset
Stage 6: Create the code to run SingleR on all remaining desmoplastic small round cell tumor samples using our well-annotated reference dataset
Beta Was this translation helpful? Give feedback.
All reactions