Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

scripts for immune deconvolution #292

Merged
merged 29 commits into from
Jan 7, 2020

Conversation

komalsrathi
Copy link
Collaborator

Purpose/implementation Section

What scientific question is your analysis addressing?

Immune profiling of PBTA histologies.

What was your approach?

To deconvolute the immune cell types across 20 histologies from the PBTA cohort, we use the R package immunedeconv, which is a unified interface that allows cell type quantification from RNA-seq gene expression data using multiple methods i.e. EPIC (n = 6), TIMER (n = 6), MCP-counter (n = 8), quanTIseq (n = 10), CIBERSORT (n = 22) and xCell (n = 64). Out of the six methods, we chose the top two most comprehensive methods i.e. CIBERSORT and xCell that are able to deconvolute 22 and 64 cell types, respectively. In a benchmarking study, xCell was also shown to robustly outperform all other methods, including CIBERSORT.

In general, immune deconvolution is a hard problem in that most immune deconvolution methods do not always agree because of the type of underlying algorithm and ability to deconvolute certain cell types better than others. To compare the outcome scores across these two prominent profiling methods, we took 13 cell types common between both methods and created a correlation plot per cell type per histology.

In order to see histology specific enrichment of certain cell types, we created heatmaps for xCell and CIBERSORT representing average immune cell scores normalized across histologies.

What GitHub issue does your pull request address?

#15

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

Is there anything that you want to discuss further?

Any suggestions are most welcomed.

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Yes

Results

What types of results are included (e.g., table, figure)?

01-immune-deconv.R creates an RData containing immune scores corresponding to xCell and CIBERSORT.

02-summary-plots.R creates a correlation plot between cell types common between xCell and CIBERSORT as well as heatmaps containing average immune scores per cell type per histology for each method.

What is your summary of the results?

Correlation plot:
Overall, across all histologies and cell types, the predictions did not correlate well (pearson correlation 0.12) but we did observe a high correlation (> 0.5) between certain cell type predictions i.e. Macrophages M2, Monocytes, Neutrophils and T cell CD8+ markers across specific histologies.

Heatmaps:
We observed a very similar and interesting pattern in the heatmaps as well i.e. cell types that were highly correlated between xCell and CIBERSORT are also the ones that found in high proportions compared to other cell types. Using the profiling method, we could distinctively identify three histologies enriched in specific cell types. Histiocytic tumors (n = 5) which are known to be characterized by an increase in Monocytes, Macrophages and Dendritic cells[3] were observed to be enriched in Myeloid dendritic cell activated, Monocytes, Neutrophils and Macrophage M1/M2; Lymphomas (n = 1) were seen to be enriched in B cell plasma, Class-switched memory B cell, B cell naive, B cell, B cell memory, T cell regulatory (Tregs), Common Lymphoid progenitor[4]; and Germ cell tumors (n = 13) as expected[5] were seen to be enriched in T cell CD8+ and T cell CD4+ naive cells.

Because we only have a sample size of 1 for Lymphomas, we couldn't get any correlation between immune scores of xCell and CIBERSORT across various B cell types but looking at the heatmap of average scores per cell type per histology, it is highly likely that those two would correlate.

Reproducibility Checklist

  • The dependencies required to run the code in this pull request have been added to the project Dockerfile.
  • This analysis has been added to continuous integration.

@jaclyn-taroni
Copy link
Member

Hi @komalsrathi, thank you for filing this! We are probably unlikely to provide a full review prior to the holiday this week, thanks for your patience.

I noticed that you are adding the CIBERSORT R script here (analyses/immune-deconv/CIBERSORT.R). Unfortunately, I believe that the CIBERSORT licensing terms are incompatible with the license we use for this repository, which is BSD-3 for source code. Tagging @cgreene who has run into this issue with CIBERSORT before if I recall correctly.

If this is in fact the case, I'd recommend the following:

  • Removing the analyses/immune-deconv/CIBERSORT.R/stop tracking in git and add this file to a .gitignore file in the analyses/immune-deconv directory.
  • Making the CIBERSORT step optional such that you are able to skip it in continuous integration. This is required because when the repository gets checked out first thing in continuous integration CIBERSORT.R will not be available. This is probably relatively straightforward in analyses/immune-deconv/01-immune-deconv.R. (You may find the Passing variables only in CI section of the README helpful depending on how you choose to implement this.) I have two thoughts about how to possibly deal with this in analyses/immune-deconv/02-summary-plots.R:
    • Checking for the existence of the CIBERSORT results before making the correlation plot and the CIBERSORT heatmap. Because we also want to ensure that the correlation plot steps execute in continuous integration, you could check for correlation between xCell and another method specified as an option.
    • Because you're adding analyses/immune-deconv/deconv-output.RData as part of this pull request, you could have logic in analyses/immune-deconv/01-immune-deconv.R where you check for the existence of the analyses/immune-deconv/deconv-output.RData file and only overwrite it if specified as an option. You would elect not to overwrite the file in CI and then it would use this one that you've committed in 02-summary-plots.R as is.

Looking forward to your thoughts on implementation if we need to work around CIBERSORT.

Thank you again for the contribution and have a happy holiday break!

@komalsrathi
Copy link
Collaborator Author

Hi @komalsrathi, thank you for filing this! We are probably unlikely to provide a full review prior to the holiday this week, thanks for your patience.

I noticed that you are adding the CIBERSORT R script here (analyses/immune-deconv/CIBERSORT.R). Unfortunately, I believe that the CIBERSORT licensing terms are incompatible with the license we use for this repository, which is BSD-3 for source code. Tagging @cgreene who has run into this issue with CIBERSORT before if I recall correctly.

If this is in fact the case, I'd recommend the following:

  • Removing the analyses/immune-deconv/CIBERSORT.R/stop tracking in git and add this file to a .gitignore file in the analyses/immune-deconv directory.

  • Making the CIBERSORT step optional such that you are able to skip it in continuous integration. This is required because when the repository gets checked out first thing in continuous integration CIBERSORT.R will not be available. This is probably relatively straightforward in analyses/immune-deconv/01-immune-deconv.R. (You may find the Passing variables only in CI section of the README helpful depending on how you choose to implement this.) I have two thoughts about how to possibly deal with this in analyses/immune-deconv/02-summary-plots.R:

    • Checking for the existence of the CIBERSORT results before making the correlation plot and the CIBERSORT heatmap. Because we also want to ensure that the correlation plot steps execute in continuous integration, you could check for correlation between xCell and another method specified as an option.
    • Because you're adding analyses/immune-deconv/deconv-output.RData as part of this pull request, you could have logic in analyses/immune-deconv/01-immune-deconv.R where you check for the existence of the analyses/immune-deconv/deconv-output.RData file and only overwrite it if specified as an option. You would elect not to overwrite the file in CI and then it would use this one that you've committed in 02-summary-plots.R as is.

Looking forward to your thoughts on implementation if we need to work around CIBERSORT.

Thank you again for the contribution and have a happy holiday break!

@jaclyn-taroni Thank you for your feedback. I agree, we shouldn't have the CIBERSORT.R and LM22.txt available (my bad for overlooking that). I am going to think about this a little more and get back with my thoughts after the break. Have a Happy Thanksgiving!

@cgreene
Copy link
Collaborator

cgreene commented Nov 27, 2019

CIBERSORT's license used to be... extensive. @dhimmel made a copy of it here:
https://gist.github.com/dhimmel/58dcd9b512e669f20a65ddf73997b733

Is the license the same or have they modified it? It seems plausible that they could have updated it so that you could add a command to download it to the CI system with wget if there's a URL where it's available.

@jaclyn-taroni
Copy link
Member

Ah, okay it does look like it's been updated @cgreene, I am copying the text from the modal that comes up when you click Non-Commercial Terms of Use on the About page (I do not have an account):

The Board of Trustees of the Leland Stanford Junior University (“Stanford”) provides CIBERSORT website features and services (“Service”) free of charge for non-commercial use only. Use of the Service by any commercial entity for any purpose, including research, is prohibited.

By using the Service, you agree to be bound by the terms of this Agreement. Please read it carefully.

You agree not to use the Service for commercial advantage, or in the course of for-profit activities. You agree not to use the Service on behalf of any organization that is not a non-profit organization. Commercial entities wishing to use this Service should contact Stanford University’s Office of Technology Licensing and reference docket S13-279.
You agree not to use the Service for diagnosis, treatment, cure, prevention, or mitigation of disease or other conditions in man or other animals. You acknowledge that the Service and any information obtained therefrom is not intended to substitute for care by a licensed healthcare professional.

THE SERVICE IS OFFERED “AS IS”, AND, TO THE EXTENT PERMITTED BY LAW, STANFORD MAKES NO REPRESENTATIONS AND EXTENDS NO WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED. STANFORD SHALL NOT BE LIABLE FOR ANY CLAIMS OR DAMAGES WITH RESPECT TO ANY LOSS OR OTHER CLAIM BY YOU OR ANY THIRD PARTY ON ACCOUNT OF, OR ARISING FROM THE USE OF THE SERVICE. YOU HEREBY AGREE TO DEFEND AND INDEMNIFY STANFORD, ITS TRUSTEES, EMPLOYEES, OFFICERS, STUDENTS, AGENTS, FACULTY, REPRESENTATIVES, AND VOLUNTEERS (“STANFORD INDEMNITEES”) FROM ANY LOSS OR CLAIM ASSERTED AGAINST STANFORD INDEMNITEES ARISING FROM YOUR USE OF THE SERVICE.

All rights not expressly granted to you in this Agreement are reserved and retained by Stanford or its licensors or content providers. This Agreement provides no license under any patent.
You agree that this Agreement and any dispute arising under it is governed by the laws of the State of California, United States of America, applicable to agreements negotiated, executed, and performed within California.

Subject to your compliance with the terms and conditions set forth in this Agreement, Stanford grants you a revocable, non-exclusive, non-transferable right to access and make use of the Service.

@jaclyn-taroni
Copy link
Member

Hi @komalsrathi, I hope you had a great break! I wanted to check in. I have not requested or provided a review of this yet because of the expected CIBERSORT changes. Do you have an idea of when you will have the opportunity to revisit this pull request? Thank you!

@komalsrathi
Copy link
Collaborator Author

Hi @komalsrathi, I hope you had a great break! I wanted to check in. I have not requested or provided a review of this yet because of the expected CIBERSORT changes. Do you have an idea of when you will have the opportunity to revisit this pull request? Thank you!

@jaclyn-taroni I should have the necessary changes by end of next week.

@jaclyn-taroni
Copy link
Member

Sounds good, thanks!

@komalsrathi
Copy link
Collaborator Author

@jaclyn-taroni I am working on this today, will let you know when this is ready for review.

@komalsrathi
Copy link
Collaborator Author

@jaclyn-taroni I have made the changes that you suggested:

  1. Added CIBERSORT.R and LM22.txt to analyses/immune-deconv/.gitignore
  2. Use xCell as method 1 because it has the most immune cell types to deconvolute and performs better than others (I have added citations in corresponding methods) and pass the second method as a variable.
  3. For the second method, the script will look for the env variable OPENPBTA_DECONV_METHOD. If it is set to cibersort_abs, then the script will look for CIBERSORT.R and LM22.txt files under analyses/immune-deconv and run CIBERSORT.
  4. If OPENPBTA_DECONV_METHOD is not found, then the second method is set to mcp_counter (for MCP-counter). This is one of the three best methods (along with xCell, CIBERSORT) found in a benchmarking study

Please review when you can and let me know any changes/suggestions. Thank you!!

@jaclyn-taroni jaclyn-taroni self-requested a review December 30, 2019 13:52
@jaclyn-taroni
Copy link
Member

The CIBERSORT.R and LM22.txt files are ignored but they have not been removed/deleted yet. I will delete these at the same time as updating this branch to be in sync with master and then provide a review.

Copy link
Member

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @komalsrathi,

Thanks for adding this analysis! I've added specific comments aimed at bringing the folder organization more in line with the rest of the repository and using rprojroot to head off any potential issues with sourcing your plot theme file. In addition to the line comments I've added, I have some more general comments and questions about this analysis:

Thanks again!

@cgreene
Copy link
Collaborator

cgreene commented Jan 3, 2020

@komalsrathi : can you help us understand why you needed to force push here:

komalsrathi force-pushed the komalsrathi:immune-deconv branch from c79de3a to 401b9ad 37 minutes ago

Do you know what changes were wiped out by this? I'm guessing they were edits that were made by others or on GitHub.

@komalsrathi
Copy link
Collaborator Author

komalsrathi commented Jan 4, 2020 via email

@cgreene
Copy link
Collaborator

cgreene commented Jan 4, 2020

@komalsrathi Just corresponded a bit with @jaclyn-taroni. Fortunately she checked it out locally so she could force push back to the state things were at when she was last making changes.

How many changes were you intending to make (vs. how many of the changes from the commits came in from the merge).

@komalsrathi
Copy link
Collaborator Author

@cgreene I guess I was just trying to pull changes that Jaclyn had made here to my local branch: in order to get all changes on my local system and then work on her other suggestions. Because there were some changes that I had accepted for e.g. where she had asked to create a plots directory.

@cgreene
Copy link
Collaborator

cgreene commented Jan 4, 2020

Ok! I see merge markers so it sounds like you pulled, there were merge conflicts, and then when you tried to push it didn't work b/c it wasn't yet merged. I will let @jaclyn-taroni propose next steps on this, but I think her force pushing and you re-integrating those changes will be easier than any other alternative. Thanks for your help understanding what the goal was here! It'll help us get the best path to a successful merge. 😁

@jaclyn-taroni
Copy link
Member

The suggestions that were applied earlier are marked as resolved so they can no longer be applied. I will mark them as unresolved so we're able to apply the changes via the GitHub interface. That will get us back to the point where I submitted my review.

@jaclyn-taroni
Copy link
Member

Alright! You are all set to apply my suggestions via the GitHub interface. Once you've applied the suggestions, you essentially want to get the same state of the branch on your local machine as what's here remotely. You could do what is called a reset in git parlance to overwrite what you have locally (should take care of the merge conflicts).

This StackOverflow post has a lot of good information in it and is what I referenced while writing this up.

The command for that is (warning this will get rid of any local changes!):

git reset --hard origin/immune-deconv

I am happy to walk you through all of that when we meet in person, but I wanted to document this while it was fresh in my mind.

komalsrathi and others added 5 commits January 6, 2020 11:15
Co-Authored-By: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Co-Authored-By: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Co-Authored-By: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Co-Authored-By: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Co-Authored-By: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
@komalsrathi
Copy link
Collaborator Author

@jaclyn-taroni Hi Jaclyn, I think this should be ok now. This is what I did:

git fetch origin immune-deconv
git reset --hard origin/immune-deconv

Then, I applied the changes via Github interface and again:

git fetch origin immune-deconv
git reset --hard origin/immune-deconv

This is the log on my local system:

➜  OpenPBTA-analysis git:(immune-deconv) git log --oneline
b75cd7f (HEAD -> immune-deconv, origin/immune-deconv) Update analyses/immune-deconv/run-immune-deconv.sh

Now, I just need to make and commit other changes that you have suggested in the review. Can you comment, if this sounds right?

@jaclyn-taroni
Copy link
Member

Now, I just need to make and commit other changes that you have suggested in the review. Can you comment, if this sounds right?

Yep, this looks correct 👍 thanks!

@komalsrathi
Copy link
Collaborator Author

komalsrathi commented Jan 7, 2020

HI @jaclyn-taroni, here are my responses:

Hi @komalsrathi,

Thanks for adding this analysis! I've added specific comments aimed at bringing the folder organization more in line with the rest of the repository and using rprojroot to head off any potential issues with sourcing your plot theme file. In addition to the line comments I've added, I have some more general comments and questions about this analysis:

Done

  • Similarly, can you add a README to immune-deconv here? I think this addition will help with some of the outstanding questions I have (see below).

Here: https://github.com/komalsrathi/OpenPBTA-analysis/blob/immune-deconv/analyses/immune-deconv/README.md

  • How do you intend for this module to be used? Specifically, do you expect that folks will run only two methods (e.g., xCell and cibersort_abs) or is your intention to make this flexible enough that people can run multiple methods that they can compare downstream? I'm asking in part because you made reference to creating correlation plots for all methods in the past #15 (comment).

Please see the output section of: https://github.com/komalsrathi/OpenPBTA-analysis/blob/immune-deconv/analyses/immune-deconv/README.md#choice-of-method

https://github.com/komalsrathi/OpenPBTA-analysis/blob/immune-deconv/analyses/immune-deconv/README.md#01-immune-deconvr

  • You're combining the two RNA-seq matrices that have been split up because we observed technical effects. How does combining the matrices (or not) affect your results?

MCP_Counter gives exactly the same result for polyA and stranded, if the matrix is combined or not. CIBERSORT (abs.) also gives the same result for polyA - whether combined or not. But for stranded data, there is some loss of information when using the combined matrix because polyA has lesser genes compared to stranded. In this case, if the stranded data is run independently, a few more cell types pop up. Similarly, for xCell not much changes when using the combined matrix vs the independent matrices (pearson correlation 0.99). So, to address the issue with CIBERSORT (abs.), I have changed 01-immune-deconv.R such that we run the methods on both datasets independently and then combine the output. The results haven't changed much after applying this change.

Thanks again!

@jaclyn-taroni
Copy link
Member

Modules at a glance added with 9fc1439

@komalsrathi
Copy link
Collaborator Author

@jaclyn-taroni, I have made the discussed changes. Let me know whenever you get a chance to review this. And thanks for helping out!

Copy link
Member

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 looks good to me! Thanks for the updates! I'm going to update this branch to be in sync with the master branch (the CI failure is not related to your changes, but has been fixed elsewhere) and if this passes, I will merge.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants