Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

CI calculation added for odds ratio #1204

Merged
merged 8 commits into from
Nov 28, 2021

Conversation

runjin326
Copy link
Collaborator

@runjin326 runjin326 commented Nov 15, 2021

Purpose/implementation Section

What scientific question is your analysis addressing?

Add confidence interval calculation for odds ratio in interaction-plots module

What was your approach?

Three columns - standard_error_or, or_ci_lower_bound and or_ci_upper_bound were added to the results column with the following calculation -

      standard_error_or = sqrt(1/mut11 + 1/mut00 + 1/mut10 + 1/mut01),
      or_ci_lower_bound = exp(log(odds_ratio) - 1.96 * standard_error_or),
      or_ci_upper_bound = exp(log(odds_ratio) + 1.96 * standard_error_or)

What GitHub issue does your pull request address?

#1203

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

Please check and see the addition of CI related columns are as wanted.

Is there anything that you want to discuss further?

No.

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Yes.

Results

What types of results are included (e.g., table, figure)?

All the *.tsv files in the results folder are updated with addition of CI related columns.

What is your summary of the results?

Spot checked a couple entries and the numbers make sense.

Reproducibility Checklist

  • The dependencies required to run the code in this pull request have been added to the project Dockerfile.
  • This analysis has been added to continuous integration.

Documentation Checklist

  • This analysis module has a README and it is up to date.
  • This analysis is recorded in the table in analyses/README.md and the entry is up to date.
  • The analytical code is documented and contains comments.

Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @runjin326-
Thank you for implementing this!

Unfortunately, the simple OR CI calculation that you implemented here is probably not quite what we will want, as it results in Inf/NA any time we have a 0 count. There are two potential solutions that I can think of:

One is to implement Haldane's Correction when there are zero counts: This would mean adding 0.5 to all counts in the table if any counts are zero, then performing the same standard error and CI calculation as before. This is probably the easiest to implement, but as it starts to get a bit messier, I would pull it out as a separate function, similar to the row_fisher() function. One note here is that the CI calculated this way starts from the "corrected" OR, which means that we often end up with an OR that does not include 0, so I would probably set the lower bound value to min(or_uncorrected, or_lowerbound)

Another option is to use the PropCIs::orscoreci() function, which is probably more accurate (it uses a method from Agretsi), but would involve a more backend work, as it requires adding a new package to the docker image.

@runjin326
Copy link
Collaborator Author

@jashapiro, thanks so much for your prompt review - I would probably take the first approach and ping you for another review when completed.

@runjin326
Copy link
Collaborator Author

@jashapiro, I pulled out function for CI and now the results only have one ci column with the calculation that you specified. Could you please check to see whether this is what you expect? Thanks!

Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @runjin326 I saw that some of the intervals were still not making sense, which I think has to do with when the function is being applied. I also modified the code to more accurately reflect the full adjustment as it has to be applied (if odd_ratio is zero, you can't take the log, which is part of the reason for this procedure).

I have not tested the code I wrote, but hopefully it works, or does so mostly!

@runjin326
Copy link
Collaborator Author

@jashapiro , thanks so much for the detailed review and instructions. I fixed the scripts based on your suggestion - please check to see whether this now looks good.

Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I spot checked some of the confidence intervals, and they seem correct to me.

@jaclyn-taroni jaclyn-taroni merged commit a16bb9d into AlexsLemonade:master Nov 28, 2021
LauraEgolf added a commit to LauraEgolf/OpenPBTA-analysis that referenced this pull request Jan 5, 2022
commit ddb6a8c
Author: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Date:   Tue Jan 4 09:33:28 2022 -0500

    OncoPrint figures (AlexsLemonade#1200)

    * Initial pass at histology palette overhaul

    * Add display columns for legends

    * Remove histology label palette NB from CI

    * Add broad histology order

    * Add oncoprint_include

    * Add oncoprint grouping info to palette df

    * Make cancer_group other darker; rerun

    * Pretty sure change in AlexsLemonade#1161 was unintentional

    * We have been manually editing the oncoprint palette I see

    * Remove/move/rename to prep for an oncoprint fig2

    * Missed the instance where I recode based on other hex code

    * Rerun oncoprint now that palette is fixed

    * Change up approach to oncoprint

    * Remove oncoprint script that assembled from PNGs

    * Remove oncoprint PNGs

    * Add new oncoprint figure script

    saves individual panels as PDFs, also uses new palette

    * Add PDFs of oncoprints and legends

    * Add PDF draft of assembled figure 2

    * Update figure shell script

    * Add fig 2 README

    * Add PNG version of draft figure 2

    * Tweak the cancer group palette

    * Reorder HGAT legend; rerun with new palette

    * Remove outdated assembled figure

    * Add compiled versions

    Sizing and spacing probably still need work!

    * Update oncoprint palette

    * Rerun module with new palette

    * Rerun pub ready figure versions of oncoprints

    * Add complex event to oncoprint palette

    * Rerun module with complex event in palette

    * Rerun oncoprint fig script with complex events

    * Darker complex event hex code

    * Try a "wine" color for complex event

    * Remove outdated PNG & PDF of compiled fig

    * Update figures/generate-figures.sh

    Co-authored-by: Chante Bethell  <43576623+cbethell@users.noreply.github.com>

    Co-authored-by: Chante Bethell  <43576623+cbethell@users.noreply.github.com>

commit 32a81d2
Author: runjin326 <47674661+runjin326@users.noreply.github.com>
Date:   Fri Dec 3 17:29:51 2021 -0500

    tables and script added (AlexsLemonade#1199)

    * tables and script added

    * column name changed

    * modified per suggestion + add README

    * tables modified

    * add MB WGS here

    * suggestion implemented and docker file modified

    * re-run on docker

    * dockerfile reverted

    * space removed

    * openxlsx added back to docker

    Co-authored-by: Jo Lynne Rokita <jolynnerokita@d3b.center>
    Co-authored-by: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>

commit 6f3e98c
Author: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Date:   Fri Dec 3 08:34:43 2021 -0700

    Revamp `figures` documentation (AlexsLemonade#1209)

    * WIP: figure docs overhaul

    * Polish figures/README.md

    * Add TOC

    * Remove SNV consensus; add breakpoint

    * Update doc to reflect removal of consensus SNV

    * Rearrange table for figure scripts

    * Update figures/README.md

    Co-authored-by: Candace Savonen <cansav09@gmail.com>

    * Update figures/README.md

    Co-authored-by: Candace Savonen <cansav09@gmail.com>

commit c2b5a70
Author: Komal Rathi <komalsrathi@users.noreply.github.com>
Date:   Tue Nov 30 08:42:07 2021 -0500

    add molecular subtype to wgs-only samples (AlexsLemonade#1206)

    Co-authored-by: Jo Lynne Rokita <jolynnerokita@d3b.center>

commit 21d501f
Author: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Date:   Sun Nov 28 19:00:41 2021 -0500

    Update interaction plot to use new cancer group palette (AlexsLemonade#1197)

    * Initial pass at histology palette overhaul

    * Add display columns for legends

    * Remove histology label palette NB from CI

    * Add broad histology order

    * Add oncoprint_include

    * Add oncoprint grouping info to palette df

    * Make cancer_group other darker; rerun

    * Save the scatterplot panel as a PDF

    * Add a shell script to run the chromothripsis module

    Also adjust CI accordingly

    * Add session info to R Markdown files

    * Rerun module with changes & commit output

    * Add chromothripsis module to figures script

    * Add figures script for chromothripsis bar plot by cancer group

    * Add Rscript to figures shell script

    Also add the copied version of the other chromothripsis panel

    * Remove lettering from panels

    * Use PDF output for plots with all disease types

    * Add ggpattern to Docker

    * Add patterning to stacked bar chart

    * Move over main interaction figure, now a PDF

    * Missed the instance where I recode based on other hex code

    * Outline the stacked bars in case that's helpful

    * Change up approach to oncoprint

    * Tweak the cancer group palette

    * Remove > 0 filter; rerun with new palette

    * With new cancer group palette, remove the patterns

    * Apply suggestions from code review

    Co-authored-by: Ally Hawkins <54039191+allyhawkins@users.noreply.github.com>

    * Rerun plotting with code changes

    * ggpattern no longer req'd in this branch

    * Remove from script as well

    * Apply suggestions from code review

    Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

    Co-authored-by: Ally Hawkins <54039191+allyhawkins@users.noreply.github.com>
    Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

commit 2e6cc1b
Author: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Date:   Sun Nov 28 17:45:33 2021 -0500

    Update TP53 classifier plots by broad histology and cancer group to fit new conventions (AlexsLemonade#1205)

    * Update tp53 by histology plotting to use new palette; save as PDF

    * Add last plotting script to module shell script

    * Add documentation for recently added script

    * Update variable name

    * Rerun with fix

commit a16bb9d
Author: runjin326 <47674661+runjin326@users.noreply.github.com>
Date:   Sun Nov 28 16:23:01 2021 -0500

    CI calculation added for odds ratio (AlexsLemonade#1204)

    * CI calc added for odds ratio

    * add 0.5 if 0

    * function modified

    * function def

    * rerun and add results

    Co-authored-by: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>

commit 67b555b
Author: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Date:   Sun Nov 28 14:21:18 2021 -0500

    Update Figure 4 single panel PDFs, add compiled figure (AlexsLemonade#1194)

    * Initial pass at histology palette overhaul

    * Add display columns for legends

    * Remove histology label palette NB from CI

    * Add broad histology order

    * Add oncoprint_include

    * Add oncoprint grouping info to palette df

    * Allow for alteration of alpha values in custom function

    * Remove immune deconv, rework to use new palette

    * Use updated palette paradigm in telomerase activity box plot

    * Move up telomerase activity plotting so panel can be used downstream

    * Take an approach where individual panels are saved as PDFs

    * panels sub directory

    * Make cancer_group other darker; rerun

    * Rerun telomerase activities

    * Compile individual panels via AI & document

    * Missed the instance where I recode based on other hex code

    * Change up approach to oncoprint

    * Add theme_pubr() to UMAP panel

    * Tweak the cancer group palette

    * Rerun with new cancer group palette

    * Remove outdated assembled figure

    * Updated versions of compiled figure

    * Remove commented out code

commit 6755273
Author: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Date:   Sun Nov 28 10:51:18 2021 -0500

    Add Figure 1A component (AlexsLemonade#1195)

    * Add most final component of Fig1 from Biorender

    * Move into panels subdir

commit 0ed1f9b
Author: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Date:   Wed Nov 17 15:45:35 2021 -0500

    Update chromothripsis panels to use new cancer group palette (Fig 3) (AlexsLemonade#1196)

    * Initial pass at histology palette overhaul

    * Add display columns for legends

    * Remove histology label palette NB from CI

    * Add broad histology order

    * Add oncoprint_include

    * Add oncoprint grouping info to palette df

    * Make cancer_group other darker; rerun

    * Save the scatterplot panel as a PDF

    * Add a shell script to run the chromothripsis module

    Also adjust CI accordingly

    * Add session info to R Markdown files

    * Rerun module with changes & commit output

    * Add chromothripsis module to figures script

    * Add figures script for chromothripsis bar plot by cancer group

    * Add Rscript to figures shell script

    Also add the copied version of the other chromothripsis panel

    * Missed the instance where I recode based on other hex code

    * Change up approach to oncoprint

    * Tweak the cancer group palette

    * Remove > 0 filter; rerun with new palette

    * Apply suggestions from code review

    Co-authored-by: Ally Hawkins <54039191+allyhawkins@users.noreply.github.com>

    * Rerun plotting with code changes

    Co-authored-by: Ally Hawkins <54039191+allyhawkins@users.noreply.github.com>

commit 5f57826
Author: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Date:   Mon Nov 15 16:00:39 2021 -0500

    Create minimal main display item palette (AlexsLemonade#1176) (AlexsLemonade#1193)

    * Initial pass at histology palette overhaul

    * Add display columns for legends

    * Remove histology label palette NB from CI

    * Add broad histology order

    * Add oncoprint_include

    * Add oncoprint grouping info to palette df

    * Make cancer_group other darker; rerun

    * Missed the instance where I recode based on other hex code

    * Change up approach to oncoprint

    * Tweak the cancer group palette

    * Apply suggestions from code review

    Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

    * Also display other hex codes with legend()

    Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

commit 4951c98
Author: Jo Lynne Rokita <jolynnerokita@d3b.center>
Date:   Mon Nov 15 15:57:33 2021 -0500

    Tp53 figures by broad histology and cancer group (AlexsLemonade#1202)

    * add plots of tp53 scores by histology

    * update path to root

commit 78220da
Author: runjin326 <47674661+runjin326@users.noreply.github.com>
Date:   Fri Oct 22 20:21:18 2021 -0400

    Oncoprint output count tables  (AlexsLemonade#1191)

commit b00d344
Author: Krutika Gaonkar <34580719+kgaonkar6@users.noreply.github.com>
Date:   Sat Oct 23 02:30:22 2021 +0530

    N per cancer_group oncoprint (AlexsLemonade#1189)

    * adding ns per cancer_group

    * add Ns to data outputfolder

    Co-authored-by: kgaonkar6 <gaonkark@chop.edu>
    Co-authored-by: Jo Lynne Rokita <jolynnerokita@d3b.center>

commit 8557352
Author: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Date:   Fri Oct 22 14:03:12 2021 -0400

    Take "sufficiently non-zero" approach to exposures for CNS signature fitting (AlexsLemonade#1192)

    * Save & ignore full fit_signatures output

    * Make it easy to just run the CNS fitting part

    * Run all the steps in CI

    * Update analyses/mutational-signatures/run_mutational_signatures.sh

    Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

    * Set exposures to zero when lower confidence interval < 0.01

    * Rerun CNS fitting steps

    * Add info about lower interval to README

    Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

commit d76ac06
Author: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
Date:   Thu Oct 21 16:26:51 2021 -0400

    Save output of fit_signatures(), make it easier just to run CNS fitting steps (AlexsLemonade#1190)

    * Save & ignore full fit_signatures output

    * Make it easy to just run the CNS fitting part

    * Run all the steps in CI

    * Update analyses/mutational-signatures/run_mutational_signatures.sh

    Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

    Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

commit 380b6f1
Author: Jo Lynne Rokita <jolynnerokita@d3b.center>
Date:   Wed Oct 20 15:23:13 2021 -0400

    add Ns and percents to co-occurrence tables for paper (AlexsLemonade#1188)

    * add Ns and percents for paper

    - n_mutated_gene1
    - n_mutated_gene2
    - perc_mutated_gene1
    - perc_mutated_gene2
    - perc_cooccur_or_mutexcl

    * add n_mutated_gene comment

    * clarify `perc_cooccur_or_mutexcl`

    * add all sample percent calculations and update definitions

    clarified to:
    - perc_cooccur_all_samples
    - perc_mutexcl_all_samples
    - perc_cooccur_gene1_mutated
    - perc_mutexcl_gene1_mutated

    * remove ifelse

    * update `perc_cooccur` calculation

    perc_cooccur =  mut11*100/(mut11 + mut10 + mut01)

    * fix typo in perc_mutexcl calculation

    * Update analyses/interaction-plots/scripts/cooccur_functions.R

    Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

    * Update analyses/interaction-plots/scripts/cooccur_functions.R

    Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>

    Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants