Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DL Edition] T036: Uncertainty estimation #286

Merged
merged 20 commits into from
Apr 11, 2023

Conversation

mbackenkoehler
Copy link
Collaborator

@mbackenkoehler mbackenkoehler commented Dec 6, 2022

Details

  • Talktorial ID: 036
  • Title: Uncertainty estimation
  • Original authors: Michael Backenköhler
  • Reviewer(s): TBD
  • Date of review: TBD

Content

  • One line summary: Illustrate basic uncertainty estimation in ML on small molecules
  • Potential labels or categories (e.g. machine learning, small molecules, online APIs): machine learning, small molecules, ensemble methods
  • Time it took to execute (approx.): TBD
  • I have used the talktorial template and followed the content and formatting suggestions there
  • Packages must be open-sourced and should be installable from conda-forge. If you are adding new packages to the TeachOpenCADD environment, please check if already installed packages can perform the same functionality and if not leave a sentence explaining why the new addition is needed. If the new package is not on conda-forge, please list them and their intended usage here.
    • package1: Already in TeachOpenCADD
    • package2 (conda-forge): I use it for XXX
    • package3 (pip only): I use it for XXX
  • Data must be publicly available, preferably accessible via a webserver or downloadable via a URL. Please list the data resources that you use and how to access them:
    • EGFR binding affinities: Talktorial 022

Content style

  • Talktorial includes cross-references to other talktorials if applicable
  • The table of contents reflects the talktorial story-line; order of #, ##, ### headers is correct
  • URLs are linked with meaningful words, instead of pasting the URL directly or linking words like here.
  • I have spell-checked the notebook
  • Images have enough resolution to be rendered with quality, without being too heavy.
  • All figures have a description
  • Markdown cell content is still in-line with code cell output (whenever results are discussed)
  • I have checked that cell outputs are not incredibly long (this applies also to DataFrames)
  • Formatting looks correctly on the Sphinx render (bold, italics, figure placing)

Code style

  • Variable and function names follow snake case rules (e.g. a_variable_name vs aVariableName)
  • Spacing follows PEP8 (run Black on the code cells if needed)
  • Code line are under 99 characters each (run black-nb -l 99)
  • Comments are useful and well placed
  • There are no unpythonic idioms like for i in range(len(list)) (see slides)
  • All 3rd party dependencies are listed at the top of the notebook
  • I have marked all code cell with output referenced in markdown cells with the label # NBVAL_CHECK_OUTPUT
  • I have identified potential candidates for a code refactor / useful functions
  • All import ... lines are at the top (practice part) cell, ordered by standard library / 3rd party packages / our own (teachopencadd.*)
  • I have used absolute paths instead of relative paths
    HERE = Path(_dh[-1])
    DATA = HERE / "data"

Website

We present our talktorials on our TeachOpenCADD website (https://projects.volkamerlab.org/teachopencadd/), so we have to check as well if the Jupyter notebook renders nicely there.

  • If this PR adds a new talktorial, please follow these steps:
    • Add your talktorial to the complete list of talktorials here (at the end).
    • Add your talktorial to one or multiple of the collections here. Or propose a new collection section in your PR.
    • Add your talktorial's nblink file by running python generate_nblinks.py from within the directory teachopencadd/docs/talktorials.
    • Please complile the website following the instructions here.
  • Check the rendering of the talktorial of this PR.
  • Is your talktorial listed in the talktorial list?
  • Is your talktorial listed in the talktorial collections?
    • Add a picture for your talktorial in the collection view by following these instructions.

@mbackenkoehler mbackenkoehler changed the title [DL Edition] Uncertainty estimation [DL Edition] T0036: Uncertainty estimation Dec 6, 2022
@mbackenkoehler mbackenkoehler changed the title [DL Edition] T0036: Uncertainty estimation [DL Edition] T036: Uncertainty estimation Dec 6, 2022
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@AndreaVolkamer AndreaVolkamer added new talktorial New talktorial labels Dec 8, 2022
@dominiquesydow dominiquesydow mentioned this pull request Dec 27, 2022
9 tasks
@gerritgr
Copy link
Collaborator

gerritgr commented Feb 7, 2023

  • Intro (or discussion): maybe be more critical of the claims of UE in general/clarify the scope of UE. 
  • Calibration: maybe start with an example or, at least, clarify the setting like: "Assume we have a ML model that incorporates uncertainty, how do we evaluate and improve the predicted uncertainty" or even put this after the methods part.
  • "varying the model's architecture explicitely or via a Bayesian network with probabilistic dropout." -> explain a little bit more, explicitely -> explicitly
  • "we can compute confidence intervals based on the standard deviations, we get out of our model ensemble. According to the definition of the confidence interval..." Can we get an actual confidence interval here? Maybe ref to a rigorous definition. Why use the STD and not the samples in a non-parametric way?

@gerritgr gerritgr merged commit 1fb5b6e into DL_edition Apr 11, 2023
@mbackenkoehler mbackenkoehler deleted the mb-036-uncertainty-estimation branch January 29, 2024 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new talktorial New talktorial
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants