Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update Gradient descent notebook #218

Merged
merged 3 commits into from
Mar 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# ChangeLog
## v3.6.0
* moved gradient_descent_mr_pet_ct.ipynb to Introductory folder and fixed some issues such that it does gradient descent for all modalities.
* updated devcontainer to pull docker images from ghcr.io

## v3.5.0
* added `environment.yml` for `conda` users. This also installs CIL.
* added a [`.devcontainer` folder](https://containers.dev/) which enables GitHub Codespaces as well as Visual Studio Code devcontainers.
Expand Down
8 changes: 6 additions & 2 deletions notebooks/Introductory/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,16 @@ This notebook is not required for being able to use SIRF, but could help with un
The [introduction](introduction.ipynb) notebook serves as a starting point for all SIRF jupyter notebooks.
The notebook shows how MR, PET and CT images can be created and manipulated.

The [acquisition_model_mr_pet_ct](acquisition_model_mr_pet_ct.ipynb) notebook shows how to create a MR, PET and CT acquisition model and go from images to raw data and back again for each modality.
The [acquisition_model_mr_pet_ct](acquisition_model_mr_pet_ct.ipynb) notebook shows how to create a MR, PET and CT acquisition model and go from images to raw data and back again for each modality. (Do check notebooks for each modality for more information on the acquisition models.)

The [gradient_descent_mr_pet_ct notebook](gradient_descent_mr_pet_ct.ipynb) shows how to write
a simple gradient descent algorithm for MR, PET and CT (using CIL for the latter).

## Learning objectives

- Ensure you have a working version of SIRF (whether you're viewing this via a virtual machine, an Azure instance or a SIRF installation on your own machine).
- Ensure you have a working version of SIRF (whether you're viewing this via a virtual machine, docker image, an Azure instance or a SIRF installation on your own machine).
By the end of these notebooks, you should feel more comfortable with:
- Get a feel for Jupyter notebooks (and the basic python they require)
- Reading and displaying images
- Doing basic simulations and reconstruction in three different modalities
- Querying for help to improve knowledge of SIRF functionality.
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
"# Gradient descent for MR, PET and CT\n",
"This demonstration shows how to do image reconstruction using gradient descent for different modalities. \n",
"\n",
"It builds on the the notebook *acquisition_model_mr_pet_ct.ipynb*. The first part of the notebook which creates acquisition models and simulates data from the brainweb is the same code but with fewer comments. If anything is unclear, then please refer to the other notebook to get some further information.\n",
"It builds on the notebook [acquisition_model_mr_pet_ct.ipynb](acquisition_model_mr_pet_ct.ipynb). The first part of the notebook which creates acquisition models and simulates data from the brainweb is the same code but with fewer comments. If anything is unclear, then please refer to the other notebook to get some further information.\n",
"\n",
"This demo is a jupyter notebook, i.e. intended to be run step by step.\n",
"You could export it as a Python file and run it one go, but that might\n",
Expand All @@ -18,20 +18,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"**WARNING**: At present, this notebook actually implements gradient ascent for PET, and it uses some sign tricks for CT/MR. This is because the `sirf.STIR` objective functions correspond to log-likelihood (or posterior), which needs to be maximised, while this is not so common in CT/MR. We will have to edit this notebook to make this clearer in the next version!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Author: Christoph Kolbitsch, Edoardo Pasca \n",
"First version: 23rd of April 2021 \n",
"Updated: 26nd of June 2021 \n",
"Author: Christoph Kolbitsch, Edoardo Pasca, Kris Thielemans \n",
"First version: 23rd of April 2021\n",
"\n",
"CCP SyneRBI Synergistic Image Reconstruction Framework (SIRF). \n",
"Copyright 2015 - 2017, 2021 Rutherford Appleton Laboratory STFC. \n",
"Copyright 2015 - 2019 University College London. \n",
"Copyright 2015 - 2019, 2024 University College London. \n",
"Copyright 2021 Physikalisch-Technische Bundesanstalt.\n",
"\n",
"This is software developed for the Collaborative Computational\n",
Expand Down Expand Up @@ -109,6 +101,23 @@
" print(\"CIL ASTRA plugin is not installed\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's make the PET output less verbose about warnings etc."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pet.set_verbosity(0)\n",
"_ = pet.MessageRedirector(warn=None)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -424,7 +433,7 @@
"tags": []
},
"source": [
"# Gradient descent (CT, MR) or ascent (PET)"
"# Gradient descent"
]
},
{
Expand Down Expand Up @@ -461,15 +470,15 @@
"metadata": {},
"outputs": [],
"source": [
"obj_fun_pet_sirf = pet.make_Poisson_loglikelihood(raw_pet)\n",
"obj_fun_pet_sirf = pet.make_Poisson_loglikelihood(raw_pet, acq_model = acq_mod_pet)\n",
"obj_fun_pet_sirf.set_up(bwd_pet)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Because `obj_func_pet` cannot be called directly but `obj_func_pet.value()` has to be used, we write a quick wrapper around it"
"As this demo will implement gradient descent, we need to use multiply the log-likelihood with -1, so we write a quick wrapper around it."
]
},
{
Expand All @@ -479,14 +488,14 @@
"outputs": [],
"source": [
"def obj_fun_pet(curr_image_estimate):\n",
" return(obj_fun_pet_sirf.value(curr_image_estimate))"
" return(-obj_fun_pet_sirf(curr_image_estimate))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The gradient is also already implemented and can simply be calculated by calling `obj_fun_pet.gradient`. Nevertheless, to be more explicit and consistent with the other modalities we will define a new function to do the job:"
"And of course the same for the gradient."
]
},
{
Expand All @@ -498,7 +507,7 @@
"def obj_fun_grad_pet(curr_image_estimate):\n",
" # The 0 here means, only the gradient for subset 0 is returned. \n",
" # We will just accept this as is here, because subsets are too advanced for this demo.\n",
" return(obj_fun_pet_sirf.gradient(curr_image_estimate, 0))"
" return( -1*obj_fun_pet_sirf.gradient(curr_image_estimate, 0))"
]
},
{
Expand All @@ -515,12 +524,7 @@
"outputs": [],
"source": [
"def make_positive(image_data):\n",
" # The idea is to create an image with all 0s (zero_image) and then we can take the maximum over each voxel.\n",
" # If the voxel value is larger than 0, then the original value is returned. If it is smaller than 0, then \n",
" # the value of the zero_image, i.e. 0, is returned.\n",
" zero_image = image_data.clone()\n",
" zero_image.fill(0.0)\n",
" image_data = image_data.maximum(zero_image)\n",
" image_data = image_data.maximum(0)\n",
" return image_data"
]
},
Expand Down Expand Up @@ -579,8 +583,7 @@
"outputs": [],
"source": [
"def obj_fun_grad_ct(curr_image_estimate):\n",
" # We are returning the negative gradient, because we are descending\n",
" return(-least_squares_cil.gradient(curr_image_estimate))"
" return(least_squares_cil.gradient(curr_image_estimate))"
]
},
{
Expand All @@ -594,7 +597,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"And last but not least __MR__ . If you want to know more about the objective function of __MR__ and its gradient, then please have a look at the notebook *d_undersampled_reconstructions.ipynb*."
"And last but not least __MR__ . If you want to know more about the objective function of __MR__ and its gradient, then please have a look at the notebook *../MR/d_undersampled_reconstructions.ipynb*."
]
},
{
Expand All @@ -615,15 +618,30 @@
"outputs": [],
"source": [
"def obj_fun_grad_mr(curr_image_estimation):\n",
" # We are returning the negative gradient, because we are descending\n",
" return(-acq_mod_mr.backward(acq_mod_mr.forward(curr_image_estimate) - raw_mr))"
" return(acq_mod_mr.backward(acq_mod_mr.forward(curr_image_estimate) - raw_mr))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we have all our `obj_func` and `obj_func_grad` we will select one modality and then implement the gradient descent/ascent appproach. We also need an image `init_image` to start with. Here we will simply use the simple reconstruction which we did above."
"Note that we could have used CIL's `LeastSquares` as well here."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Implement gradient descent for one modality (and hence all)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we have all our `obj_func` and `obj_func_grad` we will select one modality and then implement the gradient descent/ascent appproach. We also need an image `init_image` to start with. Here we will simply use the simple reconstruction which we did above.\n",
"\n",
"We will implement the code using some common names. Set `curr_modality` to what you want to try."
]
},
{
Expand All @@ -632,22 +650,28 @@
"metadata": {},
"outputs": [],
"source": [
"curr_modality = 'pet' # pet, ct, mr\n",
"curr_modality = 'mr' # pet, ct, mr\n",
"\n",
"if curr_modality.lower() == 'pet':\n",
" obj_fun = obj_fun_pet\n",
" obj_fun_grad = obj_fun_grad_pet\n",
" init_image = bwd_pet\n",
" raw_data = raw_pet\n",
" acq_model = acq_mod_pet\n",
"elif curr_modality.lower() == 'ct':\n",
" if not has_astra:\n",
" if not have_astra:\n",
" raise ImportError('ASTRA toolbox is not installed')\n",
" obj_fun = obj_fun_ct\n",
" obj_fun_grad = obj_fun_grad_ct\n",
" init_image = bwd_ct \n",
" init_image = bwd_ct\n",
" raw_data = raw_ct\n",
" acq_model = acq_mod_ct\n",
"elif curr_modality.lower() == 'mr':\n",
" obj_fun = obj_fun_mr\n",
" obj_fun_grad = obj_fun_grad_mr\n",
" init_image = bwd_mr \n",
" init_image = bwd_mr\n",
" raw_data = raw_mr\n",
" acq_model = acq_mod_mr\n",
"else:\n",
" raise NameError('{:} not recognised'.format(curr_modality,)) "
]
Expand All @@ -656,13 +680,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we come to the probably most important paramter for gradient descent, the infamous __step-size__ . Unfortunately, the gradient only gives us the direction along which we need to update the image, but does not tell us by how much we have to go in this direction. Therefore, we need to define how far we step along the gradient direction in each iteration by hand. \n",
"\n",
"To make sure the step-size is adapted to each modality as much as possible, we won't define the step-size directly, but we will calculate the step-size as `tau` times the norm of the current image estimate. Nevertheless, this is still just a guess. \n",
"Now we come to the probably most important parameter for gradient descent, the infamous __step-size__ . Unfortunately, the gradient only gives us the direction along which we need to update the image, but does not tell us by how much we have to go in this direction. Therefore, we need to define how far we step along the gradient direction in each iteration by hand. \n",
"\n",
"If the step-size is too small, we have very slow convergence. If the step-size is too large, then we are overstepping our target image and the objective function starts to oscillate. The value below is reasonable for __PET__, but for the other modalities you will have to optimise it to ensure good convergence.\n",
"To make sure the step-size is adapted to each modality as much as possible, we won't define the step-size directly. If the step-size is too small, we have very slow convergence. If the step-size is too large, then we are overstepping our target image and the objective function starts to oscillate. You could have to optimise it to ensure good convergence.\n",
"\n",
"Note that in the implementation below, we have used a \"relative step-size\" `tau` such that the norm of the increment (or step) is `tau curr_image_estmate.norm()`, which is intuitive, but theoretically and practically not well-motivated. Feel free to experiment with other step-size rules (see the end of this notebook for some suggestions)."
"Note that in the implementation below, we have used a \"relative step-size\" `tau` such that at the first iteration, the norm of the increment (or step) is `tau curr_image_estimate.norm()`, which is intuitive. Feel free to experiment with other step-size rules (see the end of this notebook for some suggestions)."
]
},
{
Expand All @@ -671,7 +693,7 @@
"metadata": {},
"outputs": [],
"source": [
"# Relative step-size\n",
"# Relative step-size (this value is ok for PET, but not so much for MR!)\n",
"tau = .3"
]
},
Expand All @@ -696,7 +718,21 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Then we need to create an image as a starting point for the gradient descent algorithm and we will also create a numpy array, where we can store the image estimate at each iteration"
"Then we need to create an image as a starting point for the gradient descent algorithm and we will also create a numpy array, where we can store the image estimate at each iteration.\n",
"\n",
"Gradient descent is an additive algorithm, so it is a bit sensitive to the \"scale\" of the initial image. Using the wrong values could lead to slow convergence initially. Here we will fix the scale if the initial image by making sure that its forward projection roughly matches the raw data."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"scale = raw_data.norm() / acq_model.direct(init_image).norm()\n",
"print(scale)\n",
"# let's use a slightly larger scale as illustration.\n",
"curr_image_estimate = init_image * (scale * 1.1)"
]
},
{
Expand All @@ -705,9 +741,6 @@
"metadata": {},
"outputs": [],
"source": [
"# Initial image\n",
"curr_image_estimate = init_image.clone()\n",
"\n",
"# Array for images at each iteration\n",
"image_iteration = numpy.ndarray((num_iters + 1,), dtype=object)\n",
"image_iteration[0] = curr_image_estimate\n",
Expand All @@ -734,11 +767,12 @@
" # First we calculate the gradient to find out how we need to update our current image estimate\n",
" grad = obj_fun_grad(curr_image_estimate)\n",
"\n",
" # Compute step-size relative to current image-norm\n",
" step_size = tau * curr_image_estimate.norm() / grad.norm()\n",
"\n",
" # Perform gradient ascent step\n",
" curr_image_estimate = curr_image_estimate + step_size * grad\n",
" # Compute step-size relative to initial image-norm (avoiding division by zero)\n",
" if i == 1:\n",
" step_size = tau * curr_image_estimate.norm() / (grad.norm() + 1e-19)\n",
" print(step_size)\n",
" # Perform gradient descent step\n",
" curr_image_estimate = curr_image_estimate - step_size * grad\n",
"\n",
" # In PET and CT we have to ensure values are positive. \n",
" if curr_modality.lower() == 'ct' or curr_modality.lower() == 'pet':\n",
Expand All @@ -760,7 +794,7 @@
"# Plot objective function values\n",
"plt.figure()\n",
"plt.title('Objective function value')\n",
"plt.plot(obj_func_values, 'o-b')"
"plt.plot(obj_func_values[2:], 'o-b')"
]
},
{
Expand All @@ -785,10 +819,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"You could now experiment witih different step-size schemes. Some suggestions:\n",
"- change to a fixed step size, i.e. remove the `grad.norm()`\n",
"You could now experiment with different step-size schemes. Some suggestions:\n",
"- use a decreasing step-size (sometimes called \"relaxation\"), e.g. $\\alpha/(n+\\beta)$ with $n$ the iteration number\n",
"- for MR and CT, the maximum step-size can be determined from the norm of the forward model\n",
"- for MR and CT, the maximum step-size can be determined from the norm of the forward model (or the Lipschitz constant of the gradient of the objective function)\n",
"- back-tracking line search\n",
"\n",
"You could also have a look at the `cil.optimisation.algorithms.GD` algorithms from CIL, as illustrated in the [01_optimisation_gd_fista CIL-demo](\n",
Expand Down
9 changes: 3 additions & 6 deletions notebooks/Synergistic/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
# Synergistic notebooks

You should find in this directory a few notebooks that are designed to give you a feeling for synergistic reconstructions.
We recommend that you review the The [gradient_descent_mr_pet_ct notebook](../Introductory/gradient_descent_mr_pet_ct.ipynb)
notebook first, which shows how to write a simple gradient descent algorithm for MR, PET and CT (using CIL for the latter).
It is not really a "synergistic" notebook in itself, but can serve as the basis for any synergistic algorithm that uses alternating optimisation between the different modalities.

0. [Gradient descent/ascent for MR, PET and CT](#gradient-descent-for-MR-PET-CT)
1. [Generate the data](#gen_data)
2. [de Pierro MAPEM with the Bowsher prior](#de_pierro)
3. [Dual PET](#dual_pet)
Expand All @@ -11,11 +13,6 @@ You should find in this directory a few notebooks that are designed to give you
6. [Joint-TV for PET](#Joint-TV_P)
7. [Joint-TV for PET/SPECT](#Joint-TV_PS)

## 0. Gradient descent/ascent for MR, PET and CT<a name="gradient-descent-for-MR-PET-CT"></a>

The [gradient_descent_mr_pet_ct notebook](gradient_descent_mr_pet_ct.ipynb) shows how to write
a simple gradient descent (or ascent...) algorithm for MR, PET and CT (using CIL for the latter).
It is not really a "synergistic" notebook in itself, but can serve as the basis for any synergistic algorithm that uses alternating optimisation between the different modalities.

## 1. Generate the data <a name="gen_data"></a>

Expand Down