From 863c063018416c5395091f48ac919f1d88954d80 Mon Sep 17 00:00:00 2001 From: Janos Gabler Date: Tue, 5 Nov 2024 07:56:37 +0100 Subject: [PATCH] Last polishing. --- .../how_to/how_to_algorithm_selection.ipynb | 28 ++++++++++--------- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/docs/source/how_to/how_to_algorithm_selection.ipynb b/docs/source/how_to/how_to_algorithm_selection.ipynb index 5b15c8352..fceb44038 100644 --- a/docs/source/how_to/how_to_algorithm_selection.ipynb +++ b/docs/source/how_to/how_to_algorithm_selection.ipynb @@ -15,17 +15,17 @@ "\n", "- There is no optimizer that works well for all problems \n", "- Making the right choice can lead to enormous speedups\n", - "- Making the wrong choice can mean that you don't solve your problem at all. Sometimes, \n", + "- Making the wrong choice can mean that you [don't solve your problem at all](algo-selection-how-important). Sometimes,\n", "optimizers fail silently!\n", "\n", "\n", "## The three steps for selecting algorithms\n", "\n", "Algorithm selection is a mix of theory and experimentation. We recommend the following \n", - "four steps:\n", + "steps:\n", "\n", "1. **Theory**: Based on the properties of your problem, start with 3 to 5 candidate algorithms. \n", - "You may use the [decision tree below](link)\n", + "You may use the decision tree below.\n", "2. **Experiments**: Run the candidate algorithms for a small number of function \n", "evaluations and compare the results in a *criterion plot*. As a rule of thumb, use \n", "between `n_params` and `10 * n_params` evaluations. \n", @@ -68,12 +68,12 @@ "```{note}\n", "Many books on numerical optimization focus strongly on the inner workings of algorithms.\n", "They will, for example, describe the difference between a trust-region algorithm and a \n", - "line-search algorithm in a lot of detail. We have an [intuitive explanation](../explanation/explanation_of_numerical_optimizers.md) of this too. However, these details are not \n", - "very relevant for algorithm selection. For example, If you have a scalar, differentiable \n", - "problem without nonlinear constraints, the decision tree suggests `fides` and `lbfgsb`.\n", - "`fides` is a trust-region algorithm, `lbfgsb` is a line-search algorithm. Both are \n", - "designed to solve the same kinds of problems and which one works best needs to be \n", - "found out through experimentation.\n", + "line-search algorithm in a lot of detail. We have an [intuitive explanation](../explanation/explanation_of_numerical_optimizers.md) of this too. Understanding these details is important for configuring and\n", + "troubleshooting optimizations, but not for algorithm selection. For example, If you have\n", + "a scalar, differentiable problem without nonlinear constraints, the decision tree \n", + "suggests `fides` and two variants of `lbfgsb`. `fides` is a trust-region algorithm, \n", + "`lbfgsb` is a line-search algorithm. Both are designed to solve the same kinds of \n", + "problems and which one works best needs to be found out through experimentation.\n", "```\n", "\n", "(algo-selection-example-problem)=\n", @@ -134,7 +134,7 @@ "Let's go through the decision tree for the Trid function:\n", "\n", "1. **No** nonlinear constraints our solution needs to satisfy\n", - "2. **No** no least-squares structure we can exploit \n", + "2. **No** least-squares structure we can exploit \n", "3. **Yes**, the function is differentiable. We even have a closed form gradient that \n", "we would like to use. \n", "\n", @@ -152,9 +152,9 @@ "### Step 2: Experiments\n", "\n", "To find out which algorithms work well for our problem, we simply run optimizations with\n", - "all algorithms in a loop and store the result in a dictionary. We limit the number of \n", - "function evaluations to 8. Since some algorithms only support a maximum number of iterations \n", - "as stopping criterion we also limit the number of iterations to 8.\n" + "all candidate algorithms in a loop and store the result in a dictionary. We limit the \n", + "number of function evaluations to 8. Since some algorithms only support a maximum number\n", + "of iterations as stopping criterion we also limit the number of iterations to 8.\n" ] }, { @@ -248,6 +248,8 @@ "experiments. See [here](how_to_derivatives.ipynb) to learn more about derivatives.\n", "\n", "\n", + "(algo-selection-how-important)=\n", + "\n", "## How important was it?\n", "\n", "The Trid function is differentiable and very well behaved in almost every aspect. \n",