From ac2c7460f2f74f073050d295092817b71708bb73 Mon Sep 17 00:00:00 2001 From: Patrick Peglar Date: Tue, 10 Mar 2020 21:52:13 +0000 Subject: [PATCH 1/8] Correct and improve dev-guide section on fixing graphics-tests. --- .../src/developers_guide/graphics_tests.rst | 81 +++++++++++-------- 1 file changed, 46 insertions(+), 35 deletions(-) diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst index 684ccfa4ab..8c4489db6e 100644 --- a/docs/iris/src/developers_guide/graphics_tests.rst +++ b/docs/iris/src/developers_guide/graphics_tests.rst @@ -75,43 +75,54 @@ If you notice that a graphics test in the Iris testing suite has failed following changes in Iris or any of its dependencies, this is the process you now need to follow: -#. Create a directory in iris/lib/iris/tests called 'result_image_comparison'. -#. From your Iris root directory, run the tests by using the command: - ``python setup.py test``. -#. Navigate to iris/lib/iris/tests and run the command: ``python idiff.py``. - This will open a window for you to visually inspect the changes to the - graphic and then either accept or reject the new result. -#. Upon acceptance of a change or a new image, a copy of the output PNG file - is added to the reference image repository in - https://github.com/SciTools/test-images-scitools. The file is named - according to the image hash value, as ``.png``. -#. The hash value of the new result is added into the relevant set of 'valid - result hashes' in the image result database file, - ``tests/results/imagerepo.json``. -#. The tests must now be re-run, and the 'new' result should be accepted. - Occasionally there are several graphics checks in a single test, only the - first of which will be run should it fail. If this is the case, then you - may well encounter further graphical test failures in your next runs, and - you must repeat the process until all the graphical tests pass. -#. To add your changes to Iris, you need to make two pull requests. The first - should be made to the test-images-scitools repository, and this should - contain all the newly-generated png files copied into the folder named - 'image_files'. -#. The second pull request should be created in the Iris repository, and should - only include the change to the image results database - (``tests/results/imagerepo.json``) : - This pull request must contain a reference to the matching one in - test-images-scitools. +#. Create a new, empty directory to store temporary image results, at the path + ``lib/iris/tests/result_image_comparison`` in your Iris repository checkout. + +#. From your Iris root directory, run test sourcefiles directly as python + scripts, or by using a command such as + ``python -m unittest discover paths/to/test/files``. + +#. Navigate to ``iris/lib/iris/tests`` and run the command: ``python idiff.py``. + This will open a window for you to visually inspect 'old', 'new' and + 'change' images for each failed graphics test output. + Hit a button to either accept or reject each new result. + +#. Upon acceptance of a change or a new image : + + * (A) The imagehash value of the new result image is added into the relevant + set of 'valid result hashes' in the image result database file, + ``tests/results/imagerepo.json`` ; + + * (B) the relevant output file in ``tests/result_image_comparison`` is + renamed according to the image hash value, as ``.png``. + A copy of this new PNG file must be added into the reference image + repository at https://github.com/SciTools/test-images-scitools. + (See below). + +#. Now re-run the tests. The 'new' result should now be recognised and the + relevant test should pass. However, certain tests do perform *multiple* + graphics checks within a single test function : In those cases, any failing + check will prevent the others from being run, so a test re-run can encounter + further (new) graphical test failures. Simply repeat the check-and-accept + process until all graphical tests pass. + +#. To add your changes to Iris, you need to make two pull requests : + + * (A) The first PR is made in the test-images-scitools repository, at + https://github.com/SciTools/test-images-scitools. This should contain all + the newly-generated PNG files, which must be added to the ``images/v4`` + directory. In your Iris repo, these files are to be found in the + temporary results folder ``iris/tests/result_image_comparison``. + **Note**: this location is covered by a project ``.gitignore`` setting, + so those files do not show up in a ``git status`` output. + + * (B) The second PR is created in the Iris repository, and + should only include the change to the image results database, + ``tests/results/imagerepo.json`` : + The description box of this pull request should contain a reference to + the matching one in test-images-scitools. Note: the Iris pull-request will not test out successfully in Travis until the test-images-scitools pull request has been merged : This is because there is an Iris test which ensures the existence of the reference images (uris) for all the targets in the image results database. - - -Fixing a failing graphics test -============================== - - -Adding a new graphics test -========================== From 7cf47b26e2edfb1312360405613ff98325409602 Mon Sep 17 00:00:00 2001 From: Patrick Peglar Date: Wed, 18 Mar 2020 14:36:59 +0000 Subject: [PATCH 2/8] Review changes + general rethink. --- .../src/developers_guide/graphics_tests.rst | 146 ++++++++++-------- 1 file changed, 83 insertions(+), 63 deletions(-) diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst index 8c4489db6e..8eb3a56614 100644 --- a/docs/iris/src/developers_guide/graphics_tests.rst +++ b/docs/iris/src/developers_guide/graphics_tests.rst @@ -10,9 +10,10 @@ For this, a basic 'graphics test' assertion operation is provided in the method match against a stored reference. A "graphics test" is any test which employs this. -At present (Iris version 1.10), such tests include the testing for modules -`iris.tests.test_plot` and `iris.tests.test_quickplot`, and also some other -'legacy' style tests (as described in :ref:`developer_tests`). +At present, such tests include the testing for modules `iris.tests.test_plot` +and `iris.tests.test_quickplot`, all output plots from the gallery examples +(contained in `docs/iris/example_tests`), and a few other 'legacy' style tests +(as described in :ref:`developer_tests`). It is conceivable that new 'graphics tests' of this sort can still be added. However, as graphics tests are inherently "integration" style rather than true unit tests, results can differ with the installed versions of dependent @@ -38,91 +39,110 @@ Testing actual plot results introduces some significant difficulties : Graphics Testing Strategy ========================= -Prior to Iris 1.10, all graphics tests compared against a stored reference -image with a small tolerance on pixel values. +In the Iris Travis matrix, and over time, graphics tests must run with +multiple versions of Python, and of key dependencies such as matplotlib. +To make this manageable, the "check_graphic" test routine tests against +multiple alternative 'correct' results. It does this using an image "hash" +comparison technique which avoids storing reference images in the Iris +repository itself, to avoid space problems. -From Iris v1.11 onward, we want to support testing Iris against multiple -versions of matplotlib (and some other dependencies). -To make this manageable, we have now rewritten "check_graphic" to allow -multiple alternative 'correct' results without including many more images in -the Iris repository. This consists of : - * using a perceptual 'image hash' of the outputs (see - https://github.com/JohannesBuchner/imagehash) as the basis for checking + * The 'check_graphic' funciton uses a perceptual 'image hash' of the outputs + (see https://github.com/JohannesBuchner/imagehash) as the basis for checking test results. - * storing the hashes of 'known accepted results' for each test in a - database in the repo (which is actually stored in - ``lib/iris/tests/results/imagerepo.json``). - * storing associated reference images for each hash value in a separate public - repository, currently in https://github.com/SciTools/test-images-scitools , - allowing human-eye judgement of 'valid equivalent' results. - * a new version of the 'iris/tests/idiff.py' assists in comparing proposed - new 'correct' result images with the existing accepted ones. + * The hashes of 'known accepted results' for each test are stored in a + lookup dictionary, saved to the repo file + ``lib/iris/tests/results/imagerepo.json`` . + * An actual reference image for each hash value is stored in a *separate* + public repository : https://github.com/SciTools/test-iris-imagehash . + * The reference images allow human-eye assessment of whether a new output is + judged to be 'close enough' to the older ones, or not. + * The utility script ``iris/tests/idiff.py`` automates checking, enabling the + developer to easily compare proposed new 'correct' result images against the + existing accepted reference images, for each failing test. -BRIEF... -There should be sufficient work-flow detail here to allow an iris developer to: - * understand the new check graphic test process - * understand the steps to take and tools to use to add a new graphic test - * understand the steps to take and tools to use to diagnose and fix an graphic test failure +How to Add New Known-Valid Result Images +======================================== - -Basic workflow -============== - -If you notice that a graphics test in the Iris testing suite has failed -following changes in Iris or any of its dependencies, this is the process -you now need to follow: +When you find that a graphics test in the Iris testing suite has failed, +following changes in Iris or the run dependencies, this is the process +you should follow: #. Create a new, empty directory to store temporary image results, at the path ``lib/iris/tests/result_image_comparison`` in your Iris repository checkout. -#. From your Iris root directory, run test sourcefiles directly as python - scripts, or by using a command such as +#. **In your Iris repo root directory**, run the relevant (failing) tests + directly as python scripts, or by using a command such as ``python -m unittest discover paths/to/test/files``. -#. Navigate to ``iris/lib/iris/tests`` and run the command: ``python idiff.py``. - This will open a window for you to visually inspect 'old', 'new' and - 'change' images for each failed graphics test output. - Hit a button to either accept or reject each new result. +#. **In the** ``iris/lib/iris/tests`` **folder**, run the command: ``python idiff.py``. + This will open a window for you to visually inspect side-by-side 'old', 'new' + and 'difference' images for each failed graphics test. + Hit a button to either "accept", "reject" or "skip" each new result ... + + * If the change is *"accepted"* : + + * the imagehash value of the new result image is added into the relevant + set of 'valid result hashes' in the image result database file, + ``tests/results/imagerepo.json`` ; + + * the relevant output file in ``tests/result_image_comparison`` is + renamed according to the image hash value, as ``.png``. + A copy of this new PNG file must then be added into the reference image + repository at https://github.com/SciTools/test-iris-imagehash. + (See below). + + * If a change is *"skipped"* : + + * no further changes are made in the repo. -#. Upon acceptance of a change or a new image : + * when you run idiff again, the skipped choice will be presented again. - * (A) The imagehash value of the new result image is added into the relevant - set of 'valid result hashes' in the image result database file, - ``tests/results/imagerepo.json`` ; + * If a change is *"rejected"* : - * (B) the relevant output file in ``tests/result_image_comparison`` is - renamed according to the image hash value, as ``.png``. - A copy of this new PNG file must be added into the reference image - repository at https://github.com/SciTools/test-images-scitools. - (See below). + * the output image is deleted from ``result_image_comparison``. + + * when you run idiff again, the skipped choice will not appear, unless + and until the relevant failing test is re-run. #. Now re-run the tests. The 'new' result should now be recognised and the - relevant test should pass. However, certain tests do perform *multiple* - graphics checks within a single test function : In those cases, any failing - check will prevent the others from being run, so a test re-run can encounter - further (new) graphical test failures. Simply repeat the check-and-accept - process until all graphical tests pass. + relevant test should pass. However, some tests can perform *multiple* graphics + checks within a single testcase function : In those cases, any failing + check will prevent the following ones from being run, so a test re-run may + encounter further (new) graphical test failures. If that happens, simply + repeat the check-and-accept process until all tests pass. #. To add your changes to Iris, you need to make two pull requests : - * (A) The first PR is made in the test-images-scitools repository, at - https://github.com/SciTools/test-images-scitools. This should contain all - the newly-generated PNG files, which must be added to the ``images/v4`` - directory. In your Iris repo, these files are to be found in the - temporary results folder ``iris/tests/result_image_comparison``. - **Note**: this location is covered by a project ``.gitignore`` setting, - so those files do not show up in a ``git status`` output. + * (1) The first PR is made in the test-iris-imagehash repository, at + https://github.com/SciTools/test-iris-imagehash. + + * First, add all the newly-generated referenced PNG files into the + ``images/v4`` directory. In your Iris repo, these files are to be found + in the temporary results folder ``iris/tests/result_image_comparison``. + + .. Note:: + + The ``result_image_comparison`` folder is covered by a project + ``.gitignore`` setting, so those files *will not show up* in a + ``git status`` check. + + * Then, run ``python recreate_v4_files_listing.py``, to update the file + which lists available images, ``v4_files_listing.txt``. + + * Create a PR proposing these changes, in the usual way. - * (B) The second PR is created in the Iris repository, and + * (2) The second PR is created in the Iris repository, and should only include the change to the image results database, ``tests/results/imagerepo.json`` : The description box of this pull request should contain a reference to - the matching one in test-images-scitools. + the matching one in test-iris-imagehash. Note: the Iris pull-request will not test out successfully in Travis until the -test-images-scitools pull request has been merged : This is because there is +test-iris-imagehash pull request has been merged : This is because there is an Iris test which ensures the existence of the reference images (uris) for all -the targets in the image results database. +the targets in the image results database. N.B. likewise, it will *also* fail +if you forgot to run ``recreate_v4_files_listing.py`` to update the image-listing +file in test-iris-imagehash. From 67f44206c1bc84103d641144031d760033837e38 Mon Sep 17 00:00:00 2001 From: Patrick Peglar Date: Wed, 18 Mar 2020 14:45:22 +0000 Subject: [PATCH 3/8] Reduce duplication between 'graphics-tests' and general 'tests' page. --- docs/iris/src/developers_guide/tests.rst | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/docs/iris/src/developers_guide/tests.rst b/docs/iris/src/developers_guide/tests.rst index 929073b569..417db96f32 100644 --- a/docs/iris/src/developers_guide/tests.rst +++ b/docs/iris/src/developers_guide/tests.rst @@ -139,16 +139,5 @@ This the only way of testing the modules :mod:`iris.plot` and :mod:`iris.quickplot`, but is also used for some other legacy and integration- style testcases. -Prior to Iris version 1.10, a single reference image for each testcase was -stored in the main Iris repository, and a 'tolerant' comparison was performed -against this. - -From version 1.11 onwards, graphics testcase outputs are compared against -possibly *multiple* known-good images, of which only the signature is stored. -This uses a sophisticated perceptual "image hashing" scheme (see: -). -Only imagehash signatures are stored in the Iris repo itself, thus freeing up -valuable space. Meanwhile, the actual reference *images* -- which are required -for human-eyes evaluation of proposed new "good results" -- are all stored -elsewhere in a separate public repository. +There are specific mechanisms for handling this. See :ref:`developer_graphics_tests`. From ddee22a9d40a35afdc37060a360bb62fd5d6069a Mon Sep 17 00:00:00 2001 From: Patrick Peglar Date: Wed, 18 Mar 2020 15:30:55 +0000 Subject: [PATCH 4/8] Update docs/iris/src/developers_guide/graphics_tests.rst Co-Authored-By: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> --- docs/iris/src/developers_guide/graphics_tests.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst index 8eb3a56614..4038c8e311 100644 --- a/docs/iris/src/developers_guide/graphics_tests.rst +++ b/docs/iris/src/developers_guide/graphics_tests.rst @@ -48,7 +48,7 @@ repository itself, to avoid space problems. This consists of : - * The 'check_graphic' funciton uses a perceptual 'image hash' of the outputs + * The 'check_graphic' function uses a perceptual 'image hash' of the outputs (see https://github.com/JohannesBuchner/imagehash) as the basis for checking test results. * The hashes of 'known accepted results' for each test are stored in a From 295a49434687ee00fb938299910d1f024cc2d6ea Mon Sep 17 00:00:00 2001 From: Patrick Peglar Date: Wed, 18 Mar 2020 16:54:19 +0000 Subject: [PATCH 5/8] Update docs/iris/src/developers_guide/graphics_tests.rst Co-Authored-By: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> --- docs/iris/src/developers_guide/graphics_tests.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst index 4038c8e311..3c00091933 100644 --- a/docs/iris/src/developers_guide/graphics_tests.rst +++ b/docs/iris/src/developers_guide/graphics_tests.rst @@ -51,7 +51,7 @@ This consists of : * The 'check_graphic' function uses a perceptual 'image hash' of the outputs (see https://github.com/JohannesBuchner/imagehash) as the basis for checking test results. - * The hashes of 'known accepted results' for each test are stored in a + * The hashes of known 'acceptable' results for each test are stored in a lookup dictionary, saved to the repo file ``lib/iris/tests/results/imagerepo.json`` . * An actual reference image for each hash value is stored in a *separate* From 4c1430efc8c54eabe7bd230205db8f4744167a49 Mon Sep 17 00:00:00 2001 From: Patrick Peglar Date: Wed, 18 Mar 2020 16:54:45 +0000 Subject: [PATCH 6/8] Update docs/iris/src/developers_guide/graphics_tests.rst Co-Authored-By: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> --- docs/iris/src/developers_guide/graphics_tests.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst index 3c00091933..0027c4a0ba 100644 --- a/docs/iris/src/developers_guide/graphics_tests.rst +++ b/docs/iris/src/developers_guide/graphics_tests.rst @@ -42,7 +42,7 @@ Graphics Testing Strategy In the Iris Travis matrix, and over time, graphics tests must run with multiple versions of Python, and of key dependencies such as matplotlib. To make this manageable, the "check_graphic" test routine tests against -multiple alternative 'correct' results. It does this using an image "hash" +multiple alternative 'acceptable' results. It does this using an image "hash" comparison technique which avoids storing reference images in the Iris repository itself, to avoid space problems. From 9d8a9b575967cd9e745787eed9caf024831d81f6 Mon Sep 17 00:00:00 2001 From: Patrick Peglar Date: Wed, 18 Mar 2020 16:54:58 +0000 Subject: [PATCH 7/8] Update docs/iris/src/developers_guide/graphics_tests.rst Co-Authored-By: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> --- docs/iris/src/developers_guide/graphics_tests.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst index 0027c4a0ba..02f57f2f02 100644 --- a/docs/iris/src/developers_guide/graphics_tests.rst +++ b/docs/iris/src/developers_guide/graphics_tests.rst @@ -59,7 +59,7 @@ This consists of : * The reference images allow human-eye assessment of whether a new output is judged to be 'close enough' to the older ones, or not. * The utility script ``iris/tests/idiff.py`` automates checking, enabling the - developer to easily compare proposed new 'correct' result images against the + developer to easily compare proposed new 'acceptable' result images against the existing accepted reference images, for each failing test. From 0c9c97598042209364948dfffed5e905f243c583 Mon Sep 17 00:00:00 2001 From: Patrick Peglar Date: Wed, 18 Mar 2020 16:55:11 +0000 Subject: [PATCH 8/8] Update docs/iris/src/developers_guide/graphics_tests.rst Co-Authored-By: Martin Yeo <40734014+trexfeathers@users.noreply.github.com> --- docs/iris/src/developers_guide/graphics_tests.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst index 02f57f2f02..2782f319ec 100644 --- a/docs/iris/src/developers_guide/graphics_tests.rst +++ b/docs/iris/src/developers_guide/graphics_tests.rst @@ -63,7 +63,7 @@ This consists of : existing accepted reference images, for each failing test. -How to Add New Known-Valid Result Images +How to Add New 'Acceptable' Result Images to Existing Tests ======================================== When you find that a graphics test in the Iris testing suite has failed,