Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tighten some assertCMLApproxData rtols. #2224

Merged
merged 1 commit into from
Nov 14, 2016
Merged

Conversation

pp-mo
Copy link
Member

@pp-mo pp-mo commented Oct 28, 2016

The really big 'rtol' values bothered me, so here I've reduced them, just so far as allows it to succeed with both numpy 1.10 and 1.11.

I think we _still_ have a problem with the collapse calculations (see below).
However, I don't know yet if that is relevant to these result differences between numpy 1.10 and 1.11.

I did also try to retest with numpy 1.8, but dependencies don't resolve so easily for the latest Iris.
(So I gave up).
I think we really don't care about 1.8 any more.

@pp-mo
Copy link
Member Author

pp-mo commented Oct 28, 2016

we still have a problem with the collapse calculations

We discovered in the past that numpy 1.8 had some serious problems with summations over large arrays -- which impacts average and standard-deviation calculations.

This was mostly fixed in 1.9, I think by adopting "pairwise" calculations
(i.e. dividing into sub-operations. See numpy 1.9 release notes section, and https://en.wikipedia.org/wiki/Pairwise_summation )

From my recent experiments with numpy 1.10 and 1.11 it seems that when an array is non-contiguous you can still get the "old" (bad) behaviour.
Iris Cube.collapsed triggers this by the way it transforms + reshapes the data to combine the calculation on one axis. See here
The real problem here occurs with the standard-deviation value, which is very sensitive to this and it comes out significantly different.

What I've observed:

  • at 1.10 this occurs with our test in test_cdm, because of the transpose+reshape
  • at 1.11 it no longer does

So it does look like we are now in the clear after all, in that the "newest" version 1.11 produces definitely "better" results.

_However..._ it seems this only works with numpy.ma.average, and not with numpy.average, so the problem is still not completely gone.

It is worth re-iterating that the differences are in most cases very small, and only really show up with 32-byte floats. The problem testcase here is a particularly tricky example.
However, you can easily see where some calculations are (still) wrong like this ...

>>> np.__version__
'1.11.1'
>>> 
>>> data = np.zeros((6, 70, 100, 100), dtype=np.float32)
>>> data.fill(888.0 + 1.0/3)  # A value with a non-terminating representation
>>> data = np.transpose(data, [3,0,1,2]).reshape([100, 42000])  # What Cube.collapse does
>>> 
>>> # result with np.ma.average is "not too bad"...
... np.ma.average(data, axis=-1)[:2]
masked_array(data = [888.333251953125 888.333251953125],
             mask = [False False],
       fill_value = 1e+20)
>>> 
>>> # result with np.average is still "not so good"...
... np.average(data, axis=-1)[:2]
array([ 888.09002686,  888.09002686], dtype=float32)
>>> 
>>> # but a *COPY* of this data is contiguous, and that works ok ...
... np.average(data.copy(), axis=-1)[:2]
array([ 888.33325195,  888.33325195], dtype=float32)
>>> 

@pp-mo pp-mo mentioned this pull request Oct 28, 2016
@marqh
Copy link
Member

marqh commented Nov 2, 2016

this looks sensible and I would like to merge

the analysis on the ticket is really relevant and I would like to keep it

@pp-mo
please may you open a new issue so we can continue to work on this larger issue once this PR is shut?

@marqh marqh merged commit 8c116ad into SciTools:v1.11.x Nov 14, 2016
@QuLogic QuLogic added this to the v1.11 milestone Nov 14, 2016
@pp-mo
Copy link
Member Author

pp-mo commented Nov 15, 2016

the analysis on the ticket is really relevant and I would like to keep it
please may you open a new issue so we can continue to work on this larger issue once this PR is shut?

I've considered this, but I don't know quite what to do about it.
The problem is that the bug is still potentially there, in some odd cases, but at present Iris is sidestepping it by using masked-array calculations for all our stats calculations, which seems to fix it.

I tried a few other approaches, but I couldn't construct a testcase that Iris trips up on.
So although we still have a problem in numpy, I'm struggling to make out a case that it is actually relevant to Iris.

I'm tempted to let this drop now unless we can show some relevance.
Otherwise, it's sounding to me like #2239 (comment)

  • as in "the issue doesn't belong to Iris"
    What do you think @marqh ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants