Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Nel.traverseEither and Nel.traverseValidated incorrectly traversi… #2362

Closed
wants to merge 4 commits into from
Closed

Conversation

kyay10
Copy link
Collaborator

@kyay10 kyay10 commented Apr 4, 2021

…ng the nel's head twice

Bug found by Mike Wilkes on Slack.
In short, the bug caused the Nel's head to be traversed both at the very start of the resulting Nel and at the very end of the resulting Nel. I couldn't use foldRight because it was quite awkward in this case, and so I resorted to just using the stdlib's fold. Added a couple of tests too to make sure that that bug never happens again ;)

kyay10 added 3 commits April 4, 2021 18:40
# Conflicts:
#	arrow-libs/core/arrow-core/src/main/kotlin/arrow/core/NonEmptyList.kt
Copy link
Member

@1Jajen1 1Jajen1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is great 👍

The note is a bit inaccurate, other than that this looks correct^^

Comment on lines +416 to +417
// Using foldRight here is very awkward since we have to somehow special-case the head of the Nel,
// but fold just works perfectly. In reality no one should really be relying on the order of execution of this traverse (hopefully)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a bit wrong. traverse's order should be left to right in all cases (you can go backwards by using the correct applicative which swaps the order). foldRight is not about execution order at all, it is about associativity of the combine function (or at least it should be, and imo our Iterable.foldRight is just wrong and needs to be changed).
Either way, the root problem why foldRight does not work is simply that foldRight already iterates over all elements, including the head. Thus we iterate over all, but also include the head in the default/empty case, which is where the duplication comes from.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see exactly what you mean, but foldRight still won't work in this case even if it was tail.foldRight because the head would be added to the very end of the list while the rest of the elements would be in the correct order. In other words, I think that semantically, as you describe it, the order of parameters in foldRight doesn't really allow for using the head as an empty case. I guess if we really really wanted to we could have a truly empty case, but that would violate the contract of how Nel works in the first place, so tbh it doesn't seem like that good of an idea.

Thank you so much for the insightful response btw. I've barely got any experience with FP and so I've only picked a few things up from reading articles and just seeing code samples that use Arrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is due to the very confusing ways we implement foldRight and after talking to @nomisRev about it, it might make sense to just throw it away as it provides no benefit over fold anyway.

So foldLeft/fold/foldRight: (as lazy and strict variants. Strict as in it needs to evaluate its arguments, in arrow we use Eval for lazy values)

  • strict foldLeft = fold This iterates over the elements in order left to right and accumulates along the way
  • strict foldRight This iterates over the elements in order right to left. (Haskell docs even state that this is hardly ever what you want ^^)
  • lazy foldLeft This builds a lazy computation that is left nested, which means to produce the final lazy computation we must traverse the entire list first. Highly inefficient when we want to short circuit or when we actually need every result applied.
  • lazy foldRight: This returns a lazy computation that is right nested. This means it only ever iterates the list as far as necessary to calculate a result. Thus it is also safe for infinite structures!

See the haskell docs for Foldable (they are quite good, even without haskell knowledge): https://hackage.haskell.org/package/base-4.15.0.0/docs/Data-Foldable.html#t:Foldable

So in kotlin we have fold and foldRight which are both strict folds that are either left or right associative, but arguably strict right folds are basically never what a user wants in the first place.
So the next question to ask is: Why no lazy folds?

  • The easy answer is: Because there are no lazy values in kotlin, but this is superficial as they in this specific case are not needed.
  • The actual answer: fold is inline. This means we can exit the fold whenever we are done using non-local returns. This makes laziness inside the fold completely unnecessary as we can exit whenever we need.

Oh and lastly unrelated to the fold bits:
Don't worry about using empty values or even mutable lists as accumulators for traverse over nonempty lists. traverse needs to follow certain rules, one of which being: "traverse has to retain the structure of whatever it traverses" This means no dropping or adding elements and thus the invariant for nonempty lists will always hold. Sure it ain't as pretty, but it'll work. (This follows from the typeclass laws of traverse, which https://hackage.haskell.org/package/base-4.15.0.0/docs/Data-Traversable.html#overview describes quite well)

If you need a reference: I, after seeing your pr, checked the behavior of traverse across arrow and found several bugs and inaccuracies, this pr should address all of them: #2365

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be interesting to refer to this explanation in the doc

@kyay10
Copy link
Collaborator Author

kyay10 commented Apr 5, 2021

Note: this should be closed if #2365 is merged instead

@1Jajen1
Copy link
Member

1Jajen1 commented Apr 6, 2021

Note: this should be closed if #2365 is merged instead

It is, thanks for kicking this all off ❤️

@1Jajen1 1Jajen1 closed this Apr 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants