Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: slices: func CollectError #70631

Open
nicois opened this issue Dec 2, 2024 · 8 comments
Open

proposal: slices: func CollectError #70631

nicois opened this issue Dec 2, 2024 · 8 comments
Labels
Milestone

Comments

@nicois
Copy link

nicois commented Dec 2, 2024

Proposal Details

Idiomatic go functions are often in the form func (...) (T, err).
When creating iterators, it is equally common for the underlying functions to potentially return an error as each element in the iteration is created. For example, records might be read from a database, or requests from a client, etc.

Therefore, it will be relatively common for iterators of the form iter.Seq2[T, error] to be defined, as each element may produce an error. Sometimes a non-nil error will indicate a problem with that element only, but in many situations it will correspond to a termination of the iterator.

slices.Collect() is a helpful function to generate []T from iter.Seq[T]. It would be helpful to have an analogue to handle iter.Seq2[T, error]

My proposal is to add something similar to the following to the slices package:

func CollectError[T any](seq iter.Seq2[T, error]) ([]T, error) {
        var err error
        result := slices.Collect[T](func(yield func(t T) bool) {
                for k, v := range seq {
                        if v != nil {
                                err = v
                                return
                        }
                        if !yield(k) {
                                return
                        }
                }
        })
        return result, err
}

If an iteration includes a non-nil error, a slice of []T will be returned including all elements up to (but not including) the erroring iteration, along with the error.
Should no error be found, it will return ([]T, nil).

While this is not a large or complex function, iterator syntax is a little unwieldy and this would allow gophers to more easily consume iter.Seq2[T, err] iterators without being exposed to yield.

As mentioned earlier, some iter.Seq2[T, err] iterators will return a non-nil error and then continue. They would not be suitable to use with CollectError - but it would not be encouraged to collect these elements into a slice anyway; they should be processed as they arrive instead.

@nicois nicois added the Proposal label Dec 2, 2024
@gabyhelp
Copy link

gabyhelp commented Dec 2, 2024

Related Issues

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

@gopherbot gopherbot added this to the Proposal milestone Dec 2, 2024
@mateusz834
Copy link
Member

mateusz834 commented Dec 2, 2024

Similar (but more general) idea: #70084 (comment)

var err error
slice := slices.Collect(iter.ErrorAbsorber(ErrorIter(), &err))
if err != nil {
    // error handling
}

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Dec 2, 2024
@apparentlymart
Copy link

apparentlymart commented Dec 2, 2024

While I acknowledge that an iterator over T, error is one plausible way to represent fallible iteration, the iterator pattern was added primarily to allow user types to be used with for ... range and yet I don't think there's any well-adopted idiom for writing loops like this:

for v, err := range Something() {
    if err != nil {
        return err // or save the error somewhere and break? or ...?
    }

    // do something with v
}

It feels weird to me to introduce a helper function that assumes a pattern that doesn't actually exist yet, particularly if it functions as a "backdoor" way to introduce a new pattern without proposing that pattern explicitly.

The original proposal for iterators and sequences had some discussion about some different ways that fallible sequences could be represented (starting at #61405 (comment)), including iter.Seq2[T, error] as one possibility, but it seemed like there were lots of drawbacks to every idea and no clearly-superior path forward and so the discussion put that aside and focused on infallible sequences instead. I'd personally prefer to see the language settle on one "blessed" pattern for fallible iteration -- including adopting it for at least some actual iterator cases in the standard library -- before introducing a utility function like CollectError which can be designed around whatever that pattern turns out to be.

Edit: Thanks to some cross-linking work by others, I think #70084 has become the current "main place" to discuss what the pattern ought to be for fallible sequences.


FWIW I do quite like the general idea of a slices.CollectError function and am only hesitant about the fine details of how it's designed, since whatever pattern it encourages is likely to become the de-facto pattern for fallible iteration.

Therefore I was inspired to think about other ways this could potentially work, and the following is one. I'm not actually proposing to do this yet, since it seems to be rather awkward, but I'm posting it anyway to try to add something more concrete to the discussion than just a broad concern about patterns. The main thing I care about is actually settling on an idiom for fallible sequences before blessing that idiom with a helper function.

One of the other possible patterns discussed in that previous thread was something shaped like bufio.Scanner, which includes a function that advances the iterator and returns a bool for whether there's another element and then a separate function to call after the loop to check if the iteration stopped due to an error:

sc := bufio.NewScanner(...)
for sc.Next() {
    // ...
}
if err := sc.Err(); err != nil {
    return err
}

It's awkward to adapt that to range-over-func because a function pointer can't directly carry data beyond the actual function pointer and the function's closure. A range-over-func-shaped version of this would presumably involve a separate type with two methods, which I'll write as an interface here just for notation convenience:

package iter

interface FallibleSeq[T any] {
    Items() Seq[T]
    Err() error
}

slices.CollectError could then consume this API instead:

package slices

func CollectError[T any](seq iter.FallibleSeq[T]) ([]T, error) {
    result := slices.Collect[T](seq.Items())
    return seq.Err()
}

Under this pattern it would be up to the FallibleSeq implementation to decide whether encountering an error causes the sequence to stop immediately, or whether iteration is allowed to continue and finally produce an error that somehow combines all of the accumulated errors encountered along the way. errors.Join would be a plausible way to produce a "mega-error" to eventually return from Err(), but that's ultimately up to the FallibleSeq implementer to decide.

(The original discussion of this particular shape of fallible iterator noted that it's easy to forget to check the error after iteration, and that remains true here. I don't intend this to be a solution to that problem, only to describe what slices.CollectError might look like if that were chosen as the preferred pattern for representing fallible sequences.)

@atdiar
Copy link

atdiar commented Dec 3, 2024

One thing I had been thinking about: given that a fallible sequence will necessarily return a garbage value, it should almost necessarily return the garbage value, along with an error.

We could easily expect that these would be wrapped.

type Maybe[T any] struct{
   val T
   err error
}

func(m Maybe[T]) Values() (T, err) {...}

So using the same exact mechanism as the infallible iterators.

var Err error
for i, v:= range it{
   vv, err:= v.Values()
   if err! = nil{ Err = err; break}
  //... 

} 

Then the only thing that may need to be adapted is the iterator.
As iter. Seq[Maybe[T]] if that makes sense.

Does this idea make sense or am I overlooking something, especially regarding the value conversion?

Seems a bit too obvious.

@nicois
Copy link
Author

nicois commented Dec 3, 2024

I can see the benefit to exposing the returned value alongside the failed element. By adding another yield, this is easily achievable. It would result in the final element of []T holding this value.

func CollectError[T any](seq iter.Seq2[T, error]) ([]T, error) {
        var err error
        result := slices.Collect[T](func(yield func(t T) bool) {
                for k, v := range seq {
                        if v != nil {
                                err = v
                                yield(k)
                                return
                        }
                        if !yield(k) {
                                return
                        }
                }
        })
        return result, err
}

@nicois
Copy link
Author

nicois commented Dec 3, 2024

I appreciate the other feedback too. If the jury is out on the idiomatic way to implement fallable errors, perhaps this proposal is premature - though my personal opinion is that the decision of whether a iter.Seq2[T, error] is truly infallable can (and will) be superseded by the consumer. For example, a database server might happily continue returning records after an error, but a consumer might want to stop as soon as an error is seen, in their particular case. For this reason, I expect some implementation of CollectError is already useful for infallable iterators.

@jba
Copy link
Contributor

jba commented Dec 20, 2024

If the function that returns a fallible iterator also returns an error function along with the iterator, you don't need slices.CollectError:

iter, errf := NewFallibleIterator(...)
s := slices.Collect(iter)
if err := errf(); err != nil {
    return err
}

@apparentlymart
Copy link

apparentlymart commented Dec 20, 2024

With this discussion now unfortunately split across two issues I guess I'll redundantly note here that I ended up implementing that FallibleSeq idea from my earlier comment as part of my seqreader prototype.

The FallibleSeq variation of the pattern in the previous comment would be this:

// assuming that "seq" is a FallibleSeq[T]
s := slices.Collect(seq.Items())
if err := seq.Err(); err != nil {
    // handle err
}

Of course, this comes with the tradeoff that nothing is forcing you to actually call seq.Err. For me this is a familiarity vs. robustness tradeoff: returning a function that returns an error is a pretty unusual thing to do in today's Go, whereas returning an object with a method that returns error is a more normal thing with existing precedent in stdlib. Of course, "familiar" does not necessarily imply "better".

If slices.CollectError existed at all in that world then its implementation would be essentially just the above, with the marginal advantage that anyone using it would find it harder to accidentally ignore the error result.

(I think @jba's design also marginally benefits from slices.CollectError, but for the reason of making it the more familiar signature returning a result and an error rather than a result and a function that returns an error.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

7 participants