Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maps: remove Keys and Values for Go 1.21 #61538

Closed
rsc opened this issue Jul 23, 2023 · 60 comments
Closed

maps: remove Keys and Values for Go 1.21 #61538

rsc opened this issue Jul 23, 2023 · 60 comments

Comments

@rsc
Copy link
Contributor

rsc commented Jul 23, 2023

Given #61405 and the likelihood of additional operations on those kinds of iterator functions (like Map, Filter, and Reduce), it seems possible that we would want Keys and Values to return iterator functions that can be used with those and do not provoke O(N) storage. I imagine there would also be a slices.Collect that collects values from any iterator into a slice, so existing uses of maps.Keys(m) that still want a slice would become slices.Collect(maps.Keys(m)).

If there's a good chance we're going to want Keys and Values to return iterators in Go 1.22, it seems short-sighted to ship them in Go 1.21 returning slices. I wonder whether we should pull them. It is late in the cycle, but not too late. git grep says there are no mentions of maps.Keys or maps.Values in the standard library.

@rsc rsc added this to the Go1.21 milestone Jul 23, 2023
@rsc rsc added this to Proposals Jul 23, 2023
@rsc rsc moved this to Incoming in Proposals Jul 23, 2023
@zephyrtronium
Copy link
Contributor

I'm not clear on the value of using iterators for those methods rather than the normal range loops that exist today. We don't have map, filter, or reduce that work on iterators. Whatever package provides them can also trivially provide a function to adapt any rangeable value to an iterator.

@jimmyfrasche
Copy link
Member

s/Slices/Values/g?

@DmitriyMV
Copy link
Contributor

We have relevant experience with something like maps.Keys and its rather unpleasant one, because most of the time you don't need all keys: you likely don't need 90% of them, since the result of maps.Keys in most cases is going into something like slices.Filter with very specific filter pattern.

And the reason it's unpleasant is because you never know how many keys are in the map now - each time there is chance of unbounded memory growth, that could not happened during original code review, but things had changed since then and now instead of 10 keys you constantly filtering from 10000 keys.

But thats our experience anyway.

@changkun
Copy link
Member

Instead of slices.Collect(maps.Keys(m)), what if we keep maps.Keys but do something like slices.Collect(maps.KeyIter(m)). This would make the signature reads more clear that Keys returns an slice, but KeyIter returns an iterator?

@rsc
Copy link
Contributor Author

rsc commented Jul 23, 2023

@jimmyfrasche Yes thank you. Changed Keys and Slices -> Keys and Values.

@zephyrtronium

I'm not clear on the value of using iterators for those methods rather than the normal range loops that exist today. We don't have map, filter, or reduce that work on iterators. Whatever package provides them can also trivially provide a function to adapt any rangeable value to an iterator.

That's true, but if the rangeable value is the current maps.Keys(m), then Keys builds the entire slice in memory even if you only just want to iterate through the keys and look at them. That's O(N) space that could be O(1) space instead, a large and potentially important difference for big maps. I don't think we should lock ourselves into the O(N) iteration strategy for map keys and map values.

@DmitriyMV: Yes, exactly, that's the situation I am concerned about. If Keys returned an iterator instead of a []K, you wouldn't have the memory or latency hit of collecting all the keys. You might still have the latency of filtering if you are going to go through all of them, but you avoid the memory hit with a filtered iterator. And if you stop early, you avoid the latency hit too.

@changkun Yes, if we keep Keys and Values as they are we will have to introduce some other iterator form for efficiency, maybe called KeysIter and ValuesIter. But I suspect most uses for Keys and Values work better with the iterator form (which would fix @DmitriyMV's example), meaning we should save the good names Keys and Values for them instead of giving them second-rate names.

@changkun
Copy link
Member

instead of giving them second-rate names

Why is the Iter suffix a second-rate name? Values vs ValuesIter Comparably we already have artifact pairs like Handler vs HandlerFunc etc.

@zephyrtronium
Copy link
Contributor

That's true, but if the rangeable value is the current maps.Keys(m), then Keys builds the entire slice in memory even if you only just want to iterate through the keys and look at them. That's O(N) space that could be O(1) space instead, a large and potentially important difference for big maps. I don't think we should lock ourselves into the O(N) iteration strategy for map keys and map values.

I don't see how we would be "locked into" an O(n) space iteration strategy for map keys or values. If I want O(1) space iteration over the keys of a map m, I write for k := range m. I don't see why I would ever use maps.Keys for that. When I use maps.Keys, it is to build a list of all the keys, usually followed by sorting the list. That is the most common purpose for which I use package maps, even. I didn't realize that is unusual.

Of course, I could still do this with a slices.Collect that accepts an iterator. And if that doesn't happen, maps.Keys and maps.Values can of course be added later. It just seems unfortunate to me to revise an already accepted proposal to remove the feature I use most frequently from its experimental version because of a language change that might happen and that might be followed by an API addition that can't even be proposed until that language change is decided.

@zephyrtronium
Copy link
Contributor

meaning we should save the good names Keys and Values for them instead of giving them second-rate names.

Another way to provide a "first-class" name for composable map iterators would be to introduce a package iter which might have iter.Map, iter.Keys, iter.Collect, &c. maps.Keys isn't the only option.

@AndrewHarrisSPU
Copy link

AndrewHarrisSPU commented Jul 23, 2023

More general formulations of algorithms might want to parameterize on something like type Pair[K comparable, V any] struct{Key K, Value V}, to preserve the key:value association. Also, there might be a type OrderedValue[V any] Pair[int, V] for slices and OrderedPair[K comparable, V any] OrderedValue[Pair[K,V]] for ordered map elements.

I'm quickly unsure if or when this is "bad idea" territory, but it's probably not in some languages - it's just that I definitely agree about not getting locked in.

@PaluMacil
Copy link

PaluMacil commented Jul 23, 2023

@rsc I think we can flip this backwards: instead of talking about leaving the preferred name for iterators, think about how not using a suffix will cause confusion in the rest of the standard library. There are things all over that return slices, and with iterators, it could be valuable to add methods and functions that return them. This won't be the only place, so while we can use a suffix and have the new iterators be consistent, we cannot have the other way. Be consistent. Also, "ThingIter" sounds like a name of a thing iterator and "Things" sounds like I already have those things allocated.

@rsc
Copy link
Contributor Author

rsc commented Jul 24, 2023

Pretty much everything in the standard library that returns a slice today should return an iterator in a future version. Adding "Iter" suffixes may be OK given the history but it's not where we want to end up. A v2 of those packages might flip the default. Or maybe there are good names available that don't use Iter. Either way, maps and slices are brand new packages. We don't want to go straight to a v2 just to fix this detail. We can avoid making the mistake to begin with if we just leave them out for this release and give ourselves another cycle to get it right. Maybe we'll end up back where we are today, but maybe not. I suspect not.

@PaluMacil
Copy link

I don't think it's entirely a mistake though. There are choices all over where we use slices as a return instead of some sort of iterator and have performance that is just fine. Anything that isn't CPU bound won't have a noticeable improvement either way.

However, at any given time, most software developers are fairly junior and the concept of an iterator is more difficult than the concept of a slice. Python has had generators and iterators for quite some time, but most of the time you see people using lists. It's totally unnecessary, but outside shared libraries, it rarely causes a difference. I don't think you'll stop seeing slices used a lot more often than iterators even if we one day have a Go 2 where they are the more common option.

That said, I personally haven't seen much indication that we would ever get to a point where such a big break would be worth it, so using words that are less specific in hopes that you can standardize on that eventually doesn't seem particularly helpful

@blackgreen100
Copy link

blackgreen100 commented Jul 24, 2023

I think iterators shouldn't be the default choice, in general. They should be something the programmer consciously opts in to when needed. It simply follows the principle of least surprise. It is intuitive and unsurprising that maps.Keys and maps.Values return slices, just as the names imply. I strongly agree with @zephyrtronium 's comment above:

If I want O(1) space iteration over the keys of a map m, I write for k := range m. I don't see why I would ever use maps.Keys for that.

PS: the title of this issue should be updated too: "remove Keys and Slices" -> "remove Keys and Values"

@seankhliao seankhliao changed the title proposal: maps: remove Keys and Slices for Go 1.21 proposal: maps: remove Keys and Values for Go 1.21 Jul 24, 2023
@doggedOwl
Copy link

doggedOwl commented Jul 24, 2023

I think iterators shouldn't be the default choice, in general. They should be something the programmer consciously opts in to when needed.

Normally you would want the most costly and specific option to be consciously opted in and clearly returning a slice is more costly than returning an Iterator. having the full slice of elements upfront is usefull only when you are sure that you would need all of them and can justify the memory used. The more generic case is that you would need only some of the values and/or potentially exit early from the loop making Iterators better suited for the default choice.

You can collect the elements of the iterator in a slice if it's needed but while you still can create an iterator that loops over the slice, the "damage" of the memory use is already done and creating an iterator at that point is almost useless.

@earthboundkid
Copy link
Contributor

But I suspect most uses for Keys and Values work better with the iterator form

I don’t suspect that. Python 3 changed from returning a list to returning an iterator, and it was often a pain because I needed to wrap the keys method in a list conversion. For maps.Keys today, my most frequent operation is to take the keys and sort them, which requires it being a slice. I think there should be an easy to use slice function. If it’s called KeySlice instead of Keys, that’s fine, but it is so frequently used it really is a pain to have to add a collect step in.

@jimmyfrasche
Copy link
Member

If you need sorted map keys maybe you need a sorted map instead? Even without such a thing, if you need to collect the keys in a slice once it's not a lot of code (with or without stdlib changes) and if you need it more than once you can throw that in a function.

@earthboundkid
Copy link
Contributor

if you need it more than once you can throw that in a function.

And that function could be in the standard library. ;-)

@jimmyfrasche
Copy link
Member

I'm not opposed to functions that collects the keys/values in a slice, but I agree with @rsc that is less general than the iterator form and hence the short name should go to the iter form. For the specific use case you bring up, I wonder if a more generally useful way to expose that would be a sorting func that takes an iterator as input and returns a slice.

@earthboundkid
Copy link
Contributor

I agree that this is a release blocker and should be resolved definitively before 1.21 releases, but I also think this is a bit of retreading previous decisions. Iterators were discussed as part of the planning for the maps package. In #47330 (reply in thread) @ianlancetaylor wrote,

A normal map iterator will be over key/value pairs. I don't think that an iterator over just keys or just values will be the common case. While I can imagine adding them (if we ever have iterators), I don't think we need to reserve better names for them.

Did something change to make us rethink that?

@gazerro
Copy link
Contributor

gazerro commented Jul 24, 2023

Reading the keys of a map to obtain a sorted slice of keys is a use case that I have encountered several times. Without the slices and maps package, I would have written

i := 0
s := make([]T, len(m))
for k := range m {
	s[i] = k
	i++
}
sort.Slice(s, func(i, j int) { return s[i] < s[j] })

With the slices and maps packages, you can write

s := maps.Keys(m)
slices.Sort(s)

but with the change of this proposal I suggest introducing a slices.Sorted function

s := slices.Sorted(maps.Keys(m))

declared as

Sorted[T comparable](iter func(func(T) bool) bool) []T

This would be a good improvement, using only the standard library, compared to the current situation.

@heschi
Copy link
Contributor

heschi commented Jul 24, 2023

A process comment: we say to people that they should write code against release candidates and deploy them to production. At this point these functions have been in RCs for about a month, and I think it's plausible that people have uses functions in their codebase that will break when they update from the last RC to the final release. Are we concerned about the risk of breakage?

@cuiweixie
Copy link
Contributor

The current version of maps.Keys is impl by me. I think https://go-review.googlesource.com/c/go/+/481435 and https://go-review.googlesource.com/c/go/+/481436. I do not know the detail impl of iters. but I think the current version of maps.Keys/Values could be faster that the Iter version. I think performance is also an important consideration.

@eihigh
Copy link

eihigh commented Jul 25, 2023

As many have already mentioned, the main use of maps.Keys is for sorting.

GitHub Code search results

In this case, what we need is not an iterator, but a slice to be sorted.
(The method proposed by @gazerro does not allow us to make([]T, len(m)) before sorting, which would be inefficient.)

bradfitz pushed a commit to tailscale/go that referenced this issue Aug 2, 2023
Preserve the names in case we want them to return an iterator.
Keep the efficient runtime implementations for now,
as we will probably want them under some name, perhaps KeysSlice
and ValuesSlice.

Fixes golang#61538

Change-Id: I6b03010bf071fb4531cb2f967dad46425962fcb8
Reviewed-on: https://go-review.googlesource.com/c/go/+/513476
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-on: https://go-review.googlesource.com/c/go/+/513715
Run-TryBot: Russ Cox <rsc@golang.org>
oxisto added a commit to oxisto/money-gopher that referenced this issue Aug 12, 2023
Almost removing `golang.org/x/exp` but not yet completly because of golang/go#61538.
oxisto added a commit to oxisto/money-gopher that referenced this issue Aug 12, 2023
Almost removing `golang.org/x/exp` but not yet completly because of golang/go#61538.
oxisto added a commit to oxisto/money-gopher that referenced this issue Aug 12, 2023
Almost removing `golang.org/x/exp` but not yet completly because of golang/go#61538.
database64128 added a commit to database64128/shadowsocks-go that referenced this issue Aug 12, 2023
- Remove packages copied from std and change references.
- Move functions in [maps] that were dropped by std right before release to maphelper. golang/go#61538
- Use min, max.
eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
Preserve the names in case we want them to return an iterator.
Keep the efficient runtime implementations for now,
as we will probably want them under some name, perhaps KeysSlice
and ValuesSlice.

Fixes golang#61538

Change-Id: I6b03010bf071fb4531cb2f967dad46425962fcb8
Reviewed-on: https://go-review.googlesource.com/c/go/+/513476
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
Preserve the names in case we want them to return an iterator.
Keep the efficient runtime implementations for now,
as we will probably want them under some name, perhaps KeysSlice
and ValuesSlice.

Fixes golang#61538

Change-Id: I6b03010bf071fb4531cb2f967dad46425962fcb8
Reviewed-on: https://go-review.googlesource.com/c/go/+/513476
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
@dablelv
Copy link

dablelv commented Jan 30, 2024

That's true, but if the rangeable value is the current maps.Keys(m), then Keys builds the entire slice in memory even if you only just want to iterate through the keys and look at them. That's O(N) space that could be O(1) space instead, a large and potentially important difference for big maps. I don't think we should lock ourselves into the O(N) iteration strategy for map keys and map values.

I don't see how we would be "locked into" an O(n) space iteration strategy for map keys or values. If I want O(1) space iteration over the keys of a map m, I write for k := range m. I don't see why I would ever use maps.Keys for that. When I use maps.Keys, it is to build a list of all the keys, usually followed by sorting the list. That is the most common purpose for which I use package maps, even. I didn't realize that is unusual.

Of course, I could still do this with a slices.Collect that accepts an iterator. And if that doesn't happen, maps.Keys and maps.Values can of course be added later. It just seems unfortunate to me to revise an already accepted proposal to remove the feature I use most frequently from its experimental version because of a language change that might happen and that might be followed by an API addition that can't even be proposed until that language change is decided.

Can't agree more, please add maps.Keys and maps.Values back to the standard library.

@earthboundkid
Copy link
Contributor

This issue is closed. See #61900 for the proposal to bring back maps.Keys and maps.Values as iterators.

@cuishuang
Copy link
Contributor

Maybe I have limited information and less knowledge than the members of the go team, but if it is to support the functions of the iter package, why not add two new function names, such as maps.KeysIter and maps.ValuesIter, instead of changing the behavior of the previous method.

@PaluMacil
Copy link

PaluMacil commented Dec 25, 2024

Maybe I have limited information and less knowledge than the members of the go team, but if it is to support the functions of the iter package, why not add two new function names, such as maps.KeysIter and maps.ValuesIter, instead of changing the behavior of the previous method.

This was discussed in depth in this thread. I preferred this approach because the implementation of an iterator technically hides complexity that could be expensive if you don’t read the implementation and in most trivial cases an iterator doesn’t actually improve efficiency. However, the consensus was that iterators should be the default so that if all you need is iteration, you are less likely to materialize the whole slice somewhere and be able to stream process with good efficiency. Regardless of your opinion, the ship has sailed, and you can try this now. If you need a slice, simply collect it

@cuishuang
Copy link
Contributor

This was discussed in depth in this thread. I preferred this approach because the implementation of an iterator technically hides complexity that could be expensive if you don’t read the implementation and in most trivial cases an iterator doesn’t actually improve efficiency. However, the consensus was that iterators should be the difficult so that if all you need is iteration, you are less likely to materialize the whole slice somewhere and be able to stream process with good efficiency. Regardless of your opinion, the ship has sailed now and you can try this now. If you need a slice, simply collect it

Thanks for your reply and discussion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests