Skip to content
This repository has been archived by the owner on Jul 8, 2024. It is now read-only.

Inconsistent treatment of iterable argument #35

Closed
ajklein opened this issue Dec 18, 2018 · 12 comments
Closed

Inconsistent treatment of iterable argument #35

ajklein opened this issue Dec 18, 2018 · 12 comments

Comments

@ajklein
Copy link

ajklein commented Dec 18, 2018

All the methods in this proposal take a single argument, iterable. But there are two different treatments of that argument:

  • For the operations which create new sets, the iterator protocol is always used on iterable to get at its elements.
  • For the isXXX operations (except for isSupersetOf), there's a check for the [[SetData]] internal slot, and if it's already a Set, then no iteration is performed (instead, the receiver's [[SetData]] is, unobservably, iterated). For non-Set iterables, a new Set is constructed (which will, of course, run the iteration protocol itself).

Why are there two different approaches here? If possible, it would be ideal for these methods to deal with their argument in a uniform manner.

@ljharb
Copy link
Member

ljharb commented Dec 18, 2018

I'd personally be in favor of always preferring the [[SetData]], and falling back to iterableness, unless there's a strong reason to go with "always iterable".

@bakkot
Copy link
Collaborator

bakkot commented Dec 18, 2018

If we intend Set to be subclassable, I think it makes more sense to always use the iteration protocol. For example, if you want to make a Set subclass which distinguishes -0 and 0, you'd override Symbol.iterator so that it would emit -0 at the appropriate time. But your [[SetData]] internal slot would not contain -0. It would be surprising to me if such an object behaved correctly everywhere except when passed to Set.prototype.isSubsetOf.

@ljharb
Copy link
Member

ljharb commented Dec 18, 2018

That's a fair point.

@gsathya
Copy link
Member

gsathya commented Dec 18, 2018

Thanks for the feedback, Adam!

Why are there two different approaches here?

In isSubsetOf and isDisjointWith, I call Set.prototype.has with the argument as the receiver. This would throw if the argument is not a Set. None of the other methods require calling a Set.prototype method with the argument as the receiver which is why I don't do the conversion inline for the others.

If we intend Set to be subclassable, I think it makes more sense to always use the iteration protocol.

Just to confirm that we're thinking of the same solution here -- this would mean just removing the [[SetData]] check and constructing a new Set object always ( and causing the iteration protocol to run)? I think this is fine from an implementation POV -- I can optimize away the extra allocation depending on whether the Symbol.iterator has been patched or not.

@ljharb
Copy link
Member

ljharb commented Dec 18, 2018

For isSubsetOf and isDisjointWith, since they're both calling into the potentially overridden has method, i suppose it does seem more consistent to delegate the slot checking to that method.

@ljharb
Copy link
Member

ljharb commented Dec 18, 2018

Even if the set is always iterated over; does it make sense to require the [[SetData]] internal slot anyways? Subclasses wouldn't have an issue with that, as long as they called super.

@bakkot
Copy link
Collaborator

bakkot commented Dec 18, 2018

Just to confirm that we're thinking of the same solution here -- this would mean just removing the [[SetData]] check and constructing a new Set object always ( and causing the iteration protocol to run)?

Yeah, that's the thought. Though it's a little weird to still use the potentially overridden Set.prototype.has at that point. Personally I'd be fine with always using the builtin Set and has, i.e., the equivalent of

const $Set = Set;
const $has = Function.prototype.call.bind(Set.prototype.has);

Set.prototype.isSubsetOf = function(iterable) {
  const set = new $Set(iterable);
  for (const e of this.[[SetData]]) { // not sure whether to use [[SetData]] or Symbol.iterator
    if (!has(set, e)) {
      return false;
    }
  }
  return true;
};

Another alternative would be to make it a "protocol" like thenables - don't check anything, don't accept iterables, just invoke arg.has repeatedly. If you want to use an arbitrary iterable, you have to wrap it in new Set() first yourself. But I like that somewhat less.

Even if the set is always iterated over; does it make sense to require the [[SetData]] internal slot anyways?

I don't think so. Set.prototype.union doesn't (and IMO shouldn't).

@ajklein
Copy link
Author

ajklein commented Dec 19, 2018

I think the back-and-forth above is useful, just wanted to chime in with some more understanding I have after @gsathya's reply (and some offline discussion):

  1. IsDisjointWith ought to be easy to do in terms of iterables: iterate the argument, and ask the receiver if it has each item, returning true only if all the has checks return false. This makes it uniform with the other methods.
  2. Neither IsDisjointWith nor IsSupersetOf should need to do a [[SetData]] check on the receiver.

This makes it clear that IsSubsetOf is the tricky one, and I think it's good to continue to the discussion of how it might best be done. But I don't think there's a need to talk about it in terms of "consistency" or "uniformity", since it does have a conceptual difference: it needs to treat the iterable argument like a Set (something that's not needed for any of the others).

@ljharb
Copy link
Member

ljharb commented Dec 19, 2018

@bakkot ah, i guess what i meant was, before getting has on the input, do the slot check. the check wouldn't be needed if it was only iterating.

@bakkot
Copy link
Collaborator

bakkot commented Dec 20, 2018

isSubsetOf needs its receiver to support iteration (whatever that means) and its argument to support membership testing (whatever that means). isSupersetOf needs the reverse.

I think it makes sense for these methods to "mean the same thing" by the above terms: so however isSupersetOf does iteration of its argument and membership testing of its receiver, isSubsetOf should do iteration of its receiver and membership testing of its argument in the same way. That is, if nothing else changes, isSubsetOf should should iterate over its receiver using the iteration protocol (rather than by reading [[SetData]]) and should test membership in its argument by using its argument's .has method (rather than possibly converting it to a Set first).

Having isSubsetOf iterate over its receiver using the iteration protocol rather than by reading [[SetData]] is consistent with union etc, too (which invoke the iteration protocol on their receiver by passing it to the Set constructor). It is not consistent with the existing methods on Set.prototype, particularly forEach (and neither is union). I'd argue that's OK because one might reasonably want to invoke the methods in this proposal on a non-set iterable, whereas you would never want to invoke Set.prototype.forEach on a non-set iterable.

That makes isSubsetOf unique in that its argument would not be treated as an iterable, but rather a thing with has. But, likewise, isSupersetOf and isDisjointWith are unique in that they don't treat their receiver as an iterable, but rather a thing with has. I think that's ok. It might trip someone up the first time, but as long as they're not trying to pass some iterable which happens to have a "has" property we should be able to give them a reasonably clear error. (Let's be sure not to add has to Array.prototype or String.prototype, I guess.)

(I would also be OK with having "membership testing" mean "look at at the [[SetData]] slot" in all cases, but that's strictly less useful and I think only very slightly easier to optimize, so I prefer using has.)


This differs from the strategy I was suggesting above, in that I am now suggesting "always assume it's a set and call has" instead of "always coerce it to a set". Sorry about that. Either approach is OK, I think, although my preferred approach is the one I've described here. I do think isSubsetOf and isSupersetOf should match.

@bakkot
Copy link
Collaborator

bakkot commented Dec 20, 2018

Short version of above comment:


But I don't think there's a need to talk about it in terms of "consistency" or "uniformity", since it does have a conceptual difference: it needs to treat the iterable argument like a Set (something that's not needed for any of the others).

isSubsetOf is conceptually different from the other methods in that particular way, but conceptually similar to isSupersetOf and isDisjointWith in that those methods need to treat their receiver like a Set, which is not needed for any of the others. So I think there's still a case to be made for a particular kind of consistency.

@gsathya
Copy link
Member

gsathya commented Jan 14, 2019

Fixed in #41

@gsathya gsathya closed this as completed Jan 14, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants