Skip to content
This repository has been archived by the owner on Jun 3, 2023. It is now read-only.

Official spec for transformer protocol #20

Closed
tgriesser opened this issue Mar 12, 2015 · 38 comments
Closed

Official spec for transformer protocol #20

tgriesser opened this issue Mar 12, 2015 · 38 comments

Comments

@tgriesser
Copy link

@swannodette if you recall, this was something I had asked about and while I don't recall the exact reason for not having a transformer protocol implemented in transducers-js, I do remember there was a reason.

I've been thinking about it a bit lately, and I do think it's worthwhile to be able to define, particularly for JavaScript where it's much simpler to bake this into the prototype rather than defining as a lookup map of handlers as in transit-js.

As more libraries in JavaScript begin implementing this protocol (this was prompted by @kevinbeaty's excellent transducer PR on the ramda.js project), I wanted to see if there could be an agreed upon spec for the transformer before things get too far along.

This is the implementation kicked off by @jlongster

var t = require('./transducers');
Immutable.Vector.prototype[t.protocols.transformer] = {
  init: function() {
    return Immutable.Vector().asMutable();
  },
  result: function(vec) {
    return vec.asImmutable();
  },
  step: function(vec, x) {
    return vec.push(x);
  }
};

For starters - the use of a Symbol('transformer') is problematic as Symbol() creates a unique value (Symbol('transformer') !== Symbol('transformer')), and so you lose any interop when defined independently in multiple libraries.

I'd propose all transformer protocols be implemented as a @@transformer string (similar to @@iterator) until if/when the transformer is officially recognized in the well known symbols list.

I'd also propose that the spec behave similarly to an iterator, in that it is a function which returns the transformer, rather than as just an object proposed above:

Immutable.Vector.prototype['@@transformer'] = function() {
  return {
    init: function() {
      return new Immutable.Vector().asMutable();
    },
    result: function(vec) {
      return vec.asImmutable();
    },
    step: function(vec, x) {
      return vec.push(x);
    }
  };
}

This makes the implementation more useful, as it can refer to the current object in the init to know what value makes to init, should the object be subclassed:

SomeObject.prototype['@@transformer'] = function() {
  var obj = this;
  return {
    init: function() {
      return new obj.constructor()
    },
    step: function(result, arr) {
      return result.set(arr[0], arr[1]);
    },
    result: function(obj) {
      return obj;
    }    
  };
}
@kevinbeaty
Copy link

For starters - the use of a Symbol('transformer') is problematic as Symbol() creates a unique value (Symbol('transformer') !== Symbol('transformer')), and so you lose any interop when defined independently in multiple libraries.

This is a very good point. I'm going to open a new PR on Ramda to remove the Symbol until this is worked out. It's not being exposed, nor documented, nor otherwise used internally other than to initialize _into.

kevinbeaty added a commit to kevinbeaty/ramda that referenced this issue Mar 12, 2015
Prompted by cognitect-labs/transducers-js#20 from @tgriesser, removing
the internal transformer symbol until issues are worked out. This symbol
is currently only supported in jlongster/transducers.js and
interoperability is questionable as (Symbol('transformer') !==
Symbol('transformer')).

This feature was not documented, nor do the tests depend on it, and
only effects an undocumented feature of into. Better to remove it now
until details are worked out.  If an "official spec" is agreed upon,
I will open a new PR.
kevinbeaty added a commit to kevinbeaty/ramda that referenced this issue Mar 12, 2015
Prompted by cognitect-labs/transducers-js#20 from @tgriesser, removing
the internal transformer symbol until issues are worked out. This symbol
is currently only supported in jlongster/transducers.js and
interoperability is questionable as (Symbol('transformer') !==
Symbol('transformer')).

This feature was not documented, nor do the tests depend on it, and
only effects an undocumented feature of into. Better to remove it now
until details are worked out.  If an "official spec" is agreed upon,
I will open a new PR.
kevinbeaty added a commit to kevinbeaty/ramda that referenced this issue Mar 12, 2015
Prompted by cognitect-labs/transducers-js#20 from @tgriesser, removing
the internal transformer symbol until issues are worked out. This symbol
is currently only supported in jlongster/transducers.js and
interoperability is questionable as (Symbol('transformer') !==
Symbol('transformer')).

This feature was not documented, nor do the tests depend on it, and
only effects an undocumented feature of into. Better to remove it now
until details are worked out.  If an "official spec" is agreed upon,
I will open a new PR.
@swannodette
Copy link
Contributor

@tgriesser @kevinbeaty

Seems to me the logic should be more like the following:

var TRANSFORMER = null;

if(typeof Symbol != "undefined") {
    TRANSFORMER = Symbol.for("cognitect/transformer");
} else {
    TRANSFORMER = "@@cognitect/transformer";
}

Globally stealing names seems like a really bad idea.

@tgriesser
Copy link
Author

Aren't we looking for something which behaves similar to __transducers_reduced__? - but in this case a standard duck-type for defining an object as a transformable which works between libraries?

I agree that stealing top-level names via Symbol.for wouldn't be ideal

@swannodette
Copy link
Contributor

@tgriesser I'm suggesting that Symbols are the right path forward, using Symbol.for is fine as long as you namespace.

var ITransducer = {
    reduced: Symbol.for("cognitect/reduced"),
    transformer: Symbol.for("cognitect/transformer"),
    /* ... */
};

This seems like a good approach to me. Perhaps this is what @jlongster already did?

@jlongster
Copy link

I thought I had done that but I know I reworked the protocol in v2 of my lib so I must have dropped that: https://github.com/jlongster/transducers.js/blob/master/transducers.js#L8

Symbol.for is the right approach, but since it is supposed to be library-agnostic maybe transduce/transformer or something like that? I agree about namespacing. Symbols were supposed to get around namespacing but are only good for library-specific approaches... Here it doesn't really matter whether we use a symbol or not, imho

@jlongster
Copy link

I would suggest maybe we could have a separate library that defines these symbols that we all use, but that requires an equality check. And because npm loves to duplicate libraries, I don't think that will really work. Bah.

@swannodette
Copy link
Contributor

@jlongster Symbol.for always returns the same thing. Dupes would only be across JS Contexts but I've always considered that a lost cause honestly. I suspect that Symbols will consistently compare faster than Strings in the future, that's the primary reason to prefer them.

@tgriesser
Copy link
Author

Symbol.for is the right approach, but since it is supposed to be library-agnostic

Yeah, this is the main thing I'm looking for, just a way of making it library agnostic...

Then the second concern was whether the value is a function which defines/returns the transformer (which makes things a bit more flexible and is similar to an iterator), or just a plain object.

@swannodette
Copy link
Contributor

Libraries could do the following:

var stringProp = "@@cognitect/transformer",
    symProp = typeof Symbol != "undefined" ? Symbol.for("cognitect/transformer") : stringProp;

x[stringProp] = x[symProp] = function() {
   /* ... */
};

@jlongster
Copy link

@jlongster Symbol.for always returns the same thing. Dupes would only be across JS Contexts but I've always considered that a lost cause honestly. I suspect that Symbols will consistently compare faster than Strings in the future, that's the primary reason to prefer them.

Ah I see. Semantically I didn't see any benefits but that's true.

Then the second concern was whether the value is a function which defines/returns the transformer (which makes things a bit more flexible and is similar to an iterator), or just a plain object.

Is using obj.constructor like that supported everywhere? You usually want to handle data structures in a specific way, like with the immutable vector returns a mutable version for performance.

@swannodette
Copy link
Contributor

Also I'm personally not against transducer/transformer etc. as long as people don't mind that we're taking that top level name.

@tgriesser
Copy link
Author

Is using obj.constructor like that supported everywhere?

Afaik, yes.

You usually want to handle data structures in a specific way, like with the immutable vector returns a mutable version for performance.

Agreed, the benefit here is that if you're eventually able to subclass vector, you would only need to define the transformer once at the top-level Vector object and it'd be inherited properly (and create one of that instance when init rather than a generic Vector)

@jlongster
Copy link

Here's an idea: we break apart the object into three functions on the prototype:

SomeObject.prototype['@@transducer/init'] = function() {
  return new this.constructor();
};

SomeObject.prototype['@@transducer/step'] = function(result, arr) {
  return result.set(arr[0], arr[1]);
};

SomeObject.prototype['@@transducer/result'] = function(obj) {
  return obj;
};

If we are going to namespace the symbols anyway. If that init function works (I can't remember if this.constructor works like that), it could be the default one and init is optional.

@jlongster
Copy link

Also with the above I think 95% of data structures would just need to implement the step function the default init and result ones should be fine.

@tgriesser
Copy link
Author

Here's an idea: we break apart the object into three functions on the prototype:

That approach sounds great to me!

people don't mind that we're taking that top level name.

I wouldn't mind - though if performance is what we're after, I don't believe the latest benchmarks are leaning in favor of symbols (for now).

@swannodette
Copy link
Contributor

@jlongster I'm assuming we also want "@@transducer/reduced" no?

@jlongster
Copy link

reduced doesn't need to be part of a transducer, does it? I've lost context for transducers a bit, so you'll have to remind me why that's not just a library function that uses the agreed upon __transducers_reduced__ property or whatever that was.

@swannodette
Copy link
Contributor

@jlongster well transducers need to detect it, wether they use a library function to do it shouldn't matter. Just saying that if we're defining the required properties it seems they should all follow the same naming convention. Speaking of which it seems to me we probably need to supply the name to get the reduced value, "@@transducers/value". The benefit is again we're not stealing value from programmers and there's a data oriented way to deal with this stuff (no API).

@jlongster
Copy link

@swannodette This protocol is for implementing the "bottom" transducer though in a data structure so that's it's "transducable". Detecting if something is reduced isn't data structure-specific, right? I guess I have a hard time seeing why this (https://github.com/jlongster/transducers.js/blob/master/transducers.js#L111) belongs here

Edit: oops messed up line numbers

@swannodette
Copy link
Contributor

@jlongster but that's obviously broken right if not handled carefully? If you wrap your value in your own Reduced how can a different transducer from a different library detect it and extract the value? Sorry for not being clear, I think the protocol needs to be more general than the bottom transducer otherwise we're not all going to be able to compose the way we want.

@jlongster
Copy link

@swannodette That's what the __transducers_reduced__ property is for, which makes it compatible across libraries. Wait, are you just saying we should rename that to follow this convection? Not to actually attach it to SomeObject.prototype, but that our library's isReduced should check @@transducer/reduced? Yeah we can rename that, definitely.

@swannodette
Copy link
Contributor

@jlongster yep just saying we should rename to make this whole thing uniform. So the protocol is not just about the transformers, but how to deal with reduced values as well. Then you can really mix and match libraries.

@jlongster
Copy link

Absolutely! Sorry for the confusion. So we're in agreement?

  • Data structures can implement 3 methods: @@transduce/step, @@transduce/init, and @@transduce/result. Only step is required.
  • A "reduced" object is indicated by a @@transduce/reduced property that equals "true". If this is a true, the value can be retrieved on the value property of the same object.

@swannodette
Copy link
Contributor

@jlongster I think the value should be stored in @@transducer/value, the truth is that people duck type in the craziest ways you'd never expect. Putting the value in a key unlikely to clash just avoids all of this.

@Gozala
Copy link

Gozala commented Mar 20, 2015

This all sounds great.

While we're at it, what do you think of using @@transducer/reduce to mark a dispatch method of type function(reducingFunction, init) for use similar to IReduceInit in Clojure? (Useful with eduction, etc.)

I would like that, but would rather use different function signature (structure, reducer, init) => state, which is what I actually us already:
https://github.com/Gozala/transducers/blob/1f0efdfef18fe253d1c9b3759967401d431a23d1/src/transducers.js#L277-L311

@Gozala
Copy link

Gozala commented Mar 21, 2015

I would also like to propose that target[Symbol.for("transducer/reduce")] was invoked as follows:

target[Symbol.for("transducer/reduce")](step, initial, collection)

Note that step here has a (result, input) => result signature, as unless I'm mistaken reduce does not need to do init neither it needs to do result both are done by a transduce.

@kevinbeaty
Copy link

I was thinking about it a little differently. The idea is that the reducible object does not need to be concerned with transducers or transformers. It would be defined similar to Array.prototype.reduce. You could do something like this:

function reduce(xf, init, coll){
  if(isArray(coll)){
    return arrayReduce(xf, init, coll)
  }
  if(isFunction(coll[Symbol.for('transducer/reduce')])){
    return methodReduce(xf, init, coll)
  }
  // whatever else
  return iteratorReduce(xf, init, coll) 
}

function methodReduce(xf, init, coll){
  var result = coll[Symbol.for('transducer/reduce')](function(result, value){
      return xf.step(result, value)
    }, init)
  return xf.result(result)
}

@kevinbeaty
Copy link

... And the reducible object would just have to be concerned with implementing a contract similar to Array.prototype.reduce:

function eduction(t, coll) {
  return new Eduction(t, coll)
}
function Eduction(t, coll){
  this.t = t
  this.coll = coll
}
Eduction.prototype[Symbo.iterator] = function(){
  return sequence(this.t, this.coll)[Symbol.iterator]()
}
Eduction.prototype[Symbol.for('transducer/reduce')] = function(rf, init){
  return transduce(this.t, rf, init, this.coll)
}

@swannodette
Copy link
Contributor

@kevinbeaty going to leave transducer/reduce alone for now as I think this requires some more thought. The above minimal protocol is enough to get everyone on the same page about the fundamentals.

@swannodette
Copy link
Contributor

Implemented and released. Thanks everyone!

@jlongster
Copy link

Same here! My project conforms.

@Gozala
Copy link

Gozala commented Apr 3, 2015

https://github.com/gozala/transducers also uses this interface ;)

kevinbeaty added a commit to transduce/transduce that referenced this issue Apr 4, 2015
donabrams pushed a commit to donabrams/immutable-js that referenced this issue Jul 16, 2015
This was referenced Aug 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants