Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec: Formal Subscriptions Definition #305

Merged
merged 22 commits into from
May 17, 2017
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions spec/Section 5 -- Validation.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,67 @@ query getName {
}
```

### Subscription Operation Definitions

#### Single root field

**Formal Specification**

* For each subscription operation definition {subscription} in the document
* Let {rootFields} be the top level selection set on {subscription}.
* {rootFields} must be a set of one.

**Explanatory Text**

Subscription operations must have exactly one root field.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robzhu what is the reason for that? why can't i subscribe to more then one subscription with one call?
so far i thought this is implementation limitation..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Rob's text that starting with just one root field is reasonable. A later version of the spec could support multiple fields, but that requires figuring out a lot of edge cases around errors, when subfields are re-executed, and more.

Copy link
Contributor

@OlegIlyenko OlegIlyenko May 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also would like to know the reason for this rule. I find that it is quite hard limitation, and in a way it goes against GraphQL principles where clients have the power to decide what it wants to get.

that requires figuring out a lot of edge cases around errors

I think it would be helpful to list and discuss these edge cases. From my experience, there are definitely things to consider when merging different event streams from different GraphQL subscription fields, but I find it quite manageable.

Copy link
Contributor

@OlegIlyenko OlegIlyenko May 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I listed most of the points I found as I was implementing it in sangria here:

#282 (comment)

I think that it boils down to points 3.i, 5 and 6. My take on it is here: #282 (comment)

IMHO, if we need to make a trade off, I would rather disallow not-null root subscription field types than allow only a single subscription field in a query.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stubailo if one wants to experiment that's fine, but i don't think it should be in the official spec unless there is a reason (which is not implementation) behind it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stubailo thanks a lot for describing in on an example!

So you're sending two subscription fields in the request, but you're never going to get that entire response.

I don't see it as a problem but rather as a natural behavior. This is inherent property of event-based interaction. This is also the reason why I suggested to disallow not-null root subscription field types.

you can now only unsubscribe to the whole thing at once (maybe a benefit)

I also don't see it as a issue either. Can you describe in more detail why this behavior can be disadvantageous? (considering that you still can make 2 separate and isolated subscription queries if it suits better for the use-case at hand)

it's not clear what to do when one of them has a fatal error - do both get unsubscribed?

I feel that either behavior is fine as long as it is defined in the spec. Though in this case I would suggest draw inspiration from streaming libraries: if 2 event streams are joined/merged together in a single result stream, then an error in either of these will also case the result stream to fail. If one of event steams naturally completes (because of the exhaustion), then the result steam still continue to emit events until all of the source streams are exhausted. I think this behavior is quite intuitive and widespread.

Copy link
Contributor Author

@robzhu robzhu May 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OlegIlyenko @DxCx To give a little bit of context on where this came from, we discovered early on that it was better to use subscriptions for modeling granular events. For example, consider the three main subscriptions that operate on a facebook post: live comments, live likes, and typing indicators. These are individual subscriptions as opposed to a single "postLiveUpdate" subscription. Keeping these subscriptions granular on the client made natural sense. @laneyk and @dschafer may be able to add more perspective here.

Thinking this through, if we include multiple root fields like so:

subscription sub(...) {
  liveLike (...) {...}
  likeComment (...) {...}
}

So far, everyone seems to assume this subscription should publish data when either "live like" or "live comment" publishes. Is that clearly the intent of this query? What if there were a desire to trigger the publish only when both root fields have a publish payload available? How would we describe that? By limiting the selection to a single root field, we sidestep all that.

I also don't think the single-root-field-rule introduces any practical limitations to the client. In fact, it results in simpler client-side code, like so:

likeSubscription.subscribe(payload => updateLikeState(payload));

For subscriptions containing more than one root field, if we assume the "or" behavior, as @stubailo points out, you'll never have more than one event trigger at a time, so the code would end up looking like:

genericSubscription.subscribe(payload => {
  if (payload.subscriptionA) { updateA(payload.SubscriptionA); } 
  else if (payload.subscriptionB) { updateB(payload.SubscriptionB); }
  // etc.
});

Can someone help me understand a compelling use case that is served by multi-root subscription operations that would not be equally served by separate individual subscriptions?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to add a meta-point to this conversation:

In my view, there's nothing stopping us from working through these details and figuring out the edge and corner cases of allowing multiple subscription fields in a single request, however the choice we do have is to address those concerns now, or allow for more time to do so. In previous conversations @robzhu has had over the last few months about subscriptions, he has convinced many that this is far more complicated than we originally thought and may not have clear answers. This limitation is added mostly in a desire to expedite the addition of subscriptions to the spec, while reserving the ability to continue to work out how or if multi-field subscriptions should be allowed.

Had this limitations been omitted while also not making mention of how to address multi-field subscriptions in the spec, then we would see divergence of behavior and that could tie our hands in the future for deciding how to address these edge cases.

Copy link
Contributor

@OlegIlyenko OlegIlyenko May 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @robzhu and @leebyron for the insightful comments! I think now I got a better understanding of the issue. I was also in two minds on this. On one hand, I wanted to get a better understanding about motivation behind inclusion of this rule. But on the other hand, I don't want to delay the progress on subscriptions incision in the spec. I tend to agree that it is a good idea to disallow multiple fields for now and start a separate discussion. I think it is a discussion worth having. I am actually very glad that this point is considered in the spec since I was also quite concerned about the semantics of multiple subscriptions fields.

Can someone help me understand a compelling use case that is served by multi-root subscription operations that would not be equally served by separate individual subscriptions?

In general, I found it very valuable to have as much information from client as possible in single query. For example, the fact that a client can express its requirements for a view or particular part of the application in a single query allowed us to make very interesting optimizations which would be quite hard to do otherwise (it is quite hard to correlate seemingly independent requests/queries). So by allowing client to better express it's requirements with several subscription fields in a single query, we open a door for potential server-side optimizations.

Now that I'm equipped with new insights, I will give it another thought. This thread was definitely helpful in this respect.

Before we will introduce this rule though, I think it is important to consider the nullability of the subscription fields, as i mentioned above. It is possible to make a nullable field not-null later on in a backwards-compatible way. If we allow subscription fields to be not-null now, it might become a challenge in future to allow multiple subscription fields, if we decide to do so.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nullability may mean two different things in this context - this is a great point we should address.

One thing it may mean is that a subscription may not exist given some inputs in a way that isn't considered an error. I think this interpretation is both not what you were referring to, and also probably confusing to think about. The schema talks about the type of the payload result - so we're talking about the types of responses. We should probably make this point in the spec to clarify.

Secondly the nullability of the responses. This is one of the concerns with multi-field subscriptions to address later. For example, should it be legal to have a subscription field streamThings: String? where it is legal for any payload in the event sequence to in fact be null? I don't see a compelling reason to explicitly disallow this - though it is an edge case.

I think handling the payloads of multi-field subscriptions will need to account for this


Valid examples:

```graphql
subscription sub {
newMessage {
body
sender
}
}
```

```graphql
fragment newMessageFields on Message {
body
sender
}

subscription sub {
newMessage {
... newMessageFields
}
}
```

Invalid:

```!graphql
subscription sub {
newMessage {
body
sender
}
disallowedSecondRootField
}
```

Introspection fields are counted. The following example is also invalid:

```!graphql
subscription sub {
newMessage {
body
sender
}
__typename
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear what we would name this one if we wanted to get rid of the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a description above.

}
```

## Fields

Expand Down
121 changes: 119 additions & 2 deletions spec/Section 6 -- Execution.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,8 +103,9 @@ Note: This algorithm is very similar to {CoerceArgumentValues()}.
## Executing Operations

The type system, as described in the “Type System” section of the spec, must
provide a query root object type. If mutations are supported, it must also
provide a mutation root object type.
provide a query root object type. If mutations or subscriptions are supported, it must also provide a mutation and subscription root object type, respectively.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: line break to maintain roughly 80c?


### Query

If the operation is a query, the result of the operation is the result of
executing the query’s top level selection set with the query root object type.
Expand All @@ -123,6 +124,8 @@ ExecuteQuery(query, schema, variableValues, initialValue):
selection set.
* Return an unordered map containing {data} and {errors}.

### Mutation

If the operation is a mutation, the result of the operation is the result of
executing the mutation’s top level selection set on the mutation root
object type. This selection set should be executed serially.
Expand All @@ -143,6 +146,120 @@ ExecuteMutation(mutation, schema, variableValues, initialValue):
selection set.
* Return an unordered map containing {data} and {errors}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting a comment here because silly GitHub won't let me comment on the lines above. Up on line 106-107 it says:

If mutations are supported, it must also provide a mutation root object type.

That should probably say:

If mutations or subscriptions are supported, it must also provide a mutation and subscription root object type, respectively.

Or words to that effect.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also noticed that a step should be added to {ExecuteRequest} above to handle subscription requests.


### Subscription
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we include some example responses here, and perhaps an example of a lifecycle for a simple subscription?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I've added an example after the unsubscribe section.


We define an event stream as a sequence of discrete events over time that can be
observed. An observer of an event stream may cancel observation to avoid stop
future events.

If the operation is a subscription, the result is an event stream called the
"Publish Stream" where each event in the stream is called a "Publish Payload".
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flip the order of these two paragraphs.

Also for the para on event streams, some additional detail is necessary around event stream completion. Maybe also some examples for color. Perhaps:

An event stream represents a sequence of discrete events over time which can be observed. As an example, a "Pub-Sub" system may produce an event stream when "subscribing to a topic", with an event occurring on that event stream for each "publish" to that topic. Event streams may produce an infinite sequence of events or may complete at any point. Event streams may complete in response to an error or simply because no more events will occur. An observer may at any point decide to stop observing an event stream, after which it must receive no more events from that event stream.

Note: If an event stream's observer has stopped observing, that may be a good opportunity to clean up any associated resources such as closing any connections which are no longer necessary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the other paragraph, I'm not sure the events in the stream need a name so much as they need a brief description:

If the operation is a subscription, the result is an event stream called the "Response Stream" where each event in the event stream is the result of executing the operation for each new event on an underlying "Source Stream".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear "Publish Stream" may be too ambiguous - it could be too easy to confuse it with the underlying stream. How about: "Source Stream" and "Response Stream"?


#### Subscribe

Executing a subscription creates a persistent function on the server that
maps an underlying event stream to the Publish Stream. The logic to create the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maps an underlying Source Stream to a returned Response Stream

underlying event stream is domain-specific and takes the root field and query
variables as inputs.

Subscribe(schema, subscription, operationName, variableValues, initialValue):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

operationName isn't used here. Other related algos like ExecuteQuery look like:

ExecuteQuery(query, schema, variableValues, initialValue):

So perhaps:

ExecuteSubscription(subscription, schema, variableValues, initialValue):


* Let {subscriptionType} be the root Subscription type in {schema}.
* Assert: {subscriptionType} is an Object type.
* Let {selectionSet} be the top level Selection Set in {subscription}.
* Let {rootField} be the first top level field in {selectionSet}.
* Let {eventStream} be the result of running {CreateUnderlyingEventStream(rootField, variableValues)}.
* Let {publishStream} be the result of running {MapEventToPayload(eventStream)}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a step which returns {publishStream}


CreateUnderlyingEventStream(rootField, variableValues):

* *Application-specific logic to map from root field/variables to events*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be filled in. In particular, nothing is referencing the algorithm in Publish yet, which should probably be referenced here.

Or perhaps there's something missing? By name, I would expect this algorithm to map one set of events to another. Maybe this one should be renamed CreateSubscriptionEvents or something along those lines which should actually just be a delegation to provided logic, and then separately include MapSubscriptionEvents which calls into the Publish algo.

Also - this section on mapping event streams is a great opportunity to mention that while described as an inline algorithm, this step is often implemented across services, with the original event stream belonging to a subscription management service, and the execution happening on an API middleware service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right, "map" is misleading here. Changed to "CreateSubscriptionEvents" and hooked up the pseudocode from above and below.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a reflection of the feedback on graphql/graphql-js#846 - let's break out the top portion of Subscribe and name this second algorithm "Resolve" since it's resolving from a field.

CreateSourceEventStream(schema, subscription, operationName, variableValues, initialValue)

ResolveFieldEventStream(subscriptionType, rootValue, fieldName, argumentValues) <- should mirror ResolveFieldValue - look there for an example of getting these values

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the Subscribe() step is only these two steps, I think you could make a reference to your bottom most section by adding a Note: to clarify that in many real services, this Subscribe algorithm is run on a separate service than the execute step, and to refer to the section below on Considerations when supporting subscriptions


#### Publish Stream
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Response Stream"?


Each event in the underlying event stream triggers execution of the subscription
selection set.

MapEventToPayload(eventStream):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing the arguments referred to below

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps MapSourceToResponseEventStream?


* For each {event} on {eventStream}:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to return something, so perhaps we can add a step before this:

* Return a new event stream which yields events as follows:

* Let {publishPayload} be the result of running
{ExecuteSelectionSet(selectionSet, subscriptionType, event, variableValues)}
*normally* (allowing parallelization).
* Let {errors} be any *field errors* produced while executing the
selection set.
* Yield an unordered map containing {publishPayload} and, optionally,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Break this into two such that:

  • Let {response} be an unordered map containing {data} and {errors}.
  • Yield an event containing {response}.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should define ExecuteSubscription to mirror ExecuteQuery? It should be identical except for the first step. Then we could just have Let {response} be the result of {ExecuteSubscription(...)}`.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty close after the accumulated edits.

{errors} on {publishStream}.
* At any time while the publish stream is active, the client or server may
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really a step in the algorithm, it can be removed. However, we should have a step here which clarifies what to do when the source event stream runs out:

`* Once {sourceEventStream} has completed, this Event Stream also completes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is clearly from the "for each" iteration of {sourceStream}.

Unsubscribe().

Common reasons for Unsubscribe() include:
* client no longer wants to receive payloads for a subscription.
* The underlying event stream has produced an error or has naturally ended.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this paragraph below to be within the Unsubscribe section. Also I think this would read easier as prose instead of bullet points.


#### Unsubscribe

Unsubscribe cancels the Publish Stream. This is also a good opportunity for the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cancels the Response Stream?

server to clean up the underlying event stream and any other resources used by
the subscription.

Unsubscribe()

* Cancel {publishStream}

#### Example
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere we use **Example** to avoid creating sections in the TOC for examples. Also, what do you think about moving the Example directly above the algorithms as the last part of the top level "Subscription" section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion!


As an example, consider a chat application. To subscribe to new messages posted
to the chat room, the client sends a request like so:

```graphql
subscription NewMessages {
newMessage(roomId: 123) {
sender
text
}
}
```

While the client is subscribe, whenever new messages are posted to chat room
with ID "123", the selection for "sender" and "text" will be evaluated and
published to the client, for example:

```js
{
"data": {
"newMessage": {
"sender": "Hagrid",
"text": "You're a wizard!"
}
}
}
```

#### Recommendations and Considerations for Supporting Subscriptions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a long section title and it's specifically about scale. So perhaps just Supporting Subscriptions at Scale?


Supporting subscriptions is a large change for any GraphQL server. Query and
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"significant" instead of "large"?

mutation operations are stateless, allowing scaling via cloning GraphQL server
instances. Subscriptions, by contrast, are stateful. The pieces of state for a
subscription are:

* Subscriber/client channel
* Subscription document
* Variables
* Execution context (for example, current logged-in user, locale, etc.)
* Event stream resulting from Subscribe step (above)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this list of things a bit confusing and overly detailed for the point you're trying to make. Perhaps just:

Subscriptions, by contrast, are long-lived and stateful and require maintaining the GraphQL document, variables, and other context over the lifetime of the request.


We recommend thinking about the behavior of your system when this state is lost
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec shouldn't speak in the third person. Instead:

Consider the behavior of your system when state is lost due to the failure of a single machine in a service. Durability and availability may be improved by having a separate dedicated service for managing this state.

due to single-node failures. We can improve durability and availability by using
dedicated sub-systems to manage this state. For example, event streams can be
built using modern pub-sub systems, and payload delivery to clients can be
handled by a dedicated client gateway tier.

For systems with high capacity, availability, and durability requirements, we
recommend keeping the GraphQL server stateless and delegating all state
persistence to sub-systems that are designed for stateful scaling. Note that
subscription types are still defined in the original schema along with queries
and mutations.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note is a bit out of place. You can define this stuff wherever the heck you want, that's implementation detail. Also, this paragraph is really just repeating the point made in the previous one. I think you can probably just remove it entirely if not move a few of the more fine-pointed words into the prior paragraph


## Executing Selection Sets

Expand Down