-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interface fragment expansion causes conflicting field types #1257
Comments
Thanks for the report. And you are obviously correct, this is an issue. Unfortunately, for the current
And so, because the 0.x line is now essentially in "maintenance" mode, I think this will remain a limitation of those 0.x version (excluding someone coming up with an alternative fix that is simple enough to be confidently put in a maintenance release). Essentially, you should ensure that interface implementations use the exact same types than in the interface to avoid running into this problem. The brighter news is that the upcoming major version of federation (version 2, currently in alpha) already does better here. For instance, in the example of the description, querying the interface field would not expand into fragments internally and you won't run into this issue. The slightly less good news is that this doesn't mean that this is fully fixed in federation 2 (yet), because there is cases where expanding the interface is kind of necessary. And that's because in federation, you could have one implementation where a field is resolved "locally" but another implementation where that same field is external and resolved by another subgaph. Which imply internally making different kind of queries depending on the concrete implementation (that's why current federation always does this btw). Anyway, I believe that means you can still run into this problem in federation 2, though an example of that is a tad more involved (and so hopefully you are less likely to run into it in practice). Thankfully I think this can be reasonably fixed (by making the implementation a bit more judicious regarding what is expanded and what isn't), but I need to take some time to double check. So stay tuned. |
Well, I double-checked and I that idea of mine unfortunately doesn't work. So using aliases is the only idea that I can see working properly at the moment. And as mentioned above, while doing so is relatively simple conceptually, I suspect the implementation is somewhat involved: the gateway will need a way to know when aliases are used in this manner to merge the responses back, and either we use a special naming scheme for those aliases, but we'd have to make sure it doesn't conflict with user aliases, or we add something new to query plans, and it's not obvious how this could look like. Anyway, I really want this fixed eventually, but to set expectations, this may take a little while to get prioritized given current priorities and the facts that:
But in the meantime, what I would suggest is to actually detect the cases that are problematic early and to throw a meaningful error, instead of failing at runtime and letting users figure it out. Adding such validation is comparatively fairly simple and I've pushed a PR for this at #1318. |
Hey @pcmanus thanks for investigating I really appreciate the effort. Just wante to give an update on my side for you or anyone else experiencing the same problem. As a workaround we looked into adjusting the type implementing the interface with the non-nullable field so that the field is nullable, at the cost of having to find and update all consumers of that type. After investigating all consumers we were fortunate enough in our particular case to be able to remove the consuming code completely. I think the biggest risk (and certainly the biggest impact for my company when we experienced it) is the fact that this problem can arise without warning when porting to federation. So I think adding a meaningful error message is a great start. Maybe even a note or warning in the federation docs could be helpful too. |
…ter issue This patch detect the cases that would lead to the runtime issue described on apollographql#1257 and reject composition. That this, this is not a fix for apollographql#1257 but it ensures that, until we get to fix that issue properly, we at least error out earlier and with a more helpful error.
For the record, I'd like to note that the PR I pushed (#1318) does not handle all cases, and that detecting all cases is probably quite involved. The reason is that the patch essentially check, for each interface field, if we may need to "type-explode" for that field, which federation 2 avoids unless there is an But sometimes we may type-explode for a field where the type is the same everywhere, but fetching that field may require other fields (keys for one, but And it's pretty hard to validate out because this cannot be validated on a single subgraph, but rather depends essentially on which Additionally, this can happen with unions, not just interfaces. Consider the type Query {
u: U
}
union U = A | B
type A @key(fields: "id") {
id: Int
f: Int @external
g: Int
}
type B @key(fields: "id") {
id: Int
f: Int @external
g: String
} And consider we do the following query: {
u {
... on A {
f
}
... on B {
f
}
}
} This look inoccuous because type A @key(fields: "id") {
id: Int
f: Int @requires(fields: "g")
g: Int @external
}
type B @key(fields: "id") {
id: Int
f: Int @requires(fields: "g")
g: String @external
} Then the query to the first subgraphs would actually include {
u {
... on A {
__typename
id
g
}
... on B {
__typename
id
g
}
}
} but that is invalid due to And it's quite hard to validate why the example should be rejected, because you cannot infered it from either subgraph individually. So I still suggest we finish reviewing and commit #1318, because it's ready and it's better than nothing, but we definitively need to find cycles for fixing this properly. |
…ter issue This patch detect the cases that would lead to the runtime issue described on apollographql#1257 and reject composition. That this, this is not a fix for apollographql#1257 but it ensures that, until we get to fix that issue properly, we at least error out earlier and with a more helpful error.
I wanted to sum up where we are on this issue since there's been a bit of back-and-forth on my part, and even a related PR merged, and this might have troubled the water. The problemFirst, the problem: graphQL specifies that if 2 fields within a selection set have the same response name, then they must have the exact same type. So if you have some field {
t: {
... on A {
x
}
... on B {
x
}
}
} then both
Now, if a user does this, it's invalid and it needs to use an alias for at least one of the However, there is cases where the query planner can generates queries that fall into the invalid "pattern" we just describe, even if the original user query does not have that patter. Afaict, there is 2 main reasons for this:
Potential solutionsAs mentioned previously, the only proper solution to this is issue is to ensure that the query planner does what any user would do faced with this pattern: it should use aliases in the subgraph fetches to ensure that type with conflicting names ends up having different response names. However, while the principle is simple, I think the concrete implementation is a bit of effort. First, this isn't a query planner only change. Even assuming the QP knows to add aliases in the proper cases to make the subgraph query valid, it means that some data in the result of that fetch will essentially have the wrong name (the alias, instead of the actual field name). So the execution part of the query planner would need to "transform" the returned subgraph data, renaming back the alias field to their original name, and this before merging the subgraph data to the in-memory result data. Which probably means that the query plan for such fetches should list a number of post-query rewrites that needs to be performed by execution (which means an addition to the query plan format in particular). And all that needs to account for the fact that the original query could may have aliases in the first place (at least for the type-explosion case; for the case of non-queried On the side of the query planner implementation, there is also a few questions. In particular, is it easier to detect the cases where we must add an alias and only add it then, or is it easier to add aliases more routinely, even if it's not always useful? To be fair, we probably want to do the former mostly because the later would probably change tons of existing query plans and this could be a scare for users on upgrade, but unsure said former is the simpler/most efficient option. So that's my "summary" (I'm bad at this, right!?). And taking the time to lay this out because as it is not a trivial chunk of work, I'm not yet sure when this will raise to the top of the TODO list for the good folks here at Apollo, so if someone help feels like scratching that itch in the meantime, this might help that someone get started. |
In a few situations, the query planner was generating queries where the same response name was queried at the same "level" with incompatible types, resulting in invalid queries (the queries were failing the [`FieldsInSetCanMerge`](https://spec.graphql.org/draft/#FieldsInSetCanMerge())) validation for the GraphQL sepc). This commit detects this case, and when it would happen, aliases one of the occurence in the fetch to make the query valid. Once receiving the fetch result, the aliased value is rewritten to it's original response name. Fixes apollographql#1257
* Use alias in QP when querying conflicting fields In a few situations, the query planner was generating queries where the same response name was queried at the same "level" with incompatible types, resulting in invalid queries (the queries were failing the [`FieldsInSetCanMerge`](https://spec.graphql.org/draft/#FieldsInSetCanMerge())) validation for the GraphQL sepc). This commit detects this case, and when it would happen, aliases one of the occurence in the fetch to make the query valid. Once receiving the fetch result, the aliased value is rewritten to it's original response name. Fixes #1257 * Review feedback: add test for alias conflicts and fix related code * Regen error doc
The query planner in Federation 2.3 adds a new concept of "data rewrites" for fetches in order to support both `@interfaceObject` and to support the fix for apollographql/federation#1257. Those "rewrites" describe simple updates that need to be performed either on the inputs (the "representations" passed to `_entities`; need to rewrite the `__typename` when sending queries to an `@interfaceObject`) or the output of a fetch (needed when a field has been aliased to permit the subgraph query to be valid, but that field needs to be "un-aliased" to its original name after the fetch). This commit implements those rewrites.
The query planner in Federation 2.3 adds a new concept of "data rewrites" for fetches in order to support both `@interfaceObject` and to support the fix for apollographql/federation#1257. Those "rewrites" describe simple updates that need to be performed either on the inputs (the "representations" passed to `_entities`; need to rewrite the `__typename` when sending queries to an `@interfaceObject`) or the output of a fetch (needed when a field has been aliased to permit the subgraph query to be valid, but that field needs to be "un-aliased" to its original name after the fetch). This commit implements those rewrites.
The query planner in Federation 2.3 adds a new concept of "data rewrites" for fetches in order to support both `@interfaceObject` and to support the fix for apollographql/federation#1257. Those "rewrites" describe simple updates that need to be performed either on the inputs (the "representations" passed to `_entities`; need to rewrite the `__typename` when sending queries to an `@interfaceObject`) or the output of a fetch (needed when a field has been aliased to permit the subgraph query to be valid, but that field needs to be "un-aliased" to its original name after the fetch). This commit implements those rewrites. Co-authored-by: Geoffroy Couprie <geoffroy@apollographql.com>
I have an interface with a field where there is a nullable field for which one implementer has the field marked nullable and another implementer has the field is marked non-nullable. (I've changed the names of the types from the originals, but the concept is the same).
When querying against the interface, the field is nullable regardless of the returned type.
It seems that when receiving a query like this, the gateway will use information about the implementers of the interface to expand the fields under
post
into fragments like this:In this query the type of
post.likes
could beInt
orInt!
. This results in a GraphQL validation error because there are two different types trying to be bound to the same field.I believe the way gateway expands interface fields into inline fragments is incorrect or problematic as it causes problems like these. If I ran the source query directly on the subgraph it would have no problem but the changes the gateway makes result in an invalid query.
Potential solutions would be to not expand interface fields into inline fragments, or to give the common field in the fragments different aliases (as suggested by the error message), then rename them to the original name before returning the query.
Here are the packages I used
@apollo/gateway
(gateway)0.43.1
apollo-server
(gateway)3.5.0
@apollo/subgraph
(subgraph)0.1.4
apollo-server-koa
(subgraph)3.4.0
graphql
(subgraph)15.7.2
The text was updated successfully, but these errors were encountered: