Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Cursor-based pagination on unions #344

Closed
darrellwarde opened this issue Jul 21, 2021 · 1 comment
Closed

Discussion: Cursor-based pagination on unions #344

darrellwarde opened this issue Jul 21, 2021 · 1 comment
Assignees

Comments

@darrellwarde
Copy link
Contributor

Ok, three ways I think you can do this:

  1. Filter out the empty nodes prior to collection in the Cypher command.

I'm not a cypher expert, but couldn't you run a WHERE filter on the returned edges to filter out the edges before returning it to the GraphQL layer? For example:

CALL {
    WITH this
    CALL {
        WITH this
        OPTIONAL MATCH (this)-[this_wrote:WROTE]->(this_Book:Book)
        WITH { words: this_wrote.words, node: { __resolveType: "Book", title: this_Book.title } } AS edge
        RETURN edge
    UNION
        WITH this
        OPTIONAL MATCH (this)-[this_wrote:WROTE]->(this_Journal:Journal)
        WITH { words: this_wrote.words, node: { __resolveType: "Journal", subject: this_Journal.subject } } AS edge
        RETURN edge
    }
    ## <------ FILTER OUT EMPTY NODES HERE?
    WITH collect(edge) as edges, count(edge) as totalCount
    RETURN { edges: edges, totalCount: totalCount } AS publicationsConnection
}
  1. Drop the opinionated implementation on unions and return empty nodes.

I'm in favor of this one. It keeps the count consistent, and the worst case scenario is that you get an edge like the below, which is still helpful as it indicates the resolveType and existence of a node, even if no properties are returned.

// empty edge

{
  cursor: "1234579348",
  node: {
    __resolveType: "Movie"
  }

}

  1. Keep the opinionated implementation and do fuzzy math to keep the cursors consistent.

You can swap the order of the union filter and the createConnection functions so that you generate the connection based on all the returned values but then filter out the empty ones before returning to the client. The forward cursor-based pagination will still work since the cursors will be calculated in order, but it means that the number of items requested by the user might be different than what is returned.

There's also some wonky math with the totalCount you'd have to keep track of, but I suppose it's possible.

Originally posted by @litewarp in #334 (comment)

@darrellwarde
Copy link
Contributor Author

So! #362 introduces some further changes into how it is decided which union members will be projected, which fits much better into calculating the total count. From the PR:

Which union members are returned by a Query are dictated by the where filter applied.

For example, the following will return all user content, and you will specifically get the title of each blog.

query GetUsersWithBlogs {
    users {
        name
        content {
            ... on Blog {
                title
            }
        }
    }
}

Whilst the query below will only return blogs. We could for instance use a filter to check that the title is not null to essentially return all blogs:

query GetUsersWithAllContent {
    users {
        name
        content(where: { Blog: { title_NOT: null }}) {
            ... on Blog {
                title
            }
        }
    }
}

Conceptually, this maps to the WHERE clauses of the subquery unions in Cypher. Going back to the first example with no where argument, each subquery has a similar structure:

CALL {
    WITH this
    OPTIONAL MATCH (this)-[has_content:HAS_CONTENT]->(blog:Blog)
    RETURN { __resolveType: "Blog", title: blog.title }
UNION
    WITH this
    OPTIONAL MATCH (this)-[has_content:HAS_CONTENT]->(journal:Post)
    RETURN { __resolveType: "Post" }
}

Now if we were to leave both subqueries and add a WHERE clause for blogs, it would look like this:

CALL {
    WITH this
    OPTIONAL MATCH (this)-[has_content:HAS_CONTENT]->(blog:Blog)
    WHERE blog.title IS NOT NULL
    RETURN { __resolveType: "Blog", title: blog.title }
UNION
    WITH this
    OPTIONAL MATCH (this)-[has_content:HAS_CONTENT]->(journal:Post)
    RETURN { __resolveType: "Post" }
}

As you can see, the subqueries are now "unbalanced", which could result in massive overfetching of Post nodes.

So, when a where argument is passed in, we only include union members which are in the where object, so it is essentially acting as a logical OR gate, different from the rest of our where arguments:

CALL {
    WITH this
    OPTIONAL MATCH (this)-[has_content:HAS_CONTENT]->(blog:Blog)
    WHERE blog.title IS NOT NULL
    RETURN { __resolveType: "Blog", title: blog.title }
}

With this in place, I believe it should now be a simple enough job to introduce the first and after arguments to union fields. Perhaps not so much of a discussion necessary now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant