Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial query extraction using AST #553

Merged
merged 16 commits into from
Aug 31, 2018
Merged

Initial query extraction using AST #553

merged 16 commits into from
Aug 31, 2018

Conversation

evans
Copy link
Contributor

@evans evans commented Aug 14, 2018

This is a query extraction technique. We currently output map from sha to query string in a file, manifest.json. The output location could be customizable and probably should be stdout by default, so that a user can pipe it into a file should they choose.

fixes #556

@evans evans requested a review from abernix August 14, 2018 22:05
Copy link
Member

@abernix abernix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Super glad to see the foundation is concise and straightforward. Just a couple comments...

import { Command, flags } from '@oclif/command';
import * as Listr from 'listr';
import { print } from 'graphql';
import { sha512 } from 'js-sha512';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a particular reason for using js-sha512, a JavaScript implementation of the SHA-512 routine, rather than Node.js’ native crypto.createHash (which is implemented in C/C++).

@@ -0,0 +1,80 @@
import 'apollo-codegen-core/lib/polyfills';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What’s this polyfill relatated to? Might be worth putting a comment about its necessity/purpose since it’s imported in its entirety rather generically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why it's included myself. I'll dig around and see. My first guess is lack of native support for the functions in Node 8

import { sha512 } from 'js-sha512';
import * as fs from 'fs';

import { loadQueryDocuments } from 'apollo-codegen-core/lib/loading';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Longer term, is it confusing to leave the loadQueryDocuments functionality in apollo-codegen-core? (It doesn’t really sound like a utility library and we’re not doing code-gen.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm we could probably make the judgement of apollo-codegen-core in general. Let's keep it in mind as we continue digging into the functionality

{
title: 'Outputing manifest',
task: async ctx => {
fs.writeFileSync('manifest.json', JSON.stringify(ctx.mapping));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should really be an option before this gets merged. People will riot otherwise 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😆 Totally! What do you think about the default being output to stdout and then having an option to specify the file with --output? Or should it automatically send it to the manifest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be one option, but consider the rest of the cli that just takes the output path as the last arg without a flag.

@evans evans force-pushed the evans/query-extraction branch 2 times, most recently from 7b7d144 to 63c76f2 Compare August 29, 2018 05:48
@evans
Copy link
Contributor Author

evans commented Aug 29, 2018

This can be tested with a command similar to

(cd ~/engine/frontend \
&& ~/apollo-cli/packages/apollo-cli/bin/run queries:extract \
    --queries=**/*.jsx --queries=./imports/**/*.js \
&& cat manifest.json | jq)

matches = extractDocumentsWithAST(content, options);
} catch (e) {
console.log('Extraction using AST failed with \n', e, '\nRetrying using regex');
const re = new RegExp(tagName + 's*`([^`]*)`', 'g');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what the s* here is meant to implement? Tag plurality? (Is that a thing?; i.e. gqls, graphqls?)

Copy link
Member

@abernix abernix Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should look into why the parse is failing since it seems to necessitate a switch from the underlying parser to something which supports more modern ECMAScript features — a configuration tweak, usually.

All in all, I'm opposed to falling back to a regular expression parse technique in the event that an AST parse has failed. While it will work in some cases, it's still fragile and could mis-identify or under-match an operation.

For example, this regular expression won't support a properly-escaped backtick within a document, and it will fail to properly concatenate two operations which have been joined together (e.g. with +, Array.prototype.join, etc.).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like gqls is not a thing prettier/prettier#2092. I wasn't sure about it and hadn't pinned down the usage. Let's remove it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the regex fallback makes sense. It will cause some configuration pain in some cases., i.e. actively excluding .d.ts files, since there are many that don't parse well. For the most part, I trust the parsers. Running it on spectrum and the engine frontend yielded good results.

import { format } from "../schema/check";
import { resolveDocumentSets } from "../../config";
import { loadConfigStep } from "../../load-config";
import { engineFlags } from '../../engine-cli';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to revert the Prettier changes present in 7214296? We should run it on the whole repository outside the scope of this PR if that's what we want, but for the sake of accurately reviewing this PR it should be kept as a separate concern.

@evans evans force-pushed the evans/query-extraction branch 3 times, most recently from a99c0aa to d45e43c Compare August 29, 2018 22:38
@evans
Copy link
Contributor Author

evans commented Aug 29, 2018

The behavior of the query:extract now mirrors schema:download in terms of output location.

The PR also improves fragment inclusion for query checking, fixing #556. The changes also apply to mobile, since the fragment inclusions work for .gql and .graphql files.

@evans evans force-pushed the evans/query-extraction branch from 27e9209 to ea03f70 Compare August 30, 2018 17:08
@evans evans changed the title [WIP] Initial query extraction Initial query extraction using AST Aug 31, 2018
@evans evans force-pushed the evans/query-extraction branch from a72d3dc to 0d1c0b6 Compare August 31, 2018 02:44
Outputs map from sha to query string in a file, manifest.json. This
could be customizable and really probably should be piped to std out by
default, so that a user can pipe it into a file should they choose.

* [squash] fix hash digest

* remove unused code left from query:check

* use native node crypto library

* query:extract remove unused json flag
@evans evans force-pushed the evans/query-extraction branch 2 times, most recently from fcf239f to 15d1d97 Compare August 31, 2018 17:17
evans added 13 commits August 31, 2018 11:09
This approach becomes much too fragile with dynamic imports, so a
different technique is used in a later commit.
Also output the sha in base64
Fallback on the naive regext search for gql` in the cases that the
parsing fails. This also removes the code that chased down imports and
attempted to resolve them with variable/import analysis. This strategy
proves to be too fragile when attempting to construct an interpolated
string. There are too many different ways to import values. Additionally
in the future world of import(), which returns a promise, would be
impossible to support.

* import apollo-engine-reporting for AST manipulation

* remove unused variables

* continue traversing AST for nested template literal
Updates the @bable/parser version. Improves error message for parser
failure.

* add clarity to parser choosing logic
We add to the Error's message from parser instead of throwing our own
error
This moves the extraction and combination to apollo-codegen-core. These
operations are shared across other operations, such as type generation.
In loading.ts and extract.ts
This matches the queries:download arguments
This isolates the test cases that are failing and indicates that the
file globing should be done in a single location.
The tests depend on the fact that the git remote is set to apollo-cli
loadConfigStep pulls the queries value from the flags passed to the
command, so there is no need to use a custom parsing function. Though
the name of the function requires an in-depth look at the
implementation.
@evans evans force-pushed the evans/query-extraction branch 6 times, most recently from 3bd807f to b0f4a74 Compare August 31, 2018 18:57
Dogfoods apollo queries:check
@evans evans force-pushed the evans/query-extraction branch from b0f4a74 to 53b2724 Compare August 31, 2018 19:04
@evans evans merged commit ebae069 into master Aug 31, 2018
@evans evans deleted the evans/query-extraction branch August 31, 2018 19:26
@evans
Copy link
Contributor Author

evans commented Sep 4, 2018

Comments from @benjamn to optimize the solution:

  1. use a traversal that does not construct NodePath objects, since we don’t need to know about the ancestry of the node
  2. use a regular expression to find possible gql identifiers, and then ignore subtrees that don’t contain any
  3. (harder) modify the parser so that just calls a callback when it hits a gql-tagged template string, and doesn’t generate AST nodes we don’t care about

Hopefully these are useful in the future 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Having issues with Graphql Fragments
3 participants