Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query from url #254

Merged

Conversation

greebie
Copy link
Contributor

@greebie greebie commented Feb 21, 2017

My colleagues at the Qualitative Data Repository requested the ability to link to a specific page and show only a specific user's annotations.

This change provides a solution by using Hypothesis's existing search filter and the "#annotations:" url fragment. By using the url

https://www.example.com#annotations:query__{filter}

The client opens the search filter window and filters the annotations for {filter}, provided that the filter is base64 compliant (alphanumeric, underscore or hyphen only). Including the terms user__USERNAME, any__ANYTHING or tag__ATAG will create the filter for user: USERNAME, any: ANYTHING and tag: ATAG respectively.

So https://www.example.com#annotations:query__user__USERNAMEany__ANYTHING will filter for "user:USERNAME any:ANYTHING."

#Limitations:

The current solution does not support multi-word queries (because % and whitespace are illegal in url fragments). This behavior can be imitated using query__any__WORDany__ANOTHERWORDany__ANDANOTHER.

The syntax was selected to impact the hypothesis architecture as little as possible. A more intuitive solution could be to permit a #filter: fragment instead or even to permit colons directly. This is why the functions to detect the query and prepare the query are separated. Colons are not currently permissible in base64, so I opted for the double-underscore rather than trying to permit something that otherwise hadn't been.

The current syntax will not work in the bouncer. I have a solution for that will be submitted for pull request shortly.

@codecov-io
Copy link

codecov-io commented Feb 21, 2017

Codecov Report

Merging #254 into master will increase coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #254      +/-   ##
==========================================
+ Coverage   76.16%   76.19%   +0.02%     
==========================================
  Files         117      117              
  Lines        5811     5817       +6     
  Branches      950      952       +2     
==========================================
+ Hits         4426     4432       +6     
  Misses       1385     1385
Impacted Files Coverage Δ
src/sidebar/host-config.js 100% <ø> (ø)
src/annotator/sidebar.coffee 71.05% <ø> (ø)
src/annotator/config.js 88.23% <100%> (ø)
src/sidebar/reducers/selection.js 99% <100%> (+0.01%)
src/annotator/util/extract-annotation-query.js 100% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d28669c...2cb0ec8. Read the comment docs.

@robertknight robertknight self-assigned this Feb 21, 2017
@robertknight
Copy link
Member

Hello @greebie ,

The client opens the search filter window and filter the annotations for {filter}, provided that the filter is base64 compliant (alphanumeric, underscore or hyphen only)

Is the reason for this restriction to avoid changing in the extractIDFromURL function in the client and browser extension?

I don't think there is anything stopping us from changing the format of the fragment after the initial #annotations: string. You could for example change the client and browser extension to allow either an base64 ID or q:<the normal query syntax for the top bar>.

@greebie
Copy link
Contributor Author

greebie commented Feb 21, 2017

The only reason for this restriction is that this was the existing restriction for #annotations. I can edit to make the alterations to accommodate Q: [a-zA-z_-:] no problem. While we are at it, any concerns about Q: [a-zA-z_:\+-]? This would fix the multi-word query problem as well (replace '+' with whitespace).

@greebie
Copy link
Contributor Author

greebie commented Feb 21, 2017

Okay the change in reference should be fixed now. Thanks.

The new syntax is https://www.example.com#annotations:(query: or Q:){filter}

the filter can include any of user:, tag:, any: or text:

@robertknight
Copy link
Member

Hello @greebie - Sorry for taking such a long time to get back to you on this. Would you mind squashing everything down to one commit on your branch?

@greebie
Copy link
Contributor Author

greebie commented Mar 8, 2017

Yes. No worries! Will have this done today.

@robertknight
Copy link
Member

robertknight commented Mar 8, 2017

I've had another read through this, some thoughts:

  1. The query is still limited to alphanumeric characters. We have users writing annotations in non-Latin scripts, using math unicode symbols etc. so I don't see any reason for this restriction. It looks like from one of the test cases that you had a security-based reason in mind. However, since the sidebar's search input accepts input from the user, it has to assume that the query is untrusted anyway and make sure that it properly escapes anything extracted from it which is subsequently displayed in the UI.
  2. I see you've made the functions that extract the ID from the annotation more generic, but it still has the same name. I think it would be clearer if it were given a more generic name (eg. extractAnnotationQuery) and returned an object with helpfully named fields (eg. {annotation: id, query: string})
  3. Thanks for adding tests, but I see there are bunch of test cases for which there isn't an assert statement using them.
  4. Can you please keep to the hyphenated-naming-scheme for new files, for consistency with the rest of the code.

This project enables the user to include a default filter query as part of a url.

The syntax works with the #annotations: fragment.  If a Q: or Query: (case insensitive)
is included, whatever follows the query will be included in a search filter as a
default.

user tags may be included in this query string. For instance user:USERNAME will
filter for the user called USERNAME. Using an @hypothes.is will search for exactly
that user. So far, the user, tag or any tags are the main permissible items.

So http://www.example.com/path/to/file#annotations:Q:user:USERNAMEtag:TAGany:KEYWORD

will filter the page for annotations by USERNAME tagged TAG and containing KEYWORD.
@greebie
Copy link
Contributor Author

greebie commented Mar 10, 2017

All squished and fixed from my perspective! Thanks for the helpful feedback.

Just a note that this commit takes a slightly modified approach to have extractAnnotationQuery return an object. The function now creates a setting variable called "query" that sets the default filter. Thus some minor changes to config and the sidebar.

filter = null;
}
} catch (err) {
filter = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What specific error are you trying to guard against here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just following the previous codebase (function extractURLFromID formerly) I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was looking at the code and the only possible error is a URIError which would not be caught by this code. I'll remove the try/catch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two tests that fail if the try-catch code is not included, so I kept it in.

var annotationIds = require('../extract-annotation-query');

describe('annotation queries', function () {
var annotation = annotationIds.extractAnnotationQuery('http://localhost:3000#annotations:alphanum3ric_-only');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a utility called unroll for writing data-driven tests which I think would be useful here. See this test for an example.

var annotation = annotationIds.extractAnnotationQuery('http://localhost:3000#annotations:alphanum3ric_-only');
var queryVarA = annotationIds.extractAnnotationQuery('http://localhost:3000#annotations:q:user:USERNAME');
var queryVarB = annotationIds.extractAnnotationQuery('http://localhost:3000#annotations:QuerY:user:USERNAME');
var invalid = annotationIds.extractAnnotationQuery('http://localhost:3000#annotations:\"TRYINGTOGETIN\";EVILSCRIPT()');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general comment about these tests is that it is easier to follow if the preparation/arrangement is part of the test case instead of being a separate step that happens in beforeEach. The exception is if you're using a particular fixture across several test cases. We generally try to follow the 'Arrange Act Assert' formula for structuring individual test cases.

greebie added 2 commits March 13, 2017 17:02
URL queries should now use uri encoded strings for all default filter queries.
Copy link
Member

@robertknight robertknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this with a range of queries and IDs. All looks good. Thank-you!

I left a couple of notes about minor issues which can be addressed separately.

//
// The Chrome extension or proxy may already have provided this config
// via a tag injected into the DOM, which avoids the problem where the page's
// JS rewrites the URL before Hypothesis loads.
//
// In environments where the config has not been injected into the DOM,
// we try to retrieve it from the URL here.
var directLinkedID = annotationIDs.extractIDFromURL(window_.location.href);
var directLinkedID = annotationQuery.extractAnnotationQuery(window_.location.href);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this variable needs an update.

}

module.exports = {
extractAnnotationQuery: extractAnnotationQuery,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually if a module exports a single function then we use module.exports = <function name> rather than exporting an object with one field.

@robertknight robertknight merged commit 9f4ab73 into hypothesis:master Mar 17, 2017
@adam3smith
Copy link

Thanks Ryan, Robert! This is terrific.
Robert -- how does your deployment timeline look? When can we test this in production/how can we tell when it's live?

@robertknight
Copy link
Member

@adam3smith - The client gets uploaded to a CDN a few minutes after the GitHub release is created. It takes about half an hour to become available after that. It is now available through Via.

You may need to clear your browser cache if you don't see it straight away. You can check the version via Account -> Help in the client. The version of the client with this release is v1.4.0.

The Chrome extension takes an hour or two to go through malware checks in the Chrome Web Store.

@robertknight
Copy link
Member

Note that the PR that automatically activates the extension when a URL contains #annotations:query:{query} has not yet been merged, so you'll still need activate the extension manually. We should get that resolved soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants