-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As an API user, I want to know in the response how many hits are returned for an API query. #68
Comments
@tdddblog FYI the hits field you were talking about yesterday |
Here is a problem. Some requests like /bundle/lidvid/products and /product/lidvid/bundles require double de-referencing to obtain the result. In the case of /product/lidvid/bundles it probably does not matter but for /bundle/lidvid/products the double de-referencing is going to make calculating The first look up is the bundle from the lidvid. This gives a list of collection lids from ref_collection_lid in the bundle. These are then converted to lidvids and looked up in the registry_ref_index to get a list of results with each result containing the product_lidvids list of the products that collection contains. Therefore, to compute hits, one would have to sum up all the lengths of product_lidvids which requires traversing the entire list of results. Since each collection can have a different number of products, there are no shortcuts to walking the entire list. In turn, walking the entire list will undo the efficiency gains of registry-api-service#13. Now, let me turn that explanation into a concrete example using pds-gamma and our notebook pds-api-client-ovirs-part1-explore-a-collection, the bundle This is a battle of conflicting requirements and will probably take some time to reach a consensus on which requirement wins. In the interim, I will return -1 hits for those places where extra time is required to compute the hits -- aka, cannot be extracted directly from the elasticsearch query. |
@al-niessner copy. I think I made some sense of that. would the following help this? @tdddblog when we ingest data into the registry via harvest, how easy/hard would it be for us to add a product count to the ref_registry_index documents? |
Adding product_count to ref_registry_index does not help the problem as stated. You still have nested loops to do the computation that re not needed to process the request from start to start+limit. |
@al-niessner copy that. so I guess I am reading the above as the problem statement, but how would you propose we change/update the registry indexes in ES to fix it? or are you saying it is un-fixable? |
@jordanpadams @al-niessner The new fields can look like this:
Examples
(2) Collection with 100 primary and 300 secondary product refs, 3 ES docs in
I can also add total primary and secondary product refs counts to collection document in |
I would recommend one of two choices:
|
closed per NASA-PDS/registry-api-service#61 |
Motivation
...so that I can use that information to efficiently paginate or understand at a glance what the query returns
Additional Details
Similar to other search responses, we want to include a
hits
field with the number of matching products.Acceptance Criteria
General
Given a deployed pds-registry-app with X number of products ingested
When I perform a query against any of the endpoints
Then I expect a
hits
field in the response with the value equal to the number of matching products returned by the registrySpecific
Given a deployed pds-registry-app with X number of products ingested
When I perform a query against the
/products
endpoint for all productsThen I expect a
hits
field in the response with value equal to XEngineering Details
Similar to ESDIS CMR response, something like:
may vary slightly depending on the response format (e.g. JSON vs. pds4+json)
The text was updated successfully, but these errors were encountered: