-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
summary does not contain the property values #171
Comments
Filling in the properties is a problem. In order to correctly fill them in, the code would have to request all of the data from opensearch then collect only those that exist. It will make summary only inefficient in the sense that more data has to be exchanged between the Java code and opensearch. An alternative is to return what the user asked for. In the case given, it would be a correct list since no fields were asked for only the defaults would be returned. However, consider the case where the user literally asks for
|
@al-niessner @tloubrieu-jpl can we use the _mapping api? https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-mapping.html |
Yes and no. I am tired of that answer as I am sure everyone else is. As best I understand the documentation and limited testing. Yes, the mapping can be changed on the fly but nothing is re-indexed. You have to use the reindex API for that but it also comes with caveats. You cannot reindex into an existing index. Not really a problem if one had the foresight to know that things would change or used dynamic (first instance starts the indexing which is pretty clever). If one plans on index changes, then assign an alias and use it. For instance, registry would be the alias for r1 say. Then when we update the mapping we put it into r2 then reindex r1 into r2. When all that work is done, change the alias so that registry points at r2. I am sure that comes with caveats as well like no document (old db record) inserts during reindex or something along those lines. If you like, I can look at adding an alias to the index and use it so that we can update and move around a smidge easier in the future but if you think this is the last change we will ever need to make to the index mapping then I can keep it as is. |
Hi @al-niessner @jordanpadams , What we had before When summary-only=true, we read the results anyway and sent the unique properties of the page. This was inaccurate and costly, in addition this will not work if we replace summary-only with limit=0. What we need
This is a requirement to be able to get the list of properties before returning the results since otherwise the user will have no clue how to apply request criteria on the registry. Other thoughts Note that eventually we also want faceting which is the ability to see statistics on the values of the properties. One level higher. |
I do not understand the requirements:
All the fields available to the request? That would mean walking every document to build that list since not all keywords are indexed and many documents have unique elements. If dynamic were true, we could just use the index. Since dynamic cannot be true, then must build it manually at ingestion. Separate index or stuff it in registry as yet another document with a special lidvid like 1. Do they really want the default fields that the output would use anyway? Long list for json and short list for kvp+json? What is the pagination limit for in the requirement? Is that when summary-only existed and find the unique fields in the first limit of matching documents?
Why intersection? We return the union. If a user asks for field=foobar with json output then get all the default fields plus foobar not just foobar. Is the requirement wrong or our implementation wrong? Independent of your answers to the requirements, it seems like you want efficient and accurate which means adding something new during ingestion. Is this correct? |
fixed by #166 |
🐛 Describe the bug
The list of properties is missing in the summary.
📜 To Reproduce
curl --request GET 'http://localhost:8080/collections/urn:nasa:pds:insight_rad:data_derived::7.0/products?summary-only=true'
Result:
{
"summary": {
"q": "",
"hits": 7,
"took": 25,
"start": 0,
"limit": 100,
"sort": [],
"properties": []
}
}
🕵️ Expected behavior
The properties used by the products of this collection should be listed.
📚 Version of Software Used
all
🩺 Test Data / Additional context
🏞Screenshots
🖥 System Info
🦄 Related requirements
⚙️ Engineering Details
The text was updated successfully, but these errors were encountered: