Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Query API key information API #101691

Closed
SiddharthMantri opened this issue Nov 1, 2023 · 11 comments
Closed

Enhance Query API key information API #101691

SiddharthMantri opened this issue Nov 1, 2023 · 11 comments
Assignees
Labels
>enhancement :Security/Security Security issues without another label Team:Security Meta label for security team

Comments

@SiddharthMantri
Copy link

SiddharthMantri commented Nov 1, 2023

Summary:

Enhance Query API key information API to support API key management page use cases

Description:

To improve user experience Customers with a large number of API keys are unable to use the API key management page today as there is a lack of pagination support in the GET endpoint. To enable such users to use this feature, we (Kibana security team) are moving away from the Get API key information API to the Query API. We require some changes to the Query API key API to support enhanced searching and term aggregation.

  • Search Capabilities

    • Description: Extend the current search capabilities of the API endpoint to accept new query fields.
    • Requirement: Add a new query field called type to the existing ones
    • Possible values: cross_cluster, rest. The filter will only call one or the other and never both at the same time
  • Term Aggregation:

    • Description: When loading a subset of API keys, the endpoint should provide all associated usernames, types and if the keys are active/expired, irrespective of the number of keys loaded.
    • Use Case: For instance, a user visits a UI page and loads 10 API keys. These 10 keys are a subset of a total of 20 keys, each created by a different user. The UI's Owner dropdown should display all 20 usernames, even if only 10 are loaded. Similarly, having that information for active/expired keys and types of keys would be needed as well.
    • Note: The term aggregation should honor the privileges manage_api_key and manage_own_api_key for the results. For those users who have the manage_own_api_key privileges, the results should not include the keys that do not belong to them.

UI

image

This is the existing UI which loads all the API keys and then displays them using an in-memory table.

We intend to move away from this in-memory table. To support a 1:1 mapping between the old and new features, we would also require some enhancements to the API to support use cases of the search bar

Free text search
If a user searches with the term exp, the table currently displays results as follows

image

This also works similarly if the user searches by API key “type” as ‘personal/cross_cluster` or any other criteria on that page.

image

To keep the user experience similar across this change, a free text query capability like this one would be a very welcome addition to the Query API key information API.

We’d like to inspect the following fields for free text search:

  • Username
  • Type
  • Name
  • Id
@SiddharthMantri SiddharthMantri added >enhancement Team:Security Meta label for security team needs:triage Requires assignment of a team area label labels Nov 1, 2023
@elasticsearchmachine elasticsearchmachine removed the Team:Security Meta label for security team label Nov 1, 2023
@SiddharthMantri SiddharthMantri added the Team:Security Meta label for security team label Nov 1, 2023
@elasticsearchmachine elasticsearchmachine removed the Team:Security Meta label for security team label Nov 1, 2023
@astefan astefan added :Security/Security Security issues without another label and removed needs:triage Requires assignment of a team area label labels Nov 2, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Security Meta label for security team label Nov 2, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

@albertzaharovits
Copy link
Contributor

@SiddharthMantri

  1. I think we can expose allowing searching on the API Key type (rest or cross_cluster) without any problems.
  2. Wrt to the "Free text search" I think ES should expose the multi-match query type https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html (it does not today), and Kibana should call that to search on both the username and name fields (given the user typed-in text)
  3. Wrt to aggregations, regular ES searches actually specify an aggs param to _search requests https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html. Currently, the Query API key doesn't allow for it though. There are 2 options: we either allow aggs similarly to the _search endpoint (I don't think we need to do any validation, but I'll check for it), or we offer some hard-coded specific (term) aggregations, e.g. for username, type and expiry ones, that can be toggled via a simple boolean/enum parameter to the Query API Key API.

I'm going to raise this with the broader team tomorrow, but I would like to first check with you if the above sounds that it's sufficient to achieve the UI features you've described.

@n1v0lg
Copy link
Contributor

n1v0lg commented Nov 8, 2023

@albertzaharovits re:

I think we can expose allowing searching on the API Key type (rest or cross_cluster) without any problems.

It's a little more complicated than it looks, though nothing insurmountable. What complicates this is that API keys created before we introduced the type field, the field is (logically) undefined; these API keys are implicitly rest keys. This means that you can't simply query for rest and get all rest keys; you need to query for s.th. like (rest OR undefined) -- this could happen client-side or we come up with an alternative approach for special handling of the type in the QueryApiKey API. ES-5990 (no link since it's a private Jira issue) gives more context. We mainly didn't prioritize working on this because there was no clear need for it. Now that we have an ask though, we can revisit 👍

@SiddharthMantri
Copy link
Author

@albertzaharovits

Wrt point 3, If i understand correctly, we could query the API as

{
  "aggs": {
    "agg-api-keys": {
      "terms": {
        "field": "username"
      }
    }
  }
}

And then add multiple aggs for each of the fields so that we get the "hits" for each of these fields?

Overall, from reading the docs, this looks like a good solution to how we're expecting the UI to function.

@albertzaharovits
Copy link
Contributor

And then add multiple aggs for each of the fields so that we get the "hits" for each of these fields?

Yeah, something like that. It's on the caller to define the aggs names and types (even nested ones).

@albertzaharovits
Copy link
Contributor

Wrt to aggregations, regular ES searches actually specify an aggs param to _search requests https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html. Currently, the Query API key doesn't allow for it though. There are 2 options: we either allow aggs similarly to the _search endpoint (I don't think we need to do any validation, but I'll check for it), or we offer some hard-coded specific (term) aggregations, e.g. for username, type and expiry ones, that can be toggled via a simple boolean/enum parameter to the Query API Key API.

We discussed this in yesterday team's meeting. We're OK to provide the aggs parameter to the Query API Key API (_security/_query/api_key) in an identical fashion to how it's used for the regular _search endpoint, possibly limiting the types of aggs.

@kc13greiner
Copy link
Contributor

Heya @albertzaharovits! Do you have a rough estimate on when this work will be picked up? There is growing interest in this feature!

@albertzaharovits
Copy link
Contributor

@kc13greiner I'm starting to work on this now.

@albertzaharovits
Copy link
Contributor

albertzaharovits commented Dec 12, 2023

@kc13greiner @SiddharthMantri I've raised a draft PR that should address all the requirements in the description.
I would appreciate if you could give it a try while it goes through reviews and testing, in order to ensure that it really meets the requirements. Thanks!

albertzaharovits added a commit that referenced this issue Jan 11, 2024
This adds support for the type parameter to the Query API key API.
The type for an API Key can currently be either rest or cross_cluster.

Relates: #101691
albertzaharovits added a commit that referenced this issue Jan 19, 2024
…104132)

This adds support for the simple_query_string query type to the Query API key Information API.
In addition, this also adds support for querying all the API Key metadata fields simultaneously,
rather than requiring each to be specified, such as metadata.x, metadata.y, etc.

Relates: #101691
albertzaharovits added a commit that referenced this issue Jan 30, 2024
This adds support for the `match` query type to the Query API key Information API.
Note that since string values associated to API Keys are mapped as `keywords`,
a `match` query with no analyzer parameter is effectively equivalent to a `term` query
for such fields (e.g. `name`, `username`, `realm_name`).

Relates: #101691
benwtrent added a commit that referenced this issue Jan 31, 2024
* Change release version lookup to an instance method (#104902)

* Upgrade to Lucene 9.9.2 (#104753)

This commit upgrades to Lucene 9.9.2.

* Improve `CANNOT_REBALANCE_CAN_ALLOCATE` explanation (#104904)

Clarify that in this situation there is a rebalancing move that would
improve the cluster balance, but there's some reason why rebalancing is
not happening. Also points at the `can_rebalance_cluster_decisions` as
well as the node-by-node decisions since the action needed could be
described in either place.

* Get from translog fails with large dense_vector (#104700)

This change fixes the engine to apply the current codec when retrieving documents from the translog.
We need to use the same codec than the main index in order to ensure that all the source data is indexable.
The internal codec treats some fields differently than the default one, for instance dense_vectors are limited to 1024 dimensions.
This PR ensures that these customizations are applied when indexing document for translog retrieval.

Closes #104639

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

* [Connector Secrets] Add delete API endpoint (#104815)

* Add DELETE endpoint for /_connector/_secret/{id}
* Add endpoint to write_connector_secrets cluster privilege

* Merge Aggregations into InternalAggregations (#104896)

This commit merges Aggregations into InternalAggregations in order to remove the unnecessary hierarchy.

* [Profiling] Simplify cost calculation (#104816)

* [Profiling] Add the number of cores to HostMetadata

* Update AWS pricelist (remove cost_factor, add usd_per_hour)

* Switch cost calculations from 'cost_factor' to 'usd_per_hour'

* Remove superfluous CostEntry.toXContent()

* Check for Number type in CostEntry.fromSource()

* Add comment

* Retry get_from_translog during relocations (#104579)

During a promotable relocation, a `get_from_translog` sent by the
unpromotable  shard to handle a real-time get might encounter
`ShardNotFoundException` or  `IndexNotFoundException`. In these cases,
we should retry.

This is just for `GET`. I'll open a second PR for `mGET`.  The relevant
IT is in the  Stateless PR.

Relates ES-5727

* indicating fix for 8.12.1 for int8_hnsw (#104912)

* Removing the assumption from some tests that the request builder's request() method always returns the same object (#104881)

* [DOCS] Adds get setting and update settings asciidoc files to security API index (#104916)

* [DOCS] Adds get setting and update settings asciidoc files to security API index.

* [DOCS] Fixes references in docs.

* Reuse APMMeterService of APMTelemetryProvider (#104906)

* Mute more tests that tend to leak searchhits (#104922)

* ESQL: Fix SearchStats#count(String) to count values not rows (#104891)

SearchStats#count incorrectly counts the number of documents (or rows)
 in which a document appears instead of the actual number of values.
This PR fixes this by looking at the term frequency instead of the doc
 count.

Fix #104795

* Adding request source for cohere (#104926)

* Fixing a broken javadoc comment in ReindexDocumentationIT (#104930)

This fixes a javadoc comment that was broken by #104881

* Fix enabling / disabling of APM agent "recording" in APMAgentSettings (#104324)

* Add `type` parameter support, for sorting, to the Query API Key API (#104625)

This adds support for the `type` parameter, for sorting, to the Query API key API.
The type for an API Key can currently be either `rest` or `cross_cluster`.
This was overlooked in #103695 when support for the `type` parameter
was first introduced only for querying.

* Apply publish plugin to es-opensaml-security-api project (#104933)

* Support `match` for the Query API Key API (#104594)

This adds support for the `match` query type to the Query API key Information API.
Note that since string values associated to API Keys are mapped as `keywords`,
a `match` query with no analyzer parameter is effectively equivalent to a `term` query
for such fields (e.g. `name`, `username`, `realm_name`).

Relates: #101691

* [Connectors API] Relax strict response parsing for get/list operations (#104909)

* Limit concurrent shards per node for ESQL (#104832)

Today, we allow ESQL to execute against an unlimited number of shards 
concurrently on each node. This can lead to cases where we open and hold
too many shards, equivalent to opening too many file descriptors or
using too much memory for FieldInfos in ValuesSourceReaderOperator.

This change limits the number of concurrent shards to 10 per node. This 
number was chosen based on the _search API, which limits it to 5.
Besides the primary reason stated above, this change has other
implications:

We might execute fewer shards for queries with LIMIT only, leading to 
scenarios where we execute only some high-priority shards then stop. 
For now, we don't have a partial reduce at the node level, but if we
introduce one in the future, it might not be as efficient as executing
all shards at the same time.  There are pauses between batches because
batches are executed sequentially one by one.  However, I believe the
performance of queries executing against many shards (after can_match)
is less important than resiliency.

Closes #103666

* [DOCS] Support for nested functions in ES|QL STATS...BY (#104788)

* Document nested expressions for stats

* More docs

* Apply suggestions from review

- count-distinct.asciidoc
  - Content restructured, moving the section about approximate counts to end of doc.

- count.asciidoc
  - Clarified that omitting the `expression` parameter in `COUNT` is equivalent to `COUNT(*)`, which counts the number of rows.

- percentile.asciidoc
  - Moved the note about `PERCENTILE` being approximate and non-deterministic to end of doc.

- stats.asciidoc
  - Clarified the `STATS` command
  -  Added a note indicating that individual `null` values are skipped during aggregation

* Comment out mentioning a buggy behavior

* Update sum with inline function example, update test file

* Fix typo

* Delete line

* Simplify wording

* Fix conflict fix typo

---------

Co-authored-by: Liam Thompson <leemthompo@gmail.com>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>

* [ML] Passing input type through to cohere request (#104781)

* Pushing input type through to cohere request

* switching logic to allow request to always override

* Fixing failure

* Removing getModelId calls

* Addressing feedback

* Switching to enumset

* [Transform] Unmute 2 remaining continuous tests: HistogramGroupByIT and TermsGroupByIT (#104898)

* Adding ActionRequestLazyBuilder implementation of RequestBuilder (#104927)

This introduces a second implementation of RequestBuilder (#104778). As opposed
to ActionRequestBuilder, ActionRequestLazyBuilder does not create its request
until the request() method is called, and does not hold onto that request (so each
call to request() gets a new request instance).
This PR also updates BulkRequestBuilder to inherit from ActionRequestLazyBuilder
as an example of its use.

* Update versions to skip after backport to 8.12 (#104953)

* Update/Cleanup references to old tracing.apm.* legacy settings in favor of the telemetry.* settings (#104917)

* Exclude tests that do not work in a mixed cluster scenario (#104935)

* ES|QL: Improve type validation in aggs for UNSIGNED_LONG and better support for VERSION (#104911)

* [Connector API] Make update configuration action non-additive (#104615)

* Save allocating enum values array in two hot spots (#104952)

Our readEnum code instantiates/clones enum value arrays on read.
Normally, this doesn't matter much but the two spots adjusted here are
visibly hot during bulk indexing, causing GBs of allocations during e.g.
the http_logs indexing run.

* ESQL: Correct out-of-range filter pushdowns (#99961)

Fix pushed down filters for binary comparisons that compare a
byte/short/int/long with an out of range value, like
WHERE some_int_field < 1E300.

* [DOCS] Dense vector element type should be float for OpenAI (#104966)

* Fix test assertions (#104963)

* Move functions that generate lucene geometries under a utility class (#104928)

We have functions that generate lucene geometries scattered in different places of the code. This commit moves 
everything under a utility class.

* fixing index versions

---------

Co-authored-by: Simon Cooper <simon.cooper@elastic.co>
Co-authored-by: Chris Hegarty <62058229+ChrisHegarty@users.noreply.github.com>
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Navarone Feekery <13634519+navarone-feekery@users.noreply.github.com>
Co-authored-by: Ignacio Vera <ivera@apache.org>
Co-authored-by: Tim Rühsen <tim.ruehsen@gmx.de>
Co-authored-by: Pooya Salehi <pxsalehi@users.noreply.github.com>
Co-authored-by: Keith Massey <keith.massey@elastic.co>
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
Co-authored-by: Moritz Mack <mmack@apache.org>
Co-authored-by: Costin Leau <costin@users.noreply.github.com>
Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>
Co-authored-by: Albert Zaharovits <albert.zaharovits@elastic.co>
Co-authored-by: Mark Vieira <portugee@gmail.com>
Co-authored-by: Jedr Blaszyk <jedrazb@gmail.com>
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
Co-authored-by: Liam Thompson <leemthompo@gmail.com>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
Co-authored-by: Przemysław Witek <przemyslaw.witek@elastic.co>
Co-authored-by: Joe Gallo <joe.gallo@elastic.co>
Co-authored-by: Lorenzo Dematté <lorenzo.dematte@elastic.co>
Co-authored-by: Luigi Dell'Aquila <luigi.dellaquila@gmail.com>
Co-authored-by: Armin Braun <me@obrown.io>
Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
Co-authored-by: David Kyle <david.kyle@elastic.co>
@albertzaharovits
Copy link
Contributor

This has been resolved by implementing:

@albertzaharovits
Copy link
Contributor

As well as support for the match query type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Security/Security Security issues without another label Team:Security Meta label for security team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants