[BUG] Kibana throws errors 500/401 one hour after login when using SAML #828

GuiTeK · 2021-09-15T13:01:07Z

Describe the bug
1 hour after login, Kibana will show one of these 2 errors instead of the requested page:

On the root domain of Kibana: {"statusCode":500,"error":"Internal Server Error","message":"An internal server error occurred."}
On all other pages: {"statusCode":401,"error":"Unauthorized","message":"Response Error"}

The only way to work around this issue is to delete cookies.

To Reproduce
Steps to reproduce the behavior:

Login to Kibana via a SAML Identity Provider (e.g. Okta)
Wait for 1 hour
Try to refresh/browse to a new Kibana page
See that Kibana shows {"statusCode":401,"error":"Unauthorized","message":"Response Error"} instead of showing the requested page
See that Kibana shows {"statusCode":500,"error":"Internal Server Error","message":"An internal server error occurred."} when manually going to the root Kibana domain

Note: to create the Okta app, I followed the instructions here: AWS - Add Single Sign-On (SSO) to Open Distro for Elasticsearch Kibana using SAML and Okta.

Expected behavior
The internal JWT created by OpenDistro (I'm not sure exactly what component creates it) should be automatically renewed and Kibana shouldn't throw an error (either 401 or 500) when visiting a page 1 hour or more after initial login.

Logs
ES Node Logs

[2021-09-15T09:08:24,861][TRACE][c.a.o.s.a.BackendRegistry] [es3.logs.example.com] Rest authentication request from 10.0.3.4:32528 [original: /10.0.3.4:32528]
[2021-09-15T09:08:24,861][DEBUG][c.a.o.s.a.BackendRegistry] [es3.logs.example.com] Check authdomain for rest noop/0 or 2 in total
[2021-09-15T09:08:24,861][TRACE][c.a.o.s.a.BackendRegistry] [es3.logs.example.com] Try to extract auth creds from clientcert http authenticator
[2021-09-15T09:08:24,861][TRACE][c.a.o.s.h.HTTPClientCertAuthenticator] [es3.logs.example.com] No CLIENT CERT, send 401
[2021-09-15T09:08:24,861][TRACE][c.a.o.s.a.BackendRegistry] [es3.logs.example.com] No 'Authorization' header, send 403
[2021-09-15T09:08:24,861][DEBUG][c.a.o.s.a.BackendRegistry] [es3.logs.example.com] Check authdomain for rest noop/1 or 2 in total
[2021-09-15T09:08:24,861][TRACE][c.a.o.s.a.BackendRegistry] [es3.logs.example.com] Try to extract auth creds from saml http authenticator
[2021-09-15T09:08:24,862][INFO ][c.a.d.a.h.j.AbstractHTTPJwtAuthenticator] [es3.logs.example.com] Extracting JWT token from [REDACTED JWT TOKEN HERE] failed
com.amazon.dlic.auth.http.jwt.keybyoidc.BadCredentialsException: The token has expired
	at com.amazon.dlic.auth.http.jwt.keybyoidc.JwtVerifier.getVerifiedJwtToken(JwtVerifier.java:85) ~[opendistro_security-1.13.1.0.jar:1.13.1.0]
	at com.amazon.dlic.auth.http.jwt.AbstractHTTPJwtAuthenticator.extractCredentials0(AbstractHTTPJwtAuthenticator.java:108) [opendistro_security-1.13.1.0.jar:1.13.1.0]
	at com.amazon.dlic.auth.http.jwt.AbstractHTTPJwtAuthenticator.access$000(AbstractHTTPJwtAuthenticator.java:47) [opendistro_security-1.13.1.0.jar:1.13.1.0]
	at com.amazon.dlic.auth.http.jwt.AbstractHTTPJwtAuthenticator$1.run(AbstractHTTPJwtAuthenticator.java:90) [opendistro_security-1.13.1.0.jar:1.13.1.0]
	at com.amazon.dlic.auth.http.jwt.AbstractHTTPJwtAuthenticator$1.run(AbstractHTTPJwtAuthenticator.java:87) [opendistro_security-1.13.1.0.jar:1.13.1.0]
	at java.security.AccessController.doPrivileged(AccessController.java:312) [?:?]
	at com.amazon.dlic.auth.http.jwt.AbstractHTTPJwtAuthenticator.extractCredentials(AbstractHTTPJwtAuthenticator.java:87) [opendistro_security-1.13.1.0.jar:1.13.1.0]
	at com.amazon.dlic.auth.http.saml.HTTPSamlAuthenticator.extractCredentials(HTTPSamlAuthenticator.java:148) [opendistro_security-1.13.1.0.jar:1.13.1.0]
	at com.amazon.opendistroforelasticsearch.security.auth.BackendRegistry.authenticate(BackendRegistry.java:421) [opendistro_security-1.13.1.0.jar:1.13.1.0]
	at com.amazon.opendistroforelasticsearch.security.filter.OpenDistroSecurityRestFilter.checkAndAuthenticateRequest(OpenDistroSecurityRestFilter.java:177) [opendistro_security-1.13.1.0.jar:1.13.1.0]
	at com.amazon.opendistroforelasticsearch.security.filter.OpenDistroSecurityRestFilter.access$000(OpenDistroSecurityRestFilter.java:66) [opendistro_security-1.13.1.0.jar:1.13.1.0]
	at com.amazon.opendistroforelasticsearch.security.filter.OpenDistroSecurityRestFilter$1.handleRequest(OpenDistroSecurityRestFilter.java:113) [opendistro_security-1.13.1.0.jar:1.13.1.0]
	at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:258) [elasticsearch-7.10.2.jar:7.10.2]
	at org.elasticsearch.rest.RestController.tryAllHandlers(RestController.java:340) [elasticsearch-7.10.2.jar:7.10.2]
	at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:191) [elasticsearch-7.10.2.jar:7.10.2]
	at com.amazon.opendistroforelasticsearch.security.ssl.http.netty.ValidatingDispatcher.dispatchRequest(ValidatingDispatcher.java:63) [opendistro_security-1.13.1.0.jar:1.13.1.0]
	at org.elasticsearch.http.AbstractHttpServerTransport.dispatchRequest(AbstractHttpServerTransport.java:319) [elasticsearch-7.10.2.jar:7.10.2]
	at org.elasticsearch.http.AbstractHttpServerTransport.handleIncomingRequest(AbstractHttpServerTransport.java:384) [elasticsearch-7.10.2.jar:7.10.2]
	at org.elasticsearch.http.AbstractHttpServerTransport.incomingRequest(AbstractHttpServerTransport.java:309) [elasticsearch-7.10.2.jar:7.10.2]
	at org.elasticsearch.http.netty4.Netty4HttpRequestHandler.channelRead0(Netty4HttpRequestHandler.java:42) [transport-netty4-client-7.10.2.jar:7.10.2]
	at org.elasticsearch.http.netty4.Netty4HttpRequestHandler.channelRead0(Netty4HttpRequestHandler.java:28) [transport-netty4-client-7.10.2.jar:7.10.2]
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at org.elasticsearch.http.netty4.Netty4HttpPipeliningHandler.channelRead(Netty4HttpPipeliningHandler.java:58) [transport-netty4-client-7.10.2.jar:7.10.2]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) [netty-codec-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) [netty-codec-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286) [netty-handler-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1518) [netty-handler-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1267) [netty-handler-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1314) [netty-handler-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501) [netty-codec-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440) [netty-codec-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) [netty-codec-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:615) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:578) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.49.Final.jar:4.1.49.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.49.Final.jar:4.1.49.Final]
	at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: org.apache.cxf.rs.security.jose.jwt.JwtException: The token has expired
	at org.apache.cxf.rs.security.jose.jwt.JwtUtils.validateJwtExpiry(JwtUtils.java:58) ~[cxf-rt-rs-security-jose-3.4.0.jar:3.4.0]
	at com.amazon.dlic.auth.http.jwt.keybyoidc.JwtVerifier.validateClaims(JwtVerifier.java:119) ~[opendistro_security-1.13.1.0.jar:1.13.1.0]
	at com.amazon.dlic.auth.http.jwt.keybyoidc.JwtVerifier.getVerifiedJwtToken(JwtVerifier.java:81) ~[opendistro_security-1.13.1.0.jar:1.13.1.0]
	... 74 more
[2021-09-15T09:08:24,865][DEBUG][c.o.s.a.AuthnRequest     ] [es3.logs.example.com] AuthNRequest --> <samlp:AuthnRequest xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol" xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion" ID="ONELOGIN_{SOME_UUID}" Version="2.0" IssueInstant="2021-09-15T09:08:24Z" ForceAuthn="true" Destination="https://subdomain.okta.com/app/xxx/yyy/sso/saml" ProtocolBinding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" AssertionConsumerServiceURL="https://kb.logs.example.com/_opendistro/_security/saml/acs"><saml:Issuer>logs-kibana-saml</saml:Issuer><samlp:NameIDPolicy Format="urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified" AllowCreate="true" /></samlp:AuthnRequest>
[2021-09-15T09:08:24,865][TRACE][o.e.i.b.in_flight_requests] [es3.logs.example.com] [in_flight_requests] Adjusted breaker by [0] bytes, now [0]
[2021-09-15T09:08:24,866][TRACE][o.e.h.HttpTracer         ] [es3.logs.example.com] [422][8027f54c-0e8c-4318-b8c8-a5fab657dd63][UNAUTHORIZED][text/plain; charset=UTF-8][0] sent response to [Netty4HttpChannel{localAddress=/10.0.5.8:9200, remoteAddress=/10.0.3.4:32528}] success [true]
[2021-09-15T09:08:24,866][TRACE][c.a.o.s.a.BackendRegistry] [es3.logs.example.com] No 'Authorization' header, send 401 and 'WWW-Authenticate Basic'

Kibana Server Logs

Sep 14 15:33:17 kb1.logs.example.com kibana[493]: {"type":"log","@timestamp":"2021-09-14T15:33:17Z","tags":["error","elasticsearch","data"],"pid":493,"message":"[ResponseError]: Response Error"}
Sep 14 15:33:17 kb1.logs.example.com kibana[493]: {"type":"log","@timestamp":"2021-09-14T15:33:17Z","tags":["error","http"],"pid":493,"message":"{ ResponseError: Response Error\n    at IncomingMessage.response.on (/us
r/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:272:25)\n    at IncomingMessage.emit (events.js:203:15)\n    at endReadableNT (_stream_readable.js:1145:12)\n    at process._tickCallback (internal/proc
ess/next_tick.js:63:19)\n  name: 'ResponseError',\n  meta:\n   { body: '',\n     statusCode: 401,\n     headers:\n      { 'x-opaque-id': '{SOME_UUID}',\n        'www-authenticate':\n         'X-S
ecurity-IdP realm=\"Open Distro Security\" location=\"https://subdomain.okta.com/app/xxx/yyy/sso/saml?SAMLRequest=some-base64-data\" requestId=\"ONELOGIN_{SOME_UUID}\"',\n        'content-type': 'text/plain; charset=UTF-8',\n        'content-length': '0' },\n
    meta:\n      { context: null,\n        request: [Object],\n        name: 'elasticsearch-js',\n        connection: [Object],\n        attempts: 0,\n        aborted: false } },\n  isBoom: true,\n  isServer: false,\n  d
ata: null,\n  output:\n   { statusCode: 401,\n     payload:\n      { statusCode: 401,\n        error: 'Unauthorized',\n        message: 'Response Error' },\n     headers: {} },\n  reformat: [Function],\n  [Symbol(SavedOb
jectsClientErrorCode)]: 'SavedObjectsClient/notAuthorized' }"}
Sep 14 15:33:17 kb1.logs.example.com kibana[493]: {"type":"error","@timestamp":"2021-09-14T15:33:17Z","tags":[],"pid":493,"level":"error","error":{"message":"Internal Server Error","name":"Error","stack":"Error: Inter
nal Server Error\n    at HapiResponseAdapter.toInternalError (/usr/share/kibana/src/core/server/http/router/response_adapter.js:69:19)\n    at Router.handle (/usr/share/kibana/src/core/server/http/router/router.js:177:34
)\n    at process._tickCallback (internal/process/next_tick.js:68:7)"},"url":{"protocol":null,"slashes":null,"auth":null,"host":null,"port":null,"hostname":null,"hash":null,"search":null,"query":{},"pathname":"/","path":
"/","href":"/"},"message":"Internal Server Error"}
Sep 14 15:33:17 kb1.logs.example.com kibana[493]: {"type":"response","@timestamp":"2021-09-14T15:33:17Z","tags":[],"pid":493,"method":"get","statusCode":500,"req":{"url":"/","method":"get","headers":{"x-forwarded-for"
:"38.103.45.2","x-forwarded-proto":"https","x-forwarded-port":"443","host":"kb.logs.example.com","x-amzn-trace-id":"Root=some-id","sec-ch-ua":"\"Google Chrome\";v=\"93\", \" Not;A Brand\";v
=\"99\", \"Chromium\";v=\"93\"","sec-ch-ua-mobile":"?0","sec-ch-ua-platform":"\"macOS\"","upgrade-insecure-requests":"1","user-agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko
) Chrome/93.0.4577.63 Safari/537.36","accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9","sec-fetch-site":"none","sec-fetch-m
ode":"navigate","sec-fetch-user":"?1","sec-fetch-dest":"document","accept-encoding":"gzip, deflate, br","accept-language":"en-GB,en-US;q=0.9,en;q=0.8,fr;q=0.7"},"remoteAddress":"10.0.1.36","userAgent":"Mozilla/5.0 (Macin
tosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36"},"res":{"statusCode":500,"responseTime":56,"contentLength":9},"message":"GET / 500 56ms - 9.0B"}

Host/Environment (please complete the following information):

OS: Ubuntu 20.04
ElasticSearch: OSS 7.10.2
Kibana (opendistroforelasticsearch-kibana): 1.13.2
OpenDistro versions:

opendistro-alerting               1.13.1.0-1 
opendistro-anomaly-detection      1.13.0.0-1
opendistro-asynchronous-search    1.13.0.1-1
opendistro-index-management       1.13.2.0-1
opendistro-job-scheduler          1.13.0.0-1
opendistro-knn                    1.13.0.0-1
opendistro-knnlib                 1.13.0.0
opendistro-performance-analyzer   1.13.0.0-1
opendistro-reports-scheduler      1.13.0.0-1
opendistro-security               1.13.1.0-1
opendistro-sql                    1.13.2.0-1
opendistroforelasticsearch        1.13.2-1

The text was updated successfully, but these errors were encountered:

GuiTeK · 2021-11-05T16:58:50Z

Any news on this issue?

I just tested OpenSearch 1.0.0 and 1.1.0 and the issue is still present.

The only way to work around it is to delete cookies, which is a nightmare from an UX point of view/usability of the tool.

The JWT expiry setting could maybe be a workaround for this issue, but unfortunately this feature also has a bug: opensearch-project/security#1448

GuiTeK · 2021-11-05T17:01:51Z

cc @dblock just so you're aware of it as I believe it is a rather high impact bug.

davidlago · 2021-11-05T17:13:20Z

Thanks for the report, @GuiTeK. Although not a security vulnerability, I agree that this is painful from the user experience standpoint. The team has not had time to look into it yet, but I've removed the untriaged label to make sure we have it on the list of issues ready to take on.

GuiTeK · 2021-11-08T16:00:33Z

Thank you for your reply @davidlago!

Related issues (although the root cause is not the parameters described in these two issues):

rmelilloii · 2021-12-03T11:14:28Z

Hello, @GuiTeK all good!? Did you find a solution to this issue? I am currently validating the downgrade of the ES plugin "opendistro_security" from: 1.13.1.0 to: 1.13.0.0. Keeping the Kibana plugin as it is.
It works, at least did not break anything. I will proceed with the same for my clusters and see how it behaves.
Thanks for the info regarding OpenSearch versions.

rmelilloii · 2021-12-08T16:17:10Z

Hello, @GuiTeK all good!? Did you find a solution to this issue? I am currently validating the downgrade of the ES plugin "opendistro_security" from: 1.13.1.0 to: 1.13.0.0. Keeping the Kibana plugin as it is. It works, at least did not break anything. I will proceed with the same for my clusters and see how it behaves. Thanks for the info regarding OpenSearch versions.

Replying to self:
ended up with:
Kibana 1.12 (7.10.0) + opendistroSecurityKibana 1.13.0.0
My case with multiple Tenants suffers a lot fo the lack of real support to them + the session/cookie issue.

With this combination, I can finally see the 1hour timeout (from Azure). Before was a matter of 15 min or less.

But I feel that it will not be enough. So maybe next will try to change my infra (load balancer/request flow) or test with latest Opensearch version.

I guess that everyone is off for the year, is that right? Cheers!! :)

mvanderlee · 2022-01-26T17:50:00Z

We observed this behaviour when the session keepalive is set to true. Setting it to false fixed it.

In dashboards.yml

opensearch_security.session.keepalive: false

However, we found another issue regarding to SAML timeouts. The IDP provides a expiry time, but OpenSearch only honors a specific option in the SAMLResponse. Auth0 for example sends it via a different option. Not sure who is in the wrong here. But in our case the work-around was to set the jwt.expiry setting manually

#159 (comment)

sandervandegeijn · 2022-04-08T03:26:18Z

Confirmed with both saml and oidc

rmelilloii · 2022-04-08T06:45:09Z

Hello hello! Forgot to post back. I ended up with normal settings (exactly like the official docs) no fancy stuff, normal versions all around. But, I use load balancers, previous person had several ÉS node type under the LB and kibana was pointed to this LB for auth. After pointing to a single client node it never happened again.
I didn’t test with multiple endpoints, but it is not really required for me and I am moving to Opensearch. Will test there ;) Cheers and thanks for all the ideas.

mhoydis13 · 2022-05-19T20:27:24Z

This is still an issue for anyone using openid_auth_domain

sandervandegeijn · 2022-05-19T20:31:39Z

Yip. Tried different ttl values and combination of settings. Problem still persists

mhoydis13 · 2022-07-25T13:38:36Z

This issue is still present in version 2.1.0

sandervandegeijn · 2022-07-25T15:19:48Z

Confirmed and very annoying.

SakuraAxy · 2022-08-05T08:34:45Z

I have the same problem

sandervandegeijn · 2022-09-20T19:32:29Z

Any news on this one? This is one of the reasons I can't migrate an Elastic cluster to Opensearch because the team doesn't want tot deal with this session error.

FryggFR · 2023-01-23T08:29:31Z

Hello,

I have the same problem with Azure AD

stephen-crawford · 2023-01-30T20:17:36Z

[Triage 1/30/2023] This issue seems to be related to the cookie storage and potentially the access & refresh tokens expiring. We are passing a token but it does not have a good method of dealing with expiration between front-end and backend systems. @davidlago could you link this to the to-be-created ticket for session management so that this can be a considered use case. Thank you.

Also linking a pair of associated issues

jperhamcatchteam · 2023-02-09T18:37:00Z

Hello,

I have the same problem with Azure AD

Also running into this issue with Azure AD. I have not found a way to resolve this without manually clearing cookies for the issue browser.

SergioIbIGZ · 2023-12-19T08:41:54Z

Hello @jochen-kressin . Thank you for your answers.
I tried adding "offline_access" to the scope but our IDP does not work, is invalid for it :(

Apart of this, yesterday I made a interesting POC. I installed the last version of OpenSearch (2.11) with the latest version of OpenSearch Dashboards 2. Using Helm and applying the exact SAME config as we use in OpenSearch 1.x, including OpenID.

For my surprise, in OpenSearch 2.x logs appears the cookie expire invalid error, but in OpenSearch Dasboards 2.x does NOT appears the BadCredentialsException that exist in the 1.x version. For your reminder:

[2023-12-14T11:27:07,592][INFO ][c.a.d.a.h.j.AbstractHTTPJwtAuthenticator] [tip-master-1] Extracting JWT token from eyJraWQiOiJyc2ExIiwiYWxnIjoiUlMyNTYifQ.eyJzdWIiOiJYRTk1NDQ0IiwiY291bnRyeSI6IkVTUCIsInJvbGVzIjpbIkJCVkFfSE8tVklFV0VSX0NUSSJdLCJraWQiOiJyc2ExIiwiaXNzIjoiaHR0cHM6XC9cL2lkcC5saXZlLmdsb2JhbC5wbGF0Zm9ybS5iYnZhLmNvbVwvb2lkY1wvIiwicHJlZmVycmVkX3VzZXJuYW1lIjoiWEU5NTQ0NCIsImdpdmVuX25hbWUiOiJTRVJHSU8iLCJhdWQiOiIxMGMzYzhhNi0zZjc4LTQ5M2UtYTc4Ny02MTJkODcwMjFhZGQiLCJuYW1lIjoiW0V4dGVybmFsXSBTRVJHSU8gSUJBw5FFWiBET01JTkdPIiwiZXhwIjoxNzAyNTUyMTU1LCJpYXQiOjE3MDI1NTE1NTUsImp0aSI6IjQxZGM0ZGViLThmMjMtNDUwYi04OTIzLTc1ZjM3ZjM5NWI4NiIsImVtYWlsIjoiU0VSR0lPLklCQU5FWi5DT05UUkFDVE9SQEJCVkEuQ09NIn0.mwMA0tnhvOPK14kY39MjPawYiklH3TlnHMwBM63K8AABfAIFtz-Ra_8uwy3AfHODKdTzDhOigYU5oFNJlVoudnzRAZek7uDk2YUsYRb3of4zIKJt3tBPkaHOrYOGHUGIQsgeX3kNQewKMPIvmiLgGw-r0Ep5kKbm228TRXhlbWOt_Y_TDj1KqF5SCv2rkr60wiJVt19nPSbzK2WlLhkE_227ywC1gwo9N1lvSH6qoO82o4If75O4L0ddO6crvaE97amgeCxi9jaI6_QM0U9lSX9kGoAvK1kb0ik90NJGnaVBqetw-WWVykZMsxUFDL8dlDP8eMNCRCOwayrDM2_EUQ failed
com.amazon.dlic.auth.http.jwt.keybyoidc.BadCredentialsException: The token has expired

So when a user log-in in OpenSearch Dashboards 2.x, the 401/Unauthorized error 10 minutes after the login, does NOT occurs. The session is kept alive or maybe the token is able to be refreshed.

The conclusion for me is that "com.amazon.dlic.auth.http.jwt.keybyoidc" is different, o maybe another version, in 2.x and it works properly with our IDP and its cookies.

Can you tell me about it? Is there possible to study a possible fix for 1.x?

Thanks for your time and dedication :)

sandervandegeijn · 2024-02-02T23:53:00Z

Any updates on this one? Would really love for this to be solved in 2.12. This one is very annoying :)

stephen-crawford · 2024-02-05T14:31:42Z

Hi @sandervandegeijn, I don't know of any active efforts to fix this. I will remove the Triaged label so the matainers can review this issue during the Triaging meeting later today and add an update below.

stephen-crawford · 2024-02-05T16:27:20Z

[Triage] Seems like this is still an issue and something is going wrong with the behavior when using external IdPs and dashboards. Based on the discussion any data is lost from the active session if a redirect to refresh the token is executed during making a changes. I.e. making filters and then a token refresh causes you to lose all of the filters. We should try to prioritize this based on the long life of this issue.

derek-ho · 2024-02-07T20:13:39Z

@GuiTeK @sandervandegeijn I am beginning to pick this up, and new tho this space. Can you share your settings for opensearch_security.session.ttl and opensearch_security.session.keepalive as well as any settings from the saml provider that you have, such as the expiry/ assertion lifespan? I am seeing some behavior on my local and not sure if it is what you are talking about - is the bug that after some certain amount of time OSD is re-routed back to the saml provider to re-authenticate even though the assertion should still be valid? I do see this on my local, but I do not see the 401's/error pages that you folks were mentioning?

sandervandegeijn · 2024-02-07T20:18:08Z

We are at the defaults. I'm not running SAML anymore, but with openid it's basically the same thing. Leave the dashboards app for 30 minutes or so, click on the next page and it will kick off the authentication flow (which is better than it was in the past, then it would just throw the 401 and be done with it). This is fast and another problem is dat it loses all the state that you had like filters. :)

andrew-landsverk-win · 2024-02-07T20:57:32Z

I am doing the same as @sandervandegeijn - using OpenID via Azure AD and the same thing happens to us.

derek-ho · 2024-02-07T21:21:43Z

Understood. I do think there is a bug/confusion here regarding the whole management of sessions. Would you folks be able to provide some feedback on this issue? #1711? Additionally a few questions/comments for you (I am still trying to wrap my head around it so there will be more to come!)

Is there anything preventing you from updating keepalive to false and opensearch_security.session.ttl to match the expiry of the assertion/exp field? This might not solve for cases in which the exp is actually short lived (that may be a separate issue of doing the best to preserve URls, filters, etc.), but this should solve for the redirection. I believe the defaults here: https://github.com/opensearch-project/security-dashboards-plugin/blob/main/server/index.ts#L78 are the root case of this
Can you folks help me understand in which case would people want the session within OSD to actually be shorter than the validity of the result received from the IDP? We may want to set some sensible defaults in the case that it is not set in a way we expect by the IDP, but any other reasons to have this mismatch? I need to dive deep into the code, but if there is no use case In having a shorter session in dashboards only I would be in favor of removing that to handle everything in a single place and have a single source of truth (OpenSearch backend) although a intermediary fix may be to set the cookie expiry to the max of the existing cookie expiry, expiry from idp, or current time + ttl. What do you folks think?

sandervandegeijn · 2024-02-07T21:59:28Z

Actually, I checked, I set the timeouts to one day. Removed the settings, so now I'm on the defaults. Will test again.

I do not understand why you would override the timeout from the IDP. If you need it shorter, you should fix it at the IDP's side I would suppose?

This also seems related: #159 (comment)

SergioIbIGZ · 2024-02-08T07:44:43Z

Hello again,
Probably this information is not useful here but I would like to share it with you.

As I said in #828 (comment) in our case, the IDP and its cookie is configuring to try to set 12 hours as Dashboards session ttl.
But at the end that session is closed with a 401 error after only 10 minutes (I don't know why in our case is only that time).

Our parcial workaround is configuring this settings:

opensearch_security.openid.refresh_tokens: false
opensearch_security.session.keepalive: true
opensearch_security.session.ttl: 180000
opensearch_security.cookie.ttl: 180000

As you can see, the session ttl is only 3 minutes. But what we achieve with that is a browser "auto-reload" to dashboards login each that time. It is annoying but doing this, the 401 error does NOT occurs. But the users have to save its work continuosly to don't lose it, or course.

Thanks!

sandervandegeijn · 2024-02-08T15:20:39Z

Actually, I checked, I set the timeouts to one day. Removed the settings, so now I'm on the defaults. Will test again.

I do not understand why you would override the timeout from the IDP. If you need it shorter, you should fix it at the IDP's side I would suppose?

This also seems related: #159 (comment)

Problem persists.

atbohmer · 2024-02-08T16:11:59Z

From a user perspective: mighty irritating! Had a discover window open, did some work, went to fetch a coffee, back and a session reset in front of my eyes. All selections and filtering gone. So back to the ELK setup for daily work.

derek-ho · 2024-02-13T16:44:24Z

Hello again, Probably this information is not useful here but I would like to share it with you.

As I said in #828 (comment) in our case, the IDP and its cookie is configuring to try to set 12 hours as Dashboards session ttl. But at the end that session is closed with a 401 error after only 10 minutes (I don't know why in our case is only that time).

Our parcial workaround is configuring this settings:
opensearch_security.openid.refresh_tokens: false
opensearch_security.session.keepalive: true
opensearch_security.session.ttl: 180000
opensearch_security.cookie.ttl: 180000
As you can see, the session ttl is only 3 minutes. But what we achieve with that is a browser "auto-reload" to dashboards login each that time. It is annoying but doing this, the 401 error does NOT occurs. But the users have to save its work continuosly to don't lose it, or course.

Thanks!

@SergioIbIGZ If I am reading your situation correctly, it seems like you have issues on 1.x line, but the issue is not on 2.x line? Unfortunately we do not develop for 1.x anymore anymore, and would recommend you upgrade to 2.x. https://opensearch.org/releases.html. That being said, I am going to shortly post a summary on this issue and close it out with the merging of a recent PR. Feel free to open another issue if something that is affecting 2.x comes up, or if I am not understanding your problem correctly. Thanks!

derek-ho · 2024-02-13T16:49:02Z

This issue is getting a little long in the tooth and it's getting hard for me to diagnose/help individual folks with their problems. That being said, it seems to me like there's several issues mentioned, some related, and some not, and some based on opendistro, which may or may not be out of date. From what I see the issues are:

SAML assertion assumptions not allowing auth0 to be properly read. I created a follow up issue in the security repo since SAML assertion reading is done there: [FEATURE] Support custom SAML headers for expiry time security#4046. @mvanderlee can you comment on whether customizing SAML assertion fits your use case? I am new to the space so not sure if I am reading that this might help?
Any OIDC cookie expiry issues to anything less than the opendistro_security.cookie.ttl, which defaults to 1 hour should be handled in Fix cookie expiry issues from IDP/JWT auth methods, disables keepalive for JWT/IDP #1773
Any cookie expiry issues where keepalive is true (which is the default) should be handled in Fix cookie expiry issues from IDP/JWT auth methods, disables keepalive for JWT/IDP #1773

I will be closing this issue with the merging of #1773. Anybody please feel free to open a follow-up issue with detailed reproduction steps (IDP, opensearch_dashboards.yml settings, opensearch security backend config, etc.) so I can better address individual concerns. Thanks!

Additionally, we have a RFC #1711 to discuss confusion around some of the settings. If anyone has any thoughts, please leave them there, thanks!

@GuiTeK @rmelilloii @sandervandegeijn @mhoydis13 @SakuraAxy @FryggFR @jperhamcatchteam @Beeez @K3ndu @tr0k
@mkhpalm @SergioIbIGZ @andrew-landsverk-win @atbohmer @jochen-kressin

sandervandegeijn · 2024-02-13T16:54:19Z

Thanks for the effort Derek, we haven't made it easy for you ;)

SergioIbIGZ · 2024-02-14T07:21:21Z

Hello again, Probably this information is not useful here but I would like to share it with you.
As I said in #828 (comment) in our case, the IDP and its cookie is configuring to try to set 12 hours as Dashboards session ttl. But at the end that session is closed with a 401 error after only 10 minutes (I don't know why in our case is only that time).
Our parcial workaround is configuring this settings:
opensearch_security.openid.refresh_tokens: false
opensearch_security.session.keepalive: true
opensearch_security.session.ttl: 180000
opensearch_security.cookie.ttl: 180000
As you can see, the session ttl is only 3 minutes. But what we achieve with that is a browser "auto-reload" to dashboards login each that time. It is annoying but doing this, the 401 error does NOT occurs. But the users have to save its work continuosly to don't lose it, or course.
Thanks!
@SergioIbIGZ If I am reading your situation correctly, it seems like you have issues on 1.x line, but the issue is not on 2.x line? Unfortunately we do not develop for 1.x anymore anymore, and would recommend you upgrade to 2.x. https://opensearch.org/releases.html. That being said, I am going to shortly post a summary on this issue and close it out with the merging of a recent PR. Feel free to open another issue if something that is affecting 2.x comes up, or if I am not understanding your problem correctly. Thanks!

That's right @derek-ho. I already tested it in 2.x and my issue is not present. So I will suggest to update to that version.
Thanks a lot :)

GuiTeK added Beta bug Something isn't working untriaged labels Sep 15, 2021

davidlago removed Beta untriaged labels Nov 5, 2021

davidlago added the help wanted Extra attention is needed, need help from community label Feb 18, 2022

sandervandegeijn mentioned this issue Sep 20, 2022

[BUG] JWT expiry setting not honored opensearch-project/security#1448

Open

davidlago added the triaged label Oct 10, 2022

gdiazlo mentioned this issue Oct 14, 2022

SAML logout/session renewal bugs wazuh/wazuh-dashboard-plugins#4595

Closed

2 tasks

gdiazlo mentioned this issue Oct 31, 2022

SAML logout error "not found" wazuh/wazuh-dashboard-plugins#4779

Closed

1 task

jotacarma90 mentioned this issue Dec 30, 2022

Release 4.4.0 - Alpha 2 - E2E UX tests - SAML SSO wazuh/wazuh#15765

Closed

3 tasks

davidlago mentioned this issue Jan 31, 2023

[RFC] Improved session management #1311

Closed

jochen-kressin mentioned this issue Dec 28, 2023

[RFC] Remove confusion about conflicting token expiration and cookie + session config settings #1711

Open

davidjiglesias mentioned this issue Jan 11, 2024

Release 4.8.0 - Alpha 2 - E2E UX tests - SAML SSO wazuh/wazuh#21368

Closed

2 tasks

stephen-crawford removed the triaged label Feb 5, 2024

stephen-crawford added triaged v2.12.0 Items targeting 2.12.0 and removed help wanted Extra attention is needed, need help from community labels Feb 5, 2024

davidjiglesias mentioned this issue Feb 6, 2024

Release 4.8.0 - Beta 1 - E2E UX tests - SAML SSO wazuh/wazuh#21768

Closed

2 tasks

davidlago assigned derek-ho Feb 7, 2024

derek-ho mentioned this issue Feb 9, 2024

Fix cookie expiry issues from IDP/JWT auth methods, disables keepalive for JWT/IDP #1773

Merged

3 tasks

DarshitChanpura closed this as completed in #1773 Feb 23, 2024

dlundgren mentioned this issue Jun 10, 2024

[BUG] Session timeout with SAML wazuh/wazuh-dashboard#193

Closed

juliamagan mentioned this issue Jun 20, 2024

Release 4.9.0 - Alpha 2 - E2E UX tests - SAML SSO wazuh/wazuh#24214

Closed

2 tasks

davidjiglesias mentioned this issue Jul 22, 2024

Release 4.9.0 - Alpha 3 - E2E UX tests - SAML SSO wazuh/wazuh#24855

Closed

2 tasks

juliamagan mentioned this issue Sep 20, 2024

Release 4.9.1 - RC 1 - E2E UX tests - SAML SSO wazuh/wazuh#25829

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Kibana throws errors 500/401 one hour after login when using SAML #828

[BUG] Kibana throws errors 500/401 one hour after login when using SAML #828

GuiTeK commented Sep 15, 2021

GuiTeK commented Nov 5, 2021

GuiTeK commented Nov 5, 2021

davidlago commented Nov 5, 2021

GuiTeK commented Nov 8, 2021 •

edited

Loading

rmelilloii commented Dec 3, 2021

rmelilloii commented Dec 8, 2021

mvanderlee commented Jan 26, 2022

sandervandegeijn commented Apr 8, 2022

rmelilloii commented Apr 8, 2022 •

edited

Loading

mhoydis13 commented May 19, 2022

sandervandegeijn commented May 19, 2022

mhoydis13 commented Jul 25, 2022

sandervandegeijn commented Jul 25, 2022

SakuraAxy commented Aug 5, 2022

sandervandegeijn commented Sep 20, 2022

FryggFR commented Jan 23, 2023

stephen-crawford commented Jan 30, 2023 •

edited

Loading

jperhamcatchteam commented Feb 9, 2023

SergioIbIGZ commented Dec 19, 2023

sandervandegeijn commented Feb 2, 2024

stephen-crawford commented Feb 5, 2024

stephen-crawford commented Feb 5, 2024

derek-ho commented Feb 7, 2024 •

edited

Loading

sandervandegeijn commented Feb 7, 2024

andrew-landsverk-win commented Feb 7, 2024

derek-ho commented Feb 7, 2024 •

edited

Loading

sandervandegeijn commented Feb 7, 2024

SergioIbIGZ commented Feb 8, 2024

sandervandegeijn commented Feb 8, 2024

atbohmer commented Feb 8, 2024

derek-ho commented Feb 13, 2024

derek-ho commented Feb 13, 2024 •

edited

Loading

sandervandegeijn commented Feb 13, 2024

SergioIbIGZ commented Feb 14, 2024

[BUG] Kibana throws errors 500/401 one hour after login when using SAML #828

[BUG] Kibana throws errors 500/401 one hour after login when using SAML #828

Comments

GuiTeK commented Sep 15, 2021

GuiTeK commented Nov 5, 2021

GuiTeK commented Nov 5, 2021

davidlago commented Nov 5, 2021

GuiTeK commented Nov 8, 2021 • edited Loading

rmelilloii commented Dec 3, 2021

rmelilloii commented Dec 8, 2021

mvanderlee commented Jan 26, 2022

sandervandegeijn commented Apr 8, 2022

rmelilloii commented Apr 8, 2022 • edited Loading

mhoydis13 commented May 19, 2022

sandervandegeijn commented May 19, 2022

mhoydis13 commented Jul 25, 2022

sandervandegeijn commented Jul 25, 2022

SakuraAxy commented Aug 5, 2022

sandervandegeijn commented Sep 20, 2022

FryggFR commented Jan 23, 2023

stephen-crawford commented Jan 30, 2023 • edited Loading

jperhamcatchteam commented Feb 9, 2023

SergioIbIGZ commented Dec 19, 2023

sandervandegeijn commented Feb 2, 2024

stephen-crawford commented Feb 5, 2024

stephen-crawford commented Feb 5, 2024

derek-ho commented Feb 7, 2024 • edited Loading

sandervandegeijn commented Feb 7, 2024

andrew-landsverk-win commented Feb 7, 2024

derek-ho commented Feb 7, 2024 • edited Loading

sandervandegeijn commented Feb 7, 2024

SergioIbIGZ commented Feb 8, 2024

sandervandegeijn commented Feb 8, 2024

atbohmer commented Feb 8, 2024

derek-ho commented Feb 13, 2024

derek-ho commented Feb 13, 2024 • edited Loading

sandervandegeijn commented Feb 13, 2024

SergioIbIGZ commented Feb 14, 2024

GuiTeK commented Nov 8, 2021 •

edited

Loading

rmelilloii commented Apr 8, 2022 •

edited

Loading

stephen-crawford commented Jan 30, 2023 •

edited

Loading

derek-ho commented Feb 7, 2024 •

edited

Loading

derek-ho commented Feb 7, 2024 •

edited

Loading

derek-ho commented Feb 13, 2024 •

edited

Loading