-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Kibana throws errors 500/401 one hour after login when using SAML #828
Comments
Any news on this issue? I just tested OpenSearch 1.0.0 and 1.1.0 and the issue is still present. The only way to work around it is to delete cookies, which is a nightmare from an UX point of view/usability of the tool. The JWT expiry setting could maybe be a workaround for this issue, but unfortunately this feature also has a bug: opensearch-project/security#1448 |
cc @dblock just so you're aware of it as I believe it is a rather high impact bug. |
Thanks for the report, @GuiTeK. Although not a security vulnerability, I agree that this is painful from the user experience standpoint. The team has not had time to look into it yet, but I've removed the untriaged label to make sure we have it on the list of issues ready to take on. |
Thank you for your reply @davidlago! Related issues (although the root cause is not the parameters described in these two issues): |
Hello, @GuiTeK all good!? Did you find a solution to this issue? I am currently validating the downgrade of the ES plugin "opendistro_security" from: 1.13.1.0 to: 1.13.0.0. Keeping the Kibana plugin as it is. |
Replying to self: With this combination, I can finally see the 1hour timeout (from Azure). Before was a matter of 15 min or less. But I feel that it will not be enough. So maybe next will try to change my infra (load balancer/request flow) or test with latest Opensearch version. I guess that everyone is off for the year, is that right? Cheers!! :) |
We observed this behaviour when the session keepalive is set to true. Setting it to false fixed it. In dashboards.yml
However, we found another issue regarding to SAML timeouts. The IDP provides a expiry time, but OpenSearch only honors a specific option in the SAMLResponse. Auth0 for example sends it via a different option. Not sure who is in the wrong here. But in our case the work-around was to set the jwt.expiry setting manually |
Confirmed with both saml and oidc |
Hello hello! Forgot to post back. I ended up with normal settings (exactly like the official docs) no fancy stuff, normal versions all around. But, I use load balancers, previous person had several ÉS node type under the LB and kibana was pointed to this LB for auth. After pointing to a single client node it never happened again. |
This is still an issue for anyone using openid_auth_domain |
Yip. Tried different ttl values and combination of settings. Problem still persists |
This issue is still present in version 2.1.0 |
Confirmed and very annoying. |
I have the same problem |
Any news on this one? This is one of the reasons I can't migrate an Elastic cluster to Opensearch because the team doesn't want tot deal with this session error. |
Hello, I have the same problem with Azure AD |
[Triage 1/30/2023] This issue seems to be related to the cookie storage and potentially the access & refresh tokens expiring. We are passing a token but it does not have a good method of dealing with expiration between front-end and backend systems. @davidlago could you link this to the to-be-created ticket for session management so that this can be a considered use case. Thank you. Also linking a pair of associated issues |
Also running into this issue with Azure AD. I have not found a way to resolve this without manually clearing cookies for the issue browser. |
Hello @jochen-kressin . Thank you for your answers. Apart of this, yesterday I made a interesting POC. I installed the last version of OpenSearch (2.11) with the latest version of OpenSearch Dashboards 2. Using Helm and applying the exact SAME config as we use in OpenSearch 1.x, including OpenID. For my surprise, in OpenSearch 2.x logs appears the cookie expire invalid error, but in OpenSearch Dasboards 2.x does NOT appears the BadCredentialsException that exist in the 1.x version. For your reminder:
So when a user log-in in OpenSearch Dashboards 2.x, the 401/Unauthorized error 10 minutes after the login, does NOT occurs. The session is kept alive or maybe the token is able to be refreshed. The conclusion for me is that "com.amazon.dlic.auth.http.jwt.keybyoidc" is different, o maybe another version, in 2.x and it works properly with our IDP and its cookies. Can you tell me about it? Is there possible to study a possible fix for 1.x? Thanks for your time and dedication :) |
Any updates on this one? Would really love for this to be solved in 2.12. This one is very annoying :) |
Hi @sandervandegeijn, I don't know of any active efforts to fix this. I will remove the Triaged label so the matainers can review this issue during the Triaging meeting later today and add an update below. |
[Triage] Seems like this is still an issue and something is going wrong with the behavior when using external IdPs and dashboards. Based on the discussion any data is lost from the active session if a redirect to refresh the token is executed during making a changes. I.e. making filters and then a token refresh causes you to lose all of the filters. We should try to prioritize this based on the long life of this issue. |
@GuiTeK @sandervandegeijn I am beginning to pick this up, and new tho this space. Can you share your settings for |
We are at the defaults. I'm not running SAML anymore, but with openid it's basically the same thing. Leave the dashboards app for 30 minutes or so, click on the next page and it will kick off the authentication flow (which is better than it was in the past, then it would just throw the 401 and be done with it). This is fast and another problem is dat it loses all the state that you had like filters. :) |
I am doing the same as @sandervandegeijn - using OpenID via Azure AD and the same thing happens to us. |
Understood. I do think there is a bug/confusion here regarding the whole management of sessions. Would you folks be able to provide some feedback on this issue? #1711? Additionally a few questions/comments for you (I am still trying to wrap my head around it so there will be more to come!)
|
Actually, I checked, I set the timeouts to one day. Removed the settings, so now I'm on the defaults. Will test again. I do not understand why you would override the timeout from the IDP. If you need it shorter, you should fix it at the IDP's side I would suppose? This also seems related: #159 (comment) |
Hello again, As I said in #828 (comment) in our case, the IDP and its cookie is configuring to try to set 12 hours as Dashboards session ttl. Our parcial workaround is configuring this settings:
As you can see, the session ttl is only 3 minutes. But what we achieve with that is a browser "auto-reload" to dashboards login each that time. It is annoying but doing this, the 401 error does NOT occurs. But the users have to save its work continuosly to don't lose it, or course. Thanks! |
Problem persists. |
From a user perspective: mighty irritating! Had a discover window open, did some work, went to fetch a coffee, back and a session reset in front of my eyes. All selections and filtering gone. So back to the ELK setup for daily work. |
@SergioIbIGZ If I am reading your situation correctly, it seems like you have issues on 1.x line, but the issue is not on 2.x line? Unfortunately we do not develop for 1.x anymore anymore, and would recommend you upgrade to 2.x. https://opensearch.org/releases.html. That being said, I am going to shortly post a summary on this issue and close it out with the merging of a recent PR. Feel free to open another issue if something that is affecting 2.x comes up, or if I am not understanding your problem correctly. Thanks! |
This issue is getting a little long in the tooth and it's getting hard for me to diagnose/help individual folks with their problems. That being said, it seems to me like there's several issues mentioned, some related, and some not, and some based on opendistro, which may or may not be out of date. From what I see the issues are:
I will be closing this issue with the merging of #1773. Anybody please feel free to open a follow-up issue with detailed reproduction steps (IDP, opensearch_dashboards.yml settings, opensearch security backend config, etc.) so I can better address individual concerns. Thanks! Additionally, we have a RFC #1711 to discuss confusion around some of the settings. If anyone has any thoughts, please leave them there, thanks! @GuiTeK @rmelilloii @sandervandegeijn @mhoydis13 @SakuraAxy @FryggFR @jperhamcatchteam @Beeez @K3ndu @tr0k |
Thanks for the effort Derek, we haven't made it easy for you ;) |
That's right @derek-ho. I already tested it in 2.x and my issue is not present. So I will suggest to update to that version. |
Describe the bug
1 hour after login, Kibana will show one of these 2 errors instead of the requested page:
{"statusCode":500,"error":"Internal Server Error","message":"An internal server error occurred."}
{"statusCode":401,"error":"Unauthorized","message":"Response Error"}
The only way to work around this issue is to delete cookies.
To Reproduce
Steps to reproduce the behavior:
{"statusCode":401,"error":"Unauthorized","message":"Response Error"}
instead of showing the requested page{"statusCode":500,"error":"Internal Server Error","message":"An internal server error occurred."}
when manually going to the root Kibana domainNote: to create the Okta app, I followed the instructions here: AWS - Add Single Sign-On (SSO) to Open Distro for Elasticsearch Kibana using SAML and Okta.
Expected behavior
The internal JWT created by OpenDistro (I'm not sure exactly what component creates it) should be automatically renewed and Kibana shouldn't throw an error (either 401 or 500) when visiting a page 1 hour or more after initial login.
Logs
ES Node Logs
Kibana Server Logs
Host/Environment (please complete the following information):
opendistroforelasticsearch-kibana
): 1.13.2The text was updated successfully, but these errors were encountered: