-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Unstopped thread when using Kerberos authentication #918
Comments
Hi @jansohn We have removed this piece of code in PR #839 and the fix was part of v7.1.2-preview. Would like you to test and confirm if the new driver version resolves your problem? Thanks, |
I just tested it with |
Hi @jansohn Can you confirm if you see the exact same error as in above stacktrace or some other by using newer version of the driver? Please make sure your application doesn't have multiple driver versions referenced as according to the stacktrace, the code path in question (Reflective access to Config.getKDCList()) no longer exists as I mentioned in the PR link above. Please verify in a standalone Java App and do share us repro it you see the issue. |
I'll check this next week! |
Updated stacktrace with
|
Reproducer: https://github.com/jansohn/kerberos-timer |
Hi @jansohn, thanks for your investigation. I can also see that a daemon thread has been left behind with your reproduction code. We'll start looking into this issue, and will let you know when we have an update. |
Hi @jansohn, That particular piece of code was introduced in PR #40, to automatically append the realm name to the SPN for Kerberos cross-realm authentication (when server's realm is different than the default realm). It makes use of the JNDI DNS, which spins up a daemon thread in the background, but does not provide an API to release it. I agree that leaving a thread running is an issue and did some research to find alternatives to JNDI DNS, but looks like there is no other JDK library that would let us achieve the same functionality. There are however some third party libraries that provide API to retrieve DNS records, but it is highly unlikely that we will introduce a new dependency to the driver, mainly because there is already a connection property All of that being said, we are currently leaning toward reverting PR #40 and asking users to explicitly specify realm name in I am adding @pierresouchay - the author of the PR #40 to the thread to get his opinion on this issue too. |
I was unaware of such Thread being initialized when using native DNS resolutions. I have a few remarks:
I really think removing this feature would be a shame: Kerberos is complicated (look at the workaround I had to do to make it work on multiple JVMs) and I spent lots of time on fixing it, and removing the feature would break some users code. |
Hi @pierresouchay, Thank you for the detailed explanation. You are correct to say that the thread gets reused when reloading the driver, so this does not seem like a memory leak, but the ideal behavior would be to not have the thread left behind. Another reason why we were thinking of reverting the PR was because there are lots of workarounds in the code as you mentioned and it is not straightforward (sometimes nearly impossible) to maintain them, given the limitations and changes from different JDK vendors/versions. @jansohn, do you have any particular issue in your application that's caused by the daemon thread or it is just a warning reported in Tomcat? |
@ulvii The code is complicated for some reasons: JVM support for Kerberos is not very consistent. Some JVMs (SUN/Oracle) do parse the krb5.conf using the class DNS is cool to solve REALMs, unfortunately, it is not very often properly configured, thus, the 2nd method (using DNS) is less likely work since it expects lots of configuration to be in place. Moreover, the file
While you might choose to remove reflection to simplify code (in that case, users will have to configure all of their DNS records properly), but removing the feature is IMHO a huge mistake, because nobody understand Kerberos completely and having the logic in the driver is what make it work. In my company, we are doing cross REALMs calls using it across 9 different REALMs with different levels of trust (mixing Hadoop and AD domains) for years, it had been impossible before that. So, yes, this is not fun, but I think the driver HAS to hide the gory details, otherwise, what is the point? |
Out of curiosity, can the connection property serverSPN not be set from your application? |
@cheenamalhotra yes it can, if you know what you are doing, but it is not trivial, especially in cross Realm calls: how you know that serverbak1.acme.com is cname to server.acme.com which is in realm UACME while you are in MY.ACME.NET. How do you compute all of this? Well, you have to learn how it works. I bet than less than 0.1% of devs do, and less than 2% of admins |
@pierresouchay @ulvii My vote would go to introducing a lightweight dependency which takes care of the DNS resolution. We also need cross-realm Kerberos support so completely removing the feature would be pretty bad... |
@ulvii the big issue is to determine it. That's fine for small shops with 2 databases, that's complicated with cross realms and subrealms, especially with modern discovery systems such as Consul |
@pierresouchay, please correct me if I am wrong, doesn't realm name have to match the domain name of the server for PR #40 to work?
|
@ulvii no: it can be overridden by the correct section in krb5.conf (oracle JVM with correct implementation) or with DNS records as specified in RFC. Realm taken into account will be the longest as well (aka: db.acme.corp will be used instead of acme.corp if two realms match) Ex in our datacenters: MyMachine.am5.hpc.acme.com am5.hpc.acme.com is in fact in realm AMS. HPC.AD.ACME.COM (as specified with realms section of krb5.conf or DNS records, while machine.hpc.acme.com is in realm AD.AM5.ACME.COM) |
I just had a quick lookup into the implementation of krb5.config file and turns out the only thing different or additional in their implementation is looking up for DNS server record using TCP(_tcp) along-with UDP(_udp). Do you think implementing similarly in DNSKerberosLocator.isRealmValid() after fetching records using UDP would resolve this problem? Was this considered when the class was written in PR #40 and wondering if there were any limitations? |
Hi @cheenamalhotra, To be fair, I used (very few) shortcuts in the DNS implementation, because we do use Oracle JVM + Unix for most of our tools, and while I tested it on several Windows versions to make it work smoothly (for instance with DBeaver GUI), the advanced stuff work better with Oracle JVMs in complicated setups (we have Hadoop + AD) So the DNS implementation lacks a few stuff to be closer to reflection based implementation:
What I know is that those implementations are well tested with several Windows versions (2008, 2012, 2016, 7, 10), Mac OS (which is using Heimdal) and Linux and several JVMs (6, 7, 8 at the time - hence the test with |
Thanks for clarifying. I'd like to summarize the discussion to conclude few things, based on our discussions:
Lastly, regarding the open thread issue, it seems like this is deep embedded in |
@cheenamalhotra Yes, this is well summarized. I agree that DNSKerberosLocator is probably sufficient for most use-cases as it is. While we had in the past the need for REALM -> domains configuration, I think we did trash that - so, at least for my company - it would not be an issue, and since very few people do connect to MS SQL Servers outside realms managed by AD, very few would have the issue. In that case, it is also correct to say that setting the SPN as a workaround (which is the way we took initially before #40 was merge by your team ) is working. It is true that the implementation of DNSKerberosLocator might be enhanced ( And finally, if some Kerberos internals do create daemon thread (very likely since it might itself perform some InetLookups with the same interface I used (very used in many many old school Java libraries and JDK internals), as suggested in my first comment
We are using this Driver (with another Driver on top of it) for quite a long time (before #40 got merged since this code is basically code I wrote before this PR) on more than 50 intensive applications performing lots of cross REALM calls, and we never had any issue regarding a leak or a failure initializing Kerberos by using simply IPs or CNAMEs for target databases. Regards |
Thanks @pierresouchay for your explainations. @jansohn Also I believe this thread is created for caching purpose looking at its source code. As the name suggests, it's an "Address Change Listener", which awaits change in DNS Address entries by calling Native JNI code, and has a role to play when request is received to load DNS configuration from OS. If nothing has changed on OS and available DNS data has not expired, Java reuses available DNS information for next request but if address change notification has received, it would always load latest DNS configuration from OS. On that note, I will close the issue. If you have more concerns, you may reach out to OpenJDK Team, but according to my understanding, this daemon thread should not cause any harm to your application. |
Driver version
7.0.0
SQL Server version
Client Operating System
Windows 10
JAVA/JVM version
Problem description
Expected behaviour: All threads should be stopped when shutting down the application in Tomcat
Actual behaviour: One thread is still running after shutdown and produces a possible memory leak
Error message/stack trace:
I debugged the web application and found out that the thread is started in a static initialization block in
sun.net.dns.ResolverConfigurationImpl
(see stack trace below):The thread itself is started by a Java core class but my question is if we can prevent the thread from starting in the first place (stopping it correctly seems to be out of the question as there is no reference available).
I think others using Kerberos for authentication (with username/password) should experience this problem, too. It would be nice to know if this is a general problem or if I'm doing something wrong.
Currently the code is embedded in a bigger web application (with JPA and others) but I'll try to strip it down to a minimal example if necessary.
The text was updated successfully, but these errors were encountered: