-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spring-integration-sftp leads to sshd-SshClien threads leaks #9373
Comments
Any chances to have more info about those threads? |
I'm not sure if it's safe to provide a thread dump |
So, according to debugging I see that those threads come from the
And this is that thread pool size:
Now it would be great to see where they are used so we see that dead-lock. Eventually it leads me that this NIO2 stuff is used by the
So, I'm not sure how we can have so many of those threads.
No need the whole one, but at least some stack trace to follow back to Spring code where we indeed can prove that we do something wrong or so. |
Here are some excerpts: |
`"sshd-SshClient[100a02a3]-timer-thread-1" #3625 [693138] daemon prio=5 os_prio=0 cpu=39.63ms elapsed=1001.82s tid=0x00007fc6a0380ca0 nid=693138 waiting on condition [0x00007fc63acfe000] "Thread-147" #3626 [693139] daemon prio=5 os_prio=0 cpu=1.57ms elapsed=1001.82s tid=0x00007fc6a037edd0 nid=693139 runnable [0x00007fc63abfe000] "sshd-SshClient[100a02a3]-nio2-resume-thread-1" #3627 [693140] daemon prio=5 os_prio=0 cpu=1.26ms elapsed=1001.82s tid=0x00007fc6a03a8fb0 nid=693140 waiting on condition [0x00007fc63aafe000] "sshd-SshClient[100a02a3]-nio2-thread-1" #3628 [693141] daemon prio=5 os_prio=0 cpu=3.19ms elapsed=1001.81s tid=0x00007fc6904c27b0 nid=693141 waiting on condition [0x00007fc63a9fe000] "sshd-SshClient[100a02a3]-nio2-thread-2" #3629 [693142] daemon prio=5 os_prio=0 cpu=5.29ms elapsed=1001.81s tid=0x00007fc6904fd9e0 nid=693142 waiting on condition [0x00007fc63a8fe000] "sshd-SshClient[100a02a3]-nio2-thread-3" #3630 [693143] daemon prio=5 os_prio=0 cpu=1.39ms elapsed=1001.81s tid=0x00007fc68514d300 nid=693143 waiting on condition [0x00007fc63a7fe000] "sshd-SshClient[3940b428]-timer-thread-1" #3674 [693936] daemon prio=5 os_prio=0 cpu=17.17ms elapsed=401.09s tid=0x00007fc6983fd9e0 nid=693936 runnable [0x00007fc63a5fe000] "Thread-148" #3675 [693937] daemon prio=5 os_prio=0 cpu=1.45ms elapsed=401.09s tid=0x00007fc6984132b0 nid=693937 runnable [0x00007fc63fafe000] "sshd-SshClient[3940b428]-nio2-resume-thread-1" #3676 [693938] daemon prio=5 os_prio=0 cpu=1.52ms elapsed=401.03s tid=0x00007fc698368340 nid=693938 waiting on condition [0x00007fc63a6fe000] "sshd-SshClient[3940b428]-nio2-thread-1" #3677 [693939] daemon prio=5 os_prio=0 cpu=1.63ms elapsed=401.02s tid=0x00007fc68cdd6810 nid=693939 waiting on condition [0x00007fc63a2fe000] "sshd-SshClient[3940b428]-nio2-thread-2" #3678 [693940] daemon prio=5 os_prio=0 cpu=6.40ms elapsed=401.02s tid=0x00007fc68cd90820 nid=693940 waiting on condition [0x00007fc63a1fe000] "sshd-SshClient[3940b428]-nio2-thread-3" #3679 [693941] daemon prio=5 os_prio=0 cpu=9.22ms elapsed=401.01s tid=0x00007fc6904ee540 nid=693941 waiting on condition [0x00007fc63a4fe000] "sshd-SshClient[30922ac9]-timer-thread-1" #3680 [693944] daemon prio=5 os_prio=0 cpu=16.17ms elapsed=400.27s tid=0x00007fc6983fe690 nid=693944 waiting on condition [0x00007fc63a0fe000] "Thread-149" #3681 [693945] daemon prio=5 os_prio=0 cpu=1.33ms elapsed=400.27s tid=0x00007fc6983c6860 nid=693945 runnable [0x00007fc639ffe000] "sshd-SshClient[30922ac9]-nio2-resume-thread-1" #3682 [693946] daemon prio=5 os_prio=0 cpu=1.26ms elapsed=400.27s tid=0x00007fc69836a0a0 nid=693946 waiting on condition [0x00007fc639efe000] "sshd-SshClient[30922ac9]-nio2-thread-1" #3683 [693947] daemon prio=5 os_prio=0 cpu=5.23ms elapsed=400.26s tid=0x00007fc6f417d0f0 nid=693947 waiting on condition [0x00007fc639dfe000] "sshd-SshClient[30922ac9]-nio2-thread-2" #3684 [693948] daemon prio=5 os_prio=0 cpu=1.20ms elapsed=400.26s tid=0x00007fc6f416cdd0 nid=693948 waiting on condition [0x00007fc639cfe000] "sshd-SshClient[30922ac9]-nio2-thread-3" #3685 [693949] daemon prio=5 os_prio=0 cpu=2.67ms elapsed=400.25s tid=0x00007fc6e830af20 nid=693949 waiting on condition [0x00007fc639bfe000] "task-751" #3707 [694143] prio=5 os_prio=0 cpu=3.34ms elapsed=170.26s tid=0x00007fc688582550 nid=694143 waiting on condition [0x00007fc6429fe000] "task-752" #3708 [694144] prio=5 os_prio=0 cpu=7.25ms elapsed=170.20s tid=0x00007fc688582c60 nid=694144 waiting on condition [0x00007fc6485fe000] "task-753" #3709 [694145] prio=5 os_prio=0 cpu=5.08ms elapsed=170.11s tid=0x00007fc6884775b0 nid=694145 waiting on condition [0x00007fc648bfe000] "task-754" #3710 [694146] prio=5 os_prio=0 cpu=6.06ms elapsed=170.07s tid=0x00007fc6885ba460 nid=694146 waiting on condition [0x00007fc648dfe000] "task-755" #3711 [694147] prio=5 os_prio=0 cpu=1.25ms elapsed=170.07s tid=0x00007fc6804e1e60 nid=694147 waiting on condition [0x00007fc65d2fe000] "task-756" #3712 [694148] prio=5 os_prio=0 cpu=1.20ms elapsed=170.06s tid=0x00007fc6885bb350 nid=694148 waiting on condition [0x00007fc63a3fe000] "task-757" #3713 [694151] prio=5 os_prio=0 cpu=3.89ms elapsed=166.01s tid=0x00007fc6884759b0 nid=694151 waiting on condition [0x00007fc6443fe000] "task-758" #3714 [694152] prio=5 os_prio=0 cpu=4.50ms elapsed=166.00s tid=0x00007fc6812f4e10 nid=694152 waiting on condition [0x00007fc63e2fe000] "GC Thread#1" os_prio=0 cpu=11063.12ms elapsed=41099.28s tid=0x00007fc694006310 nid=657467 runnable "VM Thread" os_prio=0 cpu=3038.21ms elapsed=41101.83s tid=0x00007fc71c29e010 nid=657428 runnable "VM Periodic Task Thread" os_prio=0 cpu=26401.58ms elapsed=41101.84s tid=0x00007fc71c284fe0 nid=657427 waiting on condition "G1 Service" os_prio=0 cpu=1779.66ms elapsed=41101.88s tid=0x00007fc71c2744c0 nid=657426 runnable "G1 Refine#0" os_prio=0 cpu=15093.16ms elapsed=41101.88s tid=0x00007fc71c2734f0 nid=657425 runnable "G1 Conc#0" os_prio=0 cpu=56208.98ms elapsed=41101.88s tid=0x00007fc71c0a12c0 nid=657424 runnable "G1 Main Marker" os_prio=0 cpu=94.64ms elapsed=41101.89s tid=0x00007fc71c0a0300 nid=657423 runnable "GC Thread#0" os_prio=0 cpu=11148.85ms elapsed=41101.89s tid=0x00007fc71c095020 nid=657422 runnable ` |
Maybe @scheduled can be a reason, i often had issues with different libraries e.g google libraries when calling in @scheduled. |
You thread dump shows several I don't see any dead-locks in your dump.
apparently waiting for tasks to arrive into thread pool queue.
So, apparently these threads are not guilty (and even not suffering). All of these are just my assumptions. |
I'll try to create a project over the weekend that reproduces this problem. |
I don't understand why these threads aren't released, I use DefaultSftpSessionFactory, not CachingSessionFactory. |
If you do As an alternative you may take a look into a shared
|
Thank you very much, I will try to use a single DefaultSftpSessionFactory instance, didn't know about this nuance. And didn't know that DefaultSftpSessionFactory is thread-safe, this behavior is not documented and how people can know that they have to destroy something, I spent several hours searching destroy/delete/release methods everywhere, but it's hard to find if you don't know what to look. |
You use Spring library, so it is expected that everything should be based on the Spring dependency injection container. |
I initially focused on the SftpRemoteFileTemplate without considering its connection to the Spring context. Moving forward, I'll make sure to check dependencies as well. I assumed that the file template would automatically handle resource cleanup and checked if it had methods for releasing resources. However, now I discovered that I needed to manually pass the factory as a bean or destroy it manually. I didn't expect this because I thought an SFTP connection would require multiple instances to connect to different servers depending on the requirements, and therefore wouldn't be used as a bean. I also can't assume that something is thread-safe based on its relation to spring. I had threading issues too often and now don't use classes that aren't marked as thread-safe in comments. I hope this ticket will help other people to not make my mistake. Thank you again! |
The Yes, Can we agree that there is nothing to do to from this project perspective and close the issue? Thank you for your patience! |
I checked again, issue was solved after I added destroy() call. It's also interesting that DefaultFtpSessionFactory doesn't need to be destroyed. |
Would you mind to clarify what you mean with the |
We added some paragraph into doc to explain the situation like this. |
I mean inconsistency:
I think this inconsistency is the main reason why we made this mistake, to add sftp support we just copy-pasted already implemented ftp logic, but instead of spring-integration-ftp we used spring-integration-sftp and replaced appropriated classes. As a programmer, we often don't look at class implementations and it's easy to skip that some factories need to be destroyed |
Yeah... That's just coincident that |
I always try to do it in the Spring way, but DefaultSftpSessionFactory contains details of one host, but we need to use different hosts based on yaml files settings that can be changed when the application running. |
Sounds like you need a |
In what version(s) of Spring Integration are you seeing this issue?
6.3.2
Describe the bug
Recently I got exception:
Caused by: java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
First I checked result of:
systemctl status myService
It showed me:
`Tasks: 4012 (limit: 9385) //usually this number shows me several thousands, I didn't check it at the moment when OOM happened, but I think it was near 9k
Then I made thread dump and saw that almost all threads are sshd e.g
sshd-SshClient[96ccf2f]-nio2-thread-2" #13237 [581081] daemon prio=5 os_prio=0 cpu=6.82ms elapsed=19393.97s tid=0x00007ff599190fa0 nid=581081 waiting on condition [0x00007ff4512fe000]
To Reproduce
I can't provide relevant examples now
Expected behavior
spring-integration-sftp thread leaks must not happen.
The text was updated successfully, but these errors were encountered: