Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid FFI::AutoPointer is causing infinite loop in Finalizer thread starting from JRuby 9.3.4 #854

Closed
rsim opened this issue Jul 7, 2022 · 7 comments · Fixed by #856
Closed
Assignees
Labels

Comments

@rsim
Copy link
Contributor

rsim commented Jul 7, 2022

After upgrading from JRuby 9.3.3.0 to 9.3.6.0 we noticed that our Rails app with the appsignal gem consumes more CPU. From the thread dump we identified that the Java Finalizer thread is stuck in an infinite loop:

"Finalizer" - Thread t@3
   java.lang.Thread.State: RUNNABLE
        at org.jruby.RubyException.checkCircularCause(RubyException.java:399)
        at org.jruby.RubyException.setCause(RubyException.java:370)
        at org.jruby.Ruby.newRaiseException(Ruby.java:4299)
        at org.jruby.Ruby.newArgumentError(Ruby.java:3631)
        at org.jruby.Ruby.newArgumentError(Ruby.java:3626)
        at org.jruby.runtime.Arity.checkArity(Arity.java:167)
        at org.jruby.runtime.Arity.checkArity(Arity.java:161)
        at org.jruby.ext.ffi.jffi.NativeInvoker.call(NativeInvoker.java:70)
        at org.jruby.ext.ffi.jffi.DefaultMethod.call(DefaultMethod.java:106)
        at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:222)
        at org.jruby.RubyMethod.call(RubyMethod.java:119)
        at org.jruby.RubyMethod$INVOKER$i$call.call(RubyMethod$INVOKER$i$call.gen)
        at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodZeroOrOneOrNBlock.call(JavaMethod.java:354)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:173)
        at home.rails.eazybi.production.shared.bundle.jruby.$2_dot_6_dot_0.gems.ffi_minus_1_dot_15_dot_5_minus_java.lib.ffi.autopointer.invokeOther1:call(/home/rails/eazybi/production/shared/bundle/jruby/2.6.0/gems/ffi-1.15.5-java/lib/ffi/autopointer.rb:175)
        at home.rails.eazybi.production.shared.bundle.jruby.$2_dot_6_dot_0.gems.ffi_minus_1_dot_15_dot_5_minus_java.lib.ffi.autopointer.RUBY$method$release$0(/home/rails/eazybi/production/shared/bundle/jruby/2.6.0/gems/ffi-1.15.5-java/lib/ffi/autopointer.rb:175)
        at java.lang.invoke.LambdaForm$DMH/0x0000000801af7840.invokeStatic(LambdaForm$DMH)
        at java.lang.invoke.LambdaForm$MH/0x00000008006f2040.invokeExact_MT(LambdaForm$MH)
        at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:165)
        at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:185)
        at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:218)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:173)
        at home.rails.eazybi.production.shared.bundle.jruby.$2_dot_6_dot_0.gems.ffi_minus_1_dot_15_dot_5_minus_java.lib.ffi.autopointer.invokeOther3:release(/home/rails/eazybi/production/shared/bundle/jruby/2.6.0/gems/ffi-1.15.5-java/lib/ffi/autopointer.rb:150)
        at home.rails.eazybi.production.shared.bundle.jruby.$2_dot_6_dot_0.gems.ffi_minus_1_dot_15_dot_5_minus_java.lib.ffi.autopointer.RUBY$method$call$0(/home/rails/eazybi/production/shared/bundle/jruby/2.6.0/gems/ffi-1.15.5-java/lib/ffi/autopointer.rb:150)
        at java.lang.invoke.LambdaForm$DMH/0x0000000801af7440.invokeStatic(LambdaForm$DMH)
        at java.lang.invoke.LambdaForm$MH/0x00000008006f2040.invokeExact_MT(LambdaForm$MH)
        at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:139)
        at org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:162)
        at org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:185)
        at org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:218)
        at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:173)
        at org.jruby.RubyBasicObject$Finalizer.callFinalizer(RubyBasicObject.java:1971)
        at org.jruby.RubyBasicObject$Finalizer.finalize(RubyBasicObject.java:1960)
        at java.lang.System$2.invokeFinalize(System.java:2125)
        at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:87)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:171)

Infinite loop is caused by checkCircularCause which was added in jruby/jruby@bd2595c which solved jruby/jruby#7035. It seems that somehow circular exception cause loop is created which does not include the latest top exception.

But further in the stack we can see that the root cause is FFI::AutoPointer at autopointer.rb:175 which calls

@proc.call(ptr)

and then checkArity fails for this call.

As I found the appsignal gem defines FFI::AutoPointer in three places:

@pointer = FFI::AutoPointer.new(
  pointer,
  Extension.method(:appsignal_free_transaction)
)

@pointer = FFI::AutoPointer.new(
  pointer,
  Extension.method(:appsignal_free_span)
)

@pointer = FFI::AutoPointer.new(
  pointer,
  Extension.method(:appsignal_free_data)
)

And these methods are defined as foreign functions in the following way:

attach_function :appsignal_free_transaction,
  [],
  :void

attach_function :appsignal_free_span,
  [:pointer],
  :void

attach_function :appsignal_free_data, [], :void

As I see appsignal_free_span is defined with a :pointer argument but appsignal_free_transaction and appsignal_free_data are defined without any arguments which I suspect is wrong.

My guess is that when FFI::AutoPointer tries to call the finalizer code with either appsignal_free_transaction or appsignal_free_data then the method arity check fails (as it tries to call it with a pointer argument but they are defined without arguments). And now the new checkCircularCause method results in an infinite loop and exposes this problem, previously I suspect it was silently failing and was not noticeable.

@rsim rsim added the bug label Jul 7, 2022
@rsim rsim changed the title Invalid FFI::AutoPointer is causing infinite loop in Finalizer thread srating from JRuby 9.3.4 Invalid FFI::AutoPointer is causing infinite loop in Finalizer thread starting from JRuby 9.3.4 Jul 7, 2022
@shairyar
Copy link
Member

shairyar commented Jul 7, 2022

rsim added a commit to eazybi/appsignal-ruby that referenced this issue Jul 8, 2022
@rsim
Copy link
Contributor Author

rsim commented Jul 8, 2022

We patched the appsignal gem with this change eazybi@bae84fd and now in production, we do not see anymore the Finalizer thread with an infinite loop in checkCircularCause. Please verify this patch and include in the next version of the gem.

@shairyar
Copy link
Member

Thanks @rsim for the patch, we will take a look into that.

tombruijn added a commit that referenced this issue Jul 22, 2022
The `appsignal_free_transaction` and `appsignal_free_data` Rust
extension function calls were missing their argument definitions in the
FFI declaration.

These are the function definitions from the `appsignal.h` file,
downloaded upon installation.

```c
void appsignal_free_transaction(appsignal_transaction_t*);
void appsignal_free_data(appsignal_data_t*);
```

Fixes #854 where users saw very high CPU usage because the function call
was being retried indefinitely.

Co-authored-by: Raimonds Simanovskis <raimonds.simanovskis@gmail.com>
@tombruijn
Copy link
Member

I've created a PR for the fix you sent in here: #856

@tombruijn tombruijn self-assigned this Jul 22, 2022
@tombruijn
Copy link
Member

Looks like our CI didn't show us this issue because we don't test against JRuby 9.3, in which the commit that highlights this issue was added. I've created an issue to add JRuby 9.3 to our CI build in #858.

@tombruijn tombruijn mentioned this issue Jul 22, 2022
2 tasks
tombruijn added a commit that referenced this issue Jul 22, 2022
The `appsignal_free_transaction` and `appsignal_free_data` Rust
extension function calls were missing their argument definitions in the
FFI declaration.

These are the function definitions from the `appsignal.h` file,
downloaded upon installation.

```c
void appsignal_free_transaction(appsignal_transaction_t*);
void appsignal_free_data(appsignal_data_t*);
```

Fixes #854 where users saw very high CPU usage because the function call
was being retried indefinitely.

Co-authored-by: Raimonds Simanovskis <raimonds.simanovskis@gmail.com>
@tombruijn
Copy link
Member

We're hoping to release the fix for this next week.

@tombruijn
Copy link
Member

@rsim The fix for this has been released in Ruby gem 3.1.0. And we're already up to 3.1.2 now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants