-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
One semaphore to exclusively load them all #550
Conversation
Note, this whole problem goes away if we autoload this memoized method once at boot before puma worker threads are spawned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
76a2d68
to
b96b498
Compare
@jrafanie no BZ, but we might need one .. the bug can appear in older versions as well. |
b96b498
to
7605f3e
Compare
7605f3e
to
67a74c5
Compare
Checked commit jrafanie@67a74c5 with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0 |
result << name if %w(date datetime).include?(typeobj.type.to_s) | ||
klass.columns_hash.each do |name, typeobj| | ||
result << name if %w(date datetime).include?(typeobj.type.to_s) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I feel bad that I don't understand the whole rails code reloading process well enough to fix something like this 😞 |
# Temporary measure to avoid thread race condition which could lead to a deadlock | ||
ActiveSupport::Dependencies.interlock.permit_concurrent_loads do | ||
# Ensure we're the only thread trying to autoload classes and their columns | ||
ActiveSupport::Dependencies.interlock.loading do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, this also works:
Rails.application.executor.wrap do
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fryguy do you prefer the executor wrapper? I can research more tomorrow if that makes sense over the interlock.loading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the executor wrapper seemed like a better approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha, executor.wrap deadlocks if I hit compute -> cloud after login 🤣 ... back to .loading
I need to circle around to @kbrock and @NickLaMuro regarding the |
@kbrock @NickLaMuro When @jrafanie went through this, we came to the conclusion that vanilla Rails' built-in columns_hash (and thus load_schema) never goes through the autoload process, whereas after ar_virtual is introduced, load_schema is changed to autoload some things. This is sort of the fundamental problem. Since load_schema happens during the code require process, the const_missing ends up causing a cascading require. In a single-thread this is fine for some reason, but in multi-thread, you end up with a collision which manifests as a circular dep (I don't think it's actually circular but instead a shared "did I see this yet" list shows a class as already loaded - but this all gets really confusing to follow) |
67a74c5
to
f9203f8
Compare
Not sure if it was documented elsewhere but... The issue is with virtual attribute's It is the |
A work around could be for us to explicitly state the type of the attribute delegated. That way loading one class is not dependent upon other classes. One interesting note, |
Still testing this... |
https://bugzilla.redhat.com/show_bug.cgi?id=1671458 Fixes ManageIQ#544 From MartinH's findings in ManageIQ#544, we had one thread at the `cspec[:klass].constantize` line while another thread was trying to run `klass.columns_hash`, causing a `Circular dependency detected while autoloading...` error. I then tried a bunch of things until I found a way to reliably recreate this error: ```ruby require_relative 'config/environment' threads = [] 4.times do threads << Thread.new { Api::Environment.time_attributes } end threads.collect(&:join) ``` I could then run the above script several times in my shell and get the `circular dependency` error most of the time. ``` for x in `seq 1 10`; do; bundle exec ruby test.rb; done ``` With this test in place, I then tried a few solutions: 1) Move the `klass.columns_hash` block into the permit_concurrent_loads block: ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f4b6554a 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -22,9 +22,10 @@ module Api # Temporary measure to avoid thread race condition which could lead to a deadlock ActiveSupport::Dependencies.interlock.permit_concurrent_loads do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` This did not fix the `circular dependency` error. Perhaps `permit_concurrent_loads` doesn't handle arbitrarily deep nested autoloads cross threads? 2) I tried Mutex#synchronize and this worked, but I'd rather we work with the interlock provided by rails. ``` diff --git a/lib/api/environment.rb b/lib/api/environment.rb index 87b34f99..f51de73f 100644 --- a/lib/api/environment.rb +++ b/lib/api/environment.rb @@ -1,5 +1,6 @@ module Api class Environment + ONE_AUTOLOADER_LOCK = Mutex.new def self.url_attributes @url_attributes ||= Set.new(%w(href)) end @@ -19,12 +20,13 @@ module Api next if cspec[:klass].blank? klass = nil - # Temporary measure to avoid thread race condition which could lead to a deadlock - ActiveSupport::Dependencies.interlock.permit_concurrent_loads do + # Ensure we're the only thread trying to autoload classes and their columns + ONE_AUTOLOADER_LOCK.synchronize do klass = cspec[:klass].constantize - end - klass.columns_hash.each do |name, typeobj| - result << name if %w(date datetime).include?(typeobj.type.to_s) + + klass.columns_hash.each do |name, typeobj| + result << name if %w(date datetime).include?(typeobj.type.to_s) + end end end end ``` 3) I tried Sync.new with SH and EX locks instead of Mutex and this failed with Thread killed errors. 4) I changed the `permit_concurrent_loads` on the interlock to `loading` and this worked, but Yuri found this caused a deadlock. 5) Use `AS::Dep.interlock.loading` from 4) and move the `columns_hash` call into the `loading` block. This solves the circular dependency and avoids the deadlock encountered in 4). I noticed, `permit_concurrent_loads` calls `yield_shares` with `compatible: [:load])` and that method has this in the source code comment: https://github.com/rails/rails/blob/bb22fe9d4a6102d2a28cb1adfd6fe9d38fc9bb22/activesupport/lib/active_support/concurrency/share_lock.rb#L166-L168 ``` Temporarily give up all held Share locks while executing the supplied block, allowing any +compatible+ exclusive lock request to proceed. ``` Perhaps, since we're loading code, `permit_concurrent_loads` is too permissive of other exclusive lock requests and we really need to ensure nothing else is trying to load.
f9203f8
to
8ea4c9c
Compare
It turns out |
One semaphore to exclusively load them all (cherry picked from commit e75c9e0) https://bugzilla.redhat.com/show_bug.cgi?id=1673039
Hammer backport details:
|
https://bugzilla.redhat.com/show_bug.cgi?id=1671458
Fixes #544
From MartinH's findings in #544, we had one thread at the
cspec[:klass].constantize
line while another thread was trying to runklass.columns_hash
, causing aCircular dependency detected while autoloading...
error.I then tried a bunch of things until I found a way to reliably recreate
this error:
I could then run the above script several times in my shell and get the
circular dependency
error most of the time.With this test in place, I then tried a few solutions:
klass.columns_hash
block into the permit_concurrent_loadsblock:
This did not fix the
circular dependency
error. Perhapspermit_concurrent_loads
doesn't handle arbitrarily deep nested autoloadscross threads?
with the interlock provided by rails.
I tried Sync.new with SH and EX locks instead of Mutex and this
failed with Thread killed errors.
I changed the
permit_concurrent_loads
on the interlock toloading
and this worked, but Yuri found this caused a deadlock.
Use
loading
from 4) and move thecolumns_hash
call intothe
loading
block. This solves the circular dependency and avoids thedeadlock encountered in 4).
I noticed,
permit_concurrent_loads
callsyield_shares
withcompatible: [:load])
and that method has this in the source code comment:
https://github.com/rails/rails/blob/bb22fe9d4a6102d2a28cb1adfd6fe9d38fc9bb22/activesupport/lib/active_support/concurrency/share_lock.rb#L166-L168
Perhaps, since we're loading code,
permit_concurrent_loads
is toopermissive of other exclusive lock requests and we really need to ensure
nothing else is trying to load.