Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

Stuck in Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #348

Open
dblock opened this issue Feb 8, 2015 · 29 comments

Comments

@dblock
Copy link
Contributor

dblock commented Feb 8, 2015

Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #<Moped::Cluster:128953180 @seeds=[<Moped::Node resolved_address="10.95.128.244:27017">, <Moped::Node resolved_address="10.184.156.102:27017">]>
…avity/ruby/2.0.0/gems/moped-2.0.3/lib/moped/cluster.rb: 254:in `with_primary'
…ty/ruby/2.0.0/gems/moped-2.0.3/lib/moped/collection.rb: 124:in `insert'
…by/2.0.0/gems/mongoid-4.0.0/lib/mongoid/query_cache.rb: 117:in `insert_with_clear_cache'
…ems/mongoid-4.0.0/lib/mongoid/persistable/creatable.rb:  79:in `insert_as_root'

Occasionally we see a machine or two stuck in this. I am not sure when this happens, but about 10% of nodes end up in this state every 24 hours. The MongoDB cluster is doing fine.

This issue could probably use more detail, please tell me what to look for next time I have a machine in this state.

@wandenberg
Copy link
Contributor

Hi @dblock could you check if the code on #352 solve this problem?

@niedfelj
Copy link

@dblock Please see my PR #338
We were having these errors too, and I'm guessing that you are actually having a pool saturation problem and not primary node connection issues. In general, the logging in mongoid is pretty terrible. Are you running Puma? And have you tuned pool_size and pool_timeout?

@steve-rodriguez
Copy link

How do you go about tuning those? How do you know what to set them to? Are there guidelines?

@niedfelj
Copy link

In general, you should have a pool_size that is equal to or greater than the number threads you are running. You shouldn't need to tune pool_timeout. Here is an update submitted to mongoid for generating the mongoid.yml giving more details on those configs

https://github.com/mongoid/mongoid/pull/3883/files

@niedfelj
Copy link

These PRs might also be useful to you, in adding more/better logging in error situations and giving metrics on per request in rails:

https://github.com/mongoid/mongoid/pull/3885
https://github.com/mongoid/mongoid/pull/3884

@dblock
Copy link
Contributor Author

dblock commented Feb 20, 2015

#352 has so far been good to us in production (72 hours). So it has improved things I want to say.

@fedenusy
Copy link
Contributor

I'm seeing this error as well.

@ajsharp
Copy link

ajsharp commented Mar 8, 2015

+1. We see this a couple of times per day, seemingly on a random basis.

@wnkz
Copy link

wnkz commented Mar 10, 2015

+1 also seeing this.

@InvisibleMan
Copy link

I think, MOPED also use wrong thread-safe code.

#353 (comment)

@ajsharp
Copy link

ajsharp commented Mar 13, 2015

Interesting. Does anyone see this behavior with unicorn? I've seen it with puma (threads), but don't have anything in production with unicorn.

Wondering if switching the app server to unicorn might be an easy "fix", because it seems like the real fix could take a bit of time.

@ajsharp
Copy link

ajsharp commented Mar 13, 2015

@arthurnn any thoughts on this issue?

@InvisibleMan
Copy link

I'm using sidekiq gem and I have not choice.

@glebtv
Copy link

glebtv commented Mar 21, 2015

I just spent 20 minutes debugging an issue with this error message, and I found that when calling .find(nil) in moped it results in this (incorrect) error message.

> session[:test].find(nil).first
Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #<Moped::Cluster:69729780 @seeds=[<Moped::Node resolved_address="127.0.0.1:27017">]>

Whereas without arguments it's ok:

> session[:test].find().first
=> nil

Expected error message would be something along InvalidFind

@sahin
Copy link

sahin commented Jun 6, 2015

+1

@wandenberg
Copy link
Contributor

who still having this problem and can help me with the setup environment and a description on how to reproduce it?

@sahin
Copy link

sahin commented Jun 7, 2015

+1 @wandenberg , we still have this problem in production. It is simple to reproduce it, shutdown one of the server in the replication or close the port.

@nofxx
Copy link

nofxx commented Jun 7, 2015

+1. Monkey increasing POOL_SIZE seems to give more time between errors.
Also, looks like sidekiq is playing a major role.
I got 90 sidekiq workers in 3 servers, plus 10 or so unicorns. Still don't get the pool size 5...

@davidleroy
Copy link

+1

1 similar comment
@brand-it
Copy link

brand-it commented Jul 1, 2015

👍

@mhuggins
Copy link

mhuggins commented Oct 9, 2015

We're seeing this error crop up in some sidekiq jobs.

@chenqiangzhishen
Copy link

chenqiangzhishen commented Jul 26, 2016

+1, still see the issue

Moped::Errors::ConnectionFailure

Could not connect to a primary node for replica set #<Moped::Cluster:50526920 @seeds=[<Moped::Node resolved_address="10.23.84.206:27018">, <Moped::Node resolved_address="10.23.84.207:27018">]>

traceback

vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/cluster.rb:254:in `with_primary'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/read_preference/primary.rb:55:in `block in with_node'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/retryable.rb:30:in `call'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/retryable.rb:30:in `with_retry'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/retryable.rb:39:in `rescue in with_retry'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/retryable.rb:29:in `with_retry'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/retryable.rb:39:in `rescue in with_retry'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/retryable.rb:29:in `with_retry'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/read_preference/primary.rb:54:in `with_node'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/cursor.rb:139:in `load_docs'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/query_cache.rb:234:in `block in load_docs'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/query_cache.rb:135:in `with_cache'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/query_cache.rb:234:in `load_docs'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/cursor.rb:28:in `each'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/query.rb:78:in `each'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/contextual/mongo.rb:122:in `each'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/contextual.rb:20:in `each'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/criteria/findable.rb:107:in `entries'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/criteria/findable.rb:107:in `from_database'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/criteria/findable.rb:75:in `multiple_from_db'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/criteria/findable.rb:19:in `execute_or_raise'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/criteria/findable.rb:40:in `find'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/findable.rb:90:in `find'
....
....

@dennislysenko
Copy link

How do you properly set up a moped pool if not using mongoid? Here is how I'm doing it, and still occasionally getting these errors:

$mongo_pool = ConnectionPool.new(size: 30, timeout: 3000) do
  mongo_client = Moped::Session.new(Moped::Uri.new(uri_string).hosts)
  mongo_client.use(dbname)
end

# have one main one open
mongo_client = Moped::Session.new(Moped::Uri.new(uri_string).hosts)
$mongo = mongo_client.use(dbname)

where uri_string is in the format: mongodb://1.2.3.4:27017/desired_db_name

Might end up just dropping moped as I'm not even using mongoid and that seems to be the biggest use/support case :/

@elenatanasoiu
Copy link

It could be that mongo is not running. Have you tried:

sudo rm /var/lib/mongodb/mongod.lock
sudo service mongodb start

@deepthawtz
Copy link

@elenatanasoiu the problem is that mongod is running and replica set is healthy but these error messages crop up nevertheless

@bastoune
Copy link

bastoune commented Aug 1, 2017

Hey, did you find any solution ?

@shivamv
Copy link

shivamv commented Sep 11, 2017

@bastoune we used sidekiq for background jobs and puma. Both being multithreaded, supporting 25 and 16 threads by default.
Now, mongoid by default has pool size as 5, evidently, there were situations wherein the poolsize got exhausted in this case resulting into
Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #<Moped::Cluster:1223353180 @seeds=[<Moped::Node resolved_address="xx.xxx.xxx.xxx:27017">, <Moped::Node resolved_address="xx.xxx.xxx.xxx:27017">]>

Fixed it by tuning poolsize, sidekiq + puma threads.
Here is an article for sql database though i suppose it clarifies the fundamentals

@bastoune
Copy link

@shivamv Thanks for the reply, going to spend more time to understands this ;)

@yanghoxom
Copy link

yanghoxom commented Aug 27, 2018

@shivamv thanks, it help full, maybe somebody miss turn on docker have mongoid inside?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests