Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

Message for pool saturated, shutdown pool instead of disconnect, logging around failover, etc #338

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

niedfelj
Copy link

In most scenarios, node.disconnect doesn't make much sense (ie connection pool is saturated, not going to be very helpful to try to get a connection again just to disconnect it).

This is especially true given the way the new underlying connection pool works, which means if you are operating on separate pool.with blocks (not nested), the connection pool may have already returned the connection you were working with to the general pool (if it's the last on the stack of connections it goes back in the available pool and since pool.with always ensures that a connection is checked in, we would have to do a checkout and ensure checkin on our own if we want to work on the same connection or have disconnect happen within the existing block).

It seems much cleaner to just shutdown the pool (as that is the only API we really have from the connection pool to disconnect all connections) and create a new one, so we can ensure we have a fresh pool and connections. Otherwise we need to change the underlying way a node deals with connections and the pool. In general, it's a bit frustrating the way sessions/clusters/nodes are not shared at all across threads, as it prevents them from really communicating events/failovers/state etc without all of them discovering it separately? Correct me if I'm wrong, but this seems at least true with Mongoid's implementation via the Session::SessionFactory when you call Mongoid.default_session? I was working on another PR to move the pools to the node objects to simplify that logic, when I realized every thread had a different copy of the node, which is clearly why ConnectionManager is there in the first place? If the current architecture remains, does it make sense to have a NodeManager that can keep the state across them to record and mark them down, etc? IMHO, there are 3 or 4 layers now of thread protections that seem redundant, but I'm not familiar enough with all of the pipelining etc that is happening, even though it already seems like that's built into the Executable module.

I've made some more comments in the code below. But in general, this PR rescues a Timeout::Error which might be thrown by ConnectionPool in response to either not being able to get a connection within pool_timeout because they are all in use and/or it couldn't create any new connection instances because there are already pool_size connections created (this Timeout::Error has nothing to do with whether a connection can be established to a host). It raises a new exception Errors::PoolSaturated (both Errors::PoolSaturated and Errors::PoolTimeout already existed in the Moped codebase, I chose the former) with a message about not being able to get a new connection in certain amount of time and a suggestion to increase pool_size (might be nice in the future to have an auto-increment for pool size?). In combination with this PR for Mongoid, https://github.com/mongoid/mongoid/pull/3884, exceptions that happen DURING a command/query should be logged with them so they don't look like they are succeeding.

I'm also adding logging around events that seem like they are crucial, especially given failover events that might need to be recorded. I'm going to continue expanding logging/debugging in PR's, as Mongoid/Moped are rather devoid of it, and while 99% of the time we have zero issues with either, we've recently run into some situations where it would have been extremely helpful.

…e disconnect shutdown a pool/connections to start fresh, add logging around failover events and better debug messaging in places.
@@ -13,6 +13,9 @@ module Manager
# Used for synchronization of pools access.
MUTEX = Mutex.new

# Used for synchronization of pool shutdown.
SHUTDOWN_MUTEX = Mutex.new
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate Mutex for shutdown so we don't block other threads trying to get/create a new one while this thread shuts down.

@arthurnn
Copy link
Contributor

kinda related to #337

@arthurnn arthurnn added this to the 2.0.3 milestone Dec 20, 2014
@niedfelj
Copy link
Author

niedfelj commented Jan 5, 2015

It is very similar to #337, although I have some issues with the logic in that PR (namely, other sessions/nodes are memoizing @pool, which I've tried to resolve in mine because they will end up with a reference to a pool in a shutdown state...and my PR to shutdown a pool isn't blocking for other nodes/threads).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants