Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix "not master" error in mongo plugin #7743

Closed
wants to merge 7 commits into from

Conversation

zhecksum
Copy link

Fixes #7742.

Related: #2973

@hashicorp-cla
Copy link

hashicorp-cla commented Oct 26, 2019

CLA assistant check
All committers have signed the CLA.

func (c *mongoDBConnectionProducer) Close() error {
c.Lock()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locking inside Close() always produces a deadlock, as all of the calling functions (from mongodb.go) have locked the mutex already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Close can be called externally which is why we need to lock.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I see two possibilities for dealing with that then.

  1. Unlocking before calling it, and then locking again once done. (downside being that the lock needs to be acquired twice, and that another process can interfere inbetween the locking and unlocking)

  2. Not calling it at all, and just closing the session by referencing the mongoDBConnectionProducer in mongodb.go. I had a commit where I did exactly this, and it worked ok.

What do you prefer?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@calvn Close now locks again!

}
case strings.Contains(err.Error(), "not master"):
// Close connection and reconnect if connected node is not primary.
if err := m.Close(); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to call Close here, getConnection should take care of closing the session. Do we also want to retry the query afterwards? If so, we might just combine the EOF case and this one into one (similar to what you're doing in RevokeUser.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does getConnection really close the session in this case though? What happens if the session is not nil already and the ping succeeds?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see getConnection calls Connection() in connection_producer.go which only closes the session if the session is not nil and the ping fails.

The ping won't fail though, as the node is still up (it's just not the primary). I don't see a way to check whether it's master with the ping command.

@@ -251,7 +279,7 @@ func (m *MongoDB) RevokeUser(ctx context.Context, statements dbplugin.Statements
err = session.DB(db).RemoveUser(username)
switch {
case err == nil, err == mgo.ErrNotFound:
case err == io.EOF, strings.Contains(err.Error(), "EOF"):
case err == io.EOF, strings.Contains(err.Error(), "EOF"), strings.Contains(err.Error(), "not master"):
if err := m.Close(); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised we've not bumped into a deadlock here, but we should remove this Close call.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I removed it and replcaed it with a different way of closing that doesn't lead to deadlock.

@chrishoffman chrishoffman added the bug Used to indicate a potential bug label Oct 29, 2019
@zhecksum
Copy link
Author

I can confirm that the current state of this PR fixes the bug. Let me know if you need any revisions to the code. Looking forward to getting this merged.

@zhecksum
Copy link
Author

zhecksum commented Nov 7, 2019

Pinging @calvn

What do you think of the changes?

What would it take to get this merged?

if m.mongoDBConnectionProducer.session != nil {
m.mongoDBConnectionProducer.session.Close();
}
m.mongoDBConnectionProducer.session = nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move this into the conditional above so that we always nil out the session after we close it.

return errwrap.Wrapf("error closing EOF'd mongo connection: {{err}}", err)
case err == io.EOF, strings.Contains(err.Error(), "EOF"), strings.Contains(err.Error(), "not master"):
if m.mongoDBConnectionProducer.session != nil {
m.mongoDBConnectionProducer.session.Close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we set the session to nil afterwards?

// Close connection and reconnect if connected node is not primary.
if m.mongoDBConnectionProducer.session != nil {
m.mongoDBConnectionProducer.session.Close();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we set the session to nil afterwards?

@calvn calvn added this to the 1.3.1 milestone Nov 11, 2019
@briankassouf briankassouf modified the milestones: 1.3.1, 1.3.2 Dec 17, 2019
@briankassouf briankassouf modified the milestones: 1.3.2, 1.4 Jan 17, 2020
@briankassouf briankassouf modified the milestones: 1.4, triaged Feb 15, 2020
@briankassouf
Copy link
Contributor

I believe this should be fixed by #8140, going to close for now. Please let me know if you think this is a mistake.

@pbernal pbernal removed this from the triaged milestone Feb 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Used to indicate a potential bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Mongo plugin] vault read database/creds/<role> throws "not master" error after primary step down
7 participants