Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit number and lifetime of connections #208

Merged
merged 3 commits into from
Jul 5, 2017

Conversation

arvenil
Copy link
Contributor

@arvenil arvenil commented Jun 15, 2017

I'm trying to hunt down some leaking connection issues. While this doesn't nail the issue it's something I was surprised not to see in exporter.

If user forgets to limit number of connections from db perspective, we still won't create hundreds of them. Even if user sets MAX_USER_CONNECTIONS 3 then golang driver won't even try to create new connection.

Also SetConnMaxLifetime gets rid of idle and stale/broken connections.

Let me know what you think.

https://golang.org/pkg/database/sql/#DB.SetMaxIdleConns
https://golang.org/pkg/database/sql/#DB.SetMaxOpenConns
golang/go@0c516c1

@SuperQ
Copy link
Member

SuperQ commented Jun 15, 2017

I don't think the max open/idle is going to do anything. We create a new connection handler for every scrape. There is no connection pooling/recycling involved.

The max lifetime could be useful.

@arvenil
Copy link
Contributor Author

arvenil commented Jun 15, 2017

Yes, you are right... however I run into some issues where I had continuously 10 open idle connections and only restart of exporter helped. I looked into the code and couldn't find any obvious bug like not issuing rows.Close(). That's why I said it doesn't nail the issue.

So this is more like a precaution step. In the code this db handle is used by several scrapers, if one fails, it can eat unlimited number of connections. With limitations on DB any bad code implementation won't eat all connections.

Also db.Open run for every http request leads to another issue I've encountered. Since each http call is executed in parallel you can end up with several db.Open instances, and so exporter can effectively eat all db connections. If there was only one db.Open and kept forever, as godoc suggests, you could limit number of connections for whole program.

@SuperQ
Copy link
Member

SuperQ commented Jun 15, 2017

The only values that make sense there are:

db.SetMaxOpenConns(1)
db.SetMaxIdleConns(0)

The only way to prevent this problem is to use server-side enforcement with MAX_USER_CONNECTIONS in the grant.

@arvenil
Copy link
Contributor Author

arvenil commented Jun 15, 2017

With db.SetMaxIdleConns(0) scrappers won't be able to reuse connections like they do now which means higher overhead (every Query() will open new connection).

db.SetMaxOpenConns(1) is fine for me but why then you suggest MAX_USER_CONNECTIONS set to 3

They all make sense if you want to avoid dev bugs affecting production like this one: f3fb793

@SuperQ
Copy link
Member

SuperQ commented Jun 15, 2017

I read through the bit of the DB driver, it seems like it will keep one connection open for up to 5min before closing it when you set db.SetMaxIdleConns(0). But maybe I'm misunderstanding the timeout code.

MAX_USER_CONNECTIONS 3 is just a recommendation, and is of course site install dependent. It's necessary to allow for simultaneous scrapes from multiple Prometheus servers. It also allows for one or two slow scrapes before causing rejections.

Prometheus clients must be designed such that there may be more than one server polling it simultaneously.

@arvenil
Copy link
Contributor Author

arvenil commented Jun 15, 2017

MAX_USER_CONNECTIONS 3 makes sense then. However right now exporter code doesn't protect itself from getting 20 http calls, which will end up creating 20 sql.Open instances.

There should be only one instance of sql.Open to be honest. Otherwise it gets really messy and I continuously see issues where exporter uses all 3 connections and stops working because it can't open more connections. With one sql.Open you have full control over connections.

Also there shouldn't be need for a program to have external requirement (/suggestion) to set MAX_USER_CONNECTIONS 3 because it can otherwise eat unlimited number of connections.

So if we consider exporter only uses one connection at a time what would you say about:

db.SetMaxOpenConns(1)
db.SetMaxIdleConns(1)
db.SetConnMaxLifetime(1 * time.Minute)

and global db.Open as golang std library doc suggest to avoid multiple db.Open instances?

@SuperQ
Copy link
Member

SuperQ commented Jun 15, 2017

Because of the way we intend the exporter to work we explicitly do want to open a new connection for every scrape. This allows each scrape to act as a full test of the mysql server. Otherwise you will get misleading mysql_up results.

Setting the per-connection limits as you propose is fine.

db.SetMaxIdleConns(3)
// close idle connections after max lifetime
db.SetConnMaxLifetime(5 * time.Minute)
// by design exporter should use maximum one connection at any given time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please capitalize the sentence and end with a .. Also maybe "one connection per request" might be more correct.

Copy link
Member

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@guidoiaquinti
Copy link
Contributor

Right now we don't have control over the concurrent # of connections. We can run out of MySQL connections if multiple clients are scraping the exporter at the same time.

I'm 👍 to avoid multiple db.Open instances as @arvenil suggested. I don't think the # of connections should depends on the # of concurrent HTTP calls.

Please correct me if I misunderstood something @SuperQ 🙇

@SuperQ
Copy link
Member

SuperQ commented Jul 5, 2017

@giaquinti There is no way to avoid multiple concurrent opens, this change is a noop for the number of connections, it only limits the connection lifetime. By definition, the way Prometheus works is that there may be more than one server asking for metrics from a target simultaneously. Every new http request for metrics opens a new connection. This is the only way it can work without breaking the required operational semantics.

@guidoiaquinti
Copy link
Contributor

I might be missing something but why can't we use a global db.Open?

I think this exporter is really useful and I don't want to limit its usage to Prometheus scraping only. On the other side I can't increase the concurrent # of MySQL connections linearly to the concurrent # of HTTP requests to make it working 😞

@SuperQ
Copy link
Member

SuperQ commented Jul 5, 2017

The reason is Prometheus does not work that way.

  • We must support multiple simultaneous scrapes in order to support high availability.
  • In order to provide a full test of a MySQL server, we want to exercise the db.Open functionality for every request. If we recycled connections, we would mask authentication failures.

The only proper way to fix this is to adjust MAX_USER_CONNECTIONS in the database grant.

@SuperQ SuperQ merged commit bfc59d8 into prometheus:master Jul 5, 2017
@arvenil arvenil deleted the limit_connections branch July 5, 2017 18:53
@guidoiaquinti
Copy link
Contributor

I still don't get how we can support

multiple simultaneous scrapes in order to support high availability

if the number of the simultaneous scrapes is directly dependent with an external setting not related with this tool.

I'm also not sure if the "full test of a MySQL server" should be a requirement of the exporter. Anyway, I don't think a global db.Open will mask authentication failures (example: MySQL creds rotation) as the connection lifetime is limited by this PR to 60 seconds. Low db.SetConnMaxLifetime + global db.Open should probably allow us to archive all our goals without adding any penalty to HTTP rps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants