Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the best way to handle an inaccessible postgres server? #434

Closed
voxpelli opened this issue Sep 4, 2013 · 14 comments
Closed

What is the best way to handle an inaccessible postgres server? #434

voxpelli opened this issue Sep 4, 2013 · 14 comments
Labels

Comments

@voxpelli
Copy link

voxpelli commented Sep 4, 2013

I'm not sure how my app should best handle the case of an inaccessible Postgres server – I want to build it so that it can handle the case of the database server crashing and by itself recover when the database has recovered

Using pg.Client I would expect to be able to reconnect like this if the database is down on the initial connection attempt (but with a better timer mechanism of course):

db = new pg.Client(process.env.DATABASE_URL),
db.connect(function (error) {
  if (error) {
    setTimeout(function () {
      db.connect();
    }, 10000);
  }
});
db.on('error', function () {});

The problem with this approach is that db.connect() attaches a lot of listeners whenever it is called - which apart from any errors the listeners themselves can cause by being invoked multiple times floods the connection with listeners until Node starts screaming about it. What would be the correct way to reconnect here?

Using pg.connect instead:

pg.connect(process.env.DATABASE_URL, function (err, client, done) {
  if (err) {
     // Handle the error, do a fallback or retry or something
  }
});

By using pg.connect on each request to my Node app instead I thought I could make it easier to reconnect to the database while also gaining the advantage of connection pooling. My idea would be to just provide a good fallback when the database couldn't be reached but instead I get unhandled errors thrown at me if I kill my server while running my code – which makes it a bit hard to handle the inaccessible server in a nice way:

events.js:72
        throw er; // Unhandled 'error' event
              ^
error: terminating connection due to administrator command
    at Connection.parseE (/node_modules/pg/lib/connection.js:526:11)
    at Connection.parseMessage (/node_modules/pg/lib/connection.js:371:17)
    at Socket.<anonymous> (/node_modules/pg/lib/connection.js:86:20)
    at Socket.EventEmitter.emit (events.js:95:17)
    at Socket.<anonymous> (_stream_readable.js:746:14)
    at Socket.EventEmitter.emit (events.js:92:17)
    at emitReadable_ (_stream_readable.js:408:10)
    at emitReadable (_stream_readable.js:404:5)
    at readableAddChunk (_stream_readable.js:165:9)
    at Socket.Readable.push (_stream_readable.js:127:10)

So I'm kind of wondering – what is the best practice in implementing node-postgres so that I can handle an inaccessible database without throwing any major errors and so that it reconnects again when the database becomes available?

@voxpelli
Copy link
Author

voxpelli commented Sep 4, 2013

Solution to pg.connect seems to be to set a global error handler:

pg.on('error', function (err) {
  console.log('Database error!', err);
});

Still don't know the correct solution to use with pg.Client though.

@brianc
Copy link
Owner

brianc commented Sep 4, 2013

Hi @voxpelli - I did some work on the pool to make it more gracefully recover from the backend server going down.

How it works is whenever a client is idling in the pool if it receives an error that error will be consumed by the pool. The pool will destroy that client and remove it from the pool so when you next request a client you'll get a "fresh" one. The pool itself then re-emits the error so you can be aware of what's going on. Does that sound like a sane way to handle that?

As for an individual client, they're not supposed to be reconnected. The best thing to do (and what the pool does) is when a client receives an error for any reason other than a query error, close that client and get a new one. I does take maybe 50 milliseconds to do the initial postgres handshake for a new connection so you don't want to use a new client on every http request or in a tight loop, but there are plenty of cases where you want to keep a single client open for a long time. Definitely get a new client if the one you're using experienced an error.

@brianc
Copy link
Owner

brianc commented Sep 4, 2013

Does that help?

@voxpelli
Copy link
Author

voxpelli commented Sep 4, 2013

Yeah, that helps to explain it. Thanks for the quick answer - and for a great module!

I think that it would be good if the pool only re-emitted the error if an error listener has been attached as Node will otherwise see an uncaught error and quit. I would expect a call to pg.connect() to require no configuration outside of that specific call – for it to be pretty self-contained and that anything happening outside of it to only be sent to me if I specifically requested it.

I also feel that adding listeners directly to the main module object is a bit problematic as that listener is then shared with all code that imports the library and will receive errors that are totally unrelated to the code where it was defined. It's probably useful in some cases, but it can create some weird and confusing situations - especially if the listener is required.

@brianc
Copy link
Owner

brianc commented Sep 30, 2013

About the error coming off the pg object...I agree it's ugly. I just don't like the alternative of hiding the error all together. My reasoning was probably better to have an unexpected error throw and crash your process (if you're using domains & cluster it shouldn't be too big of a deal) and then you can see something is wrong rather than an unexpected error silently going off the in the background and things start falling for an unexplained reason.

As for the listener being shared with all code that imports the library...that's a bigger issue I didn't fully consider until you brought it up. I'm wondering if something like making the pool initialization explicit would be better...something like this:

var pg = require('pg');
var pool = pg.createPool('your connecton params');

pool.checkout(function(err, client, release) {
 //do your stuff
})

pool.on('error', function(err, client) {
  //an idle client in the pool emitted an error
  //and has been removed from the pool and destroyed
});

That would be a big breaking change but probably for the best overall?

@thelinuxlich
Copy link

+1 for this

1 similar comment
@luizcarvalho
Copy link

+1 for this

@voxpelli
Copy link
Author

An issue today at one of my modules made me revisit this (something I have meant to do for over two years, but I guess some other stuff got in the way 😛).

The issue is about the unwanted throwing of exceptions from my Postgres Express session handler when the database is down: voxpelli/node-connect-pg-simple#29 And those exceptions comes from an unhandled error event.

Tricky thing with that module is:

It comes with its own pg module that it can use to make its own connections, if no other is given, and so it will have to listen to any error events itself. That can of course be handled but if the pg module through some deduplication gets shared with other modules, then that listener will like said here be called when it shouldn't be.

And even trickier:

npm 3 – which does deduplication by default nowadays 😃 Which means that the module will almost certainly be shared – so if I add an error handler then the behavior of default to exception throwing will basically be disabled by it as its dealing of its errors will take care of all of an application's pg errors.

So – now with especially npm 3 in consideration – the event triggering on the top object should probably go away through something like your suggested API above. Have you thought some more about that?

@greaber
Copy link

greaber commented Mar 19, 2016

I am a newbie using docker and docker-compose. A scenario that could happen is that you make a configuration change to the db container, and then it goes away and comes back with a new IP: https://docs.docker.com/compose/networking/. I guess the best behavior in this scenario for my node app would be for it to just reconnect (probably serving 500s on any pending requests although in principle it could retry them). But maybe it is fine to crash node too since this should be a rare event? Just trying to get a sense of what a reasonable thing to do is and how I can implement it.

@vincentvent
Copy link

@brianc your pool creation would be great. I am getting the following problem because of all the event listeners the current solution causes:

warning: possible EventEmitter memory leak detected. 11 listeners added. Use emitter.setMaxListeners() to increase limit.

@ADumaine
Copy link

I ran into a similar issue in a different scenario. The app requires some customer data be stored in separate databases, and the application had been built using pg pool. That was pulled out and replaced with pgbouncer, which runs as a separate daemon (or service in windows) and makes connecting to multiple databases simpler. Each time a new account is created a new database is created and pgbouncer.ini has to be updated and restarted to recognize it. Any queries get an error trying to connect during the restart.

After trying various loops and timeouts, I finally resorted to putting the client.connect() in a promise and used the promiseRetry module to try to reconnect on errors. That module has retry and timeout options that can be adjusted for different scenarios.

I'd much prefer to have some sort of configurable connection retry method in the Client as this current approach just adds additional overhead.

This is using Hapijs but shows how it can be implemented. It also has extra console.logs for testing. Any suggestions on improving it or simpler ideas of how to accomplish this would be much appreciated.

const debug = require('debug')('RUNQUERY');
const promiseRetry = require('promise-retry');
const Client = require('pg').Client;

//this creates a method attached to the server
//access via server.methods.run.pgQuery(dbname, query, values, callback)

exports.register = function (server, options, next) {
           
    var runQuery = function (dbname, query, values, cb) {

        if (!dbname) dbname = server.settings.app.database;
        const config = {
            host: server.settings.app.bouncerhost,
            port: server.settings.app.bouncerport,
            database: dbname,
            user: server.settings.app.pguser,
            password: server.settings.app.pgpw, 
        }

        var client ;

        //options for promiseRetry.  client.connect() takes about 50ms 
        const options = {retries:13, factor:2, minTimeout:100, maxTimeout:800} 
        
        function tryConnect(cb){
            client.connect(function(err){
                console.log("try");
                if (err) {
                    console.log(err.code)
                    client.end();
                    return cb(err);
                }
                    return cb(null,"connected")   
                });
        }

        promiseRetry(function(retry, number){
               console.log('attempt number', number);
              return new Promise(function(resolve, reject){
                        client  = new Client(config);
                            tryConnect(function(err,result){
                                if(result == "connected") return resolve(result);
                            
                                return reject(err);
                            });
                         })
                        .then(function(val){
                            console.log("promise output",val);
                        })
                        .catch(retry)
                
            },options)
            .then(function () {
                console.log("completed", query);
                // run the query
                 client.query(query, values, function (err, result) {
                    //release the client
                    client.end();
                    if (err) return cb ? cb("runQuery: " + err + "  " + query) : debug(err);
                    return cb ? cb(null, result) : result
                });
            }, function (err) {
                // .. 
                console.log("ERROR:", err)
                 return cb ? cb("runQuery: " + err + "  " + query) : debug(err);
            })
            .catch(function(err){
                console.log("CATCH ERROR:",err);
            });
       
    }

    server.method('run.pgQuery', runQuery, {});

    next();
};
exports.register.attributes = {
    name: 'runQuery'
};

@robhaswell
Copy link

robhaswell commented Apr 12, 2017

I also have the problem that my clients and databases are started simultaneously. Here is my approach:

const pg = require('pg')
const retry = require('promise-retry')

function connect() {
  function attempt () {
    const client = new pg.Client()

    return new Promise((resolve, reject) => {
      client.connect((err) => {
        if (err) return reject(err)
        resolve(client)
      })
    })
  }

  return retry((retry_, number) => {
    log.info('connecting to database', { number })
    return attempt().catch(retry_)
  }, { minTimeout: 500 })
  .then((client) => {
    log.info('connected to database')

    client.on('error', (err) => {
      /* Database has gone away, bump this problem up to the process manager. */
      process.exit(1)
    })

    // ... program starts here ...
  })
}

This doesn't handle disconnecting Postgres servers properly, as my app goes on to close this connection an open a pool.

@charmander
Copy link
Collaborator

If you create a Client, you have to handle disconnects yourself. (In uses where that’s considered a problem, though, you’d generally want to use a pool instead.)

@eljefedelrodeodeljefe
Copy link

Frankly, just saying using a pool is a oversimplification, imo. E.g. I can hardly use pools, because I our DB is tenant based having thousands of databases I cannot have reference to all the time, especially not in a load balanced environment.

If you are not willing to see changes on the client error handling, I would suggest allowing for very strong docs. E.g. how to hack it with JS timeouts after pg.connect, attaching to the error handler with a minimum risk of memory leaks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants