Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: drop replication slot when db deletes wal segment #154

Merged
merged 2 commits into from
May 22, 2021
Merged

Conversation

w3b6x9
Copy link
Member

@w3b6x9 w3b6x9 commented May 21, 2021

What kind of change does this PR introduce?

Bug fix

What is the current behavior?

Database deletes a WAL segment that Realtime server still needs to replay. Realtime is caught restarting in a loop as it cannot find the WAL segment.

What is the new behavior?

Database deletes a WAL segment that Realtime server still needs to replay. Realtime recognizes this, drops the replication slot, and on restart a new replication slot is created. All is normal again.

Additional context

@w3b6x9 w3b6x9 requested a review from kiwicopple May 21, 2021 04:25
@kiwicopple
Copy link
Member

Yeah I agree with this approach for our current use-case. If we ever want to use this as an auditing tool however, we'll have to re-engineer this. Perhaps we can add this to the FAQ's in the readme. Something like:

  • q: does this provide guaranteed delivery?
  • a: not yet, due to XXX.

I think as part of our re-engineering or realtime/sockets/workflows we can come up with some solutions here

@w3b6x9
Copy link
Member Author

w3b6x9 commented May 21, 2021

@kangmingtay

If we ever want to use this as an auditing tool however, we'll have to re-engineer this.
I think as part of our re-engineering or realtime/sockets/workflows we can come up with some solutions here

Yep, totally agree!

add this to the FAQ's in the readme

Yep, good idea! Let me go add to readme and i'll ping you for feedback

README.md Outdated

1. Postgres database runs out of disk space due to Write-Ahead Logging (WAL) buildup, which will crash the database and prevent Realtime server from streaming replication and broadcasting changes.
2. Realtime server will crash due to a larger replication lag than available memory, forcing the creation of a new replication slot and resetting streaming replication to read from the latest WAL data.
3. When Realtime server falls too far behind for any reason, for example disconnecting from database as WAL continues to build up, then database will delete WAL segments the server still needs to read from, for example after reconnecting.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @w3b6x9 - I would just update some of the "will"s to "can"s. eg:

which can crash the database

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kiwicopple yep, good call

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍 feel free to merge

@w3b6x9 w3b6x9 merged commit 48edd9e into master May 22, 2021
@w3b6x9 w3b6x9 deleted the db-deletes-wal branch May 22, 2021 02:48
@github-actions
Copy link

🎉 This PR is included in version 0.14.1 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

w3b6x9 pushed a commit that referenced this pull request Nov 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants