-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
During rolling deploy it is possible for the old application pod to interact with the updated database #1867
Comments
@jobara - We understand that this is not a priority at this time. Is that right? The sense of our team is that we can turn off the rolling updates, but then we will have downtimes for each deployment. This might not be worth our time. Do you agree? |
@colleenskemp I'll have to think so more on this. I'll check in with @michelled when she's back. |
At the dev check in meeting with @JureUrsic, @peterhebert, and @michelled we discussed using Laravel's maintenance mode for this. When the deploy is happening the script would call |
@JureUrsic I was thinking about this today, and wondering when/where it should run. I was thinking it could go around the migration step in DeployGlobal.php but I'm not sure because wouldn't the old web head need to come down before we take the site out of maintenance mode? Also are you able to take on work on this task? |
@jobara it should go into "local" command on start and beginning |
I can run some tests on dev, just give me the commands to run |
@JureUrsic thanks, you can use the |
@JureUrsic the other day I manually reset the database in the dev deploy. As part of that I put the site in maintenance mode. However, after bringing the site back up using |
So the problem with maintenance mode currently is that the health check on the pods also gets maintenance mode so the pod is considered unhealthy and the load balancer doesn't forward connections. We will take the following actions to fix:
|
@jobara I've made the necessary changes in the branch associated with this issue. Let me know if you want me to create a PR for it? |
@marvinroman thanks for working on this. Yes, please file a PR for the changes. |
Regarding the health check, in taking a glance at your branch, it looks like it checks the DB now. But I guess that won't really tell us if the web site is actually served up properly. Is there a way to check different things if the site is in maintenance mode or not? Regarding turning maintenance mode on/off in the global deploy, will that affect the original instance as well and not just the two new ones that are in the process of spinning up? |
@marvinroman also in your branch I noticed that it brings the site back up after 5 minutes. These kinds of timers are always risky as we don't know if the task has yet to complete or completed some time before. Is it possible to get a hook into when the pods are actually being used, and/or when the old pods are all removed? |
This is a health check of the pod and not the site to know whether to forward connections to the pod from the load balancer. In other words are the services properly running. We have an external check that determines site health and will notify us of site issues. When maintenance mode is activated it occurs across all the pods. |
I agree that there are risks associated with a timer, but we haven't found an alternative at this time. We have determined that lifecycle hooks aren't possible to use in our infrastructure at this time. |
Prerequisites
Describe the bug
In our current rolling deploy system, as new pods are being deployed an old pod sticks around until the new ones are ready for use. However, there is a single shared database that the pods connect to. The issue here is that a user may be interacting with the old pod, but the database could have been migrated to a new structure. This could lead to data corruption and/or 500 errors reported to the user as the application may have a mismatch of expectations of the data compared to the current database.
Expected behavior
We should minimize or eliminate the possibility of the old application and new database from interacting with each other.
The text was updated successfully, but these errors were encountered: