Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix change notification of backend shard #835

Merged
merged 1 commit into from
Aug 9, 2021
Merged

Conversation

jcmoraisjr
Copy link
Owner

HAProxy backends can be sharded into smaller pieces, so unchanged backends don't need to be rebuilt, reducing the time spent regenerating the configuration file, specially on deployments with thousands of backends.

HAProxy Ingress controls the shards that should be rebuilt via the partial parsing implementation of ingress converter. Partial parsing tracks everything that need to be rebuilt, including backends. The internal HAProxy model controls the shards that should be rebuilt based on the backends tracked by the ingress converter. This means that changes made outside this controlled environment can be out of sync with configurations saved to disk, which is what HAProxy uses when it is restarted. This synchronization can be eventually fixed depending on new received events, sometimes this doesn't trigger HAProxy reloads, leading to the internal model in sync with disk, but not synced with the HAProxy instance.

This out of sync is happening when HAProxy needs to be reloaded: all backends are iterated and new empty endpoints are added if needed. If the backend that received new endpoints wasn't tracked due to api changes, it is considered as a read only backend and its configuration file won't be updated. However the internal model has this new endpoint and will eventually use it, leading to a "No such server" from HAProxy client socket, which the controller was ignoring up to v0.12.5.

This behavior is mitigated since v0.12.6 where unexpected responses triggers a reload with synced configuration files, and this update fixes a root cause of this behavior.

This should be merged as far as v0.11.

Related with #810
Should fix the root cause of #807

HAProxy backends can be sharded into smaller pieces, so unchanged
backends don't need to be rebuilt, reducing the time spent regenerating
the configuration file, specially on deployments with thousands of
backends.

HAProxy Ingress controls the shards that should be rebuilt via the
partial parsing implementation of ingress converter. Partial parsing
tracks everything that need to be rebuilt, including backends. The
internal HAProxy model controls the shards that should be rebuilt based
on the backends tracked by the ingress converter. This means that
changes made outside this controlled environment can be out of sync
with configurations saved to disk, which is what HAProxy uses when it
is restarted. This synchronization can be eventually fixed depending on
new received events, sometimes this doesn't trigger HAProxy reloads,
leading to the internal model in sync with disk, but not synced with
the HAProxy instance.

This out of sync is happening when HAProxy needs to be reloaded: all
backends are iterated and new empty endpoints are added if needed. If
the backend that received new endpoints wasn't tracked due to api
changes, it is considered as a read only backend and its configuration
file won't be updated. However the internal model has this new endpoint
and will eventually use it, leading to a "No such server" from HAProxy
client socket, which the controller was ignoring up to v0.12.5.

This behavior is mitigated since v0.12.6 where unexpected responses
triggers a reload with synced configuration files, and this update
fixes a root cause of this behavior.

This should be merged as far as v0.11.
@jcmoraisjr jcmoraisjr merged commit 9238d5d into master Aug 9, 2021
@jcmoraisjr jcmoraisjr deleted the jm-fix-shard branch August 9, 2021 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant