-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incompatibility with Zookeeper 3.9 #53749
Comments
I think it is related to this change in zk v3.9 https://issues.apache.org/jira/browse/ZOOKEEPER-4492 More detail in the PR: apache/zookeeper#1837 (comment) |
…ickhouse and zookeeper 3.9.0, see details in apache/zookeeper#1837 (comment) return `:latest` default value after resolve ClickHouse/ClickHouse#53749
Cool kids use ClickHouse Keeper. |
We would but we had stability issues and we believe it was from the |
@bputt-e sorry, but clickhouse-keeper manifests it not related to clickouse-keeper itself @alexey-milovidov root reason is how eBay/NuRaft store quorum peers and how it update quorum state |
@bputt-e, you are pointing to a third-party ClickHouse operator from altinity, which is unrelated to ClickHouse - it can contain mistakes. And having --force-recovery in the operator is 100% a mistake. Do not use this operator with Keeper. |
@Slach you are pointing to a new I have no idea why someone needs this command. ClickHouse Keeper works perfectly without a |
@alexey-milovidov Could you explain how to create cluster with 3 Because even when you change XML config with Please don't remove |
I would probably ask about another completely reasonable scenario: 2 DC: A & B 3 keeper nodes in DC A (they participate in quorum) And what if DC A is completely down and we need to switch to DC B, so do reconfigure keeper nodes in DC B, without quorum being up. It's quite common approach for companies, which value their data and ability to survive any cataclysm.
Until first disaster? |
Switching leaders without a quorum can lead to data loss (of the data that was present in the unavailable datacenter). A bulletproof approach is to have three Keeper nodes in three different data centers, but not too far from each other (say, less than 30 ms RTT). An approach when you switch the leader manually makes sense, but only when you can accept data loss - it is similar to, say, changing the master in MySQL replication (a source of many horror stories, especially if done with some automation). |
Datacenter is already gone, so at least temporary, but this DC specific data is already lost from user perspective. Plus learners, should be pretty up to date with latest changes in keeper, much better than ClickHouse replication (just because of data size)
30 ms RTT is too much for quorum for my taste.
Normal replication in ClickHouse is also for people, who can accept data loss. (no quorum during write/async replication) |
Don't do this, it's an antipattern. Single-node [Zoo]Keeper clusters are good for dev/staging env, but I would not recommend it for production.
They do not care about their data if they reconfigure a coordination service forcefully without a quorum. |
But I agree that As for the incompatibility with ZooKeeper 3.9, it's a minor issue because:
So we can reopen this issue and hope that some good person from the community will send us a PR |
* add connection to gcs and use different context for upload incase it got cancel by another thread * save * keep ctx * keep ctx * use v2 * change to GCS_CLIENT_POOL_SIZE * pin zookeeper to 3.8.2 version for resolve incompatibility between clickhouse and zookeeper 3.9.0, see details in apache/zookeeper#1837 (comment) return `:latest` default value after resolve ClickHouse/ClickHouse#53749 * Revert "add more precise disk re-balancing for not exists disks, during download, partial fix Altinity#561" This reverts commit 20e250c. * fix S3 head object Server Side Encryption parameters, fix Altinity#709 * change timeout to 60m, TODO make tests Parallel --------- Co-authored-by: Slach <bloodjazman@gmail.com>
This commit enables the read-only flag when connecting to the ZooKeeper server. This flag is enabled by sending one extra byte when connecting, and then receiving one extra byte during the first response. In addition to that, we modify createIfNotExists to not complain about attempting to alter a read-only ZooKeeper cluster if the node already exists. This makes ClickHouse more useful in the event of a loss of quorum, user credentials are still accessible, which makes it possible to connect to the cluster and run read queries. Any DDL or DML query on a Distributed database or ReplicatedMergeTree table will correctly fail, since it needs to write to ZooKeeper to execute the query. Any non-distributed query will be possible, which is ok since the query was never replicated in the first place, there is no loss of consistency. Fixes ClickHouse#53749 as it seems to be the only thing 3.9 enforced.
There is an incompatibility of ClickHouse with Zookeeper 3.9. See: - apache/zookeeper#2146 - apache/zookeeper#1837 - ClickHouse/ClickHouse#53749
There is an incompatibility of ClickHouse with Zookeeper 3.9. See: - apache/zookeeper#2146 - apache/zookeeper#1837 - ClickHouse/ClickHouse#53749
It seems ZK 3.9 has changed something in its protocol and ClickHouse can't connect to it.
The error seems to be related to the handshake:
ZK 3.8.2 is fine.
Keeper is fine too.
The text was updated successfully, but these errors were encountered: