-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wrong users generation causing clickhouse crash #1332
Comments
For more information: My operator config: ---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: operatorgroup
namespace: mynamespace
spec:
targetNamespaces:
- mynamespace
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: clickhouse
namespace: mynamespace
spec:
channel: latest
name: clickhouse
source: operatorhubio-catalog
sourceNamespace: olm
installPlanApproval: Automatic
---
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseOperatorConfiguration"
metadata:
name: "chop-config-01"
namespace: "mynamespace"
spec:
################################################
##
## Watch Section
##
################################################
watch:
# List of namespaces where clickhouse-operator watches for events.
# Concurrently running operators should watch on different namespaces
#namespaces: ["dev", "test"]
namespaces: []
clickhouse:
configuration:
################################################
##
## Configuration Files Section
##
################################################
file:
path:
# Path to the folder where ClickHouse configuration files common for all instances within a CHI are located.
common: config.d
# Path to the folder where ClickHouse configuration files unique for each instance (host) within a CHI are located.
host: conf.d
# Path to the folder where ClickHouse configuration files with users settings are located.
# Files are common for all instances within a CHI.
user: users.d
################################################
##
## Configuration Users Section
##
################################################
user:
default:
# Default values for ClickHouse user configuration
# 1. user/profile - string
# 2. user/quota - string
# 3. user/networks/ip - multiple strings
# 4. user/password - string
profile: default
quota: default
networksIP:
- "::1"
- "127.0.0.1"
password: "default"
################################################
##
## Configuration Network Section
##
################################################
network:
# Default host_regexp to limit network connectivity from outside
hostRegexpTemplate: "(chi-{chi}-[^.]+\\d+-\\d+|clickhouse\\-{chi})\\.{namespace}\\.svc\\.cluster\\.local$"
################################################
##
## Access to ClickHouse instances
##
################################################
access:
# ClickHouse credentials (username, password and port) to be used by operator to connect to ClickHouse instances
# for:
# 1. Metrics requests
# 2. Schema maintenance
# 3. DROP DNS CACHE
# User with such credentials can be specified in additional ClickHouse .xml config files,
# located in `chUsersConfigsPath` folder
username: "clickhouse_operator"
password: "clickhouse_operator_password"
secret:
# Location of k8s Secret with username and password to be used by operator to connect to ClickHouse instances
# Can be used instead of explicitly specified username and password
namespace: ""
name: ""
# Port where to connect to ClickHouse instances to
port: 8123
################################################
##
## Templates Section
##
################################################
template:
chi:
# Path to the folder where ClickHouseInstallation .yaml manifests are located.
# Manifests are applied in sorted alpha-numeric order.
path: templates.d
################################################
##
## Reconcile Section
##
################################################
reconcile:
runtime:
# Max number of concurrent CHI reconciles in progress
reconcileCHIsThreadsNumber: 10
# Max number of concurrent shard reconciles in progress
reconcileShardsThreadsNumber: 1
# The maximum percentage of cluster shards that may be reconciled in parallel
reconcileShardsMaxConcurrencyPercent: 50
statefulSet:
create:
# What to do in case created StatefulSet is not in Ready after `statefulSetUpdateTimeout` seconds
# Possible options:
# 1. abort - do nothing, just break the process and wait for admin
# 2. delete - delete newly created problematic StatefulSet
# 3. ignore - ignore error, pretend nothing happened and move on to the next StatefulSet
onFailure: ignore
update:
# How many seconds to wait for created/updated StatefulSet to be Ready
timeout: 300
# How many seconds to wait between checks for created/updated StatefulSet status
pollInterval: 5
# What to do in case updated StatefulSet is not in Ready after `statefulSetUpdateTimeout` seconds
# Possible options:
# 1. abort - do nothing, just break the process and wait for admin
# 2. rollback - delete Pod and rollback StatefulSet to previous Generation.
# Pod would be recreated by StatefulSet based on rollback-ed configuration
# 3. ignore - ignore error, pretend nothing happened and move on to the next StatefulSet
onFailure: rollback
host:
wait:
exclude: "true"
include: "false"
################################################
##
## Annotations management
##
################################################
annotation:
# Applied when:
# 1. Propagating annotations from the CHI's `metadata.annotations` to child objects' `metadata.annotations`,
# 2. Propagating annotations from the CHI Template's `metadata.annotations` to CHI's `metadata.annotations`,
# Include annotations from the following list:
# Applied only when not empty. Empty list means "include all, no selection"
include: []
# Exclude annotations from the following list:
exclude: []
################################################
##
## Labels management
##
################################################
label:
# Applied when:
# 1. Propagating labels from the CHI's `metadata.labels` to child objects' `metadata.labels`,
# 2. Propagating labels from the CHI Template's `metadata.labels` to CHI's `metadata.labels`,
# Include labels from the following list:
# Applied only when not empty. Empty list means "include all, no selection"
include: []
# Exclude labels from the following list:
exclude: []
# Whether to append *Scope* labels to StatefulSet and Pod.
# Full list of available *scope* labels check in labeler.go
# LabelShardScopeIndex
# LabelReplicaScopeIndex
# LabelCHIScopeIndex
# LabelCHIScopeCycleSize
# LabelCHIScopeCycleIndex
# LabelCHIScopeCycleOffset
# LabelClusterScopeIndex
# LabelClusterScopeCycleSize
# LabelClusterScopeCycleIndex
# LabelClusterScopeCycleOffset
appendScope: "no"
################################################
##
## StatefulSet management
##
################################################
statefulSet:
revisionHistoryLimit: 0
################################################
##
## Pod management
##
################################################
pod:
# Grace period for Pod termination.
# How many seconds to wait between sending
# SIGTERM and SIGKILL during Pod termination process.
# Increase this number is case of slow shutdown.
terminationGracePeriod: 30
################################################
##
## Log parameters
##
################################################
logger:
logtostderr: "true"
alsologtostderr: "false"
v: "1"
stderrthreshold: ""
vmodule: ""
log_backtrace_at: ""
```
My clickhouse configuration
```yaml
---
## https://github.com/Altinity/clickhouse-operator/blob/master/deploy/operator/clickhouse-operator-install-bundle.yaml
apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseInstallation
metadata:
name: clickhouse
namespace: mynamespace
annotations:
prometheus.io/scrape: "false"
prometheus.io/port: "8888"
ad.datadoghq.com/clickhouse.checks: |
{
"openmetrics": {
"init_config": {},
"instances": [
{
"openmetrics_endpoint": "http://%%host%%:8888/metrics",
"use_openmetrics": "false"
}
]
}
}
ad.datadoghq.com/clickhouse.logs: '[{"source":"clickhouse","service":"clickhouse_cluster"}]'
argocd.argoproj.io/sync-wave: "3"
spec:
configuration:
settings:
logger/level: "error"
# to allow scrape metrics via embedded prometheus protocol
prometheus/endpoint: /metrics
prometheus/port: 8888
prometheus/metrics: false
prometheus/events: false
prometheus/asynchronous_metrics: false
users:
superset/networks/ip: "::/0"
superset/password: apassword
superset/profile: default
clusters:
- name: clickhouse
layout:
replicasCount: 1
shardsCount: 1
templates:
volumeClaimTemplate: storage-clickhouse
podTemplate: pod-template
serviceTemplate: svc-template
files:
config.d/01-clickhouse-02-logger.xml: |
<yandex>
<logger>
<!-- Possible levels: https://github.com/pocoproject/poco/blob/develop/Foundation/include/Poco/Logger.h#L105 -->
<level>error</level>
<log>/var/log/clickhouse-server/clickhouse-server.log</log>
<errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
<size>1000M</size>
<count>10</count>
<!-- Default behavior is autodetection (log to console if not daemon mode and is tty) -->
<console>1</console>
</logger>
</yandex>
templates:
serviceTemplates:
- name: svc-template
spec:
type: ClusterIP
podTemplates:
- name: pod-template
spec:
containers:
- name: clickhouse
image: clickhouse/clickhouse-server:23.11.5-alpine
resources:
limits:
cpu: "1"
memory: 7Gi
requests:
cpu: "1"
memory: 7Gi
volumeClaimTemplates:
- name: storage-clickhouse
spec:
storageClassName: "csi-cinder-classic"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Gi The erroneous generated config |
Found the reason, work in progress. Please, keep an eye on operatorhub releases, will publish 0.23.1 soon |
@salimidruide thanks for detailed description of the case, this really helped to reproduce and to catch the issue |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello there,
After the upgrade to clickhouse-operator 0.23.0 , we noticed that the ClickHouseInstallation object has some erroneous injected users configuration:
users: /networks/ip: - 10.2.2.100 /profile: clickhouse_operator default/networks/host_regexp: >- (chi-clickhouse-[^.]+\d+-\d+|clickhouse\-clickhouse)\.analytics\.svc\.cluster\.local$ default/networks/ip: - '::1' - 127.0.0.1 - 10.2.14.136 default/profile: default
It seems that the operator is injecting "empty"/networks.. "empty"/profile.
Clickhouse-server is trying then to parse the generated configuration and crashing with the error:
<Error> Application: Code: 36. DB::Exception: Either 'password' or 'password_sha256_hex' or 'password_double_sha1_hex' or 'no_password' or 'ldap' or 'kerberos or 'ssl_certificates' or 'ssh_keys' or 'http_authentication' must be specified for user networks.: while parsing user 'networks' in users configuration file: while loading configuration file '/etc/clickhouse-server/users.xml'. (BAD_ARGUMENTS), Stack trace (when copying this message, always include the lines below):
I am using the lasted clickhouse-server version: clickhouse/clickhouse-server:23.11.5-alpine (self-hosted on Kubernetes).
is there a way to fix it? a turn around ?
Thank you in advance
The text was updated successfully, but these errors were encountered: