Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]rabbitmq cluster created failed when setting 3.8.14 as service version #8906

Closed
tianyue86 opened this issue Feb 11, 2025 · 1 comment · Fixed by apecloud/kubeblocks-addons#1479
Assignees
Labels
kind/bug Something isn't working
Milestone

Comments

@tianyue86
Copy link

tianyue86 commented Feb 11, 2025

Describe the env

Kubernetes: v1.31.1-aliyun.1
KubeBlocks: 1.0.0-beta.28
kbcli: 1.0.0-beta.11

To Reproduce
Steps to reproduce the behavior:

  1. Create rabbitmq cluster with yaml below
apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
  name: rabbitmq-donotd
  namespace: default
spec:
  clusterDef: rabbitmq
  topology: clustermode
  terminationPolicy: DoNotTerminate
  componentSpecs:
    - name: rabbitmq
      serviceVersion: 3.8.14
      replicas: 3
      resources:
        requests:
          cpu: 500m
          memory: 0.5Gi
        limits:
          cpu: 500m
          memory: 0.5Gi
      serviceVersion: 3.8.14
      volumeClaimTemplates:
        - name: data
          spec:
            storageClassName:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 20Gi
  1. Check cluster status
k get cluster -A
NAMESPACE   NAME              CLUSTER-DEFINITION   TERMINATION-POLICY   STATUS     AGE
default     rabbitmq-donotd   rabbitmq             DoNotTerminate       Failed     105s

k get pod
NAME                         READY   STATUS             RESTARTS        AGE
rabbitmq-donotd-rabbitmq-0   1/2     CrashLoopBackOff   2 (12s ago)     62s

k describe pod rabbitmq-donotd-rabbitmq-0
Events:
  Type     Reason                  Age                    From                     Message
  ----     ------                  ----                   ----                     -------
  Normal   Scheduled               4m55s                  default-scheduler        Successfully assigned default/rabbitmq-donotd-rabbitmq-0 to cn-zhangjiakou.10.0.0.128
  Normal   SuccessfulAttachVolume  4m55s                  attachdetach-controller  AttachVolume.Attach succeeded for volume "d-8vb2ic9tapwfvcgj6l9p"
  Normal   AllocIPSucceed          4m45s                  terway-daemon            Alloc IP 10.0.0.62/24 took 30.787677ms
  Normal   Pulled                  4m45s                  kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/kubeblocks-tools:1.0.0-beta.28" already present on machine
  Normal   Created                 4m45s                  kubelet                  Created container init-kbagent
  Normal   Started                 4m45s                  kubelet                  Started container init-kbagent
  Normal   Pulled                  4m45s                  kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/rabbitmq:3.13.2-management" already present on machine
  Normal   Started                 4m44s                  kubelet                  Started container kbagent-worker
  Normal   Created                 4m44s                  kubelet                  Created container kbagent-worker
  Normal   Pulling                 4m44s                  kubelet                  Pulling image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/rabbitmq:3.8.14-management"
  Normal   Pulled                  4m40s                  kubelet                  Successfully pulled image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/rabbitmq:3.8.14-management" in 3.282s (3.282s including waiting). Image size: 87254007 bytes.
  Normal   Pulled                  4m40s                  kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/rabbitmq:3.13.2-management" already present on machine
  Normal   Created                 4m40s                  kubelet                  Created container kbagent
  Normal   Started                 4m40s                  kubelet                  Started container kbagent
  Warning  Unhealthy               4m35s                  kubelet                  Startup probe failed: dial tcp 10.0.0.62:5672: connect: connection refused
  Normal   Created                 4m12s (x3 over 4m40s)  kubelet                  Created container rabbitmq
  Normal   Started                 4m12s (x3 over 4m40s)  kubelet                  Started container rabbitmq
  Normal   Pulled                  4m12s (x2 over 4m34s)  kubelet                  Container image "apecloud-registry.cn-zhangjiakou.cr.aliyuncs.com/apecloud/rabbitmq:3.8.14-management" already present on machine
  Warning  BackOff                 3m50s (x5 over 4m28s)  kubelet                  Back-off restarting failed container rabbitmq in pod rabbitmq-donotd-rabbitmq-0_default(13e1b5f5-2a17-45fd-88c0-3919ac43dbd7)
  1. See error
Configuring logger redirection
06:38:52.306 [error] You've tried to set log.file.rotation.compress, but there is no setting with that name.
06:38:52.306 [error]   Did you mean one of these?
06:38:52.426 [error]     log.file.rotation.count
06:38:52.426 [error]     log.file.rotation.date
06:38:52.426 [error]     log.file.rotation.size
06:38:52.427 [error] Error preparing configuration in phase transform_datatypes:
06:38:52.427 [error]   - Conf file attempted to set unknown variable: log.file.rotation.compress
06:38:52.429 [error] 

06:38:52.429 [error] BOOT FAILED
BOOT FAILED
06:38:52.429 [error] ===========
===========
06:38:52.429 [error] Error during startup: {error,failed_to_prepare_configuration}
Error during startup: {error,failed_to_prepare_configuration}
06:38:52.429 [error] 

06:38:53.431 [error] Supervisor rabbit_prelaunch_sup had child prelaunch started with rabbit_prelaunch:run_prelaunch_first_phase() at undefined exit with reason failed_to_prepare_configuration in context start_error
06:38:53.431 [error] CRASH REPORT Process <0.152.0> with 0 neighbours exited with reason: {{shutdown,{failed_to_start_child,prelaunch,failed_to_prepare_configuration}},{rabbit_prelaunch_app,start,[normal,[]]}} in application_master:init/4 line 138
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,failed_to_prepare_configuration}},{rabbit_prelaunch_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,failed_to_prepare_configuration}},{rabbit_prelaunch_app,start,

Crash dump is being written to: erl_crash.dump...done
  1. Further testing:
===>The 3.8.14 service version leads to cluster creation failure
k get cluster -A
NAMESPACE   NAME              CLUSTER-DEFINITION   TERMINATION-POLICY   STATUS     AGE
default     rabbitmq-donotd   rabbitmq             DoNotTerminate       Failed     105s
default     rabbitmq-jytplg   rabbitmq             WipeOut              Updating   5m39s
default     rabbitmq-qykmgk   rabbitmq             DoNotTerminate       Failed     15m
default     rabbitmq-zozegi   rabbitmq             Delete               Running    18m
default     rabbitmq-tryversion    rabbitmq             DoNotTerminate       Running    94s

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@tianyue86 tianyue86 added the kind/bug Something isn't working label Feb 11, 2025
@tianyue86 tianyue86 added this to the Release 1.0.0 milestone Feb 11, 2025
@shanshanying
Copy link
Contributor

terminationPolicy wont' affect the creation of Cluster. Please check serviceVersion

@shanshanying shanshanying removed their assignment Feb 12, 2025
@tianyue86 tianyue86 changed the title [BUG]rabbitmq cluster created failed when setting DoNotTerminate for terminationPolicy [BUG]rabbitmq cluster created failed when setting 3.8.14 as service version Feb 12, 2025
@xuriwuyun xuriwuyun linked a pull request Feb 19, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants