Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-46010: wait for at least 3 kube-apiserver instances #9302

Merged
merged 1 commit into from
Jan 15, 2025

Conversation

tkashem
Copy link
Contributor

@tkashem tkashem commented Dec 10, 2024

No description provided.

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Dec 10, 2024
@openshift-ci-robot
Copy link
Contributor

@tkashem: This pull request references Jira Issue OCPBUGS-45924, which is invalid:

  • expected the bug to target the "4.19.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 10, 2024
@tkashem
Copy link
Contributor Author

tkashem commented Dec 10, 2024

/cc @benluddy

@openshift-ci openshift-ci bot requested a review from benluddy December 10, 2024 19:51
@benluddy
Copy link
Contributor

This wouldn't resolve whatever is allowing concurrent etcd installers in 45924, so I opened a separate bug to track.

/retitle [WIP] OCPBUGS-46010: wait for at least 3 kube-apiserver instances

@openshift-ci openshift-ci bot changed the title [WIP] OCPBUGS-45924: wait for at least 3 kube-apiserver instances [WIP] OCPBUGS-46010: wait for at least 3 kube-apiserver instances Dec 10, 2024
@openshift-ci-robot
Copy link
Contributor

@tkashem: This pull request references Jira Issue OCPBUGS-46010, which is invalid:

  • expected the bug to target the "4.19.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@@ -57,7 +57,7 @@ is_topology_ha() {

##
## for HA cluster, we mark the bootstrap process as complete when there
## are at least two IP addresses available to the endpoints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth updating to mention here that the bootstrap instance is itself one of the endpoints. We don't want to tear down the bootstrap instance until there are enough permanent instances to tolerate 1 unavailable instance.

@tkashem tkashem force-pushed the fix-wait-for-ha-api branch from 62abaf4 to 19e98f9 Compare December 11, 2024 14:19
@benluddy
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 12, 2024
@tkashem tkashem changed the title [WIP] OCPBUGS-46010: wait for at least 3 kube-apiserver instances OCPBUGS-46010: wait for at least 3 kube-apiserver instances Dec 12, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 12, 2024
@tkashem
Copy link
Contributor Author

tkashem commented Dec 13, 2024

/cc @patrickdillon

@patrickdillon
Copy link
Contributor

/approve

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 13, 2024
@patrickdillon
Copy link
Contributor

/test ?

Copy link
Contributor

openshift-ci bot commented Dec 13, 2024

@patrickdillon: The following commands are available to trigger required jobs:

/test altinfra-images
/test aro-unit
/test artifacts-images
/test e2e-agent-compact-ipv4
/test e2e-aws-ovn
/test e2e-aws-ovn-edge-zones-manifest-validation
/test e2e-aws-ovn-upi
/test e2e-azure-ovn
/test e2e-azure-ovn-upi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upi
/test e2e-metal-ipi-ovn-ipv6
/test e2e-openstack-ovn
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi
/test gofmt
/test golint
/test govet
/test images
/test integration-tests
/test integration-tests-nodejoiner
/test openstack-manifests
/test shellcheck
/test terraform-images
/test terraform-verify-vendor
/test tf-lint
/test unit
/test verify-codegen
/test verify-vendor
/test yaml-lint

The following commands are available to trigger optional jobs:

/test altinfra-e2e-aws-custom-security-groups
/test altinfra-e2e-aws-ovn
/test altinfra-e2e-aws-ovn-fips
/test altinfra-e2e-aws-ovn-imdsv2
/test altinfra-e2e-aws-ovn-localzones
/test altinfra-e2e-aws-ovn-proxy
/test altinfra-e2e-aws-ovn-shared-vpc
/test altinfra-e2e-aws-ovn-shared-vpc-local-zones
/test altinfra-e2e-aws-ovn-shared-vpc-wavelength-zones
/test altinfra-e2e-aws-ovn-single-node
/test altinfra-e2e-aws-ovn-wavelengthzones
/test altinfra-e2e-azure-capi-ovn
/test altinfra-e2e-azure-ovn-shared-vpc
/test altinfra-e2e-gcp-capi-ovn
/test altinfra-e2e-gcp-ovn-byo-network-capi
/test altinfra-e2e-gcp-ovn-secureboot-capi
/test altinfra-e2e-gcp-ovn-xpn-capi
/test altinfra-e2e-ibmcloud-capi-ovn
/test altinfra-e2e-nutanix-capi-ovn
/test altinfra-e2e-openstack-capi-ccpmso
/test altinfra-e2e-openstack-capi-ccpmso-zone
/test altinfra-e2e-openstack-capi-dualstack
/test altinfra-e2e-openstack-capi-dualstack-upi
/test altinfra-e2e-openstack-capi-dualstack-v6primary
/test altinfra-e2e-openstack-capi-externallb
/test altinfra-e2e-openstack-capi-nfv-intel
/test altinfra-e2e-openstack-capi-ovn
/test altinfra-e2e-openstack-capi-proxy
/test altinfra-e2e-vsphere-capi-multi-vcenter-ovn
/test altinfra-e2e-vsphere-capi-ovn
/test altinfra-e2e-vsphere-capi-static-ovn
/test altinfra-e2e-vsphere-capi-zones
/test azure-ovn-marketplace-images
/test e2e-agent-4control-ipv4
/test e2e-agent-5control-ipv4
/test e2e-agent-compact-ipv4-appliance-diskimage
/test e2e-agent-compact-ipv4-none-platform
/test e2e-agent-compact-ipv6-minimaliso
/test e2e-agent-ha-dualstack
/test e2e-agent-sno-ipv4-pxe
/test e2e-agent-sno-ipv6
/test e2e-aws-default-config
/test e2e-aws-overlay-mtu-ovn-1200
/test e2e-aws-ovn-custom-iam-profile
/test e2e-aws-ovn-edge-zones
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-heterogeneous
/test e2e-aws-ovn-imdsv2
/test e2e-aws-ovn-proxy
/test e2e-aws-ovn-public-ipv4-pool
/test e2e-aws-ovn-public-ipv4-pool-disabled
/test e2e-aws-ovn-public-subnets
/test e2e-aws-ovn-shared-vpc-custom-security-groups
/test e2e-aws-ovn-shared-vpc-edge-zones
/test e2e-aws-ovn-single-node
/test e2e-aws-ovn-techpreview
/test e2e-aws-ovn-upgrade
/test e2e-aws-ovn-workers-rhel8
/test e2e-aws-upi-proxy
/test e2e-azure-default-config
/test e2e-azure-ovn-resourcegroup
/test e2e-azure-ovn-shared-vpc
/test e2e-azure-ovn-techpreview
/test e2e-azurestack
/test e2e-azurestack-upi
/test e2e-crc
/test e2e-external-aws
/test e2e-external-aws-ccm
/test e2e-gcp-default-config
/test e2e-gcp-ovn-byo-vpc
/test e2e-gcp-ovn-heterogeneous
/test e2e-gcp-ovn-techpreview
/test e2e-gcp-ovn-xpn
/test e2e-gcp-secureboot
/test e2e-gcp-upgrade
/test e2e-gcp-upi-xpn
/test e2e-gcp-user-provisioned-dns
/test e2e-ibmcloud-ovn
/test e2e-metal-assisted
/test e2e-metal-ipi-ovn
/test e2e-metal-ipi-ovn-dualstack
/test e2e-metal-ipi-ovn-swapped-hosts
/test e2e-metal-ipi-ovn-virtualmedia
/test e2e-metal-single-node-live-iso
/test e2e-nutanix-ovn
/test e2e-openstack-ccpmso
/test e2e-openstack-ccpmso-zone
/test e2e-openstack-dualstack
/test e2e-openstack-dualstack-upi
/test e2e-openstack-externallb
/test e2e-openstack-nfv-intel
/test e2e-openstack-proxy
/test e2e-openstack-singlestackv6
/test e2e-powervs-capi-ovn
/test e2e-vsphere-multi-vcenter-ovn
/test e2e-vsphere-ovn-multi-network
/test e2e-vsphere-ovn-techpreview
/test e2e-vsphere-ovn-upi-zones
/test e2e-vsphere-ovn-zones
/test e2e-vsphere-ovn-zones-techpreview
/test e2e-vsphere-static-ovn
/test okd-scos-e2e-aws-ovn
/test okd-scos-images
/test tf-fmt

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-installer-master-altinfra-images
pull-ci-openshift-installer-master-aro-unit
pull-ci-openshift-installer-master-artifacts-images
pull-ci-openshift-installer-master-e2e-aws-ovn
pull-ci-openshift-installer-master-gofmt
pull-ci-openshift-installer-master-golint
pull-ci-openshift-installer-master-govet
pull-ci-openshift-installer-master-images
pull-ci-openshift-installer-master-okd-scos-e2e-aws-ovn
pull-ci-openshift-installer-master-shellcheck
pull-ci-openshift-installer-master-tf-fmt
pull-ci-openshift-installer-master-tf-lint
pull-ci-openshift-installer-master-unit
pull-ci-openshift-installer-master-verify-codegen
pull-ci-openshift-installer-master-verify-vendor
pull-ci-openshift-installer-master-yaml-lint

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@benluddy
Copy link
Contributor

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Dec 13, 2024
@openshift-ci-robot
Copy link
Contributor

@benluddy: This pull request references Jira Issue OCPBUGS-46010, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.19.0) matches configured target version for branch (4.19.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @wangke19

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from wangke19 December 13, 2024 18:37
@patrickdillon
Copy link
Contributor

/test e2e-agent-compact-ipv4

@tkashem
Copy link
Contributor Author

tkashem commented Dec 13, 2024

/hold

(make sure agent based install works)

@tkashem
Copy link
Contributor Author

tkashem commented Jan 13, 2025

tested the new command against a cluster

$ oc --kubeconfig="$KUBECONFIG" get kubeapiservers cluster -o jsonpath='{range @.status.nodeStatuses[?(@.currentRevision>0)]}{.nodeName}{" "}{end}'

ip-10-0-121-118.us-east-2.compute.internal ip-10-0-16-56.us-east-2.compute.internal ip-10-0-58-223.us-east-2.compute.internal

@tkashem
Copy link
Contributor Author

tkashem commented Jan 13, 2025

$ oc --kubeconfig="$KUBECONFIG" get kubeapiservers cluster -o yaml

apiVersion: operator.openshift.io/v1
kind: KubeAPIServer
metadata:
  annotations:
    include.release.openshift.io/ibm-cloud-managed: "true"
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    release.openshift.io/create-only: "true"
  creationTimestamp: "2025-01-13T19:31:19Z"
  generation: 8
  name: cluster
  ownerReferences:
  - apiVersion: config.openshift.io/v1
    kind: ClusterVersion
    name: version
    uid: 3ac2fb0f-db7b-48e5-8ef8-e32e8ebfb69a
  resourceVersion: "32678"
  uid: f651e87a-6378-4d76-9e78-16d8db8a79fc
spec:
  logLevel: Normal
  managementState: Managed
  observedConfig:
    admission:
      pluginConfig:
        PodSecurity:
          configuration:
            defaults:
              audit: restricted
              audit-version: latest
              enforce: restricted
              enforce-version: latest
              warn: restricted
              warn-version: latest
        network.openshift.io/ExternalIPRanger:
          configuration:
            allowIngressIP: false
            apiVersion: network.openshift.io/v1
            kind: ExternalIPRangerAdmissionConfig
        network.openshift.io/RestrictedEndpointsAdmission:
          configuration:
            apiVersion: network.openshift.io/v1
            kind: RestrictedEndpointsAdmissionConfig
            restrictedCIDRs:
            - 10.128.0.0/14
            - 172.30.0.0/16
    apiServerArguments:
      api-audiences:
      - https://kubernetes.default.svc
      authentication-token-webhook-config-file:
      - /etc/kubernetes/static-pod-resources/secrets/webhook-authenticator/kubeConfig
      authentication-token-webhook-version:
      - v1
      etcd-servers:
      - https://10.0.121.118:2379
      - https://10.0.16.56:2379
      - https://10.0.58.223:2379
      - https://localhost:2379
      feature-gates:
      - AWSEFSDriverVolumeMetrics=true
      - AdditionalRoutingCapabilities=true
      - AdminNetworkPolicy=true
      - AlibabaPlatform=true
      - AzureWorkloadIdentity=true
      - BareMetalLoadBalancer=true
      - BuildCSIVolumes=true
      - ChunkSizeMiB=true
      - CloudDualStackNodeIPs=true
      - DisableKubeletCloudCredentialProviders=true
      - GCPLabelsTags=true
      - HardwareSpeed=true
      - IngressControllerLBSubnetsAWS=true
      - KMSv1=true
      - ManagedBootImages=true
      - ManagedBootImagesAWS=true
      - MultiArchInstallAWS=true
      - MultiArchInstallGCP=true
      - NetworkDiagnosticsConfig=true
      - NetworkLiveMigration=true
      - NewOLM=true
      - NodeDisruptionPolicy=true
      - OpenShiftPodSecurityAdmission=true
      - PrivateHostedZoneAWS=true
      - SetEIPForNLBIngressController=true
      - VSphereControlPlaneMachineSet=true
      - VSphereDriverConfiguration=true
      - VSphereMultiVCenters=true
      - VSphereStaticIPs=true
      - ValidatingAdmissionPolicy=true
      - AWSClusterHostedDNS=false
      - AutomatedEtcdBackup=false
      - BootcNodeManagement=false
      - CPMSMachineNamePrefix=false
      - ClusterAPIInstall=false
      - ClusterAPIInstallIBMCloud=false
      - ClusterMonitoringConfig=false
      - ClusterVersionOperatorConfiguration=false
      - ConsolePluginContentSecurityPolicy=false
      - DNSNameResolver=false
      - DynamicResourceAllocation=false
      - EtcdBackendQuota=false
      - EventedPLEG=false
      - Example=false
      - ExternalOIDC=false
      - GCPClusterHostedDNS=false
      - GatewayAPI=false
      - HighlyAvailableArbiter=false
      - ImageStreamImportMode=false
      - IngressControllerDynamicConfigurationManager=false
      - InsightsConfig=false
      - InsightsConfigAPI=false
      - InsightsOnDemandDataGather=false
      - InsightsRuntimeExtractor=false
      - KMSEncryptionProvider=false
      - MachineAPIMigration=false
      - MachineAPIOperatorDisableMachineHealthCheckController=false
      - MachineAPIProviderOpenStack=false
      - MachineConfigNodes=false
      - MaxUnavailableStatefulSet=false
      - MetricsCollectionProfiles=false
      - MinimumKubeletVersion=false
      - MixedCPUsAllocation=false
      - MultiArchInstallAzure=false
      - NetworkSegmentation=false
      - NodeSwap=false
      - NutanixMultiSubnets=false
      - OVNObservability=false
      - OnClusterBuild=false
      - PersistentIPsForVirtualization=false
      - PinnedImages=false
      - PlatformOperators=false
      - ProcMountType=false
      - RouteAdvertisements=false
      - RouteExternalCertificate=false
      - ServiceAccountTokenNodeBinding=false
      - SignatureStores=false
      - SigstoreImageVerification=false
      - TranslateStreamCloseWebsocketRequests=false
      - UpgradeStatus=false
      - UserNamespacesPodSecurityStandards=false
      - UserNamespacesSupport=false
      - VSphereHostVMGroupZonal=false
      - VSphereMultiDisk=false
      - VSphereMultiNetworks=false
      - VolumeAttributesClass=false
      - VolumeGroupSnapshot=false
      runtime-config:
      - admissionregistration.k8s.io/v1beta1=true
      send-retry-after-while-not-ready-once:
      - "false"
      service-account-issuer:
      - https://kubernetes.default.svc
      service-account-jwks-uri:
      - https://api.ci-ln-wni5ym2-76ef8.aws-2.ci.openshift.org:6443/openid/v1/jwks
      shutdown-delay-duration:
      - 129s
    authConfig:
      oauthMetadataFile: /etc/kubernetes/static-pod-resources/configmaps/oauth-metadata/oauthMetadata
    corsAllowedOrigins:
    - //127\.0\.0\.1(:|$)
    - //localhost(:|$)
    gracefulTerminationDuration: "194"
    imagePolicyConfig:
      internalRegistryHostname: image-registry.openshift-image-registry.svc:5000
    servicesSubnet: 172.30.0.0/16
    servingInfo:
      bindAddress: 0.0.0.0:6443
      bindNetwork: tcp4
      cipherSuites:
      - TLS_AES_128_GCM_SHA256
      - TLS_AES_256_GCM_SHA384
      - TLS_CHACHA20_POLY1305_SHA256
      - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
      - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
      - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
      - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
      - TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256
      - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256
      minTLSVersion: VersionTLS12
      namedCertificates:
      - certFile: /etc/kubernetes/static-pod-certs/secrets/localhost-serving-cert-certkey/tls.crt
        keyFile: /etc/kubernetes/static-pod-certs/secrets/localhost-serving-cert-certkey/tls.key
      - certFile: /etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.crt
        keyFile: /etc/kubernetes/static-pod-certs/secrets/service-network-serving-certkey/tls.key
      - certFile: /etc/kubernetes/static-pod-certs/secrets/external-loadbalancer-serving-certkey/tls.crt
        keyFile: /etc/kubernetes/static-pod-certs/secrets/external-loadbalancer-serving-certkey/tls.key
      - certFile: /etc/kubernetes/static-pod-certs/secrets/internal-loadbalancer-serving-certkey/tls.crt
        keyFile: /etc/kubernetes/static-pod-certs/secrets/internal-loadbalancer-serving-certkey/tls.key
      - certFile: /etc/kubernetes/static-pod-resources/secrets/localhost-recovery-serving-certkey/tls.crt
        keyFile: /etc/kubernetes/static-pod-resources/secrets/localhost-recovery-serving-certkey/tls.key
  operatorLogLevel: Normal
  unsupportedConfigOverrides: null
status:
  conditions:
  - lastTransitionTime: "2025-01-13T19:33:46Z"
    reason: NoUnsupportedConfigOverrides
    status: "True"
    type: UnsupportedConfigOverridesUpgradeable
  - lastTransitionTime: "2025-01-13T19:33:46Z"
    message: Kubelet and API server minor versions are synced.
    reason: KubeletMinorVersionsSynced
    status: "True"
    type: KubeletMinorVersionUpgradeable
  - lastTransitionTime: "2025-01-13T19:33:46Z"
    message: All master nodes are ready
    reason: MasterNodesReady
    status: "False"
    type: NodeControllerDegraded
  - lastTransitionTime: "2025-01-13T19:33:46Z"
    reason: ExpectedReason
    status: "False"
    type: PodSecurityCustomerEvaluationConditionsDetected
  - lastTransitionTime: "2025-01-13T19:33:46Z"
    reason: ExpectedReason
    status: "False"
    type: PodSecurityOpenshiftEvaluationConditionsDetected
  - lastTransitionTime: "2025-01-13T19:33:46Z"
    reason: ExpectedReason
    status: "False"
    type: PodSecurityRunLevelZeroEvaluationConditionsDetected
  - lastTransitionTime: "2025-01-13T19:33:46Z"
    reason: ExpectedReason
    status: "False"
    type: PodSecurityDisabledSyncerEvaluationConditionsDetected
  - lastTransitionTime: "2025-01-13T19:33:46Z"
    reason: DefaultCertRotationBase
    status: "True"
    type: CertRotationTimeUpgradeable
  - lastTransitionTime: "2025-01-13T19:35:16Z"
    status: "False"
    type: MutatingAdmissionWebhookConfigurationError
  - lastTransitionTime: "2025-01-13T19:36:06Z"
    status: "False"
    type: ValidatingAdmissionWebhookConfigurationError
  - lastTransitionTime: "2025-01-13T19:39:38Z"
    status: "False"
    type: CRDConversionWebhookConfigurationError
  - lastTransitionTime: "2025-01-13T19:33:46Z"
    status: "False"
    type: VirtualResourceAdmissionError
  - lastTransitionTime: "2025-01-13T19:33:47Z"
    status: "False"
    type: StartupMonitorPodContainerExcessiveRestartsDegraded
  - lastTransitionTime: "2025-01-13T19:33:47Z"
    status: "False"
    type: StartupMonitorPodDegraded
  - lastTransitionTime: "2025-01-13T19:33:47Z"
    status: "False"
    type: EncryptionStateControllerDegraded
  - lastTransitionTime: "2025-01-13T19:33:47Z"
    status: "False"
    type: Encrypted
  - lastTransitionTime: "2025-01-13T19:33:47Z"
    status: "False"
    type: EncryptionPruneControllerDegraded
  - lastTransitionTime: "2025-01-13T19:33:47Z"
    status: "False"
    type: EncryptionKeyControllerDegraded
  - lastTransitionTime: "2025-01-13T19:33:47Z"
    status: "False"
    type: EncryptionMigrationControllerDegraded
  - lastTransitionTime: "2025-01-13T19:33:47Z"
    status: "False"
    type: EncryptionMigrationControllerProgressing
  - lastTransitionTime: "2025-01-13T19:52:58Z"
    reason: AsExpected
    status: "False"
    type: GuardControllerDegraded
  - lastTransitionTime: "2025-01-13T19:33:47Z"
    status: "False"
    type: StaticPodFallbackRevisionDegraded
  - lastTransitionTime: "2025-01-13T19:42:39Z"
    status: "False"
    type: RevisionControllerDegraded
  - lastTransitionTime: "2025-01-13T19:33:48Z"
    message: latency profile not set on cluster
    reason: ProfileEmpty
    status: "True"
    type: WorkerLatencyProfileComplete
  - lastTransitionTime: "2025-01-13T19:33:48Z"
    reason: ProfileEmpty
    status: "False"
    type: WorkerLatencyProfileProgressing
  - lastTransitionTime: "2025-01-13T19:35:57Z"
    status: "False"
    type: InstallerControllerDegraded
  - lastTransitionTime: "2025-01-13T19:42:49Z"
    status: "False"
    type: NodeInstallerDegraded
  - lastTransitionTime: "2025-01-13T19:57:29Z"
    message: 3 nodes are at revision 7
    reason: AllNodesAtLatestRevision
    status: "False"
    type: NodeInstallerProgressing
  - lastTransitionTime: "2025-01-13T19:37:32Z"
    message: 3 nodes are active; 3 nodes are at revision 7
    reason: ""
    status: "True"
    type: StaticPodsAvailable
  - lastTransitionTime: "2025-01-13T19:33:48Z"
    reason: AsExpected
    status: "False"
    type: MissingStaticPodControllerDegraded
  - lastTransitionTime: "2025-01-13T19:33:49Z"
    reason: AsExpected
    status: "False"
    type: WorkerLatencyProfileDegraded
  - lastTransitionTime: "2025-01-13T19:35:42Z"
    reason: AsExpected
    status: "False"
    type: NodeKubeconfigControllerDegraded
  - lastTransitionTime: "2025-01-13T19:33:49Z"
    status: "False"
    type: InstallerPodContainerWaitingDegraded
  - lastTransitionTime: "2025-01-13T19:33:49Z"
    status: "False"
    type: InstallerPodNetworkingDegraded
  - lastTransitionTime: "2025-01-13T19:33:49Z"
    status: "False"
    type: InstallerPodPendingDegraded
  - lastTransitionTime: "2025-01-13T19:34:20Z"
    status: "False"
    type: CertRotation_ExternalLoadBalancerServing_Degraded
  - lastTransitionTime: "2025-01-13T19:34:26Z"
    status: "False"
    type: ConfigObservationDegraded
  - lastTransitionTime: "2025-01-13T19:34:04Z"
    status: "False"
    type: CertRotation_CheckEndpointsClient_Degraded
  - lastTransitionTime: "2025-01-13T19:34:24Z"
    status: "False"
    type: CertRotation_KubeSchedulerClient_Degraded
  - lastTransitionTime: "2025-01-13T19:33:53Z"
    status: "False"
    type: AuditPolicyDegraded
  - lastTransitionTime: "2025-01-13T19:33:54Z"
    status: "False"
    type: CertRotation_InternalLoadBalancerServing_Degraded
  - lastTransitionTime: "2025-01-13T19:42:48Z"
    reason: AsExpected
    status: "False"
    type: BackingResourceControllerDegraded
  - lastTransitionTime: "2025-01-13T19:33:56Z"
    status: "False"
    type: StaticPodsDegraded
  - lastTransitionTime: "2025-01-13T19:34:00Z"
    status: "False"
    type: ResourceSyncControllerDegraded
  - lastTransitionTime: "2025-01-13T19:43:12Z"
    reason: AsExpected
    status: "False"
    type: KubeAPIServerStaticResourcesDegraded
  - lastTransitionTime: "2025-01-13T19:34:09Z"
    status: "False"
    type: CertRotation_KubeAPIServerToKubeletClientCert_Degraded
  - lastTransitionTime: "2025-01-13T19:34:12Z"
    status: "False"
    type: CertRotation_AggregatorProxyClientCert_Degraded
  - lastTransitionTime: "2025-01-13T19:34:14Z"
    status: "False"
    type: CertRotation_LocalhostServing_Degraded
  - lastTransitionTime: "2025-01-13T19:34:16Z"
    status: "False"
    type: CertRotation_NodeSystemAdminClient_Degraded
  - lastTransitionTime: "2025-01-13T19:34:18Z"
    status: "False"
    type: CertRotation_LocalhostRecoveryServing_Degraded
  - lastTransitionTime: "2025-01-13T19:34:22Z"
    status: "False"
    type: CertRotation_ServiceNetworkServing_Degraded
  - lastTransitionTime: "2025-01-13T19:34:23Z"
    status: "False"
    type: CertRotation_KubeControllerManagerClient_Degraded
  - lastTransitionTime: "2025-01-13T19:34:26Z"
    status: "False"
    type: CertRotation_ControlPlaneNodeAdminClient_Degraded
  - lastTransitionTime: "2025-01-13T19:43:21Z"
    status: "False"
    type: TargetConfigControllerDegraded
  latestAvailableRevision: 7
  nodeStatuses:
  - currentRevision: 7
    lastFailedCount: 0
    lastFailedReason: ""
    lastFailedRevision: 0
    lastFallbackCount: 0
    nodeName: ip-10-0-121-118.us-east-2.compute.internal
    targetRevision: 0
  - currentRevision: 7
    lastFailedCount: 0
    lastFailedReason: ""
    lastFailedRevision: 0
    lastFallbackCount: 0
    nodeName: ip-10-0-16-56.us-east-2.compute.internal
    targetRevision: 0
  - currentRevision: 7
    lastFailedCount: 0
    lastFailedReason: ""
    lastFailedRevision: 0
    lastFailedRevisionErrors:
    - |
      installer: ing-cert-002",
        (string) (len=21) "user-serving-cert-003",
        (string) (len=21) "user-serving-cert-004",
        (string) (len=21) "user-serving-cert-005",
        (string) (len=21) "user-serving-cert-006",
        (string) (len=21) "user-serving-cert-007",
        (string) (len=21) "user-serving-cert-008",
        (string) (len=21) "user-serving-cert-009"
       },
       CertConfigMapNamePrefixes: ([]string) (len=4 cap=4) {
        (string) (len=20) "aggregator-client-ca",
        (string) (len=9) "client-ca",
        (string) (len=29) "control-plane-node-kubeconfig",
        (string) (len=26) "check-endpoints-kubeconfig"
       },
       OptionalCertConfigMapNamePrefixes: ([]string) (len=1 cap=1) {
        (string) (len=17) "trusted-ca-bundle"
       },
       CertDir: (string) (len=57) "/etc/kubernetes/static-pod-resources/kube-apiserver-certs",
       ResourceDir: (string) (len=36) "/etc/kubernetes/static-pod-resources",
       PodManifestDir: (string) (len=25) "/etc/kubernetes/manifests",
       Timeout: (time.Duration) 2m0s,
       StaticPodManifestsLockFile: (string) "",
       PodMutationFns: ([]installerpod.PodMutationFunc) <nil>,
       KubeletVersion: (string) ""
      })
      I0113 19:40:44.064637       1 cmd.go:413] Getting controller reference for node ip-10-0-58-223.us-east-2.compute.internal
      I0113 19:40:44.073268       1 cmd.go:426] Waiting for installer revisions to settle for node ip-10-0-58-223.us-east-2.compute.internal
      I0113 19:40:44.073308       1 envvar.go:172] "Feature gate default state" feature="WatchListClient" enabled=false
      I0113 19:40:44.073321       1 envvar.go:172] "Feature gate default state" feature="InformerResourceVersion" enabled=false
      I0113 19:40:44.076651       1 cmd.go:518] Waiting additional period after revisions have settled for node ip-10-0-58-223.us-east-2.compute.internal
      I0113 19:41:14.077032       1 cmd.go:524] Getting installer pods for node ip-10-0-58-223.us-east-2.compute.internal
      F0113 19:41:28.080207       1 cmd.go:109] Get "https://172.30.0.1:443/api/v1/namespaces/openshift-kube-apiserver/pods?labelSelector=app%3Dinstaller": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
    lastFailedTime: "2025-01-13T19:42:44Z"
    lastFallbackCount: 0
    nodeName: ip-10-0-58-223.us-east-2.compute.internal
    targetRevision: 0
  readyReplicas: 0
  serviceAccountIssuers:
  - name: https://kubernetes.default.svc

@tkashem tkashem changed the title [WIP] OCPBUGS-46010: wait for at least 3 kube-apiserver instances OCPBUGS-46010: wait for at least 3 kube-apiserver instances Jan 13, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 13, 2025
@tkashem
Copy link
Contributor Author

tkashem commented Jan 14, 2025

/retest

@tkashem
Copy link
Contributor Author

tkashem commented Jan 14, 2025

/label acknowledge-critical-fixes-only

@openshift-ci openshift-ci bot added the acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. label Jan 14, 2025
@tkashem
Copy link
Contributor Author

tkashem commented Jan 15, 2025

/retest-required

@tkashem
Copy link
Contributor Author

tkashem commented Jan 15, 2025

/test e2e-agent-compact-ipv4

@benluddy
Copy link
Contributor

/test e2e-agent-ha-dualstack

@benluddy
Copy link
Contributor

/test e2e-agent-sno-ipv6

Copy link
Contributor

openshift-ci bot commented Jan 15, 2025

@tkashem: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-ovn 19e98f9 link true /test e2e-azure-ovn
ci/prow/e2e-azure-ovn-upi 19e98f9 link true /test e2e-azure-ovn-upi
ci/prow/e2e-azure-ovn-resourcegroup ba82c38 link false /test e2e-azure-ovn-resourcegroup
ci/prow/e2e-vsphere-static-ovn ba82c38 link false /test e2e-vsphere-static-ovn
ci/prow/e2e-vsphere-ovn-multi-network ba82c38 link false /test e2e-vsphere-ovn-multi-network
ci/prow/okd-scos-e2e-aws-ovn ba82c38 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@benluddy
Copy link
Contributor

benluddy commented Jan 15, 2025

/test e2e-agent-sno-ipv6

The latest failure looks like a transient image registry error. The other two agent jobs are green.

Error: copying system image from manifest list: reading blob sha256:0d1baa9a1e65a2a5993ffaad35ff54cfb88cd20846309f34746d67cc421dbfe9: fetching blob: received unexpected HTTP status: 502 Bad Gateway

@benluddy
Copy link
Contributor

/lgtm

(assuming the last agent job passes)

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 15, 2025
@tkashem
Copy link
Contributor Author

tkashem commented Jan 15, 2025

/hold cancel

(e2e-agent-compact-ipv4 has passed)

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 15, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit b2723a8 into openshift:main Jan 15, 2025
17 of 21 checks passed
@openshift-ci-robot
Copy link
Contributor

@tkashem: Jira Issue OCPBUGS-46010: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-46010 has been moved to the MODIFIED state.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@benluddy
Copy link
Contributor

/cherry-pick release-4.18

@openshift-cherrypick-robot

@benluddy: new pull request created: #9372

In response to this:

/cherry-pick release-4.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-installer-terraform-providers
This PR has been included in build ose-installer-terraform-providers-container-v4.19.0-202501152337.p0.gb2723a8.assembly.stream.el9.
All builds following this will include this PR.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-installer-altinfra
This PR has been included in build ose-installer-altinfra-container-v4.19.0-202501152337.p0.gb2723a8.assembly.stream.el9.
All builds following this will include this PR.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-baremetal-installer
This PR has been included in build ose-baremetal-installer-container-v4.19.0-202501152337.p0.gb2723a8.assembly.stream.el9.
All builds following this will include this PR.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-installer-artifacts
This PR has been included in build ose-installer-artifacts-container-v4.19.0-202501152337.p0.gb2723a8.assembly.stream.el9.
All builds following this will include this PR.

@wangke19
Copy link

/jira refresh

@openshift-ci-robot
Copy link
Contributor

@wangke19: Jira Issue OCPBUGS-46010 is in an unrecognized state (Verified) and will not be moved to the MODIFIED state.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants