Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit event on Description-only healthcheck update #2072

Merged

Conversation

DamianSawicki
Copy link
Contributor

@DamianSawicki DamianSawicki commented Apr 7, 2023

Since #2008, the health check description is modified if the service health check is configured with a BackendConfig CRD. The purpose of the present PR is to emit an event on the service on this occasion.

The changes are guarded with the flag flags.F.EnableUpdateCustomHealthCheckDescription from #2018.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 7, 2023
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 7, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @DamianSawicki. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 7, 2023
@DamianSawicki
Copy link
Contributor Author

/assign aojea

@@ -338,3 +341,32 @@ func ApplyProbeSettingsToHC(p *v1.Probe, hc *HealthCheck) {

hc.Description = DescriptionForHealthChecksFromReadinessProbe
}

// GetObjectKind implements runtime.Object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change related?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. EventRecorder.Event requires the first argument to be runtime.Object, and this interface contains two methods: GetObjectKind() schema.ObjectKind and DeepCopyObject() Object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the object in this case is the Service or just pass the Namespace, doing this to the healtcheck object seems very brittle

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can get it from defaultBackendSvc types.NamespacedName

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This again requires implementing the above methods either for string (in the case of namespace) or for types.NamespacedName (in the case of service), and when trying to do that, I'm getting the following error cannot define new methods on non-local type "k8s.io/apimachinery/pkg/types".NamespacedName. I guess I can write my own alias type for one of these and then implement the methods for the alias. @aojea, do you think it is a good idea?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no sorry, my point is that we should not modify this type, and use one of the Kubernetes types associated to the healtcheck, it can be the Service or the Namespace, ideally the Service

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification! Done, please have a look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you forgot to remove this code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -142,6 +144,11 @@ func (h *HealthChecks) sync(hc *translator.HealthCheck, bchcc *backendconfigv1.H
changes := calculateDiff(filter(existingHC), filter(hc), bchcc)
if changes.hasDiff() {
klog.V(2).Infof("Health check %q needs update (%s)", existingHC.Name, changes)
if flags.F.EnableUpdateCustomHealthCheckDescription && changes.size() == 1 && changes.has("Description") {
message := fmt.Sprintf("Healthcheck will be updated and the only field updated is Description.\nOld: %+v\nNew: %+v\nDiff: %+v", existingHC, hc, changes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems a very large message from an event, can we just send the diff or do we miss some important information?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if guarantees that the diff contains only the Description update. Recall from our discussions that this event emission was intended as a precaution during Description update, so the event would not serve this purpose well if it contained only the "positive scenario" information.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the message contains the New and Old too, can we send only the changes? do we need the others?

%+v\nNew: %+v\nDiff: %+v", existingHC, hc, changes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm trying to say that we need the others. The changes aka "diff" inside the if consists only of the modified Description (changes.size() == 1 && changes.has("Description")). However, we want to include more information in the message as a precaution.

A background fact that I should probably mention is that calculateDiff() does not return the full diff. It checks selected fields only (recently Description was added as one of them), sometimes just 2 of them.

@aojea
Copy link
Member

aojea commented Apr 12, 2023

/ok-to-test

looks good overall, some comments that needs to be addressed

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 12, 2023
@DamianSawicki DamianSawicki force-pushed the emit-event-on-desc-change branch 2 times, most recently from d396a05 to 9d5fea2 Compare April 12, 2023 09:18
@DamianSawicki DamianSawicki force-pushed the emit-event-on-desc-change branch 2 times, most recently from 66f7f13 to ad6a07b Compare April 13, 2023 17:00
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 13, 2023
if flags.F.EnableUpdateCustomHealthCheckDescription && changes.size() == 1 && changes.has("Description") {
message := fmt.Sprintf("Healthcheck will be updated and the only field updated is Description.\nOld: %+v\nNew: %+v\nDiff: %+v", existingHC, hc, changes)
h.eventRecorder.Event(
hc.Service, "Warning", "HealthcheckDescriptionUpdate", message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I have to apologize, but now that I look it carefully , it seems Service is not the right object to pass, in this case this method is trying to sync bchcc *backendconfigv1.HealthCheckConfig , I think this object is implementing runtime.Object, isnt't it?, that will be the best thing since we can do this only with 4-5 lines of code

Sorry again, I've should have checked this better before doing any suggestion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aojea we want to emit the even on the Service because UHC is created per Service and one BackendConfig can be assign to multiple Services but all Customer cares is a configuration of HealthCheck for Service.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, Damien commented offline, sounds good

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, Damien commented offline, sounds good

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*Damian

@@ -41,7 +42,7 @@ type Backends struct {
}

// Backends is a Pool.
var _ Pool = (*Backends)(nil)
var _ interfaces.Pool = (*Backends)(nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change was needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to rebase

Copy link
Contributor Author

@DamianSawicki DamianSawicki Apr 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, done.

GetService(namespace, name string) (*api_v1.Service, error)
}

func NewEmptyServiceGetter() ServiceGetter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewFake

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


type fakeServiceGetter struct{}

func (fsg *fakeServiceGetter) GetService(namespace, name string) (*api_v1.Service, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be better to return some default Service here. I don't think is good to return the empty object. Normally when you get nil error you expect that the Object is properly constructed. This may lead to some crashes I think.

Copy link
Contributor Author

@DamianSawicki DamianSawicki Apr 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the comments ,api_v1.Service has only one non-optional field of type metav1.TypeMeta, which in turn has only optional fields. So it seems that the "empty object" is a valid object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can at least fill name and namespace to match function params.
Where is it used?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be enough

	svc := &v1.Service{
		ObjectMeta: metav1.ObjectMeta{
			Name:      name,
			Namespace: namespace,
		},
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just used as one of the Event parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aojea Yeah, I missed your comment earlier, but this is exactly what we ended up with.

@@ -114,7 +115,8 @@ func NewLoadBalancerController(
Interface: ctx.KubeClient.CoreV1().Events(""),
})

healthChecker := healthchecks.NewHealthChecker(ctx.Cloud, ctx.HealthCheckPath, ctx.DefaultBackendSvcPort.ID.Service)
eventRecorder := ctx.Recorder(ctx.Namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the namespace for recorder needs to be the namespace of the Service you wants to emit the event for. And health checker is created one per controller you need to postpone this function to the place where you have your Service.
You probably need context there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@aojea
Copy link
Member

aojea commented Apr 14, 2023

just please don't use calls to the apiserver on the getter, remember that you have informers that already have Listers and Getter to a local cache with all the objects information. The other alternative is to plumb down the information on initialization, I'm not familiar with the code, so I can't suggest which one is the best option

@DamianSawicki DamianSawicki force-pushed the emit-event-on-desc-change branch 3 times, most recently from cb61a41 to 60aa3d0 Compare April 14, 2023 12:45
@DamianSawicki DamianSawicki force-pushed the emit-event-on-desc-change branch 2 times, most recently from d851dac to d726038 Compare April 14, 2023 15:16
@DamianSawicki DamianSawicki marked this pull request as ready for review April 17, 2023 21:50
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 17, 2023
@k8s-ci-robot k8s-ci-robot requested a review from aojea April 17, 2023 21:51
@@ -67,9 +69,23 @@ func (h *HealthChecks) new(sp utils.ServicePort) *translator.HealthCheck {
hc.Name = sp.BackendName()
hc.Port = sp.NodePort
hc.RequestPath = h.pathFromSvcPort(sp)
service := h.mainService(sp)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll flag-gate the following lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@DamianSawicki DamianSawicki force-pushed the emit-event-on-desc-change branch from d726038 to daa03ce Compare April 18, 2023 08:30
@DamianSawicki
Copy link
Contributor Author

/uncc kl52752

@DamianSawicki
Copy link
Contributor Author

/assign cezarygerard

@@ -142,6 +160,11 @@ func (h *HealthChecks) sync(hc *translator.HealthCheck, bchcc *backendconfigv1.H
changes := calculateDiff(filter(existingHC), filter(hc), bchcc)
if changes.hasDiff() {
klog.V(2).Infof("Health check %q needs update (%s)", existingHC.Name, changes)
if flags.F.EnableUpdateCustomHealthCheckDescription && changes.size() == 1 && changes.has("Description") {
message := fmt.Sprintf("Healthcheck will be updated and the only field updated is Description.\nOld: %+v\nNew: %+v\nDiff: %+v", existingHC, hc, changes)
h.recorderGetter.Recorder(hc.Service.Namespace).Event(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hc.Service Can be null here,

if it error occurred while assigning the hc.Service in pkg/healthchecks/healthchecks.go

if flags.F.EnableUpdateCustomHealthCheckDescription {
		service := h.mainService(sp)
		var err error
		hc.Service, err = h.serviceGetter.GetService(service.Namespace, service.Name)
		if err != nil {
			klog.Errorf("Service %s/%s not found: %v.", service.Namespace, service.Name, err)
		}
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -67,9 +69,25 @@ func (h *HealthChecks) new(sp utils.ServicePort) *translator.HealthCheck {
hc.Name = sp.BackendName()
hc.Port = sp.NodePort
hc.RequestPath = h.pathFromSvcPort(sp)
if flags.F.EnableUpdateCustomHealthCheckDescription {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should not be using flags this deep down the logic

convert them into fields somewhere in context or close to main

it makes it much more testable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had this discussion already with kl52752, panslava, and bowei: #2018 (comment). This is the same flag as there and we're not adding any new non-parallelizable tests in this PR, so I'll stick to the approach bowei suggested in #2018 and keep the flag deep down.

var err error
hc.Service, err = h.serviceGetter.GetService(service.Namespace, service.Name)
if err != nil {
klog.Errorf("Service %s/%s not found: %v.", service.Namespace, service.Name, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. why erorr?
    we keep going so it should be a Warning

  2. Add one more sentence to this log, what are the consequences (not big, we keep going)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -67,9 +69,25 @@ func (h *HealthChecks) new(sp utils.ServicePort) *translator.HealthCheck {
hc.Name = sp.BackendName()
hc.Port = sp.NodePort
hc.RequestPath = h.pathFromSvcPort(sp)
if flags.F.EnableUpdateCustomHealthCheckDescription {
service := h.mainService(sp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

service is a confusing name here
It suggests it's v1.Service
but it is just name and namespace

see my next comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 72 to 79
if flags.F.EnableUpdateCustomHealthCheckDescription {
service := h.mainService(sp)
var err error
hc.Service, err = h.serviceGetter.GetService(service.Namespace, service.Name)
if err != nil {
klog.Errorf("Service %s/%s not found: %v.", service.Namespace, service.Name, err)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole part can be well hidden:
see this:

// new returns a *HealthCheck with default settings and specified port/protocol
func (h *HealthChecks) new(sp utils.ServicePort) *translator.HealthCheck {
	var hc *translator.HealthCheck
	if sp.NEGEnabled && !sp.L7ILBEnabled {
		hc = translator.DefaultNEGHealthCheck(sp.Protocol)
	} else if sp.L7ILBEnabled {
		hc = translator.DefaultILBHealthCheck(sp.Protocol)
	} else {
		hc = translator.DefaultHealthCheck(sp.NodePort, sp.Protocol)
	}
	// port is the key for retrieving existing health-check
	// TODO: rename backend-service and health-check to not use port as key
	hc.Name = sp.BackendName()
	hc.Port = sp.NodePort
	hc.RequestPath = h.pathFromSvcPort(sp)
	hc.Service = h.getService(sp)

	return hc
}

func (h *HealthChecks) getService(sp utils.ServicePort) *v1.Service {
	if flags.F.EnableUpdateCustomHealthCheckDescription {
		namespacedName := h.mainService(sp)
		var err error
		service, err := h.serviceGetter.GetService(namespacedName.Namespace, namespacedName.Name)
		if err != nil {
			klog.Errorf("Service %s/%s not found: %v.", namespacedName.Namespace, namespacedName.Name, err)
		}
		return service
	}
	return nil
}

func (h *HealthChecks) mainService(sp utils.ServicePort) types.NamespacedName {
	service := h.defaultBackendSvc
	if sp.ID.Service.Name != "" {
		service = sp.ID.Service
	}
	return service
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better, I somehow missed it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if flags.F.EnableUpdateCustomHealthCheckDescription && changes.size() == 1 && changes.has("Description") {
message := fmt.Sprintf("Healthcheck will be updated and the only field updated is Description.\nOld: %+v\nNew: %+v\nDiff: %+v", existingHC, hc, changes)
h.recorderGetter.Recorder(hc.Service.Namespace).Event(
hc.Service, "Warning", "HealthcheckDescriptionUpdate", message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is Info type of message, not Warning

be cautious what we log to customer projects ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also use const from this package
vendor/k8s.io/api/core/v1/types.go

as we do in other places

// Valid values for event types (new types could be added in future)
const (
	// Information only and will not cause any problems
	EventTypeNormal string = "Normal"
	// These events are to warn that something might go wrong
	EventTypeWarning string = "Warning"
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 61 to 114

// ServiceGetter is an interface to retrieve Kubernetes Services.
type ServiceGetter interface {
GetService(namespace, name string) (*v1.Service, error)
}

func NewFakeServiceGetter() ServiceGetter {
return &fakeServiceGetter{}
}

type fakeServiceGetter struct{}

func (fsg *fakeServiceGetter) GetService(namespace, name string) (*v1.Service, error) {
return &v1.Service{
ObjectMeta: metav1.ObjectMeta{Namespace: namespace, Name: name},
}, nil
}

type RecorderGetter interface {
Recorder(ns string) record.EventRecorder
}

func NewFakeRecorderGetter(bufferSize int) RecorderGetter {
return &fakeRecorderGetter{bufferSize}
}

type fakeRecorderGetter struct {
bufferSize int
}

// Returns a different record.EventRecorder for every call.
func (frg *fakeRecorderGetter) Recorder(namespace string) record.EventRecorder {
return record.NewFakeRecorder(frg.bufferSize)
}

type singletonFakeRecorderGetter struct {
recorder *record.FakeRecorder
}

// Returns the same record.EventRecorder irrespective of the namespace.
func (sfrg *singletonFakeRecorderGetter) Recorder(namespace string) record.EventRecorder {
return sfrg.FakeRecorder()
}

func (sfrg *singletonFakeRecorderGetter) FakeRecorder() *record.FakeRecorder {
if sfrg.recorder == nil {
panic("singletonFakeRecorderGetter not initialised: recorder is nil.")
}
return sfrg.recorder
}

func NewFakeSingletonRecorderGetter(bufferSize int) *singletonFakeRecorderGetter {
return &singletonFakeRecorderGetter{recorder: record.NewFakeRecorder(bufferSize)}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all fakes should be moved to _test.go file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to a new flakes.go file because somehow when I moved it to healthchecks_test.go it was not visible in other packages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah... so it has to be compiled into final binary

I have tried to use build constraint

//go:build test

or

//go:build testing

but it does not work

golang keeps amazing me :-D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DamianSawicki DamianSawicki force-pushed the emit-event-on-desc-change branch from 497877a to fe47055 Compare April 21, 2023 08:23
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 21, 2023
@cezarygerard
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 21, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cezarygerard, DamianSawicki

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit f965f77 into kubernetes:master Apr 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants