Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A simple long running test #598

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

mojtaba-esk
Copy link
Contributor

@mojtaba-esk mojtaba-esk commented Dec 19, 2024

This PR proposes a simple long running test. Some of the underlying code had to change to allow for that.

Summary by CodeRabbit

  • New Features

    • Introduced new end-to-end tests for instance management and command execution.
    • Enhanced logging for file operations in the storage module.
    • Added a method to create ClusterRoleBinding resources in the test suite.
    • Added a skipCleanup option to control cleanup behavior in the Knuu struct.
  • Bug Fixes

    • Improved error handling for existing Kubernetes resources in various methods.
    • Added specific error messages for existing service accounts, roles, and role bindings.
  • Refactor

    • Updated various methods to use a more modern and fluent API for Kubernetes probes and configurations.
    • Transitioned to a declarative approach for managing Kubernetes resources.
  • Chores

    • Cleaned up indirect dependencies in the go.mod file and updated several module versions.

@mojtaba-esk mojtaba-esk added the enhancement New feature or request label Dec 19, 2024
@mojtaba-esk mojtaba-esk requested a review from a team December 19, 2024 10:32
@mojtaba-esk mojtaba-esk self-assigned this Dec 19, 2024
Copy link
Contributor

coderabbitai bot commented Dec 19, 2024

Walkthrough

This pull request introduces significant updates across multiple packages in the project, focusing on modernizing Kubernetes resource configuration and error handling. The changes primarily involve transitioning from direct Kubernetes API types to apply configurations, updating import statements, and enhancing error management for various Kubernetes resources like service accounts, roles, and role bindings. The modifications aim to improve type safety, provide more declarative resource management, and create more robust error handling mechanisms.

Changes

File Change Summary
e2e/basic/probe_test.go Updated probe definition using fluent interface and method chaining
e2e/longrunning/simple_test.go Added new test for instance creation and command execution
e2e/system/start_callback_test.go Updated Kubernetes API import and probe configuration method
go.mod Cleaned up indirect dependencies and updated module versions
pkg/instance/* Transitioned probe and security context types to apply configurations
pkg/k8s/pod.go Comprehensive update to use apply configurations for pod resources
pkg/k8s/prob.go Added DeepCopyProbe function for probe configuration
pkg/k8s/errors.go Introduced new error variables for Kubernetes resource management
pkg/k8s/role.go Enhanced error handling for role and cluster role creation
pkg/k8s/rolebinding.go Improved error handling for role binding creation
pkg/k8s/serviceaccount.go Added error handling for existing service accounts
pkg/k8s/replicaset.go Updated ReplicaSet methods to use apply configurations
pkg/k8s/test_suite_test.go Added method for creating ClusterRoleBinding in test suite

Poem

🐰 Hop, hop, through Kubernetes land,
Configs now dance at our command!
Apply, configure, with grace so neat,
Our code's evolution is quite a treat!
Errors handled, types refined,
A rabbit's coding design! 🚀

Finishing Touches

  • 📝 Generate Docstrings

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🔭 Outside diff range comments (1)
pkg/instance/build.go (1)

Line range hint 324-341: Add deep copy for imagePullPolicy in clone method

The clone method needs to handle the deep copy of imagePullPolicy pointer field.

Apply this diff to fix the clone method:

 return &build{
     instance:  nil,
     imageName: b.imageName,
+    imagePullPolicy: ptr.To[v1.PullPolicy](*b.imagePullPolicy),
     //TODO: This does not create a deep copy of the builderFactory. Implement it in another PR
     builderFactory: b.builderFactory,
     command:    commandCopy,
     args:       argsCopy,
     env:        envCopy,
     imageCache: &imageCacheClone,
 }
🧹 Nitpick comments (14)
pkg/k8s/replicaset.go (2)

37-43: Consider preserving or merging existing annotations and labels only if needed.
Right now, you are overwriting the spec with the existing ReplicaSet’s labels/annotations. If the new config sets additional fields, ensure they do not get accidentally removed if the existing object has them.


171-187: Nice transition to ApplyConfiguration for the ReplicaSet.
Using this approach is more in line with the server-side apply model and helps avoid direct conflicts.

pkg/k8s/pod.go (5)

393-396: EmptyDir usage.
Since emptyDir volumes are ephemeral, confirm this meets your persistence requirements for the scenario where volumesAmount == 0 and filesAmount != 0.


449-449: Empty containerFiles slice.
If container files are handled independently later, consider documenting the usage or logic behind appending an empty slice here.


475-478: Careful with mounting subPath for nested structures.
TrimPrefix approach is good. Check if multiple nested paths (e.g. /files/etc/) all mount properly.


562-575: Ports naming convention.
TCP and UDP ports are named automatically based on the port number. Good approach for clarity.


620-620: preparePodVolumes function is small and cohesive.
Straightforwardly delegates to buildPodVolumes, which keeps code DRY.

e2e/basic/probe_test.go (1)

26-34: Properly configured liveness probe.
Setting the HTTP GET path and using the known port is correct. The 10s initial delay also seems reasonable.

pkg/k8s/prob.go (1)

19-45: Consider adding header validation for HTTPGet probe

While the implementation correctly copies all fields, consider adding validation for HTTPGet headers if they're supported in your Kubernetes version.

 if probe.HTTPGet != nil {
   copy.HTTPGet = &corev1.HTTPGetActionApplyConfiguration{
     Path:   probe.HTTPGet.Path,
     Port:   probe.HTTPGet.Port,
     Host:   probe.HTTPGet.Host,
     Scheme: probe.HTTPGet.Scheme,
+    HTTPHeaders: append([]*corev1.HTTPHeaderApplyConfiguration{}, probe.HTTPGet.HTTPHeaders...),
   }
 }
e2e/longrunning/simple_test.go (2)

14-17: Consider reducing test timeout

15 minutes seems excessive for this simple test. Consider reducing it to a more reasonable duration like 2-3 minutes.

 const (
   alpineImage = "alpine:3.20.3"
-  testTimeout = time.Minute * 15
+  testTimeout = time.Minute * 3
 )

72-74: Consider using a more robust long-running command

Using sleep infinity might not be the best approach for a long-running container. Consider using a more robust command that can handle signals properly.

- if err := ins.Build().SetStartCommand("sleep", "infinity"); err != nil {
+ if err := ins.Build().SetStartCommand("tail", "-f", "/dev/null"); err != nil {
pkg/instance/security.go (2)

84-85: Consider using WithXXX methods for apply configuration

The apply configuration API provides WithXXX methods for more idiomatic initialization.

Consider using the builder pattern:

- sc := &corev1.SecurityContextApplyConfiguration{}
+ sc := corev1.SecurityContext().
+   WithPrivileged(s.privileged).
+   WithCapabilities(corev1.Capabilities().
+     WithAdd(capabilities))

95-97: Simplify capabilities initialization

The current implementation could be more concise using direct type conversion.

Consider this more idiomatic approach:

- sc.Capabilities = &corev1.CapabilitiesApplyConfiguration{
-   Add: capabilities,
- }
+ sc.Capabilities = corev1.Capabilities().
+   WithAdd(capabilities)
pkg/k8s/errors.go (1)

150-152: Consider grouping related errors together.

For better code organization and maintainability, consider moving these new error definitions next to their related errors:

  • Move ErrServiceAccountAlreadyExists near other service account related errors
  • Move ErrRoleAlreadyExists near ErrCreatingRole (around line 71)
  • Move ErrRoleBindingAlreadyExists near ErrCreatingRoleBinding (around line 72)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 83a1a24 and f429274.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (19)
  • e2e/basic/probe_test.go (2 hunks)
  • e2e/longrunning/simple_test.go (1 hunks)
  • e2e/system/start_callback_test.go (2 hunks)
  • go.mod (0 hunks)
  • pkg/instance/build.go (3 hunks)
  • pkg/instance/execution.go (2 hunks)
  • pkg/instance/instance.go (2 hunks)
  • pkg/instance/monitoring.go (5 hunks)
  • pkg/instance/security.go (3 hunks)
  • pkg/instance/storage.go (1 hunks)
  • pkg/k8s/errors.go (1 hunks)
  • pkg/k8s/pod.go (12 hunks)
  • pkg/k8s/prob.go (1 hunks)
  • pkg/k8s/replicaset.go (3 hunks)
  • pkg/k8s/role.go (3 hunks)
  • pkg/k8s/rolebinding.go (3 hunks)
  • pkg/k8s/serviceaccount.go (2 hunks)
  • pkg/knuu/errors.go (0 hunks)
  • pkg/traefik/traefik.go (2 hunks)
💤 Files with no reviewable changes (2)
  • pkg/knuu/errors.go
  • go.mod
🔇 Additional comments (35)
pkg/k8s/replicaset.go (3)

12-13: Good addition of apply client configs.
This facilitates more declarative resource management and aligns with newer Kubernetes best practices.


35-35: Validate that the ReplicaSet specification is fully built before applying.
Ensure that any optional fields (e.g., labels or additional metadata) are properly set in the returned apply configuration.


46-48: Check status after applying changes.
Ensure that applying the new configuration results in the expected rolling update behavior. For instance, verifying that the updated ReplicaSet correctly manages Pods and respects your FieldManager name.

pkg/k8s/pod.go (16)

17-17: New import for corev1 apply configurations.
This is consistent with your switch to server-side apply for Pod resources.


91-94: Switching to Apply.
Using “Apply” for Pods offers more granular, declarative updates. Confirm that other code (e.g., watchers or readiness checks) continue to function correctly with this approach.


365-368: Memory optimization: reusing capacity.
You’ve allocated the slice with the right capacity (len(envMap)), which avoids unnecessary re-allocation. Good job!


376-377: Early check for zero volumes.
Great to optimize for this scenario by returning early with an empty slice.


380-384: PersistentVolumeClaim usage.
Be mindful that the PVC must exist already. If it doesn’t, the Pod will fail to schedule. Ensure there is a preceding step to create or reference the PVC if needed.


403-408: ConfigMap usage.
Ensure that the ConfigMap is created beforehand. Otherwise, the Pod may fail to mount this volume.


Line range hint 422-444: Combining volumes for normal volumes and file-based volumes.
Make sure the subPath logic is correct if multiple files share the same directory structure.


455-455: Graceful early return.
Returning an empty slice when no volumes and no files exist is a good approach, prevents accidental conflicts in the init container.

Also applies to: 457-457


460-460: Mounting knuuPath.
This approach is correct for the init container to handle volumes. Confirm that the final container also receives these volumes if needed.

Also applies to: 463-465


483-483: Appending file volume mounts in containerFiles.
Be sure to handle collisions if multiple files share the same subPath.

Also applies to: 485-488


550-556: Double-check CPU and memory usage.
Ensure requests/limits are balanced, to avoid scheduling or resource starvations.


582-583: Only partial definition.
Since you’re returning a ContainerApplyConfiguration, confirm that no required fields are omitted (e.g., ephemeral storage requests).


Line range hint 582-599: prepareContainer well-structured.
This function appears well organized, with the relevant fields set from config. Overall, good job.


601-606: Init containers only if volumes or files.
This is a good optimization to avoid extra overhead if there’s nothing to prepare.


624-628: preparePodSpec merges sidecar containers.
Watch out for potential conflicts in naming or volumes when sidecars are added.


646-652: Using Pod(...) builder function is consistent with apply pattern.
This is consistent with the approach used in ReplicaSets. Good example of code consistency.

pkg/k8s/serviceaccount.go (2)

7-7: Importing apierrs is aligned with the existing code.
It’s good to unify the error handling approach across resources.


31-33: Graceful handling for pre-existing ServiceAccounts.
Returning a specialized error (ErrServiceAccountAlreadyExists) improves clarity and debugging.

e2e/basic/probe_test.go (1)

8-8: Leveraging corev1 apply configurations for probes.
This promotes a consistent pattern across your codebase and simplifies probe creation.

pkg/k8s/prob.go (1)

5-17: LGTM! Proper nil handling and field copying.

The implementation correctly handles nil input and copies all basic probe configuration fields.

e2e/system/start_callback_test.go (1)

26-32: LGTM! Clean fluent interface usage.

The updated probe configuration using method chaining improves readability while maintaining the same functionality.

e2e/longrunning/simple_test.go (1)

46-81: ⚠️ Potential issue

Add cleanup to prevent resource leaks

The createInstance function should implement cleanup to prevent resource leaks in case of test failures.

Let's verify if there are any other tests that might need cleanup:

pkg/k8s/role.go (1)

40-42: LGTM: Improved error handling for existing roles

The error handling enhancement properly identifies and wraps "already exists" errors, improving error clarity.

pkg/instance/monitoring.go (3)

8-17: LGTM: Modernized Kubernetes type usage

The transition from direct Kubernetes API types to apply configurations is a good improvement that aligns with modern Kubernetes practices for declarative resource management.


92-104: LGTM: Proper deep copy implementation

The clone method correctly implements deep copying of probe configurations using k8s.DeepCopyProbe.


Line range hint 35-44: LGTM: Consistent probe configuration updates

The probe setter methods have been consistently updated to use ProbeApplyConfiguration, maintaining a uniform approach across all probe types (liveness, readiness, and startup).

Let's verify the probe configuration usage across the codebase:

Also applies to: 51-60, 67-76

✅ Verification successful

Probe configuration types are consistently used across the codebase

The verification confirms that all probe-related code consistently uses ProbeApplyConfiguration:

  • All probe fields in pkg/k8s/pod.go use ProbeApplyConfiguration
  • All probe setters in pkg/instance/monitoring.go use ProbeApplyConfiguration
  • The deep copy function in pkg/k8s/prob.go handles ProbeApplyConfiguration
  • The only v1.Probe() usages are in test files where they are appropriately used for test setup
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining direct v1.Probe usage that might need updating
# and verify that all probe configurations use the new type

# Search for any remaining v1.Probe usage
echo "Checking for remaining v1.Probe usage..."
rg "v1\.Probe"

# Search for ProbeApplyConfiguration usage
echo "Checking ProbeApplyConfiguration usage..."
rg "ProbeApplyConfiguration"

Length of output: 2977

pkg/instance/instance.go (1)

64-64: LGTM: Using recommended pointer initialization

Good use of ptr.To helper from k8s.io/utils/ptr for initializing the pointer type, which is the recommended approach for Kubernetes types.

pkg/instance/build.go (2)

21-21: LGTM: Consistent pointer type handling

The imagePullPolicy field and its accessor methods have been properly updated to use pointer types, maintaining consistency with Kubernetes patterns.

Also applies to: 38-44


247-248: LGTM: Improved build directory caching

Good improvement to cache the build directory path, preventing redundant path joining operations on subsequent calls.

pkg/traefik/traefik.go (1)

5-5: LGTM: Improved error handling for service account creation.

The changes appropriately handle the case where a service account already exists, preventing unnecessary errors while still catching and wrapping other potential issues.

Also applies to: 56-58

pkg/instance/execution.go (1)

314-327: LGTM: Consistent error handling for Kubernetes resources.

The changes implement a consistent pattern for handling "already exists" errors across service accounts, roles, and role bindings. This improves robustness by:

  • Allowing idempotent resource creation
  • Properly wrapping other errors for better error reporting
pkg/k8s/errors.go (1)

150-152: LGTM! The new error definitions follow consistent patterns.

The new error variables for handling "already exists" scenarios follow the established naming conventions and message formatting patterns. They complement the existing creation errors and enhance error handling for service accounts, roles, and role bindings.

Comment on lines +38 to +54
Name string // Name to assign to the Container
Image string // Name of the container image to use for the container
ImagePullPolicy *v1.PullPolicy // Image pull policy for the container
Command []string // Command to run in the container
Args []string // Arguments to pass to the command in the container
Env map[string]string // Environment variables to set in the container
Volumes []*Volume // Volumes to mount in the Pod
MemoryRequest resource.Quantity // Memory request for the container
MemoryLimit resource.Quantity // Memory limit for the container
CPURequest resource.Quantity // CPU request for the container
LivenessProbe *corev1.ProbeApplyConfiguration // Liveness probe for the container
ReadinessProbe *corev1.ProbeApplyConfiguration // Readiness probe for the container
StartupProbe *corev1.ProbeApplyConfiguration // Startup probe for the container
Files []*File // Files to add to the Pod
SecurityContext *corev1.SecurityContextApplyConfiguration // Security context for the container
TCPPorts []int // TCP ports to expose on the Pod
UDPPorts []int // UDP ports to expose on the Pod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure pointer fields are non-nil when needed.
Pointer fields like ImagePullPolicy and Probes can lead to nil pointer issues if not carefully checked. Double-check usage to avoid runtime exceptions.

Comment on lines 33 to 36
_, err = ins1.Execution().ExecuteCommand(ctx, "echo", fileContent, ">", "/tmp/test-id")
require.NoError(t, err)

time.Sleep(1 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace arbitrary sleep with proper readiness check

Using time.Sleep is not reliable for testing. Consider implementing a proper readiness check or using a probe.

- _, err = ins1.Execution().ExecuteCommand(ctx, "echo", fileContent, ">", "/tmp/test-id")
- require.NoError(t, err)
- 
- time.Sleep(1 * time.Second)
+ _, err = ins1.Execution().ExecuteCommand(ctx, "echo", fileContent, ">", "/tmp/test-id")
+ require.NoError(t, err)
+ 
+ // Wait for file to be ready
+ err = waitForFile(ctx, ins1, "/tmp/test-id")
+ require.NoError(t, err)

Consider adding this helper function:

func waitForFile(ctx context.Context, ins *instance.Instance, path string) error {
    for i := 0; i < 10; i++ {
        _, err := ins.Execution().ExecuteCommand(ctx, "test", "-f", path)
        if err == nil {
            return nil
        }
        select {
        case <-ctx.Done():
            return ctx.Err()
        case <-time.After(100 * time.Millisecond):
        }
    }
    return fmt.Errorf("file %s not ready", path)
}

pkg/k8s/role.go Outdated
Comment on lines 70 to 73
if apierrs.IsAlreadyExists(err) {
return ErrClusterRoleAlreadyExists.WithParams(name).Wrap(err)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential race condition in CreateClusterRole

The Get-then-Create pattern could lead to race conditions in concurrent scenarios. Consider using Create with error handling instead, similar to the CreateRole implementation.

Apply this diff to fix the race condition:

- _, err := c.clientset.RbacV1().ClusterRoles().Get(ctx, name, metav1.GetOptions{})
- if apierrs.IsAlreadyExists(err) {
-   return ErrClusterRoleAlreadyExists.WithParams(name).Wrap(err)
- }

 role := &rbacv1.ClusterRole{
   ObjectMeta: metav1.ObjectMeta{
     Name:   name,
     Labels: labels,
   },
   Rules: policyRules,
 }
 _, err = c.clientset.RbacV1().ClusterRoles().Create(ctx, role, metav1.CreateOptions{})
+ if apierrs.IsAlreadyExists(err) {
+   return ErrClusterRoleAlreadyExists.WithParams(name).Wrap(err)
+ }
 return err
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if apierrs.IsAlreadyExists(err) {
return ErrClusterRoleAlreadyExists.WithParams(name).Wrap(err)
}
role := &rbacv1.ClusterRole{
ObjectMeta: metav1.ObjectMeta{
Name: name,
Labels: labels,
},
Rules: policyRules,
}
_, err = c.clientset.RbacV1().ClusterRoles().Create(ctx, role, metav1.CreateOptions{})
if apierrs.IsAlreadyExists(err) {
return ErrClusterRoleAlreadyExists.WithParams(name).Wrap(err)
}
return err

Comment on lines 86 to 88
if err != nil && apierrs.IsAlreadyExists(err) {
return ErrClusterRoleBindingAlreadyExists.WithParams(name).Wrap(err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix redundant error check condition

The condition err != nil && apierrs.IsAlreadyExists(err) is redundant since IsAlreadyExists already includes a nil check.

Apply this diff to simplify the condition:

- if err != nil && apierrs.IsAlreadyExists(err) {
+ if apierrs.IsAlreadyExists(err) {
    return ErrClusterRoleBindingAlreadyExists.WithParams(name).Wrap(err)
  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if err != nil && apierrs.IsAlreadyExists(err) {
return ErrClusterRoleBindingAlreadyExists.WithParams(name).Wrap(err)
}
if apierrs.IsAlreadyExists(err) {
return ErrClusterRoleBindingAlreadyExists.WithParams(name).Wrap(err)
}

Comment on lines +345 to +347
// TODO: remove this
fmt.Printf("\nfile: %#v\n", file)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace debug print with structured logging.

The current debug print has several issues:

  1. Uses fmt.Printf instead of the structured logger
  2. Prints directly to stdout which could interfere with program output
  3. TODO comment lacks context

Replace with structured logging:

-	// TODO: remove this
-	fmt.Printf("\nfile: %#v\n", file)
+	s.instance.Logger.WithField("file", file).Debug("adding file to instance")

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (11)
pkg/k8s/k8s.go (1)

39-40: Add documentation for the fieldManager constant

Please add a comment explaining the purpose and usage of this constant. Field managers are important in Kubernetes for tracking which component makes changes to resources, especially with Server-Side Apply operations.

+// FieldManager is the identifier used to track changes made by this component
+// when applying Kubernetes resources using Server-Side Apply
 fieldManager = "knuu"
pkg/k8s/test_suite_test.go (1)

100-117: Consider making the helper method more flexible

The helper method is well-structured and follows the established patterns. However, it could be more reusable by accepting a slice of subjects instead of a single service account.

Consider this alternative implementation:

-func (s *TestSuite) createClusterRoleBinding(name string, labels map[string]string, clusterRole string, serviceAccount string) error {
+func (s *TestSuite) createClusterRoleBinding(name string, labels map[string]string, clusterRole string, subjects []rbacv1.Subject) error {
 	_, err := s.client.Clientset().RbacV1().ClusterRoleBindings().
 		Create(context.Background(), &rbacv1.ClusterRoleBinding{
 			ObjectMeta: metav1.ObjectMeta{
 				Name:   name,
 				Labels: labels,
 			},
 			RoleRef: rbacv1.RoleRef{
 				Kind:     "ClusterRole",
 				Name:     clusterRole,
 				APIGroup: rbacv1.GroupName,
 			},
-			Subjects: []rbacv1.Subject{
-				{Kind: "ServiceAccount", Name: serviceAccount},
-			},
+			Subjects: subjects,
 		}, metav1.CreateOptions{})
 	return err
}
pkg/k8s/rolebinding_test.go (1)

140-142: Consider adding a comment to clarify test setup

While using createClusterRoleBinding is appropriate, adding a comment would make it clearer that this setup is intentionally creating a pre-existing binding for the test.

 setupMock: func() {
+	// Create a pre-existing cluster role binding to test the "already exists" scenario
 	s.createClusterRoleBinding("existing-clusterrolebinding", map[string]string{"app": "existing"}, "existing-clusterrole", "existing-sa")
 },
pkg/k8s/pod.go (8)

416-424: Confirm SubPath usage correctness.
By using SubPath derived from “strings.TrimLeft(volume.Path, "/")”, mount path collisions can occur if multiple volumes share the same subpath. Double-check you’re not unintentionally overshadowing subpaths in the same container.


435-438: Avoid potential conflicts with the container’s root path.
Trimming “/” can produce empty SubPath if the directory is exactly “/”. This is minor, but might cause unexpected mounting behavior in edge cases.


Line range hint 449-472: Check for large volume ownership changes in init container.
The init container can recursively chown large volumes. This may lead to performance overhead for large data sets. Check logs and performance constraints if volumes are large.


477-482: Runtime overhead for copying files in init container.
Copying files individually may be slow if many files exist; consider using a more optimized approach (e.g., tar-then-untar).


556-569: Naming collisions with repeated port usage.
The port name is being auto-generated from the port number. If a port is used multiple times, the name might conflict. Consider a fallback or unique suffix.


595-604: Multiple init containers in a single Pod.
Currently, you only return a single init container if volumes or files exist. If you need to chain multiple init processes in the future, consider appending to a slice rather than overwriting.


614-614: Potentially redundant wrapper
preparePodVolumes is a direct call to buildPodVolumes. Evaluate whether a direct call is simpler if no additional logic is present.


618-622: Consider nil returns instead of empty pointer
Returning a pointer to a PodSpecApplyConfiguration is fine. Alternatively, you might opt for a struct literal if usage patterns are consistent.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f429274 and 0a73c1d.

📒 Files selected for processing (10)
  • pkg/k8s/k8s.go (1 hunks)
  • pkg/k8s/pod.go (12 hunks)
  • pkg/k8s/pod_test.go (4 hunks)
  • pkg/k8s/replicaset.go (3 hunks)
  • pkg/k8s/replicaset_test.go (5 hunks)
  • pkg/k8s/role.go (3 hunks)
  • pkg/k8s/role_test.go (1 hunks)
  • pkg/k8s/rolebinding.go (3 hunks)
  • pkg/k8s/rolebinding_test.go (2 hunks)
  • pkg/k8s/test_suite_test.go (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/k8s/rolebinding.go
  • pkg/k8s/role.go
🔇 Additional comments (27)
pkg/k8s/test_suite_test.go (1)

13-13: LGTM: Import added for RBAC types

The addition of the rbacv1 import is appropriate for the new ClusterRoleBinding functionality.

pkg/k8s/role_test.go (1)

163-163: LGTM: Simplified error handling

The change to use errInternalServerError directly makes the error handling more consistent with other test cases and follows the principle of keeping test assertions simple and focused.

pkg/k8s/rolebinding_test.go (1)

152-156: LGTM: Consistent error simulation

The mock setup for simulating a creation error is well-structured and follows the established pattern in other test cases.

pkg/k8s/pod.go (11)

91-94: Confirm the 'Apply' usage with appropriate PatchType.
DeployPod now calls the Apply method instead of Create on pods. This ensures a declarative approach, but confirm you're using the correct PatchType (e.g., ApplyPatchType) in the background. Some older clusters or mocks could reject partial patches.


359-362: Initialization of EnvVarApplyConfiguration is correct.
Looks good. The usage of pointers for Name and Value ensures a safe conversion to string references.


370-379: Validate PVC existence for persistent volumes.
A persistent volume claim is referenced by “ClaimName” in line 378. Ensure the PVC is created or handled before. If the PVC does not exist, the pod creation will fail.


387-390: EmptyDir used for ephemeral files.
Great use of EmptyDir for ephemeral storage. Just confirm that ephemeral usage is acceptable for your scenario.


397-402: ConfigMap volume for files.
Ensure that the ConfigMap resource (same name as the Pod) exists. If not, this causes mounting issues.


443-443: ContainerFiles not used in the appended slice?
The variable is declared but appended only at return. This might be intended, but verify it’s needed.


544-550: ResourceRequests & ResourceLimits.
Using v1.ResourceList inside an apply configuration is valid. Check that the cluster’s admission controllers override or patch these resources if necessary.
[approve]


Line range hint 576-590: Transitioning to ContainerApplyConfiguration.
The transition from v1.Container to ContainerApplyConfiguration looks correct. Ensure the result is consistent for all container-building logic (initContainers, sidecars).


640-646: PodApplyConfiguration usage
Switching from v1.Pod to PodApplyConfiguration is consistent with the overall shift. Confirm you include the correct APIVersion/Kind for custom resources if used.


17-17: Check for manifested usage references in code.
The import for "k8s.io/client-go/applyconfigurations/core/v1" has been added, shifting the usage toward apply configuration objects. Verify that all references in the code (both old v1 types and new corev1 apply types) align consistently to avoid confusion.


38-54: Ensure pointer fields are non-nil when needed.
The ContainerConfig struct now uses pointer fields for ImagePullPolicy, LivenessProbe, ReadinessProbe, StartupProbe, and SecurityContext. In some workflows, they could remain nil if not set, potentially causing nil-pointer dereferences. Please confirm correct usage and guard checks.

✅ Verification successful

Let me try a more specific pattern to find potential issues with these pointer fields.


Based on the search results, I can now provide a final response:

Pointer fields are safely handled in the codebase
The pointer fields in ContainerConfig are used appropriately throughout the codebase:

  • ImagePullPolicy is set through a dedicated setter that ensures non-nil value
  • Probes (Liveness, Readiness, Startup) are set through monitoring methods that include state validation
  • SecurityContext is created via prepareSecurityContext() which always returns a non-nil value
  • All fields are directly passed to Kubernetes API without dereferencing in pkg/k8s/pod.go
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Identify direct pointer dereferences for these fields without nil-checks.
ast-grep --pattern $'$_.$_Field.$_' 

Length of output: 36


Script:

#!/bin/bash
# Search for usage of these pointer fields
rg -U "ImagePullPolicy|LivenessProbe|ReadinessProbe|StartupProbe|SecurityContext" --type go -A 3 -B 3

# Also look for the ContainerConfig struct usage
rg "ContainerConfig" --type go -A 3 -B 3

Length of output: 20160

pkg/k8s/replicaset.go (3)

12-14: Switch to applyconfigurations for ReplicaSet
You’ve introduced “applyv1” and “corev1” package references. Confirm that the cluster supports server-side apply for apps/v1.


Line range hint 35-55: Handling existing ReplicaSet
Lines 37-43 fetch the existing ReplicaSet to reuse selector, labels, and annotations. Good approach. Just ensure you handle mismatched selectors or conflicting changes if the existing ReplicaSet was created differently.


171-187: prepareReplicaSet usage
The method returns an applyv1.ReplicaSetApplyConfiguration. This is consistent with the new approach. Confirm that all references to the old appv1.ReplicaSet have been replaced in calling code.

pkg/k8s/pod_test.go (5)

5-5: Minor import addition
Looks like a new “fmt” usage or reordering. No issues noted.


32-47: Mock patch reactor.
The test now uses a patch reactor for the “pods” resource. This is correct given the server-side apply under the hood. Ensure the returned Pod object is valid enough for subsequent assertions.


62-62: Client error simulation
The test setup simulates an internal server error on patch. This is consistent.


67-67: Direct error checking
Error is directly compared to errInternalServerError, removing extra wrapping. This might unify error handling across tests.


105-116: Patch-based replacement
Same logic as the creation test, but for replacing a pod. Looks consistent.

pkg/k8s/replicaset_test.go (5)

5-5: Minor import change
Imports “fmt”; no immediate issues. Looks fine.


39-55: Mock patch reactor for replicaset
Similar to pod patch tests, verifying the patch logic. This indicates consistent approach across resources.


75-82: Client error injection
The test simulates a patch failure by returning errInternalServerError. Straightforward approach to validate error handling.


130-144: Graceful replacement
Ensures the old replicaset is found and patched with new changes, reusing metadata. Looks good.


220-234: Patch-based replacement test
Consistent with earlier patterns. The patch approach in ReplaceReplicaSet is properly tested.

Comment on lines +39 to +40
// FieldManager is the field manager to use for the Kubernetes client
fieldManager = "knuu"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider exporting the fieldManager constant

Since this constant is used across packages (referenced in test files), it should be exported to maintain proper encapsulation and visibility. Consider renaming it from fieldManager to FieldManager.

-fieldManager = "knuu"
+FieldManager = "knuu"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// FieldManager is the field manager to use for the Kubernetes client
fieldManager = "knuu"
// FieldManager is the field manager to use for the Kubernetes client
FieldManager = "knuu"

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
pkg/knuu/knuu.go (1)

98-101: Consider adding debug-level logging for normal cleanup

The skip case is logged at info level, but there's no corresponding log when cleanup proceeds normally. This makes it harder to track cleanup behavior in logs.

 func (k *Knuu) CleanUp(ctx context.Context) error {
 	if k.skipCleanup {
 		k.Logger.Info("skipping cleanup")
 		return nil
 	}
+	k.Logger.Debug("performing cleanup")
 	return k.K8sClient.DeleteNamespace(ctx, k.Scope)
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a73c1d and 929f2f0.

📒 Files selected for processing (2)
  • e2e/longrunning/simple_test.go (1 hunks)
  • pkg/knuu/knuu.go (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • e2e/longrunning/simple_test.go
🔇 Additional comments (3)
pkg/knuu/knuu.go (3)

44-44: LGTM! Field additions follow Go conventions

The new skipCleanup fields are well-designed:

  • Public SkipCleanup in Options allows external control
  • Private skipCleanup in Knuu maintains proper encapsulation

Also applies to: 56-56


74-74: LGTM! Proper initialization of skipCleanup

The field is correctly initialized from the options struct.


98-101: Verify resource cleanup in long-running tests

When skipCleanup is true, resources could be left running indefinitely. Please ensure that:

  1. Long-running tests have proper cleanup mechanisms
  2. The timeout handler is still effective when skipCleanup is true
  3. There's documentation about the responsibility to clean up when using this option

Let's check if the timeout handler respects skipCleanup:

✅ Verification successful

Let me gather more information about the timeout and cleanup mechanisms.


Let me check one more aspect about the timeout handling implementation.


Based on the gathered information, I can now provide the final response:

Resource cleanup is properly handled even with skipCleanup

The skipCleanup option is safely implemented with proper timeout handling:

  1. A timeout handler is always created regardless of skipCleanup setting (via handleTimeout in knuu.go)
  2. The timeout handler runs as a separate instance with type TimeoutHandlerInstance
  3. The timeout handler will clean up resources after the specified timeout (default 60 minutes)
  4. The HandleStopSignal implementation ensures cleanup on SIGINT/SIGTERM

The long-running test in e2e/longrunning/simple_test.go uses this feature correctly with a 5-minute timeout, ensuring resources won't be left indefinitely.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any interaction between timeout handler and skipCleanup
rg -A 5 'skipCleanup.*timeout|timeout.*skipCleanup'

# Check if there are any tests using skipCleanup
rg -A 10 'SkipCleanup.*true|skipCleanup.*true'

Length of output: 660


Script:

#!/bin/bash
# Check timeout handler implementation
ast-grep --pattern 'func $_ HandleStopSignal($_) {
  $$$
}'

# Check cleanup implementation
ast-grep --pattern 'func $_ cleanup($_, $_) {
  $$$
}'

# Look for any other cleanup related code
rg -A 5 'cleanup|Cleanup'

Length of output: 3665


Script:

#!/bin/bash
# Check HandleStopSignal implementation and any timeout related code
rg -A 10 'HandleStopSignal|func.*HandleStopSignal'

# Look for timeout related cleanup
rg -A 10 'Timeout|timeout.*cleanup|cleanup.*timeout'

Length of output: 44536

@smuu
Copy link
Member

smuu commented Jan 9, 2025

Is this PR based on another PR? I can see many changes here and not only adding a test.
What files should I review here?

@mojtaba-esk
Copy link
Contributor Author

Is this PR based on another PR? I can see many changes here and not only adding a test. What files should I review here?

No I don't think so, if I recall well, to run the test, some parts of the knuu code had to change.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
pkg/k8s/pod.go (1)

545-570: Consider adding CPU limits.

While memory limits are set, CPU limits are missing. This is often intentional but worth documenting why.

Consider adding a comment explaining why CPU limits are omitted, as this is a common source of confusion.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 929f2f0 and 6297ac8.

📒 Files selected for processing (4)
  • pkg/instance/build.go (3 hunks)
  • pkg/instance/execution.go (2 hunks)
  • pkg/k8s/pod.go (12 hunks)
  • pkg/knuu/knuu.go (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/instance/execution.go
  • pkg/knuu/knuu.go
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: test (./e2e/netshaper, 60m)
  • GitHub Check: test (./e2e/basic, 15m)
🔇 Additional comments (7)
pkg/instance/build.go (3)

21-21: Ensure pointer field is properly initialized.

The change from value type to pointer type for imagePullPolicy requires careful handling of nil cases.


39-44: LGTM! Methods updated correctly for pointer type.

The ImagePullPolicy and SetImagePullPolicy methods have been properly updated to handle the pointer type.


257-258: LGTM! Improved state management.

Good improvement to store the build directory path in the struct field before returning it, ensuring consistency for subsequent calls.

pkg/k8s/pod.go (4)

38-54: LGTM! Modernized Kubernetes type system.

The ContainerConfig struct has been updated to use apply configurations, which is a more modern and type-safe approach. This change aligns with Kubernetes best practices for declarative configuration.


92-95: Good transition to server-side apply.

The change from Create to Apply is a positive improvement that:

  1. Enables declarative updates
  2. Handles conflicts better
  3. Supports partial updates

Line range hint 577-648: LGTM! Comprehensive container configuration.

The container preparation logic has been well-updated to use apply configurations consistently throughout. The changes maintain all the necessary configuration options while providing better type safety.


Line range hint 360-483: Verify pointer safety in build functions.

The transition to apply configurations is good, but ensure all pointer fields are properly initialized to avoid nil dereferences.

✅ Verification successful

Pointer safety verification completed - No issues found

All pointer field initializations using ptr.To[T]() are safe from nil dereferences. The code properly handles pointer conversions with direct values and protected field accesses.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for potential nil pointer dereferences in pointer field assignments
rg -A 2 "ptr\.To\[.*\]\(" pkg/k8s/

Length of output: 4161

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants