Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #3285: adding waiting for list contexts #3286

Merged
merged 1 commit into from
Jul 21, 2021

Conversation

shawkins
Copy link
Contributor

@shawkins shawkins commented Jun 30, 2021

Description

Adds waiting methods for list contexts as per #3285. Also adds methods to directly return wait futures - to expose these the
resource handler generation was simplified.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change
  • Chore (non-breaking change which doesn't affect codebase;
    test, version modification, documentation, etc.)

Checklist

  • Code contributed by me aligns with current project license: Apache 2.0
  • I Added CHANGELOG entry regarding this change
  • I have implemented unit tests to cover my changes
  • I have added/updated the javadocs and other documentation accordingly
  • No new bugs, code smells, etc. in SonarCloud report
  • I tested my code in Kubernetes
  • I tested my code in OpenShift

@centos-ci
Copy link

Can one of the admins verify this patch?

@manusa manusa added this to the 5.6.0 milestone Jun 30, 2021
@manusa manusa added the wip label Jun 30, 2021
@shawkins shawkins force-pushed the after_wait branch 2 times, most recently from 6444b41 to d661f2f Compare June 30, 2021 19:27
@shawkins
Copy link
Contributor Author

shawkins commented Jul 1, 2021

After implementing the logic in NamespaceVisitFromServerGetWatchDeleteRecreateWaitApplicableListImpl to directly use futures exposed by waitalbe methods, I decided that does not seem worth it yet. While it does same some threads, we are still creating 2 threads per watch. The cleanup to the resourcehandler generation seems valid, so I'm leaving that in.

The function is now proposed as:

  /**
   * Wait for the entire list available at this context to satisfy the given predicate.
   * @param condition the {@link Predicate} to test
   * @param pollInterval the polling interval to use, a value less than 1 indicates that every event should be tested
   * @param timeout after which a {@link KubernetesClientTimeoutException} is thrown
   * @param timeUnit unit of the pollInterval and timeout
   * @return the list of items after the condition is met
   * @throws KubernetesClientTimeoutException if time runs out before the condition is met
   */
  List<T> waitUntilListCondition(Predicate<List<T>> condition, long pollInterval, long timeout, TimeUnit timeUnit);

Such that you can test at an interval or test on every event - under the covers it is trivial to filter the interval tests to only when the resourceVersion has changed, but I've left that out for now.

The usage looks like https://github.com/fabric8io/kubernetes-client/pull/3286/files#diff-e3b6fec210f5b68dea994bde82d20006708a6e34d5e64f7f62211290dde44f0fR216

The only other thought would be if anyone doesn't want this method on WatchListDeletable, something similar could be added directly to SharedInformer.

After a preliminary review by @manusa or @rohanKanojia, I'll add a changelog and more tests.

@metacosm metacosm self-requested a review July 6, 2021 15:32
@shawkins
Copy link
Contributor Author

shawkins commented Jul 6, 2021

Separated the template changes as #3301

@shawkins shawkins changed the title WIP fix #3285: adding waiting for list contexts fix #3285: adding waiting for list contexts Jul 6, 2021
@shawkins shawkins removed the wip label Jul 6, 2021
@shawkins
Copy link
Contributor Author

shawkins commented Jul 9, 2021

It may be worth mentioning that you could also just make this a method of the Informable interface rather than something that is directly on WatchListDeletable.

* @return the list of items after the condition is met
* @throws KubernetesClientTimeoutException if time runs out before the condition is met
*/
List<T> waitUntilListCondition(Predicate<List<T>> condition, long pollInterval, long timeout, TimeUnit timeUnit);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we provide a

default List<T> waitUntilListCondition(Predicate<List<T>> condition) {
  return waitUntilListCondition(condition, 0, TimeUnit.SECONDS);
}

for a non-polling default method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll add:

default List<T> waitUntilListCondition(Predicate<List<T>> condition, long timeout, TimeUnit timeUnit) {
  return waitUntilListCondition(condition, 0, timeout, TimeUnit.SECONDS);
}

And move the methods onto Informable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of multiplying method overrides when they don't provide an obvious value. This leads to higher cognitive load and potential confusion in my opinion.

Copy link
Member

@manusa manusa Jul 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, my suggestion is based on usability. For the average user I don't see the added value of being able to specify a polling interval.

Right now, I wouldn't be able to say what's the recommended approach for any given scenario. I suspect that we'll end up getting questions about this.

I think that the reason to add this variable is based on the underlying implementation and the usage or not of informers. So in the end, I guess the subject boils down to the question that given informers+cache should provide the exact same behavior, why should we want to use a polling mechanism instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the end, I guess the subject boils down to the question that given informers+cache should provide the exact same behavior, why should we want to use a polling mechanism instead?

The polling approach was to match the existing logic of the RollingUpdater - it would of course work without polling as well. The other thoughts as to why a polling option made sense are for cases where a resource is rapidly changing (and therefore testing on every event may not be desired) or where you want the test to have a temporal component - for example, test every 500 ms, but every 5 seconds emit a log or some other more involved action. We have a scenario where while waiting for a resource to be ready we periodically check for unscheduled pods.

Another view point is that the polling is like a resync, but at a list level. If that and other handling were added to the ResourceEventHandler - onRefresh(List), onResync(List) - then we don't necessarily need a dsl method as this scenario could be implemented directly on the Informer.

Predicate<List<Pod>> predicate = ...
CompletableFuture<Void> future = ...
SharedInformer<Pod> informer = ...pods().inform(null, 1000);
informer.addEventHandler(new ResourceEventHandler<HasMetadata>() {
    public void onRefresh(List<HasMetadata> objs) {
      test(objs);
    }

    public void onResync(List<HasMetadata> objs) {
      test(objs);
    }
   
    public void onAdd(HasMetadata obj) {
      // do something if I want to test every add
      // requires access to the Informer Cache
    }
  
    ...

    void test(List<HasMetadata> objs) {
      // test the predicate, complete the future
    }
});

A built-in implementation of this ResourceEventHandler could be supplied - that would be somewhat similar to the ReadinessWatcher (although that looks like it's no longer needed).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if it does make sense to have both options, IMHO it is clearer to have the 2 methods, one that implies the non-polling behavior and another one where it's actually expected.
For me it's confusing that a "magic" 0 will disable polling and assume you want the async behavior instead.
Anyway, we can discuss this point in today's meeting.

also updating existing logic to use the wait methods
@shawkins
Copy link
Contributor Author

I've made a couple of changes/simplifications.

  1. The ResourceEventHandler now has a method onNothing to be notified when there is nothing. This was the simplest / most consistent change I could think of to put this behavior on ResourceEventHandler. Alternatives would include adding methods such as onList - and even possibly delegating the behavior to the event handler, but that gets complicated.

  2. Changed the method signature to Informable.informOnCondition(Predicate<List) - and it returns a CompletableFuture. That provides the flexibility setup multiple of these conditions and then wait across all of them rather than trying to force everything into a single one and a poll/resync for alternative behavior. For callers on Java 9+ this will look like:

    client.pods()
        .inNamespace(namespace)
        .withLabel("test", "WaitUntilReadyIT")
        .informOnCondition(Collection::isEmpty).orTimeout(60, TimeUnit.SECONDS).get();

@sonarqubecloud
Copy link

Copy link
Member

@manusa manusa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx!

@manusa manusa requested a review from rohanKanojia July 20, 2021 09:44
@manusa manusa linked an issue Jul 20, 2021 that may be closed by this pull request
@manusa manusa merged commit 8ed9692 into fabric8io:master Jul 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement waiting for lists
5 participants