Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clientv3: Canceling Watch() doesn't send &pb.WatchCancelRequest #9416

Closed
yudai opened this issue Mar 9, 2018 · 10 comments · Fixed by #11850
Closed

clientv3: Canceling Watch() doesn't send &pb.WatchCancelRequest #9416

yudai opened this issue Mar 9, 2018 · 10 comments · Fixed by #11850
Assignees
Labels
Milestone

Comments

@yudai
Copy link
Contributor

yudai commented Mar 9, 2018

Summary

Canceling a context that has been used to invoke Watch() is supposed to stop subscribing events for the Watch(), meaning that canceling should release resources in the server side.

However, clientv3 doesn't send a &pb.WatchCancelRequest message to the server when a substream is canceled, which leads resource leak in the side until the parent watch stream itself is closed.

Issue

Tested with the master branch (00b8423).

watchGrpcStream serves watch substreams in serveSubstream and accepts context canceling for them there as well (using ws.initReq.ctx).
When ws.initReq.ctx is canceled, it eventually sends the ws to w.closingc, and the ws is handled by the watchGrpcStream.run(). Then it calls w.closeSubstream(ws) and the substream is removed from internal w.substreams. (the client side is fine, it releases the resources for substreams)

However, this canceling procedure for substreams doesn't send any cancel messages to the server. The server therefore doesn't know a substream is canceled by the client at this moment.

When the watched key of a substream is actively updated, an event to the key eventually resolve this inconsistency here.

On the other hand, when the key is not updated at all, the resources for the substream in the server side kept forever until the watch stream itself is canceled or the client is dropped.

This behavior can be an issue when there are long-running clients that dynamically watch/unwatch different keys that are not actively updated after unwatching.

Reproducing

You can reproduce the situation by the following code.

package main

import "github.com/coreos/etcd/clientv3"

import (
	"context"
	"fmt"
	"log"
	"time"
)

func main() {
	cli, err := clientv3.New(clientv3.Config{
		Endpoints:   []string{"localhost:2379"},
		DialTimeout: 5 * time.Second,
	})
	if err != nil {
		log.Fatal(err)
	}
	log.Printf("Start")

	cli.Watch(context.Background(), "/") // this watch keeps the stream alive

	num := 1000
	for i := 0; i < num; i++ {
		ctx, cancel := context.WithCancel(context.Background())
		key := fmt.Sprintf("/%d", i)
		cli.Watch(ctx, key)
		cancel()
	}

	log.Printf("Done")

	time.Sleep(time.Hour)
        // etcd_debugging_mvcc_watch_stream_total 1
        // etcd_debugging_mvcc_watcher_total 1001   <- expected to be 1
}

Proposal

I think we can simply send a cancel request when w.closingc receives a closing stream.

https://github.com/coreos/etcd/blob/release-3.2/clientv3/watch.go#L513

Adding something like this?

		case ws := <-w.closingc:
			cr := &pb.WatchRequest_CancelRequest{
				CancelRequest: &pb.WatchCancelRequest{
					WatchId: ws.id,
				},
			}
			req := &pb.WatchRequest{RequestUnion: cr}
			wc.Send(req)
@gyuho
Copy link
Contributor

gyuho commented Mar 9, 2018

I think this is a valid use case for multiplexed watch stream requests + gRPC proxy. Right now, only way to cancel mvcc watchers is closing the whole watch stream.

What do you think? @heyitsanthony @xiang90

@heyitsanthony
Copy link
Contributor

@yudai that's how it works today. A server-side mechanism to LRU evict watches in the v3rpc layer might be easier/cover more cases than reworking the client's cancel path.

@yudai
Copy link
Contributor Author

yudai commented Mar 9, 2018

Thanks for the comments. Please correct me if I misunderstand something below.

I think it's fine and working as expected that a mvcc.WatchStream is kept running when there is a corresponding watch stream in the client side. We need to close the whole watch stream to close mvcc.WatchStream.

The issue I wanted to mention was that mvcc.WatchStream.Cancel(id WatchID) is not called properly when you cancel a context for a clientv3.Watch() call (which creates a substream). mvcc.WatchStream.Cancel(id WatchID) releases mvcc.watcher (I'm a bit confused. this seems a way to cancel mvcc watchers without closing the whole watch stream), but since we don't call it, instances of mvcc.watcher piles up and not going to be freed, until you close the master watch stream.

In the v3rpc layer, v3rpc.serverWatchStream calls this mvcc.WatchStream.Cancel(id WatchID) properly when it receives *pb.WatchRequest_CancelRequest(https://github.com/coreos/etcd/blob/master/etcdserver/api/v3rpc/watch.go#L265-L268).

I therefore think we already have a proper implementation for substream canceling in the server side, and the only missing part is in the client side. The client library simply doesn't send *pb.WatchRequest_CancelRequest when users cancel contexts.

I assume this issue could be critical for gRPC proxy as well. I actually realized this issue when I was processing watch requests in our private gRPC proxy. Since clients don't send substream cancel requests, our proxy cannot release its internal resources for watch substreams as long as watch streams are alive. I think that the same thing should happen for gRPC proxy.

The mvcc layer also keeps unnecessary mvcc.watchers and other resources for context-canceled watch substreams as long as the watch stream for them is alive, as you can see etcd_debugging_mvcc_watcher_total in the reproducing code above. If a client repeats calling clientv3.Watch() and canceling it (it uses only one watch stream and generates many substreams), this can happen in any environments.

In my opinion, it would be better to make the client send cancel requests when users cancel contexts. It's pretty confusing for users that canceling contexts is not always canceling watch. Sending canceling requests on context canceling would be more straight forward and complies better with the defined protocol.

@yudai
Copy link
Contributor Author

yudai commented Apr 4, 2018

Created reproduction code:

package main

import "github.com/coreos/etcd/clientv3"

import (
  "context"
  "log"
  "time"
)

func main() {
  cli, err := clientv3.New(clientv3.Config{
    Endpoints:   []string{"localhost:2379"},
    DialTimeout: 5 * time.Second,
  })
  if err != nil {
    log.Fatal(err)
  }
  log.Printf("Start")

  cli.Watch(context.Background(), "/")

  for {
    ctx, cancel := context.WithCancel(context.Background())
    cli.Watch(ctx, "/")
    cancel()
  }

  log.Printf("Done")
}

You can confirm etcd_debugging_mvcc_watcher_total keeps increasing and never decrease with this code (as long as you don't put something to /).

@gyuho gyuho self-assigned this May 25, 2018
@gyuho
Copy link
Contributor

gyuho commented May 25, 2018

@yudai @heyitsanthony I will run some benchmarks to measure the overhead of canceling each watcher on closing channel receive, from client-side. Or investigate if it can be easily handled in storage layer.

@scotwells
Copy link

We seem to be experiencing a similar issue with our current production setup running on etcd v3.2.24. The number of open watches on the server continues to climb until we hit between ~23k and ~32k open watchers. Once the number of etcd open watches flat lines, etcd will fail to send watches out to clients and we see a sharp rise in Slow Watchers.

While diving into the code, we noticed that the watch request on the etcd server was not closed until the client got an event from the server after the client context was canceled. As an attempted workaround, we configured the clients to use WithProgressNotify. This change seemed to only delay the pile up, instead of resolving the issue.

Has there been any progress on determining if there is significant overhead in canceling a watch request when the context gets canceled on the client side?

Code executed when the client context is canceled

etcd/clientv3/watch.go

Lines 513 to 520 in e06761e

case ws := <-w.closingc:
w.closeSubstream(ws)
delete(closing, ws)
if len(w.substreams)+len(w.resuming) == 0 {
// no more watchers on this stream, shutdown
return
}
}

Code executed when an event is received from the server after the context is closed

etcd/clientv3/watch.go

Lines 489 to 496 in e06761e

cancelSet[pbresp.WatchId] = struct{}{}
cr := &pb.WatchRequest_CancelRequest{
CancelRequest: &pb.WatchCancelRequest{
WatchId: pbresp.WatchId,
},
}
req := &pb.WatchRequest{RequestUnion: cr}
wc.Send(req)

Number of Open Watches vs Slow Watchers on the server
image

@xiang90 xiang90 self-assigned this Feb 26, 2019
@gyuho gyuho modified the milestones: etcd-v3.4, etcd-v3.5 Aug 5, 2019
@stale
Copy link

stale bot commented Apr 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 6, 2020
@stale stale bot closed this as completed Apr 27, 2020
@fcvarela
Copy link

fcvarela commented May 4, 2020

@xiang90 Any chance this is actively being worked on? This has caused us production issues several times before and we'll invest the time to fix it if no one else is working on it.

@spzala spzala reopened this May 4, 2020
@stale stale bot removed the stale label May 4, 2020
@spzala
Copy link
Member

spzala commented May 4, 2020

@xiang90 Any chance this is actively being worked on? This has caused us production issues several times before and we'll invest the time to fix it if no one else is working on it.
Reopening the issue per your comment, thanks @fcvarela

@jackkleeman
Copy link
Contributor

I will run some benchmarks to measure the overhead of canceling each watcher on closing channel receive, from client-side

Can there be an overhead? Once a watch is closed on a client, we will eventually send a cancel request when the next message arrives. So, there are two cases by cancelling proactively:

  1. We end up cancelling earlier, as we don't wait for a message. Arguably this saves resources.
  2. We cancel where we otherwise wouldn't have, as no message would ever come. This is the case that is causing @monzo and presumably others production issues, and I don't know if leaving streams open indefinitely should be considered a performance improvement over closing them when no longer required.

I'm working on a PR to do what others above have suggested.

jackkleeman added a commit to jackkleeman/etcd that referenced this issue May 6, 2020
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
jackkleeman added a commit to jackkleeman/etcd that referenced this issue May 6, 2020
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
jackkleeman added a commit to jackkleeman/etcd that referenced this issue May 7, 2020
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
jackkleeman added a commit to jackkleeman/etcd that referenced this issue May 7, 2020
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
jackkleeman added a commit to monzo/etcd that referenced this issue May 11, 2020
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
jackkleeman added a commit to monzo/etcd that referenced this issue May 11, 2020
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
tangcong pushed a commit to tangcong/etcd that referenced this issue Jun 23, 2020
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
tangcong pushed a commit to tangcong/etcd that referenced this issue Jun 23, 2020
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
tangcong pushed a commit to tangcong/etcd that referenced this issue Jun 23, 2020
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
cierpuchaw pushed a commit to cloudwan/gohan that referenced this issue Aug 24, 2020
The new fork at github.com/cloudwan/etcd_for_gohan contains backported
to v3.3.18 fix for etcd-io/etcd#9416 . We
use watches heavily and with that fix, the memory usage in etcd server
no longer grows uncontrollably.

Regenerated go.sum while at it.
cierpuchaw pushed a commit to cloudwan/etcd_for_gohan that referenced this issue Aug 26, 2020
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.

This a port to v3.3.8 of a change made by @jackkleeman in
87aa5a9.
fcvarela pushed a commit to monzo/etcd that referenced this issue Aug 27, 2020
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
cierpuchaw added a commit to cloudwan/gohan that referenced this issue Sep 7, 2020
The new fork at github.com/cloudwan/etcd_for_gohan contains backported
to v3.3.18 fix for etcd-io/etcd#9416 . We
use watches heavily and with that fix, the memory usage in etcd server
no longer grows uncontrollably.

Regenerated go.sum while at it.
suhailpatel pushed a commit to monzo/etcd that referenced this issue Feb 3, 2022
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
suhailpatel pushed a commit to monzo/etcd that referenced this issue Feb 14, 2022
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
suhailpatel pushed a commit to monzo/etcd that referenced this issue Feb 14, 2022
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
suhailpatel pushed a commit to monzo/etcd that referenced this issue Feb 14, 2022
Currently, watch cancel requests are only sent to the server after a
message comes through on a watch where the client has cancelled. This
means that cancelled watches that don't receive any new messages are
never cancelled; they persist for the lifetime of the client stream.
This has negative connotations for locking applications where a watch
may observe a key which might never change again after cancellation,
leading to many accumulating watches on the server.

By cancelling proactively, in most cases we simply move the cancel
request to happen earlier, and additionally we solve the case where the
cancel request would never be sent.

Fixes etcd-io#9416
Heavy inspiration drawn from the solutions proposed there.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

8 participants