improve @enforce_ordering to leverage a wait channel to avoid spinlocks #144

sachinrekhi · 2016-05-02T17:25:45Z

@andrewgodwin I tried my hand at implementing the approach we discussed for improving @enforce_ordering to avoid spinlocks and hitting the current retry limits

Discussion here: #141

Have a look and let me know if this is what you were generally thinking. This seems to be working well for me in my use-case in development thus far.

For the wait channel name, I'm currently using "wait.%s " % reply_channel.name

I had to force the channel session to save prior to finishing the method in order to enforce the strict ordering that we discussed of marking next message order as available for running prior to re-queueing wait channel message.

andrewgodwin · 2016-05-05T20:54:53Z

channels/sessions.py

+                while True:
+                    wait_channel = "wait.%s" % message.reply_channel.name
+                    channel, content = message.channel_layer.receive_many([wait_channel], block=False)
+                    if channel and content:


You only need "if channel" here (if channel is None content always will be)

Updated this

andrewgodwin · 2016-05-05T20:56:50Z

Yeah, this seems just like what I was thinking. Couple of things:

Do we need to try consuming off of the wait channel before the consumer runs, too? I'm trying to think if there's a situation where message Highlight code examples #3 is in the wait queue and Fixed wrong code on doc #4 comes in; this would just enqueue Fixed wrong code on doc #4 and not try to run Highlight code examples #3
Maybe we should use a prefix __wait__, rather than just wait, to indicate the special-ness

sachinrekhi · 2016-05-08T06:20:49Z

Renamed the wait channel to __wait__
You are correct that in your scenario the decorator would just enqueue Fixed wrong code on doc #4 on the wait channel and not try to run Highlight code examples #3. But when currently running task Fixed docstring typo in ResponseLater #2 completes, it will dequeue Highlight code examples #3 and Fixed wrong code on doc #4 and put them back on the main channel, allowing a subsequent worker to pull them off. What you are suggesting in terms of having the worker directly run available items from the wait queue may be a potential optimization, but I was trying to stick with the rule that the decorator either executes the consumer for the specific message it was given or it puts it on the wait channel, but it never runs a consumer for another message it was not asked to run. Though you do end up having to wait for a subsequent worker to pick up the requeued message, it simplifies the implementation because you only easily have access to func (the consumer code) for the task you've been asked to run, not for anything on the wait queue (which could require a different consumer func). Otherwise I think I'd need to get the router logic in here somehow?

andrewgodwin · 2016-05-08T06:53:42Z

Yes, the only situation I can come up with where #3 ends up on the wait channel without #2 not being running is if #2 has died, at which point you arguably don't want any other packets processing anyway. You'll end up with a zombie socket that has no way of telling it's queueing everything into a void, but at least its messages will get cleaned up after the expiry delay.

Also, given the new backpressure stuff I added this week, you'll need to handle the possibility that send() might raise the channel_layer.ChannelFull exception, which means (unsurprisingly) that the channel is full. I suggest we just re-raise it as a more explicit error, and maybe if we can provide some way for people to hook it into closing the websocket, without hardcoding websocket channel names into the decorator.

sachinrekhi · 2016-05-09T23:39:39Z

I added handling for the channel_layer.ChannelFull exception by re-raising the same exception with specific messaging on what in-fact happened.

I attempted to address your suggestion on enabling closing the websocket by adding an optional close_on_error parameter to the decorator that is default True which will send a "close": True message to the reply channel whenever the ChannelFull exception is raised.

I'm not sure why the CI integration is failing on these changes. The Travis CI error report seems to suggest channel_layer is None in message.channel_layer, but not sure how that could be?

andrewgodwin · 2016-05-09T23:48:10Z

Ah, some of the unit tests set channel_layer to None if they're just testing message parsing (it's one of the arguments to the Message constructor) - you likely need to just patch the offending test to take channels.channel_layers[DEFAULT_CHANNEL_LAYER] instead and make sure it uses ChannelTestCase so the channel layers are swapped out to temporary ones.

…wait channels

sachinrekhi · 2016-05-10T17:04:38Z

Got it. I updated all the @enforce_ordering unit tests to also make more relevant assertions against the wait channel since they no longer raise ConsumeLater.

We should be all set now.

andrewgodwin · 2016-05-12T06:09:08Z

Urgh, sorry to be nitpicky again, but your close_on_error implementation is currently specific to the WebSocket endpoint - other protocols are likely to turn up with an 'order' field but without the same close mechanism; can we pull it out for now and I'll merge it like that?

… on websockets

sachinrekhi · 2016-05-12T15:18:12Z

Oh good point, have only been using websockets myself. I pulled it.

andrewgodwin · 2016-05-12T17:38:15Z

Merged! Thank you so much for your work on this.

…ks (django#144) * improved @enforce_ordering to leverage a wait channel to avoid spinlocks * addressed pyflake issues * renamed wait channel to __wait__.<reply channel> * handled potential ChannelFull exception * updated sessions unit tests * updated enforce_ordering tests to reflect new approach of leveraging wait channels * addressed pyflake issues * more pyflake fixes * removed close_on_error handling on enforce_ordering since only worked on websockets

sachinrekhi added 2 commits May 2, 2016 10:16

improved @enforce_ordering to leverage a wait channel to avoid spinlocks

9df85b9

addressed pyflake issues

abc5dd7

andrewgodwin reviewed May 5, 2016
View reviewed changes

renamed wait channel to __wait__.<reply channel>

76e0bf2

sachinrekhi added 2 commits May 9, 2016 15:32

Merge remote-tracking branch 'upstream/master'

2ee8e06

handled potential ChannelFull exception

30c0654

sachinrekhi added 4 commits May 9, 2016 16:59

updated sessions unit tests

ef4bea1

updated enforce_ordering tests to reflect new approach of leveraging …

ce9a7c5

…wait channels

addressed pyflake issues

663daf0

more pyflake fixes

96405d9

removed close_on_error handling on enforce_ordering since only worked…

4a6939a

… on websockets

andrewgodwin merged commit 363b5a0 into django:master May 12, 2016

jusic mentioned this pull request May 13, 2016

Repeated ws_receive in multichat example with a external test client #151

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve @enforce_ordering to leverage a wait channel to avoid spinlocks #144

improve @enforce_ordering to leverage a wait channel to avoid spinlocks #144

sachinrekhi commented May 2, 2016

andrewgodwin May 5, 2016

sachinrekhi May 8, 2016

andrewgodwin commented May 5, 2016

sachinrekhi commented May 8, 2016

andrewgodwin commented May 8, 2016

sachinrekhi commented May 9, 2016

andrewgodwin commented May 9, 2016

sachinrekhi commented May 10, 2016 •

edited

Loading

andrewgodwin commented May 12, 2016

sachinrekhi commented May 12, 2016

andrewgodwin commented May 12, 2016

improve @enforce_ordering to leverage a wait channel to avoid spinlocks #144

improve @enforce_ordering to leverage a wait channel to avoid spinlocks #144

Conversation

sachinrekhi commented May 2, 2016

andrewgodwin May 5, 2016

Choose a reason for hiding this comment

sachinrekhi May 8, 2016

Choose a reason for hiding this comment

andrewgodwin commented May 5, 2016

sachinrekhi commented May 8, 2016

andrewgodwin commented May 8, 2016

sachinrekhi commented May 9, 2016

andrewgodwin commented May 9, 2016

sachinrekhi commented May 10, 2016 • edited Loading

andrewgodwin commented May 12, 2016

sachinrekhi commented May 12, 2016

andrewgodwin commented May 12, 2016

sachinrekhi commented May 10, 2016 •

edited

Loading