-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zmq: support single port configuration #3077
Conversation
@matthewrmshin small change and a test added since your review. |
Unit tests got stuck in Travis. Checked out the branch locally and the same happened:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's a hanging thread/socket somewhere. I wasn't going to use any linux tool to check files, but @hjoliver taught me (thanks!) about the shortcuts in htop
, so quickly inspected pytest
process, and found that it hung while waiting a file descriptor type stream (I believe these are used by PyZMQ under the hood). 💥
(fd=14
is the file descriptor it's using with poll
)
So suspected PyZMQ was not closing its sockets properly. And after reading some of its docs (opening the classes via reference in the IDE), tried the following that worked:
diff --git a/lib/cylc/network/server.py b/lib/cylc/network/server.py
index 44e860979..26e6506c5 100644
--- a/lib/cylc/network/server.py
+++ b/lib/cylc/network/server.py
@@ -105,6 +105,9 @@ class ZMQServer(object):
self.port = self.socket.bind_to_random_port(
'tcp://*', min_port, max_port)
except zmq.error.ZMQError as exc:
+ if self.socket:
+ self.socket.close()
+ del self.context
raise CylcError('could not start Cylc ZMQ server: %s' % str(exc))
# start accepting requests
@@ -120,6 +123,8 @@ class ZMQServer(object):
LOG.debug('stopping zmq server...')
self.queue.put('STOP')
self.thread.join() # wait for the listener to return
+ self.socket.close()
+ del self.context
LOG.debug('...stopped')
def register_endpoints(self):
diff --git a/lib/cylc/tests/test_zmq.py b/lib/cylc/tests/test_zmq.py
index b609365e3..4727274a6 100644
--- a/lib/cylc/tests/test_zmq.py
+++ b/lib/cylc/tests/test_zmq.py
@@ -33,6 +33,6 @@ def test_single_port():
with pytest.raises(CylcError) as exc:
serv2.start(port, port)
- assert 'Address already in use' in str(exc)
+ assert 'Address already in use' in str(exc)
serv1.stop()
Hope that helps
@@ -278,7 +286,7 @@ def __init__(self, schd): | |||
self, | |||
encrypt, | |||
decrypt, | |||
lambda: get_secret(schd.suite) | |||
partial(get_secret, schd.suite) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that for performance? I think lambdas are a bit easier to follow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLDR; lambdas are considered by some (incl guido) to have been a mistake, one of "Python's glitches"!
There are some benefits of partial
including:
- There is apparently a slight improvement in performance.
functools.partial
is also a little more Pythonic than the a with Python as it understands that it is an adapter and knows what it is passing arguments through to which presumably means it can work nicely with IDEs etc.
But really it's about lambas falling from favour rather than partial delivering superiour functionality, nowerdays Python has turned against lambdas:
- They were never properly implemented (e.g. can only be defined on one line)
- Named lambdas break
pycodestyle
(the argument being you should just use a function instead) - Late binding issues cause endless confusion (for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! I've never used functools.partial
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha! But in this case lambdas and partials deliver pretty much the same feature, right?
In Java/Scala lambdas and closures are still being used, and the equivalent for partials are currying in Scala (supported out of the box by the language by simply not providing all parameters) and in Java by (as verbose as always) creating some objects instantiating classes from java.util.function
.
I think both are still used, in some cases developers prefer lambdas for clarity/simplicity, while others prefer currying to be closer for FP. And there are cases when one or the other can be misused causing unwanted bugs, performance regressions, etc. But I think none are discouraged yet.
If I understood correctly, in this case, we have something similar to this code:
from functools import partial
class Scheduler(object):
def __init__(self):
self.suite = "sugar"
self.testa = lambda: self.echovalue(self.suite)
self.testb = partial(self.echovalue, self.suite)
def echovalue(self, value):
print(value)
s = Scheduler()
s.testa() # sugar
s.testb() # sugar
s.suite = "salt"
s.testa() # salt
s.testb() # sugar, because partial is similar to a... FP applicative functor? keeping the value inside it.. I think...
So I think it means that if we ever touch sched.suite
and change its value, then the partial object we passed would be outdated, and we would actually have to create a new one. I think?
So much to learn in Python 😟 and quite late here, so sorry if wrote anything silly ☕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So much to learn in Python worried and quite late here, so sorry if wrote anything silly coffee
Nothing silly whatsoever.
Gotcha! But in this case lambdas and partials deliver pretty much the same feature, right?
Exactly!
In Java/Scala lambdas and closures are still being used
Lambdas can be nice, but as a concept they have never really been properly integrated into Python. Their scope is limited, only one expression and no line breaks permitted.
But I think none are discouraged yet.
Lambdas are discouraged in Python to some extent if not entirely:
- List comprehensions and generator expressions are preferred to lambdas where possible removing most lambda use cases.
- Named lambdas are actively discouraged.
- The reason that the
map
function is discouraged is actually to do with thelambda
use case e.g.map([1, 2, 3], lambda x: x**2)
.
So I think it means that if we ever touch sched.suite and change its value, then the partial object we passed would be outdated
Yep, to me that's the correct behaviour!
Late closures can be useful but I think they are likely to catch you out more often than they are to come in handy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, to me that's the correct behaviour!
Well, I'm -0 here. Fine if others (cc @cylc/core) prefer the partial
. For me if I need to change the .suite
property of Scheduler
, I wouldn't want to have to remember that I also need to update the encode and the decode method.
Late closures can be useful but I think they are likely to catch you out more often than they are to come in handy.
Agreed, but in this case I think I would stick with the lambda to prevent a possible bug. -0 and -1 because we don't change the value of .suite
now. As far as I can tell, renaming a suite causes that suite to be re-registered, creating a new instance of Scheduler
. But if that changed later to actually reuse the Scheduler
(for reasons), then we would have a bug where the suite was not able to authenticate I think.
lib/cylc/network/server.py
Outdated
else: | ||
self.port = self.socket.bind_to_random_port( | ||
'tcp://*', min_port, max_port) | ||
except zmq.error.ZMQError as exc: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to catch ZMQBindError
... ZMQError
- strangely - is not the parent class of all errors in PyZMQ... so ZMQBindError
extends ZMQBaseError
... ZMQError
also extends ZMQBaseError
.
bind_to_random_port
may raise a ZMQBindError
I think... my IDE shows the documentation of the API used, and luckily PyZMQ documents return/args/raises/etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ZMQError
- strangely - is not the parent class of all errors in PyZMQ
Strange, good spot!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, just tried this, turns out that ZMQ raises a ZMQError
rather than the more specific ZMQBindError
documented!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the pyzmq code:
for i in range(max_tries):
try:
port = random.randrange(min_port, max_port)
self.bind('%s:%s' % (addr, port))
except ZMQError as exception:
en = exception.errno
if en == zmq.EADDRINUSE:
continue
elif sys.platform == 'win32' and en == errno.EACCES:
continue
else:
raise
else:
return port
raise ZMQBindError("Could not bind socket to random port.")
So it raises ZMQError
if it fails to bind to a specific port and ZMQBindError
if it fails to find a socket to bind to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will catch both...
e4937ba
to
88191e0
Compare
Thanks Bruno for your excellent spotting! Tests pass without hanging for me, hows it at your end? |
88191e0
to
10a27a7
Compare
Code looks good, Travis seems to be going to sleep (must be in NZ time), so just kicked its unit test stage. It failed with:
Will take another look in the morning and hopefully merge it! 🎉 |
TB has timed out again. Strange, I don't get that locally, do you?
+ del self.context
|
10a27a7
to
1580174
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to update one use of partial
as per comment to get Travis build passing.
1580174
to
e717256
Compare
Poke. |
Sorry for missing this one! Merging now! 🎉 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine to me (I went thorough it and tested it about a week ago, but got somewhat distracted by testing the ZMQ random port grabbing functionality - for a multi-port range, as opposed to the point of this PR - which didn't seem to work as I recall ... I always go a failure on trying to start a 2nd suite ... need to revisit that though). Anyway, long story short ... I'll approve this, but do we have agreement on the partial vs lambda discussion?
Haha, @kinow bet me to it! 👍 |
Sorry for merging it before you had time to comment!
Not a consensus I think. No strong opinion on enforcing lambdas, was more curious why one was used replacing the other. Both would work here I think, and some libraries (e.g. graphql) have code & examples with lambdas (sure we can find some with partial too, and the ones with lambdas could possibly be re-written with partial). Possibly like |
closes #3075
Also adds a nice error message in the event that ZMQ fails to bind to the specified port(s).