Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor suite runtime server API logic #2373

Merged
merged 7 commits into from
Sep 7, 2017

Conversation

matthewrmshin
Copy link
Contributor

@matthewrmshin matthewrmshin commented Jul 31, 2017

Highlights:

  • Combine client classes into a single service client class.
  • Combine server classes into a single service class.
  • Flatten API method serve points. E.g.:
    • https://host:port/command/hold_suite => https://host:port/hold_suite
    • https://host:port/broadcast/get => https://host:port/get_broadcast
    • It is now very easy to remount the service under a different hierarchy of the web server.
  • Move non-network related logic away from the cylc.network package.
    • Various related singletons, includingSuiteConfig, are now normal object instances.
    • Broadcast management + external trigger management in new cylc.broadcast_mgr module.
    • cylc.flags.pflag is now an attribute of the TaskEventsMgr.
    • State summary management in new cylc.state_summary_mgr module.
  • Remove cylc.network.https package.
  • Single method call from GUI on idle suite.
  • Removed 30+ source files and 1000+ source lines.

@matthewrmshin matthewrmshin added this to the soon milestone Jul 31, 2017
@matthewrmshin matthewrmshin self-assigned this Jul 31, 2017
@matthewrmshin matthewrmshin force-pushed the refactor-network-api branch 4 times, most recently from 8efedbb to 80e24d4 Compare August 1, 2017 15:18
@hjoliver
Copy link
Member

hjoliver commented Aug 1, 2017

@matthewrmshin - from your description, this looks great. Could you just clarify a few things though:

  • what's the rationale for consolidating the client and server classes (into one of each)?
  • ditto for flattening the API; and what do you mean by "remount the service under a different hierarchy of the web service"?
  • remove https package (the intention of a separate package here was to allow, potentially, supporting other comms layers in future)

(Just want to make sure I understand all of this!)

@matthewrmshin
Copy link
Contributor Author

matthewrmshin commented Aug 1, 2017

The main rationale is to reduce complexity. This is achieved here by combining the API into a simple set of methods under a single façade, all visible under the same roof.

A side effect of this pattern is that it can potentially allows us to turn the relationship between the web server and the suite upside down. E.g. it will be possible to start up a web server on its own and then mount multiple suites to it!

I cannot really see any case of alternate communication methods to HTTPS at the moment. The current approach has been a bit awkward - with cherrypy logic leaked back into the general network package. (The original approach made sense when we were doing remote object calls in Pyro, but it has become awkward now that we are supposed to have a Restful interface.) This new approach should be very easy to change into a different set of implementation should there be a case to do so. (But now that you have said it, I think I'll rename the modules to make it clearer that they are HTTPS client, and HTTPS server and service façade based on cherrypy.)

@matthewrmshin
Copy link
Contributor Author

Profiling by running the complex suite shows minor improvement to Elapsed/CPU time against master but very little change to memory:

1st run:

Version               Run            Elapsed Time (s)  CPU Time - Total (s)  Max Memory (kb)
7.4.0                 complex suite  1285.7            479.8                 84560.0        
master                complex suite  1280.9            469.3                 72040.0        
refactor-network-api  complex suite  1189.0            450.5                 71948.0        

2nd run:

Version               Run            Elapsed Time (s)  CPU Time - Total (s)  Max Memory (kb)
7.4.0                 complex suite  1294.4            485.6                 92948.0        
master                complex suite  1293.0            470.3                 71816.0        
refactor-network-api  complex suite  1209.2            454.6                 71960.0        

where:

  • master = 7.4.0-243-g337f15fa9
  • refactor-network-api = 7.4.0-249-gca7160cbe

@hjoliver
Copy link
Member

hjoliver commented Aug 4, 2017

Re profiling results - is that pretty much what you expected?

@matthewrmshin
Copy link
Contributor Author

I have expected a mainly neutral result. The small performance gain is a bonus.

@hjoliver
Copy link
Member

hjoliver commented Aug 4, 2017

excellent :-)

@trwhitcomb
Copy link
Collaborator

@matthewrmshin you remark "I cannot really see any case of alternate communication methods to HTTPS at the moment." I have a few things that may require an alternate communication method, but haven't started yet on implementing them. Only reason to bring it up here is to serve as a reminder that there are other options that others are thinking about (and has nothing to do with this PR).

@matthewrmshin
Copy link
Contributor Author

To reassure, this change should make it easier to add other forms of communication protocols rather than making it harder. Logic unrelated to the network have been moved to where they belong. The API is now in one place and simplified.

There are lots more to do to really complete the refactor, but I do feel that this change is taking us towards the right direction for the future.

@trwhitcomb
Copy link
Collaborator

@matthewrmshin thanks for your feedback - lots of this looked simplified, but upon further review what set of my alarm bells turned out to be me mis-reading something! This looks very helpful toward what I'm thinking about.

@matthewrmshin matthewrmshin force-pushed the refactor-network-api branch 2 times, most recently from 35b8877 to 124a51e Compare August 18, 2017 07:27
@matthewrmshin
Copy link
Contributor Author

Conflict resolved. Branch re-based.

@oliver-sanders
Copy link
Member

Adding my profiling results:

Version Run Elapsed Time (s) CPU Time - Total (s) Max Memory (kb)
matt/refactor-network-api complex suite 1036.5 294.1 71424.0
95fded2 complex suite 1056.0 307.2 72004.0

@oliver-sanders oliver-sanders added the efficiency For notable efficiency improvements label Aug 29, 2017
Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That certainly tidied up a bunch of stuff. Nice work. I didn't find any real problems.

)
'WARNING, %s:'
' not explicitly defined in dependency graphs (deprecated)'
) % name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code block can be removed? - implicit-cycling is now obsolete, not just deprecated. (I'm slightly confused though, as I can't trigger this block with implicit cycling even back at d2c77 where it was first introduced 2 years ago).

return (list(cancel_keys) in
[prune[2:] for prune in prunes if prune[2:]])
def _settings_to_keys_list(broadcasts):
"""Return a list containing each setting dict keys as a list.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(don't need "as a list" on the end of that sentence?)

try:
self.pool.join()
except AssertionError:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to catch AssertionError here?

# is set to tell process_tasks() that task processing is required.
broadcast_mgr = self.task_events_mgr.broadcast_mgr
broadcast_mgr.add_ext_triggers(
self.ext_trigger_queue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(no need for a line break here)

@matthewrmshin matthewrmshin modified the milestones: soon, next release Sep 5, 2017
@matthewrmshin
Copy link
Contributor Author

Branch rebased. Comments addressed.

Copy link
Member

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good to me, some comments:


sent = False
i_try = 0
while not sent and i_try < max_n_tries:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be:

for i_try in range(0, max_n_tries):
    try:
        ...
    except _:
        ...
    else:
        ...
        break
else:
    sys.exit('ERROR: send failed')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with similar logic.

self._report()
self._check_access_priv('full-read')
if isinstance(pruned, basestring):
pruned = ast.literal_eval(pruned)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these ast.literal_eval calls be wrapped in try/except statements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now wrapped by a static method - raise 400 with a useful error message on exception.

self.clients[uuid] = time()
self._housekeep()

def _report_id_requests(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

except (IOError, OSError):
size = 0
if size == prev_size:
return [], prev_size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return '', prev_size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


if METHOD in ["https", "http"]:
from cylc.network.https.daemon import CommsDaemon
def unicode_encode(data):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't encode non-unicode strings, purposeful?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function comes from https://github.com/cylc/cylc/blob/master/lib/cylc/network/https/util.py#L65 and I have not made any logical change to it, so not sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line confuses me {unicode_encode(key): unicode_encode(value)}.

Why would we try to unicode_encode the key if unicode_encode doesn't encode strings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the purpose of this logic is to ensure any unicode conforms to UTF-8 rather than to encode strings to unicode. Perhaps a name change could alleviate this confusion.

Copy link
Contributor

@dvalters dvalters Sep 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - a unicode data type is not encoded. u'my string'.encode() converts the unicode string into bytes. So if it's already a str (which is already bytes) it is skipped, otherwise if it's a unicode string it gets encoded to UTF-8.
https://docs.python.org/2/howto/unicode.html#python-2-x-s-unicode-support
https://stackoverflow.com/questions/10288016/usage-of-unicode-and-encode-functions-in-python

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will rename this utf8_enforce.

@cherrypy.expose
@cherrypy.tools.json_in()
@cherrypy.tools.json_out()
def clear_broadcast(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts on these methods:

  • The privilege levels (e.g. 'full-control') could be constants.
  • The self._check_access_priv(<privlevel>) method could be a decorator (e.g. @privilege_full_control)
  • A self._report() call is present in every method? Could this be factored out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added constants.

self._check_access_priv(<privlevel>)and self._report() now combined into a single method call. Not done via decorator, but it is now a single line per method.

Combine multiple client classes into a single one.
Single HTTPS call from GUI on idle suite.

Combine all server classes into a single service facade class.
Move non-network related logic from the `cylc.network` package.
Remove `cylc.network.https` package.
Flatten API method serve points.
Remove singleton suite config logic.
Remove deprecated test in validate.
Improve doc string in broadcast manager.
Improve comment for AssertionError on joining multiprocessing pool.
Remove unnecessary new line in schduler.
Connect scheduler to server after config is loaded to prevent rare
traceback from scan.
Improve retry message send loop in `cylc ext-trigger`.
Constants for access privilege levels.
Wrap `ast.literal_eval` in httpserver.
* return 400 with useful error message on failure.
Check access privilege and report are now done by single method call.
Fix bad return in `cylc.suite_logging`.
Rename `unicode_encode` to `utf8_enforce`.
@oliver-sanders oliver-sanders merged commit 6b0723d into cylc:master Sep 7, 2017
@matthewrmshin matthewrmshin deleted the refactor-network-api branch September 7, 2017 18:22
@hjoliver
Copy link
Member

hjoliver commented Sep 7, 2017

@matthewrmshin - did you consider backward compatibility (for client-server interaction) with this change. (We don't have it, but maybe you considered it!)

@matthewrmshin
Copy link
Contributor Author

I considered it. I have put in compatibility with scan for obvious reasons. The other methods are not compatible at the moment. I can put something in if you think it is desirable to do so.

@hjoliver
Copy link
Member

hjoliver commented Sep 7, 2017

I'm reasonably happy to tell users to deal with it, but you have a lot more users - so I'll go with you on this.

"""Start quick web service."""
# cherrypy.config["tools.encode.on"] = True
# cherrypy.config["tools.encode.encoding"] = "utf-8"
cherrypy.config["server.socket_host"] = get_suite_host()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This was changed from '0.0.0.0' in lib/cylc/network/https/daemon.py since because Codacy was complaining about possible binding to all interfaces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Bandit documentation (the security analyzer run by Codacy):

Binding to all network interfaces can potentially open up a service to traffic on unintended interfaces, that may not be properly documented or secured. This plugin test looks for a string pattern “0.0.0.0” that may indicate a hardcoded binding to all network interfaces.

https://docs.openstack.org/bandit/latest/plugins/b104_hardcoded_bind_all_interfaces.html?highlight=binding%20all%20interfaces

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
efficiency For notable efficiency improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants