Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

port scan inactivity timeout #2060

Merged

Conversation

matthewrmshin
Copy link
Contributor

@matthewrmshin matthewrmshin commented Nov 4, 2016

Manage custom process pool instead of using mutliprocess.Pool.
This allows the logic to terminate processes with hanging connections - about 10 seconds of inactivity.
Report each host:port that causes a process to hang or has timed out connection.

Also:

  • Default connection timeout reduced to 5 seconds - a long time for modern intra-network traffic.
  • Child processes are used as workers.
    Each worker runs a loop to receive messages from main process to scan the next host:port.
    Previously, each process pool job scans all ports (in range) of a host.
  • Single UUID for all jobs.
  • Remove cylc 6.5 compatibility logic.
    Cylc 7 port scan does not work with cylc 6 suites any way.
  • Improve returned data structure.

@matthewrmshin matthewrmshin added this to the next release milestone Nov 4, 2016
@matthewrmshin matthewrmshin self-assigned this Nov 4, 2016
@matthewrmshin matthewrmshin force-pushed the port-scan-inactivity-timeout branch from 6fa5c83 to 0e25102 Compare November 4, 2016 14:12
@matthewrmshin
Copy link
Contributor Author

@arjclark @oliver-sanders please review.

@hjoliver
Copy link
Member

An off-piste review: can you print out the port number of connections that hang and timeout? It can be useful information (someone suspended a non-detached suite with Ctrl-Z ...)

@arjclark
Copy link
Contributor

Review 1: seems fine to me and not too horrendous in terms of waiting time now when tested. @hjoliver's suggestion about printing hung ports sounds like it'd be useful

@matthewrmshin
Copy link
Contributor Author

can you print out the port number of connections that hang and timeout?

I thought I have done this. However, on a revisit, the logic does not work for the SIGSTOP (Ctrl-Z) case. This is because connection timeout is actually caught in the new code (e.g. requests.exceptions.ReadTimeout), which then get repackaged into a cylc.network.ConnectionError, which is normally ignored. I'll add a new handler for this. Hopefully, we'll then be able to report both connection timeout and hanged connection. (I'll add a test for the former, but I am yet unable to mock the latter.)

@matthewrmshin matthewrmshin force-pushed the port-scan-inactivity-timeout branch 4 times, most recently from 1edd042 to 8f8b418 Compare November 10, 2016 14:38
@matthewrmshin
Copy link
Contributor Author

(Trying to get Travis CI to work with the new test. Please hold.)

@matthewrmshin matthewrmshin force-pushed the port-scan-inactivity-timeout branch 9 times, most recently from f070285 to 2ac5e09 Compare November 10, 2016 21:25
Manage custom process pool instead of using `multiprocessing.Pool`.
This allows the logic to terminate processes with hanging connecitons -
about 10 seconds of inactivity.
Report each `host:port` that times out or causes a process to hang.

Also:
* Child processes are used as workers.
  Each process runs a loop to receive messages from the main process to scan the
  next `host:port`.
  Previously, each process pool job scans all ports (in range) of a host.
* Single UUID for all jobs.
* Remove cylc 6.5 compatibility logic.
  Cylc 7 port scan does not work with cylc 6 suites any way.
* Improve returned data structure.
@matthewrmshin matthewrmshin force-pushed the port-scan-inactivity-timeout branch from 2ac5e09 to bb2ff88 Compare November 11, 2016 09:14
@matthewrmshin
Copy link
Contributor Author

@arjclark @oliver-sanders Travis CI now back in green. Please continue review.

Copy link
Member

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK. Tested as solving the issue originally reported.

@oliver-sanders
Copy link
Member

@arjclark If you're happy with the alterations I'll merge.

@arjclark
Copy link
Contributor

@oliver-sanders - yup all good with me.

@oliver-sanders oliver-sanders merged commit e5882ce into cylc:master Nov 11, 2016
@matthewrmshin matthewrmshin deleted the port-scan-inactivity-timeout branch November 11, 2016 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants