Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve] Refactor ray.serve for compatibility #4759

Closed
wants to merge 11 commits into from

Conversation

simon-mo
Copy link
Contributor

  • Use flask instead of starlette for http framework for now
  • Move example actors to a file instead of a module.
  • Add serve to travis ci

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/745/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/746/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/14163/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/747/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/749/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/14167/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/14170/
Test FAILed.

@simon-mo
Copy link
Contributor Author

@robertnishihara This PR is ready for review. (I can't request a reviewer somehow)

.travis.yml Outdated
@@ -173,6 +173,11 @@ script:
- if [ $RAY_CI_RLLIB_AFFECTED == "1" ]; then ./ci/suppress_output python python/ray/rllib/tests/test_optimizers.py; fi
- if [ $RAY_CI_RLLIB_AFFECTED == "1" ]; then ./ci/suppress_output python python/ray/rllib/tests/test_evaluators.py; fi

# ray serve tests
- if [ $RAY_CI_SERVE_AFFECTED == "1" ] && [ $RAY_CI_PY3 == "1" ]; then ./ci/suppress_output python python/ray/experimental/serve/tests/test_actors.py; fi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't actually run the test, right? You need to do

Suggested change
- if [ $RAY_CI_SERVE_AFFECTED == "1" ] && [ $RAY_CI_PY3 == "1" ]; then ./ci/suppress_output python python/ray/experimental/serve/tests/test_actors.py; fi
- if [ $RAY_CI_SERVE_AFFECTED == "1" ] && [ $RAY_CI_PY3 == "1" ]; then ./ci/suppress_output python -m pytest -v --durations=5 --timeout=300 python/ray/experimental/serve/tests/test_actors.py; fi

same with the others.

@robertnishihara
Copy link
Collaborator

@simon-mo sorry for the delay, I'm still looking over it.

The reason for switching to flask is to support Python 3.5? Did you notice any performance changes there?

are used for testing as well as demoing purpose.
"""

from __future__ import absolute_import, division, print_function
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from __future__ import absolute_import, division, print_function
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

- if [ $RAY_CI_SERVE_AFFECTED == "1" ] && [ $RAY_CI_PY3 == "1" ]; then ./ci/suppress_output python python/ray/experimental/serve/tests/test_actors.py; fi
- if [ $RAY_CI_SERVE_AFFECTED == "1" ] && [ $RAY_CI_PY3 == "1" ]; then ./ci/suppress_output python python/ray/experimental/serve/tests/test_deadline_router.py; fi
- if [ $RAY_CI_SERVE_AFFECTED == "1" ] && [ $RAY_CI_PY3 == "1" ]; then ./ci/suppress_output python python/ray/experimental/serve/tests/test_default_app.py; fi

# ray tests
# Python3.5+ only. Otherwise we will get `SyntaxError` regardless of how we set the tester.
- if [ $RAY_CI_PYTHON_AFFECTED == "1" ]; then python -c 'import sys;exit(sys.version_info>=(3,5))' || python -m pytest -v --durations=5 --timeout=300 python/ray/experimental/test/async_test.py; fi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the RAY_CI_PY3 variable here.

@@ -54,6 +69,10 @@ def list_changed_files(commit_range):
RAY_CI_RLLIB_AFFECTED = 1
RAY_CI_LINUX_WHEELS_AFFECTED = 1
RAY_CI_MACOS_WHEELS_AFFECTED = 1
elif changed_file.startswith("python/ray/experimental/serve"):
RAY_CI_SERVE_AFFECTED = 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to add RAY_CI_SERVE_AFFECTED = 1 to the python block, the src bloc, and both else blocks.

.travis.yml Outdated
# ray serve tests
- if [ $RAY_CI_SERVE_AFFECTED == "1" ] && [ $RAY_CI_PY3 == "1" ]; then ./ci/suppress_output python python/ray/experimental/serve/tests/test_actors.py; fi
- if [ $RAY_CI_SERVE_AFFECTED == "1" ] && [ $RAY_CI_PY3 == "1" ]; then ./ci/suppress_output python python/ray/experimental/serve/tests/test_deadline_router.py; fi
- if [ $RAY_CI_SERVE_AFFECTED == "1" ] && [ $RAY_CI_PY3 == "1" ]; then ./ci/suppress_output python python/ray/experimental/serve/tests/test_default_app.py; fi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be really great to figure out why these tests are failing when we run them with one big pytest command. I'll look into it a little.

@robertnishihara
Copy link
Collaborator

robertnishihara commented May 17, 2019

@simon-mo, I just noticed that we're calling ray.experimental.get_actor a bunch of times in the router. This call isn't particularly lightweight, and so we should be sure to do it only once per actor.

EDIT: It's not that the actual call is slow or anything, but the raylet currently has to do some bookkeeping for each actor handle.

@robertnishihara
Copy link
Collaborator

After creating a router (and not doing any work), and letting it loop for 30 seconds, there are about 100K actor handles that the raylet is keeping track of and 100K local objects.

@pcmoritz
Copy link
Contributor

Should we use aiohttp instead of flask to avoid the blocking ray.get here? In that case we can use Ray's non-blocking async get.

@robertnishihara
Copy link
Collaborator

@pcmoritz are you talking about for the HTTP frontend? One benefit of flask is that it will work with Python 2 (which I realize we don't support yet here, but we could..).

@pcmoritz
Copy link
Contributor

pcmoritz commented May 17, 2019

Something like this:

from ray.experimental import async_api

async_api.init()

async def handle(request):
    object_id = # Call actor
    result = await async_api.as_future(object_id)
    # Return jsonified result

@simon-mo
Copy link
Contributor Author

Yes. Aiohttp has a good async support for python 3.5. I’ll start with that.

I realized flask is probably not idea because it’s a threaded server.

@pcmoritz
Copy link
Contributor

@robertnishihara Yes, we might want to consider only supporting python 3 until somebody asks for python 2 support (which will be EOL pretty soon).

@robertnishihara
Copy link
Collaborator

robertnishihara commented May 17, 2019 via email

@robertnishihara
Copy link
Collaborator

robertnishihara commented May 23, 2019

Ok, once #4838 is merged, we should be able to run all of the serve tests with a single pytest command.

EDIT: Ok, #4838 partially fixed the issue, but not completely.

EDIT: Ok, once #4844 is merged, the issue should be fixed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/937/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/14403/
Test FAILed.

@simon-mo
Copy link
Contributor Author

Closed by #5541

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants