Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[data] Unit-test physical execution of Datasets #37106

Closed
stephanie-wang opened this issue Jul 5, 2023 · 2 comments
Closed

[data] Unit-test physical execution of Datasets #37106

stephanie-wang opened this issue Jul 5, 2023 · 2 comments
Assignees
Labels
data Ray Data-related issues enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks ray-2.9 Issues targeting Ray 2.9 release (~Q4 CY2023)

Comments

@stephanie-wang
Copy link
Contributor

Description

We should unit-test Datasets' interaction with the Ray core layer by tracing the calls to f.remote() and checking that these match the expected behavior. This could help prevent performance regressions like the following:

Ideally all (performance-critical) Dataset tests should explicitly say what Ray core tasks they expect to produce.

Use case

No response

@stephanie-wang stephanie-wang added enhancement Request for new feature and/or capability triage Needs triage (eg: priority, bug/not-bug, and owning component) P1 Issue that should be fixed within a few weeks data Ray Data-related issues Ray-2.7 and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jul 5, 2023
@raulchen
Copy link
Contributor

raulchen commented Jul 7, 2023

The tracking code can be added to python/ray/data/_internal/remote_fn.py::cached_remote_fn

@anyscalesam
Copy link
Contributor

Relevant issues that you described are closed @stephanie-wang; can we close this issue?

@anyscalesam anyscalesam added ray-2.9 Issues targeting Ray 2.9 release (~Q4 CY2023) and removed ray-2.8 labels Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Ray Data-related issues enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks ray-2.9 Issues targeting Ray 2.9 release (~Q4 CY2023)
Projects
None yet
Development

No branches or pull requests

3 participants