Skip to content

Commit

Permalink
[Data] Update Dataset.zip() docs (#46757)
Browse files Browse the repository at this point in the history
Update Dataset.zip() docs to remove incorrect statement about materializing the dataset
Fix ExecutionPlan.__repr__() string typo

Signed-off-by: Scott Lee <sjl@anyscale.com>
  • Loading branch information
scottjlee authored Jul 24, 2024
1 parent 9cd160d commit dca3fb5
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 3 deletions.
1 change: 1 addition & 0 deletions python/ray/data/_internal/plan.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ def __repr__(self) -> str:
f"ExecutionPlan("
f"dataset_uuid={self._dataset_uuid}, "
f"snapshot_operator={self._snapshot_operator}"
f")"
)

def get_plan_as_string(self, dataset_cls: Type["Dataset"]) -> str:
Expand Down
4 changes: 1 addition & 3 deletions python/ray/data/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2256,7 +2256,7 @@ def sort(

@PublicAPI(api_group=SMD_API_GROUP)
def zip(self, other: "Dataset") -> "Dataset":
"""Materialize and zip the columns of this dataset with the columns of another.
"""Zip the columns of this dataset with the columns of another.
The datasets must have the same number of rows. Their column sets are
merged, and any duplicate column names are disambiguated with suffixes like
Expand All @@ -2277,8 +2277,6 @@ def zip(self, other: "Dataset") -> "Dataset":
>>> ds1.zip(ds2).take_batch()
{'id': array([0, 1, 2, 3, 4]), 'id_1': array([0, 1, 2, 3, 4])}
Time complexity: O(dataset size / parallelism)
Args:
other: The dataset to zip with on the right hand side.
Expand Down

0 comments on commit dca3fb5

Please sign in to comment.