Skip to content

Commit

Permalink
[SPARK-47500][PYTHON][CONNECT][FOLLOWUP] Restore error message for `D…
Browse files Browse the repository at this point in the history
…ataFrame.select(None)`

### What changes were proposed in this pull request?
the refactor PR #45636 changed the error message of `DataFrame.select(None)` from `PySparkTypeError` to `AssertionError`, this PR restore the previous error message

### Why are the changes needed?
error message improvement

### Does this PR introduce _any_ user-facing change?
yes, error message improvement

### How was this patch tested?
added test

### Was this patch authored or co-authored using generative AI tooling?
no

Closes #46930 from zhengruifeng/py_restore_select_error.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
  • Loading branch information
zhengruifeng authored and HyukjinKwon committed Jun 11, 2024
1 parent 3fe6abd commit 1e4750e
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 0 deletions.
5 changes: 5 additions & 0 deletions python/pyspark/sql/connect/dataframe.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,11 @@ def select(self, __cols: Union[List[Column], List[str]]) -> ParentDataFrame:
def select(self, *cols: "ColumnOrName") -> ParentDataFrame: # type: ignore[misc]
if len(cols) == 1 and isinstance(cols[0], list):
cols = cols[0]
if any(not isinstance(c, (str, Column)) for c in cols):
raise PySparkTypeError(
error_class="NOT_LIST_OF_COLUMN_OR_STR",
message_parameters={"arg_name": "columns"},
)
return DataFrame(
plan.Project(self._plan, [F._to_col(c) for c in cols]),
session=self._session,
Expand Down
11 changes: 11 additions & 0 deletions python/pyspark/sql/tests/connect/test_connect_error.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
from pyspark.errors.exceptions.base import SessionNotSameException
from pyspark.sql.types import Row
from pyspark.testing.connectutils import should_test_connect
from pyspark.errors import PySparkTypeError
from pyspark.errors.exceptions.connect import AnalysisException
from pyspark.sql.tests.connect.test_connect_basic import SparkConnectSQLTestCase

Expand Down Expand Up @@ -214,6 +215,16 @@ def test_column_cannot_be_constructed_from_string(self):
with self.assertRaises(TypeError):
Column("col")

def test_select_none(self):
with self.assertRaises(PySparkTypeError) as e1:
self.connect.range(1).select(None)

self.check_error(
exception=e1.exception,
error_class="NOT_LIST_OF_COLUMN_OR_STR",
message_parameters={"arg_name": "columns"},
)


if __name__ == "__main__":
from pyspark.sql.tests.connect.test_connect_error import * # noqa: F401
Expand Down

0 comments on commit 1e4750e

Please sign in to comment.