Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-45141][PYTHON][INFRA][TESTS] Pin pyarrow==12.0.1 in CI #42897

Closed
wants to merge 1 commit into from

Conversation

zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

Pin pyarrow==12.0.1 in CI

Why are the changes needed?

to fix test failure, https://github.com/apache/spark/actions/runs/6167186123/job/16738683632

======================================================================
FAIL [0.095s]: test_from_to_pandas (pyspark.pandas.tests.data_type_ops.test_datetime_ops.DatetimeOpsTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 122, in _assert_pandas_equal
    assert_series_equal(
  File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", line 931, in assert_series_equal
    assert_attr_equal("dtype", left, right, obj=f"Attributes of {obj}")
  File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", line 415, in assert_attr_equal
    raise_assert_detail(obj, msg, left_attr, right_attr)
  File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", line 599, in raise_assert_detail
    raise AssertionError(msg)
AssertionError: Attributes of Series are different

Attribute "dtype" are different
[left]:  datetime64[ns]
[right]: datetime64[us]

Does this PR introduce any user-facing change?

No

How was this patch tested?

CI and manually test

Was this patch authored or co-authored using generative AI tooling?

No

@zhengruifeng
Copy link
Contributor Author

zhengruifeng commented Sep 13, 2023

I guess the reason is:

#42842 disable old cache, and rebuild the whole image with pyarrow=13.0.0

before #42842 the cached pyarrow==12.0.1 is used, even though pyarrow=13.0.0 has been released.

@zhengruifeng
Copy link
Contributor Author

cc @HyukjinKwon @Yikun

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to not forget, could you add a TODO PR for PyArrow 13.x, @zhengruifeng ?

@zhengruifeng
Copy link
Contributor Author

@dongjoon-hyun thanks for the reminder, I create https://issues.apache.org/jira/browse/SPARK-45143 to track this issue

@HyukjinKwon HyukjinKwon changed the title [SPARK-45141][TESTS] Pin pyarrow==12.0.1 in CI [SPARK-45141][PYTHON][TESTS] Pin pyarrow==12.0.1 in CI Sep 13, 2023
@HyukjinKwon HyukjinKwon changed the title [SPARK-45141][PYTHON][TESTS] Pin pyarrow==12.0.1 in CI [SPARK-45141][PYTHON][INFRA][TESTS] Pin pyarrow==12.0.1 in CI Sep 13, 2023
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@zhengruifeng
Copy link
Contributor Author

all python tests passed, merged to master

@zhengruifeng zhengruifeng deleted the pin_pyarrow branch September 13, 2023 07:52
@itholic
Copy link
Contributor

itholic commented Sep 13, 2023

Thanks!

@dongjoon-hyun
Copy link
Member

According to #45545 (comment) , let me backport this to branch-3.5 and 3.4 to recover the CIs.

dongjoon-hyun pushed a commit that referenced this pull request Mar 17, 2024
### What changes were proposed in this pull request?
Pin `pyarrow==12.0.1` in CI

### Why are the changes needed?
to fix test failure,  https://github.com/apache/spark/actions/runs/6167186123/job/16738683632

```
======================================================================
FAIL [0.095s]: test_from_to_pandas (pyspark.pandas.tests.data_type_ops.test_datetime_ops.DatetimeOpsTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 122, in _assert_pandas_equal
    assert_series_equal(
  File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", line 931, in assert_series_equal
    assert_attr_equal("dtype", left, right, obj=f"Attributes of {obj}")
  File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", line 415, in assert_attr_equal
    raise_assert_detail(obj, msg, left_attr, right_attr)
  File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", line 599, in raise_assert_detail
    raise AssertionError(msg)
AssertionError: Attributes of Series are different

Attribute "dtype" are different
[left]:  datetime64[ns]
[right]: datetime64[us]
```

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI and manually test

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #42897 from zhengruifeng/pin_pyarrow.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
(cherry picked from commit e3d2dfa)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
dongjoon-hyun pushed a commit that referenced this pull request Mar 17, 2024
Pin `pyarrow==12.0.1` in CI

to fix test failure,  https://github.com/apache/spark/actions/runs/6167186123/job/16738683632

```
======================================================================
FAIL [0.095s]: test_from_to_pandas (pyspark.pandas.tests.data_type_ops.test_datetime_ops.DatetimeOpsTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 122, in _assert_pandas_equal
    assert_series_equal(
  File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", line 931, in assert_series_equal
    assert_attr_equal("dtype", left, right, obj=f"Attributes of {obj}")
  File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", line 415, in assert_attr_equal
    raise_assert_detail(obj, msg, left_attr, right_attr)
  File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", line 599, in raise_assert_detail
    raise AssertionError(msg)
AssertionError: Attributes of Series are different

Attribute "dtype" are different
[left]:  datetime64[ns]
[right]: datetime64[us]
```

No

CI and manually test

No

Closes #42897 from zhengruifeng/pin_pyarrow.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
(cherry picked from commit e3d2dfa)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 8049a20)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Aug 7, 2024
Pin `pyarrow==12.0.1` in CI

to fix test failure,  https://github.com/apache/spark/actions/runs/6167186123/job/16738683632

```
======================================================================
FAIL [0.095s]: test_from_to_pandas (pyspark.pandas.tests.data_type_ops.test_datetime_ops.DatetimeOpsTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/testing/pandasutils.py", line 122, in _assert_pandas_equal
    assert_series_equal(
  File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", line 931, in assert_series_equal
    assert_attr_equal("dtype", left, right, obj=f"Attributes of {obj}")
  File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", line 415, in assert_attr_equal
    raise_assert_detail(obj, msg, left_attr, right_attr)
  File "/usr/local/lib/python3.9/dist-packages/pandas/_testing/asserters.py", line 599, in raise_assert_detail
    raise AssertionError(msg)
AssertionError: Attributes of Series are different

Attribute "dtype" are different
[left]:  datetime64[ns]
[right]: datetime64[us]
```

No

CI and manually test

No

Closes apache#42897 from zhengruifeng/pin_pyarrow.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
(cherry picked from commit e3d2dfa)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 8049a20)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants