Skip to content

Commit

Permalink
[SPARK-50706][PYTHON][TESTS] Skip test_value_state_ttl_expiration in …
Browse files Browse the repository at this point in the history
…Coverage build

### What changes were proposed in this pull request?

This PR proposes to skip `test_value_state_ttl_expiration` in Coverage build for now.

### Why are the changes needed?

To make the build passing for now. It fails when the Coverage is on (https://github.com/apache/spark/actions/runs/12544995465/job/34978553717):

```
======================================================================
ERROR [12.848s]: test_value_state_ttl_expiration (pyspark.sql.tests.pandas.test_pandas_transform_with_state.TransformWithStateInPandasTests.test_value_state_ttl_expiration)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/__w/spark/spark/python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py", line 403, in test_value_state_ttl_expiration
    q.processAllAvailable()
  File "/__w/spark/spark/python/pyspark/sql/streaming/query.py", line 351, in processAllAvailable
    return self._jsq.processAllAvailable()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/spark/spark/python/lib/py4j-0.10.9.8-src.zip/py4j/java_gateway.py", line 1355, in __call__
    return_value = get_return_value(
                   ^^^^^^^^^^^^^^^^^
  File "/__w/spark/spark/python/pyspark/errors/exceptions/captured.py", line 253, in deco
    raise converted from None
pyspark.errors.exceptions.captured.StreamingQueryException: [STREAM_FAILED] Query [id = 623e9008-52cb-4b9d-9343-432e7bd855bb, runId = cc06b909-37fd-4acd-98ff-8809b9df92c7] terminated with exception: [FOREACH_BATCH_USER_FUNCTION_ERROR] An error occurred in the user provided function in foreach batch sink. Reason: An exception was raised by the Python Proxy. Return Message: Traceback (most recent call last):
  File "/__w/spark/spark/python/lib/py4j-0.10.9.8-src.zip/py4j/clientserver.py", line 641, in _call_proxy
    return_value = getattr(self.pool[obj_id], method)(*params)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/spark/spark/python/pyspark/sql/utils.py", line 157, in call
    raise e
  File "/__w/spark/spark/python/pyspark/sql/utils.py", line 154, in call
    self.func(DataFrame(jdf, wrapped_session_jdf), batch_id)
  File "/__w/spark/spark/python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py", line 334, in check_results
    assertDataFrameEqual(
  File "/__w/spark/spark/python/pyspark/testing/utils.py", line 1074, in assertDataFrameEqual
    assert_rows_equal(actual_list, expected_list, maxErrors=maxErrors, showOnlyDiff=showOnlyDiff)
  File "/__w/spark/spark/python/pyspark/testing/utils.py", line 1030, in assert_rows_equal
    raise PySparkAssertionError(
pyspark.errors.exceptions.base.PySparkAssertionError: [DIFFERENT_ROWS] Results do not match: ( 75.00000 % )
*** actual ***
  Row(id='count-0', count=2)
  Row(id='count-1', count=2)
! Row(id='ttl-count-0', count=1)
! Row(id='ttl-count-1', count=1)
! Row(id='ttl-list-state-count-0', count=1)
! Row(id='ttl-list-state-count-1', count=1)
! Row(id='ttl-map-state-count-0', count=1)
! Row(id='ttl-map-state-count-1', count=1)

*** expected ***
  Row(id='count-0', count=2)
  Row(id='count-1', count=2)
! Row(id='ttl-count-0', count=2)
! Row(id='ttl-count-1', count=2)
! Row(id='ttl-list-state-count-0', count=3)
! Row(id='ttl-list-state-count-1', count=3)
! Row(id='ttl-map-state-count-0', count=2)
! Row(id='ttl-map-state-count-1', count=2)
```

### Does this PR introduce _any_ user-facing change?

No, test-only.

### How was this patch tested?

Will monitor the Coverage build.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#49337 from HyukjinKwon/SPARK-50706.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
  • Loading branch information
HyukjinKwon committed Dec 31, 2024
1 parent 6099de7 commit 5ef556b
Showing 1 changed file with 3 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -922,6 +922,9 @@ def test_transform_with_state_in_pandas_batch_query_initial_state(self):

# This test covers mapState with TTL, an empty state variable
# and additional test against initial state python runner
@unittest.skipIf(
"COVERAGE_PROCESS_START" in os.environ, "Flaky with coverage enabled, skipping for now."
)
def test_transform_with_map_state_metadata(self):
checkpoint_path = tempfile.mktemp()

Expand Down

0 comments on commit 5ef556b

Please sign in to comment.