This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-4.0 by this push: new 25bee87585fe [SPARK-51758][SS][FOLLOWUP][TESTS] Fix flaky test around watermark due to additional batch causing empty df 25bee87585fe is described below commit 25bee87585fe0d2490c5fa5e0a51a32db25a8a3b Author: Anish Shrigondekar <anish.shrigonde...@databricks.com> AuthorDate: Thu Apr 17 14:33:56 2025 +0900 [SPARK-51758][SS][FOLLOWUP][TESTS] Fix flaky test around watermark due to additional batch causing empty df ### What changes were proposed in this pull request? Fix flaky test around watermark due to additional batch causing empty df ### Why are the changes needed? Without the changes, sometimes we would see the test fail because the extra batch with empty df case was not accounted for. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test only change ``` Will test the following Python tests: ['pyspark.sql.tests.pandas.test_pandas_transform_with_state TransformWithStateInPandasTests.test_transform_with_state_with_wmark_and_non_event_time'] python3.9 python_implementation is CPython python3.9 version is: Python 3.9.20 Starting test(python3.9): pyspark.sql.tests.pandas.test_pandas_transform_with_state TransformWithStateInPandasTests.test_transform_with_state_with_wmark_and_non_event_time (temp output: /Users/anish.shrigondekar/spark/spark/python/target/ad6814cf-2fcb-4f6c-a6c3-d63ce94013d4/python3.9__pyspark.sql.tests.pandas.test_pandas_transform_with_state_TransformWithStateInPandasTests.test_transform_with_state_with_wmark_and_non_event_time__7nb2vczb.log) Finished test(python3.9): pyspark.sql.tests.pandas.test_pandas_transform_with_state TransformWithStateInPandasTests.test_transform_with_state_with_wmark_and_non_event_time (40s) Tests passed in 40 seconds ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes #50615 from anishshri-db/task/SPARK-51758-test. Authored-by: Anish Shrigondekar <anish.shrigonde...@databricks.com> Signed-off-by: Jungtaek Lim <kabhwan.opensou...@gmail.com> (cherry picked from commit 2b78cec410cafad994ca3a2a3204c1b17fec86fb) Signed-off-by: Jungtaek Lim <kabhwan.opensou...@gmail.com> --- python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py b/python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py index 44ea90ab6659..ed343085854d 100644 --- a/python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py +++ b/python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py @@ -641,11 +641,14 @@ class TransformWithStateInPandasTestsMixin: assert set(batch_df.sort("id").collect()) == { Row(id="a", timestamp="4"), } - else: + elif batch_id == 2: # watermark for late event = 10 and min event = 2 with no filtering assert set(batch_df.sort("id").collect()) == { Row(id="a", timestamp="2"), } + else: + for q in self.spark.streams.active: + q.stop() self._test_transform_with_state_in_pandas_event_time( MinEventTimeStatefulProcessor(), check_results, "None" --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org