This is an automated email from the ASF dual-hosted git repository.
ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new be10dc03bbb0 [SPARK-55801][PYTHON] Fix type hint of
_SimpleStreamReaderWrapper.getCache
be10dc03bbb0 is described below
commit be10dc03bbb0da40f0d5b3058c8907fdffaac41a
Author: Tian Gao <[email protected]>
AuthorDate: Tue Mar 3 10:58:23 2026 +0800
[SPARK-55801][PYTHON] Fix type hint of _SimpleStreamReaderWrapper.getCache
### What changes were proposed in this pull request?
Add `None` to type hint of `getCache` method.
### Why are the changes needed?
The `getCache` method can return `None` but the type hint says otherwise.
This confuses the type checker.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
CI.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #54582 from gaogaotiantian/get-cache-type-hint.
Authored-by: Tian Gao <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
---
python/pyspark/sql/datasource_internal.py | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/python/pyspark/sql/datasource_internal.py
b/python/pyspark/sql/datasource_internal.py
index 2ac6c280e822..34039216c1d8 100644
--- a/python/pyspark/sql/datasource_internal.py
+++ b/python/pyspark/sql/datasource_internal.py
@@ -19,7 +19,7 @@
import json
import copy
from itertools import chain
-from typing import Iterator, List, Sequence, Tuple, Type, Dict
+from typing import Iterator, List, Optional, Sequence, Tuple, Type, Dict
from pyspark.sql.datasource import (
DataSource,
@@ -143,7 +143,7 @@ class _SimpleStreamReaderWrapper(DataSourceStreamReader):
assert self.cache[-1].end == end
return [SimpleInputPartition(start, end)]
- def getCache(self, start: dict, end: dict) -> Iterator[Tuple]:
+ def getCache(self, start: dict, end: dict) -> Optional[Iterator[Tuple]]:
start_idx = -1
end_idx = -1
for idx, entry in enumerate(self.cache):
@@ -155,7 +155,7 @@ class _SimpleStreamReaderWrapper(DataSourceStreamReader):
end_idx = idx
break
if start_idx == -1 or end_idx == -1:
- return None # type: ignore[return-value]
+ return None
# Chain all the data iterator between start offset and end offset
# need to copy here to avoid exhausting the original data iterator.
entries = [copy.copy(entry.iterator) for entry in self.cache[start_idx
: end_idx + 1]]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]