This is an automated email from the ASF dual-hosted git repository.
zhengruifeng pushed a commit to branch branch-4.x
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-4.x by this push:
new 465c5e113967 [SPARK-54993][PYTHON] Add type hints to NameTypeHolder
and IndexNameTypeHolder classes
465c5e113967 is described below
commit 465c5e1139673e478b5d7b9634c2e9e051cd9819
Author: adith-os <[email protected]>
AuthorDate: Wed May 13 15:01:13 2026 +0800
[SPARK-54993][PYTHON] Add type hints to NameTypeHolder and
IndexNameTypeHolder classes
### What changes were proposed in this pull request?
This PR adds proper type annotations to the NameTypeHolder and
IndexNameTypeHolder classes in python/pyspark/pandas/typedef/typehints.py
### Why are the changes needed?
This is a subtask of [
SPARK-54953](https://issues.apache.org/jira/browse/SPARK-54953) (upgrading mypy
to latest version). Changes improve type checking.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Errors on typehints.py before Changes (delete `# type: ignore[assignment]`):
```
starting mypy annotations test...
annotations failed mypy checks:
python/pyspark/pandas/typedef/typehints.py:729: error: Incompatible types
in assignment (expression has type "ExtensionDtype", variable has type "None")
[assignment]
python/pyspark/pandas/typedef/typehints.py:896: error: Incompatible types
in assignment (expression has type "ExtensionDtype", variable has type "None")
[assignment]
python/pyspark/pandas/typedef/typehints.py:900: error: Incompatible types
in assignment (expression has type "type[Any] | Any", variable has type "None")
[assignment]
python/pyspark/pandas/typedef/typehints.py:917: error: Incompatible types
in assignment (expression has type "ExtensionDtype", variable has type "None")
[assignment]
python/pyspark/pandas/typedef/typehints.py:920: error: Incompatible types
in assignment (expression has type "type[Any] | Any", variable has type "None")
[assignment]
```
After changes:
Verified with mypy that typehints.py has no type errors
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Github Copilot GPT 5.2-Codex
Closes #55807 from adith-os/SPARK-54993-improve-typehints.
Authored-by: adith-os <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
(cherry picked from commit 7d245ea0f8381195662cd5c35049e85898bb5600)
Signed-off-by: Ruifeng Zheng <[email protected]>
---
python/pyspark/pandas/typedef/typehints.py | 30 ++++++++++++------------------
1 file changed, 12 insertions(+), 18 deletions(-)
diff --git a/python/pyspark/pandas/typedef/typehints.py
b/python/pyspark/pandas/typedef/typehints.py
index 5c3ea90f34ee..658f3fb3bc62 100644
--- a/python/pyspark/pandas/typedef/typehints.py
+++ b/python/pyspark/pandas/typedef/typehints.py
@@ -25,7 +25,7 @@ import sys
import typing
from collections.abc import Iterable
from inspect import isclass
-from typing import Any, Callable, Generic, List, Tuple, Union, Type,
get_type_hints
+from typing import Any, Callable, Generic, List, Optional, Tuple, Union, Type,
get_type_hints
import numpy as np
import pandas as pd
@@ -127,15 +127,15 @@ class UnknownType:
class IndexNameTypeHolder:
- name = None
- tpe = None
- short_name = "IndexNameType"
+ name: Optional[str] = None
+ tpe: Optional[Union[type, Dtype]] = None
+ short_name: str = "IndexNameType"
class NameTypeHolder:
- name = None
- tpe = None
- short_name = "NameType"
+ name: Optional[str] = None
+ tpe: Optional[Union[type, Dtype]] = None
+ short_name: str = "NameType"
def as_spark_type(
@@ -722,7 +722,7 @@ def create_type_for_series_type(param: Any) ->
Type[SeriesType]:
new_class: Type[NameTypeHolder]
if isinstance(param, ExtensionDtype):
new_class = type(NameTypeHolder.short_name, (NameTypeHolder,), {})
- new_class.tpe = param # type: ignore[assignment]
+ new_class.tpe = param
else:
if LooseVersion(pd.__version__) < "3.0.0":
new_class = param.type if isinstance(param, np.dtype) else param
@@ -889,13 +889,11 @@ def _new_type_holders(
new_param.name = param.start
if LooseVersion(pd.__version__) < "3.0.0":
if isinstance(param.stop, ExtensionDtype):
- new_param.tpe = param.stop # type: ignore[assignment]
+ new_param.tpe = param.stop
else:
# When the given argument is a numpy's dtype instance.
new_param.tpe = (
- param.stop.type # type: ignore[assignment]
- if isinstance(param.stop, np.dtype)
- else param.stop
+ param.stop.type if isinstance(param.stop, np.dtype)
else param.stop
)
else:
new_param.tpe = param.stop
@@ -910,13 +908,9 @@ def _new_type_holders(
)
if LooseVersion(pd.__version__) < "3.0.0":
if isinstance(param, ExtensionDtype):
- new_type.tpe = param # type: ignore[assignment]
+ new_type.tpe = param
else:
- new_type.tpe = (
- param.type # type: ignore[assignment]
- if isinstance(param, np.dtype)
- else param
- )
+ new_type.tpe = param.type if isinstance(param, np.dtype)
else param
else:
new_type.tpe = param
new_types.append(new_type)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]