Copilot commented on code in PR #756: URL: https://github.com/apache/sedona-db/pull/756#discussion_r3076021938
########## python/sedonadb/tests/geography/test_geog_measures.py: ########## @@ -0,0 +1,53 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import pytest +from sedonadb.testing import BigQuery, PostGIS, SedonaDB, geog_or_null +import sedonadb + +if "s2geography" not in sedonadb.__features__: + pytest.skip("Python package built without s2geography") Review Comment: `pytest.skip()` is being called at module import time; pytest requires `allow_module_level=True` for module-level skips. Without it, running the tests in an environment without `s2geography` will raise an error during collection instead of cleanly skipping the module. ```suggestion pytest.skip("Python package built without s2geography", allow_module_level=True) ``` ########## python/sedonadb/tests/geography/test_geog_accessors.py: ########## @@ -0,0 +1,51 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import pytest +import sedonadb +from sedonadb.testing import BigQuery, SedonaDB, geog_or_null + +if "s2geography" not in sedonadb.__features__: + pytest.skip("Python package built without s2geography") Review Comment: `pytest.skip()` is being called at module import time; pytest requires `allow_module_level=True` for module-level skips. Without it, running the tests in an environment without `s2geography` will raise an error during collection instead of cleanly skipping the module. ```suggestion pytest.skip("Python package built without s2geography", allow_module_level=True) ``` ########## python/sedonadb/tests/geography/test_constructors_parsers_formatters.py: ########## @@ -0,0 +1,83 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + + +import pytest +import sedonadb +from sedonadb.testing import BigQuery, PostGIS, SedonaDB, geog_or_null, val_or_null + +if "s2geography" not in sedonadb.__features__: + pytest.skip("Python package built without s2geography") Review Comment: `pytest.skip()` is being called at module import time; pytest requires `allow_module_level=True` for module-level skips. Without it, running the tests in an environment without `s2geography` will raise an error during collection instead of cleanly skipping the module. ```suggestion pytest.skip("Python package built without s2geography", allow_module_level=True) ``` ########## python/sedonadb/python/sedonadb/testing.py: ########## @@ -657,6 +658,206 @@ def __init__(self, uri=None): cur.execute("SET max_parallel_workers_per_gather TO 0") +class BigQuery(DBEngine): + """A BigQuery implementation of the DBEngine using ADBC + + Uses the ADBC BigQuery driver. Authentication uses Application Default + Credentials (ADC) by default — run ``gcloud auth application-default login`` + once to set that up. Set the following environment variables to configure + the connection: + + - SEDONADB_BIGQUERY_TEST_PROJECT_ID: GCP project identifier. Defaults to + "sedonadb-testing". + - SEDONADB_BIGQUERY_TEST_DATASET_ID: Datasert identifier. In general data is Review Comment: Typo in the BigQuery engine docstring: "Datasert" should be "Dataset". ```suggestion - SEDONADB_BIGQUERY_TEST_DATASET_ID: Dataset identifier. In general data is ``` ########## python/sedonadb/python/sedonadb/testing.py: ########## @@ -657,6 +658,206 @@ def __init__(self, uri=None): cur.execute("SET max_parallel_workers_per_gather TO 0") +class BigQuery(DBEngine): + """A BigQuery implementation of the DBEngine using ADBC + + Uses the ADBC BigQuery driver. Authentication uses Application Default + Credentials (ADC) by default — run ``gcloud auth application-default login`` + once to set that up. Set the following environment variables to configure + the connection: + + - SEDONADB_BIGQUERY_TEST_PROJECT_ID: GCP project identifier. Defaults to + "sedonadb-testing". + - SEDONADB_BIGQUERY_TEST_DATASET_ID: Datasert identifier. In general data is + not scanned for these tests because doing so would incur cost. + - SEDONADB_BIGQUERY_TEST_CREDENTIALS_FILE: (optional) Path to a service + account JSON key file. When omitted, ADC is used instead. + + Unless modifying these tests, the cached results should allow these tests + to run without an active connection (and should allow tests to run locally + much faster as opening a connection to BigQuery is slow). + """ + + _CACHE_DIR = Path(__file__).resolve().parent.parent.parent / "tests" / "geography" + _shared_cache: "ArrowSQLCache | None" = None + + def __init__(self, cache_path: "Path | None" = None): + self._cache_path = cache_path or self._CACHE_DIR / "bigquery_cache.yml" + if cache_path is not None or BigQuery._shared_cache is None: + BigQuery._shared_cache = ArrowSQLCache("bigquery", self._cache_path) + self._file_cache = BigQuery._shared_cache + self.con = None + self._con_attempted = False + + def _ensure_con(self): + import adbc_driver_bigquery + import adbc_driver_bigquery.dbapi + + if self.con is not None or self._con_attempted: + return + self._con_attempted = True + + project_id = os.environ.get( + "SEDONADB_BIGQUERY_TEST_PROJECT_ID", "sedonadb-testing" + ) + dataset_id = os.environ.get( + "SEDONADB_BIGQUERY_TEST_DATASET_ID", "sedonadb_test" + ) + credentials_file = os.environ.get("SEDONADB_BIGQUERY_TEST_CREDENTIALS_FILE") + + db_kwargs = { + adbc_driver_bigquery.DatabaseOptions.PROJECT_ID.value: project_id, + adbc_driver_bigquery.DatabaseOptions.DATASET_ID.value: dataset_id, + } + + if credentials_file: + db_kwargs[adbc_driver_bigquery.DatabaseOptions.AUTH_TYPE.value] = ( + adbc_driver_bigquery.DatabaseOptions.AUTH_VALUE_JSON_CREDENTIAL_FILE.value + ) + db_kwargs[adbc_driver_bigquery.DatabaseOptions.AUTH_CREDENTIALS.value] = ( + credentials_file + ) + + self.con = adbc_driver_bigquery.dbapi.connect(db_kwargs=db_kwargs) + + def close(self): + """Close the connection and flush any new cache entries to disk""" + self._file_cache.flush() + if self.con: + self.con.close() + + def __del__(self): + self.close() Review Comment: Calling `close()` from `__del__` can be unsafe because it may run during interpreter shutdown when modules/filesystem state are partially torn down, potentially causing noisy ignored exceptions (and it performs disk I/O via cache flush). Consider removing `__del__` and relying on explicit `close()`/context manager usage, or guard `close()` in `__del__` with broad exception handling. ```suggestion try: self.close() except Exception: pass ``` ########## python/sedonadb/python/sedonadb/testing.py: ########## @@ -657,6 +658,206 @@ def __init__(self, uri=None): cur.execute("SET max_parallel_workers_per_gather TO 0") +class BigQuery(DBEngine): + """A BigQuery implementation of the DBEngine using ADBC + + Uses the ADBC BigQuery driver. Authentication uses Application Default + Credentials (ADC) by default — run ``gcloud auth application-default login`` + once to set that up. Set the following environment variables to configure + the connection: + + - SEDONADB_BIGQUERY_TEST_PROJECT_ID: GCP project identifier. Defaults to + "sedonadb-testing". + - SEDONADB_BIGQUERY_TEST_DATASET_ID: Datasert identifier. In general data is + not scanned for these tests because doing so would incur cost. + - SEDONADB_BIGQUERY_TEST_CREDENTIALS_FILE: (optional) Path to a service + account JSON key file. When omitted, ADC is used instead. + + Unless modifying these tests, the cached results should allow these tests + to run without an active connection (and should allow tests to run locally + much faster as opening a connection to BigQuery is slow). + """ + + _CACHE_DIR = Path(__file__).resolve().parent.parent.parent / "tests" / "geography" + _shared_cache: "ArrowSQLCache | None" = None + + def __init__(self, cache_path: "Path | None" = None): + self._cache_path = cache_path or self._CACHE_DIR / "bigquery_cache.yml" + if cache_path is not None or BigQuery._shared_cache is None: + BigQuery._shared_cache = ArrowSQLCache("bigquery", self._cache_path) + self._file_cache = BigQuery._shared_cache + self.con = None + self._con_attempted = False + + def _ensure_con(self): + import adbc_driver_bigquery + import adbc_driver_bigquery.dbapi + Review Comment: `BigQuery` defers importing `adbc_driver_bigquery` until query execution. If a query is missing from the cache, this will raise `ImportError` (or other connection errors) at execution time, bypassing the existing `create_or_skip()` skip behavior. Consider catching `ImportError`/connection failures in `_ensure_con()` and converting them into a skip (when skipping is allowed) or a clearer exception that includes `install_hint()` and the relevant environment variables. ########## python/sedonadb/tests/geography/test_geog_transformations.py: ########## @@ -0,0 +1,39 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import pytest +from sedonadb.testing import BigQuery, PostGIS, SedonaDB, geog_or_null +import sedonadb + +if "s2geography" not in sedonadb.__features__: + pytest.skip("Python package built without s2geography") Review Comment: `pytest.skip()` is being called at module import time; pytest requires `allow_module_level=True` for module-level skips. Without it, running the tests in an environment without `s2geography` will raise an error during collection instead of cleanly skipping the module. ```suggestion pytest.skip("Python package built without s2geography", allow_module_level=True) ``` ########## python/sedonadb/tests/geography/test_geog_predicates.py: ########## @@ -0,0 +1,51 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +import pytest +import sedonadb +from sedonadb.testing import BigQuery, PostGIS, SedonaDB, geog_or_null + +if "s2geography" not in sedonadb.__features__: + pytest.skip("Python package built without s2geography") Review Comment: `pytest.skip()` is being called at module import time; pytest requires `allow_module_level=True` for module-level skips. Without it, running the tests in an environment without `s2geography` will raise an error during collection instead of cleanly skipping the module. ```suggestion pytest.skip("Python package built without s2geography", allow_module_level=True) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
