jaychia commented on PR #355: URL: https://github.com/apache/iceberg-python/pull/355#issuecomment-1928068519
> Should we also have some sanity checks, for example: > > https://github.com/apache/iceberg-python/blob/a4856bc2eadf90ac85dec96d4502ca3517bb1bb5/tests/integration/test_reads.py#L184 We could either do this in this PR, or as a follow-up. Let me know your preference! I have some tests written up, but am having some trouble starting the dev environment locally. Is `make test-integration` still the recommended way for running integration tests? I get some errors when provisioning data: ``` docker-compose -f dev/docker-compose-integration.yml exec -T spark-iceberg ipython ./provision.py Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 24/02/05 20:33:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Could not initialize FileIO: pyiceberg.io.pyarrow.PyArrowFileIO 24/02/05 20:33:58 WARN BaseTransaction: Failed to load metadata for a committed snapshot, skipping clean-up 24/02/05 20:33:58 WARN BaseTransaction: Failed to load metadata for a committed snapshot, skipping clean-up 24/02/05 20:34:04 WARN BaseTransaction: Failed to load metadata for a committed snapshot, skipping clean-up 24/02/05 20:34:05 WARN metastore: Failed to connect to the MetaStore Server... 24/02/05 20:34:06 WARN metastore: Failed to connect to the MetaStore Server... 24/02/05 20:34:07 WARN metastore: Failed to connect to the MetaStore Server... --------------------------------------------------------------------------- Py4JJavaError Traceback (most recent call last) File /opt/spark/provision.py:51 27 catalogs = { 28 'rest': load_catalog( 29 "rest", (...) 47 ), 48 } 50 for catalog_name, catalog in catalogs.items(): ---> 51 spark.sql( 52 f""" 53 CREATE DATABASE IF NOT EXISTS {catalog_name}.default; 54 """ 55 ) 57 schema = Schema( 58 NestedField(field_id=1, name="uuid_col", field_type=UUIDType(), required=False), 59 NestedField(field_id=2, name="fixed_col", field_type=FixedType(25), required=False), 60 ) 62 catalog.create_table(identifier="default.test_uuid_and_fixed_unpartitioned", schema=schema) File /opt/spark/python/pyspark/sql/session.py:1440, in SparkSession.sql(self, sqlQuery, args, **kwargs) 1438 try: 1439 litArgs = {k: _to_java_column(lit(v)) for k, v in (args or {}).items()} -> 1440 return DataFrame(self._jsparkSession.sql(sqlQuery, litArgs), self) 1441 finally: 1442 if len(kwargs) > 0: File /opt/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args) 1316 command = proto.CALL_COMMAND_NAME +\ 1317 self.command_header +\ 1318 args_command +\ 1319 proto.END_COMMAND_PART 1321 answer = self.gateway_client.send_command(command) -> 1322 return_value = get_return_value( 1323 answer, self.gateway_client, self.target_id, self.name) 1325 for temp_arg in temp_args: 1326 if hasattr(temp_arg, "_detach"): File /opt/spark/python/pyspark/errors/exceptions/captured.py:169, in capture_sql_exception.<locals>.deco(*a, **kw) 167 def deco(*a: Any, **kw: Any) -> Any: 168 try: --> 169 return f(*a, **kw) 170 except Py4JJavaError as e: 171 converted = convert_exception(e.java_exception) File /opt/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name) 324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client) 325 if answer[1] == REFERENCE_TYPE: --> 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". 328 format(target_id, ".", name), value) 329 else: 330 raise Py4JError( 331 "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n". 332 format(target_id, ".", name, value)) Py4JJavaError: An error occurred while calling o41.sql. : org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive Metastore ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org