Re: [PR] Add Daft examples and code into PyIceberg docs and Table [iceberg-python]

via GitHub Mon, 05 Feb 2024 12:48:32 -0800


jaychia commented on PR #355:
URL: https://github.com/apache/iceberg-python/pull/355#issuecomment-1928068519


   > Should we also have some sanity checks, for example:
   > 
   > 
https://github.com/apache/iceberg-python/blob/a4856bc2eadf90ac85dec96d4502ca3517bb1bb5/tests/integration/test_reads.py#L184
   
   We could either do this in this PR, or as a follow-up. Let me know your 
preference!
   
   I have some tests written up, but am having some trouble starting the dev 
environment locally.
   
   Is `make test-integration` still the recommended way for running integration 
tests? I get some errors when provisioning data:
   
   ```
   docker-compose -f dev/docker-compose-integration.yml exec -T spark-iceberg 
ipython ./provision.py
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
   24/02/05 20:33:50 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
   Could not initialize FileIO: pyiceberg.io.pyarrow.PyArrowFileIO
   24/02/05 20:33:58 WARN BaseTransaction: Failed to load metadata for a 
committed snapshot, skipping clean-up
   24/02/05 20:33:58 WARN BaseTransaction: Failed to load metadata for a 
committed snapshot, skipping clean-up
   24/02/05 20:34:04 WARN BaseTransaction: Failed to load metadata for a 
committed snapshot, skipping clean-up
   24/02/05 20:34:05 WARN metastore: Failed to connect to the MetaStore 
Server...
   24/02/05 20:34:06 WARN metastore: Failed to connect to the MetaStore 
Server...
   24/02/05 20:34:07 WARN metastore: Failed to connect to the MetaStore 
Server...
   ---------------------------------------------------------------------------
   Py4JJavaError                             Traceback (most recent call last)
   File /opt/spark/provision.py:51
        27 catalogs = {
        28     'rest': load_catalog(
        29         "rest",
      (...)
        47     ),
        48 }
        50 for catalog_name, catalog in catalogs.items():
   ---> 51     spark.sql(
        52         f"""
        53       CREATE DATABASE IF NOT EXISTS {catalog_name}.default;
        54     """
        55     )
        57     schema = Schema(
        58         NestedField(field_id=1, name="uuid_col", 
field_type=UUIDType(), required=False),
        59         NestedField(field_id=2, name="fixed_col", 
field_type=FixedType(25), required=False),
        60     )
        62     
catalog.create_table(identifier="default.test_uuid_and_fixed_unpartitioned", 
schema=schema)
   
   File /opt/spark/python/pyspark/sql/session.py:1440, in 
SparkSession.sql(self, sqlQuery, args, **kwargs)
      1438 try:
      1439     litArgs = {k: _to_java_column(lit(v)) for k, v in (args or 
{}).items()}
   -> 1440     return DataFrame(self._jsparkSession.sql(sqlQuery, litArgs), 
self)
      1441 finally:
      1442     if len(kwargs) > 0:
   
   File /opt/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py:1322, 
in JavaMember.__call__(self, *args)
      1316 command = proto.CALL_COMMAND_NAME +\
      1317     self.command_header +\
      1318     args_command +\
      1319     proto.END_COMMAND_PART
      1321 answer = self.gateway_client.send_command(command)
   -> 1322 return_value = get_return_value(
      1323     answer, self.gateway_client, self.target_id, self.name)
      1325 for temp_arg in temp_args:
      1326     if hasattr(temp_arg, "_detach"):
   
   File /opt/spark/python/pyspark/errors/exceptions/captured.py:169, in 
capture_sql_exception.<locals>.deco(*a, **kw)
       167 def deco(*a: Any, **kw: Any) -> Any:
       168     try:
   --> 169         return f(*a, **kw)
       170     except Py4JJavaError as e:
       171         converted = convert_exception(e.java_exception)
   
   File /opt/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/protocol.py:326, in 
get_return_value(answer, gateway_client, target_id, name)
       324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
       325 if answer[1] == REFERENCE_TYPE:
   --> 326     raise Py4JJavaError(
       327         "An error occurred while calling {0}{1}{2}.\n".
       328         format(target_id, ".", name), value)
       329 else:
       330     raise Py4JError(
       331         "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n".
       332         format(target_id, ".", name, value))
   
   Py4JJavaError: An error occurred while calling o41.sql.
   : org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive 
Metastore
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add Daft examples and code into PyIceberg docs and Table [iceberg-python]

Reply via email to