djchapm opened a new issue, #8767: URL: https://github.com/apache/iceberg/issues/8767
### Feature Request / Improvement Hi, writing this in an effort to improve documentation - I spent a crazy amount of time writing to glue catalog and parquet-avro files in S3 with Iceberg, but could never query the data using Athena. I thought it had to do with all the missing metadata on the glue tables - but this was a red herring. Problem was that writing files does not automatically update metadata. According to the API, if you use Table.io():  This made me think using an OutputFile via Table.io() would update metadata. My usage: ``` OutputFile outputFile = table.io().newOutputFile(location); appenderLocation.put(messageType, location); FileAppender<GenericRecord> appender = Parquet.write(outputFile) .forTable(table) .setAll(propsBuilder) .createWriterFunc(ParquetAvroWriter::buildWriter) .build(); ``` On closing the appender - the file writes but there are no updates to metadata. My table is from GlueCatalog.loadTable(). I'm new - but I could not find anywhere that you have to then lookup the file again as an InputFile, create a transcation on the table and commit it: ``` log.info("Closing appender for message type {}", key); value.close(); //Appender from above //one attempt, does nothing: // tables.get(key).rewriteManifests(); log.info("Commiting {} file {}", key, appenderLocation.get(key)); InputFile inputFile = tables.get(key).io().newInputFile(appenderLocation.get(key)); DataFile dataFile = DataFiles.builder(tables.get(key).spec()) .withInputFile(inputFile) .withMetrics(value.metrics()) .withFormat(FileFormat.PARQUET) .build(); Transaction t = tables.get(key).newTransaction(); t.newAppend().appendFile(dataFile).commit(); // commit all changes to the table t.commitTransaction(); ``` So would like improvements with respect to documentation and AWS integration for writing Parquet data using GlueCatalog. Or at least a test or example people could follow for writing files and updating corresponding catalog metadata using public APIs (Junits do all kinds of metadata updates but with protected APIs we cannot access). Let me know your thoughts. ### Query engine Athena -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org