jiyis opened a new issue, #11329: URL: https://github.com/apache/iceberg/issues/11329
### Query engine Iceberg API 1.6.1 ### Question ### Schema ```java TableIdentifier tableIdentifier = TableIdentifier.of("default", "example_table"); Schema schema = new Schema( Types.NestedField.optional(1, "event_id", Types.StringType.get()), Types.NestedField.optional(2, "username", Types.StringType.get()), Types.NestedField.optional(3, "userid", Types.IntegerType.get()), Types.NestedField.optional(4, "api_version", Types.StringType.get()), Types.NestedField.optional(5, "command", Types.StringType.get()) ); PartitionSpec spec = PartitionSpec.builderFor(schema) .bucket("event_id", 10) .build(); ``` ### Insert data ```java TableIdentifier name = TableIdentifier.of("default", "example_table"); Table table = catalog.loadTable(name); Schema schema = table.schema(); GenericAppenderFactory appenderFactory = new GenericAppenderFactory(schema); int partitionId = 1, taskId = 1; OutputFileFactory outputFileFactory = OutputFileFactory.builderFor(table, partitionId, taskId) .format(FileFormat.AVRO).build(); final PartitionKey partitionKey = new PartitionKey(table.spec(), table.spec().schema()); PartitionedFanoutWriter<Record> partitionedFanoutWriter = new PartitionedFanoutWriter<>( table.spec(), FileFormat.AVRO, appenderFactory, outputFileFactory, table.io(), 10 * 1024 * 1024) { @Override protected PartitionKey partition(Record record) { partitionKey.partition(record); return partitionKey; } }; GenericRecord genericRecord = GenericRecord.create(table.schema()); List<String> levels = Arrays.asList("info", "debug", "error", "warn"); Random random = new Random(); for (int i = 0; i < 10000; i++) { GenericRecord record = genericRecord.copy(); String eventId = UUID.randomUUID().toString(); record.setField("event_id", eventId); record.setField("username", levels.get(random.nextInt(levels.size()))); record.setField("userid", random.nextInt(10000000)); record.setField("api_version", "1.0"); record.setField("command", eventId); partitionedFanoutWriter.write(record); } AppendFiles appendFiles = table.newAppend(); Arrays.stream(partitionedFanoutWriter.dataFiles()).forEach(appendFiles::appendFile); Snapshot newSnapshot = appendFiles.apply(); appendFiles.commit(); ``` ### Query I'd like to filter data by bucket partition,but it seems that no data is being retrieved. I have confirmed that the data exists, and I can retrieve it using other fields. ```java // empty result CloseableIterable<Record> result = IcebergGenerics.read(tbl) .where(Expressions.equal( "event_id" , "9c83f47c-9a07-4a6b-949c-3bedc31852fe")) .build(); // empty result CloseableIterable<Record> result = IcebergGenerics.read(tbl) .where(Expressions.equal(Expressions.bucket("event_id", 10), 1)) .build(); // has result Record(9c83f47c-9a07-4a6b-949c-3bedc31852fe, info, 2377306, 1.0, 9c83f47c-9a07-4a6b-949c-3bedc31852fe) CloseableIterable<Record> result = IcebergGenerics.read(tbl) .where(Expressions.equal( "command" , "9c83f47c-9a07-4a6b-949c-3bedc31852fe")) .build(); ``` How should I query data by partition(bucket) field? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org