stevenzwu commented on code in PR #11513: URL: https://github.com/apache/iceberg/pull/11513#discussion_r1934818356
########## core/src/main/java/org/apache/iceberg/actions/RewriteFilePlan.java: ########## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.iceberg.actions; + +import java.util.Map; +import java.util.stream.Stream; +import org.apache.iceberg.DataFile; +import org.apache.iceberg.FileScanTask; +import org.apache.iceberg.StructLike; + +/** Result of the data file rewrite planning. */ +public class RewriteFilePlan Review Comment: similarly, I found it confusing to have `RewriteFilePlan` and `FileRewritePlan`. We need better naming to distinguish the two classes (not just order swapping). Maybe `FileRewritePlan` can be called `RewriteFilePlanBase`? ########## core/src/main/java/org/apache/iceberg/actions/RewritePositionDeletesGroupPlanner.java: ########## @@ -0,0 +1,241 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.iceberg.actions; + +import java.io.IOException; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.atomic.AtomicInteger; +import java.util.stream.Stream; +import org.apache.iceberg.DeleteFile; +import org.apache.iceberg.MetadataTableType; +import org.apache.iceberg.MetadataTableUtils; +import org.apache.iceberg.Partitioning; +import org.apache.iceberg.PositionDeletesScanTask; +import org.apache.iceberg.PositionDeletesTable; +import org.apache.iceberg.RewriteJobOrder; +import org.apache.iceberg.StructLike; +import org.apache.iceberg.Table; +import org.apache.iceberg.TableProperties; +import org.apache.iceberg.actions.RewritePositionDeleteFiles.FileGroupInfo; +import org.apache.iceberg.expressions.Expression; +import org.apache.iceberg.expressions.Expressions; +import org.apache.iceberg.io.CloseableIterable; +import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList; +import org.apache.iceberg.relocated.com.google.common.collect.ImmutableSet; +import org.apache.iceberg.relocated.com.google.common.collect.Iterables; +import org.apache.iceberg.relocated.com.google.common.collect.Lists; +import org.apache.iceberg.relocated.com.google.common.collect.Maps; +import org.apache.iceberg.types.Types; +import org.apache.iceberg.util.PartitionUtil; +import org.apache.iceberg.util.PropertyUtil; +import org.apache.iceberg.util.StructLikeMap; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Groups specified files in the {@link Table} by {@link RewriteFileGroup}s. These will be grouped + * by partitions. Extends the {@link SizeBasedFileRewritePlanner} with {@link + * RewritePositionDeleteFiles#REWRITE_JOB_ORDER} handling. + */ +public class RewritePositionDeletesGroupPlanner Review Comment: `RewritePositionDeletesPlanner` is probably more accurate? ########## core/src/main/java/org/apache/iceberg/actions/RewriteFileGroup.java: ########## @@ -31,26 +31,26 @@ import org.apache.iceberg.util.DataFileSet; /** - * Container class representing a set of files to be rewritten by a RewriteAction and the new files - * which have been written by the action. + * Container class representing a set of data files to be rewritten by a RewriteAction and the new + * files which have been written by the action. */ -public class RewriteFileGroup { - private final FileGroupInfo info; - private final List<FileScanTask> fileScanTasks; - +public class RewriteFileGroup extends FileRewriteGroup<FileGroupInfo, FileScanTask, DataFile> { Review Comment: It is kind of confusing to have both `RewriteFileGroup` and `FileWriteGroup` and have former extending from later. From names, they are identical. ########## core/src/main/java/org/apache/iceberg/actions/FileRewriteExecutor.java: ########## @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.iceberg.actions; + +import java.util.Map; +import java.util.Set; +import org.apache.iceberg.ContentFile; +import org.apache.iceberg.ContentScanTask; + +/** + * A class for rewriting content file groups ({@link FileRewriteGroup}). + * + * @param <I> the Java type of the plan info + * @param <T> the Java type of the tasks to read content files + * @param <F> the Java type of the content files + * @param <G> the Java type of the planned groups + * @param <P> the Java type of the plan to execute + */ +public interface FileRewriteExecutor< + I, + T extends ContentScanTask<F>, + F extends ContentFile<F>, + G extends FileRewriteGroup<I, T, F>, + P extends FileRewritePlan<I, T, F, G>> { + + /** Returns a description for this rewriter. */ + default String description() { + return getClass().getName(); + } + + /** + * Returns a set of supported options for this rewriter. Only options specified in this list will + * be accepted at runtime. Any other options will be rejected. + */ + Set<String> validOptions(); + + /** + * Initializes this rewriter using provided options. + * + * @param options options to initialize this rewriter + */ + void init(Map<String, String> options); Review Comment: I also found it confusing. Can the `init` and `initPlan` be combined into one method? Alternatively, can the `plan` be passed in via the `rewrite` method? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
