keith-turner commented on issue #5254: URL: https://github.com/apache/accumulo/issues/5254#issuecomment-2651539056
> @keith-turner - I was thinking that colsToRead was being used because I updated [this](https://github.com/apache/accumulo/blob/ec779446cfd926ae10bb0a4362f1ed65ab76925d/server/base/src/main/java/org/apache/accumulo/server/metadata/iterators/TabletMetadataCheckIterator.java#L105) line to pass it to TabletMetadata.convertRow() but looking again all it does is set the columns which were fetched and doesn't do any filtering. Yeah that does not do any filtering. That info being passed to convertRow is only for validating that data requested from the TabletMetadata object was actually fetched by the scan. > What is the best way to filter in this case? Should we create another iterator to wrap the source to skip columns or make a change to TabletMetadata.convertRow() to skip columns not specified as part of colsToRead etc? We could add something like the following. ```java class TableMetadata { enum ColumnType { public static Set<ByteSequence> resolveFamilies(Set<ColumnType> columns){ // TODO build a set of the families used by these column types } } } ``` and then in TabletMetadataCheckIterator could make the following changes to have the source iterator filter on families. This would not filter on qualifiers, but I thnik that is fine. Would need an additional iterator to also filter on qualifiers. Filtering only on families will still be correct and will narrow the data and cause locality groups to kick in. ```java if(colsToRead.equals(TabletMetadataCheck.ALL_COLUMNS)) { // want all columns so no need to filter on families source.seek(new Range(tabletRow), Set.of(), false); } else { Set<ByteSequence> families = TabletMetadata.ColumnType.resolveFamilies(colsToRead); source.seek(new Range(tabletRow), families, true); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
