eric-maynard commented on code in PR #273:
URL: https://github.com/apache/polaris/pull/273#discussion_r2017894960
##########
extension/persistence/eclipselink/src/main/java/org/apache/polaris/extension/persistence/impl/eclipselink/PolarisEclipseLinkMetaStoreSessionImpl.java:
##########
@@ -441,27 +445,57 @@ public List<EntityNameLookupRecord>
lookupEntityActiveBatchInCurrentTxn(
entity.getParentId(),
entity.getName(),
entity.getTypeCode(),
- entity.getSubTypeCode()));
+ entity.getSubTypeCode()),
+ pageToken);
}
@Override
- public @Nonnull <T> List<T> listEntitiesInCurrentTxn(
+ public @Nonnull <T> PolarisPage<T> listEntitiesInCurrentTxn(
@Nonnull PolarisCallContext callCtx,
long catalogId,
long parentId,
@Nonnull PolarisEntityType entityType,
- int limit,
@Nonnull Predicate<PolarisBaseEntity> entityFilter,
- @Nonnull Function<PolarisBaseEntity, T> transformer) {
- // full range scan under the parent for that type
- return this.store
- .lookupFullEntitiesActive(localSession.get(), catalogId, parentId,
entityType)
- .stream()
- .map(ModelEntity::toEntity)
- .filter(entityFilter)
- .limit(limit)
- .map(transformer)
- .collect(Collectors.toList());
+ @Nonnull Function<PolarisBaseEntity, T> transformer,
+ @Nonnull PageToken pageToken) {
+ List<T> data;
+ if (entityFilter.equals(Predicates.alwaysTrue())) {
+ // In this case, we can push the filter down into the query
+ data =
+ this.store
+ .lookupFullEntitiesActive(
+ localSession.get(), catalogId, parentId, entityType,
pageToken)
+ .stream()
+ .map(ModelEntity::toEntity)
+ .filter(entityFilter)
+ .map(transformer)
+ .collect(Collectors.toList());
+ } else {
+ // In this case, we cannot push the filter down into the query. We must
therefore remove
+ // the page size limit from the PageToken and filter on the client side.
+ // TODO Implement a generic predicate that can be pushed down into
different metastores
+ PageToken unlimitedPageSizeToken =
pageToken.withPageSize(Integer.MAX_VALUE);
+ List<ModelEntity> rawData =
+ this.store.lookupFullEntitiesActive(
+ localSession.get(), catalogId, parentId, entityType,
unlimitedPageSizeToken);
Review Comment:
Luckily, we do not use this filter very often at all... but yes, this
situation is less than ideal.
If the metastore itself provided a lazily-fetched `Iterator` here instead of
a `List` (presumably with its own pagination/cache built into the client), we
should definitely prefer to use that. But I think the right place for that
logic is in the metastore itself and not at this layer, so I kept with the
simplest approach. I think we need a solution for predicate push down in place
eventually anyway.
There is actually a third option as well, which is that we just return more
results than the client asked for. This is what we currently do actually, since
we don't respect the page token.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]