lnbest0707-uber opened a new pull request, #14546: URL: https://github.com/apache/pinot/pull/14546
`feature``bugfix` `backward-incompat` This PR will converge the open source SchemaConformingTransformerV2 with Uber internal's version. The later one has been running in real production environment for a long time with large scale of data. The convergence would clean up some function that we found not useful and also added some function that we found required. There would also be a complete user manual/instruction to be release to expose to broader public usages. Clean up: - Shingling merged text index generation New functionalities: - Enhance case insensitive search by adding extra values to merged text index by `optimizeCaseInsensitiveSearch = true`. - Customize merged text index to do search by either key:value order or value:key order by `reverseTextIndexKeyValueOrder`. - Customize the document begin anchor, end anchor and key/value separator. This could optimize the prefix match, suffix match and avoid the confusions when searching ":". Use `mergedTextIndexBeginOfDocAnchor` `mergedTextIndexEndOfDocAnchor` and `jsonKeyValueSeparator`. - Add functionality to skip indexing some special fields by `fieldPathsToSkipStorage`. - Add functionality to index document keys with ".". For example, {"a.b": 1} could only be put to json_data even though there is dedicated column "a.b" because "a.b" was always translated to {"a": {"b": 1}}. With the config `useAnonymousDotInFieldNames` enabled, both would end in the same "a.b" column. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org