yupeng9 commented on pull request #6845: URL: https://github.com/apache/incubator-pinot/pull/6845#issuecomment-830411219
> > @mcvsubbu @yupeng9 , unlimited (however deep the hierarchy is) recursing should be allowed since that was one of the things we had discussed in the design doc. If the user knows what they are doing (storage explosion) and chooses to flatten everything outright, then they get tabular data with all leaves as individual columns thus allowing all SQL to work on nested data. No need to use JSON index in such cases. > > It is ok to have a limit (as eventually we might run out of stack memory and server might crash). User should go back and update the flatten config to may be choose a different sub-tree that needs to be flattened or specify the depth of hierarchy upto which we should flatten. > > In any case, our algorithm should be robust enough to handle any arbitrary level of flattening with the depth being configurable > > If we have unlimited and we crash on one record in realtime ingested stream, there is no way we can move forward. It is better to specify a limit in config, default it to unlimited. In case of problems, we can set the limit and move forward. @mcvsubbu as specified in the design doc, the unnesting is opt in that users must provide the list of collections that they want to unnest. Usually such unnesting will change cardinality, can create issues that you suggest. However, the map flattening would not change cardinality, so it's hard for me to understand why such level limit is needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org