chenboat commented on PR #12680: URL: https://github.com/apache/pinot/pull/12680#issuecomment-2027833690
> What do you think about enabling this by default/not hiding this feature behind a config? It seems that we should be able to infer whether a query is valid single term lucene syntax/phrase search or requires a span query > > e.g. for StandardAnalyzer parsed text `'*istributed'` - valid lucene, unchanged behavior `'distribute*'`- valid lucene, unchanged behavior `'"Distributed systems"'` - valid lucene, unchanged behavior `'*istributed systems*'` - not valid lucene, modify it to be a span query `'/.*istributed systems.*/'` - valid lucene, incompatible with StandardAnalyzer, unchanged behavior For now I think it is better to hide this feature behind a config. A few reasons: 1. '*istributed systems*' is today still a valid parsable boolean query without this PR. It is just not clear what the users' real intent is. We probably will leave it this way to maintain the current status. Only when a table owner explicitly set the config flag, this query pattern will be treated as a phrase query with wild card matching. 2. The query feature enable is still costly. It goes beyond most patterns suggested by Lucene (e.g., no leading *). So it is better to be an option in only feature. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org