Deepak "The greatness of a nation can be judged by the way its animals are treated. Please stop cruelty to Animals, become a Vegan"
+91 73500 12833 deic...@gmail.com Facebook: https://www.facebook.com/deicool LinkedIn: www.linkedin.com/in/deicool "Plant a Tree, Go Green" Make In India : http://www.makeinindia.com/home On Fri, May 11, 2018 at 8:15 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 5/10/2018 2:22 PM, Deepak Goel wrote: > >> Are there any benchmarks for this approach? If not, I can give it a spin. >> Also wondering if there are any alternative approach (i guess lucene >> stores >> data in a inverted field format) >> > > Here is the only other query I know of that can find documents missing a > field: > > q=*:* -field:* > > The potential problem with this query is that it uses a wildcard. On > non-point fields with very low cardinality, the performance might be > similar. But if the field is a Point type, or has a large number of unique > values, then performance would be a lot worse than the range query I > mentioned before. The range query is the best general purpose option. > > I wonder if giving a default value would help. Since Lucene stores all the document id's which contain the default value (not changed by user) in a single block (inverted index format), this could be retrieved much faster > The *:* query, despite appearances, does not use wildcards. It is special > query syntax. > > Thanks, > Shawn > >