I actually thought seriously about whether to mention wildcard vs. range, but... it annoys me that the Lucene and query parser folks won't fix either PrefixQuery or the query parsers to do the right/optimal thing for single-asterisk query. I wrote up a Jira for it years ago, but for whatever reason the difficulty persists. At one point one of the Lucene guys told me that there was a filter query that could do both * and -* very efficiently, but then later that was disputed, not to mention that filter query is now gone. In any case, with the newer AutomatonQuery the single-asterisk PrefixQuery case should always perform at least semi-reasonably no matter what, especially since it is now a constant-score query, which it wasn't many years ago.
Whether [* TO *] is actually a lot more (or less) efficient than PrefixQuery for an empty prefix these days is... unknown to me, but I won't give anybody grief for using it as a way of compensating for the brain-damaged way that Lucene and Solr handle single-asterisk and negated single-asterisk queries. -- Jack Krupansky On Tue, Feb 16, 2016 at 8:17 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 2/15/2016 9:22 AM, Jack Krupansky wrote: > > I should also have noted that your full query: > > > > (-persons:*)AND(-places:*)AND(-orgs:*) > > > > can be written as: > > > > -persons:* -places:* -orgs:* > > > > Which may work as is, or can also be written as: > > > > *:* -persons:* -places:* -orgs:* > > Salman, > > One fact of Lucene operation is that purely negative queries do not > work. A negative query clause is like a subtraction. If you make a > query that only says "subtract these values", then you aren't going to > get anything, because you did not start with anything. > > Adding the "*:*" clause at the beginning of the query says "start with > everything." > > You might ask why a query of -field:value works, when I just said that > it *won't* work. This is because Solr has detected the problem and > fixed it. When the query is very simple (a single negated clause), Solr > is able to detect the unworkable situation and implicitly add the "*:*" > starting point, producing the expected results. With more complex > queries, like the one you are trying, this detection fails, and the > query is executed as-is. > > Jack is an awesome member of this community. I do not want to disparage > him at all when I tell you that the rewritten query he provided will > work, but is not optimal. It can be optimized as the following: > > *:* -persons:[* TO *] -places:[* TO *] -orgs:[* TO *] > > A query clause of the format "field:*" is a wildcard query. Behind the > scenes, Solr will interpret this as "all possible values for field" -- > which sounds like it would be exactly what you're looking for, except > that if there are ten million possible values in the field you're > searching, the constructed Lucene query will quite literally include all > ten million values. Wildcard queries tend to use a lot of memory and > run slowly. > > The [* TO *] syntax is an all-inclusive range query, which will usually > be much faster than a wildcard query. > > Thanks, > Shawn > >