Will do thx Am 04.12.2017 9:27 nachm. schrieb "Emir Arnautović" < emir.arnauto...@sematext.com>:
> Hi Faraz, > When you say query without sort, I assume that you mean you omit sort so > you expect it to be sorted by score. It is expected to be slower than equal > query without calculating score - e.g. run same query as fq. > What you observe can be explained with: > * Solr is calculating score even not sorted by score and not returning it > (do you return score? Plus I am not sure about this - did not check the > code) > * Field that you are using for sorting do not have doc values so have to > be uninverted > * Fileld that you are using for sorting are not in OS cache so are read > from disk. > > Try comparing same query running as q=..,. and fq=… Make sure that your > filter cache is disabled if you are repeating the same queries and > averaging. > > HTH, > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > On 4 Dec 2017, at 14:54, Faraz Fallahi <faraz.fall...@googlemail.com> > wrote: > > > > Hi guys, > > > > Sorry to bother you again, but i am really confused: > > > > Ive used solr admin website and created a query with lots of ORs using > solr > > 4.7. > > > > When i execute the query without a sort it executes in round about 3.5 - > 4 > > seconds. > > When i execute it with a sort on a field called pubdate it takes about > > 4-4.5 seconds. > > When i execute it with a sort on the guid field it takes about 7 - 8 > > seconds !!! > > > > After your explanations i was expecting the query without a sort to be > the > > slowest. What am i missing here? > > > > Beat regards > > Faraz > > > > Am 30.11.2017 09:29 schrieb "Faraz Fallahi" < > faraz.fall...@googlemail.com>: > > > >> Uff... I See.. thx dir the explanation :) > >> > >> Am 30.11.2017 3:13 nachm. schrieb "Emir Arnautović" < > >> emir.arnauto...@sematext.com>: > >> > >>> Hi Faraz, > >>> It is a bit worse than that - it also needs to calculate score, so for > >>> each matching doc of one query part it has to check if it appears in > >>> results of other query parts. If you use term query parser, you avoid > >>> calculating score - all doc will have score 1. > >>> Solr is based on lucene, which is mainly inverted index: > >>> https://en.wikipedia.org/wiki/Inverted_index < > >>> https://en.wikipedia.org/wiki/Inverted_index> so knowing that helps > >>> understand how expensive some queries are. It is relatively easy to > figure > >>> out what steps are needed for different query types. Of course, Lucene > >>> includes a lot smartness, and it is probably not using the naive > approach, > >>> but it cannot avoid limitations of inverted index. > >>> > >>> HTH, > >>> Emir > >>> -- > >>> Monitoring - Log Management - Alerting - Anomaly Detection > >>> Solr & Elasticsearch Consulting Support Training - > http://sematext.com/ > >>> > >>> > >>> > >>>> On 30 Nov 2017, at 02:39, Faraz Fallahi <faraz.fall...@googlemail.com > > > >>> wrote: > >>>> > >>>> Hi Toke, > >>>> > >>>> Just to be clear and to understand. Does this mean that a query of the > >>> form > >>>> author:name1 OR author:name2 OR author:name3 > >>>> > >>>> Is being processed like e.g. > >>>> > >>>> 1 query against the index with author:name1 getting 4 result > >>>> Then 1 query against the index with author:name2 getting 3 result > >>>> Then 1 query against the index with author:name3 getting 1 result > >>>> > >>>> And in the end all results are merged and i get a result of 8 ? > >>>> > >>>> So a query of thousand authors will be splitted into thousand single > >>>> queries against the index? > >>>> > >>>> Do i understand this correctly? > >>>> > >>>> Thx for the help > >>>> Faraz > >>>> > >>>> > >>>> Am 28.11.2017 15:39 schrieb "Toke Eskildsen" <t...@kb.dk>: > >>>> > >>>> On Tue, 2017-11-28 at 11:07 +0100, Faraz Fallahi wrote: > >>>>> I have a question regarding solr queries. > >>>>> My query basically contains thousand of OR conditions for authors > >>>>> (author:name1 OR author:name2 OR author:name3 OR author:name4 ...) > >>>>> The execution time on my index is huge (around 15 sec). When i tag > >>>>> all the associated documents with a custom field and value like > >>>>> authorlist:1 and then i change my query to just search for > >>>>> authorlist:1 it executes in 78 ms. How come there is such a big > >>>>> difference in exec-time? > >>>> > >>>> Due to the nature of inverted indexes (which lies at the heart of > >>>> Solr), your thousands of OR-queries means thousands of lookups, > whereas > >>>> your authorlist means a single lookup. Adding to this the results for > >>>> each author needs to be merged with the other author-results - for > >>>> authorlist the results are there directly. > >>>> > >>>> If your author lists are static, indexing them as you did in your test > >>>> is the best solution. > >>>> > >>>> If they are not static, using a filter-query will ensure that they are > >>>> at least cached subsequently, so that only the first call will be > >>>> slow. > >>>> > >>>> If they are semi-static and there are not too many of them, you could > >>>> do warm-up filter-queries for all the different groups so that the > >>>> users does not pay the first-call penalty. This requires your filter- > >>>> cache to be large enough to hold all the author lists. > >>>> > >>>> - Toke Eskildsen, Royal Danish Library > >>> > >>> > >