Raised https://issues.apache.org/jira/browse/SOLR-13808. Thanks, Jochen!
On Mon, Sep 30, 2019 at 4:26 PM Mikhail Khludnev <m...@apache.org> wrote: > Jochen, right! Sorry for didn't get your point earlier. {!bool filter=} > means Lucene filter, not Solr's one. I suppose {!bool cache=true} flag can > be easily added, but so far there is no laconic syntax for it. Don't > hesitate to raise a jira for it. > > On Mon, Sep 30, 2019 at 3:18 PM Jochen Barth <ba...@ub.uni-heidelberg.de> > wrote: > >> Here the corrected equivalent query, giving the same results (and still >> much faster) as JsonQueryDSL: >> >> +filter(+((_query_:"{!graph from=parent_ids to=id }(meta_title_txt:muller >> meta_name_txt:muller meta_subject_txt:muller meta_shelflocator_txt:muller)" >> _query_:"{!graph from=id to=parent_ids traversalFilter=\"class_s:meta >> -type_s:multivolume_work -type_s:periodical -type_s:issue >> -type_s:journal\"}(meta_title_txt:muller meta_name_txt:muller >> text_ocr_ft:muller text_heidicon_ft:muller text_watermark_ft:muller >> text_catalogue_ft:muller text_index_ft:muller text_tei_ft:muller >> text_abstract_ft:muller text_pdf_ft:muller)") ) +class_s:meta ) >> -_query_:"{!join to=id from=parent_ids}(filter(+((_query_:\"{!graph >> from=parent_ids to=id }(meta_title_txt:muller meta_name_txt:muller >> meta_subject_txt:muller meta_shelflocator_txt:muller)\" _query_:\"{!graph >> from=id to=parent_ids traversalFilter=\\\"class_s:meta >> -type_s:multivolume_work -type_s:periodical -type_s:issue >> -type_s:journal\\\"}(meta_title_txt:muller meta_name_txt:muller >> text_ocr_ft:muller text_heidicon_ft:muller text_watermark_ft:muller >> text_catalogue_ft:muller text_index_ft:muller text_tei_ft:muller >> text_abstract_ft:muller text_pdf_ft:muller)\") ) +class_s:meta ))" >> >> I am querying the "core" of the above query (the string before >> »-_query_:"{!join«) for faceting; >> than the next query is the one above [ like »+(a) -{!join...}(a)« ] >> >> Now the second query is running in much less time because the result of >> term "a" is cached. >> >> Caching seems not to work with {boolean=>{must=>"*:*", filter=>...}}. >> >> Kind regards, >> Jochen >> >> >> >> >> >> >> Am 30.09.19 um 11:02 schrieb Jochen Barth: >> >> Ooops... Json is returning 48652 docs, StandardQueryParser 827... >> >> Must check this. >> >> Sorry, >> >> Jochen >> >> Am 30.09.19 um 10:39 schrieb Jochen Barth: >> >> the *:* in JsonQueryDSL is appearing two times because of two times >> »filter(...)« in StandardQueryParser. >> >> >> >> I've did some System.out.println in FastLRU, LRU, LFUCache, >> here the logging with JsonQueryDSL (solr 8.1.1): >> >> Fast-get +*:* #(+(([[meta_title_txt:muller meta_name_txt:muller >> meta_subject_txt:muller >> meta_shelflocator_txt:muller],parent_ids=id][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false] >> [[meta_title_txt:muller meta_name_txt:muller text_ocr_ft:muller >> text_heidicon_ft:muller text_watermark_ft:muller text_catalogue_ft:muller >> text_index_ft:muller text_tei_ft:muller text_abstract_ft:muller >> text_pdf_ft:muller],id=parent_ids] [TraversalFilter: class_s:meta >> -type_s:multivolume_work -type_s:periodical -type_s:issue >> -type_s:journal][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false])) >> +class_s:meta) valLen=null >> >> Fast-get DocValuesFieldExistsQuery [field=id] valLen=38 >> >> Fast-get DocValuesFieldExistsQuery [field=parent_ids] valLen=38 >> >> Fast-put +*:* #(+(([[meta_title_txt:muller meta_name_txt:muller >> meta_subject_txt:muller >> meta_shelflocator_txt:muller],parent_ids=id][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false] >> [[meta_title_txt:muller meta_name_txt:muller text_ocr_ft:muller >> text_heidicon_ft:muller text_watermark_ft:muller text_catalogue_ft:muller >> text_index_ft:muller text_tei_ft:muller text_abstract_ft:muller >> text_pdf_ft:muller],id=parent_ids] [TraversalFilter: class_s:meta >> -type_s:multivolume_work -type_s:periodical -type_s:issue >> -type_s:journal][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false])) >> +class_s:meta) >> >> ... >> >> Fast(LRUCache)-get is called only once, but it should have been called 2 >> Times: >> the first for finding out that this filter is not already cached and the >> second one for the identical part of the subquery. >> >> >> So now analzying Cache access with StandardQueryParser: >> Fast-get +(+[[meta_title_txt:muller meta_name_txt:muller >> meta_subject_txt:muller >> meta_shelflocator_txt:muller],parent_ids=id][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false] >> +[[meta_title_txt:muller meta_name_txt >> :muller text_ocr_ft:muller text_heidicon_ft:muller >> text_watermark_ft:muller text_catalogue_ft:muller text_index_ft:muller >> text_tei_ft:muller text_abstract_ft:muller >> text_pdf_ft:muller],id=parent_ids] [TraversalFilter: class_s:meta >> -type_s:multivolume_work -type_s:periodical -type_s:issue >> -type_s:journal][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false]) >> +class_s:meta valLen=null >> Fast-get DocValuesFieldExistsQuery [field=id] valLen=null >> Fast-put DocValuesFieldExistsQuery [field=id] >> Fast-get DocValuesFieldExistsQuery [field=parent_ids] valLen=null >> Fast-put DocValuesFieldExistsQuery [field=parent_ids] >> Fast-put +(+[[meta_title_txt:muller meta_name_txt:muller >> meta_subject_txt:muller >> meta_shelflocator_txt:muller],parent_ids=id][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false] >> +[[meta_title_txt:muller meta_name_txt >> :muller text_ocr_ft:muller text_heidicon_ft:muller >> text_watermark_ft:muller text_catalogue_ft:muller text_index_ft:muller >> text_tei_ft:muller text_abstract_ft:muller >> text_pdf_ft:muller],id=parent_ids] [TraversalFilter: class_s:meta >> -type_s:multivolume_work -type_s:periodical -type_s:issue >> -type_s:journal][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false]) >> +class_s:meta >> Fast-get +filter(+(+(+[[meta_title_txt:muller meta_name_txt:muller >> meta_subject_txt:muller >> meta_shelflocator_txt:muller],parent_ids=id][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false] >> +[[meta_title_txt:muller met >> a_name_txt:muller text_ocr_ft:muller text_heidicon_ft:muller >> text_watermark_ft:muller text_catalogue_ft:muller text_index_ft:muller >> text_tei_ft:muller text_abstract_ft:muller >> text_pdf_ft:muller],id=parent_ids] [TraversalFilter: cl >> ass_s:meta -type_s:multivolume_work -type_s:periodical -type_s:issue >> -type_s:journal][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false])) >> +class_s:meta) valLen=null >> Fast-get +(+[[meta_title_txt:muller meta_name_txt:muller >> meta_subject_txt:muller >> meta_shelflocator_txt:muller],parent_ids=id][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false] >> +[[meta_title_txt:muller meta_name_txt >> :muller text_ocr_ft:muller text_heidicon_ft:muller >> text_watermark_ft:muller text_catalogue_ft:muller text_index_ft:muller >> text_tei_ft:muller text_abstract_ft:muller >> text_pdf_ft:muller],id=parent_ids] [TraversalFilter: class_s:meta >> -type_s:multivolume_work -type_s:periodical -type_s:issue >> -type_s:journal][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false]) >> +class_s:meta valLen=40 >> Fast-put +filter(+(+(+[[meta_title_txt:muller meta_name_txt:muller >> meta_subject_txt:muller >> meta_shelflocator_txt:muller],parent_ids=id][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false] >> +[[meta_title_txt:muller met >> a_name_txt:muller text_ocr_ft:muller text_heidicon_ft:muller >> text_watermark_ft:muller text_catalogue_ft:muller text_index_ft:muller >> text_tei_ft:muller text_abstract_ft:muller >> text_pdf_ft:muller],id=parent_ids] [TraversalFilter: cl >> ass_s:meta -type_s:multivolume_work -type_s:periodical -type_s:issue >> -type_s:journal][maxDepth=-1][returnRoot=true][onlyLeafNodes=false][useAutn=false])) >> +class_s:meta) >> >> Two times Fast(LRUCache)-get +(+([[... as expected. >> >> Kind regards, >> Jochen >> >> >> >> Am 30.09.19 um 10:01 schrieb Jochen Barth: >> >> Dear Mikhail, >> >> maybe I am wrong, >> >> but this query (standardQueryParser): >> >> +filter(+((+((+(_query_:"{!graph from=parent_ids to=id >> }(meta_title_txt:muller meta_name_txt:muller meta_subject_txt:muller >> meta_shelflocator_txt:muller)") +(_query_:"{!graph from=id to=parent_ids >> traversalFilter=\"class_s:meta -type_s:multivolume_work -type_s:periodical >> -type_s:issue -type_s:journal\"}(meta_title_txt:muller meta_name_txt:muller >> text_ocr_ft:muller text_heidicon_ft:muller text_watermark_ft:muller >> text_catalogue_ft:muller text_index_ft:muller text_tei_ft:muller >> text_abstract_ft:muller text_pdf_ft:muller)"))))) +(class_s:meta)) >> -(+(_query_:"{!join from=parent_ids >> to=id}(+filter(+((+((+(_query_:\"{!graph from=parent_ids to=id >> }(meta_title_txt:muller meta_name_txt:muller meta_subject_txt:muller >> meta_shelflocator_txt:muller)\") +(_query_:\"{!graph from=id to=parent_ids >> traversalFilter=\\\"class_s:meta -type_s:multivolume_work >> -type_s:periodical -type_s:issue -type_s:journal\\\"}(meta_title_txt:muller >> meta_name_txt:muller text_ocr_ft:muller text_heidicon_ft:muller >> text_watermark_ft:muller text_catalogue_ft:muller text_index_ft:muller >> text_tei_ft:muller text_abstract_ft:muller text_pdf_ft:muller)\"))))) >> +(class_s:meta)))")) >> >> is as twice as fast as this equivalent one (JsonQueryDSL, "canonical" for >> stable key order): >> >> {"query":{"bool":{"filter":{"bool":{"must":[{"bool":{"should":[{"bool":{"should":[{"graph":{"from":"parent_ids","query":"meta_title_txt:muller >> meta_name_txt:muller meta_subject_txt:muller >> meta_shelflocator_txt:muller","to":"id"}},{"graph":{"from":"id","query":"meta_title_txt:muller >> meta_name_txt:muller text_ocr_ft:muller text_heidicon_ft:muller >> text_watermark_ft:muller text_catalogue_ft:muller text_index_ft:muller >> text_tei_ft:muller text_abstract_ft:muller >> text_pdf_ft:muller","to":"parent_ids","traversalFilter":"class_s:meta >> -type_s:multivolume_work -type_s:periodical -type_s:issue >> -type_s:journal"}}]}}]}},"class_s:meta"]}},"must":"*:*","must_not":[{"join":{"from":"parent_ids","query":{"bool":{"filter":{"bool":{"must":[{"bool":{"should":[{"bool":{"should":[{"graph":{"from":"parent_ids","query":"meta_title_txt:muller >> meta_name_txt:muller meta_subject_txt:muller >> meta_shelflocator_txt:muller","to":"id"}},{"graph":{"from":"id","query":"meta_title_txt:muller >> meta_name_txt:muller text_ocr_ft:muller text_heidicon_ft:muller >> text_watermark_ft:muller text_catalogue_ft:muller text_index_ft:muller >> text_tei_ft:muller text_abstract_ft:muller >> text_pdf_ft:muller","to":"parent_ids","traversalFilter":"class_s:meta >> -type_s:multivolume_work -type_s:periodical -type_s:issue >> -type_s:journal"}}]}}]}},"class_s:meta"]}},"must":"*:*"}},"to":"id"}}]}}} >> >> Kind regards, >> Jochen >> >> >> >> Am 29.09.19 um 21:28 schrieb Mikhail Khludnev: >> >> On Sun, Sep 29, 2019 at 8:37 PM Barth, Jochen <ba...@ub.uni-heidelberg.de> >> <ba...@ub.uni-heidelberg.de> >> wrote: >> >> >> Thanks for your hint. The documentation does not say if the result of >> filter is cached here (like fq=...) (I could test this). >> >> >> 'filter' implies caching. >> >> >> >> Is *:* more expensive (query time) than filter() (*:* not required in >> StandardQueryParser) ? >> >> >> I either doesn't get the question or it isn't worth to worry about. >> >> >> >> Kind regrads, >> Jochen >> >> ________________________________________ >> Von: Mikhail Khludnev <m...@apache.org> <m...@apache.org> >> Gesendet: Samstag, 28. September 2019 22:58 >> An: solr-user >> Betreff: Re: filter in JSON Query DSL >> >> Giving >> https://lucene.apache.org/solr/guide/8_0/other-parsers.html#boolean-query-parser >> something >> like >> '{"query": { "bool": { "must": ["*:*"] , "filter": [ >> "meta_subject_txt:globe" ] } } }' >> I'm not sure why to put filter under must they should be siblings. >> >> On Fri, Sep 27, 2019 at 4:34 PM Jochen Barth <ba...@ub.uni-heidelberg.de> >> <ba...@ub.uni-heidelberg.de> >> wrote: >> >> >> Dear reader, >> >> this query works as expected: >> >> curl -XGET http://localhost:8982/solr/Suchindex/query -d ' >> {"query": { "bool": { "must": "*:*" } }, >> "filter": [ "meta_subject_txt:globe" ] }' >> >> this does not (nor without the curley braces around "filter"): >> >> curl -XGET http://localhost:8982/solr/Suchindex/query -d ' >> {"query": { "bool": { "must": [ "*:*", { "filter": [ >> "meta_subject_txt:globe" ] } ] } } }' >> >> Is "filter" within deeper queries possible? >> >> I've got some complex queries with a "kernel" somewhat below the top >> level... >> >> Is "canonical" json important to match query cache entry? >> >> Would it help to serialize this queries to standard syntax and then use >> filter(...)? >> >> Kind regards, >> >> Jochen >> >> >> >> -- >> Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 >> 54-2580 >> >> >> >> -- >> Sincerely yours >> Mikhail Khludnev >> >> >> >> -- >> Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 54-2580 >> >> >> -- >> Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 54-2580 >> >> >> -- >> Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 54-2580 >> >> >> -- >> Jochen Barth * Universitätsbibliothek Heidelberg, IT * Telefon 06221 54-2580 >> >> > > -- > Sincerely yours > Mikhail Khludnev > -- Sincerely yours Mikhail Khludnev