Mikhail, I totally see the point: the corresponding wiki page ( https://cwiki.apache.org/confluence/display/solr/BlockJoin+Faceting ) does not mention it and says it's an experimental feature.
Is it correct that no additional options ( limit, mincount, etc.) can be set anyhow? Or more specifically, is there any work-around to control the output of the query at hand (maybe anything beyond faceting options): /bjqfacet?q={!parent%20which=type_s:doc}type_s:doc.enriched.text.keywords&child.facet.field=text_t&rows=0&fq={!parent%20which=type_s:doc}type_s:doc.userData%20%2BSubject_t:california&wt=json&indent=true > >> > >>RETURNS: > >> > >>{ > >> "responseHeader":{ > >> "status":0, > >> "QTime":1}, > >> "response":{"numFound":19,"start":0,"docs":[] > >> }, > >> "facet_counts":[ > >> "facet_fields",[ > >> "text_t",[ > >> "128x",1, > >> "18xx",1, > >> "1x",1, > >> "2",2, > >> "30",1, > >> "60",1, > >> "78xx",1, > >> "82xx",1, > >> "ab",2, > >> "access",5, > >> "account",1, > >> "accounts",1, > >>... > >>"california",13, > >>... > >>"enron",9, > >>... > >>]]]} > >> >Вторник, 29 марта 2016, 13:40 -04:00 от Mikhail Khludnev ><mkhlud...@griddynamics.com>: > >Alisa, > >There is no such thing as child.facet.limit, etc > >On Tue, Mar 29, 2016 at 6:27 PM, Alisa Z. < prol...@mail.ru > wrote: > >> So the first issue eventually solved by adding facet: {top_terms_by_doc: >> "unique(_root_)"} AND sorting the outer facet buckets by this faceting: >> >> curl http://localhost:8985/solr/enron_path_w_ts/query -d >> 'q={!parent%20which="type_s:doc"}type_s:doc.userData%20%2BSubject_t:california&rows=0& >> json.facet={ >> filter_by_child_type :{ >> type:query, >> q:"type_s:doc.enriched.text.keywords", >> domain: { blockChildren : "type_s:doc" }, >> facet:{ >> top_keywords_text : { >> type: terms, >> field: text_t, >> limit: 10, >> sort: "top_terms_by_doc desc", >> facet: { >> top_terms_by_doc: "unique(_root_)" >> } >> } >> } >> } >> }' >> >> >> The BlockJoin Faceting part is still open: I've tried all conventional >> faceting parameters: facet.limit, child.facet.limit, f.text_t.facet.limit >> ... nothing worked :( >> >> >> >Понедельник, 28 марта 2016, 17:20 -04:00 от Alisa Z. < prol...@mail.ru >: >> > >> >Ok, so for the 1st question, I think I'm getting closer: adding facet: >> {top_terms_by_doc: "unique(_root_)"} as indicated in >> http://blog.griddynamics.com/search/label/~Mikhail%20Khludnev returns >> correct counts. However, sorting is done by the upper faceting not by the >> unique(_root_): >> > >> > >> >curl http://localhost:8985/solr/my_collection /query -d >> 'q={!parent%20which="type_s:doc"}type_s:doc.userData%20%2BSubject_t:california&rows=0& >> >json.facet={ >> > filter_by_child_type :{ >> > type:query, >> > q:"type_s:doc.enriched.text.keywords", >> > domain: { blockChildren : "type_s:doc" }, >> > facet:{ >> > top_keywords_text : { >> > type: terms, >> > field: text_t, >> > limit: 10, >> > facet: { >> > top_terms_by_doc: "unique(_root_)" >> > } >> > } >> > } >> > } >> >}' >> > >> >RETURNS >> > >> >{ >> > "responseHeader":{ >> > "status":0, >> > "QTime":25, >> > "params":{ >> > "q":"{!parent which=\"type_s:doc\"}type_s:doc.userData >> +Subject_t:california", >> > "json.facet":"{\n filter_by_child_type :{\n type:query,\n >> q:\"type_s:doc.enriched.text.keywords\",\n domain: { blockChildren : >> \"type_s:doc\" },\n facet:{\n top_keywords_text : {\n type: >> terms,\n field: text_t,\n limit: 10,\n facet: >> {\n top_terms_by_doc: \"unique(_root_)\"\n }\n >> }\n }\n }\n}", >> > "rows":"0"}}, >> > "response":{"numFound":19,"start":0,"docs":[] >> > }, >> > "facets":{ >> > "count":19, >> > "filter_by_child_type":{ >> > "count":686, >> > "top_keywords_text":{ >> > "buckets":[{ >> > "val":"enron", >> > "count":57, >> > "top_terms_by_doc":9}, >> > { >> > "val":"california", >> > "count":22, >> > "top_terms_by_doc":13}, >> > { >> > "val":"power", >> > "count":21, >> > "top_terms_by_doc":7}, >> > { >> > "val":"rate", >> > "count":15, >> > "top_terms_by_doc":5}, >> > { >> > "val":"plan", >> > "count":13, >> > "top_terms_by_doc":3}, >> > { >> > "val":"hou", >> > "count":12, >> > "top_terms_by_doc":5}, >> > { >> > "val":"energy", >> > "count":11, >> > "top_terms_by_doc":5}, >> > { >> > "val":"na", >> > "count":11, >> > "top_terms_by_doc":5}, >> > { >> > "val":"mckinsey", >> > "count":10, >> > "top_terms_by_doc":1}, >> > { >> > "val":"socal", >> > "count":10, >> > "top_terms_by_doc":4}]}}}} >> > >> >Nice, but I want them to be ordered by "top_terms_by_doc" frequencies, >> not by the "count" frequencies. >> >Any suggestions? >> > >> >Thanks, >> >Alisa >> > >> > >> > >> > >> > >> >>Понедельник, 28 марта 2016, 15:39 -04:00 от Alisa Z. < prol...@mail.ru >> >: >> >> >> >>Hi all, >> >> >> >>I am trying to perform faceting of parent docs by nested document >> fields. I've tried 2 approaches as in subject, yet in first the results are >> not quite correct and in the 2nd I cannot get the query right. So I need >> help on either of them and any explication or documentation or blogs on the >> behavior is much appreciated. >> >> >> >>Verbally the query is as follows: "Find top 10 keywords for all >> documents with "california" in email subject line" >> >> >> >>Here is the query with responses: >> >> >> >>==== Json Facet API ==== >> >> >> >>curl http://localhost:8985/solr/my_collection/query -d >> 'q={!parent%20which="type_s:doc"}type_s:doc.userData%20%2BSubject_t:california&rows=0& >> >>json.facet={ >> >> filter_by_child_type :{ >> >> type:query, >> >> q:"type_s:doc.enriched.text.keywords", >> >> domain: { blockChildren : "type_s:doc" }, >> >> facet:{ >> >> top_keywords_text : { >> >> type: terms, >> >> field: text_t, >> >> limit: 10 >> >> } >> >> } >> >> } >> >>}' >> >> >> >>RETURNS: >> >> >> >>{ >> >> "responseHeader":{ >> >> "status":0, >> >> "QTime":134, >> >> "params":{ >> >> "q":"{!parent which=\"type_s:doc\"}type_s:doc.userData >> +Subject_t:california", >> >> "json.facet":"{\n filter_by_child_type :{\n type:query,\n >> q:\"type_s:doc.enriched.text.keywords\",\n domain: { blockChildren : >> \"type_s:doc\" },\n facet:{\n top_keywords_text : {\n type: >> terms,\n field: text_t,\n limit: 10\n }\n }\n }\n}", >> >> "rows":"0"}}, >> >> "response":{"numFound":19,"start":0,"docs":[] >> >> }, >> >> "facets":{ >> >> "count":19, >> >> "filter_by_child_type":{ >> >> "count":686, >> >> "top_keywords_text":{ >> >> "buckets":[{ >> >> "val":"enron", >> >> "count":57}, >> >> { >> >> "val":"california", >> >> "count":22}, >> >> { >> >> "val":"power", >> >> "count":21}, >> >> { >> >> "val":"rate", >> >> "count":15}, >> >> { >> >> "val":"plan", >> >> "count":13}, >> >> { >> >> "val":"hou", >> >> "count":12}, >> >> { >> >> "val":"energy", >> >> "count":11}, >> >> { >> >> "val":"na", >> >> "count":11}, >> >> { >> >> "val":"mckinsey", >> >> "count":10}, >> >> { >> >> "val":"socal", >> >> "count":10}]}}}} >> >> >> >> >> >>QUESTION: where do the counts greater than 19 (the total number of the >> top-level documents returned by the query) comes from? How to adjust the >> query to facet only on the top-level documents (and consequently no count >> should be greater than 19)? >> >> >> >> >> >>===== BlockJoin Faceting ====== >> >>Following the example on >> https://cwiki.apache.org/confluence/display/solr/BlockJoin+Faceting , >> I've tried this: >> >> >> >> >>/bjqfacet?q={!parent%20which=type_s:doc}type_s:doc.enriched.text.keywords&child.facet.field=text_t&child.facet.limit=10&child.facet.mincount=5&rows=0&fq={!parent%20which=type_s:doc}type_s:doc.userData%20%2BSubject_t:california&wt=json&indent=true >> >> >> >>RETURNS: >> >> >> >>{ >> >> "responseHeader":{ >> >> "status":0, >> >> "QTime":1}, >> >> "response":{"numFound":19,"start":0,"docs":[] >> >> }, >> >> "facet_counts":[ >> >> "facet_fields",[ >> >> "text_t",[ >> >> "128x",1, >> >> "18xx",1, >> >> "1x",1, >> >> "2",2, >> >> "30",1, >> >> "60",1, >> >> "78xx",1, >> >> "82xx",1, >> >> "ab",2, >> >> "access",5, >> >> "account",1, >> >> "accounts",1, >> >>... >> >>"california",13, >> >>... >> >>"enron",9, >> >>... >> >>]]]} >> >> >> >>QUESTION: This looks very close to what I want, yet why >> child.facet.limit=10&child.facet.mincount=5 are ignored? How to get top 10 >> most frequent? >> >> >> >> >> >>Thank you for your help in advance! >> >> >> >>-- >> >>Alisa Zhila >> >> > > >-- >Sincerely yours >Mikhail Khludnev >Principal Engineer, >Grid Dynamics > >< http://www.griddynamics.com > >< mkhlud...@griddynamics.com >