Hi all, I am trying to perform faceting of parent docs by nested document fields. I've tried 2 approaches as in subject, yet in first the results are not quite correct and in the 2nd I cannot get the query right. So I need help on either of them and any explication or documentation or blogs on the behavior is much appreciated.
Verbally the query is as follows: "Find top 10 keywords for all documents with "california" in email subject line" Here is the query with responses: ==== Json Facet API ==== curl http://localhost:8985/solr/my_collection/query -d 'q={!parent%20which="type_s:doc"}type_s:doc.userData%20%2BSubject_t:california&rows=0& json.facet={ filter_by_child_type :{ type:query, q:"type_s:doc.enriched.text.keywords", domain: { blockChildren : "type_s:doc" }, facet:{ top_keywords_text : { type: terms, field: text_t, limit: 10 } } } }' RETURNS: { "responseHeader":{ "status":0, "QTime":134, "params":{ "q":"{!parent which=\"type_s:doc\"}type_s:doc.userData +Subject_t:california", "json.facet":"{\n filter_by_child_type :{\n type:query,\n q:\"type_s:doc.enriched.text.keywords\",\n domain: { blockChildren : \"type_s:doc\" },\n facet:{\n top_keywords_text : {\n type: terms,\n field: text_t,\n limit: 10\n }\n }\n }\n}", "rows":"0"}}, "response":{"numFound":19,"start":0,"docs":[] }, "facets":{ "count":19, "filter_by_child_type":{ "count":686, "top_keywords_text":{ "buckets":[{ "val":"enron", "count":57}, { "val":"california", "count":22}, { "val":"power", "count":21}, { "val":"rate", "count":15}, { "val":"plan", "count":13}, { "val":"hou", "count":12}, { "val":"energy", "count":11}, { "val":"na", "count":11}, { "val":"mckinsey", "count":10}, { "val":"socal", "count":10}]}}}} QUESTION: where do the counts greater than 19 (the total number of the top-level documents returned by the query) comes from? How to adjust the query to facet only on the top-level documents (and consequently no count should be greater than 19)? ===== BlockJoin Faceting ====== Following the example on https://cwiki.apache.org/confluence/display/solr/BlockJoin+Faceting , I've tried this: /bjqfacet?q={!parent%20which=type_s:doc}type_s:doc.enriched.text.keywords&child.facet.field=text_t&child.facet.limit=10&child.facet.mincount=5&rows=0&fq={!parent%20which=type_s:doc}type_s:doc.userData%20%2BSubject_t:california&wt=json&indent=true RETURNS: { "responseHeader":{ "status":0, "QTime":1}, "response":{"numFound":19,"start":0,"docs":[] }, "facet_counts":[ "facet_fields",[ "text_t",[ "128x",1, "18xx",1, "1x",1, "2",2, "30",1, "60",1, "78xx",1, "82xx",1, "ab",2, "access",5, "account",1, "accounts",1, ... "california",13, ... "enron",9, ... ]]]} QUESTION: This looks very close to what I want, yet why child.facet.limit=10&child.facet.mincount=5 are ignored? How to get top 10 most frequent? Thank you for your help in advance! -- Alisa Zhila