Using Solr 4.8.1. I am creating an index containing Solr documents both with and without nested documents. When Indexing documents from a single SolrJ client on a single thread if I do not call commit() after each document add() I see some erroneous documents returned from my "child of" or "parent which" block queries. Here is an example:
I have a required field 'doc-type' used for the block join. For example I have 3 types {'parent', 'child', 'single'} where parent has 1 or more nested child docs and single never has a nested doc. CASE 1: ================================ I add / commit docs in this order add() - single001 add() - parent001 : [child001, child002] commit() then query all child docs of every parent doc ..... {!child of='doc-type:parent'}doc-type:parent response contains *single001*, child001, child002 -- INCORRECT *single001 *is not a child. ================================ CASE 2: ================================ I add / commit docs in this order add() - single001 *commit() added commit here* add() - parent001 : [child001, child002] commit() then query all child docs of every parent doc ..... {!child of='doc-type:parent'}doc-type:parent response contains only {child001, child002} -- CORRECT ================================ CASE 2 will only work when a single solr client is adding docs. If spawn a bunch of threads that do adds I will not be able to guarantee that commit is called between adds. Additionally, I am not 100% sure calling commit() is the correct solution it just seems to work where I have tested. I'd greatly appreciate if someone can shed some light on this problem or help me better understand the limitations of the block indexing and block join.