Using Solr 4.8.1.
I am creating an index containing Solr documents both with and without
nested documents. When Indexing documents from a single SolrJ client on a
single thread if I do not call commit() after each document add() I see
some erroneous documents returned from my "child of" or "parent which"
block queries. Here is an example:
I have a required field 'doc-type' used for the block join. For example I
have 3 types {'parent', 'child', 'single'} where parent has 1 or more
nested child docs and single never has a nested doc.
CASE 1:
================================
I add / commit docs in this order
add() - single001
add() - parent001 : [child001, child002]
commit()
then query all child docs of every parent doc
..... {!child of='doc-type:parent'}doc-type:parent
response contains *single001*, child001, child002 -- INCORRECT *single001 *is
not a child.
================================
CASE 2:
================================
I add / commit docs in this order
add() - single001
*commit() added commit here*
add() - parent001 : [child001, child002]
commit()
then query all child docs of every parent doc
..... {!child of='doc-type:parent'}doc-type:parent
response contains only {child001, child002} -- CORRECT
================================
CASE 2 will only work when a single solr client is adding docs. If spawn
a bunch of threads that do adds I will not be able to guarantee that commit
is called between adds.
Additionally, I am not 100% sure calling commit() is the correct solution
it just seems to work where I have tested.
I'd greatly appreciate if someone can shed some light on this problem or
help me better understand the limitations of the block indexing and block
join.