join query parser performance

Ron Haines Thu, 25 May 2023 05:51:48 -0700

I've been using the 'join' query parser to 'filter out' related documents
that should not be part of the result set.  Functionally, it is working
fine.  However, when we throw a 'real' level of customer traffic at it, it
pretty much brings Solr to its knees.  CPU increases ALOT.  Close to 3X,
when I enable this feature in our system.  Solr response times shoot up,
and thread counts shoot up.  Before I 'give up' on the join query parser, I
thought I'd seek some advice here.


So, when this feature is enabled, this negative &fq gets added:
-{!join fromIndex=primary_rollup from=group_id_mv to=group_member_id
score=none}${q}

The 'local' collection size is about 27 million docs, but the number of
docs that actually contain a 'group_member_id' is only about 125k.  And, in
the 'fromIndex' collection, there are only 80k documents in that
collection, and they all have the 'group_id_mv' field.  The 'fromIndex'
collection is a single shard, with a replica on each shard of the local
collection.  The local collection only has about 300k docs per shard, at 96
shards.

I guess I'm just trying to understand why this appears to be causing such
problems for Solr, as the amount of work (the # of documents involved)
seems relatively small.

I hope I'm missing something...
Thanks for any input.

join query parser performance

Reply via email to