The innerJoin is a merge join and the hashJoin is a hash join. The merge join can support joins of unlimited size and never runs out of memory. But it requires that both sides of the join are sorted on the join keys.
The hash join reads one side of the join into a hash map keyed on the join keys. This doesn't require any specific sort but it is limited in size by how much data can fit in the hash map. You can parallelize both joins using the parallel function to improve scalability and performance. Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Mar 24, 2017 at 4:49 AM, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote: > Hi, > > What is the main difference between hashJoin and innerJoin in Solr > Streaming Expression? > > I understand that both will emit a tuple containing the fields of both > tuples. > > When I tried both hashJoin and innerJoin with the same query, I get exactly > the same results, and there is no difference in performance. > > Under what circumstances should we use hashJoin, and under what > circumstances should we use innerJoin? > > Regards, > Edwin >