When doing things that require all the results (like joins) you need to specify the /export handler in the search function.
qt="/export" The search function defaults to the /select handler which is designed to return the top N results. The /export handler always returns all results that match the query. Also keep in mind that the /export handler requires that sort fields and fl fields have docValues set. Joel Bernstein http://joelsolr.blogspot.com/ On Fri, May 13, 2016 at 5:36 PM, Ryan Cutter <ryancut...@gmail.com> wrote: > Question #1: > > triple_type collection has a few hundred docs and triple has 25M docs. > > When I search for a particular subject_id in triple which I know has 14 > results and do not pass in 'rows' params, it returns 0 results: > > innerJoin( > search(triple, q=subject_id:1656521, fl="triple_id,subject_id,type_id", > sort="type_id asc"), > search(triple_type, q=*:*, fl="triple_type_id,triple_type_label", > sort="triple_type_id asc"), > on="type_id=triple_type_id" > ) > > When I do the same search with rows=10000, it returns 14 results: > > innerJoin( > search(triple, q=subject_id:1656521, fl="triple_id,subject_id,type_id", > sort="type_id asc", rows=10000), > search(triple_type, q=*:*, fl="triple_type_id,triple_type_label", > sort="triple_type_id asc", rows=10000), > on="type_id=triple_type_id" > ) > > Am I doing this right? Is there a magic number to pass into rows which > says "give me all the results which match this query"? > > > Question #2: > > Perhaps related to the first question but I want to run the innerJoin() > without the subject_id - rather have it use the results of another query. > But this does not return any results. I'm saying "search for this entity > based on id then use that result's entity_id as the subject_id to look > through the triple/triple_type collections: > > hashJoin( > innerJoin( > search(triple, q=*:*, fl="triple_id,subject_id,type_id", > sort="type_id asc"), > search(triple_type, q=*:*, fl="triple_type_id,triple_type_label", > sort="triple_type_id asc"), > on="type_id=triple_type_id" > ), > hashed=search(entity, > q=id:"urn:sid:entity:455dfa1aa27eedad21ac2115797c1580bb3b3b4e", > fl="entity_id,entity_label", sort="entity_id asc"), > on="subject_id=entity_id" > ) > > Am I using doing this hashJoin right? > > Thanks very much, Ryan >