When doing things that require all the results (like joins) you need to
specify the /export handler in the search function.

qt="/export"

The search function defaults to the /select handler which is designed to
return the top N results. The /export handler always returns all results
that match the query. Also keep in mind that the /export handler requires
that sort fields and fl fields have docValues set.

Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, May 13, 2016 at 5:36 PM, Ryan Cutter <ryancut...@gmail.com> wrote:

> Question #1:
>
> triple_type collection has a few hundred docs and triple has 25M docs.
>
> When I search for a particular subject_id in triple which I know has 14
> results and do not pass in 'rows' params, it returns 0 results:
>
> innerJoin(
>     search(triple, q=subject_id:1656521, fl="triple_id,subject_id,type_id",
> sort="type_id asc"),
>     search(triple_type, q=*:*, fl="triple_type_id,triple_type_label",
> sort="triple_type_id asc"),
>     on="type_id=triple_type_id"
> )
>
> When I do the same search with rows=10000, it returns 14 results:
>
> innerJoin(
>     search(triple, q=subject_id:1656521, fl="triple_id,subject_id,type_id",
> sort="type_id asc", rows=10000),
>     search(triple_type, q=*:*, fl="triple_type_id,triple_type_label",
> sort="triple_type_id asc", rows=10000),
>     on="type_id=triple_type_id"
> )
>
> Am I doing this right?  Is there a magic number to pass into rows which
> says "give me all the results which match this query"?
>
>
> Question #2:
>
> Perhaps related to the first question but I want to run the innerJoin()
> without the subject_id - rather have it use the results of another query.
> But this does not return any results.  I'm saying "search for this entity
> based on id then use that result's entity_id as the subject_id to look
> through the triple/triple_type collections:
>
> hashJoin(
>     innerJoin(
>         search(triple, q=*:*, fl="triple_id,subject_id,type_id",
> sort="type_id asc"),
>         search(triple_type, q=*:*, fl="triple_type_id,triple_type_label",
> sort="triple_type_id asc"),
>         on="type_id=triple_type_id"
>     ),
>     hashed=search(entity,
> q=id:"urn:sid:entity:455dfa1aa27eedad21ac2115797c1580bb3b3b4e",
> fl="entity_id,entity_label", sort="entity_id asc"),
>     on="subject_id=entity_id"
> )
>
> Am I using doing this hashJoin right?
>
> Thanks very much, Ryan
>

Reply via email to