Re: retrieving large number of docs

2015-06-04 Thread Robust Links
that worked but seem unable to run 1) phrase queries: i.e. *core1*/select?fl=title&q={!join from=id to=id fromIndex=*core0*} titleNormalized:"*text pdf*"&facet=true&facet.field=tags or 2) run filters on core0 *core1*/select?fl=title&q={!join from=id to=id fromIndex=*core0*} titleNormalized:"*te

Re: retrieving large number of docs

2015-06-04 Thread Alessandro Benedetti
Hi Rob, according to your use case you have to : Call the /select from *core1 *in this way* :* *core1*/select?fl=title&q={!join from=id to=id fromIndex=*core0*} titleNormalized:pdf&facet=true&facet.field=tags Hope this clarify your problem. Cheers 2015-06-04 15:00 GMT+01:00 Robust Links : > m

Re: retrieving large number of docs

2015-06-04 Thread Robust Links
my requirement is to join core1 onto core0. restating the requirements again. I have 2 cores core0 field:id field: text core1 field:id field tag I want to 1) query text field of core0, together with filters 2) use the {id} of matches (which can be >>10K) to retrieve the docs

Re: retrieving large number of docs

2015-06-04 Thread Alessandro Benedetti
Lets try to make clear some point : Index TO : is the one you are using to call the select request handler Index From : Tags Is titleNormalized present in the "Tags" index ? Because there is where the query will run. The documents in tags satisfying the query will be joined with the index TO . Th

Re: retrieving large number of docs

2015-06-04 Thread Robust Links
try it for yourself and see if it works Alessandro. Not only cant i get facets but i even get field errors when i run such join queries select?fl=title&q={!join from=id to=id fromIndex=Tags}titleNormalized:pdf undefined field titleNormalized 400 On Thu, Jun 4, 2015 at 5:19 AM, Alessandro Be

Re: retrieving large number of docs

2015-06-04 Thread Alessandro Benedetti
Hi Rob, Reading your use case I can not understand why the Query Time join is not a fit for you ! The documents returned by the Query Time Join will be from core1, so faceting and filter querying that core, would definitely be possible ! I can not see your problem honestly ! Cheers 2015-06-04 1:4

Re: retrieving large number of docs

2015-06-03 Thread Robust Links
that doesnt work either, and even if it did, joining is not going to be a solution since i cant query 1 core and facet on the result of the other. To sum up, my problem is core0 field:id field: text core1 field:id field tag I want to 1) query text field of core0, 2) use the {

Re: retrieving large number of docs

2015-06-03 Thread Jack Krupansky
Specify the join query parser for the main query. See: https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser -- Jack Krupansky On Wed, Jun 3, 2015 at 3:32 PM, Robust Links wrote: > Hi Erick > > they are on the same JVM. I had already tried the core join st

Re: retrieving large number of docs

2015-06-03 Thread Robust Links
Hi Erick they are on the same JVM. I had already tried the core join strategy but that doesnt solve the faceting problem... i.e if i have 2 cores, core0 and core1, and I run this query on core0 /select?&q=fq={!join from=id1 to=id2 fromIndex=core1}&facet=true&facet.field=tag has 2 problems 1) i n

Re: retrieving large number of docs

2015-06-03 Thread Joel Bernstein
Erick makes a great point, if they are in the same VM try the cross-core join first. It might be fast enough for you. A custom solution would be to build a custom query or post filter that works with your specific scenario. For example if the docID's are integers you could build a fast PostFilter

Re: retrieving large number of docs

2015-06-03 Thread Robust Links
what would be a custom solution? On Wed, Jun 3, 2015 at 1:58 PM, Joel Bernstein wrote: > You may have to do something custom to meet your needs. > > 10,000 DocID's is not huge but you're latency requirement are pretty low. > > Are your DocID's by any chance integers? This can make custom PostFi

Re: retrieving large number of docs

2015-06-03 Thread Erick Erickson
Are these indexes on different machines? Because if they're in the same JVM, you might be able to use cross-core joins. Be aware, though, that joining on high-cardinality fields (which, by definition, docID probably is) is where pseudo joins perform worst. Have you considered flattening the data a

Re: retrieving large number of docs

2015-06-03 Thread Joel Bernstein
You may have to do something custom to meet your needs. 10,000 DocID's is not huge but you're latency requirement are pretty low. Are your DocID's by any chance integers? This can make custom PostFilters run much faster. You should also be aware of the Streaming API in Solr 5.1 which will give y

Re: retrieving large number of docs

2015-06-03 Thread Robust Links
Hey Joel see below On Wed, Jun 3, 2015 at 1:43 PM, Joel Bernstein wrote: > A few questions for you: > > How large can the list of filtering ID's be? > >> 10k > > What's your expectation on latency? > 10> latency <100 > > What version of Solr are you using? > 5.0.0 > > SolrCloud or not?

Re: retrieving large number of docs

2015-06-03 Thread Joel Bernstein
A few questions for you: How large can the list of filtering ID's be? What's your expectation on latency? What version of Solr are you using? SolrCloud or not? Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Jun 3, 2015 at 1:23 PM, Robust Links wrote: > Hi > > I have a set of document

retrieving large number of docs

2015-06-03 Thread Robust Links
Hi I have a set of document IDs from one core and i want to query another core using the ids retrieved from the first core...the constraint is that the size of doc ID set can be very large. I want to: 1) retrieve these docs from the 2nd index 2) facet on the results I can think of 3 solutions: