[Subquery] Transform Documents across Collections

2020-08-12 Thread Norbert Kutasi
Hello,

We have been using [subquery] to come up with arbitrary complex hierarchies
in our document responses.

It works well as long as the documents are in the same collection however
based on the reference guide I infer it can bring in documents from
different collections except it throws an error.
https://lucene.apache.org/solr/guide/8_2/transforming-result-documents.html#subquery


We are on SOLR 8.2 and in this sandbox we have a 2 node SOLRCloud cluster,
where both collections have 1 shard and 2 NRT replicas to ensure nodes have
a core from each collection.
Basic Authorization enabled.

Simple steps to reproduce this issue in this 2 node environment:
./solr create -c Collection1 -s 1 -rf 2
./solr create -c Collection2 -s 1 -rf 2

Note: these collections are schemaless, however we observed the ones with
schemas.

Collection 1:

   
  1
  John
   
   
  2
  Peter
   


Collection 2:

   
  3
  Thomas
 2
   
   
  4
  Charles
  1
   
   
  5
  Susan
 3
   



http://localhost:8983/solr/Collection1/query
{
  params: {
q: "*",
fq: "*",
rows: 5,
fl:"*,subordinate:[subquery fromIndex=Collection2]",
subordinate.fl:"*",
subordinate.q:"{!field f=reporting v=$row.id}",
subordinate.fq:"*",
subordinate.rows:"5"
  }
}

{
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","org.apache.solr.common.SolrException"],
"msg":"while invoking subordinate:[subqueryfromIndex=Collection2] on
doc=SolrDocument{id=stored,indexed,tokenized,omitNorms,indexOptions=DOCS,
first_name=[stored,index",
"code":400}}


Where do we make a mistake?

Thank you in advance,
Norbert


Re: [Subquery] Transform Documents across Collections

2020-08-12 Thread Norbert Kutasi
Hi Dominique,

Sorry, I was in a hurry to create a simple enough yet similar case that we
face with internally.

reporting_to indeed is the right field , but the same error still persists,
something is seemingly wrong when invoking the *subquery *with *fromIndex*

{
  params: {
q: "*",
fq: "*",
rows: 5,
fl:"*,subordinate:[subquery fromIndex=Collection2]",
subordinate.fl:"*",
subordinate.q:"{!field f=reporting_to v=$row.id}",
subordinate.fq:"*",
subordinate.rows:"5",
  }
}

{
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","org.apache.solr.common.SolrException"],
"msg":"while invoking subordinate:[subqueryfromIndex=Collection2] on
doc=SolrDocument{id=stored,indexed,tokenized,omitNorms,indexOptions=DOCS,
first_name=[stored,index",
"code":400}}

Any help much appreciated, hopefully it's an error with the syntax I've
been using.

Regards,
Norbert

On Wed, 12 Aug 2020 at 12:49, Dominique Bejean 
wrote:

> Hi Norbert,
>
> The field name in collection2 is  "reporting_to" not "reporting".
>
> Dominique
>
>
>
> Le mer. 12 août 2020 à 11:59, Norbert Kutasi  a
> écrit :
>
> > Hello,
> >
> > We have been using [subquery] to come up with arbitrary complex
> hierarchies
> > in our document responses.
> >
> > It works well as long as the documents are in the same collection however
> > based on the reference guide I infer it can bring in documents from
> > different collections except it throws an error.
> >
> >
> https://lucene.apache.org/solr/guide/8_2/transforming-result-documents.html#subquery
> >
> >
> > We are on SOLR 8.2 and in this sandbox we have a 2 node SOLRCloud
> cluster,
> > where both collections have 1 shard and 2 NRT replicas to ensure nodes
> have
> > a core from each collection.
> > Basic Authorization enabled.
> >
> > Simple steps to reproduce this issue in this 2 node environment:
> > ./solr create -c Collection1 -s 1 -rf 2
> > ./solr create -c Collection2 -s 1 -rf 2
> >
> > Note: these collections are schemaless, however we observed the ones with
> > schemas.
> >
> > Collection 1:
> > 
> >
> >   1
> >   John
> >
> >
> >   2
> >   Peter
> >
> > 
> >
> > Collection 2:
> > 
> >
> >   3
> >   Thomas
> >  2
> >
> >
> >   4
> >   Charles
> >   1
> >
> >
> >   5
> >   Susan
> >  3
> >
> > 
> >
> >
> > http://localhost:8983/solr/Collection1/query
> > {
> >   params: {
> > q: "*",
> > fq: "*",
> > rows: 5,
> > fl:"*,subordinate:[subquery fromIndex=Collection2]",
> > subordinate.fl:"*",
> > subordinate.q:"{!field f=reporting v=$row.id}",
> > subordinate.fq:"*",
> > subordinate.rows:"5"
> >   }
> > }
> >
> > {
> >   "error":{
> > "metadata":[
> >   "error-class","org.apache.solr.common.SolrException",
> >   "root-error-class","org.apache.solr.common.SolrException"],
> > "msg":"while invoking subordinate:[subqueryfromIndex=Collection2] on
> >
> >
> doc=SolrDocument{id=stored,indexed,tokenized,omitNorms,indexOptions=DOCS,
> > first_name=[stored,index",
> > "code":400}}
> >
> >
> > Where do we make a mistake?
> >
> > Thank you in advance,
> > Norbert
> >
>


Re: [Subquery] Transform Documents across Collections

2020-08-12 Thread Norbert Kutasi
   {
  "id":"3",
      "reporting_to":[2],
  "first_name":["Thomas"],
  "_version_":1674811297814806528}]
}}]
  }}

I don't remember when did I change it to !fields, the documentation had it
with !terms... which seems to be not working ether
q=name:john&fl=name,id,depts:[subquery]&depts.q={!terms
f=id *v=$row.dept_id*}&depts.rows=10

Erick, thanks for the suggestion of adding:
&subordinate.collection=Collection2

The solution is
http://torvmlnx03.temenosgroup.com:8983/solr/Collection1/query?q=*&fl=*,subordinate:[subquery]&subordinate.fl=*&subordinate.collection=Collection2&subordinate.q={!term%20f=reporting_to%20v=$row.id}
Regards,
Norbert


On Wed, 12 Aug 2020 at 14:41, Erick Erickson 
wrote:

> This works from a browser:
>
> http://localhost:8981/solr/Collection1/query?q=*&fl=*,subordinate:[subquery]&subordinate.q=*:*&subordinate.fl=*&subordinate.collection=Collection2
>
> One problem you’re having is that “fromIndex” is a _core_ not a
> collection. See:
> https://lucene.apache.org/solr/guide/8_2/transforming-result-documents.html
>
> It’s vaguely possible you could make it work by specifying something like
> fromIndex=Collection2_shard1_replica_n1
> if it was colocated on the node you’re querying, but you don’t want to go
> there…
>
> Best,
> Erick
>
> > On Aug 12, 2020, at 7:17 AM, Norbert Kutasi 
> wrote:
> >
> > Hi Dominique,
> >
> > Sorry, I was in a hurry to create a simple enough yet similar case that
> we
> > face with internally.
> >
> > reporting_to indeed is the right field , but the same error still
> persists,
> > something is seemingly wrong when invoking the *subquery *with
> *fromIndex*
> >
> > {
> >  params: {
> >q: "*",
> >fq: "*",
> >rows: 5,
> > fl:"*,subordinate:[subquery fromIndex=Collection2]",
> >subordinate.fl:"*",
> >subordinate.q:"{!field f=reporting_to v=$row.id}",
> >subordinate.fq:"*",
> >subordinate.rows:"5",
> >  }
> > }
> >
> > {
> >  "error":{
> >"metadata":[
> >  "error-class","org.apache.solr.common.SolrException",
> >  "root-error-class","org.apache.solr.common.SolrException"],
> >"msg":"while invoking subordinate:[subqueryfromIndex=Collection2] on
> >
> doc=SolrDocument{id=stored,indexed,tokenized,omitNorms,indexOptions=DOCS,
> > first_name=[stored,index",
> >"code":400}}
> >
> > Any help much appreciated, hopefully it's an error with the syntax I've
> > been using.
> >
> > Regards,
> > Norbert
> >
> > On Wed, 12 Aug 2020 at 12:49, Dominique Bejean <
> dominique.bej...@eolya.fr>
> > wrote:
> >
> >> Hi Norbert,
> >>
> >> The field name in collection2 is  "reporting_to" not "reporting".
> >>
> >> Dominique
> >>
> >>
> >>
> >> Le mer. 12 août 2020 à 11:59, Norbert Kutasi 
> a
> >> écrit :
> >>
> >>> Hello,
> >>>
> >>> We have been using [subquery] to come up with arbitrary complex
> >> hierarchies
> >>> in our document responses.
> >>>
> >>> It works well as long as the documents are in the same collection
> however
> >>> based on the reference guide I infer it can bring in documents from
> >>> different collections except it throws an error.
> >>>
> >>>
> >>
> https://lucene.apache.org/solr/guide/8_2/transforming-result-documents.html#subquery
> >>>
> >>>
> >>> We are on SOLR 8.2 and in this sandbox we have a 2 node SOLRCloud
> >> cluster,
> >>> where both collections have 1 shard and 2 NRT replicas to ensure nodes
> >> have
> >>> a core from each collection.
> >>> Basic Authorization enabled.
> >>>
> >>> Simple steps to reproduce this issue in this 2 node environment:
> >>> ./solr create -c Collection1 -s 1 -rf 2
> >>> ./solr create -c Collection2 -s 1 -rf 2
> >>>
> >>> Note: these collections are schemaless, however we observed the ones
> with
> >>> schemas.
> >>>
> >>> Collection 1:
> >>> 
> >>>   
> >>>  1
> >>>  John
> >>>   
> >>>   
> >>>  2
> >>>  Peter
> >>>   
> >>> 
> >>>
> >>> Collection 2:
> >>> 
> >>>   
> >>>  3
> >>>  Thomas
> >>> 2
> >>>   
> >>>   
> >>>  4
> >>>  Charles
> >>>  1
> >>>   
> >>>   
> >>>  5
> >>>  Susan
> >>> 3
> >>>   
> >>> 
> >>>
> >>>
> >>> http://localhost:8983/solr/Collection1/query
> >>> {
> >>>  params: {
> >>>q: "*",
> >>>fq: "*",
> >>>rows: 5,
> >>> fl:"*,subordinate:[subquery fromIndex=Collection2]",
> >>>subordinate.fl:"*",
> >>>subordinate.q:"{!field f=reporting v=$row.id}",
> >>>subordinate.fq:"*",
> >>>subordinate.rows:"5"
> >>>  }
> >>> }
> >>>
> >>> {
> >>>  "error":{
> >>>"metadata":[
> >>>  "error-class","org.apache.solr.common.SolrException",
> >>>  "root-error-class","org.apache.solr.common.SolrException"],
> >>>"msg":"while invoking subordinate:[subqueryfromIndex=Collection2] on
> >>>
> >>>
> >>
> doc=SolrDocument{id=stored,indexed,tokenized,omitNorms,indexOptions=DOCS,
> >>> first_name=[stored,index",
> >>>"code":400}}
> >>>
> >>>
> >>> Where do we make a mistake?
> >>>
> >>> Thank you in advance,
> >>> Norbert
> >>>
> >>
>
>


Re: [Subquery] Transform Documents across Collections

2020-08-12 Thread Norbert Kutasi
The version it's working on is 8.5!


On Wed, 12 Aug 2020 at 17:16, Norbert Kutasi 
wrote:

> I see what you mean, however the request results in cartesian products ,
> because of subordinate.q=*:* :
>
> http://localhost:8981/solr/Collection1/query?q=*&fl=*,subordinate:[subquery]&subordinate.q=*:*&subordinate.fl=*&subordinate.collection=Collection2
>
> {
>   "responseHeader":{
> "zkConnected":true,
> "status":0,
> "QTime":0,
> "params":{
>   "q":"*",
>   "fl":"*,subordinate:[subquery]",
>   "subordinate.fl":"*",
>   "subordinate.collection":"Collection2",
>   "subordinate.q":"*:*"}},
>   "response":{"numFound":2,"start":0,"docs":[
>   {
> "id":"1",
> "first_name":["John"],
> "_version_":1674811207656144896,
> "subordinate":{"numFound":3,"start":0,"docs":[
> {
>   "id":"3",
>   "reporting_to":[2],
>   "first_name":["Thomas"],
>   "_version_":1674811297814806528},
> {
>   "id":"4",
>   "reporting_to":[1],
>   "first_name":["Charles"],
>   "_version_":1674811297816903680},
> {
>   "id":"5",
>   "reporting_to":[3],
>   "first_name":["Susan"],
>   "_version_":1674811297816903681}]
> }},
>   {
> "id":"2",
> "first_name":["Peter"],
> "_version_":1674811207659290624,
> "subordinate":{"numFound":3,"start":0,"docs":[
> {
>   "id":"3",
>   "reporting_to":[2],
>   "first_name":["Thomas"],
>   "_version_":1674811297814806528},
> {
>   "id":"4",
>   "reporting_to":[1],
>   "first_name":["Charles"],
>   "_version_":1674811297816903680},
> {
>   "id":"5",
>   "reporting_to":[3],
>   "first_name":["Susan"],
>   "_version_":1674811297816903681}]
> }}]
>   }}
>
>
> Once I add back the "join"criteria q={!fields f=reporting_to v=$row.id},
> the error comes back...
>
>
> http://localhost:8983/solr/Collection1/query?q=*&fl=*,subordinate:[subquery]&subordinate.fl=*&subordinate.collection=Collection2&subordinate.q={!fields
> f=reporting_to v=$row.id}
>
> {
>   "error":{
> "metadata":[
>   "error-class","org.apache.solr.common.SolrException",
>   "root-error-class","org.apache.solr.common.SolrException"],
> "msg":"while invoking subordinate:[subquery] on 
> doc=SolrDocument{id=stored,indexed,tokenized,omitNorms,indexOptions=DOCS,
>  first_name=[stored,index",
> "code":400}}
>
>
> While I was writing an extensive response, just came across what seems to
> be the solution:
>
>
> http://localhost:8983/solr/Collection1/query?q=*&fl=*,subordinate:[subquery]&subordinate.fl=*&subordinate.collection=Collection2&subordinate.q={!term
> f=reporting_to v=$row.id}
>
> {
>   "responseHeader":{
> "zkConnected":true,
> "status":0,
> "QTime":1,
> "params":{
>   "json":"{\r\n  params: {\r\nq: \"*\",\r\nfq: \"*\",\r\n
>  rows: 5,\r\n\tfl:\"*,subordinate:[subquery]\",\r\n
>  subordinate.fl:\"*\",\r\nsubordinate.q:\"{!term f=reporting_to v=$
> row.id}\",\r\nsubordinate.fq:\"*\",\r\n
>  subordinate.rows:\"5\",\r\nsubordinate.collection:\"Collection2\"\r\n
>  }\r\n}\r\n\r\n\r\n\r\n"}},
>   "response":{"numFound":2,"start":0,"docs":[
>   {
> "id":"1",
> "first_name":["John"],
> "_versi