Yes, that was the missing piece. Thanks a lot!

On Thu, Jun 22, 2017 at 5:20 PM, Joel Bernstein <joels...@gmail.com> wrote:

> Here is the psuedo code:
>
> rollup(sort(fetch(gatherNodes())))
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Thu, Jun 22, 2017 at 5:19 PM, Joel Bernstein <joels...@gmail.com>
> wrote:
>
> > You'll need to use the sort expression to sort the nodes by schemaType
> > first. The rollup expression is doing a MapReduce rollup that requires
> the
> > the records to be sorted by the "over" fields.
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> > On Thu, Jun 22, 2017 at 2:49 PM, Pratik Patel <pra...@semandex.net>
> wrote:
> >
> >> Hi,
> >>
> >> I have a streaming expression which uses rollup function. My
> understanding
> >> is that rollup takes an incoming stream and aggregates over given
> buckets.
> >> However, with following query the result contains duplicate tuples.
> >>
> >> Following is the streaming expression.
> >>
> >> rollup(
> >>     fetch(
> >>         collection1,
> >>         gatherNodes(
> >>             collection1,
> >>             gatherNodes(collection1,
> >>                         walk="54227b412a1c4e574f88f2bb
> >> ->eventParticipantID",
> >>                         gather="eventID"
> >>             ),
> >>             walk="eventID->conceptid",
> >>             gather="conceptid",
> >>             trackTraversal="true", scatter="branches,leaves"
> >>         ),
> >>         fl="schematype",
> >>         on="node=conceptid"
> >>     ),
> >>     over="schematype",
> >>     count(schematype)
> >> )
> >>
> >> The result returned is as follows.
> >>
> >> {
> >>   "result-set": {
> >>     "docs": [
> >>       {
> >>         "count(schematype)": 1,
> >>         "schematype": "Company"
> >>       },
> >>       {
> >>         "count(schematype)": 1,
> >>         "schematype": "Founding Event"
> >>       },
> >>       {
> >>         "count(schematype)": 1,
> >>         "schematype": "Customer"
> >>       },
> >>       {
> >>         "count(schematype)": 1,
> >>         "schematype": "Founding Event"  // duplicate
> >>       },
> >>       {
> >>         "count(schematype)": 1,
> >>         "schematype": "Employment"      // duplicate
> >>       },
> >>       {
> >>         "count(schematype)": 1,
> >>         "schematype": "Founding Event"
> >>       },
> >>       {
> >>         "count(schematype)": 4,
> >>         "schematype": "Employment"
> >>       },......
> >>      ]
> >>  }
> >>
> >> As you can see, there are more than one tuples for 'Founding
> >> Event'/'Employment'
> >>
> >> Am I missing something here?
> >>
> >> Following is the content of stream which is wrapped by rollup, if it
> >> helps.
> >>
> >> // stream on which rollup is working
> >> {
> >>   "result-set": {
> >>     "docs": [
> >>       {
> >>         "node": "54227b412a1c4e574f88f2bb",
> >>         "schematype": "Company",
> >>         "collection": "collection1",
> >>         "field": "node",
> >>         "level": 0
> >>       },
> >>       {
> >>         "node": "543004f0c92c0a651166aea5",
> >>         "schematype": "Founding Event",
> >>         "collection": "collection1",
> >>         "field": "eventID",
> >>         "level": 1
> >>       },
> >>       {
> >>         "node": "543004f0c92c0a651166ae99",
> >>         "schematype": "Customer",
> >>         "collection": "collection1",
> >>         "field": "eventID",
> >>         "level": 1
> >>       },
> >>       {
> >>         "node": "543004f0c92c0a651166aea1",
> >>         "schematype": "Founding Event",
> >>         "collection": "collection1",
> >>         "field": "eventID",
> >>         "level": 1
> >>       },
> >>       {
> >>         "node": "543004f0c92c0a651166ae78",
> >>         "schematype": "Employment",
> >>         "collection": "collection1",
> >>         "field": "eventID",
> >>         "level": 1
> >>       },
> >>       {
> >>         "node": "54ee6178b54c1d65412b5f9f",
> >>         "schematype": "Founding Event",
> >>         "collection": "collection1",
> >>         "field": "eventID",
> >>         "level": 1
> >>       },
> >>       {
> >>         "node": "543004f0c92c0a651166ae7c",
> >>         "schematype": "Employment",
> >>         "collection": "collection1",
> >>         "field": "eventID",
> >>         "level": 1
> >>       },
> >>       {
> >>         "node": "543004f0c92c0a651166ae80",
> >>         "schematype": "Employment",
> >>         "collection": "collection1",
> >>         "field": "eventID",
> >>         "level": 1
> >>       },
> >>       {
> >>         "node": "543004f0c92c0a651166ae8a",
> >>         "schematype": "Employment",
> >>         "collection": "collection1",
> >>         "field": "eventID",
> >>         "level": 1
> >>       },
> >>       {
> >>         "node": "543004f0c92c0a651166ae94",
> >>         "schematype": "Employment",
> >>         "collection": "collection1",
> >>         "field": "eventID",
> >>         "level": 1
> >>       },
> >>       {
> >>         "node": "543004f0c92c0a651166ae9d",
> >>         "schematype": "Customer",
> >>         "collection": "collection1",
> >>         "field": "eventID",
> >>         "level": 1
> >>       },
> >>       {
> >>         "EOF": true,
> >>         "RESPONSE_TIME": 38
> >>       }
> >>     ]
> >>   }
> >> }
> >>
> >> If I rollup on the level field then the results are as expected but not
> >> when the field is schematype. Any idea what's going on here?
> >>
> >>
> >> Thanks,
> >>
> >> Pratik
> >>
> >
> >
>

Reply via email to