You'll need to use the sort expression to sort the nodes by schemaType
first. The rollup expression is doing a MapReduce rollup that requires the
the records to be sorted by the "over" fields.

Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, Jun 22, 2017 at 2:49 PM, Pratik Patel <pra...@semandex.net> wrote:

> Hi,
>
> I have a streaming expression which uses rollup function. My understanding
> is that rollup takes an incoming stream and aggregates over given buckets.
> However, with following query the result contains duplicate tuples.
>
> Following is the streaming expression.
>
> rollup(
>     fetch(
>         collection1,
>         gatherNodes(
>             collection1,
>             gatherNodes(collection1,
>                         walk="54227b412a1c4e574f88f2bb->
> eventParticipantID",
>                         gather="eventID"
>             ),
>             walk="eventID->conceptid",
>             gather="conceptid",
>             trackTraversal="true", scatter="branches,leaves"
>         ),
>         fl="schematype",
>         on="node=conceptid"
>     ),
>     over="schematype",
>     count(schematype)
> )
>
> The result returned is as follows.
>
> {
>   "result-set": {
>     "docs": [
>       {
>         "count(schematype)": 1,
>         "schematype": "Company"
>       },
>       {
>         "count(schematype)": 1,
>         "schematype": "Founding Event"
>       },
>       {
>         "count(schematype)": 1,
>         "schematype": "Customer"
>       },
>       {
>         "count(schematype)": 1,
>         "schematype": "Founding Event"  // duplicate
>       },
>       {
>         "count(schematype)": 1,
>         "schematype": "Employment"      // duplicate
>       },
>       {
>         "count(schematype)": 1,
>         "schematype": "Founding Event"
>       },
>       {
>         "count(schematype)": 4,
>         "schematype": "Employment"
>       },......
>      ]
>  }
>
> As you can see, there are more than one tuples for 'Founding
> Event'/'Employment'
>
> Am I missing something here?
>
> Following is the content of stream which is wrapped by rollup, if it helps.
>
> // stream on which rollup is working
> {
>   "result-set": {
>     "docs": [
>       {
>         "node": "54227b412a1c4e574f88f2bb",
>         "schematype": "Company",
>         "collection": "collection1",
>         "field": "node",
>         "level": 0
>       },
>       {
>         "node": "543004f0c92c0a651166aea5",
>         "schematype": "Founding Event",
>         "collection": "collection1",
>         "field": "eventID",
>         "level": 1
>       },
>       {
>         "node": "543004f0c92c0a651166ae99",
>         "schematype": "Customer",
>         "collection": "collection1",
>         "field": "eventID",
>         "level": 1
>       },
>       {
>         "node": "543004f0c92c0a651166aea1",
>         "schematype": "Founding Event",
>         "collection": "collection1",
>         "field": "eventID",
>         "level": 1
>       },
>       {
>         "node": "543004f0c92c0a651166ae78",
>         "schematype": "Employment",
>         "collection": "collection1",
>         "field": "eventID",
>         "level": 1
>       },
>       {
>         "node": "54ee6178b54c1d65412b5f9f",
>         "schematype": "Founding Event",
>         "collection": "collection1",
>         "field": "eventID",
>         "level": 1
>       },
>       {
>         "node": "543004f0c92c0a651166ae7c",
>         "schematype": "Employment",
>         "collection": "collection1",
>         "field": "eventID",
>         "level": 1
>       },
>       {
>         "node": "543004f0c92c0a651166ae80",
>         "schematype": "Employment",
>         "collection": "collection1",
>         "field": "eventID",
>         "level": 1
>       },
>       {
>         "node": "543004f0c92c0a651166ae8a",
>         "schematype": "Employment",
>         "collection": "collection1",
>         "field": "eventID",
>         "level": 1
>       },
>       {
>         "node": "543004f0c92c0a651166ae94",
>         "schematype": "Employment",
>         "collection": "collection1",
>         "field": "eventID",
>         "level": 1
>       },
>       {
>         "node": "543004f0c92c0a651166ae9d",
>         "schematype": "Customer",
>         "collection": "collection1",
>         "field": "eventID",
>         "level": 1
>       },
>       {
>         "EOF": true,
>         "RESPONSE_TIME": 38
>       }
>     ]
>   }
> }
>
> If I rollup on the level field then the results are as expected but not
> when the field is schematype. Any idea what's going on here?
>
>
> Thanks,
>
> Pratik
>

Reply via email to