Here is the psuedo code:

rollup(sort(fetch(gatherNodes())))

Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, Jun 22, 2017 at 5:19 PM, Joel Bernstein <joels...@gmail.com> wrote:

> You'll need to use the sort expression to sort the nodes by schemaType
> first. The rollup expression is doing a MapReduce rollup that requires the
> the records to be sorted by the "over" fields.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Thu, Jun 22, 2017 at 2:49 PM, Pratik Patel <pra...@semandex.net> wrote:
>
>> Hi,
>>
>> I have a streaming expression which uses rollup function. My understanding
>> is that rollup takes an incoming stream and aggregates over given buckets.
>> However, with following query the result contains duplicate tuples.
>>
>> Following is the streaming expression.
>>
>> rollup(
>>     fetch(
>>         collection1,
>>         gatherNodes(
>>             collection1,
>>             gatherNodes(collection1,
>>                         walk="54227b412a1c4e574f88f2bb
>> ->eventParticipantID",
>>                         gather="eventID"
>>             ),
>>             walk="eventID->conceptid",
>>             gather="conceptid",
>>             trackTraversal="true", scatter="branches,leaves"
>>         ),
>>         fl="schematype",
>>         on="node=conceptid"
>>     ),
>>     over="schematype",
>>     count(schematype)
>> )
>>
>> The result returned is as follows.
>>
>> {
>>   "result-set": {
>>     "docs": [
>>       {
>>         "count(schematype)": 1,
>>         "schematype": "Company"
>>       },
>>       {
>>         "count(schematype)": 1,
>>         "schematype": "Founding Event"
>>       },
>>       {
>>         "count(schematype)": 1,
>>         "schematype": "Customer"
>>       },
>>       {
>>         "count(schematype)": 1,
>>         "schematype": "Founding Event"  // duplicate
>>       },
>>       {
>>         "count(schematype)": 1,
>>         "schematype": "Employment"      // duplicate
>>       },
>>       {
>>         "count(schematype)": 1,
>>         "schematype": "Founding Event"
>>       },
>>       {
>>         "count(schematype)": 4,
>>         "schematype": "Employment"
>>       },......
>>      ]
>>  }
>>
>> As you can see, there are more than one tuples for 'Founding
>> Event'/'Employment'
>>
>> Am I missing something here?
>>
>> Following is the content of stream which is wrapped by rollup, if it
>> helps.
>>
>> // stream on which rollup is working
>> {
>>   "result-set": {
>>     "docs": [
>>       {
>>         "node": "54227b412a1c4e574f88f2bb",
>>         "schematype": "Company",
>>         "collection": "collection1",
>>         "field": "node",
>>         "level": 0
>>       },
>>       {
>>         "node": "543004f0c92c0a651166aea5",
>>         "schematype": "Founding Event",
>>         "collection": "collection1",
>>         "field": "eventID",
>>         "level": 1
>>       },
>>       {
>>         "node": "543004f0c92c0a651166ae99",
>>         "schematype": "Customer",
>>         "collection": "collection1",
>>         "field": "eventID",
>>         "level": 1
>>       },
>>       {
>>         "node": "543004f0c92c0a651166aea1",
>>         "schematype": "Founding Event",
>>         "collection": "collection1",
>>         "field": "eventID",
>>         "level": 1
>>       },
>>       {
>>         "node": "543004f0c92c0a651166ae78",
>>         "schematype": "Employment",
>>         "collection": "collection1",
>>         "field": "eventID",
>>         "level": 1
>>       },
>>       {
>>         "node": "54ee6178b54c1d65412b5f9f",
>>         "schematype": "Founding Event",
>>         "collection": "collection1",
>>         "field": "eventID",
>>         "level": 1
>>       },
>>       {
>>         "node": "543004f0c92c0a651166ae7c",
>>         "schematype": "Employment",
>>         "collection": "collection1",
>>         "field": "eventID",
>>         "level": 1
>>       },
>>       {
>>         "node": "543004f0c92c0a651166ae80",
>>         "schematype": "Employment",
>>         "collection": "collection1",
>>         "field": "eventID",
>>         "level": 1
>>       },
>>       {
>>         "node": "543004f0c92c0a651166ae8a",
>>         "schematype": "Employment",
>>         "collection": "collection1",
>>         "field": "eventID",
>>         "level": 1
>>       },
>>       {
>>         "node": "543004f0c92c0a651166ae94",
>>         "schematype": "Employment",
>>         "collection": "collection1",
>>         "field": "eventID",
>>         "level": 1
>>       },
>>       {
>>         "node": "543004f0c92c0a651166ae9d",
>>         "schematype": "Customer",
>>         "collection": "collection1",
>>         "field": "eventID",
>>         "level": 1
>>       },
>>       {
>>         "EOF": true,
>>         "RESPONSE_TIME": 38
>>       }
>>     ]
>>   }
>> }
>>
>> If I rollup on the level field then the results are as expected but not
>> when the field is schematype. Any idea what's going on here?
>>
>>
>> Thanks,
>>
>> Pratik
>>
>
>

Reply via email to