You'll need to use the sort expression to sort the nodes by schemaType first. The rollup expression is doing a MapReduce rollup that requires the the records to be sorted by the "over" fields.
Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Jun 22, 2017 at 2:49 PM, Pratik Patel <pra...@semandex.net> wrote: > Hi, > > I have a streaming expression which uses rollup function. My understanding > is that rollup takes an incoming stream and aggregates over given buckets. > However, with following query the result contains duplicate tuples. > > Following is the streaming expression. > > rollup( > fetch( > collection1, > gatherNodes( > collection1, > gatherNodes(collection1, > walk="54227b412a1c4e574f88f2bb-> > eventParticipantID", > gather="eventID" > ), > walk="eventID->conceptid", > gather="conceptid", > trackTraversal="true", scatter="branches,leaves" > ), > fl="schematype", > on="node=conceptid" > ), > over="schematype", > count(schematype) > ) > > The result returned is as follows. > > { > "result-set": { > "docs": [ > { > "count(schematype)": 1, > "schematype": "Company" > }, > { > "count(schematype)": 1, > "schematype": "Founding Event" > }, > { > "count(schematype)": 1, > "schematype": "Customer" > }, > { > "count(schematype)": 1, > "schematype": "Founding Event" // duplicate > }, > { > "count(schematype)": 1, > "schematype": "Employment" // duplicate > }, > { > "count(schematype)": 1, > "schematype": "Founding Event" > }, > { > "count(schematype)": 4, > "schematype": "Employment" > },...... > ] > } > > As you can see, there are more than one tuples for 'Founding > Event'/'Employment' > > Am I missing something here? > > Following is the content of stream which is wrapped by rollup, if it helps. > > // stream on which rollup is working > { > "result-set": { > "docs": [ > { > "node": "54227b412a1c4e574f88f2bb", > "schematype": "Company", > "collection": "collection1", > "field": "node", > "level": 0 > }, > { > "node": "543004f0c92c0a651166aea5", > "schematype": "Founding Event", > "collection": "collection1", > "field": "eventID", > "level": 1 > }, > { > "node": "543004f0c92c0a651166ae99", > "schematype": "Customer", > "collection": "collection1", > "field": "eventID", > "level": 1 > }, > { > "node": "543004f0c92c0a651166aea1", > "schematype": "Founding Event", > "collection": "collection1", > "field": "eventID", > "level": 1 > }, > { > "node": "543004f0c92c0a651166ae78", > "schematype": "Employment", > "collection": "collection1", > "field": "eventID", > "level": 1 > }, > { > "node": "54ee6178b54c1d65412b5f9f", > "schematype": "Founding Event", > "collection": "collection1", > "field": "eventID", > "level": 1 > }, > { > "node": "543004f0c92c0a651166ae7c", > "schematype": "Employment", > "collection": "collection1", > "field": "eventID", > "level": 1 > }, > { > "node": "543004f0c92c0a651166ae80", > "schematype": "Employment", > "collection": "collection1", > "field": "eventID", > "level": 1 > }, > { > "node": "543004f0c92c0a651166ae8a", > "schematype": "Employment", > "collection": "collection1", > "field": "eventID", > "level": 1 > }, > { > "node": "543004f0c92c0a651166ae94", > "schematype": "Employment", > "collection": "collection1", > "field": "eventID", > "level": 1 > }, > { > "node": "543004f0c92c0a651166ae9d", > "schematype": "Customer", > "collection": "collection1", > "field": "eventID", > "level": 1 > }, > { > "EOF": true, > "RESPONSE_TIME": 38 > } > ] > } > } > > If I rollup on the level field then the results are as expected but not > when the field is schematype. Any idea what's going on here? > > > Thanks, > > Pratik >