Yes, that was the missing piece. Thanks a lot! On Thu, Jun 22, 2017 at 5:20 PM, Joel Bernstein <joels...@gmail.com> wrote:
> Here is the psuedo code: > > rollup(sort(fetch(gatherNodes()))) > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Thu, Jun 22, 2017 at 5:19 PM, Joel Bernstein <joels...@gmail.com> > wrote: > > > You'll need to use the sort expression to sort the nodes by schemaType > > first. The rollup expression is doing a MapReduce rollup that requires > the > > the records to be sorted by the "over" fields. > > > > Joel Bernstein > > http://joelsolr.blogspot.com/ > > > > On Thu, Jun 22, 2017 at 2:49 PM, Pratik Patel <pra...@semandex.net> > wrote: > > > >> Hi, > >> > >> I have a streaming expression which uses rollup function. My > understanding > >> is that rollup takes an incoming stream and aggregates over given > buckets. > >> However, with following query the result contains duplicate tuples. > >> > >> Following is the streaming expression. > >> > >> rollup( > >> fetch( > >> collection1, > >> gatherNodes( > >> collection1, > >> gatherNodes(collection1, > >> walk="54227b412a1c4e574f88f2bb > >> ->eventParticipantID", > >> gather="eventID" > >> ), > >> walk="eventID->conceptid", > >> gather="conceptid", > >> trackTraversal="true", scatter="branches,leaves" > >> ), > >> fl="schematype", > >> on="node=conceptid" > >> ), > >> over="schematype", > >> count(schematype) > >> ) > >> > >> The result returned is as follows. > >> > >> { > >> "result-set": { > >> "docs": [ > >> { > >> "count(schematype)": 1, > >> "schematype": "Company" > >> }, > >> { > >> "count(schematype)": 1, > >> "schematype": "Founding Event" > >> }, > >> { > >> "count(schematype)": 1, > >> "schematype": "Customer" > >> }, > >> { > >> "count(schematype)": 1, > >> "schematype": "Founding Event" // duplicate > >> }, > >> { > >> "count(schematype)": 1, > >> "schematype": "Employment" // duplicate > >> }, > >> { > >> "count(schematype)": 1, > >> "schematype": "Founding Event" > >> }, > >> { > >> "count(schematype)": 4, > >> "schematype": "Employment" > >> },...... > >> ] > >> } > >> > >> As you can see, there are more than one tuples for 'Founding > >> Event'/'Employment' > >> > >> Am I missing something here? > >> > >> Following is the content of stream which is wrapped by rollup, if it > >> helps. > >> > >> // stream on which rollup is working > >> { > >> "result-set": { > >> "docs": [ > >> { > >> "node": "54227b412a1c4e574f88f2bb", > >> "schematype": "Company", > >> "collection": "collection1", > >> "field": "node", > >> "level": 0 > >> }, > >> { > >> "node": "543004f0c92c0a651166aea5", > >> "schematype": "Founding Event", > >> "collection": "collection1", > >> "field": "eventID", > >> "level": 1 > >> }, > >> { > >> "node": "543004f0c92c0a651166ae99", > >> "schematype": "Customer", > >> "collection": "collection1", > >> "field": "eventID", > >> "level": 1 > >> }, > >> { > >> "node": "543004f0c92c0a651166aea1", > >> "schematype": "Founding Event", > >> "collection": "collection1", > >> "field": "eventID", > >> "level": 1 > >> }, > >> { > >> "node": "543004f0c92c0a651166ae78", > >> "schematype": "Employment", > >> "collection": "collection1", > >> "field": "eventID", > >> "level": 1 > >> }, > >> { > >> "node": "54ee6178b54c1d65412b5f9f", > >> "schematype": "Founding Event", > >> "collection": "collection1", > >> "field": "eventID", > >> "level": 1 > >> }, > >> { > >> "node": "543004f0c92c0a651166ae7c", > >> "schematype": "Employment", > >> "collection": "collection1", > >> "field": "eventID", > >> "level": 1 > >> }, > >> { > >> "node": "543004f0c92c0a651166ae80", > >> "schematype": "Employment", > >> "collection": "collection1", > >> "field": "eventID", > >> "level": 1 > >> }, > >> { > >> "node": "543004f0c92c0a651166ae8a", > >> "schematype": "Employment", > >> "collection": "collection1", > >> "field": "eventID", > >> "level": 1 > >> }, > >> { > >> "node": "543004f0c92c0a651166ae94", > >> "schematype": "Employment", > >> "collection": "collection1", > >> "field": "eventID", > >> "level": 1 > >> }, > >> { > >> "node": "543004f0c92c0a651166ae9d", > >> "schematype": "Customer", > >> "collection": "collection1", > >> "field": "eventID", > >> "level": 1 > >> }, > >> { > >> "EOF": true, > >> "RESPONSE_TIME": 38 > >> } > >> ] > >> } > >> } > >> > >> If I rollup on the level field then the results are as expected but not > >> when the field is schematype. Any idea what's going on here? > >> > >> > >> Thanks, > >> > >> Pratik > >> > > > > >