Great, I think I can achieve what I want by combining "select" and "cartersian" functions in my expression. Thanks a lot for help!
Regards, Pratik On Wed, Mar 15, 2017 at 10:21 AM, Joel Bernstein <joels...@gmail.com> wrote: > I haven't created the jira ticket for this yet. It's fairly quick to > implement but the Solr 6.5 release is just around the corner. So most > likely it would be in the Solr 6.6. It will be committed fairly soon > though so if you want to use master, or branch_6x you can experiment with > it earlier. > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Tue, Mar 14, 2017 at 7:53 PM, Pratik Patel <pra...@semandex.net> wrote: > > > Wow, this is interesting! Is it going to be a new addition to solr or is > it > > already available cause I can not find it in documentation? I am using > solr > > version 6.4.1. > > > > On Tue, Mar 14, 2017 at 7:41 PM, Joel Bernstein <joels...@gmail.com> > > wrote: > > > > > I'm going to add a "cartesian" function that create a cartesian product > > > from a multi-value field. This will turn a single tuple with a > > multi-value > > > into multiple tuples with a single value field. This will allow the > fetch > > > operation to work on ancestors. It also has many other use cases. > Sample > > > syntax: > > > > > > fetch(collection1, > > > cartesian(field=ancestors, > > > having(gatherNodes(collection1, > > > > > > search(collection1, > > > > > > q="*:*", > > > > > > fl="conceptid", > > > > > > sort="conceptid asc", > > > > > > fq=storeid:"524efcfd505637004b1f6f24", > > > > > > fq=tags:"Company", > > > > > > fq=tags:"Prospects2", > > > > > > qt="/export"), > > > > > > walk=conceptid->eventParticipantID, > > > > > > gather="eventID", > > > t > > > rackTraversal="true", > > > > > > scatter="leaves", > > > count(*)), > > > gt(count(*),1))), > > > fl="concept_name", > > > on="ancestors=conceptid") > > > > > > Joel Bernstein > > > http://joelsolr.blogspot.com/ > > > > > > On Tue, Mar 14, 2017 at 11:51 AM, Pratik Patel <pra...@semandex.net> > > > wrote: > > > > > > > Hi, Joel. Thanks for the reply. > > > > > > > > So, I need to do some graph traversal queries for my use case. In my > > data > > > > set, I have concepts and events. > > > > > > > > concept : {name, address, bio ......}, > > > > > event: {name, date, participantIds:[concept1, concept2...] .....} > > > > > > > > > > > > Events connects two or more concepts. So, this is a graph data where > > > > concepts are connected to each other via events. Each event store > links > > > to > > > > the concepts that it connects. So the field which stores those links > is > > > > multivalued. This is a natural structure for my data on which I > wanted > > to > > > > do some advanced graph traversal queries with some streaming > > expression. > > > > However, gatherNodes() function does not support multivalued fields > > yet. > > > > So, I changed my index structure to be something like this. > > > > > > > > concept : {conceptId, name, address, bio ......}, > > > > > event: {eventId, name, date, participantIds:[concept1, concept2...] > > > > .....} > > > > > *****create eventLink documents for each participantId in each > > > > > event******** > > > > > eventLink:{eventid, conceptid, id} > > > > > > > > > > > > > > > > I created eventLink documents from each event so that I can traverse > > the > > > > data using gatherNodes() function. With this change, I was able to do > > > graph > > > > query and get Ids of concepts which I wanted. However, I only have > ids > > of > > > > concepts. Now, using these ids, I want additional data from concept > > > > documents like concept_name or address or bio. This is what I was > > trying > > > > to achieve with fetch() function but it seems I hit the multivalued > > > > limitation again :) The reason why I am storing only the ids in > > eventLink > > > > documents is because I don't want to duplicate data unnecessarily. It > > > will > > > > complicate maintenance of consistency in index when delete/update > > > happens. > > > > Is there any way I can achieve this? > > > > > > > > Thanks! > > > > Pratik > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 14, 2017 at 11:24 AM, Joel Bernstein <joels...@gmail.com > > > > > > wrote: > > > > > > > > > Wow that's an interesting expression! > > > > > > > > > > The problem is that you are trying to fetch using the ancestors > > field, > > > > > which is multi-valued. fetch doesn't support multi-value join > keys. I > > > > never > > > > > thought someone might try to do that. > > > > > > > > > > So , your attempting to get the concept names for ancestors? > > > > > > > > > > Can you explain a little more about the use case? > > > > > > > > > > > > > > > Joel Bernstein > > > > > http://joelsolr.blogspot.com/ > > > > > > > > > > On Tue, Mar 14, 2017 at 11:08 AM, Pratik Patel < > pra...@semandex.net> > > > > > wrote: > > > > > > > > > > > I have two types of documents in my index. eventLink and > > > concepttData. > > > > > > > > > > > > eventLink ---- { ancestors:[<id1>,<id2>] } > > > > > > conceptData-----{ id:id1, conceptid, concept_name .....<some more > > > > data> } > > > > > > > > > > > > Both are in same collection. > > > > > > In my query, I am doing a gatherNodes query wrapped in some other > > > > > function > > > > > > and ultimately I am getting a bunch of eventLink documents. Now, > I > > am > > > > > > trying to get conceptData document for each id specified in > > > eventLink's > > > > > > ancestors field. I am trying to do that using fetch() function. > > Here > > > is > > > > > > simplified form of my query. > > > > > > > > > > > > fetch(collection1, > > > > > > > function to get eventLinks, > > > > > > > fl="concept_name", > > > > > > > on="ancestors=conceptid" > > > > > > > ) > > > > > > > > > > > > > > > > > > On executing this query, I am getting back same set of documents > > > which > > > > > are > > > > > > results of my streaming expression containing gatherNodes() > > function. > > > > No > > > > > > fields are added to the tuples. From documentation, it seems like > > > fetch > > > > > > would fetch additional data and add it to the tuples. However, > that > > > is > > > > > not > > > > > > happening. Resulting tuples does not have concept_name field in > > them. > > > > > What > > > > > > am I missing here? I really need to get this additional data from > > one > > > > > solr > > > > > > query so that I don't have to iterate over the eventLinks and get > > > > > > additional data by individual queries. That would badly impact > > > > > performance. > > > > > > Any suggestions? > > > > > > > > > > > > Here is my actual query and the response. > > > > > > > > > > > > > > > > > > fetch(collection1, > > > > > > > having( > > > > > > > gatherNodes(collection1, > > > > > > > search(collection1,q="*:*",fl="conceptid",sort="conceptid > > > > > > > asc",fq=storeid:"524efcfd505637004b1f6f24",fq= > > > > tags:"Company",fq=tags:" > > > > > > Prospects2", > > > > > > > qt="/export"), > > > > > > > walk=conceptid->eventParticipantID, > > > > > > > gather="eventID", > > > > > > > trackTraversal="true", scatter="leaves", > > > > > > > count(*) > > > > > > > ), > > > > > > > gt(count(*),1) > > > > > > > ), > > > > > > > fl="concept_name", > > > > > > > on="ancestors=conceptid" > > > > > > > ) > > > > > > > > > > > > > > > > > > > > > > > > Response : > > > > > > > > > > > > { > > > > > > > "result-set": { > > > > > > > "docs": [ > > > > > > > { > > > > > > > "node": "524f03355056c8b53b4ed199", > > > > > > > "field": "eventID", > > > > > > > "level": 1, > > > > > > > "count(*)": 2, > > > > > > > "collection": "collection1", > > > > > > > "ancestors": [ > > > > > > > "524f02845056c8b53b4e9871", > > > > > > > "524f02755056c8b53b4e9269" > > > > > > > ] > > > > > > > }, > > > > > > > ......... > > > > > > > } > > > > > > > > > > > > > > > > > > Thanks, > > > > > > Pratik > > > > > > > > > > > > > > > > > > > > >