Re: Using fetch function with streaming expression

Pratik Patel Wed, 15 Mar 2017 08:54:52 -0700

Great, I think I can achieve what I want by combining "select" and
"cartersian" functions in my expression. Thanks a lot for help!


Regards,
Pratik

On Wed, Mar 15, 2017 at 10:21 AM, Joel Bernstein <joels...@gmail.com> wrote:

> I haven't created the jira ticket for this yet. It's fairly quick to
> implement but the Solr 6.5 release is just around the corner. So most
> likely it would be in the Solr 6.6.  It will be committed fairly soon
> though so if you want to use master, or branch_6x you can experiment with
> it earlier.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Tue, Mar 14, 2017 at 7:53 PM, Pratik Patel <pra...@semandex.net> wrote:
>
> > Wow, this is interesting! Is it going to be a new addition to solr or is
> it
> > already available cause I can not find it in documentation? I am using
> solr
> > version 6.4.1.
> >
> > On Tue, Mar 14, 2017 at 7:41 PM, Joel Bernstein <joels...@gmail.com>
> > wrote:
> >
> > > I'm going to add a "cartesian" function that create a cartesian product
> > > from a multi-value field. This will turn a single tuple with a
> > multi-value
> > > into multiple tuples with a single value field. This will allow the
> fetch
> > > operation to work on ancestors. It also has many other use cases.
> Sample
> > > syntax:
> > >
> > > fetch(collection1,
> > >          cartesian(field=ancestors,
> > >                          having(gatherNodes(collection1,
> > >
> > >  search(collection1,
> > >
> > >  q="*:*",
> > >
> > >  fl="conceptid",
> > >
> > >  sort="conceptid asc",
> > >
> > >  fq=storeid:"524efcfd505637004b1f6f24",
> > >
> > >  fq=tags:"Company",
> > >
> > >  fq=tags:"Prospects2",
> > >
> > >  qt="/export"),
> > >
> > > walk=conceptid->eventParticipantID,
> > >
> > > gather="eventID",
> > >                                           t
> > > rackTraversal="true",
> > >
> > > scatter="leaves",
> > >                                                             count(*)),
> > >                                      gt(count(*),1))),
> > >          fl="concept_name",
> > >          on="ancestors=conceptid")
> > >
> > > Joel Bernstein
> > > http://joelsolr.blogspot.com/
> > >
> > > On Tue, Mar 14, 2017 at 11:51 AM, Pratik Patel <pra...@semandex.net>
> > > wrote:
> > >
> > > > Hi, Joel. Thanks for the reply.
> > > >
> > > > So, I need to do some graph traversal queries for my use case. In my
> > data
> > > > set, I have concepts and events.
> > > >
> > > > concept : {name, address, bio ......},
> > > > > event: {name, date, participantIds:[concept1, concept2...] .....}
> > > >
> > > >
> > > > Events connects two or more concepts. So, this is a graph data where
> > > > concepts are connected to each other via events. Each event store
> links
> > > to
> > > > the concepts that it connects. So the field which stores those links
> is
> > > > multivalued. This is a natural structure for my data on which I
> wanted
> > to
> > > > do some advanced graph traversal queries with some streaming
> > expression.
> > > > However, gatherNodes() function does not support multivalued fields
> > yet.
> > > > So, I changed my index structure to be something like this.
> > > >
> > > > concept : {conceptId, name, address, bio ......},
> > > > > event: {eventId, name, date, participantIds:[concept1, concept2...]
> > > > .....}
> > > > > *****create eventLink documents for each participantId in each
> > > > > event********
> > > > > eventLink:{eventid, conceptid, id}
> > > >
> > > >
> > > >
> > > > I created eventLink documents from each event so that I can traverse
> > the
> > > > data using gatherNodes() function. With this change, I was able to do
> > > graph
> > > > query and get Ids of concepts which I wanted. However, I only have
> ids
> > of
> > > > concepts. Now, using these ids, I want additional data from concept
> > > > documents like concept_name or address or bio.  This is what I was
> > trying
> > > > to achieve with fetch() function but it seems I hit the multivalued
> > > > limitation again :) The reason why I am storing only the ids in
> > eventLink
> > > > documents is because I don't want to duplicate data unnecessarily. It
> > > will
> > > > complicate maintenance of consistency in index when delete/update
> > > happens.
> > > > Is there any way I can achieve this?
> > > >
> > > > Thanks!
> > > > Pratik
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Mar 14, 2017 at 11:24 AM, Joel Bernstein <joels...@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Wow that's an interesting expression!
> > > > >
> > > > > The problem is that you are trying to fetch using the ancestors
> > field,
> > > > > which is multi-valued. fetch doesn't support multi-value join
> keys. I
> > > > never
> > > > > thought someone might try to do that.
> > > > >
> > > > > So , your attempting to get the concept names for ancestors?
> > > > >
> > > > > Can you explain a little more about the use case?
> > > > >
> > > > >
> > > > > Joel Bernstein
> > > > > http://joelsolr.blogspot.com/
> > > > >
> > > > > On Tue, Mar 14, 2017 at 11:08 AM, Pratik Patel <
> pra...@semandex.net>
> > > > > wrote:
> > > > >
> > > > > > I have two types of documents in my index. eventLink and
> > > concepttData.
> > > > > >
> > > > > > eventLink ---- { ancestors:[<id1>,<id2>] }
> > > > > > conceptData-----{ id:id1, conceptid, concept_name .....<some more
> > > > data> }
> > > > > >
> > > > > > Both are in same collection.
> > > > > > In my query, I am doing a gatherNodes query wrapped in some other
> > > > > function
> > > > > > and ultimately I am getting a bunch of eventLink documents. Now,
> I
> > am
> > > > > > trying to get conceptData document for each id specified in
> > > eventLink's
> > > > > > ancestors field. I am trying to do that using fetch() function.
> > Here
> > > is
> > > > > > simplified form of my query.
> > > > > >
> > > > > > fetch(collection1,
> > > > > > >  function to get eventLinks,
> > > > > > >   fl="concept_name",
> > > > > > >   on="ancestors=conceptid"
> > > > > > > )
> > > > > >
> > > > > >
> > > > > > On executing this query, I am getting back same set of documents
> > > which
> > > > > are
> > > > > > results of my streaming expression containing gatherNodes()
> > function.
> > > > No
> > > > > > fields are added to the tuples. From documentation, it seems like
> > > fetch
> > > > > > would fetch additional data and add it to the tuples. However,
> that
> > > is
> > > > > not
> > > > > > happening. Resulting tuples does not have concept_name field in
> > them.
> > > > > What
> > > > > > am I missing here? I really need to get this additional data from
> > one
> > > > > solr
> > > > > > query so that I don't have to iterate over the eventLinks and get
> > > > > > additional data by individual queries. That would badly impact
> > > > > performance.
> > > > > > Any suggestions?
> > > > > >
> > > > > > Here is my actual query and the response.
> > > > > >
> > > > > >
> > > > > > fetch(collection1,
> > > > > > >  having(
> > > > > > > gatherNodes(collection1,
> > > > > > > search(collection1,q="*:*",fl="conceptid",sort="conceptid
> > > > > > > asc",fq=storeid:"524efcfd505637004b1f6f24",fq=
> > > > tags:"Company",fq=tags:"
> > > > > > Prospects2",
> > > > > > > qt="/export"),
> > > > > > > walk=conceptid->eventParticipantID,
> > > > > > > gather="eventID",
> > > > > > > trackTraversal="true", scatter="leaves",
> > > > > > > count(*)
> > > > > > > ),
> > > > > > > gt(count(*),1)
> > > > > > > ),
> > > > > > > fl="concept_name",
> > > > > > > on="ancestors=conceptid"
> > > > > > > )
> > > > > >
> > > > > >
> > > > > >
> > > > > > Response :
> > > > > >
> > > > > > {
> > > > > > > "result-set": {
> > > > > > > "docs": [
> > > > > > > {
> > > > > > > "node": "524f03355056c8b53b4ed199",
> > > > > > > "field": "eventID",
> > > > > > > "level": 1,
> > > > > > > "count(*)": 2,
> > > > > > > "collection": "collection1",
> > > > > > > "ancestors": [
> > > > > > > "524f02845056c8b53b4e9871",
> > > > > > > "524f02755056c8b53b4e9269"
> > > > > > > ]
> > > > > > > },
> > > > > > > .........
> > > > > > > }
> > > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Pratik
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Using fetch function with streaming expression

Reply via email to