Hi Mikhail,

Thank you very much for the info, it is very informative. I am going
through the links you sent.

Best regards,
Istvan



On Tue, Nov 24, 2015 at 7:48 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> Hello Istvan,
>
> - when flattern subdocs, you can concatenate its' fields which are
> necessary for retrieval, eg "K-06-45", it solves retrieval, but isn't
> really flexible.
> - term positions is not easier to implement, if you really prefer this way
> I'd suggest to look on http://siren.solutions/siren/overview/ I haven't
> tried it, but it sounds like they implemented this approach.
> - if you follow recent blog post, you see our favorite approach
> http://blog.griddynamics.com/2013/09/solr-block-join-support.html
>
> Also, query time join {!join} and field collapsing are also alternatives to
> consider.
>
>
> On Tue, Nov 24, 2015 at 12:39 PM, István <lecc...@gmail.com> wrote:
>
> > Hi all,
> >
> > I would like to find documents in a key-value store (Riak) with Solr and
> I
> > am running into a challenge. I have nested JSON documents with patent
> > information. Patents have a one or many CPC (
> > http://www.cooperativepatentclassification.org/index.html) codes
> something
> > like these:
> >
> > {
> >
> > // more data
> >
> > "cpc": [
> >     {
> >       "class": "61",
> >       "section": "A",
> >       "sequence": "1",
> >       "subclass": "K",
> >       "subgroup": "06",
> >       "main-group": "45",
> >       "classification-value": "I"
> >     },
> >     {
> >       "class": "61",
> >       "section": "A",
> >       "sequence": "2",
> >       "subclass": "K",
> >       "subgroup": "506",
> >       "main-group": "31",
> >       "classification-value": "I"
> >     }
> > ]
> >
> > }
> >
> > I would like to find the documents that match to a certain CPC code,
> > sometimes with partial code sometimes with the full code. I used the
> > following schema to index the documents:
> >
> > <field name="cpc.class"                 type="int"    indexed="true"
> > stored="true" multiValued="true" />
> > <field name="cpc.section"               type="string" indexed="true"
> > stored="true" multiValued="true" />
> > <field name="cpc.sequence"              type="int"    indexed="true"
> > stored="true" multiValued="true" />
> > <field name="cpc.subclass"              type="string" indexed="true"
> > stored="true" multiValued="true" />
> > <field name="cpc.subgroup"              type="int"    indexed="true"
> > stored="true" multiValued="true" />
> > <field name="cpc.main-group"            type="int"    indexed="true"
> > stored="true" multiValued="true" />
> > <field name="cpc.classification-value"  type="string" indexed="true"
> > stored="true" multiValued="true" />
> >
> >
> > The problem with this approach is that when we query a certain
> combination
> > of partial CPC codes it returns document that don't actually match that
> > combination.
> >
> > This behavior described in this blog post:
> >
> >
> >
> http://blog.griddynamics.com/2011/06/solr-experience-search-parent-child.html
> >
> > My understanding is that I need to apply termPositions=”true” to the
> field
> > definition and than Solr maintains the position information and it will
> > return only the documents that actually match the combination of the
> > partial CPC codes. Am I on the right track with this or there is a better
> > solution to query nested documents with partial codes?
> >
> > Thank you in advance,
> > Istvan
> >
> > PS: I also posted this on Stackoverflow:
> >
> >
> http://stackoverflow.com/questions/33724556/how-to-index-an-array-of-hashes-with-solr
> >
> > --
> > the sun shines for all
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
> <mkhlud...@griddynamics.com>
>



-- 
the sun shines for all

Reply via email to