Hi Jack, Thank you very much, I am going to for this as the primary solution.
Regards, Istvan On Tue, Nov 24, 2015 at 1:56 PM, Jack Krupansky <jack.krupan...@gmail.com> wrote: > The primary recommendation is that you flatten nested documents. > > That means one Solr document per cpc, not multivalued. > > As always, queries should drive your data model, so please specify what a > typical query might be like, in plain English. > > -- Jack Krupansky > > On Tue, Nov 24, 2015 at 4:39 AM, István <lecc...@gmail.com> wrote: > > > Hi all, > > > > I would like to find documents in a key-value store (Riak) with Solr and > I > > am running into a challenge. I have nested JSON documents with patent > > information. Patents have a one or many CPC ( > > http://www.cooperativepatentclassification.org/index.html) codes > something > > like these: > > > > { > > > > // more data > > > > "cpc": [ > > { > > "class": "61", > > "section": "A", > > "sequence": "1", > > "subclass": "K", > > "subgroup": "06", > > "main-group": "45", > > "classification-value": "I" > > }, > > { > > "class": "61", > > "section": "A", > > "sequence": "2", > > "subclass": "K", > > "subgroup": "506", > > "main-group": "31", > > "classification-value": "I" > > } > > ] > > > > } > > > > I would like to find the documents that match to a certain CPC code, > > sometimes with partial code sometimes with the full code. I used the > > following schema to index the documents: > > > > <field name="cpc.class" type="int" indexed="true" > > stored="true" multiValued="true" /> > > <field name="cpc.section" type="string" indexed="true" > > stored="true" multiValued="true" /> > > <field name="cpc.sequence" type="int" indexed="true" > > stored="true" multiValued="true" /> > > <field name="cpc.subclass" type="string" indexed="true" > > stored="true" multiValued="true" /> > > <field name="cpc.subgroup" type="int" indexed="true" > > stored="true" multiValued="true" /> > > <field name="cpc.main-group" type="int" indexed="true" > > stored="true" multiValued="true" /> > > <field name="cpc.classification-value" type="string" indexed="true" > > stored="true" multiValued="true" /> > > > > > > The problem with this approach is that when we query a certain > combination > > of partial CPC codes it returns document that don't actually match that > > combination. > > > > This behavior described in this blog post: > > > > > > > http://blog.griddynamics.com/2011/06/solr-experience-search-parent-child.html > > > > My understanding is that I need to apply termPositions=”true” to the > field > > definition and than Solr maintains the position information and it will > > return only the documents that actually match the combination of the > > partial CPC codes. Am I on the right track with this or there is a better > > solution to query nested documents with partial codes? > > > > Thank you in advance, > > Istvan > > > > PS: I also posted this on Stackoverflow: > > > > > http://stackoverflow.com/questions/33724556/how-to-index-an-array-of-hashes-with-solr > > > > -- > > the sun shines for all > > > -- the sun shines for all