Hi all,
I would like to find documents in a key-value store (Riak) with Solr and I
am running into a challenge. I have nested JSON documents with patent
information. Patents have a one or many CPC (
http://www.cooperativepatentclassification.org/index.html) codes something
like these:
{
// more data
"cpc": [
{
"class": "61",
"section": "A",
"sequence": "1",
"subclass": "K",
"subgroup": "06",
"main-group": "45",
"classification-value": "I"
},
{
"class": "61",
"section": "A",
"sequence": "2",
"subclass": "K",
"subgroup": "506",
"main-group": "31",
"classification-value": "I"
}
]
}
I would like to find the documents that match to a certain CPC code,
sometimes with partial code sometimes with the full code. I used the
following schema to index the documents:
<field name="cpc.class" type="int" indexed="true"
stored="true" multiValued="true" />
<field name="cpc.section" type="string" indexed="true"
stored="true" multiValued="true" />
<field name="cpc.sequence" type="int" indexed="true"
stored="true" multiValued="true" />
<field name="cpc.subclass" type="string" indexed="true"
stored="true" multiValued="true" />
<field name="cpc.subgroup" type="int" indexed="true"
stored="true" multiValued="true" />
<field name="cpc.main-group" type="int" indexed="true"
stored="true" multiValued="true" />
<field name="cpc.classification-value" type="string" indexed="true"
stored="true" multiValued="true" />
The problem with this approach is that when we query a certain combination
of partial CPC codes it returns document that don't actually match that
combination.
This behavior described in this blog post:
http://blog.griddynamics.com/2011/06/solr-experience-search-parent-child.html
My understanding is that I need to apply termPositions=”true” to the field
definition and than Solr maintains the position information and it will
return only the documents that actually match the combination of the
partial CPC codes. Am I on the right track with this or there is a better
solution to query nested documents with partial codes?
Thank you in advance,
Istvan
PS: I also posted this on Stackoverflow:
http://stackoverflow.com/questions/33724556/how-to-index-an-array-of-hashes-with-solr
--
the sun shines for all