On 4/16/2013 9:17 AM, Marko Asplund wrote:
Shawn Heisey wrote:
So, using a dynamic schema I'd flatten the following JSON object graph
{
'id':'xyz123',
'obj1': {
'child1': {
'prop1': ['val1', 'val2', 'val3']
'prop2': 123
}
'prop3': 'val4'
},
'obj2': {
'child2': {
'prop3': true
}
}
}
to a Solr document something like this?
{
'id':'xyz123',
'obj1/child1/prop1_ss': ['val1', 'val2', 'val3'],
'obj1/child1/prop2_i': 123,
'obj1/prop3_s': 'val4',
'obj2/child2/prop3_b': true
}
How you flatten the data is up to you. You have to examine the data and
how you want to use it in order to keep the number of fields to a
manageable level but retain the flexibility you need. Side note: I
would not use anything in a field name other than ASCII alphanumeric and
underscore characters. Using special characters (like a slash) has been
known to cause problems with some Solr features. Because Solr uses
HTTP, there are also potential URL escaping issues.
Within a single index, Solr uses a flat model, like a single database
table with no relational capability. With two indexes, there is the
limited join feature, but I am not familiar with how it works.
I'm using Java, so I'd probably push docs for indexing to Solr and do the
searches using SolrJ, right?
That would be the most sensible approach. The SolrJ API is much more
advanced than the APIs for other languages. This is because it is
actually part of the Solr codebase and used by Solr internally.
The data import handler is a Solr server side feature and not a client side?
Does Solr or SolrJ have any support for doing transformations on the client
side?
Doing the above transformation should be fairly straight forward, so it
could be also done by code on the client side.
With SolrJ, you can do anything, because you write the code. You can do
whatever you like to the data, then send it to Solr.
The dataimport handler is indeed a server side feature. It is a contrib
module included in the Solr distribution, you have to add a jar to Solr
to activate it.
Thanks,
Shawn