On 4/15/2013 8:40 AM, Marko Asplund wrote:
I'm implementing a backend service that stores data in JSON format and I'd
like to provide a search operation in the service.
The data model is dynamic and will contain arbitrarily complex object
graphs.

How do I index object graphs with Solr?
Does the data need to be flattened before indexing?

Solr does have some *very* limited capability for doing joins between indexes, but generally speaking, you need to flatten the data.

Apparently the service needs to deliver new data and updates to Solr,
but which one should be responsible for converting the data model to adhere
to Solr schema? The service or Solr?
Should the service deliver data to Solr in a form that adheres to Solr
schema or should Solr be extended to digest data provided by the service?

Solr's ability to change your data after receiving it is fairly limited. The schema has some ability in this regard for indexed values, but the stored data is 100% verbatim as Solr receives it. If you will be using the dataimport handler, it does have some transform capability before sending to Solr. Most of the time, the rule of thumb is that changing the data on the Solr side will require contrib/custom plugins, so it may be easier to do it before Solr receives it.

How does Solr handle dynamic data models?
Solr seems to support dynamic data models with the "dynamic fields" feature
in schemas.
How are data types inferred when using dynamic fields?

A wildcard field name is used, like "i_*" or "*_int" and that definition includes the data type.

An alternative to using dynamic fields seems to be to change the schema
when the data model changes.
How easy is it to modify an existing schema?
Do I need to reindex all the data?
Can you do it online using an API?

Changing the schema is as simple as modifying schema.xml and reloading the core or restarting Solr. An API for online schema changes is coming, I don't know if it will be ready in time for 4.3 or if it will get pushed back to 4.4. No matter how you make the change, the following applies:

If you add fields, reindexing is not necessary, but existing documents will not have the new fields until you do. If you change the query analyzer chain, no reindex is required. If you change the index analyzer chain or options that affect indexing, reindexing IS required.

Thanks,
Shawn

Reply via email to