from:"Marko Asplund"

Dynamic data model design questions

2013-04-15 Thread Marko Asplund

I'm implementing a backend service that stores data in JSON format and I'd
like to provide a search operation in the service.
The data model is dynamic and will contain arbitrarily complex object
graphs.

How do I index object graphs with Solr?
Does the data need to be flattened before indexing?

Apparently the service needs to deliver new data and updates to Solr,
but which one should be responsible for converting the data model to adhere
to Solr schema? The service or Solr?
Should the service deliver data to Solr in a form that adheres to Solr
schema or should Solr be extended to digest data provided by the service?

How does Solr handle dynamic data models?
Solr seems to support dynamic data models with the "dynamic fields" feature
in schemas.
How are data types inferred when using dynamic fields?

An alternative to using dynamic fields seems to be to change the schema
when the data model changes.
How easy is it to modify an existing schema?
Do I need to reindex all the data?
Can you do it online using an API?

I'm planning on using Solr 4.2.


marko

Re: Dynamic data model design questions

2013-04-16 Thread Marko Asplund

Shawn Heisey wrote:

> Solr does have some *very* limited capability for doing joins between
indexes, but generally speaking, you need to flatten the data.

thanks!

So, using a dynamic schema I'd flatten the following JSON object graph

{
  'id':'xyz123',
  'obj1': {
'child1': {
  'prop1': ['val1', 'val2', 'val3']
  'prop2': 123
 }
 'prop3': 'val4'
  },
  'obj2': {
'child2': {
  'prop3': true
}
  }
}

to a Solr document something like this?

{
'id':'xyz123',
'obj1/child1/prop1_ss': ['val1', 'val2', 'val3'],
'obj1/child1/prop2_i': 123,
'obj1/prop3_s': 'val4',
'obj2/child2/prop3_b': true
}

I'm using Java, so I'd probably push docs for indexing to Solr and do the
searches using SolrJ, right?

> Solr's ability to change your data after receiving it is fairly limited.
The schema has some ability in this regard for indexed values, > but the
stored data is 100% verbatim as Solr receives it. If you will be using the
dataimport handler, it does have some transform > capability before sending
to Solr. Most of the time, the rule of thumb is that changing the data on
the Solr side will require
> contrib/custom plugins, so it may be easier to do it before Solr receives
it.

The data import handler is a Solr server side feature and not a client side?
Does Solr or SolrJ have any support for doing transformations on the client
side?
Doing the above transformation should be fairly straight forward, so it
could be also done by code on the client side.

marko

Re: Dynamic data model design questions

2013-04-20 Thread Marko Asplund

Jack Krupansky wrote:

> In general, Solr is much more friendly towards static data models. Yes, you
> can use dynamic fields, but use them in moderation. The more heavily you
> lean on them, the more likely that you will eventually become unhappy with
> Solr.

Can you concrete examples of what kinds of issues should I expect to
face when using a data model with only dynamic fields?
We've requirements that quite explicitly direct us into using dynamic
fields and I'd like to understand what kinds of problems we might end
up having.

> How many fields are we talking about here?

The data model is designed to be dynamic, so the number is not fixed,
but I'm expecting there'll be perhaps about 20-40 fields.

> The trick with Solr is not to brute-force flatten your data model (as you
> appear to be doing), but to REDESIGN your data model so that it is more
> amenable to a flat data model, and takes advantage of Solr's features. You
> can use multiple collections for different types of data. And you can
> simulate joins across tables by doing a sequence of queries (although it
> would be nice to have a SolrJ client-side method to do that in one API
> call.)

We're storing arbitrarily complex object graphs in a data store and
want to use Solr for implementing search property field search.
It may be difficult to use a flatter data model, but I'll consider
this option as well.

thanks!

marko

Dynamic data model design questions

Re: Dynamic data model design questions

Re: Dynamic data model design questions

3 matches

Site Navigation

Mail list logo

Footer information