On 10/3/2013 4:36 PM, jimmy nguyen wrote:
I'd like to get something like this in Solr:
<doc>
<str name="id">1</str>
<arr name="person"> <str id="1">Jane Doe</str> <str id="2">John Doe</str>
<arr name="person_phoneNumber"> <str id="1">0123456789</str> <str id="1">
1234567890</str>
<str id="2">2345678901</str>
</doc>

This way it is easy to link the 2 first phone numbers to Jane Doe and the
last one to John Doe.

Attributes like that are not something that Solr will do out of the box. With some custom code, you might be able to create something like that, but I'm not sure that you really need to do that.

When dealing with a search engine, and Solr in particular, you need to think in terms of a flat data model, without relational features of any kind. Solr does have some limited join capability where it can use two indexes to return results, but if you're wanting to jump right in and use that capability, there's a good chance that you need to back up and think about a redesign with a flat data model. Solr is not a relational database, it's a search engine. It does have features that let it fill a NoSQL database role, but that's not what it's designed to do.

The first decision is what a single Solr document will represent. Do you want phone numbers or people to be the basic unit in your index? Here's what I think you'd probably actually want for a typical document. In this example, phoneNumber is a multi-valued field:

<doc>
<str name="id">1</str>
<str name="firstName">Jane</str>
<strname="lastName">Doe</str>
<str name="phoneNumber">0123456789</str>
<str name="phoneNumber">1234567890</str>
</doc>

In case you were thinking of using Solr as your primary data store, don't. Unless you've got a flat data model, it wouldn't be very good for that role. Also, given that most schema changes and some config changes require a reindex, you'll want to have your data available elsewhere so you can accomplish that.

Thanks,
Shawn

Reply via email to