On 10/3/2013 4:36 PM, jimmy nguyen wrote:
I'd like to get something like this in Solr:
<doc>
<str name="id">1</str>
<arr name="person"> <str id="1">Jane Doe</str> <str id="2">John Doe</str>
<arr name="person_phoneNumber"> <str id="1">0123456789</str> <str id="1">
1234567890</str>
<str id="2">2345678901</str>
</doc>
This way it is easy to link the 2 first phone numbers to Jane Doe and the
last one to John Doe.
Attributes like that are not something that Solr will do out of the
box. With some custom code, you might be able to create something like
that, but I'm not sure that you really need to do that.
When dealing with a search engine, and Solr in particular, you need to
think in terms of a flat data model, without relational features of any
kind. Solr does have some limited join capability where it can use two
indexes to return results, but if you're wanting to jump right in and
use that capability, there's a good chance that you need to back up and
think about a redesign with a flat data model. Solr is not a relational
database, it's a search engine. It does have features that let it fill
a NoSQL database role, but that's not what it's designed to do.
The first decision is what a single Solr document will represent. Do
you want phone numbers or people to be the basic unit in your index?
Here's what I think you'd probably actually want for a typical
document. In this example, phoneNumber is a multi-valued field:
<doc>
<str name="id">1</str>
<str name="firstName">Jane</str>
<strname="lastName">Doe</str>
<str name="phoneNumber">0123456789</str>
<str name="phoneNumber">1234567890</str>
</doc>
In case you were thinking of using Solr as your primary data store,
don't. Unless you've got a flat data model, it wouldn't be very good
for that role. Also, given that most schema changes and some config
changes require a reindex, you'll want to have your data available
elsewhere so you can accomplish that.
Thanks,
Shawn