Re: relational design in solr?

Chris Hostetter Thu, 21 Sep 2006 11:07:02 -0700

While it's certianly possible to "join" the results of multiple indexes, i
would do so only when absolutely neccessary -- in my experience the only
time i've found that it makes sense, is when one aspect of the data
changes extremely rapidly compared to everything else, making complex
reindexing a pain, but reindexing just the changed data in it's own index
is a lot more feasible.


As a rule of thumb, when building "paginated" style search applications, I
would advise people to try and flatten their index as much as possible, so
that the application can do one "user query" (based on the users input)
to get a single page of results, and then use the uniqueKeys from that
page of results to lookup ancillary data from any other indexes (or
databases that you need) -- the key being that all the data you want to
search on, and all hte data you need to sort are in the index, but other
data you needto return to the user can come from other sources.

If you find yourself wanting to "join" to indexes for hte purposes of
matching or sorting, the amount of work you wind up doing tends to be
prohibitive on really large indexes -- and if your indxes aren't that
large, it would probably just be easier to puteverything in one index and
rebuild it frequently.

: I am trying to integrate solr search results with results from a rdbms
: query.  It's working ok, but fairly complicated  due to large size of
: the results from the database, and many different sort requirements.
:
: I know that solr/lucene was not designed to intelligently handle
: multiple document types in the same collection, i.e. provide join
: features, but I'm wondering if anyone on this list has any thoughts on
: how to do it in lucene, and how it might be integrated into a custom
: solr deployment.  I can't see going back to vanilla lucene after solr!
:
: My basic idea is to add an objType field that would be used to define a
: "table".  There would be one main objType, any related objTypes would
: have a field pointing back to the main objs via id, like a foreign key.
:
: I'd run multiple parallel searches and merge the results based on
: foreign keys, either using a Filter or just using custom code.  I'm
: anticipating that iterating through the results to retrieve the foreign
: key values will be too slow.
:
: Our data is highly textual, temporal and spatial, which pretty much
: correspond to the 3 tables I would have.  I can de-normalize a lot of
: the data, but the combination of times, locations and textual
: representations would be way too large to fully flatten.
:
: I'm about to start experimenting with different strategies, and I would
: appreciate any insight anyone can provide.  Would the faceting code help
: here somehow?



-Hoss

Re: relational design in solr?

Reply via email to