I am trying to integrate solr search results with results from a rdbms
query. It's working ok, but fairly complicated due to large size of
the results from the database, and many different sort requirements.
I know that solr/lucene was not designed to intelligently handle
multiple document types in the same collection, i.e. provide join
features, but I'm wondering if anyone on this list has any thoughts on
how to do it in lucene, and how it might be integrated into a custom
solr deployment. I can't see going back to vanilla lucene after solr!
My basic idea is to add an objType field that would be used to define a
"table". There would be one main objType, any related objTypes would
have a field pointing back to the main objs via id, like a foreign key.
I'd run multiple parallel searches and merge the results based on
foreign keys, either using a Filter or just using custom code. I'm
anticipating that iterating through the results to retrieve the foreign
key values will be too slow.
Our data is highly textual, temporal and spatial, which pretty much
correspond to the 3 tables I would have. I can de-normalize a lot of
the data, but the combination of times, locations and textual
representations would be way too large to fully flatten.
I'm about to start experimenting with different strategies, and I would
appreciate any insight anyone can provide. Would the faceting code help
here somehow?
Thanks --Joachim
- relational design in solr? Joachim Martin
-