Hello,
I have two sources of data for the same "things" to search. It is book
data in a library. First there is the usual bibliographic data (author,
title...) and then I have scanned and OCRed table of contents data about
the same books. Both are updated independently.
Now I don't know how to best index and search this data.
- One option would be to save the data in different records. That would
make updates easy because I don't have to worry about the fields
from the other source. But searching would be more difficult: I have
to do an additional search for every hit in the "contents" data to
get the bibliographic data.
- The other option would be to save everything in one record but then
updates would be difficult. Before I can update a record I must first
look if there is any data from the other source, merge it into the
record and only then update it. This option sounds very time consuming
for a complete reindex.
The best solution would be some sort of join: Have two records in the
index but always give both in the result no matter where the hit was.
Any ideas on how to best organize this kind of data?
-Michael