Hello,

I have two sources of data for the same "things" to search. It is book data in a library. First there is the usual bibliographic data (author, title...) and then I have scanned and OCRed table of contents data about the same books. Both are updated independently.
Now I don't know how to best index and search this data.
- One option would be to save the data in different records. That would
  make updates easy because I don't have to worry about the fields
  from the other source. But searching would be more difficult: I have
  to do an additional search for every hit in the "contents" data to
  get the bibliographic data.
- The other option would be to save everything in one record but then
  updates would be difficult. Before I can update a record I must first
  look if there is any data from the other source, merge it into the
  record and only then update it. This option sounds very time consuming
  for a complete reindex.

The best solution would be some sort of join: Have two records in the index but always give both in the result no matter where the hit was.
Any ideas on how to best organize this kind of data?

-Michael

Reply via email to