Searching document content and mult-valued fields

Shaun Campbell Wed, 01 Jul 2020 09:20:21 -0700

Hi

Been using Solr on a project now for a couple of years and is working well.
It's just a simple index of about 20 - 25 fields and 7,000 project records.


Now there's a requirement to be able to search on the content of documents
(web pages, Word, pdf etc) related to those projects.  My initial thought
was to just create a new index to store the Tika'd content and just search
on that. However, the requirement is to somehow search through both the
project records and the content records at the same time and list the main
project with perhaps some info on the matching content data. I tried to
explain that you may find matching main project records but no content, and
vice versa.

My only solution to this search problem is to either concatenate all the
document content into one field on the main project record, and add that to
my dismax search, and use boosting etc or to use a multi-valued field to
store the content of each project document.  I'm a bit reluctant to do this
as the application is running well and I'm a bit nervous about a change to
the schema and the indexing process.  I just wondered what you thought
about adding a lot of content to an existing schema (single or multivalued
field) that doesn't normally store big amounts of data.

Or does anyone know of any way, I can join two searches like this together
and two separate indexes?

Thanks
Shaun

Searching document content and mult-valued fields

Reply via email to