Hi Shaun, If project content is relatively static, you could use nested documents <https://lucene.apache.org/solr/guide/8_0/indexing-nested-documents.html> or you could plain with join query parser <https://lucene.apache.org/solr/guide/7_3/other-parsers.html#join-query-parser>.
HTH, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 1 Jul 2020, at 18:19, Shaun Campbell <campbell.sh...@gmail.com> wrote: > > Hi > > Been using Solr on a project now for a couple of years and is working well. > It's just a simple index of about 20 - 25 fields and 7,000 project records. > > Now there's a requirement to be able to search on the content of documents > (web pages, Word, pdf etc) related to those projects. My initial thought > was to just create a new index to store the Tika'd content and just search > on that. However, the requirement is to somehow search through both the > project records and the content records at the same time and list the main > project with perhaps some info on the matching content data. I tried to > explain that you may find matching main project records but no content, and > vice versa. > > My only solution to this search problem is to either concatenate all the > document content into one field on the main project record, and add that to > my dismax search, and use boosting etc or to use a multi-valued field to > store the content of each project document. I'm a bit reluctant to do this > as the application is running well and I'm a bit nervous about a change to > the schema and the indexing process. I just wondered what you thought > about adding a lot of content to an existing schema (single or multivalued > field) that doesn't normally store big amounts of data. > > Or does anyone know of any way, I can join two searches like this together > and two separate indexes? > > Thanks > Shaun