Re: Searching individual pages in solr

2020-03-24 Thread Erick Erickson
Well, given the structure of an inverted index, how would you have a clue what page the hit was on? You could conceivably index enough data with payloads and the like, but that’d cause a lot more bloat than just indexing each page. Using grouping would allow you to show, say, the top three pages

Searching individual pages in solr

2020-03-24 Thread Dustin Lebsock
Hi! I'm looking for some guidance on engineering a solution for searching individual pages of PDF documents. I currently have a SolrCloud setup that uses an external tika server to extract text data from PDFs. I'd like to be able to search individual pages for search results and for the overall