It's easy to then store a map of "term position" to line-number and page-number along with each paragraph, or?
Paul On 24 avr. 2013, at 16:24, Timothy Potter wrote: > Chapter seems too broad and line seems too narrow -- have you thought > about paragraph level? Something like: > > docID, book fields (title, author, publisher, etc), chapter fields (#, > title, pages, etc), section fields (title, #, etc), sub-sectionN > fields, paragraph text, lines > > Seems like line #'s would only be useful for display so just store the > lines the paragraph covers. > > > > On Tue, Apr 23, 2013 at 7:51 PM, Walter Underwood <wun...@wunderwood.org> > wrote: >> If you can represent your books in XML, then MarkLogic could do the job very >> cleanly. It isn't free, but it is very good. >> >> wunder >> >> On Apr 23, 2013, at 6:47 PM, Jason Funk wrote: >> >>> Is there a better tool than Solr to use for my situation? >>> >>> >>> On Apr 23, 2013, at 5:04 PM, Jack Krupansky <j...@basetechnology.com> wrote: >>> >>>> There is no simple, obvious, and direct approach, right out of the box. >>>> Sure, you can highlight passages of raw text, right out of the box, but >>>> that won't give you chapters, pages, and line numbers. To do all of that, >>>> you would have to either: >>>> >>>> 1. Add chapter, page, and line number as part of the payload for each >>>> word. And add some custom document transformers to access the information. >>>> or >>>> 2. Index each line as a separate Solr document, with fields for book, >>>> chapter, page, and line number. >>>> >>>> -- Jack Krupansky >>>> >>>> -----Original Message----- From: Jason Funk >>>> Sent: Tuesday, April 23, 2013 5:02 PM >>>> To: solr-user@lucene.apache.org >>>> Subject: Book text with chapter line number >>>> >>>> Hello. >>>> >>>> I'm trying to figure out if Solr is going to work for a new project that I >>>> am wanting to build. At it's heart it's a book text searching application. >>>> Each book is broken into chapters and each chapter is broken into lines. I >>>> want to be able to search these books and return relevant sections of the >>>> book and display the results with chapter and line number. I'm not sure >>>> how I would structure my data so that it's efficient and functional. I >>>> could simply treat each line of text as a document which would provide >>>> some of the functionality but what if the search query spanned two lines? >>>> Then it seems the passage the user was searching for wouldn't be returned. >>>> I could treat each book as a document and use highlighting to find the >>>> context but that seems to limit weighting/results for best matches as well >>>> as difficultly in finding chapter/line numbers. What is the best way to do >>>> this with Solr? >>>> >>>> Is there a better tool to use to solve my problem? >>> >> >> -- >> Walter Underwood >> wun...@wunderwood.org >> >> >>