It's easy to then store a map of "term position" to line-number and page-number 
along with each paragraph, or?

Paul


On 24 avr. 2013, at 16:24, Timothy Potter wrote:

> Chapter seems too broad and line seems too narrow -- have you thought
> about paragraph level? Something like:
> 
> docID, book fields (title, author, publisher, etc), chapter fields (#,
> title, pages, etc), section fields (title, #, etc), sub-sectionN
> fields, paragraph text, lines
> 
> Seems like line #'s would only be useful for display so just store the
> lines the paragraph covers.
> 
> 
> 
> On Tue, Apr 23, 2013 at 7:51 PM, Walter Underwood <wun...@wunderwood.org> 
> wrote:
>> If you can represent your books in XML, then MarkLogic could do the job very 
>> cleanly. It isn't free, but it is very good.
>> 
>> wunder
>> 
>> On Apr 23, 2013, at 6:47 PM, Jason Funk wrote:
>> 
>>> Is there a better tool than Solr to use for my situation?
>>> 
>>> 
>>> On Apr 23, 2013, at 5:04 PM, Jack Krupansky <j...@basetechnology.com> wrote:
>>> 
>>>> There is no simple, obvious, and direct approach, right out of the box. 
>>>> Sure, you can highlight passages of raw text, right out of the box, but 
>>>> that won't give you chapters, pages, and line numbers. To do all of that, 
>>>> you would have to either:
>>>> 
>>>> 1. Add chapter, page, and line number as part of the payload for each 
>>>> word. And add some custom document transformers to access the information.
>>>> or
>>>> 2. Index each line as a separate Solr document, with fields for book, 
>>>> chapter, page, and line number.
>>>> 
>>>> -- Jack Krupansky
>>>> 
>>>> -----Original Message----- From: Jason Funk
>>>> Sent: Tuesday, April 23, 2013 5:02 PM
>>>> To: solr-user@lucene.apache.org
>>>> Subject: Book text with chapter line number
>>>> 
>>>> Hello.
>>>> 
>>>> I'm trying to figure out if Solr is going to work for a new project that I 
>>>> am wanting to build. At it's heart it's a book text searching application. 
>>>> Each book is broken into chapters and each chapter is broken into lines. I 
>>>> want to be able to search these books and return relevant sections of the 
>>>> book and display the results with chapter and line number. I'm not sure 
>>>> how I would structure my data so that it's efficient and functional. I 
>>>> could simply treat each line of text as a document which would provide 
>>>> some of the functionality but what if the search query spanned two lines? 
>>>> Then it seems the passage the user was searching for wouldn't be returned. 
>>>> I could treat each book as a document and use highlighting to find the 
>>>> context but that seems to limit weighting/results for best matches as well 
>>>> as difficultly in finding chapter/line numbers. What is the best way to do 
>>>> this with Solr?
>>>> 
>>>> Is there a better tool to use to solve my problem?
>>> 
>> 
>> --
>> Walter Underwood
>> wun...@wunderwood.org
>> 
>> 
>> 

Reply via email to