Chapter seems too broad and line seems too narrow -- have you thought
about paragraph level? Something like:

docID, book fields (title, author, publisher, etc), chapter fields (#,
title, pages, etc), section fields (title, #, etc), sub-sectionN
fields, paragraph text, lines

Seems like line #'s would only be useful for display so just store the
lines the paragraph covers.



On Tue, Apr 23, 2013 at 7:51 PM, Walter Underwood <wun...@wunderwood.org> wrote:
> If you can represent your books in XML, then MarkLogic could do the job very 
> cleanly. It isn't free, but it is very good.
>
> wunder
>
> On Apr 23, 2013, at 6:47 PM, Jason Funk wrote:
>
>> Is there a better tool than Solr to use for my situation?
>>
>>
>> On Apr 23, 2013, at 5:04 PM, Jack Krupansky <j...@basetechnology.com> wrote:
>>
>>> There is no simple, obvious, and direct approach, right out of the box. 
>>> Sure, you can highlight passages of raw text, right out of the box, but 
>>> that won't give you chapters, pages, and line numbers. To do all of that, 
>>> you would have to either:
>>>
>>> 1. Add chapter, page, and line number as part of the payload for each word. 
>>> And add some custom document transformers to access the information.
>>> or
>>> 2. Index each line as a separate Solr document, with fields for book, 
>>> chapter, page, and line number.
>>>
>>> -- Jack Krupansky
>>>
>>> -----Original Message----- From: Jason Funk
>>> Sent: Tuesday, April 23, 2013 5:02 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Book text with chapter line number
>>>
>>> Hello.
>>>
>>> I'm trying to figure out if Solr is going to work for a new project that I 
>>> am wanting to build. At it's heart it's a book text searching application. 
>>> Each book is broken into chapters and each chapter is broken into lines. I 
>>> want to be able to search these books and return relevant sections of the 
>>> book and display the results with chapter and line number. I'm not sure how 
>>> I would structure my data so that it's efficient and functional. I could 
>>> simply treat each line of text as a document which would provide some of 
>>> the functionality but what if the search query spanned two lines? Then it 
>>> seems the passage the user was searching for wouldn't be returned. I could 
>>> treat each book as a document and use highlighting to find the context but 
>>> that seems to limit weighting/results for best matches as well as 
>>> difficultly in finding chapter/line numbers. What is the best way to do 
>>> this with Solr?
>>>
>>> Is there a better tool to use to solve my problem?
>>
>
> --
> Walter Underwood
> wun...@wunderwood.org
>
>
>

Reply via email to