Re: How to use Solr in my project

Gora Mohanty Thu, 26 Dec 2013 02:01:52 -0800

On 26 December 2013 10:54, Fatima Issawi <issa...@qu.edu.qa> wrote:
> Hello,
>
> First off, I apologize if this was sent twice. I was having issues 
> subscribing to the list.
>
> I'm a complete noob in Solr (and indexing), so I'm hoping someone can help me 
> figure out how to implement Solr in my project. I have gone through some 
> tutorials online and I was able to import and query text in some Arabic PDF 
> documents.
>
> We have some scans of Historical Handwritten Arabic documents that will have 
> text extracted into a database (or PDF). We would like the user to be able to 
> search the document for text, then have the scanned image show up in a viewer 
> with the text highlighted.


This will not work for scanned images which do not actually contain the
text. If you have the text of the documents, the best that you can do is
break the text into pages corresponding to the scanned images, and
index into Solr the text from the pages and the scanned image that should
be linked to the text. For a user search, you will need to show the scanned
image for the entire page: Highlighting of the search term in an image is not
possible without optical character recognition (OCR).

Similarly, if you are indexing from PDFs, you will need to ensure that they
contain text, and not just images.

Regards,
Gora

Re: How to use Solr in my project

Reply via email to