RE: How to use Solr in my project

2013-12-30 Thread Fatima Issawi
y, December 30, 2013 11:46 AM > To: solr-user@lucene.apache.org > Subject: Re: How to use Solr in my project > > On 30 December 2013 11:27, Fatima Issawi wrote: > > Hi again, > > > > We have another program that will be extracting the text, and it will be > extr

Re: How to use Solr in my project

2013-12-30 Thread Gora Mohanty
On 30 December 2013 11:27, Fatima Issawi wrote: > Hi again, > > We have another program that will be extracting the text, and it will be > extracting the top right and bottom left corners of the words. You are right, > I do expect to have a lot of data. > > When would solr start experiencing iss

RE: How to use Solr in my project

2013-12-29 Thread Fatima Issawi
; Sent: Sunday, December 29, 2013 2:48 PM > To: solr-user@lucene.apache.org > Subject: Re: How to use Solr in my project > > On 29 December 2013 11:10, Fatima Issawi wrote: > [...] > > We will have the full text stored, but we want to highlight the text in the > original imag

Re: How to use Solr in my project

2013-12-29 Thread Gora Mohanty
On 29 December 2013 11:10, Fatima Issawi wrote: [...] > We will have the full text stored, but we want to highlight the text in the > original image. I expect to process the image after retrieval. We do plan on > storing the (x, y) coordinates of the words in a database - I suspected that > it

RE: How to use Solr in my project

2013-12-28 Thread Fatima Issawi
Hello, Our pages are images of handwritten text in Arabic so OCR'ing is not possible. We will be extracting the text during pre-processing and storing the words and (x, y) coordinates in a database. Would your process apply to our images? > Step 1: > For sending the extracted text content from

RE: How to use Solr in my project

2013-12-28 Thread Fatima Issawi
> What do you mean by "word location"? The number on the page? What > purpose would this serve? I mean the (x, y) coordinates of the word on the page. We want to be able to highlight the image of the word that was extracted from the text. > I think that you might be confusing things: > * If you

Re: How to use Solr in my project

2013-12-27 Thread Gopal Agarwal
Highlighting can be done as three step process: Pre-requisite: Get the pdf with text after the OCR of the image pdf. Step 1: For sending the extracted text content from text pdf to solr, use a low level pdf converter such as poppler-utils (pdftotext or pdftohtml) to correctly get the coordinates

Re: How to use Solr in my project

2013-12-26 Thread Gora Mohanty
On 26 December 2013 15:44, Fatima Issawi wrote: > Hi, > > I should clarify. We have another application extracting the text from the > document. The full text from each document will be stored in a database > either at the document level or page level (this hasn't been decided yet). We > will a

RE: How to use Solr in my project

2013-12-26 Thread Fatima Issawi
make more sense? Fatima -Original Message- From: Gora Mohanty [mailto:g...@mimirtech.com] Sent: Thursday, December 26, 2013 1:00 PM To: solr-user@lucene.apache.org Subject: Re: How to use Solr in my project On 26 December 2013 10:54, Fatima Issawi wrote: > Hello, > > First off, I apolo

Re: How to use Solr in my project

2013-12-26 Thread Gora Mohanty
On 26 December 2013 10:54, Fatima Issawi wrote: > Hello, > > First off, I apologize if this was sent twice. I was having issues > subscribing to the list. > > I'm a complete noob in Solr (and indexing), so I'm hoping someone can help me > figure out how to implement Solr in my project. I have go