On 29 December 2013 11:10, Fatima Issawi <issa...@qu.edu.qa> wrote:
[...]
> We will have the full text stored, but we want to highlight the text in the 
> original image. I expect to process the image after retrieval. We do plan on 
> storing the (x, y) coordinates of the words in a database - I suspected that 
> it would be too expensive to store them in Solr. I guess I'm still confused 
> about how to use Solr to index the document, but then retrieve the (x, y) 
> coordinates of the search term from the database. Is this possible? If it 
> can, can you give an example how this can be done?

Storing, and retrieving the coordinates from Solr will likely be
faster than from the database. However, I still think that you
should think more carefully about your use case of highlighting
the images. It can be done, but is a significant amount of work,
and will need storage, and computational resources.
1. For highlighting in the image, you will need to store two sets
    of coordinates (e.g., top right and bottom left corners) as you
    not know the length of the word in the image. Thus, say with
    15 words per line, 50 lines per page, 100 pages per document,
    you will need to store:
      4 x 15 x 50 x 100 = 3,00,000 coordinates/document
2. Also, how are you going to get the coordinates in the first
    place?

Regards,
Gora

Reply via email to