Hi,
I want to use solr to index some scanned document, after settings solr
document with a two field "content" and "filename", I tried to upload the
attached file, but it seems that the content of the file is only "\n \n
\n....".
But if I used the tesseract from command line I got the result correctly.
The log when solr receive my request:
-----------
INFO - 2015-04-23 03:49:25.941;
org.apache.solr.update.processor.LogUpdateProcessor; [collection1]
webapp=/solr path=/update/extract params={literal.groupid=2&json.nl=flat&
resource.name=phplNiPrs&literal.id
=4&commit=true&extractOnly=false&literal.historyid=4&omitHeader=true&literal.userid=3&literal.createddate=2015-04-22T15:00:00Z&fmap.content=content&wt=json&literal.filename=\\trunght\test\tesseract_3.png}
------------
The document when I check on solr admin page:
-------------
{ "groupid": 2, "id": "4", "historyid": 4, "userid": 3, "createddate":
"2015-04-22T15:00:00Z", "filename": "\\\\trunght\\test\\tesseract_3.png", "
autocomplete_text": [ "\\\\trunght\\test\\tesseract_3.png" ], "content": "
\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
\n \n \n \n \n \n \n \n \n \n \n \n ", "_version_": 1499213034586898400 }
-----------
Since I am a solr newbie I do not know where to look, can anyone give me an
advice for where to look for error or settings to make it work.
Thanks in advanced.
Trung.