Re: How to index PDF file stored in SQL Server 2008

2011-04-11 Thread Roy Liu
I changed data-config-sql.xml to There are no errors, but, the indexed pdf is convert to Numbers.. 200 1 202 1 203 1 212 1 222 1 236 1 242 1 244 1 254 1 255 -- Best Regards, Roy Liu On Mon, Apr 11, 2011 at 2:02 PM, Roy Liu wrote: >

Re: How to index PDF file stored in SQL Server 2008

2011-04-10 Thread Roy Liu
Hi, I have copied \apache-solr-3.1.0\dist\apache-solr-dataimporthandler-extras-3.1.0.jar into \apache-tomcat-6.0.32\webapps\solr\WEB-INF\lib\ Other Errors: Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Unclosed quotation mark after the character string 'B@3e574'. -- Best Regards,

Re: How to index PDF file stored in SQL Server 2008

2011-04-10 Thread Roy Liu
Hi, all Thank YOU very much for your kindly help. *1. I have upgrade from Solr 1.4 to Solr 3.1* *2. Change data-config-sql.xml * *** * *3. solrconfig.xml and schema.xml are NOT changed.* However, when I

Re: How to index PDF file stored in SQL Server 2008

2011-04-10 Thread Lance Norskog
You have to upgrade completely to the Apache Solr 3.1 release. It is worth the effort. You cannot copy any jars between Solr releases. Also, you cannot copy over jars from newer Tika releases. On Fri, Apr 8, 2011 at 10:47 AM, Darx Oman wrote: > Hi again > what you are missing is field mapping >

Re: How to index PDF file stored in SQL Server 2008

2011-04-08 Thread Darx Oman
Hi again what you are missing is field mapping no need for TikaEntityProcessor since you are not accessing pdf files

Re: How to index PDF file stored in SQL Server 2008

2011-04-08 Thread Darx Oman
Hi there TikaEntityProcessor is available as part of DIH-extras*.jar in 3.x and 4.0

Re: How to index PDF file stored in SQL Server 2008

2011-04-07 Thread Roy Liu
Thanks Lance, I'm using Solr 1.4. If I want to using TikaEP, need to upgrade to Solr 3.1 or import jar files? Best Regards, Roy Liu On Fri, Apr 8, 2011 at 10:22 AM, Lance Norskog wrote: > You need the TikaEntityProcessor to unpack the PDF image. You are > sticking binary blobs into the index.

Re: How to index PDF file stored in SQL Server 2008

2011-04-07 Thread Lance Norskog
You need the TikaEntityProcessor to unpack the PDF image. You are sticking binary blobs into the index. Tika unpacks the text out of the file. TikaEP is not in Solr 1.4, but it is in the new Solr 3.1 release. On Thu, Apr 7, 2011 at 7:14 PM, Roy Liu wrote: > Hi, > > I have a table named *attachme

How to index PDF file stored in SQL Server 2008

2011-04-07 Thread Roy Liu
Hi, I have a table named *attachment *in MS SQL Server 2008. COLUMNTYPE - id int titlevarchar(200) attachment image I need to index the attachment(store pdf files) column from database via DIH. After access this URL, it returns "Ind