Thanks Lance, I'm using Solr 1.4. If I want to using TikaEP, need to upgrade to Solr 3.1 or import jar files?
Best Regards, Roy Liu On Fri, Apr 8, 2011 at 10:22 AM, Lance Norskog <goks...@gmail.com> wrote: > You need the TikaEntityProcessor to unpack the PDF image. You are > sticking binary blobs into the index. Tika unpacks the text out of the > file. > > TikaEP is not in Solr 1.4, but it is in the new Solr 3.1 release. > > On Thu, Apr 7, 2011 at 7:14 PM, Roy Liu <liuchua...@gmail.com> wrote: > > Hi, > > > > I have a table named *attachment *in MS SQL Server 2008. > > > > COLUMN TYPE > > ------------- ---------------- > > id int > > title varchar(200) > > attachment image > > > > I need to index the attachment(store pdf files) column from database via > > DIH. > > > > After access this URL, it returns "Indexing completed. Added/Updated: 5 > > documents. Deleted 0 documents." > > http://localhost:8080/solr/dataimport?command=full-import > > > > However, I can not search anything. > > > > Anyone can help me ? > > > > Thanks. > > > > > > -------------------- > > *data-config-sql.xml* > > <dataConfig> > > <dataSource type="JdbcDataSource" > > driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" > > url="jdbc:sqlserver://localhost:1433;databaseName=master" > > user="user" > > password="pw"/> > > <document> > > <entity name="doc" > > query="select id,title,attachment from attachment"> > > </entity> > > </document> > > </dataConfig> > > > > *schema.xml* > > <field name="attachment" type="text" indexed="true" stored="true"/> > > > > > > > > Best Regards, > > Roy Liu > > > > > > -- > Lance Norskog > goks...@gmail.com >