yeah, i want to use DIH and i tried config my file dataconfig. but it is
wrong. This is my config:

*<dataConfig>
    <dataSource type="JdbcDataSource"
driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
url="jdbc:sqlserver://ipAddress;databaseName=VTC_Edu" user="myuser"
password="mypass"  name="VTCEduDocument"/>
        
        <dataSource type="BinURLDataSource" name="dsurl"/>
    
        <document>
                
                <entity name="VTCEduDocument" pk="pk_document_id" query="select 
TOP 10
pk_document_id, s_path_origin from [VTC_Edu].[dbo].[tbl_Document]"              
        

        
transformer="vn.vtc.solr.transformer.ImageFilter,vn.vtc.solr.transformer.RemoveHTML,RegexTransformer,TemplateTransformer,vn.vtc.solr.transformer.vntransformer,vn.vtc.solr.correctUnicodeString.correctUnicodeString,vn.vtc.solr.unescapeHtmlString.UnescapeHtmlString,vn.vtc.solr.correctISOString.correctISOString"
>
                <field column="pk_document_id" name="pk_document_id" />         
                
                                <field column="s_path_origin" 
name="s_path_origin" />                                           
                </entity>
                
                <entity processor="TikaEntityProcessor" dataSource="dsurl" 
format="text"
url=
"http://media.gox.vn/edu/document/original/${VTCEduDocument.s_path_origin}";>
                                <field column="Author" name="author" 
meta="true"/>
                <field column="title" name="title" meta="true"/>
                <field column="text" name="text"/> 
      </entity>
  
    </document>
</dataConfig>*

And here error: 
*EVERE: Full Import
failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
Exception in invoking url null Processing Document # 1
        at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
        at
org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:89)
        at
org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:38)
        at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
        at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
        at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
        at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:591)
        at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:267)
        at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:186)
        at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:353)
        at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:411)
        at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:392)
Caused by: java.net.MalformedURLException: no protocol: nullselect TOP 10
pk_document_id, s_path_origin from [VTC_Edu].[dbo].[tbl_Document]
        at java.net.URL.<init>(URL.java:567)
        at java.net.URL.<init>(URL.java:464)
        at java.net.URL.<init>(URL.java:413)
        at
org.apache.solr.handler.dataimport.BinURLDataSource.getData(BinURLDataSource.java:81)
        ... 10 more*

???
Thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-data-from-rich-documents-Tika-with-solr3-1-tp3322555p3348149.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to