I think you need ClobTransformer at some point in the processing: https://lucene.apache.org/solr/guide/7_4/uploading-structured-data-store-data-with-the-data-import-handler.html#clobtransformer
Regards, Alex. On 10 August 2018 at 10:02, tfaltinat <tfalti...@iet-solutions.de> wrote: > Hi, > > we have an Oracle database where we store Rtf content into a Clob column. > Now we try to index those records but we just want to get the plain text, > same as Tika does. I tried to use the TikaEntityProcessor but I’m getting > the following error message: > > ClassCastException: java.io.StringReader cannot be cast to > java.io.InputStream > > The configuration looks like this: > > <dataSource name="f1" type="FieldReaderDataSource"/> > > <entity name="SV_SOLVE_TXT" onError="continue" transformer="ClobTransformer" > query="select SOLUTION_ID, SOLUTION_TXT SOLUTION_TXT from IT_SOLUTION where > SOLUTION_ID = '${ts3_it_solution_text_search.SOLUTION_ID}'"> > <field name="text_4" column="SOLUTION_TXT" clob="true" /> > <entity name="tika_SOLUTION_TXT" onError="continue" > processor="TikaEntityProcessor" url="${SV_SOLVE_TXT.text_4}" > dataField="SV_SOLVE_TXT.text_4" dataSource="f1" > > <field name="text_1" column="text"/> > </entity> > </entity> > > Thx & Regards, > Torsten > > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html