Hi all, I need to import data from my text file (which have HTML text). and need to apply some formatting on it. i want all text with in <p> tag , and i want it to be preceded by one element of p tag in my output, like below.
Original Text ------------------------------------------------------------------------------------------ <div><p myvar="12" myvar1="xyz">Hello World!!</p><p myvar="14" myvar1="abc">Welcome to Solr.</p><p myvar="15" myvar1="def">Enjoy</p></div> Needed Text After Formattting ------------------------------------------------------------------------------------------ 12 : Hello World!! 14 : Welcome to Solr. 15 : Enjoy I have applied combination of PlainTextTransformer , RegexTransformer and TemplateTransformer for that as below. but i am receiving ConfigurationError when i set that. <entity name="xx" onError="continue" processor="PlainTextEntityProcessor,TemplateTransformer,RegexTransformer" url="${URL.MyTxtFile}" dataSource="MDataSource"> <field column="plainText" name="FullText" /> <field column="FullText" template="${xx.FullText}" regex='<p (?:\s+[^>]+)? myvar="([^<"]*)" (?:\s+[^>]+)?>([^<]*)</p>' replaceWith="$2 : $4"/> </entity> I like to add here that i am able do this using TempleteTransformer and multivalued field. but i need above format in signle valued field, for which i am failed to do it. Can any body help me, how can i get my desired result? or what i am doing wrong on above transformer? Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/PlainTexttransformer-and-RegexTransformer-in-DataImport-Handler-tp3608415p3608415.html Sent from the Solr - User mailing list archive at Nabble.com.