Hello
As per several postings I noted that I can define variables
inside an invariants list section of the DIH handler of
solrconfig.xml:-
<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
<lst name="invariants">
<str name="finstalldir">/Volumes/spare/ts</str>
</lst>
</requestHandler>
I can also reference these variables within data-config.xml. This
works, the solr field "test" is nicely populated. However how do
I use this variable within my regex transformer? Here is my
data-config.xml:-
<dataConfig>
<dataSource name="myfilereader" type="FileDataSource"/>
<document>
<entity name="jc"
processor="FileListEntityProcessor"
fileName="^.*\.xml$"
newerThan="'NOW-1000DAYS'"
recursive="true"
rootEntity="false"
dataSource="null"
baseDir="/Volumes/spare/ts/fords/dtd/fordsxml/data">
<entity name="x"
dataSource="myfilereader"
processor="XPathEntityProcessor"
url="${jc.fileAbsolutePath}"
stream="false"
forEach="/record"
transformer="DateFormatTransformer,TemplateTransformer,RegexTransformer,HTMLStripTransformer">
<field column="fileAbsolutePath" template="${jc.fileAbsolutePath}" />
<field column="fileWebPath"
regex="${dataimporter.request.finstalldir}(.*)" replaceWith="$1"
sourceColName="fileAbsolutePath"/>
<field column="test"
template="${dataimporter.request.finstalldir}" />
<field column="title" xpath="/record/title" />
<field column="para" xpath="/record/sect1/para" stripHTML="true"
/>
<field column="date"
xpath="/record/metadata/da...@qualifier='Date']" dateTimeFormat="yyyyMMdd" />
</entity>
</entity>
</document>
</dataConfig>
indexing my content I get an error as follows:-
INFO: SolrDeletionPolicy.onInit: commits:num=2
commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_7,version=1233583868834,generation=7,filenames=[_7.frq,
_4.fdt, _7.tii, _7.fnm, _4.fdx, _7.tis, segments_7, _7.nrm, _7.prx]
commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_8,version=1233583868835,generation=8,filenames=[segments_8]
Feb 2, 2009 5:00:50 PM org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: last commit = 1233583868835
Feb 2, 2009 5:00:57 PM org.apache.solr.handler.dataimport.EntityProcessorBase
applyTransformer
WARNING: transformer threw error
java.util.regex.PatternSyntaxException: Illegal repetition near index 0
${dataimporter.request.finstalldir}(.*)
^
at java.util.regex.Pattern.error(Pattern.java:1650)
at java.util.regex.Pattern.closure(Pattern.java:2706)
at java.util.regex.Pattern.sequence(Pattern.java:1798)
at java.util.regex.Pattern.expr(Pattern.java:1687)
at java.util.regex.Pattern.compile(Pattern.java:1397)
at java.util.regex.Pattern.<init>(Pattern.java:1124)
at java.util.regex.Pattern.compile(Pattern.java:817)
at
org.apache.solr.handler.dataimport.RegexTransformer.getPattern(RegexTransformer.java:129)
at
org.apache.solr.handler.dataimport.RegexTransformer.process(RegexTransformer.java:88)
at
org.apache.solr.handler.dataimport.RegexTransformer.transformRow(RegexTransformer.java:74)
at
org.apache.solr.handler.dataimport.RegexTransformer.transformRow(RegexTransformer.java:42)
at
org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187)
at
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197)
at
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:333)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:359)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:222)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:155)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:324)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:384)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:365)
Is there some simple escape or other syntax to be used or is
this an enhancement?
Regards Fergus.
--
===============================================================
Fergus McMenemie Email:[email protected]
Techmore Ltd Phone:(UK) 07721 376021
Unix/Mac/Intranets Analyst Programmer
===============================================================