Hello

As per several postings I noted that I can define variables
inside an invariants list section of the DIH handler of
solrconfig.xml:-

  <requestHandler name="/dataimport" 
class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
       <str name="config">data-config.xml</str>
       </lst>
    <lst name="invariants">
       <str name="finstalldir">/Volumes/spare/ts</str>
       </lst>
    </requestHandler>  


I can also reference these variables within data-config.xml. This
works,  the solr field "test" is nicely populated. However how do
I use this variable within my regex transformer? Here is my 
data-config.xml:-

   <dataConfig>
   <dataSource name="myfilereader" type="FileDataSource"/>    
    <document>
       <entity name="jc"
               processor="FileListEntityProcessor"
               fileName="^.*\.xml$"
               newerThan="'NOW-1000DAYS'"
               recursive="true"
               rootEntity="false"
               dataSource="null"
               baseDir="/Volumes/spare/ts/fords/dtd/fordsxml/data">
          <entity name="x"
                  dataSource="myfilereader"
                  processor="XPathEntityProcessor"
                  url="${jc.fileAbsolutePath}"
                  stream="false"
                  forEach="/record"
                  
transformer="DateFormatTransformer,TemplateTransformer,RegexTransformer,HTMLStripTransformer">

   <field column="fileAbsolutePath" template="${jc.fileAbsolutePath}" />
   <field column="fileWebPath"      
regex="${dataimporter.request.finstalldir}(.*)" replaceWith="$1" 
sourceColName="fileAbsolutePath"/>
   <field column="test"             
template="${dataimporter.request.finstalldir}" />
   <field column="title"            xpath="/record/title" />
   <field column="para"             xpath="/record/sect1/para" stripHTML="true" 
/>
   <field column="date"             
xpath="/record/metadata/da...@qualifier='Date']" dateTimeFormat="yyyyMMdd"   />
             </entity>
       </entity>
       </document>
    </dataConfig>

indexing my content I get an error as follows:-


INFO: SolrDeletionPolicy.onInit: commits:num=2
        
commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_7,version=1233583868834,generation=7,filenames=[_7.frq,
 _4.fdt, _7.tii, _7.fnm, _4.fdx, _7.tis, segments_7, _7.nrm, _7.prx]
        
commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_8,version=1233583868835,generation=8,filenames=[segments_8]
Feb 2, 2009 5:00:50 PM org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: last commit = 1233583868835
Feb 2, 2009 5:00:57 PM org.apache.solr.handler.dataimport.EntityProcessorBase 
applyTransformer
WARNING: transformer threw error
java.util.regex.PatternSyntaxException: Illegal repetition near index 0
${dataimporter.request.finstalldir}(.*)
^
        at java.util.regex.Pattern.error(Pattern.java:1650)
        at java.util.regex.Pattern.closure(Pattern.java:2706)
        at java.util.regex.Pattern.sequence(Pattern.java:1798)
        at java.util.regex.Pattern.expr(Pattern.java:1687)
        at java.util.regex.Pattern.compile(Pattern.java:1397)
        at java.util.regex.Pattern.<init>(Pattern.java:1124)
        at java.util.regex.Pattern.compile(Pattern.java:817)
        at 
org.apache.solr.handler.dataimport.RegexTransformer.getPattern(RegexTransformer.java:129)
        at 
org.apache.solr.handler.dataimport.RegexTransformer.process(RegexTransformer.java:88)
        at 
org.apache.solr.handler.dataimport.RegexTransformer.transformRow(RegexTransformer.java:74)
        at 
org.apache.solr.handler.dataimport.RegexTransformer.transformRow(RegexTransformer.java:42)
        at 
org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187)
        at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197)
        at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160)
        at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:333)
        at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:359)
        at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:222)
        at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:155)
        at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:324)
        at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:384)
        at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:365)


Is there some simple escape or other syntax to be used or is
this an enhancement?

Regards Fergus.
-- 

===============================================================
Fergus McMenemie               Email:fer...@twig.me.uk
Techmore Ltd                   Phone:(UK) 07721 376021

Unix/Mac/Intranets             Analyst Programmer
===============================================================

Reply via email to