Hi,
I am starting my solr instance with the command java
-Dsolr.solr.home="./test1/solr/" -jar start.jar
where I have a solr.xml file
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<solr sharedLib="lib" persistent="true">
        <cores adminPath="/admin/cores">
                <core default="false" instanceDir="tester" name="tester"/>
        </cores>
</solr>

In the folder tester I have configurations - adapted from the rss examples

DataImporter.xml
<dataConfig>
 <dataSource name="myfilereader" type="FileDataSource"/>
   <document>
     <entity name="jc" rootEntity="false" dataSource="null"
             processor="FileListEntityProcessor"
             fileName="^.*\.xml$" recursive="true"
             baseDir="/projects/solrtest/transformedimport"
             >
       <entity name="x" rootEntity="true"
               dataSource="myfilereader"
               processor="XPathEntityProcessor"
               url="${jc.fileAbsolutePath}"
               stream="false" forEach="/ARTIKEL"
               
transformer="DateFormatTransformer,TemplateTransformer,RegexTransformer,LogTransformer"
               logTemplate="processing ${jc.fileAbsolutePath}"
               logLevel="info"
               >

        
         <field column="title"     xpath="/DOKTITEL/OVERSKRIFT1" />
         <field column="text"      xpath="/AKROP/TXT"  />



       </entity>
     </entity>
   </document>
  </dataConfig>

solrconfig.xml - same as the rss example only removed elevate components.

schema.xml


 <fields>
        <field name="title" type="text" indexed="true" stored="true" />
        <field name="txt" type="text" indexed="true" stored="true" />
        <field name="all_text" type="text" indexed="true" stored="true"
multiValued="true" />
        <copyField source="title" dest="all_text" />
        <copyField source="txt" dest="all_text" />
</fields>

removed the uniqueKey constraint.

When I go to http://localhost:8983/solr/tester/admin/
I get the admin page.
When I run http://localhost:8983/solr/tester/dataimport?command=full-import
it says

<response>
−
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">16</int>
</lst>
−
<lst name="initArgs">
−
<lst name="defaults">
<str name="config">dataimporter.xml</str>
</lst>
</lst>
<str name="command">full-import</str>
<str name="status">idle</str>
<str name="importResponse"/>
<lst name="statusMessages"/>
−
<str name="WARNING">
This response format is experimental.  It is likely to change in the future.
</str>
</response>
When I look at the log of that it says a bunch of stuff like:

INFO: processing c:\projects\solrtest\transformed\1.xml
org.apache.solr.common.util.XMLErrorLogger report
WARNING: XmL parser reported xml declaration in "null", line 1, column
38: Inconsistent text encoding; declared as "utf-8" in xml
declaration, application had passed "Cp1252"

Here is one of the processed documents

  <?xml version="1.0" encoding="utf-8" ?>
- <ARTIKEL ID="MM2010ADMINISTRATIONSYDELSER">
- <DOKTITEL>
  <OVERSKRIFT1>Administrationsydelser (MomsManual)</OVERSKRIFT1>
  </DOKTITEL>
- <AKROP>
  <TXT>Administrationsydelser er momspligtige. Dette gælder også når
de faktureres koncerninternt, f.eks. fra et moderselskab
(holdingselskab) til et datterselskab.</TXT>
  <TXT>Der er fradragsret for moms vedrørende køb af
administrationsydelser i samme omfang, som virksomheden kan fratrække
momsen af øvrige fællesomkostninger.</TXT>
  <TXT>Hvis administrationsydelser faktureres på tværs af
landegrænserne, f.eks. indenfor internationale koncerner, kan der
gælde forskellige principper for momsberegningen i de enkelte
EU-lande. Hvis en administrationsydelse faktureres fra Danmark til et
datterselskab i et andet land, herunder også i andre EU-lande, er det
myndighedernes holdning, at der skal faktureres med dansk moms.</TXT>
  <TXT>Hvis en administrationsydelse faktureres mellem et selskab og
dets filial/-er, skal faktura altid udstedes uden moms. Handel med
ydelser mellem et selskab og dets filial/-er anses ikke for at udgøre
momspligtige transaktioner.</TXT>
  <TXTO>Regler</TXTO>
- <TXT>
  <LR IDREF="LBKG2005966.§15" CREATOR="autolink" TARGETTYPE="REL">ML § 15</LR>
  </TXT>
  </AKROP>
  </ARTIKEL>

If I search for the text Administrationsydelser
http://localhost:8983/solr/tester/select/?q=Administrationsydelser&version=2.2&start=0&rows=10&indent=on
I get

<response>
−
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
−
<lst name="params">
<str name="indent">on</str>
<str name="start">0</str>
<str name="q">Administrationsydelser</str>
<str name="version">2.2</str>
<str name="rows">10</str>
</lst>
</lst>
<result name="response" numFound="0" start="0"/>
</response>

There is a segments.gen and a segments_4 file in my index but nothing
else. Tried looking with Luke but it seems not to be compatible with
the newest versions of Lucene...

version of solr is 3.1.0

Thanks,
Bryan Rasmussen

Reply via email to