Hi Alex,
thanks for your answer.

Yes my solrconfig.xml contains the add-unknown-fields-to-the-schema.

  <initParams path="/update/**">
    <lst name="defaults">
      <str name="update.chain">add-unknown-fields-to-the-schema</str>
    </lst>
  </initParams>

I created my core using this command:

curl 
http://192.168.99.100:8999/solr/admin/cores?action=CREATE&name=solrexchange&instanceDir=/opt/solr/server/solr/solrexchange&configSet=data_driven_schema_configs_custom

I am using the example configset data_driven_schema_configs and I simply added:

  <lib dir="${solr.install.dir:../../../..}/dist/" 
regex="solr-dataimporthandler-.*\.jar" />
  <requestHandler name="/dataimport" class="solr.DataImportHandler">
      <lst name="defaults">
        <str name="config">data-config.xml</str>
      </lst>
  </requestHandler>

I thought the schemaless mode was enable by default but I also tried adding 
this config but I get the same result.

  <schemaFactory class="ManagedIndexSchemaFactory">
    <bool name="mutable">true</bool>
    <str name="managedSchemaResourceName">managed-schema</str>
  </schemaFactory>

How can I update my schemaless URP chain and add the parameter to call it to 
DIH?


> On 10 Aug 2016, at 17:43, Alexandre Rafalovitch <arafa...@gmail.com> wrote:
> 
> Do you have the actual fields defined? If not, then I am guessing that
> your 'post' test was against a different collection that had
> schemaless mode enabled and your DIH one is against one where
> schemaless mode is not enabled (look for
> 'add-unknown-fields-to-the-schema' in the solrconfig.xml to confirm).
> Solr examples for DIH do not have schemaless mode enabled.
> 
> I _believe_ you can copy the schemaless URP chain and add the
> parameter to call it to DIH handler and it _should_ work. But I am not
> betting on it without testing it, as DIH also has some magic code to
> ignore fields not defined in schema because it is designed to work
> with only extracting relevant fields from the database even with
> 'select *' statement.
> 
> 
> Regards,
>   Alex.
> ----
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
> 
> 
> On 10 August 2016 at 17:12, Pierre Caserta <pierre.case...@gmail.com> wrote:
>> Hi,
>> It seems that using the DataImportHandler with a XPathEntityProcessor config
>> with a managed-schema setup, only import the id and version field.
>> 
>> data-config.xml
>> 
>> <dataConfig>
>>    <dataSource type="FileDataSource" encoding="UTF-8" />
>>    <document>
>>        <entity name="post"
>>            processor="XPathEntityProcessor"
>>            stream="true"
>>            forEach="/posts/row/"
>>            url="${dataimporter.request.dataurl}"
>> 
>> transformer="RegexTransformer,DateFormatTransformer,HTMLStripTransformer"
>>> 
>>            <field column="id"        xpath="/posts/row/@Id" />
>>            <field column="postTypeId"     xpath="/posts/row/@PostTypeId" />
>>            <field column="acceptedAnswerId"
>> xpath="/posts/row/@AcceptedAnswerId" />
>>            <field column="creationDate" xpath="/posts/row/@CreationDate"
>> dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss.SSS" />
>>            <field column="postScore"  xpath="/posts/row/@Score" />
>>            <field column="viewCount"  xpath="/posts/row/@ViewCount" />
>>            <field column="body"  xpath="/posts/row/@Body" stripHTML="true"
>> />
>>            <field column="ownerUserId"  xpath="/posts/row/@OwnerUserId" />
>>            <field column="lastEditorUserId"
>> xpath="/posts/row/@LastEditorUserId" />
>>            <field column="lastEditorDisplayName"
>> xpath="/posts/row/@LastEditorDisplayName" />
>>            <field column="lastActivityDate"
>> xpath="/posts/row/@LastActivityDate"
>> dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss.SSS" />
>>            <field column="title"  xpath="/posts/row/@Title" />
>>            <field column="trimmedTags" xpath="/posts/row/@Tags"
>> regex="&lt;(.*)&gt;" />
>>            <field column="tags" sourceColName="trimmedTags"
>> splitBy="&gt;&lt;" />
>>            <field column="answerCount"  xpath="/posts/row/@AnswerCount" />
>>            <field column="commentCount"  xpath="/posts/row/@CommentCount"
>> />
>>            <field column="favoriteCount"  xpath="/posts/row/@FavoriteCount"
>> />
>>            <field column="communityOwnedDate"
>> xpath="/posts/row/@CommunityOwnedDate"
>> dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss.SSS" />
>>        </entity>
>>    </document>
>> </dataConfig>
>> 
>> 
>> http://192.168.99.100:8999/solr/solrexchange/select?indent=on&q=*:*&wt=json
>> {
>>  "responseHeader":{
>>    "status":0,
>>    "QTime":0,
>>    "params":{
>>      "q":"*:*",
>>      "indent":"on",
>>      "wt":"json",
>>      "_":"1470811193595"}},
>>  "response":{"numFound":8,"start":0,"docs":[
>>      {
>>        "id":"38822",
>>        "_version_":1542258196375142400},
>>      {
>>        "id":"38836",
>>        "_version_":1542258196387725312},
>>      {
>>        "id":"63896",
>>        "_version_":1542258196388773888},
>>      {
>>        "id":"65406",
>>        "_version_":1542258196391919616},
>>      {
>>        "id":"1357173",
>>        "_version_":1542258196391919617},
>>      {
>>        "id":"5339763",
>>        "_version_":1542258196392968192},
>>      {
>>        "id":"9932722",
>>        "_version_":1542258196392968193},
>>      {
>>        "id":"9217299",
>>        "_version_":1542258196392968194}]
>>  }}
>> 
>> data_search.xml (8 rows)
>> 
>> 
>> 
>> the url I am hitting (with custom dataurl parameter)
>> 
>> curl
>> 'http://192.168.99.100:8999/solr/solrexchange/dataimport?command=full-import&commit=true&dataurl=/code/solr/data/search/dih/data_search.xml'
>> 
>> I changed my data to use <add> <doc> <field> and use the bin/post tool and
>> this is working as expected.
>> Now I am interested to make it work with the DataImportHandler.
>> How can I use the DataImportHandler to import my document ?
>> 
>> Thanks,
>> Pierre Caserta
>> 
>> 

Reply via email to