Hi Alex, thanks for your answer. Yes my solrconfig.xml contains the add-unknown-fields-to-the-schema.
<initParams path="/update/**"> <lst name="defaults"> <str name="update.chain">add-unknown-fields-to-the-schema</str> </lst> </initParams> I created my core using this command: curl http://192.168.99.100:8999/solr/admin/cores?action=CREATE&name=solrexchange&instanceDir=/opt/solr/server/solr/solrexchange&configSet=data_driven_schema_configs_custom I am using the example configset data_driven_schema_configs and I simply added: <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-dataimporthandler-.*\.jar" /> <requestHandler name="/dataimport" class="solr.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> </lst> </requestHandler> I thought the schemaless mode was enable by default but I also tried adding this config but I get the same result. <schemaFactory class="ManagedIndexSchemaFactory"> <bool name="mutable">true</bool> <str name="managedSchemaResourceName">managed-schema</str> </schemaFactory> How can I update my schemaless URP chain and add the parameter to call it to DIH? > On 10 Aug 2016, at 17:43, Alexandre Rafalovitch <arafa...@gmail.com> wrote: > > Do you have the actual fields defined? If not, then I am guessing that > your 'post' test was against a different collection that had > schemaless mode enabled and your DIH one is against one where > schemaless mode is not enabled (look for > 'add-unknown-fields-to-the-schema' in the solrconfig.xml to confirm). > Solr examples for DIH do not have schemaless mode enabled. > > I _believe_ you can copy the schemaless URP chain and add the > parameter to call it to DIH handler and it _should_ work. But I am not > betting on it without testing it, as DIH also has some magic code to > ignore fields not defined in schema because it is designed to work > with only extracting relevant fields from the database even with > 'select *' statement. > > > Regards, > Alex. > ---- > Newsletter and resources for Solr beginners and intermediates: > http://www.solr-start.com/ > > > On 10 August 2016 at 17:12, Pierre Caserta <pierre.case...@gmail.com> wrote: >> Hi, >> It seems that using the DataImportHandler with a XPathEntityProcessor config >> with a managed-schema setup, only import the id and version field. >> >> data-config.xml >> >> <dataConfig> >> <dataSource type="FileDataSource" encoding="UTF-8" /> >> <document> >> <entity name="post" >> processor="XPathEntityProcessor" >> stream="true" >> forEach="/posts/row/" >> url="${dataimporter.request.dataurl}" >> >> transformer="RegexTransformer,DateFormatTransformer,HTMLStripTransformer" >>> >> <field column="id" xpath="/posts/row/@Id" /> >> <field column="postTypeId" xpath="/posts/row/@PostTypeId" /> >> <field column="acceptedAnswerId" >> xpath="/posts/row/@AcceptedAnswerId" /> >> <field column="creationDate" xpath="/posts/row/@CreationDate" >> dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss.SSS" /> >> <field column="postScore" xpath="/posts/row/@Score" /> >> <field column="viewCount" xpath="/posts/row/@ViewCount" /> >> <field column="body" xpath="/posts/row/@Body" stripHTML="true" >> /> >> <field column="ownerUserId" xpath="/posts/row/@OwnerUserId" /> >> <field column="lastEditorUserId" >> xpath="/posts/row/@LastEditorUserId" /> >> <field column="lastEditorDisplayName" >> xpath="/posts/row/@LastEditorDisplayName" /> >> <field column="lastActivityDate" >> xpath="/posts/row/@LastActivityDate" >> dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss.SSS" /> >> <field column="title" xpath="/posts/row/@Title" /> >> <field column="trimmedTags" xpath="/posts/row/@Tags" >> regex="<(.*)>" /> >> <field column="tags" sourceColName="trimmedTags" >> splitBy="><" /> >> <field column="answerCount" xpath="/posts/row/@AnswerCount" /> >> <field column="commentCount" xpath="/posts/row/@CommentCount" >> /> >> <field column="favoriteCount" xpath="/posts/row/@FavoriteCount" >> /> >> <field column="communityOwnedDate" >> xpath="/posts/row/@CommunityOwnedDate" >> dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss.SSS" /> >> </entity> >> </document> >> </dataConfig> >> >> >> http://192.168.99.100:8999/solr/solrexchange/select?indent=on&q=*:*&wt=json >> { >> "responseHeader":{ >> "status":0, >> "QTime":0, >> "params":{ >> "q":"*:*", >> "indent":"on", >> "wt":"json", >> "_":"1470811193595"}}, >> "response":{"numFound":8,"start":0,"docs":[ >> { >> "id":"38822", >> "_version_":1542258196375142400}, >> { >> "id":"38836", >> "_version_":1542258196387725312}, >> { >> "id":"63896", >> "_version_":1542258196388773888}, >> { >> "id":"65406", >> "_version_":1542258196391919616}, >> { >> "id":"1357173", >> "_version_":1542258196391919617}, >> { >> "id":"5339763", >> "_version_":1542258196392968192}, >> { >> "id":"9932722", >> "_version_":1542258196392968193}, >> { >> "id":"9217299", >> "_version_":1542258196392968194}] >> }} >> >> data_search.xml (8 rows) >> >> >> >> the url I am hitting (with custom dataurl parameter) >> >> curl >> 'http://192.168.99.100:8999/solr/solrexchange/dataimport?command=full-import&commit=true&dataurl=/code/solr/data/search/dih/data_search.xml' >> >> I changed my data to use <add> <doc> <field> and use the bin/post tool and >> this is working as expected. >> Now I am interested to make it work with the DataImportHandler. >> How can I use the DataImportHandler to import my document ? >> >> Thanks, >> Pierre Caserta >> >>