Hey Shawn,

Thanks for the tip. This will work nicely.

I totally missed the ability to use request parameters using ${dataimporter.request.* }

This way, I can maintain my own last_index_time timestamp outside of the DIH properties file.

Much appreciation

Regards,

Michael



On 21/04/10 09:34, Shawn Heisey wrote:
Michael,

The SolrEntityProcessor looks very intriguing, but it won't work with
the released 1.4 version.  If that's OK with you and it looks like it'll
do what you want, feel free to ignore the rest of this.

I'm also using MySQL as an import source for Solr.  I was unable to use
the last_index_time because my database doesn't have a field I can match
against it.  I believe you can use something similar to the method that
I came up with.  The point of this post is to show you how to inject
values from outside Solr into a DIH request rather than have Solr
provide the milestone that indicates new content.

Here's a simplified version of my URL template and entity configuration
in data-config.xml.  The did field in my database is an autoincrement
BIGINT serving as my private key, but something similar could likely be
cooked up with timestamps too:

http://HOST:PORT/solr/CORE/dataimport?command=COMMAND&dataTable=DATATABLE&minDid=MINDID&maxDid=MAXDID

----

<entity name="dataTable" pk="did"
query="SELECT * FROM ${dataimporter.request.dataTable} WHERE did&gt;
${dataimporter.request.minDid} AND did&lt;=
${dataimporter.request.maxDid}"
deltaQuery="SELECT MAX(did) FROM ${dataimporter.request.dataTable}"
deltaImportQuery="SELECT * FROM ${dataimporter.request.dataTable} WHERE
did&gt; ${dataimporter.request.minDid} AND did&lt;=
${dataimporter.request.maxDid}">
</entity>

----

If I am doing a full-import, I set minDid to zero and maxDid to the
highest value in the database.  For a delta-import, minDid comes from
the maxDid value stored after the last successful import.

The deltaQuery is required, but in my case, is a throw-away query that
just tells Solr the delta-import needs to be run.  My query and
deltaImportQuery are identical, though yours may not be.

Good luck, no matter how you choose to approach this.

Shawn


On 4/18/2010 9:02 PM, Michael Tibben wrote:
I don't really understand how this will help. Can you elaborate ?

Do you mean that the last_index_time can be imported from somewhere
outside solr?  But I need to be able to *set* what last_index_time is
stored in dataimport.properties, not get properties from somewhere else



On 18/04/10 10:02, Lance Norskog wrote:
The SolrEntityProcessor allows you to query a Solr instance and use
the results as DIH properties. You would have to create your own
regular query to do the delta-import instead of using the delta-import
feature.

Reply via email to