Alternatively, you could use the deltaQuery to retrieve the last indexed
id from the DB (you'd have to save it there on your previous import).
Your entity would look something like:
<entity name="my_entity"
        deltaQuery="SELECT MAX(id) AS last_id_value FROM last_id_table"
        deltaImportQuery="SELECT * FROM my_table WHERE id >
${dataimporter.delta.last_id_value}"
        ... >
        <field ... />
</entity>

You could implement your deltaImportQuery as a stored procedure which
would store the appropriate id in last_id_table (for the next
delta-import) in addition to returning the data from the query.

Ephraim Ofir


-----Original Message-----
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Friday, September 10, 2010 4:54 AM
To: solr-user@lucene.apache.org
Subject: Re: Delta Import with something other than Date

  On 9/9/2010 1:23 PM, Vladimir Sutskever wrote:
> Shawn,
>
> Can you provide a sample of passing the parameter via URL? And how
using it would look in the data-config.xml
>

Here's the URL that I send to do a full build on my last shard:

http://idxst5-a:8983/solr/build/dataimport?command=full-import&optimize=
true&commit=true&dataTable=ncdat&numShards=6&modVal=5&minDid=0&maxDid=24
2895591

If I want to do a delta, I just change the command to delta-import and 
give it a proper minDid value, rather than 0.

Below is the entity from my data-config.xml.  You have to have a 
deltaQuery defined for delta-import to work, but if you're going to use 
your own placeholders, just put something in that returns a single value

very quickly.  In my case, my query and deltaImportQuery are actually 
identical.

<entity name="dataTable" pk="did"
       query="SELECT *,FROM_UNIXTIME(post_date) as pd FROM 
${dataimporter.request.dataTable} WHERE did &gt; 
${dataimporter.request.minDid} AND did &lt;= 
${dataimporter.request.maxDid} AND (did % 
${dataimporter.request.numShards}) IN (${dataimporter.request.modVal})"
       deltaQuery="SELECT MAX(did) FROM
${dataimporter.request.dataTable}"
       deltaImportQuery="SELECT *,FROM_UNIXTIME(post_date) as pd FROM 
${dataimporter.request.dataTable} WHERE did &gt; 
${dataimporter.request.minDid} AND did &lt;= 
${dataimporter.request.maxDid} AND (did % 
${dataimporter.request.numShards}) IN (${dataimporter.request.modVal})">
</entity>


Reply via email to