Shawn,

 

Is this your custom implementation?

 

"For a delta-import, minDid comes from 
the maxDid value stored after the last successful import.

"

 

Are you updating the dataTable after the import was successful? How did you
handle this? I have similar scenario and your approach will work for my
use-case as well

 

 

thanks

 

 

 

 

 

From: Shawn Heisey-4 [via Lucene]
[mailto:ml-node+738653-1765413222-124...@n3.nabble.com] 
Sent: Tuesday, April 20, 2010 4:35 PM
To: caman
Subject: Re: DIH dataimport.properties with

 

Michael, 

The SolrEntityProcessor looks very intriguing, but it won't work with 
the released 1.4 version.  If that's OK with you and it looks like it'll 
do what you want, feel free to ignore the rest of this. 

I'm also using MySQL as an import source for Solr.  I was unable to use 
the last_index_time because my database doesn't have a field I can match 
against it.  I believe you can use something similar to the method that 
I came up with.  The point of this post is to show you how to inject 
values from outside Solr into a DIH request rather than have Solr 
provide the milestone that indicates new content. 

Here's a simplified version of my URL template and entity configuration 
in data-config.xml.  The did field in my database is an autoincrement 
BIGINT serving as my private key, but something similar could likely be 
cooked up with timestamps too: 

http://HOST:PORT/solr/CORE/dataimport?command=COMMAND
<http://HOST:PORT/solr/CORE/dataimport?command=COMMAND&dataTable=DATATABLE&m
inDid=MINDID&maxDid=MAXDID> &dataTable=DATATABLE&minDid=MINDID&maxDid=MAXDID

---- 

<entity name="dataTable" pk="did" 
query="SELECT * FROM ${dataimporter.request.dataTable} WHERE did > 
${dataimporter.request.minDid} AND did <= 
${dataimporter.request.maxDid}" 
deltaQuery="SELECT MAX(did) FROM ${dataimporter.request.dataTable}" 
deltaImportQuery="SELECT * FROM ${dataimporter.request.dataTable} WHERE 
did > ${dataimporter.request.minDid} AND did <= 
${dataimporter.request.maxDid}"> 
</entity> 

---- 

If I am doing a full-import, I set minDid to zero and maxDid to the 
highest value in the database.  For a delta-import, minDid comes from 
the maxDid value stored after the last successful import. 

The deltaQuery is required, but in my case, is a throw-away query that 
just tells Solr the delta-import needs to be run.  My query and 
deltaImportQuery are identical, though yours may not be. 

Good luck, no matter how you choose to approach this. 

Shawn 


On 4/18/2010 9:02 PM, Michael Tibben wrote: 


> I don't really understand how this will help. Can you elaborate ? 
> 
> Do you mean that the last_index_time can be imported from somewhere 
> outside solr?  But I need to be able to *set* what last_index_time is 
> stored in dataimport.properties, not get properties from somewhere else 
> 
> 
> 
> On 18/04/10 10:02, Lance Norskog wrote: 
>> The SolrEntityProcessor allows you to query a Solr instance and use 
>> the results as DIH properties. You would have to create your own 
>> regular query to do the delta-import instead of using the delta-import 
>> feature. 





  _____  

View message @
http://n3.nabble.com/DIH-dataimport-properties-with-tp722924p738653.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
yc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://n3.nabble.com/DIH-dataimport-properties-with-tp722924p738949.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to