hi Erik,
It is designed to achieve this using a Transformer.

I am assuming that your API gives delta "deleted/modified/added" documents.

Always run a full-import with clean=false. Depending on the values
returned by the API your transformer can use $deleteById for deletes
etc.

$nextUrl and $hasMore can also be used to fetch more and more data .
Again these variables can be generated and put into the row by the
Transformer

we did it for one of our internal API for amessage boards using a
jsvascript transformer. you can do this with a java transformer as
well

On Thu, Jul 9, 2009 at 7:57 PM, Erik Hatcher<e...@ehatchersolutions.com> wrote:
> I'm exploring other ways of getting data into Solr via DataImportHandler
> than through a relational database, particularly the URLDataSource.
>
> I see the special commands for deleting by id and query as well as the
> $hasMore/$nextUrl techniques, but I'm unclear on exactly how one would go
> about designing a data source over HTTP that worked cleanly for full
> importing and also for delta indexing.
>
> For sake of argument, suppose I have /data.xml[?since=<some
> timestamp>][&start=X&rows=Y] and it could return documents in Solr XML (or
> really any basic format) since the last time it was updated (or all records
> if no since parameter is provided).  And the service could also return which
> records to remove since that timestamp too.  Can I get there from here using
> URLDataSource?
>
> Have folks been doing this?  If so, anyone care to share some basic
> tips/tricks/examples?
>
> Thanks,
>        Erik
>
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com

Reply via email to