I'm exploring other ways of getting data into Solr via DataImportHandler than through a relational database, particularly the URLDataSource.

I see the special commands for deleting by id and query as well as the $hasMore/$nextUrl techniques, but I'm unclear on exactly how one would go about designing a data source over HTTP that worked cleanly for full importing and also for delta indexing.

For sake of argument, suppose I have /data.xml[?since=<some timestamp>] [&start=X&rows=Y] and it could return documents in Solr XML (or really any basic format) since the last time it was updated (or all records if no since parameter is provided). And the service could also return which records to remove since that timestamp too. Can I get there from here using URLDataSource?

Have folks been doing this? If so, anyone care to share some basic tips/tricks/examples?

Thanks,
        Erik

  • DIH: URLDataSource and... Erik Hatcher

Reply via email to