I'm exploring other ways of getting data into Solr via
DataImportHandler than through a relational database, particularly the
URLDataSource.
I see the special commands for deleting by id and query as well as the
$hasMore/$nextUrl techniques, but I'm unclear on exactly how one would
go about designing a data source over HTTP that worked cleanly for
full importing and also for delta indexing.
For sake of argument, suppose I have /data.xml[?since=<some timestamp>]
[&start=X&rows=Y] and it could return documents in Solr XML (or really
any basic format) since the last time it was updated (or all records
if no since parameter is provided). And the service could also return
which records to remove since that timestamp too. Can I get there
from here using URLDataSource?
Have folks been doing this? If so, anyone care to share some basic
tips/tricks/examples?
Thanks,
Erik
- DIH: URLDataSource and... Erik Hatcher
-