I would like to the URLDataSource to make RESTful calls to get content and only 
re-index when content changes.  This means using http headers to make a request 
and using the response headers to determine when to make the request.  For 
example,

Request Headers:

Accept: application/xml
if-modified-since: timestamp


Response Headers:

Expires: timestamp
Etag: etag

In this case Solr would make a request or the specified media type by adding it 
to the accept header.  Also, it would use a timestamp in the if-modified-since 
on requests after the first request.  This timestamp would be the last time 
that indexing took place.  So, we only want to index again if changes happened. 
 The RESTful service would return content the first time contacted with the 
expires header, which would tell Solr when is the next time it should check for 
new content to be indexed.  At that point the RESTful service could return 304 
Not Modified or it could return new content.  If it returns new content, it is 
indexed.  Otherwise, Solr reads the new Expires header to see when it should 
make the next request.


My question is whether or not there is anything in Solr that currently supports 
this or if I would have to implement this myself?  I wasn't able to find 
anything.  

thanks,

Jason


Reply via email to