Le 23-janv.-09 à 10:10, Noble Paul നോബിള് नोब्ळ् a écrit :
if the response is not XML ,then there is no EntityProcessor that can consume this. We may need to add one.
well, even binary data such as word documents (base64-encoded for example) run the risk of appearing here. They sure need a pile of filters!
What bothers me with the HttpDataSource example is that, for now, at least,it is configured to pull a single URL while what is needed (and wouldprovide delta ability) is really to index a list of URLs (for which onewould pull regularly the list of recently update URLs or simply use GET-if-modified-since on all of them).The if-modified since is not supported by HttpdataSource. However you can write a transformer which pings the URL w/ a if-modified-since header an skip the document using the $skipDoc option
I still don't understand how you give several documents to the HttpDataSource.
The configuration seems only to allow a single URL. Am I missing something? paul PS: would it be worth chatting about that on irc.freenode.net#solr ?
smime.p7s
Description: S/MIME cryptographic signature