Re: URL-import field type?

Noble Paul നോബിള്‍ नोब्ळ् Fri, 23 Jan 2009 03:06:37 -0800

On Fri, Jan 23, 2009 at 2:55 PM, Paul Libbrecht <p...@activemath.org> wrote:
>
> Le 23-janv.-09 à 10:10, Noble Paul നോബിള്‍ नोब्ळ् a écrit :
>>
>> if the response is not XML ,then  there is no EntityProcessor that can
>> consume this. We may need to add one.
>
> well, even binary data such as word documents (base64-encoded for example)
> run the risk of appearing here. They sure need a pile of filters!
>
>>> What bothers me with the HttpDataSource example is that, for now, at
>>> least,
>>> it is configured to pull a single URL while what is needed (and would
>>> provide delta ability) is really to index a list of URLs (for which one
>>> would pull regularly the list of recently update URLs or simply use
>>> GET-if-modified-since on all of them).
>>
>> The if-modified since is not supported by HttpdataSource. However you
>> can write a transformer which pings the URL w/ a if-modified-since
>> header an skip the document using the $skipDoc option
>
> I still don't understand how you give several documents to the
> HttpDataSource.
> The configuration seems only to allow a single URL.
> Am I missing something?
The DataSource is like a helper class. The only intelligent piece here
is an EntityProcessor.
>
> paul
>
> PS: would it be worth chatting about that on irc.freenode.net#solr ?




-- 
--Noble Paul

Re: URL-import field type?

Reply via email to