Tags: moreinfo

> I don't know whether this behaviour is intended or not:  during recursive
> retrieval, when wget has to decide whether to enqueue or not a discovered
> url, it scans accept/reject lists with u->file, instead of u->url.
> as a result, it is often wrong about what is to be crawled.

Note that u->file contains just the "file" portion of the URL, so should
generally contain the same thing as whatever's past the final / in u->url.

What things is it "often wrong" about for crawling?

-- 
Micah J. Cowan
GNU Wget Maintainer




-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to