Re: webcrawl to cache dynamic pages

2005-05-09 Thread Richard Lyons
On Mon, May 09, 2005 at 10:31:08AM +0100, David Hugh-Jones wrote: > > On 09/05/05, Richard Lyons <[EMAIL PROTECTED]> wrote: > > On Sun, May 08, 2005 at 09:48:07AM +0200, Nacho wrote: > > > > On Mon, May 02, 2005 at 01:27:41PM +0100, Richard Lyons wrote: > > > > > I am considering how to crawl a si

Re: webcrawl to cache dynamic pages

2005-05-09 Thread David Hugh-Jones
If you end up wanting to do something more complicated, you could look into WWW::Mechanize: http://search.cpan.org/perldoc?WWW%3A%3AMechanize David On 09/05/05, Richard Lyons <[EMAIL PROTECTED]> wrote: > On Sun, May 08, 2005 at 09:48:07AM +0200, Nacho wrote: > > > On Mon, May 02, 2005 at 01:27:4

Re: webcrawl to cache dynamic pages

2005-05-08 Thread Richard Lyons
On Sun, May 08, 2005 at 09:48:07AM +0200, Nacho wrote: > > On Mon, May 02, 2005 at 01:27:41PM +0100, Richard Lyons wrote: > > > I am considering how to crawl a site which is dynamically generated, > > > and create a static version of all generated pages (or selected [...] > > Well, I don't know an

Re: webcrawl to cache dynamic pages

2005-05-08 Thread Nacho
> On Mon, May 02, 2005 at 01:27:41PM +0100, Richard Lyons wrote: > > I am considering how to crawl a site which is dynamically generated, > > and create a static version of all generated pages (or selected > > generated pages). I guess it would be simplest to start with an > > existing crawler, an

Re: webcrawl to cache dynamic pages

2005-05-06 Thread Caleb Walker
wget doesnt do what you want? On 5/6/05, Richard Lyons <[EMAIL PROTECTED]> wrote: > On Mon, May 02, 2005 at 01:27:41PM +0100, Richard Lyons wrote: > > I am considering how to crawl a site which is dynamically generated, > > and create a static version of all generated pages (or selected > > genera

Re: webcrawl to cache dynamic pages

2005-05-06 Thread Richard Lyons
On Mon, May 02, 2005 at 01:27:41PM +0100, Richard Lyons wrote: > I am considering how to crawl a site which is dynamically generated, > and create a static version of all generated pages (or selected > generated pages). I guess it would be simplest to start with an > existing crawler, and bolt on