Package: harvestman
Severity: normal

How are you crawling and downloading websites, files, images?
Do you need something better?
Its time for a change !
Download the beta version of harvestman crawler today!!!!

HarvestMan is a modular, extensible and flexible web crawler program
written in pure Python. HarvestMan can be used to download files from
websites according to a number of customized rules and constraints. It
can be used to find information from websites matching keywords or
regular expressions. The latest version of HarvestMan supports as much
as 60 plus customization options.

Download the files here:
http://harvestman-crawler.googlecode.com/files/Harvestman-2.0.4beta.tar.gz

Unzip and install:
tar -xzvf Harvestman-2.0.4beta.tar.gz
cd Harvestman-2.0.4beta
python setup.py install

Create config file and run harvestman:
harvestman --selftest
harvestman --genconfig    (open easy web gui and add the site you want
to crawl, and all the details. Save the config xml file)

Run harvestman
harvestman -C mycrawl.xml
or use harvestman from a command line
harvestman -h

Project website:
http://code.google.com/p/harvestman-crawler/


Forward to anybody that might be interested!!!!

Thank you,
Harvestman Team

-- System Information:
Debian Release: 4.0
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.18-4-486
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to