Package: harvestman Severity: normal
How are you crawling and downloading websites, files, images? Do you need something better? Its time for a change ! Download the beta version of harvestman crawler today!!!! HarvestMan is a modular, extensible and flexible web crawler program written in pure Python. HarvestMan can be used to download files from websites according to a number of customized rules and constraints. It can be used to find information from websites matching keywords or regular expressions. The latest version of HarvestMan supports as much as 60 plus customization options. Download the files here: http://harvestman-crawler.googlecode.com/files/Harvestman-2.0.4beta.tar.gz Unzip and install: tar -xzvf Harvestman-2.0.4beta.tar.gz cd Harvestman-2.0.4beta python setup.py install Create config file and run harvestman: harvestman --selftest harvestman --genconfig (open easy web gui and add the site you want to crawl, and all the details. Save the config xml file) Run harvestman harvestman -C mycrawl.xml or use harvestman from a command line harvestman -h Project website: http://code.google.com/p/harvestman-crawler/ Forward to anybody that might be interested!!!! Thank you, Harvestman Team -- System Information: Debian Release: 4.0 APT prefers stable APT policy: (500, 'stable') Architecture: i386 (i686) Shell: /bin/sh linked to /bin/bash Kernel: Linux 2.6.18-4-486 Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]