On Tue, 20 Jan 2015 17:47:58 +0000
Mike Martin <[email protected]> wrote:
> Take a load of Job Vacancy posts (xml files - loads of)
> Parse the Information, getting rid of as much garbage as possible
> Push a distinct list into a lookup hash
If you're running Linux (or any POSIX), see `man sort` and search
for /-u/
Since sort(1) is fully compiled, it should be faster than a Perl hash,
especially for long lists.
> Do replace to this list against a long list of regexes
This will create a cross-product table. That means each pair has to be
tested. Nobody has found a way to reduce this.
> Spit out nicely formatted Clean Job Titles
--
Don't stop where the ink does.
Shawn
--
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/