Package: dailystrips Version: 1.0.28-9 Severity: wishlist
Hi! I'm well aware that dailystrips is unmaintained, but I wanted to add this to the bug tracker to give myself a reminder. Maybe I'll come up with a patch a bit later. I have to use a dirty hack with curl to get to the comic url with several comics. One example is my current definition for Order Of The Stick (as the one in strips.def doesn't work any more): strip orderofthestick name Order Of The Stick homepage http://www.giantitp.com/ type search searchpage <code:my $firstpage = `curl -s $homepage`;if ($firstpage =~ /<A href="(\/comics\/oots.+\.html)"/) {"$homepage$1"} else {print "$firstpage\n"; "BROKEN_CONFIG!"}> searchpattern <IMG.+?src.+?=.+?"(/comics/images/.+?.gif)"> baseurl $homepage provides latest end I bet the <code> snippet could be cleaned by using a perl function that dailystrips already includes and the curl requirement could be dropped. However I can think of a much more versatile solution. Something like this: strip orderofthestick name Order Of The Stick homepage http://www.giantitp.com/ type search searchpage $homepage pagepattern <A href="(\/comics\/oots.+\.html)" searchpattern <IMG.+?src.+?=.+?"(/comics/images/.+?.gif)"> baseurl $homepage provides latest end The idea here is that the user can specify multiple consecutive instances of pagepattern which go in an array and are used to calculate the next value of $searchpage, which is used with the next $pagepattern to get the next page, and finally with $searchpattern to get the image url, like this: strip pagepattern_example name Pagepattern Example homepage http://www.debian.org/ type search searchpage $homepage pagepattern <a href="(newspage\d.html)"> pagepattern <a href="(comicpage\d.html)"> searchpattern (comic\d.gif) baseurl $homepage/comics/ provides latest end An even more flexible approach would allow to specify which part of the previous match should be used as searchpage, but I doubt that would be needed (unless content providers decide to make their pages even more convoluted). (NB: If more cartoon pages would use full-content rss feeds *with* ads, such as Dilbert, I'd happily use them and end this content snarfing business once and for all.) Ok, don't hold your breath for patches. Maybe someone else will pick up this idea. -- System Information: Debian Release: squeeze/sid APT prefers testing APT policy: (990, 'testing'), (500, 'unstable'), (500, 'stable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 2.6.30-1-amd64 (SMP w/1 CPU core) Locale: LANG=de_DE.utf8, LC_CTYPE=de_DE.utf8 (charmap=UTF-8) (ignored: LC_ALL set to de_DE.utf8) Shell: /bin/sh linked to /bin/bash Versions of packages dailystrips depends on: ii debconf [debconf-2.0] 1.5.27 Debian configuration management sy ii libtimedate-perl 1.1600-9 Time and date functions for Perl ii libwww-perl 5.829-1 WWW client/server library for Perl ii perl 5.10.0-24 Larry Wall's Practical Extraction dailystrips recommends no packages. dailystrips suggests no packages. -- debconf information: dailystrips/warning-etcdefs: -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org