Package: dailystrips
Version: 1.0.28-9
Severity: wishlist

Hi!

I'm well aware that dailystrips is unmaintained, but I wanted to add
this to the bug tracker to give myself a reminder. Maybe I'll come up
with a patch a bit later.

I have to use a dirty hack with curl to get to the comic url with
several comics.

One example is my current definition for Order Of The Stick (as the one in 
strips.def doesn't work any more):

strip orderofthestick
      name Order Of The Stick
      homepage http://www.giantitp.com/
      type search
      searchpage <code:my $firstpage = `curl -s $homepage`;if ($firstpage =~ 
/<A href="(\/comics\/oots.+\.html)"/) {"$homepage$1"} else {print 
"$firstpage\n"; "BROKEN_CONFIG!"}>
      searchpattern <IMG.+?src.+?=.+?"(/comics/images/.+?.gif)">
      baseurl $homepage
      provides latest
end

I bet the <code> snippet could be cleaned by using a perl function
that dailystrips already includes and the curl requirement could be dropped.

However I can think of a much more versatile solution. Something like this:

strip orderofthestick
      name Order Of The Stick
      homepage http://www.giantitp.com/
      type search
      searchpage $homepage
      pagepattern <A href="(\/comics\/oots.+\.html)"
      searchpattern <IMG.+?src.+?=.+?"(/comics/images/.+?.gif)">
      baseurl $homepage
      provides latest
end

The idea here is that the user can specify multiple consecutive
instances of pagepattern which go in an array and are used to
calculate the next value of $searchpage, which is used with the next
$pagepattern to get the next page, and finally with $searchpattern to
get the image url, like this:


strip pagepattern_example
      name Pagepattern Example
      homepage http://www.debian.org/
      type search
      searchpage $homepage
      pagepattern <a href="(newspage\d.html)">
      pagepattern <a href="(comicpage\d.html)">
      searchpattern (comic\d.gif)
      baseurl $homepage/comics/
      provides latest
end

An even more flexible approach would allow to specify which part of
the previous match should be used as searchpage, but I doubt that
would be needed (unless content providers decide to make their pages
even more convoluted).

(NB: If more cartoon pages would use full-content rss feeds *with*
ads, such as Dilbert, I'd happily use them and end this content
snarfing business once and for all.)

Ok, don't hold your breath for patches. Maybe someone else will pick
up this idea.

-- System Information:
Debian Release: squeeze/sid
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable'), (500, 'stable'), (1, 
'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.30-1-amd64 (SMP w/1 CPU core)
Locale: LANG=de_DE.utf8, LC_CTYPE=de_DE.utf8 (charmap=UTF-8) (ignored: LC_ALL 
set to de_DE.utf8)
Shell: /bin/sh linked to /bin/bash

Versions of packages dailystrips depends on:
ii  debconf [debconf-2.0]         1.5.27     Debian configuration management sy
ii  libtimedate-perl              1.1600-9   Time and date functions for Perl
ii  libwww-perl                   5.829-1    WWW client/server library for Perl
ii  perl                          5.10.0-24  Larry Wall's Practical Extraction

dailystrips recommends no packages.

dailystrips suggests no packages.

-- debconf information:
  dailystrips/warning-etcdefs:



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to