Le Mer 26 Janvier 2005 03:47, Dirk Eddelbuettel a écrit : > In response to the mail by Pierre Habouzit dated 25 January 2005 at 14:10: > | Le Mardi 25 Janvier 2005 13:20, Dirk Eddelbuettel a écrit : > | > Package: akregator > | > Version: 1.0-beta8-2 > | > Severity: important > | > > | > Salut Pierre, > | > > | > Thanks for looking akregator -- a great rss reader. Once in a > | > while, and in particular on Debian Planet, someone puts a '&' > | > into a story heading. Currently this can be seen on > | > planet.debian.org in the story from Marga. > | > > | > Akregator then stops updating the feed. This is rather annoying. > | > I discussed this with the authors of the planet code, and their > | > take is that their Python toolset, in particular the rss part, is > | > robust -- it is the readers that are at fault. > | > | they are wrong. > | > | I don't say that akregator should'nt be more robust to bad > | encoded entities, but the spec for RSS is xml, and in xml you have > | to put things like that : > | > | <tag>put an ampersand : &</tag> > | > | OR > | > | <tag><![CDATA[put an ampersand : &]]></tag> > | > | any other solution is NOT correct, so you can bug debianplanet too. > | or the python module they use. > > Can't resist CCing Scott here as I pestered him (unsuccessfully) > about this before.
and if he is not convinced, ask him to open debian planet in mozilla/firefox or even the last konqueror. the verdict is unanimously : « xml error » > | the bug is well known ([1] : funny, it is a debian planet problem > | too), and is Qt's fault, since akregator uses the Qt xml API that > | has a very strict (but also correct) parser. > > Well, is someone going to fix it? the problem is : we cannot act on the Qt xml parser easily. and make some preg_replace(/&\b/, /&/, feed) will not work, because of the <![CDATA[ ]]>. that's why nothing has been done. the only solution I can see that is correct would be either to make a parser that only corrects the feed, but that's an overkill, and will moreover cost too many time. Or use another XML parser/API. but that's a big big work, since you imagine that the XML API has a major place in akregator :) I guess we (i mean the akregator coder team) should look at what konqueror uses ... I guess there is KXML classes. but it would mean a major rewrite > | anyway, the bug is certainly not important, at most normal, since > | it only stop the fetch of the faulty feed, and not of the others. > > Don't disagree completely but reading Planet Debian is important to > me, and I can't read it right now. As our lusers break the content, > how about if both ends of the software strive to get better here? I like to read DPlanet too. but ... btw, the other * Planet I read (KDE Planet e.g.) does not suffer from the same problems. And there is a computer sentence that say : « you must be very strict wrt what you produce, and very tolerant wrt what you read » and I like to add « but nobody can blame you not to be tolerant enough when the input is faulty » that's mean that the first sentence is a quite good practice. but it's not a philosophy that you HAVE to match. and here, I'm sorry to confirm another time, DP *is* faulty, it's pervasive XML knowlege : in xml you have two choices wrt < & and > : escape them & < > OR put them unescaped in <![CDATA[ ]]> sections. DPlanet has been delivering bad feeds too many times, so I think they should really improve their RSS output module. -- ·O· Pierre Habouzit ··O OOO http://www.madism.org
pgpMLMfCdMyuv.pgp
Description: PGP signature