On Sun, 2010-08-15 at 08:43 +0200, Peter Lind wrote: > On 15 August 2010 06:14, Paul M Foster <pa...@quillandmouse.com> wrote: > > On Sat, Aug 14, 2010 at 10:36:07PM +0200, Sebastian Ewert wrote: > > > >> Hi, > >> > >> before I allow to upload images I read them and check for several html > >> tags. If they exist I don't allow the upload. Is their any need to check > >> pdf files, too? At the time I'm doing this, but the result is that many > >> files are denied because of unallowed html tags. > > > > If I'm not mistaken, more recent versions of the PDF spec allow for > > embedded javascript. If so, it might be worthwhile to check for > > javascript in PDFs. (Whoever first thought of embedding *code* in > > documents should be shot.) > > > > I personally wouldn't bother: it is the responsibility of Adobe Reader > or whichever pdf reader a user is using, to make sure that nothing > evil comes of viewing a pdf. There's very little chance you'll be able > to properly check pdfs serverside for the various security exploits > they may contain - the pdf reader would/should be much better equipped > to do this (the fact that Adobe has failed miserably at it so far is > another thing). > > Sebastian, I personally think the best check for validity is, taking > images as an example, opening the image using Imagick or something > like it. After opening, verify that the image has valid dimensions and > type: a string of javascript or something like it simply won't > validate as an image. I've typically used > http://dk2.php.net/manual/en/function.getimagesize.php for this > myself, as there isn't a lot of overhead with that function - I don't > know if Imagick would be faster though, you'd have to check. > > Regards > Peter > > -- > <hype> > WWW: http://plphp.dk / http://plind.dk > LinkedIn: http://www.linkedin.com/in/plind > BeWelcome/Couchsurfing: Fake51 > Twitter: http://twitter.com/kafe15 > </hype> >
If you're that worried about PDF's, then maybe you could run them through Clam via an exec() call. I believe a lot of the pdf holes have been picked up by the antivirus groups out there, as Adobe does seem to be a bit slow to plug them. Thanks, Ash http://www.ashleysheridan.co.uk