Yeah, tin foil hats on... but SOMETHING weird is going on and I suspect this is the best place to find people who can tell me what. And after all my webserver does run Debian :-)
I've been noticing some very strange activity in my webserver logs over the last few days. It only seems to happen to people using American ISPs, mostly Comcast and Verizon. What happens is I get a sequence of requests for all the images off the root page of my website http://pigeonsnest.co.uk - and only that site, it doesn't seem to be happening to any others. The requests come in the usual order corresponding to the order in which the HTML references them, and have the expected Referer: header of http://pigeonsnest.co.uk/ . But there is no request for the actual HTML page, only the images off it. Sometimes the sequence of requests for the images is immediately preceded - as in so close in time that the process forked to serve it gets the immediately preceding process ID - by a request for the HTML page which comes from an IP owned by Google. So it looks like Comcast, Verizon and some others are somehow proxying the requests for my HTML via some server owned by Google. And unlike a normal proxy, it caches the HTML for a long time but the images not at all. And it's not people reading the cached copy of my site from the "Cached" link on a google search. When people do that it is obvious from the Referer: headers. And I can't see any reason why ISPs would proxy the requests for the HTML and not the images unless they're doing some kind of content filtering or censorship on the HTML. How do I know that what Comcast/Verizon/etc customers are seeing is what I published? No doubt there will be several Comcast and Verizon customers reading this message so I hope some people will have some useful input. -- Pigeon Be kind to pigeons Pigeon's Nest - http://pigeonsnest.co.uk/ Lucy Pinder Television - http://www.lucy-pinder.tv/ GPG key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x21C61F7F
signature.asc
Description: Digital signature