On 02/05/2012 16:43, Hadley Wickham wrote:
I'm struggling to decompress a gzip'd raw vector in memory:
content<- readBin("http://httpbin.org/gzip", "raw", 1000)
memDecompress(content, type = "gzip")
# Error in memDecompress(content, type = "gzip") :
# internal error -3 in memDecompress(2)
I'm reasonably certain that the file is correctly compressed, because
if I save it out to a file, I can read the uncompressed data:
tmp<- tempfile()
writeBin(content, tmp)
readLines(tmp)
So that suggests I'm using memDecompress incorrectly. Any hints?
Headers.
Looking at http://tools.ietf.org/html/rfc1952:
* the first two bytes are id1 and id2, which are 1f 8b as expected
* the third byte is the compression: deflate (as.integer(content[3]))
* the fourth byte is the flag
rawToBits(content[4])
[1] 00 00 00 00 00 00 00 00
which indicates no extra header fields are present
So the header looks ok to me (with my limited knowledge of gzip)
Stripping off the header doesn't seem to help either:
memDecompress(content[-(1:10)], type = "gzip")
# Error in memDecompress(content[-(1:10)], type = "gzip") :
# internal error -3 in memDecompress(2)
I've read the help for memDecompress but I don't see anything there to help me.
Any more hints?
Well, it seems what you get there depends on the client, but I did
tystie% curl -o foo "http://httpbin.org/gzip"
tystie% file foo
foo: gzip compressed data, last modified: Wed May 2 17:06:24 2012, max
compression
and the final part worried me: I do not know if memDecompress() knows
about that format. The help page does not claim it can do anything
other than de-compress the results of memCompress() (although past
experience has shown that it can in some cases). gzfile() supports a
much wider range of formats.
--
Brian D. Ripley, rip...@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel