Re: [compress] BZip2CompressorInputStream stops working without rhyme or reason ...

2020-10-14 Thread Albretch Mueller
I don't know what could there apaprently be exactly at byte offset 2848 in some buffer but files reporing to be fine by bzip2 --test can't be processed by BZip2CompressorInputStream: ~ $ _IFL="/home/lbrtchx/cmllpz/LklWb/org/wikimedia/dumps/enwiki/20200920/enwiki-20200920-pages-articles-multistrea

Re: [compress] BZip2CompressorInputStream stops working without rhyme or reason ...

2020-10-14 Thread Albretch Mueller
the files decompress fine using Linux bzip2: $ time bzip2 --decompress --verbose --keep "enwiki-20200920-pages-articles-multistream1.xml-p1p41242.bz2" enwiki-20200920-pages-articles-multistream1.xml-p1p41242.bz2: done real2m22.089s user2m6.664s sys 0m7.184s $ time bzip2 --decomp

Re: [compress] BZip2CompressorInputStream stops working without rhyme or reason ...

2020-10-13 Thread Albretch Mueller
user128m4.964s sys 1m9.108s $ time bzip2 --decompress --verbose --keep "${_IFL}" enwiki-latest-pages-articles.xml.bz2: done real147m59.737s user124m31.476s sys 8m1.516s $ On 10/13/20, Albretch Mueller wrote: > As part of my corpora research work I have to work with su

[compress] BZip2CompressorInputStream stops working without rhyme or reason ...

2020-10-13 Thread Albretch Mueller
As part of my corpora research work I have to work with such large text files. Wikipedia dumps are bzip2 so I have been working with: commons/compress/compressors/bzip2/BZip2CompressorInputStream.html and I consistently notice that it just stops processing without an error of any kind. I che

commons.apache.org/math/stat/

2008-06-08 Thread Albretch Mueller
On Sun, Jun 8, 2008 at 10:18 AM, Phil Steitz <[EMAIL PROTECTED]> > Its probably best to take the discussion to the dev list. ~ Hi, ~ this thread started in [EMAIL PROTECTED] as "commons.apache.org/math/stat/" ~ Formal need for a way to keep incremental statistics as part of the package: ~ If yo