MM 49 49
2A 00 / 4D 4D 00 2A).
-Original Message-
From: Stefan Bodewig [mailto:bode...@apache.org]
Sent: Tuesday, February 27, 2018 3:46 PM
To: Stefan Bodewig
Cc: Allison, Timothy B. ; Commons Developers List
Subject: Re: [COMPRESS] TIFF file identified as TAR
On 2018-02-27, S
COMPRESS colleagues,
On TIKA-2591[0], a user reports that a specific type of TIFF is being
identified as a TAR file. Is this something we should try to fix at the Tika
level, or is this something that would be better fixed in COMPRESS?
Thank you!
Best,
Tim
[0]
Compress colleagues,
Over on https://bz.apache.org/bugzilla/show_bug.cgi?id=61275, a user
submitted two .xlsx files generated with Apache POI, one by IBM's jvm and one
by Oracle's jvm. The file generated with Oracle's jvm opens without issue;
however, MSOffice complains but can fix the file
Fellow file-philes on [compress],
Sebastian Nagel has added file type id via Apache Tika to Common Crawl. While
Tika is not 100% accurate, this means that we have far better clarity on mime
type than relying on the http header+file suffix. So, for testing purposes,
you (or we over on Tika)
>enum wouldn't work for formats added via ServiceLoader. LZO supports a couple
>of names of its own and you couldn't inject them into the enum.
Doh! Got it. New code base...Sorry.
-
To unsubscribe, e-mail: dev-unsubscr...@comm
>> If there is anything COMPRESS can do to detect and avoid the situation, then
>> please open an issue over here.
Done: COMPRESS-385, PR submitted
>> If we wanted to add such a method, what would the return value be? One of
>> the String constants contained inside the *Factory classes, likely.
On TIKA-1631 [1], users have observed that a corrupt Z file can cause an OOM at
Internal_.InternalLZWStream.initializeTable. Should we try to protect against
this at the Tika level, or should we open an issue on commons-compress's JIRA?
A second question, we're creating a stream with the Comp
here:
https://groups.google.com/forum/#!topic/common-crawl/Cv21VRQjGN0
I’ve tried to follow Commons’ vernacular, and I’ve added [COMPRESS] to the
Subject line. Please invite others who might have an interest in this work.
Best,
Tim
From: Allison, Timothy B.
Sent