[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085640#comment-16085640
]
Tim Allison edited comment on TIKA-2428 at 7/13/17 12:34 PM:
-------------------------------------------------------------
Thank you, [~lfcnassif], for reporting this and finding the cause.
>From the Javadocs for FileInputStream:
bq. This method may skip more bytes than are remaining in the backing file.
This produces no exception and the number of bytes skipped may include some
number of bytes that were beyond the EOF of the backing file. Attempting to
read from the stream after skipping past the end will result in -1 indicating
the end of the file.
>From the Javadocs for InputStream:
bq. The skip method may, for a variety of reasons, end up skipping over some
smaller number of bytes, possibly 0. This may result from any of a number of
conditions; reaching end of file before n bytes have been skipped is only one
possibility. The actual number of bytes skipped is returned.
If bytes skipped is more than requested, we've hit EOF. If bytes skipped == 0,
we need to test with a read, according to
[guava|https://github.com/google/guava/blob/master/guava/src/com/google/common/io/ByteStreams.java#L779]
was (Author: [email protected]):
Thank you, [~lfcnassif], for reporting this and finding the cause.
>From the Javadocs for FileInputStream:
{noformat}
This method may skip more bytes than are remaining in the backing file. This
produces no exception and the number of bytes skipped may include some number
of bytes that were beyond the EOF of the backing file. Attempting to read from
the stream after skipping past the end will result in -1 indicating the end of
the file.
{noformat}
>From the Javadocs for InputStream:
{noformat}
The skip method may, for a variety of reasons, end up skipping over some
smaller number of bytes, possibly 0. This may result from any of a number of
conditions; reaching end of file before n bytes have been skipped is only one
possibility. The actual number of bytes skipped is returned.
{noformat}
If bytes skipped is more than requested, we've hit EOF. If bytes skipped == 0,
we need to test with a read, according to
[guava|https://github.com/google/guava/blob/master/guava/src/com/google/common/io/ByteStreams.java#L779]
> EMFParser loops forever with corrupted files
> --------------------------------------------
>
> Key: TIKA-2428
> URL: https://issues.apache.org/jira/browse/TIKA-2428
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.15, 1.16
> Reporter: Luis Filipe Nassif
> Attachments: Carved-1285676.emf, Carved-1296288.emf, Carved-912866.emf
>
>
> EMFParser hangs with the attached corrupted EMF files.
> Sorry [[email protected]]! Just now having time to test against our
> forensic test corpus...
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)