[
https://issues.apache.org/jira/browse/TIKA-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138093#comment-17138093
]
suchendra commented on TIKA-3097:
---------------------------------
"there are some dependencies that load the full file into memory and then we
emit the sax events", even if the file read as stream ?
> Out of memory while parsing docx
> --------------------------------
>
> Key: TIKA-3097
> URL: https://issues.apache.org/jira/browse/TIKA-3097
> Project: Tika
> Issue Type: Bug
> Components: core, parser
> Affects Versions: 1.24
> Reporter: suchendra
> Priority: Major
> Attachments: Screenshot from 2020-05-07 08-14-25.png, samplefile.txt,
> test.docx
>
>
> I have written simple Scala code to extract the content from uploaded file
> which is docx. JVM goes OOM when tika tries to parse the file. I have
> configured JVM heap to 1GB and tried with 2GB same issue occurs, issue both
> with jar as well as in my code.
> Attached the file for reference.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)