This is an automated email from the ASF dual-hosted git repository.
tallison pushed a change to branch TIKA-4692-improve-ooxml-sax-parsers
in repository https://gitbox.apache.org/repos/asf/tika.git
from 3e4aa9f9c6 TIKA-4692 -- remove strict tag mode and dev testing unit
tests
add 0f51fcc6da TIKA-4706 (#2732)
add fec0a997c8 decapsulate html from rtf within msgs...lol (#2713)
add b4eff27647 Merge branch 'main' into TIKA-4692-improve-ooxml-sax-parsers
No new revisions were added by this update.
Summary of changes:
.skills/dev.md | 106 ++++++++++++
.skills/tika-eval-compare.md | 185 +++++++++++++++++++++
.../tika/parser/microsoft/OfficeParserConfig.java | 18 --
.../tika/parser/microsoft/OutlookExtractor.java | 151 ++++++-----------
.../msg/RTFEncapsulatedHTMLExtractor.java | 177 +++++++++++++++++---
.../tika/parser/microsoft/OutlookParserTest.java | 48 +-----
.../msg/RTFEncapsulatedHTMLExtractorTest.java | 139 ++++++++++++++++
7 files changed, 644 insertions(+), 180 deletions(-)
create mode 100644 .skills/dev.md
create mode 100644 .skills/tika-eval-compare.md