With a bit of customisation, PDFBox should be able to parse pdf to md
<https://www.markdownguide.org/cheat-sheet/>. This probably involves a
process like PDFText2HTML.java
<https://svn.apache.org/repos/asf/pdfbox/branches/2.0/tools/src/main/java/org/apache/pdfbox/tools/PDFText2HTML.java>,
possibly just modifying that processor, but I'm open to advice.

I can find tutorials on how to program in Java but I'd like to know the
approach (how to go about it) with PDFBox. A lot of syntax is just matching
patterns, an approach that lets me use leaflet.js without knowing js.

Hopefully, any code given in an answer is explained clearly enough so I can
understand it.

Reply via email to