It depends what you want to get. See the DrawPrintTextLocations.java
example which shows several strategies to get the bounding boxes of
individual glyphs and draw them on the screen (not in a PDF, so the Y
coordinate is different). You would have to adjust the
"Rectangle2D.Float" code to whatever you prefer|, or adjust
|DrawPrintTextLocations to collect words like the mkl code does.
Tilman
On 12.02.2024 18:48, Frédéric Ravetier wrote:
Hello,
I'd like to find some specific words in a PDF and draw a rectangle over
these words.
I'm using PDFBox 3.0.1
I found this to locate the words :
https://github.com/mkl-public/testarea-pdfbox2/blob/master/src/test/java/mkl/testarea/pdfbox2/extract/ExtractWordCoordinates.java
As you can see in the println, :
System.out.println(builder.toString() + " [(X=" + boundingBox.getX() +
",Y=" + boundingBox.getY()
+ ") height=" + boundingBox.getHeight() + " width=" +
boundingBox.getWidth() + "]");
I get :
MYSTRING [(X=29.862407684326172,Y=383.78765869140625)
height=7.098414897918701 width=50.3477668762207 ]
in my prototype I print this information and copy and past x, y, height,
width into a block of code hardcoded
PDPage page = document.getPage(0);
PDPageContentStream contentStream = new PDPageContentStream(document,
page, PDPageContentStream.AppendMode.APPEND, false);
contentStream.setNonStrokingColor(Color.RED);
contentStream.addRect(29.862407684326172f, 383.78765869140625f,
50.3477668762207, 7.098414897918701f);
contentStream.fill();
contentStream.close();
document.save(new FileOutputStream(src_file_path.replace(".pdf", "-rect.pdf")));
But it does not match the text on the PDF.
I tried to replace the height by the font size but it was not really better.
Where is my mistake ?
Best regards,
Fred