Hi,

If you only need to know whether it is > 0, then you don't need it at all. Because > 0 means you got text.

If you want a count, then extend the stripper and extend showGlyph(). Here is how our "big sister project" Apache Tika does it:

    @Override
    protected void showGlyph(Matrix textRenderingMatrix, PDFont font, int code, String unicode, Vector displacement) throws IOException
    {
        super.showGlyph(textRenderingMatrix, font, code, unicode, displacement);
        if (unicode == null || unicode.isEmpty()) {
            unmappedUnicodeCharsPerPage++;
        }
        totalCharsPerPage++;
    }

Tilman

Am 10.09.2020 um 14:12 schrieb Karen Madore:
Hello,

We are migrating from PDFBox from 1.8.16 to 2.0.21 and am looking for the 
equivalent method to the getValidCharCnt(). Previously this method was 
accessible via PDFTextStripper class but now it does not seem to recognize this 
method and I am unable to find a suitable replacement method. Below is the line 
of code I am trying to migrate.

_hasText = (_hasText || stripper.getValidCharCnt() > 0) ? true : false;

Imports used are:
import org.apache.pdfbox.text.PDFTextStripper;
import org.apache.pdfbox.multipdf.Splitter;

I am new to PDFBox so this might be a newbie question.

Cheers,

Karen Madore


This email and any files transmitted with it are solely intended for the use of 
the addressee(s) and may contain information that is confidential and 
privileged. If you receive this email in error, please advise us by return 
email immediately. Please also disregard the contents of the email, delete it 
and destroy any copies immediately. Mediagrif Interactive Technologies Inc. and 
its subsidiaries do not accept liability for the views expressed in the email 
or for the consequences of any computer viruses that may be transmitted with 
this email. This email is also subject to copyright. No part of it should be 
reproduced, adapted or transmitted without the written consent of the copyright 
owner.



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to