Am 24.01.2016 um 19:50 schrieb Felix Schumacher:
Am 23.01.2016 um 17:05 schrieb Rainer Jung:
Am 23.01.2016 um 13:17 schrieb Rainer Jung:
Since the calls to filter() are in a hot code path, I wonder whether
using a more complex code instead of a single regexp could be better.
The code should be fast in the common case, which IMHO is the case when
the resource name neither starts with javax nor with org.apache. That
case could be handled by a simple prefix comparison without regexp.

Only when the resource name starts with javax or org.apache (or
org/apache) the various cases to check may make using a regexp be the
better decision. Of course such code is a bit harder to maintain, than a
single regexp.

I think I'll run a simple micro benchmark to get an idea.

The below code nees about 5% of the execution time of the regexp based
one. In some cases 15%. I found no case worse than 15%.

The regex can be sped up, when "[./]" will be used instead of "(\.|/)"
and by combining the conditions like

"^(?:javax[./](?el|security[./]auth[./]message|servlet|websocket)|org[./]apache[./](?:catalina|coyote|el|jasper|juli|naming|tomcat))[./]"


That one is a bit nearer to the hand written logic below and performs
better in my micro tests with jmh on the strings "nomatch" and
"org.apache.catalina.shouldmatch". (3535124,947 ops/s vs. 5800147,877
ops/s).

Yes, using that regexp (with ?el -> ?:el) and a similar one for the Permit pattern the numbers change slightly, but still in most cases, the hand written matcher needs 5-10% of the regexp executions, worst case 20%. At least after warmup.


    /**
     * Filter classes.
     *
     * @param name class name
     * @return <code>true</code> if the class should be filtered
     */
    protected boolean filter(String name) {

        if (name == null)
            return false;

        char sep;
        boolean javax_or_org_apache;
        boolean dot_or_slash;
I found the names of the variables ..._or_... a bit confusing, as I read
a boolean or into it and thought it would be true, if name started with
javax. or org.apache.

Yes, agreed. I ran out of ideas for a better name. I did another test and Enums are fast enough, so I would replace the two "_or_" booleans with a Separator and a Prefix enum.

Thanks for having a look.

Rainer

Regards,
  Felix


        if (name.startsWith("javax")) {
            /* 5 == length("javax") */
            sep = name.charAt(5);
            if (sep == '.') {
                javax_or_org_apache = true;
                dot_or_slash = true;
            } else if (sep == '/') {
                javax_or_org_apache = true;
                dot_or_slash = false;
            } else {
                return false;
            }
        } else if (name.startsWith("org")) {
            /* 3 == length("org") */
            sep = name.charAt(3);
            /* 4 == length("org.") */
            if (sep == '.' && name.startsWith("apache.", 4)) {
                javax_or_org_apache = false;
                dot_or_slash = true;
            /* 4 == length("org/") */
            } else if (sep == '/' && name.startsWith("apache/", 4)) {
                javax_or_org_apache = false;
                dot_or_slash = false;
            } else {
                return false;
            }
        } else {
            return false;
        }

        if (javax_or_org_apache) {
            if (dot_or_slash) {
                /* 6 == length("javax.") */
if (name.startsWith("servlet.jsp.jstl.", 6)) {
                    return false;
                }
                if (name.startsWith("el.", 6) ||
                    name.startsWith("servlet.", 6) ||
                    name.startsWith("websocket.", 6) ||
                    name.startsWith("security.auth.message.", 6)) {
                    return true;
                }
            } else {
                /* 6 == length("javax/") */
                if (name.startsWith("servlet/jsp/jstl/", 6)) {
                    return false;
                }
                if (name.startsWith("el/", 6) ||
                    name.startsWith("servlet/", 6) ||
                    name.startsWith("websocket/", 6) ||
                    name.startsWith("security/auth/message/", 6)) {
                    return true;
                }
            }
        } else {
            if (dot_or_slash) {
                /* 11 == length("org.apache.") */
                if (name.startsWith("tomcat.jdbc.", 11)) {
                    return false;
                }
                if (name.startsWith("el.", 11) ||
                    name.startsWith("catalina.", 11) ||
                    name.startsWith("jasper.", 11) ||
                    name.startsWith("juli.", 11) ||
                    name.startsWith("tomcat.", 11) ||
                    name.startsWith("naming.", 11) ||
                    name.startsWith("coyote.", 11)) {
                    return true;
                }
            } else {
                /* 11 == length("org/apache/") */
                if (name.startsWith("tomcat/jdbc/", 11)) {
                    return false;
                }
                if (name.startsWith("el/", 11) ||
                    name.startsWith("catalina/", 11) ||
                    name.startsWith("jasper/", 11) ||
                    name.startsWith("juli/", 11) ||
                    name.startsWith("tomcat/", 11) ||
                    name.startsWith("naming/", 11) ||
                    name.startsWith("coyote/", 11)) {
                    return true;
                }
            }
        }
        return false;
    }


Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to