martin-g commented on a change in pull request #384: URL: https://github.com/apache/tomcat/pull/384#discussion_r537495586
########## File path: java/org/apache/catalina/valves/AbstractAccessLogValve.java ########## @@ -1805,4 +1808,99 @@ protected AccessLogElement createAccessLogElement(char pattern) { return new StringElement("???" + pattern + "???"); } } + + + /* + * This method is intended to mimic the escaping performed by httpd and + * mod_log_config. mod_log_config escapes more elements than indicated by the + * documentation. See: + * https://github.com/apache/httpd/blob/trunk/modules/loggers/mod_log_config.c + * + * The following escaped elements are not supported by Tomcat: + * - %C cookie value (see %{}c below) + * - %e environment variable + * - %f filename + * - %l remote logname (always logs "-") + * - %n note + * - %R handler + * - %ti trailer request header + * - %to trailer response header + * - %V server name per UseCanonicalName setting + * + * The following escaped elements are not escaped in Tomcat because values + * that would require escaping are rejected before they reach the + * AccessLogValve: + * - %h remote host + * - %H request protocol + * - %m request method + * - %q query string + * - %r request line + * - %U request URI + * - %v canonical server name + * + * The following escaped elements are supported by Tomcat: + * - %{}i request header + * - %{}o response header + * - %u remote user + * + * The following additional Tomcat elements are escaped for consistency: + * - %{}c cookie value + * - %{}r request attribute + * - %{}s session attribute + * + * giving a total of 6 elements that are escaped in Tomcat. + * + * Quoting from the httpd docs: + * "...non-printable and other special characters in %r, %i and %o are + * escaped using \xhh sequences, where hh stands for the hexadecimal + * representation of the raw byte. Exceptions from this rule are " and \, + * which are escaped by prepending a backslash, and all whitespace + * characters, which are written in their C-style notation (\n, \t, etc)." + * + * Reviewing the httpd code, characters with the high bit set are escaped. + * The httpd is assuming a single byte encoding which may not be true for + * Tomcat so Tomcat uses the Java \\uXXXX encoding. + */ + protected static void escapeAndAppend(String input, CharArrayWriter dest) { + if (input == null || input.isEmpty()) { + dest.append('-'); + return; + } + + for (char c : input.toCharArray()) { + switch (c) { + // " and \ + case '\\': + dest.append("\\\\"); + break; + case '\"': + dest.append("\\\""); + break; Review comment: I see there was a new commit that moved left everything but the problematic `break` :-) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org