On 08.11.2010 16:16, Rainer Jung wrote:
A typical annoyance when combining Apache web server and Tomcat is the
difference in access log timestamp. Apache logs the beginning of the
request, Tomcat logs the end of the request.

I added a feature to Apache trunk (will become 2.4) to make it
configurable for Apache, which time stamp to choose ([1]). Furthermore
the Apache web server allows to choose the format, in which the time
stamp is being logged as something different than just "Common Log
Format". This is all configured by an appropriate "format" in %{format}t.
...

[1] http://httpd.apache.org/docs/trunk/mod/mod_log_config.html#formats

A first draft patch is availabel for review at

http://people.apache.org/~rjung/patches/tomcat-trunk-accesslog-extended-timestamp.patch

Status
======

- docs are still missing
- tested a bit also under load
- not yet tested with multiple instances of the access log
- the aspect of async request handling is not clear to me.
  There is request.getCoyoteRequest().getStartTime() and
  request.setAttribute(t1Name, new Long(t1)) and I haven't
  checked the relation between the two at the time of logging.


What works
==========

- %t still loggs response end time in Common Log Format
- %t{FORMAT} logs in configurable format:
  - FORMAT could be either "begin" or "end" or have a prefix of
    "begin:" or "end:", which means "use the request start time"
    respectively "use the response end time".
  - stripping the optional prefix the remaining format can be either:
    - one of the special tokens "sec" (seconds since epoch), "msec"
      (milliseconds since epoch) or "msec_frac" (only sub second
      milliseconds)
    - any other remaining format string will be passed to
      SimpleDateFormat

Implementation Design
=====================

Common Log Format formats and SimpleDateFormat formats are cached (the other special formats are trivial to evaluate). Each different format specification occuring in the pattern has its own cache (and SimpleDateFormat).

The cache is organized in two levels. A ThreadLocal caches the format for 60 consecutive timestamps, a global synchronized cache for 300 consecutive timestamps. The design is more important than in the original implementation because request start times can jump more.

All data is be consistent, i.e. begin times, end times and duration fit together even if multiple %{FORMAT}t entries are used in the log pattern.

Finally if a general format string is used, which gets fed into SimpleDateFormat, I remove any "S" (milliseconds) and replace it with a marker string. This is necessary to allow the cache to be effective. Replacing the marker with the milliseconds is a trivial post-formatting tas and doesn't need complex calendar objects (I know there are leap seconds, but well ...).

Performance Implications
========================

I did a few quick stress tests on a two CPU Sparc Server with a trivial one byte plus no sessions JSP. There was no clear difference between the old and the new implementation (below 5%, which one was faster was not stable, around 7.500 requests per second for default config).

Observations
============

The addElement() in the interface isn't very helpful. The date and time arguments are very special for the DateAndTime and ElapsedTime formatter. I guess it would be better to have everything needed in request and response or pass a mre generic data object that contains additional meta data. I didn't change any of the interfaces though, because it might not be to unlikely that people have actually extended existing AccessLog implementations.

I can't work on it during the next two days, but there will be some time during the weekend to proceed.

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to