[ 
https://issues.apache.org/jira/browse/LOG4J2-2606?focusedWorklogId=247238&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247238
 ]

ASF GitHub Bot logged work on LOG4J2-2606:
------------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/May/19 02:25
            Start Date: 23/May/19 02:25
    Worklog Time Spent: 10m 
      Work Description: carterkozak commented on pull request #273: 
LOG4J2-2606: Substantially improve async logging performance under heavy load
URL: https://github.com/apache/logging-log4j2/pull/273#discussion_r286754343
 
 

 ##########
 File path: 
log4j-core/src/main/java/org/apache/logging/log4j/core/async/AsyncLoggerDisruptor.java
 ##########
 @@ -202,32 +209,95 @@ private boolean hasLog4jBeenShutDown(final 
Disruptor<RingBufferLogEvent> aDisrup
         return false;
     }
 
-    public boolean tryPublish(final RingBufferLogEventTranslator translator) {
+    boolean tryPublish(final RingBufferLogEventTranslator translator) {
         try {
+            // Note: we deliberately access the volatile disruptor field 
afresh here.
+            // Avoiding this and using an older reference could result in 
adding a log event to the disruptor after it
+            // was shut down, which could cause the publishEvent method to 
hang and never return.
             return disruptor.getRingBuffer().tryPublishEvent(translator);
         } catch (final NullPointerException npe) {
             // LOG4J2-639: catch NPE if disruptor field was set to null in 
stop()
-            LOGGER.warn("[{}] Ignoring log event after log4j was shut down: {} 
[{}] {}", contextName,
-                    translator.level, translator.loggerName, 
translator.message.getFormattedMessage()
-                            + (translator.thrown == null ? "" : 
Throwables.toStringList(translator.thrown)));
+            logWarningOnNpeFromDisruptorPublish(translator);
             return false;
         }
     }
 
-    void enqueueLogMessageInfo(final RingBufferLogEventTranslator translator) {
+    void enqueueLogMessageQueueFullBlocking(final RingBufferLogEventTranslator 
translator) {
 
 Review comment:
   Good idea, thanks for the suggestion
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 247238)
    Time Spent: 2h 20m  (was: 2h 10m)

> Asynchronous logging when the queue is full results heavy CPU load
> ------------------------------------------------------------------
>
>                 Key: LOG4J2-2606
>                 URL: https://issues.apache.org/jira/browse/LOG4J2-2606
>             Project: Log4j 2
>          Issue Type: Bug
>    Affects Versions: 2.11.2
>            Reporter: Carter Kozak
>            Assignee: Carter Kozak
>            Priority: Major
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> I've encountered unexpected performance characteristics when the asynchronous 
> logging queue is full, resulting in the application becoming unavailable for 
> an extended period of time.
> I think EventRoute.ENQUEUE is using a spin lock (or similar construct). When 
> many threads attempt to log at a higher rate than we can drain the queue, 
> logging throughput drops significantly while CPU utilization skyrockets 
> causing instability across the system.
> I can reproduce this in a benchmark (I will clean it up and push when I have 
> a moment) where I get the following results:
> Setup:
>  Root logger has a random access file appender with immediateFlush disabled
> AsyncLoggerContextSelector enabled for completely asynchronous logging
> 1 thread logging, any queue full policy: ~1,400k events/second, 150% CPU
> 32 threads logging, default policy (EventRoute.ENQUEUE, default policy): 
> 300k-400k events/second. 930% CPU load
> 32 threads logging, custom policy using EventRoute.SYNCHRONOUS: 1,200k 
> events/second. 350% CPU load
> Using the synchronous policy results in ~1/3 the CPU load and 3-4x throughput 
> when the system is loaded to the point that the logging queue is filled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to