> On 2 Sep 2017, at 4:38 am, YKdvd <[email protected]> wrote:
>
> I may have missed it, but is there any documentation as to how incoming
> requests are distributed to wsgi handler threads? So if you have something
> like "WSGIDaemonProcess myApp processes=3 threads=10", do incoming hits tend
> to get assigned to the 10 threads in Process1 first, then spill into Process2
> threads, or is there some sort of balancing attempted among the threads of
> the 3 processes? I think I saw somewhere that recently completed threads
> were preferentially reused where possible, but I can't find anything like
> that now.
Sorry, completely missed this message in my inbox.
Across processes, requests should be more or less balanced as far as which
process gets to accept the next request. This is because there is a cross
process mutex lock to ensure that only one process can be waiting to accept the
next request. This is in part to stop the thundering heard problem, although
newer operating systems don't suffer as much with that problem. Anyway, what it
usually results in is that after a process accepts a request, that the next
request would be accepted by a different process if they are ready.
As to within a single process and what thread gets used to handle a new
request, for that there is a LIFO stack. This means that when a request is
finished, that thread gets put onto the top of the stack and will be
preferentially used for subsequent requests. It wouldn't be used for the very
next request, as there would already been a thread waiting to accept the next
request.
So even in a low volume request scenario, if you defined 20 threads, then
requests might just cycle between using just two. The other threads will sit
there dormant and unused. The LIFO stack for tracking threads is to avoid
cycling through all the threads and wasting effort on needing to swap thread
stacks so much. Using only most recent means thread stack hopefully still in
near memory cache.
You can monitor how many of the threads in the LIFO stack have been activated
and so use it as a measure of knowing whether you are wasting memory by over
allocating on number of threads, with them never being used. Spikes can see
activated number jump up, but other metrics available can be used to work out a
proper ongoing thread utilisation as well.
For example:
mod_wsgi.maximum_processes: 1
mod_wsgi.threads_per_process: 5
mod_wsgi.process_metrics: {'cpu_system_time': 0.019999999552965164,
'request_busy_time': 0.011465999999999999, 'current_time': 1504559432.472619,
'memory_max_rss': 12369920L, 'memory_rss': 12369920L, 'pid': 31720,
'request_threads': 2, 'restart_time': 1504559391.892375, 'threads':
[{'thread_id': 1, 'request_count': 12L}, {'thread_id': 2, 'request_count':
11L}], 'request_count': 22L, 'active_requests': 1, 'cpu_user_time':
0.03999999910593033, 'running_time': 40L}
There are five threads per process, but 'threads' only shows two active and how
many requests each has handled.
This is old code so haven't check it still works, but principle should:
last_metrics = None
def monitor(*args):
global last_metrics
while True:
current_metrics = mod_wsgi.process_metrics()
if last_metrics is not None:
cpu_user_time = (current_metrics['cpu_user_time'] -
last_metrics['cpu_user_time'])
cpu_system_time = (current_metrics['cpu_system_time'] -
last_metrics['cpu_system_time'])
request_busy_time = (current_metrics['request_busy_time'] -
last_metrics['request_busy_time'])
request_threads = current_metrics['request_threads']
# report data
timestamp = int(current_metrics['current_time'] * 1000000000)
samples = []
item = {}
item['time'] = timestamp
item['measurement'] = 'process'
fields = {}
fields['cpu_user_time'] = cpu_user_time
fields['cpu_system_time'] = cpu_system_time
fields['request_busy_time'] = request_busy_time
fields['request_busy_usage'] = (request_busy_time /
mod_wsgi.threads_per_process)
fields['threads_per_process'] = mod_wsgi.threads_per_process
fields['request_threads'] = request_threads
item['fields'] = fields
samples.append(item)
influxdb_client_2.write_points(samples)
last_metrics = current_metrics
current_time = current_metrics['current_time']
delay = max(0, (current_time + 1.0) - time.time())
time.sleep(delay)
thread = threading.Thread(target=monitor)
thread.setDaemon(True)
thread.start()
The calculated 'request_busy_usage' value shows percent utilisation for threads
in thread pool of the process.
If that only ever shows 10% and you had increased number of threads, then you
can see that you have over allocated number.
There is various process and per request CPU metrics you can extract as well
which can help determine if you are overloading things due to GIL contention.
Graham
--
You received this message because you are subscribed to the Google Groups
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.