#36520: Performance Regression in parse_header_params
---------------------------------+------------------------------------
     Reporter:  David Smith      |                    Owner:  (none)
         Type:  Bug              |                   Status:  new
    Component:  HTTP handling    |                  Version:  dev
     Severity:  Release blocker  |               Resolution:
     Keywords:                   |             Triage Stage:  Accepted
    Has patch:  0                |      Needs documentation:  0
  Needs tests:  0                |  Patch needs improvement:  0
Easy pickings:  0                |                    UI/UX:  0
---------------------------------+------------------------------------
Comment (by Natalia Bidart):

 Replying to [comment:11 Jake Howard]:
 > Encoding a string into bytes is incredibly fast. Perhaps doing the
 short-circuit, converting the data to bytes and then parsing it with
 `email.message` is fast enough? It's still going to be a performance
 regression, but hopefully not by quite as much.

 Below are benchmark results using my testing code, where `line =
 line.encode("utf-8")` is called inside the benchmarked function, just
 before invoking `Message.get_params()`. This approach could work: we see a
 potential 2x performance improvement for very large headers (which are
 often the source of security reports). However, the average case (having a
 `charset` and potentially a `boundary`) appears to suffer a 3x slowdown,
 and I don't think we could easily avoid that penalty.

 ||= Python 3.11 =||= cgi =||= get_params(str) =||= get_params(bytes) =||=
 ratio =||= get_params(bytes) is =||
 || `text/plain` || 0.337 || 1.635 || 1.672 || 0.20 || 5.0x slower ||
 || `text/html; charset=UTF-8; boundary=something` || 1.362 || 4.348 ||
 4.514 || 0.30 || 3.3x slower ||
 || `application/x-stuff; ...` || 1.955 || 8.675 || 2.494 || 1.27 || 1.3x
 slower ||

 ||= Python 3.12 =||= cgi =||= get_params(str) =||= get_params(bytes) =||=
 ratio =||= get_params(bytes) is =||
 || `text/plain` || 0.356 || 1.657 || 1.725 || 0.21 || 4.8x slower ||
 || `text/html; charset=UTF-8; boundary=something` || 1.407 || 4.582 ||
 4.697 || 0.30 || 3.3x slower ||
 || `application/x-stuff; ...` || 2.017 || 9.609 || 2.645 || 1.31 || 1.3x
 slower ||

 ||= Python 3.13 =||= cgi =||= get_params(str) =||= get_params(bytes) =||=
 ratio =||= get_params(bytes) is =||
 || `text/plain` || 0.325 || 1.613 || 1.717 || 0.19 || 5.3x slower ||
 || `text/html; charset=UTF-8; boundary=something` || 1.167 || 3.862 ||
 3.943 || 0.30 || 3.4x slower ||
 || `application/x-stuff; ...` || 4.263 || 9.445 || 2.252 || 1.89 || 1.9x
 faster ||

 ||= Python 3.14 =||= cgi =||= get_params(str) =||= get_params(bytes) =||=
 ratio =||= get_params(bytes) is =||
 || `text/plain` || 0.258 || 1.601 || 1.725 || 0.16 || 6.7x slower ||
 || `text/html; charset=UTF-8; boundary=something` || 1.037 || 3.773 ||
 3.870 || 0.27 || 3.7x slower ||
 || `application/x-stuff; ...` || 3.978 || 8.789 || 2.132 || 1.87 || 1.9x
 faster ||
-- 
Ticket URL: <https://code.djangoproject.com/ticket/36520#comment:12>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/010701985724f1cc-62cb7f78-40a7-4c57-a10e-f92404dcd491-000000%40eu-central-1.amazonses.com.

Reply via email to