On Tue, Aug 26, 2014 at 12:53 PM, Mark Thomas <ma...@apache.org> wrote:
> One of the aims of the proposed cookie changes [1] was to deal with the > HTML 5 changes that mean UTF-8 can appear in cookie headers. > > This has some potentially large implications for Tomcat. > Since we already are in the 8.0.x release cycle, I, as an end user/system administrator, would expect parsing would remain 100% backwards compatible for version 8.0.x+n (n=1...) > > Currently, Tomcat handles cookies as MessageBytes, processing everything > in bytes and only converting to String when necessary. This is largely > possible because of the assumption that everything is ASCII. > > Introduce UTF-8 and processing everything in bytes gets a whole lot > harder. You essentially have to decode to UTF-8 to ensure that you have > valid data - at a which point why not just use Strings anyway? > > I am currently leaning towards removing a lot of the current cookie > header caching recycling and doing something along the following lines: > all that caching/recycling is to avoid GC cycles and was in the past a crucial performance optimization. back in those days, with the hardware that was available in 06-07, we were pushing a single Tomcat instance to 60k requests per second. creating new objects was painfully expensive at that rate. > - Lazy parsing as currently (but unless cookie based session tracking is > disabled this is going to run on every request) > but our cookies, JSESSIONID, doesn't have to be UTF-8, does it? this goes hand in hand with the SessionIdGenerator that Rainer just did, can that return UTF-8 values? So the lazy part can apply to all other cookies, meaning, don't parse it until the app requests it, just store the bytes and move on. > - Convert headers to UTF-8 strings - Parse them with a new parser along the lines of o.a.t.u.http.parser > - Have that parser return an array of javax.servlet.http.Cookie objects > - Pass those to the app if/when requested > > In terms of handling RFC6265 and RFC2109 my plan is to have two parsers, > share as much code as possible and switch between them based on the > cookie header with the expectation that 99.9% of cookies will be parsed > by the RFC6265 parser. We could add some options to this switching to > enable other parsers (e.g. a Netscape parser) to be used. > I like the idea of swappable parsers, with the default is the exact behavior you see now. I can see changing the default after some stabilization. > > I'd also like to keep the current cookie parsing implementation for now. > Until we are happy with the new parsing, the current implementation will > be the default. Once we are happy with the new parsing we can change the > default. We can add an option to switch between the current and the new > parsing. > > Thoughts? > knock it out. > > > Mark > > > [1] https://wiki.apache.org/tomcat/Cookies > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org > For additional commands, e-mail: dev-h...@tomcat.apache.org > >