2012/1/10 Rainer Jung <rainer.j...@kippdata.de>: > I analyzed a fun problem a few weeks ago. Someone was using the shared > loader extensively (TC 5.5). They observed performance problems and thread > dumps showed, that often the class loader locking was the culprit. > > Code was inside loadClass(). Now it turned out, it wasn't about really > loading classes. Instead there was a lot of deserialization going on (data > received from a backend app server). Deserialization uses reflection > intensively and reflection leads to loadClass() calls to retrieve the > classes. We observed e.g. several hundreds loadClass() calls per second. > > Now loadClass() in the WebappClasLoader does: > > - check own class cache > - check super class cache > - try loading from system loader > - call Class.forName with parent loader (which calls loadClass() there) > [only if "delegated", which is *not* the default] > - try loading via findClass() > - call Class.forName with parent loader (which calls loadClass() there) > [only if not "delegated", which *is* the default] > > So if a class was previously loaded by the shared loader (or common or > server), then we will not find in in our own or the super cache, will then > *always* try to load it via system, will then (if default) always try to > load it ourselves and only finally will try to load it via parent and find > it there in the cache. > > This turned out to become a bottleneck. > > I implemented a quick hack which cached the classes loaded by system, parent > and shared positively and negatively (not found) in the WebappLoader using > the same method that was already used for its own cache. Thus the massive > calls to those loaders could be avoided and the bottleneck went away. > > I wonder whether we want to improve caching in the WebappLoader. Of course > most deployments no longer use shared or common to share many application > classes, but it is still a supported feature and for some classes like JDBC > it is standard. > > What we could do to keep the design simple is caching any positive result > from loadClass() in the WebappLoader, even it it was found via super, system > or parent. In addition we could also cache negative results for all those. > The biggest downsides I can see would be > > - less dynamics: if someone had a more dynamic loader unerneath ours, which > would change the result of loadClass() during runtime, we would shield the > app from it, because we now return classes from our cache. > > - increased memory use for the cache, i.e. the list of class names and > references to the classes. > > To stay completely compatible I think the feature should not be default, at > least until TC 7, maybe switch default for 8. > > What do you think? Does it make sense? Should I prepare a patch for trunk? >
Note that when looking for a class most time is wasted when looking up a "*.class" resource. That code is in findResourceInternal() and I think that that method should be considered as well, to speed up lookup of any resources, not only the classes. I had some old patch draft lying around. I submitted it in https://issues.apache.org/bugzilla/show_bug.cgi?id=52448 See that issue for details. Concerning the cache of classes that you are proposing: consider the following scenario: (1) WebappClassLoader is asked to load a class. It does not find it and then finds it in the parent classloader. (2) It is asked for this class again. During (2) should it look into its own resources first? If during the time between (1) and (2) one puts a class file into WEB-INF/classes should that class file be used? I think that during (2) call it must ignore the class in WEB-INF/classes and still use the class from parent classloader. That is for consistency with previous result. I think that instead of caching the class instance itself it would be safe to cache the fact that the class was loaded from the parent classloader. It separates responsibilities and solves the issue with dynamic classloading. The difference is small though. Actually WebappClassLoader#notFoundResources already serves similar purpose. Regarding class instances caching I see two concerns: a) Garbage-collecting unused classes. b) Hot-swapping classes during debugging. Regarding a) that is not a concern if parent classloader already has cache of those classes. Caching just names is safer, does not prevent gc of the classes and the names take less memory (and no PermGen memory). Regarding b) I suspect that hot-swapping changes bytecode but does not change the Class instance. Documentation is at [1], but I do not have much experience with this feature. [1] http://docs.oracle.com/javase/1.4.2/docs/guide/jpda/enhancements.html#hotswap Regarding "try loading from system loader" call: Isn't that call fast? In JRE 6 there are some indexes (lib/classlist, lib/meta-index) that should help system classloader to reject requests for classes that it cannot load. While we are talking about classloaders - there is one more patch in Bugzilla, https://issues.apache.org/bugzilla/show_bug.cgi?id=48903#c6 that tries to use some jdk7 feature to improve classloader locking. I do not plan to look at it in near future, as I do not see much benefit from jdk7 yet. Just reminding. Best regards, Konstantin Kolinko --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org