Thanks Tri, I really appreciate the response. When I get some free time shortly I'll start giving some of these a try and report back.
On Mon, Mar 3, 2014 at 12:42 PM, Tri Cao <tm...@me.com> wrote: > If it's really the interned strings, you could try upgrade JDK, as the > newer HotSpot > JVM puts interned strings in regular heap: > > http://www.oracle.com/technetwork/java/javase/jdk7-relnotes-418459.html > > <http://www.oracle.com/technetwork/java/javase/jdk7-relnotes-418459.html>(search > for String.intern() in that release) > > I haven't got a chance to look into the new core auto discovery code, so I > don't know > if it's implemented with reflection or not. Reflection and dynamic class > loading is another > source of PermGen exception, in my experience. > > I don't see anything wrong with your JVM config, which is very much > standard. > > Hope this helps, > Tri > > > On Mar 03, 2014, at 08:52 AM, Josh <jwda...@gmail.com> wrote: > > In the user core there are two fields, the database core in question was > 40, but in production environments the database core is dynamic. My time > has been pretty crazy trying to get this out the door and we haven't tried > a standard solr install yet but it's on my plate for the test app and I > don't know enough about Solr/Bitnami to know if they've done any serious > modifications to it. > > I had tried doing a dump from VisualVM previously but it didn't seem to > give me anything useful but then again I didn't know how to look for > interned strings. This is something I can take another look at in the > coming weeks when I do my test case against a standard solr install with > SolrJ. The exception with user cores happens after 80'ish runs, so 640'ish > user cores with the PermGen set to 64MB. The database core test was far > lower, it was in the 10-15 range. As a note once the permgen limit is hit, > if we simply restart the service with the same number of cores loaded the > permgen usage is minimal even with the amount of user cores being high in > our production environment (500-600). > > If this does end up being the interning of strings, is there anyway it can > be mitigated? Our production environment for our heavier users would see in > the range of 3200+ user cores created a day. > > Thanks for the help. > Josh > > > On Mon, Mar 3, 2014 at 11:24 AM, Tri Cao <tm...@me.com> wrote: > > Hey Josh, > > I am not an expert in Java performance, but I would start with dumping a > > the heap > > and investigate with visualvm (the free tool that comes with JDK). > > In my experience, the most common cause for PermGen exception is the app > > creates > > too many interned strings. Solr (actually Lucene) interns the field names > > so if you have > > too many fields, it might be the cause. How many fields in total across > > cores did you > > create before the exception? > > Can you reproduce the problem with the standard Solr? Is the bitnami > > distribution just > > Solr or do they have some other libraries? > > Hope this helps, > > Tri > > On Mar 03, 2014, at 07:28 AM, Josh <jwda...@gmail.com> wrote: > > It's a windows installation using a bitnami solr installer. I incorrectly > > put 64M into the configuration for this, as I had copied the test > > configuration I was using to recreate the permgen issue we were seeing on > > our production system (that is configured to 512M) as it takes awhile with > > to recreate the issue with larger permgen values. In the test scenario > > there was a small 180 document data core that's static with 8 dynamic user > > cores that are used to index the unique document ids in the users view, > > which is then merged into a single user core. The final user core contains > > the same number of document ids as the data core and the data core is > > queried against with the ids in the final merged user core as the limiter. > > The user cores are then unloaded, and deleted from the drive and then the > > test is reran again with the user cores re-created > > We are also using the core discovery mode to store/find our cores and the > > database data core is using dynamic fields with a mix of single value and > > multi value fields. The user cores use a static configuration. The data is > > indexed from SQL Server using jtDS for both the user and data cores. As a > > note we also reversed the test case I mention above where we keep the user > > cores static and dynamically create the database core and this created the > > same issue only it leaked faster. We assumed this because the configuration > > was larger/loaded more classes then the simpler user core. > > When I get the time I'm going to put together a SolrJ test app to recreate > > the issue outside of our environment to see if others see the same issue > > we're seeing to rule out any kind of configuration problem. Right now we're > > interacting with solr with POCO via the restful interface and it's not very > > easy for us to spin this off into something someone else could use. In the > > mean time we've made changes to make the user cores more static, this has > > slowed down the build up of permgen to something that can be managed by a > > weekly reset. > > Sorry about the confusion in my initial email and I appreciate the > > response. Anything about my configuration that you can think might be > > useful just let me know and I can provide it. We have a work around, but it > > really hampers what our long term goals were for our Solr implementation. > > Thanks > > Josh > > On Mon, Mar 3, 2014 at 9:57 AM, Greg Walters <greg.walt...@answers.com > > >wrote: > > Josh, > > You've mentioned a couple of times that you've got PermGen set to 512M but > > then you say you're running with -XX:MaxPermSize=64M. These two statements > > are contradictory so are you *sure* that you're running with 512M of > > PermGen? Assuming your on a *nix box can you provide `ps` output proving > > this? > > Thanks, > > Greg > > On Feb 28, 2014, at 5:22 PM, Furkan KAMACI <furkankam...@gmail.com> wrote: > > > Hi; > > > > > > You can also check here: > > > > > > http://stackoverflow.com/questions/3717937/cmspermgensweepingenabled-vs-cmsclassunloadingenabled > > > > > > Thanks; > > > Furkan KAMACI > > > > > > > > > 2014-02-26 22:35 GMT+02:00 Josh <jwda...@gmail.com>: > > > > > >> Thanks Timothy, > > >> > > >> I gave these a try and -XX:+CMSPermGenSweepingEnabled seemed to cause > > the > > >> error to happen more quickly. With this option on it didn't seemed to do > > >> any intermittent garbage collecting that delayed the issue in with it > > off. > > >> I was already using a max of 512MB, and I can reproduce it with it set > > this > > >> high or even higher. Right now because of how we have this implemented > > just > > >> increasing it to something high just delays the problem :/ > > >> > > >> Anything else you could suggest I would really appreciate. > > >> > > >> > > >> On Wed, Feb 26, 2014 at 3:19 PM, Tim Potter <tim.pot...@lucidworks.com > > >>> wrote: > > >> > > >>> Hi Josh, > > >>> > > >>> Try adding: -XX:+CMSPermGenSweepingEnabled as I think for some VM > > >>> versions, permgen collection was disabled by default. > > >>> > > >>> Also, I use: -XX:MaxPermSize=512m -XX:PermSize=256m with Solr, so 64M > > may > > >>> be too small. > > >>> > > >>> > > >>> Timothy Potter > > >>> Sr. Software Engineer, LucidWorks > > >>> www.lucidworks.com > > >>> > > >>> ________________________________________ > > >>> From: Josh <jwda...@gmail.com> > > >>> Sent: Wednesday, February 26, 2014 12:27 PM > > >>> To: solr-user@lucene.apache.org > > >>> Subject: Solr Permgen Exceptions when creating/removing cores > > >>> > > >>> We are using the Bitnami version of Solr 4.6.0-1 on a 64bit windows > > >>> installation with 64bit Java 1.7U51 and we are seeing consistent issues > > >>> with PermGen exceptions. We have the permgen configured to be 512MB. > > >>> Bitnami ships with a 32bit version of Java for windows and we are > > >> replacing > > >>> it with a 64bit version. > > >>> > > >>> Passed in Java Options: > > >>> > > >>> -XX:MaxPermSize=64M > > >>> -Xms3072M > > >>> -Xmx6144M > > >>> -XX:+UseParNewGC > > >>> -XX:+UseConcMarkSweepGC > > >>> -XX:CMSInitiatingOccupancyFraction=75 > > >>> -XX:+CMSClassUnloadingEnabled > > >>> -XX:NewRatio=3 > > >>> > > >>> -XX:MaxTenuringThreshold=8 > > >>> > > >>> This is our use case: > > >>> > > >>> We have what we call a database core which remains fairly static and > > >>> contains the imported contents of a table from SQL server. We then have > > >>> user cores which contain the record ids of results from a text search > > >>> outside of Solr. We then query for the data we want from the database > > >> core > > >>> and limit the results to the content of the user core. This allows us > > to > > >>> combine facet data from Solr with the search results from another > > engine. > > >>> We are creating the user cores on demand and removing them when the > > user > > >>> logs out. > > >>> > > >>> Our issue is the constant creation and removal of user cores combined > > >> with > > >>> the constant importing seems to push us over our PermGen limit. The > > user > > >>> cores are removed at the end of every session and as a test I made an > > >>> application that would loop creating the user core, import a set of > > data > > >> to > > >>> it, query the database core using it as a limiter and then remove the > > >> user > > >>> core. My expectation was in this scenario that all the permgen > > associated > > >>> with that user cores would be freed upon it's unload and allow permgen > > to > > >>> reclaim that memory during a garbage collection. This was not the case, > > >> it > > >>> would constantly go up until the application would exhaust the memory. > > >>> > > >>> I also investigated whether the there was a connection between the two > > >>> cores left behind because I was joining them together in a query but > > even > > >>> unloading the database core after unloading all the user cores won't > > >>> prevent the limit from being hit or any memory to be garbage collected > > >> from > > >>> Solr. > > >>> > > >>> Is this a known issue with creating and unloading a large number of > > >> cores? > > >>> Could it be configuration based for the core? Is there something other > > >> than > > >>> unloading that needs to happen to free the references? > > >>> > > >>> Thanks > > >>> > > >>> Notes: I've tried using tools to determine if it's a leak within Solr > > >> such > > >>> as Plumbr and my activities turned up nothing. > > >>> > > >> > >