About your second point. Try committing more often with openSearcher set to false. There's a bit here: http://wiki.apache.org/solr/SolrConfigXml
<autoCommit> <maxDocs>10000</maxDocs> <!-- maximum uncommited docs before autocommit triggered --> <maxTime>15000</maxTime> <!-- maximum time (in MS) after adding a doc before an autocommit is triggered --> <openSearcher>false</openSearcher> <!-- SOLR 4.0. Optionally don't open a searcher on hard commit. This is useful to minimize the size of transaction logs that keep track of uncommitted updates. --> </autoCommit> That should keep the size of the transaction log down to reasonable levels... Best Erick On Sun, Oct 14, 2012 at 4:11 PM, Shawn Heisey <s...@elyograg.org> wrote: > Please see my other thread called "Testing Solr4 - reference thread"for > general information about my config layout. If more specific information is > required, please let me know. > > So far I cannot get a solr.war built without slf4j bindings to work right. > There does not seem to be any centrally configured directory I can use for > the slf4j and log4j jars. I am hesitant to use a lib entry in > solrconfig.xml, because I actually have three distinct solrconfig.xml files > and each server has 16 cores that symlink to those files. I can have each > instanceDir contain a symlink to a more central lib directory, but I don't > want each core to have its own copy of those jars loaded into memory unless > it's the only way to make it work. If anyone knows how to make this work > properly, let me know. If the instanceDir symlink option is the only way, I > will probably file an issue in Jira. > > If the updateLog is turned on (I did add _version_ to my schema), doing a > full reindex (using DIH) leads to "out of memory" exceptions, and the > transaction log takes up the same amount of disk space (in a single log > file) as the partially built index. Based on the index progress before it > died, performance is terrible -- about one third the pace of Solr 3.5.0, > perhaps less. > > After I turned off updateLog, performance went way up and it was able to > complete without error. I think it is actually faster than it was under > 3.5.0 with the exact same DIH config, as long as updateLog is turned off. I > haven't done enough testing to file an issue yet. Are there ways to split > the transaction log into multiple files and control how much disk space the > log uses? Can I do anything to increase performance? > > For relative paths, instanceDir is relative to solr.home, dataDir is > relative to instanceDir, and if you are using symlinks for solrconfig.xml, > xinclude directives are relative to the symlink location, not the real file > location. These seem like reasonable defaults to me. Is this what I should > expect for the future, or should I be filing an issue? > > Thanks, > Shawn >