All, I just found a thread about this on the mailing list archives because I'm troubleshooting the same problem. The kicker is that it doesn't take such large files to kill the StringBuilder. I have discovered the following:
By using a text file made up of 3,443,464 bytes or less, I get no error. AT 3,443,465 bytes: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.lang.String.<init>(String.java:208) at java.lang.StringBuilder.toString(StringBuilder.java:431) at org.junit.Assert.format(Assert.java:321) at org.junit.ComparisonFailure$ComparisonCompactor.compact( ComparisonFailure.java:80) at org.junit.ComparisonFailure.getMessage(ComparisonFailure.java:37) at java.lang.Throwable.getLocalizedMessage(Throwable.java:267) at java.lang.Throwable.toString(Throwable.java:344) at java.lang.String.valueOf(String.java:2615) at java.io.PrintWriter.print(PrintWriter.java:546) at java.io.PrintWriter.println(PrintWriter.java:683) at java.lang.Throwable.printStackTrace(Throwable.java:510) at org.apache.tools.ant.util.StringUtils.getStackTrace( StringUtils.java:96) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.getFilteredTrace (JUnitTestRunner.java:856) at org.apache.tools.ant.taskdefs.optional.junit.XMLJUnitResultFormatter.formatError (XMLJUnitResultFormatter.java:280) at org.apache.tools.ant.taskdefs.optional.junit.XMLJUnitResultFormatter.addError (XMLJUnitResultFormatter.java:255) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner$4.addError( JUnitTestRunner.java:988) at junit.framework.TestResult.addError(TestResult.java:38) at junit.framework.JUnit4TestAdapterCache$1.testFailure( JUnit4TestAdapterCache.java:51) at org.junit.runner.notification.RunNotifier$4.notifyListener( RunNotifier.java:96) at org.junit.runner.notification.RunNotifier$SafeNotifier.run( RunNotifier.java:37) at org.junit.runner.notification.RunNotifier.fireTestFailure( RunNotifier.java:93) at org.junit.internal.runners.TestMethodRunner.addFailure( TestMethodRunner.java:104) at org.junit.internal.runners.TestMethodRunner.runUnprotected( TestMethodRunner.java:87) at org.junit.internal.runners.BeforeAndAfterRunner.runProtected( BeforeAndAfterRunner.java:34) at org.junit.internal.runners.TestMethodRunner.runMethod( TestMethodRunner.java:75) at org.junit.internal.runners.TestMethodRunner.run( TestMethodRunner.java:45) at org.junit.internal.runners.TestClassMethodsRunner.invokeTestMethod( TestClassMethodsRunner.java:71) at org.junit.internal.runners.TestClassMethodsRunner.run( TestClassMethodsRunner.java:35) at org.junit.internal.runners.TestClassRunner$1.runUnprotected( TestClassRunner.java:42) at org.junit.internal.runners.BeforeAndAfterRunner.runProtected( BeforeAndAfterRunner.java:34) at org.junit.internal.runners.TestClassRunner.run( TestClassRunner.java:52) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:32) AT 3,443,466 byes (or more) : Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.lang.AbstractStringBuilder.expandCapacity( AbstractStringBuilder.java:99) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java :393) at java.lang.StringBuilder.append(StringBuilder.java:120) at org.junit.Assert.format(Assert.java:321) at org.junit.ComparisonFailure$ComparisonCompactor.compact( ComparisonFailure.java:80) at org.junit.ComparisonFailure.getMessage(ComparisonFailure.java:37) at java.lang.Throwable.getLocalizedMessage(Throwable.java:267) at java.lang.Throwable.toString(Throwable.java:344) at java.lang.String.valueOf(String.java:2615) at java.io.PrintWriter.print(PrintWriter.java:546) at java.io.PrintWriter.println(PrintWriter.java:683) at java.lang.Throwable.printStackTrace(Throwable.java:510) at org.apache.tools.ant.util.StringUtils.getStackTrace( StringUtils.java:96) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.getFilteredTrace (JUnitTestRunner.java:856) at org.apache.tools.ant.taskdefs.optional.junit.XMLJUnitResultFormatter.formatError (XMLJUnitResultFormatter.java:280) at org.apache.tools.ant.taskdefs.optional.junit.XMLJUnitResultFormatter.addError (XMLJUnitResultFormatter.java:255) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner$4.addError( JUnitTestRunner.java:988) at junit.framework.TestResult.addError(TestResult.java:38) at junit.framework.JUnit4TestAdapterCache$1.testFailure( JUnit4TestAdapterCache.java:51) at org.junit.runner.notification.RunNotifier$4.notifyListener( RunNotifier.java:96) at org.junit.runner.notification.RunNotifier$SafeNotifier.run( RunNotifier.java:37) at org.junit.runner.notification.RunNotifier.fireTestFailure( RunNotifier.java:93) at org.junit.internal.runners.TestMethodRunner.addFailure( TestMethodRunner.java:104) at org.junit.internal.runners.TestMethodRunner.runUnprotected( TestMethodRunner.java:87) at org.junit.internal.runners.BeforeAndAfterRunner.runProtected( BeforeAndAfterRunner.java:34) at org.junit.internal.runners.TestMethodRunner.runMethod( TestMethodRunner.java:75) at org.junit.internal.runners.TestMethodRunner.run( TestMethodRunner.java:45) at org.junit.internal.runners.TestClassMethodsRunner.invokeTestMethod( TestClassMethodsRunner.java:71) at org.junit.internal.runners.TestClassMethodsRunner.run( TestClassMethodsRunner.java:35) at org.junit.internal.runners.TestClassRunner$1.runUnprotected( TestClassRunner.java:42) at org.junit.internal.runners.BeforeAndAfterRunner.runProtected( BeforeAndAfterRunner.java:34) at org.junit.internal.runners.TestClassRunner.run( TestClassRunner.java:52) I am writing a filesystem crawler so I need to be able to crawl and index any size file (within reason). A 3-4MB file is certainly within reason. I rewrote my code to store the file contents in a file and read/write in one line at a time. However, when I post the XML file to Solr using SimplePostTool, I get another OutOfMemoryError about the java heap space (thrown from org.xmlpull... again). In any case, does anyone have any ideas about this? Has anyone posted documents with contents larger than 3.5MB to Solr successfully? If so, how was it done? I'm using Solr v1.2. Best, Dave