A couple of things: > Consider indexing them with SolrJ, here's a place to get started: > http://searchhub.org/2012/02/14/indexing-with-solrj/. Especially if you use a > SAX-based parser you have more control over memory consumption, it's on the > client after all. And, you can rack together as many clients all going to > Solr as you need.
> Here's a bunch of information about tlogs and commits that might be useful > background. http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/. Consider setting your <autoCommit> interval quite short (15 seconds) with openSearcher set to false. That'll truncate your tlog, although how that relates to your error is something of a mystery to me... Best, Erick On Sun, Jun 15, 2014 at 3:14 AM, Mikhail Khludnev <mkhlud...@griddynamics.com> wrote: > Hello Floyd, > > Did you consider to disable tlog? > Does a file consist of many docs? > Do you have SolrCloud? Do you use just sh/curl or have a java program? > DIH is not really performant so far. Submitting roughly ten huge files in > parallel is a way to perform good. Once again, nuke tlog. > > > On Sun, Jun 15, 2014 at 12:44 PM, Floyd Wu <floyd...@gmail.com> wrote: > >> Hi, >> I have many XML Message file formatted like this >> https://wiki.apache.org/solr/UpdateXmlMessages >> >> These files are generated by my index builder daily. >> Currently I am sending these file through http post to Solr but sometimes I >> hit OOM exception or pending too many tlog. >> >> Do you have better way to "import" these files to Solr to build index? >> >> Thanks for the suggestion >> >> Floyd >> > > > > -- > Sincerely yours > Mikhail Khludnev > Principal Engineer, > Grid Dynamics > > <http://www.griddynamics.com> > <mkhlud...@griddynamics.com>