You should stand to benefit from concurrent loading.  Certainly the text 
analysis would end up being done concurrently; I'm not sure what else benefits 
from it but I think there are other things.  Ideally you could try a 
configurable number of concurrent loads and pick the one that gets the job done 
fastest.

~ David Smiley
________________________________________
From: Joe Calderon [calderon....@gmail.com]
Sent: Thursday, August 06, 2009 4:58 PM
To: solr-user@lucene.apache.org
Subject: concurrent csv loading

for first time loads i currently post to
/update/csv?commit=false&separator=%09&escape=\&stream.file=workfile.txt&map=NULL:&keepEmpty=false",
this works well and finishes in about 20 minutes for my work load.

this is mostly cpu bound, i have an 8 core box and it seems one takes
the brunt of the work.

 if i wanted to optimize, would i see any benefit to splitting
workfile.txt in two and doing two posts ?

im running lucid's build of solr 1.3.0 on jetty 6, io is not a
bottleneck as the data folder is on tmpfs

thx much
--joe

Reply via email to