Little question please:

I have directories with around 30 files of 40Mo with around 17 000 doc for each files.

is it better to index:
- file by file with java -jar 1.xml, java -jar 2.xml, etc....
or
- all at the same time with java -jar *.xml

All files are verified, so my question is just concerning speed

Thx for your comments,
Bruno


Le 20/06/2012 05:44, Lance Norskog a écrit :
M. Della Bitta is right- we're not talking about post.jar, but starting Solr:

java -xMx300m -jar start.jar

On Tue, Jun 19, 2012 at 10:05 AM, Erick Erickson
<erickerick...@gmail.com>  wrote:
Well, it _used_ to be defaulted in the code, but on looking at 3.6 it's seems
like it defaults to Integer.MAX_VALUE, so you're fine....

And it's all deprecated in 4.x, will be gone

Best
Erick

On Tue, Jun 19, 2012 at 7:07 AM, Bruno Mannina<bmann...@free.fr>  wrote:
Actually -Xmx512m and no effect

Concerning  maxFieldLength, no problem it's commented

Le 19/06/2012 13:02, Erick Erickson a écrit :

Then try -Xmx600M
next try -Xmx900M


etc. The idea is to bump things on separate runs.

But be a little cautious here. Look in your solrconfig.xml file, you'll
see
a commented-out line
<maxFieldLength>10000</maxFieldLength>

The default behavior for Solr/Lucene is to index the first 10,000 tokens
(not characters, think of tokens as words for not) in each
document and throw the rest on the floor. At the sizes you're talking
about,
that's probably not a problem, but do be aware of it.

Best
Erick

On Tue, Jun 19, 2012 at 5:44 AM, Bruno Mannina<bmann...@free.fr>    wrote:
Like that?

java -Xmx300m -jar post.jar myfile.xml



Le 19/06/2012 11:11, Lance Norskog a écrit :

Ah! Java memory size is a java command line option:


http://javahowto.blogspot.com/2006/06/6-common-errors-in-setting-java-heap.html

You would try increasing the memory size in stages up to maybe 300m.

On Tue, Jun 19, 2012 at 2:04 AM, Bruno Mannina<bmann...@free.fr>
  wrote:

Le 19/06/2012 10:51, Lance Norskog a écrit :

675 doc/s is respectable for that server. You might move the memory
allocated to Java up and down- there is a balance between amount of
memory in Java v.s. the OS disk buffer.

How can I do that ? is there an option during my command line or in a
config
file?
sorry for this newbie question :(


And, of course, use the latest trunk.
Solr 3.6


On Tue, Jun 19, 2012 at 12:10 AM, Bruno Mannina<bmann...@free.fr>
  wrote:
Correction: file size is 40 Mo !!!

Le 19/06/2012 09:09, Bruno Mannina a écrit :

Dear All,

I would like to know if the indexation speed is right.

I have a 40Go file size with around 27 000 docs inside.
I index around 20 fields,

My (old) test server is a DualCore 3.06GHz Intel Xeon with only 1Go
Ram

The file takes 40 seconds with the command line:
java -jar post.jar myfile.xml

Could I increase this speed or reduce this time?

Thanks a lot,
PS: Newbie user





Reply via email to