Re: TooManyOpenFiles: indexing in one core, doing many searches at the same time in another

Bruno Aranda Tue, 14 Jul 2009 06:58:18 -0700

Hi, my process is:

I index 600000 docs in the secondary core (each doc has 5 fields). No
problem with that. After this core is indexed (and optimized) it will
be used only for searches, during the main core indexing.
Currently, I am using mergeFactoror 10 for the main core. I will try
with 2 to see if it changes and the useCompoundFile set to true. I
guess I don't need to modify anything in the secondary core as it is
only used for searches.


Thanks for your answers,

Bruno

2009/7/14 Mark Miller <markrmil...@gmail.com>:
> What merge factor are you using now? The merge factor will influence the
> number of files that are created as the index grows. Lower = fewer file
> descriptors needed, but also slower bulk indexing.
> You could up the Max Open Files settings on your OS.
>
> You could also use
>    <!-- options specific to the main on-disk lucene index -->
>    <useCompoundFile>true</useCompoundFile>
>
> Which writes multiple segments to one file and requires *way* less file
> handles (slightly slower indexing).
>
> It would normally be odd to hit something like that after only 50,000
> documents, but a doc with 300 fields is certainly not the norm ;) Anything
> else special about your setup?
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
> On Tue, Jul 14, 2009 at 12:49 PM, Bruno Aranda <brunoara...@gmail.com>wrote:
>
>> Hi,
>>
>> We are having a TooManyOpenFiles exception in our indexing process. We
>> are reading data from a database and indexing this data into one of
>> the two cores of our solr instance. Each of the cores has a different
>> schema as they are used for a different purpose. While we index in the
>> first core, we do many searches in the second core as it contains data
>> to "enrich" what we index (the second core is never modifier - read
>> only). After indexing about 50.000 documents (about 300 fields each)
>> we get the exception. If we run the same process, but without the
>> "enrichment" (not doing queries in the second core), everything goes
>> all right.
>> We are using spring batch, and we only commit+optimize at the very
>> end, as we don't need to search anything in the data that is being
>> indexed.
>>
>> I have seen recommendations that go from committing+optimize more
>> often or lowering the merge factor? How is the merge factor affecting
>> in this scenario?
>>
>> Thanks,
>>
>> Bruno
>>
>

Re: TooManyOpenFiles: indexing in one core, doing many searches at the same time in another

Reply via email to