Dear  Sir/Madam, I am Li Wei, from China, and I`m writing to you for your help. 
Here is the problem I encountered:


There is a timing task set at night in our project which uses  Lucene to build 
index for the data from Oracle database. It was working fine at the beginning, 
however, as the index file grows bigger, the indexing work is getting slower, 
when the data needed to be indexed is big, the timing task can't be finished at 
night. 


To solve this problem, we take the following measure:
We store the index data in different directories according to the time the data 
inserted into the database. This measure can solve the indexing problem in some 
way. However, when searching the index data, the user has to specify the year 
when the data is created so as to search in the corresponding directory??it`s a 
bad experience for the users.


Then we learned that Solr is good at indexing data from database, so we decide 
to adopt Solr into our project. But as the index data gets bigger, it would 
also take more and more time for Solr to finish the index task. I know that 
SolrCloud can solve the search problem when the index data is big, but it`s 
even slower in indexing than Solr.


So I am writing to you for help. Is there any solution for Solr to handle this 
kind problem? There are more than six hundred million records in the database 
right now, and data will be added into the database everyday. Whether it is 
true that if we don't set the UniqueKey property in the config.xml file, then 
the problem will be avoided? If so, there`s another problem, the index data can 
be only added, but can't be updated without the UniqueKey property. Could you 
please give me some solutions for these problems?


I am looking forward to you sincerely. Thank you very much for your time!


Best regards,
Li Wei

Reply via email to