Hi,

Each doc is 100K?  That's on the big side, yes, and the server seems on the
small side, yes.  Hence the "speed". :)

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Wed, Mar 5, 2014 at 3:37 PM, Rallavagu <rallav...@gmail.com> wrote:

> Otis,
>
> Good points. I guess you are suggesting that it depends on the resources.
> The document is 100k each the pre processing server is a 2 cpu VM running
> with 4G RAM. So, that could be a "small" machine relatively to process such
> amount of data??
>
>
> On 3/5/14, 12:27 PM, Otis Gospodnetic wrote:
>
>> Hi,
>>
>> It depends.  Are docs huge or small? Server single core or 32 core?  Heap
>> big or small?  etc. etc.
>>
>> Otis
>> --
>> Performance Monitoring * Log Analytics * Search Analytics
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>
>> On Wed, Mar 5, 2014 at 3:02 PM, Rallavagu <rallav...@gmail.com> wrote:
>>
>>  It seems the latency is introduced by collecting the data from different
>>> sources and putting them together then actual Solr index. I would say all
>>> these activities are contributing equally though I would say So, is it
>>> normal to expect to run indexing to run for long? Wondering what to
>>> expect
>>> in such cases. Thanks.
>>>
>>> On 3/5/14, 11:47 AM, Otis Gospodnetic wrote:
>>>
>>>  Hi,
>>>>
>>>> 6M is really not huge these days.  6B is big, though also still not huge
>>>> any more.  What seems to be the bottleneck?  Solr or DB or network or
>>>> something else?
>>>>
>>>> Otis
>>>> --
>>>> Performance Monitoring * Log Analytics * Search Analytics
>>>> Solr & Elasticsearch Support * http://sematext.com/
>>>>
>>>>
>>>> On Wed, Mar 5, 2014 at 2:37 PM, Rallavagu <rallav...@gmail.com> wrote:
>>>>
>>>>   All,
>>>>
>>>>>
>>>>> Wondering about best practices/common practices to index/re-index huge
>>>>> amount of data in Solr. The data is about 6 million entries in the db
>>>>> and
>>>>> other source (data is not located in one resource). Trying with solrj
>>>>> based
>>>>> solution to collect data from difference resources to index into Solr.
>>>>> It
>>>>> takes hours to index Solr.
>>>>>
>>>>> Thanks in advance
>>>>>
>>>>>
>>>>>
>>>>
>>

Reply via email to