Hi, Each doc is 100K? That's on the big side, yes, and the server seems on the small side, yes. Hence the "speed". :)
Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Wed, Mar 5, 2014 at 3:37 PM, Rallavagu <rallav...@gmail.com> wrote: > Otis, > > Good points. I guess you are suggesting that it depends on the resources. > The document is 100k each the pre processing server is a 2 cpu VM running > with 4G RAM. So, that could be a "small" machine relatively to process such > amount of data?? > > > On 3/5/14, 12:27 PM, Otis Gospodnetic wrote: > >> Hi, >> >> It depends. Are docs huge or small? Server single core or 32 core? Heap >> big or small? etc. etc. >> >> Otis >> -- >> Performance Monitoring * Log Analytics * Search Analytics >> Solr & Elasticsearch Support * http://sematext.com/ >> >> >> On Wed, Mar 5, 2014 at 3:02 PM, Rallavagu <rallav...@gmail.com> wrote: >> >> It seems the latency is introduced by collecting the data from different >>> sources and putting them together then actual Solr index. I would say all >>> these activities are contributing equally though I would say So, is it >>> normal to expect to run indexing to run for long? Wondering what to >>> expect >>> in such cases. Thanks. >>> >>> On 3/5/14, 11:47 AM, Otis Gospodnetic wrote: >>> >>> Hi, >>>> >>>> 6M is really not huge these days. 6B is big, though also still not huge >>>> any more. What seems to be the bottleneck? Solr or DB or network or >>>> something else? >>>> >>>> Otis >>>> -- >>>> Performance Monitoring * Log Analytics * Search Analytics >>>> Solr & Elasticsearch Support * http://sematext.com/ >>>> >>>> >>>> On Wed, Mar 5, 2014 at 2:37 PM, Rallavagu <rallav...@gmail.com> wrote: >>>> >>>> All, >>>> >>>>> >>>>> Wondering about best practices/common practices to index/re-index huge >>>>> amount of data in Solr. The data is about 6 million entries in the db >>>>> and >>>>> other source (data is not located in one resource). Trying with solrj >>>>> based >>>>> solution to collect data from difference resources to index into Solr. >>>>> It >>>>> takes hours to index Solr. >>>>> >>>>> Thanks in advance >>>>> >>>>> >>>>> >>>> >>