What I'm suggesting, is that you should aim for max(50GB) per shard of data. How much is it currently ? Each shard is a lucene index which has a lot of overhead. If you can, try to have 20x-50x-100x less shards than you currently do and you'll see lower heap requirement. I don't know about static/dynamic memory-issue though.
On Tue, Apr 11, 2017 at 6:09 PM, jpereira <jpereira...@gmail.com> wrote: > Dorian Hoxha wrote > > Isn't 18K lucene-indexes (1 for each shard, not counting the replicas) a > > little too much for 3TB of data ? > > Something like 0.167GB for each shard ? > > Isn't that too much overhead (i've mostly worked with es but still lucene > > underneath) ? > > I don't have only 3TB , I have 3TB in two tier2 machines, the whole cluster > is 12 TB :) So what I was trying to explain was this: > > NODES A & B > 3TB per machine , 36 collections * 12 shards (432 indexes) , average heap > footprint of 60GB > > NODES C & D - at first > ~725GB per machine, 4 collections * 12 shards (48 indexes) , average heap > footprint of 12GB > > NODES C & D - after addding 220GB schemaless data > ~1TB per machine, 46 collections * 12 shards (552 indexes), average heap > footprint of 48GB > > So, what you are suggesting is that the culprit for the bump in heap > footprint is the new collections? > > > Dorian Hoxha wrote > > Also you should change the heap 32GB->30GB so you're guaranteed to get > > pointer compression. I think you should have no need to increase it more > > than this, since most things have moved to out-of-heap stuff, like > > docValues etc. > > I was forced to raise the heap size because the memory requirements > dramatically raised, hence this post :) > > Thanks > > > > -- > View this message in context: http://lucene.472066.n3. > nabble.com/Dynamic-schema-memory-consumption-tp4329184p4329345.html > Sent from the Solr - User mailing list archive at Nabble.com. >