Thanks Erick!
So let's say I have a config of

<filterCache
class="solr.FastLRUCache"
size="10000"
initialSize="10000"
autowarmCount="5000"/>

MaxDocuments = 1,000,000

So according to your formula, filterCache should roughly have the potential
to consume this much RAM:
((1,000,000 / 8) + 128) * (10,000) = 1,251,280,000 byte / 1,000 =
1,251,280 kb / 1,000 = 1,251.28 mb / 1000 = 1.25 gb

Thanks,
Ben





On Wed, Jun 18, 2014 at 11:13 AM, Erick Erickson <erickerick...@gmail.com>
wrote:

> You pretty much have it. Actually, the number you want is the "maxDoc"
> figure from the admin UI screen. The formula will be maxDoc/8 bytes +
> (some overhead but not enough to matter), for EVERY entry.
>
> You'll never fit 100B docs on a single machine anyway. Lucene has a
> hard limit of 2B docs, and I've never heard of anyone fitting even 2B
> docs on a single machine in a performant manner. So under any
> circumstance this won't all be on one machine. You have to figure it
> locally for each shard. And at this size there's no doubt you'll be
> sharding!
>
> Also be very careful here. the "size" parameter in the cache
> definition is the number of _entries_, NOT the number of _bytes_.
>
> _Each_ entry is that size! So the cache requirements will be close to
> ((maxDoc/8) + 128) * (size_defined_in_the_config_file), where 128 is
> an approximation of the storage necessary for the text of the fq
> clause.
>
> Best,
> Erick
>
> On Wed, Jun 18, 2014 at 8:00 AM, Benjamin Wiens
> <benjamin.wi...@gmail.com> wrote:
> > Hi,
> > I'm looking for a formula to calculate filterCache size in the RAM.
> >
> > The best estimation I can find is here
> >
> http://stackoverflow.com/questions/20999904/solr-filter-cache-fastlrucache-takes-too-much-memory-and-results-in-out-of-mem
> >
> > An index of 1.000.000 would thus take 12,5 GB in the RAM with this
> formula:
> >
> > 100.000.000.000 bit / 8 (to byte) / 1000 (to kb) / 1000 (to mb) / 1000
> (to
> > gb) = 12,5 GB
> >
> > Can anyone confirm this formula? I am aware that if the result of the
> > filter query is low, it can just create something else which take up less
> > memory.
> >
> > I know I can just start with a low filterCache size and kick it up in my
> > environment, but I'd like to come up with a scientific formula.
> >
> > Thanks,
> > Ben
>

Reply via email to