Under the covers, Lucene stores ints in a packed format, so I'd just count
on that for a first pass.

What is "a lot of integer values"? Hundreds of millions? Billions? Trillions?

Unless you give us some indication of scale, it's hard to say anything
helpful. But unless you have some evidence that your going to blow out
memory I'd just ignore the "wasted" bits. Especially if you can use docValues,
that option holds much of the underlying data in MMapDirectory
that uses swappable OS memory....

Best,
Erick

On Fri, Oct 16, 2015 at 1:53 AM, Robert Krüger <krue...@lesspain.de> wrote:
> Hi,
>
> I have a data model where I would store and index a lot of integer values
> with a very restricted range (e.g. 0-255), so theoretically the 32 bits of
> Solr's integer fields are complete overkill. I want to be able to to things
> like vector distance calculations on those fields. Should I worry about the
> "wasted" bits or will Solr compress/organize the index in a way that
> compensates for this if there are only 256 (or even fewer) distinct values?
>
> Any recommendations on how my fields should be defined to make things like
> numeric functions work as fast as technically possible?
>
> Thanks in advance,
>
> Robert

Reply via email to