Under the covers, Lucene stores ints in a packed format, so I'd just count on that for a first pass.
What is "a lot of integer values"? Hundreds of millions? Billions? Trillions? Unless you give us some indication of scale, it's hard to say anything helpful. But unless you have some evidence that your going to blow out memory I'd just ignore the "wasted" bits. Especially if you can use docValues, that option holds much of the underlying data in MMapDirectory that uses swappable OS memory.... Best, Erick On Fri, Oct 16, 2015 at 1:53 AM, Robert Krüger <krue...@lesspain.de> wrote: > Hi, > > I have a data model where I would store and index a lot of integer values > with a very restricted range (e.g. 0-255), so theoretically the 32 bits of > Solr's integer fields are complete overkill. I want to be able to to things > like vector distance calculations on those fields. Should I worry about the > "wasted" bits or will Solr compress/organize the index in a way that > compensates for this if there are only 256 (or even fewer) distinct values? > > Any recommendations on how my fields should be defined to make things like > numeric functions work as fast as technically possible? > > Thanks in advance, > > Robert