You’re off track a bit. useDocValuesAsStored has no effect on the size on disk. It’s purely a runtime option that pulls the data to return from either the stored or docValues parts of the index. If you change the definition and reindex, you should see significant differences in the size of your index, particularly the “*.fdt/*.fdx” and “*.dvd*I.dvm” files, where stored and docValues are kept respectively.
However, it’s also apples and oranges. Specifically, using docValues as stored will _not_ necessarily return the fields the same way they were sent in the multiValued case. The docValues data is kept as a SORTED_SET, which means it’s both lexically sorted and deduplicated. So input like “a” “z” “h” “a” will return “a” “h” “z”. Best, Erick > On Jul 15, 2020, at 1:35 PM, Gael Jourdan-Weil > <gael.jourdan-w...@kelkoogroup.com> wrote: > > Hello, > > I was wondering if we can expect significant disk usage reduction (index > size) if we move from fields defined as "docValues=true + stored=true" to > "docValues=true + stored=false" (with useDocValuesAsStored=true as default in > both cases)? > > Considering the use case we are targeting is only Streaming Expression with > /export handler, I also understand that we might also set > useDocValuesAsStored=false from what is described at > https://lucene.apache.org/solr/guide/8_4/docvalues.html. > If so, would setting useDocValuesAsStored=false help reduce the index size as > well? > > We will obviously try it and see by ourselves the results but I was wondering > if you already have an idea about it. > Also if you have any good link to how data are physically stored depending on > the fields options (indexed/stored/docValues), this could really be > interesting. > > Thanks, > Gaël