Hi guys, let's open a discussion : *Use Case *: A set of fields I use only for : - exact search - faceting
*Field Configuration* <field name="author" type="string" indexed="true" stored="true" docValues= "true" omitTermFreqsAndPositions="true" omitNorms="true" /> I don't need norms, I don't need term freq and I don't need positions. I do need the index for exact search. I would like to have docValues because facets are going to be heavy on those fields. I like to store them. *Faceting approach * *1) *Indexing the human readable field value Facets will be returned readable, out of the box. I can not see any cons in this approach, I would say it is the standard one. - When building the docValues and flushing them to the disk, good compression algorithm are going to be used. - When calculating faceting, in memory it is used the ordinal for each term, which means in memory we don't waste space for the actual term, or waste the time looking up for the value until the very end of the process, after the counts are done . *2)* Correlate outside the search system each term to a custom ID. Index the custom ID. After facets are calculated resolve the ID and show the human readable labels. According to my knowledge in this way we are overcomplicating the situation. We basically duplicate the effort in looking up for the facet values ( we do internally in Lucene in the end of the faceting process : from Ordinal to CustomID and we do it again from the CustomID to the value in the front end) The only apparent gain could be in term of disk space, but also in this case I am not 100% sure that compressing a set of IDs will produce much benefit in compressing the real values ( which can present repeated sequence of characters for example) . What are your consideration ? Any additional pro/con ? Cheers -- -------------------------- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England