[ 
https://issues.apache.org/jira/browse/LUCENE-10188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17543989#comment-17543989
 ] 

Lu Xugang edited comment on LUCENE-10188 at 5/30/22 3:53 PM:
-------------------------------------------------------------

Hi, [~jpountz] , implementation of SortedNumericDocValues#docValueCount() means 
numbers of values indexed per doc, duplicated value was also participated in 
counting.

Does SortedSetDocValues#docValueCount() has the same semantic (currently they 
have the same javadoc description)? Or maybe you means only calculate the 
number of unique values(ord)?


was (Author: chrislu):
Hi, [~jpountz] ,  implementation of SortedNumericDocValues#docValueCount() 
means numbers of values indexed per doc, duplicated value was also participated 
in counting.

Does SortedSetDocValues#docValueCount() has the same semantic ? Or maybe you 
means only calculate the number of unique values(ord)?

> Give SortedSetDocValues a docValueCount()?
> ------------------------------------------
>
>                 Key: LUCENE-10188
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10188
>             Project: Lucene - Core
>          Issue Type: Wish
>            Reporter: Adrien Grand
>            Priority: Minor
>             Fix For: 10.0 (main), 9.2
>
>
> Theoretically SortedSetDocValues gives more options to codecs with regard to 
> how SORTED_SET doc values could store ords. However in practice we currently 
> always store counts. Maybe giving SORTED_SET doc values an API that is closer 
> to the API of SORTED_NUMERIC doc values would be a better trade-off?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to