[GitHub] [lucene] shubhamvishu opened a new pull request, #12548: Add API to compute vector similarity in DoubleValuesSource

2023-09-10 Thread via GitHub
shubhamvishu opened a new pull request, #12548: URL: https://github.com/apache/lucene/pull/12548 ### Description This PR addresses the issue #12394. It adds an API **`similarityToQueryVector`** to `DoubleValuesSource` to compute vector similarity scores between the query vector and t

[GitHub] [lucene] jpountz commented on pull request #12489: Add support for recursive graph bisection.

2023-09-10 Thread via GitHub
jpountz commented on PR #12489: URL: https://github.com/apache/lucene/pull/12489#issuecomment-1712928445 Regarding positions, the reproducibility paper noted that the algorithm helped term frequencies a bit, though not as much as docs. It doesn't say anythink about positions, though I suspe

[GitHub] [lucene] jpountz commented on pull request #12489: Add support for recursive graph bisection.

2023-09-10 Thread via GitHub
jpountz commented on PR #12489: URL: https://github.com/apache/lucene/pull/12489#issuecomment-1712923358 > I wonder why stored fields index size wasn't really hurt nearly as much for wikibigall but was for wikimediumall? This is because wikimedium uses chunks of articles as documents,

[GitHub] [lucene] jpountz commented on pull request #12489: Add support for recursive graph bisection.

2023-09-10 Thread via GitHub
jpountz commented on PR #12489: URL: https://github.com/apache/lucene/pull/12489#issuecomment-1712779097 Wikibigall. Less space spent on doc valuse this time since I did not enable indexing of facets. There is a more significant size reduction of postings this time (-10.5%). This is not mis