date:20230912

[GitHub] [lucene] shubhamvishu commented on pull request #12183: Make some heavy query rewrites concurrent

2023-09-12 Thread via GitHub

shubhamvishu commented on PR #12183: URL: https://github.com/apache/lucene/pull/12183#issuecomment-1716957965 @jpountz I have made some changes to the `TermStates#build` to unblock this PR and avoid the deadlock issue happening due to executor forking into itself by checking if its a `Thre

[GitHub] [lucene] Tony-X closed issue #12536: Remove `lastPosBlockOffset` from term metadata for Lucene90PostingsFormat

2023-09-12 Thread via GitHub

Tony-X closed issue #12536: Remove `lastPosBlockOffset` from term metadata for Lucene90PostingsFormat URL: https://github.com/apache/lucene/issues/12536 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [lucene] Tony-X commented on issue #12536: Remove `lastPosBlockOffset` from term metadata for Lucene90PostingsFormat

2023-09-12 Thread via GitHub

Tony-X commented on issue #12536: URL: https://github.com/apache/lucene/issues/12536#issuecomment-1716406470 https://github.com/apache/lucene/pull/12541 is merged and I'll close this one -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [lucene] Tony-X commented on a diff in pull request #12552: Make FSTPostingsFormat load FSTs off-heap

2023-09-12 Thread via GitHub

Tony-X commented on code in PR #12552: URL: https://github.com/apache/lucene/pull/12552#discussion_r1323531587 ## lucene/codecs/src/java/org/apache/lucene/codecs/memory/FSTTermsReader.java: ## @@ -191,7 +193,9 @@ final class TermsReader extends Terms { this.sumTotalTermFr

[GitHub] [lucene] msokolov commented on a diff in pull request #12552: Make FSTPostingsFormat load FSTs off-heap

2023-09-12 Thread via GitHub

msokolov commented on code in PR #12552: URL: https://github.com/apache/lucene/pull/12552#discussion_r1323494538 ## lucene/codecs/src/java/org/apache/lucene/codecs/memory/FSTTermsReader.java: ## @@ -191,7 +193,9 @@ final class TermsReader extends Terms { this.sumTotalTerm

[GitHub] [lucene] Tony-X opened a new pull request, #12552: Make FSTPostingsFormat load FSTs off-heap

2023-09-12 Thread via GitHub

Tony-X opened a new pull request, #12552: URL: https://github.com/apache/lucene/pull/12552 ### Description FSTs supports to load offheap for a while. As we were trying to use `FSTPostingsFormat` for some fields we realized heap usage bumped. Upon further investigation we reali

[GitHub] [lucene] jimczi opened a new pull request, #12551: Introduce dynamic segment efSearch to Knn{Byte|Float}VectorQuery

2023-09-12 Thread via GitHub

jimczi opened a new pull request, #12551: URL: https://github.com/apache/lucene/pull/12551 This PR introduces a new parameter known as 'efSearch' to the knn vector query. 'efSearch' governs the maximum size of the priority queue employed for nearest neighbor searches. As each segment may co

[GitHub] [lucene] jpountz merged pull request #12490: Reduce the overhead of ImpactsDISI.

2023-09-12 Thread via GitHub

jpountz merged PR #12490: URL: https://github.com/apache/lucene/pull/12490 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] jimczi merged pull request #12529: Introduce a random vector scorer in HNSW builder/searcher

2023-09-12 Thread via GitHub

jimczi merged PR #12529: URL: https://github.com/apache/lucene/pull/12529 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

[GitHub] [lucene] mikemccand commented on pull request #12541: Document why we need `lastPosBlockOffset`

2023-09-12 Thread via GitHub

mikemccand commented on PR #12541: URL: https://github.com/apache/lucene/pull/12541#issuecomment-1715559983 I backported to 9.x as well ... annoying that GitHub doesn't state in summary that the above push was to 9.x (it's only reflected here because it referenced this PR). It does reflect

[GitHub] [lucene] mikemccand merged pull request #12541: Document why we need `lastPosBlockOffset`

2023-09-12 Thread via GitHub

mikemccand merged PR #12541: URL: https://github.com/apache/lucene/pull/12541 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

[GitHub] [lucene] uschindler commented on pull request #12460: Allow reading binary doc values as a DataInput

2023-09-12 Thread via GitHub

uschindler commented on PR #12460: URL: https://github.com/apache/lucene/pull/12460#issuecomment-1715550666 To save more memory copies, the codec may use a slice from the underlying IndexInput directly to support both access apis. All file pointer checks would then be performed by the low l

[GitHub] [lucene] jpountz commented on a diff in pull request #12529: Introduce a random vector scorer in HNSW builder/searcher

2023-09-12 Thread via GitHub

jpountz commented on code in PR #12529: URL: https://github.com/apache/lucene/pull/12529#discussion_r1322897603 ## lucene/core/src/java/org/apache/lucene/util/hnsw/RandomVectorScorerProvider.java: ## @@ -0,0 +1,79 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [lucene] uschindler commented on pull request #12460: Allow reading binary doc values as a DataInput

2023-09-12 Thread via GitHub

uschindler commented on PR #12460: URL: https://github.com/apache/lucene/pull/12460#issuecomment-1715514900 > This has been a challenge so many times in the past, maybe it's time to add `seek()` support to `DataInput`? We have full random access (positional reads), if you extend the i

[GitHub] [lucene] stefanvodita commented on pull request #12337: Index arbitrary fields in taxonomy docs

2023-09-12 Thread via GitHub

stefanvodita commented on PR #12337: URL: https://github.com/apache/lucene/pull/12337#issuecomment-1715512722 Thank you for the review @mikemccand! I’ve integrated your feedback. Updatable doc values are definitely something to consider. For comparison, I coded up an [association facet fi

[GitHub] [lucene] stefanvodita commented on a diff in pull request #12337: Index arbitrary fields in taxonomy docs

2023-09-12 Thread via GitHub

stefanvodita commented on code in PR #12337: URL: https://github.com/apache/lucene/pull/12337#discussion_r1322872602 ## lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyIndexReader.java: ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software

[GitHub] [lucene] jimczi commented on pull request #12529: Introduce a random vector scorer in HNSW builder/searcher

2023-09-12 Thread via GitHub

jimczi commented on PR #12529: URL: https://github.com/apache/lucene/pull/12529#issuecomment-1715484871 Given that no further concerns have been raised, I am intending to merge this change soon. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [lucene] jpountz commented on pull request #12490: Reduce the overhead of ImpactsDISI.

2023-09-12 Thread via GitHub

jpountz commented on PR #12490: URL: https://github.com/apache/lucene/pull/12490#issuecomment-1715453502 Another benchmark run on the last commit to make sure it still works as expected, and wikibigall this time instead of wikimedium10m: ``` TaskQPS base

[GitHub] [lucene] stefanvodita closed pull request #12550: [Demo] Per label association facet fields

2023-09-12 Thread via GitHub

stefanvodita closed pull request #12550: [Demo] Per label association facet fields URL: https://github.com/apache/lucene/pull/12550 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [lucene] stefanvodita commented on pull request #12550: [Demo] Per label association facet fields

2023-09-12 Thread via GitHub

stefanvodita commented on PR #12550: URL: https://github.com/apache/lucene/pull/12550#issuecomment-1715245714 Cancelling right away, this is not meant to be merged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [lucene] stefanvodita opened a new pull request, #12550: [Demo] Per label association facet fields

2023-09-12 Thread via GitHub

stefanvodita opened a new pull request, #12550: URL: https://github.com/apache/lucene/pull/12550 ### Description A user could have data about facet labels. In the demo here, we record an author's popularity score, with authors being facet labels in an index of books. Today, use

[GitHub] [lucene] jpountz commented on pull request #12460: Allow reading binary doc values as a DataInput

2023-09-12 Thread via GitHub

jpountz commented on PR #12460: URL: https://github.com/apache/lucene/pull/12460#issuecomment-1715238722 > I think this approach defeats on of the main purposes for this change, that is to avoid allocating a byte array when reading doc values. I don't think we want BinaryDocValues to do tha

[GitHub] [lucene] iverase commented on pull request #12460: Allow reading binary doc values as a DataInput

2023-09-12 Thread via GitHub

iverase commented on PR #12460: URL: https://github.com/apache/lucene/pull/12460#issuecomment-1715224914 > I'm contemplating not introducing a new DataInputDocValues class, and instead have a dataInput() method on BinaryDocValues I think this approach defeats on of the main purposes f

[GitHub] [lucene] jpountz commented on a diff in pull request #12549: Run merge-on-full-flush even though no changes got flushed.

2023-09-12 Thread via GitHub

jpountz commented on code in PR #12549: URL: https://github.com/apache/lucene/pull/12549#discussion_r1322599113 ## lucene/core/src/test/org/apache/lucene/index/TestIndexWriterDelete.java: ## @@ -1315,7 +1315,8 @@ public void testTryDeleteDocument() throws Exception { w.addD

[GitHub] [lucene] jpountz commented on a diff in pull request #12549: Run merge-on-full-flush even though no changes got flushed.

2023-09-12 Thread via GitHub

jpountz commented on code in PR #12549: URL: https://github.com/apache/lucene/pull/12549#discussion_r1322592471 ## lucene/core/src/test/org/apache/lucene/index/TestIndexWriter.java: ## @@ -518,11 +518,10 @@ public void testFlushWithNoMerging() throws IOException { doc.add(n

[GitHub] [lucene] jpountz commented on pull request #12460: Allow reading binary doc values as a DataInput

2023-09-12 Thread via GitHub

jpountz commented on PR #12460: URL: https://github.com/apache/lucene/pull/12460#issuecomment-1715126194 The more I think of this change, the more I like it: most of the time, you would need to read data out of binary doc values, e.g. (variable-length) integers, strings, etc. and exposing b

[GitHub] [lucene] shubhamvishu commented on pull request #12183: Make some heavy query rewrites concurrent

[GitHub] [lucene] Tony-X closed issue #12536: Remove `lastPosBlockOffset` from term metadata for Lucene90PostingsFormat

[GitHub] [lucene] Tony-X commented on issue #12536: Remove `lastPosBlockOffset` from term metadata for Lucene90PostingsFormat

[GitHub] [lucene] Tony-X commented on a diff in pull request #12552: Make FSTPostingsFormat load FSTs off-heap

[GitHub] [lucene] msokolov commented on a diff in pull request #12552: Make FSTPostingsFormat load FSTs off-heap

[GitHub] [lucene] Tony-X opened a new pull request, #12552: Make FSTPostingsFormat load FSTs off-heap

[GitHub] [lucene] jimczi opened a new pull request, #12551: Introduce dynamic segment efSearch to Knn{Byte|Float}VectorQuery

[GitHub] [lucene] jpountz merged pull request #12490: Reduce the overhead of ImpactsDISI.

[GitHub] [lucene] jimczi merged pull request #12529: Introduce a random vector scorer in HNSW builder/searcher

[GitHub] [lucene] mikemccand commented on pull request #12541: Document why we need `lastPosBlockOffset`

[GitHub] [lucene] mikemccand merged pull request #12541: Document why we need `lastPosBlockOffset`

[GitHub] [lucene] uschindler commented on pull request #12460: Allow reading binary doc values as a DataInput

[GitHub] [lucene] jpountz commented on a diff in pull request #12529: Introduce a random vector scorer in HNSW builder/searcher

[GitHub] [lucene] uschindler commented on pull request #12460: Allow reading binary doc values as a DataInput

[GitHub] [lucene] stefanvodita commented on pull request #12337: Index arbitrary fields in taxonomy docs

[GitHub] [lucene] stefanvodita commented on a diff in pull request #12337: Index arbitrary fields in taxonomy docs

[GitHub] [lucene] jimczi commented on pull request #12529: Introduce a random vector scorer in HNSW builder/searcher

[GitHub] [lucene] jpountz commented on pull request #12490: Reduce the overhead of ImpactsDISI.

[GitHub] [lucene] stefanvodita closed pull request #12550: [Demo] Per label association facet fields

[GitHub] [lucene] stefanvodita commented on pull request #12550: [Demo] Per label association facet fields

[GitHub] [lucene] stefanvodita opened a new pull request, #12550: [Demo] Per label association facet fields

[GitHub] [lucene] jpountz commented on pull request #12460: Allow reading binary doc values as a DataInput

[GitHub] [lucene] iverase commented on pull request #12460: Allow reading binary doc values as a DataInput

[GitHub] [lucene] jpountz commented on a diff in pull request #12549: Run merge-on-full-flush even though no changes got flushed.

[GitHub] [lucene] jpountz commented on a diff in pull request #12549: Run merge-on-full-flush even though no changes got flushed.

[GitHub] [lucene] jpountz commented on pull request #12460: Allow reading binary doc values as a DataInput

26 matches

Site Navigation

Mail list logo

Footer information