[PR] Improve handling of NullPointerException in MMapDirectory's IndexInputs (check that the "closed" condition) [lucene]

2023-10-21 Thread via GitHub
uschindler opened a new pull request, #12705: URL: https://github.com/apache/lucene/pull/12705 See the dev thread by @msokolov @ https://lists.apache.org/thread/qts8wvrjs54gkgz04pk4p93fg0wjbq3o The handling of NPE is very special in ByteBufferIndexInput and also MemorySegmentIndexIn

Re: [PR] [DRAFT] Load vector data directly from the memory segment [lucene]

2023-10-21 Thread via GitHub
uschindler commented on PR #12703: URL: https://github.com/apache/lucene/pull/12703#issuecomment-1773761875 Hi, I was also thinking about this but came to a bit different setup. My problem here is that it is directly linking the code in the Java 20+ code to each other and adding instance

Re: [PR] Remove direct dependency of NodeHash to FST [lucene]

2023-10-21 Thread via GitHub
mikemccand commented on PR #12690: URL: https://github.com/apache/lucene/pull/12690#issuecomment-1773767253 Thanks @dungba88 -- looks great, I'll merge soon! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] Optimize computing number of levels in MultiLevelSkipListWriter#bufferSkip [lucene]

2023-10-21 Thread via GitHub
mikemccand merged PR #12653: URL: https://github.com/apache/lucene/pull/12653 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Optimize computing number of levels in MultiLevelSkipListWriter#bufferSkip [lucene]

2023-10-21 Thread via GitHub
mikemccand commented on PR #12653: URL: https://github.com/apache/lucene/pull/12653#issuecomment-1773768806 I merged to `main` and `9.x` (9.9)! Thanks @shubhamvishu. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Remove direct dependency of NodeHash to FST [lucene]

2023-10-21 Thread via GitHub
mikemccand merged PR #12690: URL: https://github.com/apache/lucene/pull/12690 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Remove direct dependency of NodeHash to FST [lucene]

2023-10-21 Thread via GitHub
mikemccand commented on PR #12690: URL: https://github.com/apache/lucene/pull/12690#issuecomment-1773775196 Thanks @dungba88 -- I'll wait to backport this until after backporting https://github.com/apache/lucene/pull/12633 -- This is an automated message from the Apache Git Service. To re

Re: [PR] Refactor ByteBlockPool so it is just a "shift/mask big array" [lucene]

2023-10-21 Thread via GitHub
stefanvodita commented on PR #12625: URL: https://github.com/apache/lucene/pull/12625#issuecomment-1773790152 I've rebased #12506. I like having a separate class for slice allocation, but if there's disagreement over that, I can put the code back in `TermsHashPerField`. -- This is an aut

Re: [PR] Clean up ByteBlockPool [lucene]

2023-10-21 Thread via GitHub
stefanvodita commented on PR #12506: URL: https://github.com/apache/lucene/pull/12506#issuecomment-1773789994 The last commit is a large rebase + conflict resolution after #12625 got merged. What this PR does hasn't really changed. -- This is an automated message from the Apache Git Serv

Re: [PR] Clean up ByteBlockPool [lucene]

2023-10-21 Thread via GitHub
mikemccand commented on PR #12506: URL: https://github.com/apache/lucene/pull/12506#issuecomment-1773797296 Thanks @stefanvodita -- I'll try to have a look soon! And thank you for gracefully handling the "two people made very similar changes" situation :) This happens often in open s

Re: [PR] Refactor ByteBlockPool so it is just a "shift/mask big array" [lucene]

2023-10-21 Thread via GitHub
mikemccand commented on PR #12625: URL: https://github.com/apache/lucene/pull/12625#issuecomment-1773797762 Thanks @stefanvodita -- I'll try to have a look soon at your rebased PR #12506. And thank you for gracefully handling the "two people made very similar changes" situation :)

Re: [PR] [DRAFT] Load vector data directly from the memory segment [lucene]

2023-10-21 Thread via GitHub
ChrisHegarty commented on PR #12703: URL: https://github.com/apache/lucene/pull/12703#issuecomment-1773837768 > I am out of office the next week, I'd like to participate in the discussion; we should not rush anything. Take your time. Your input and ideas are very much welcome. We will

Re: [PR] Random access term dictionary [lucene]

2023-10-21 Thread via GitHub
bruno-roustant commented on PR #12688: URL: https://github.com/apache/lucene/pull/12688#issuecomment-1773923204 This is some code I wrote a long time ago. It has been tested and used, so I'm confident on the functional aspect, and it might benefit from a benchmark for perf. Le ve

Re: [I] Adding option to codec to disable patching in Lucene's PFOR encoding [lucene]

2023-10-21 Thread via GitHub
rmuir commented on issue #12696: URL: https://github.com/apache/lucene/issues/12696#issuecomment-1773935712 Should we just do more tests and start writing indexes without patching? Only a 4 percent disk savings? It is a lot of complexity, especially to vectorize. A runtime option is more ex

Re: [PR] Avoid object construction when linear searching arcs [lucene]

2023-10-21 Thread via GitHub
gf2121 commented on PR #12692: URL: https://github.com/apache/lucene/pull/12692#issuecomment-1773995253 Nightly benchmark shows fuzzy queries are a bit happy for this change: https://home.apache.org/~mikemccand/lucenebench/2023.10.19.18.03.18.html. -- This is an automated message from the