[PR] lint python scripts in dev-tools [lucene]

2025-03-01 Thread via GitHub
rmuir opened a new pull request, #14318: URL: https://github.com/apache/lucene/pull/14318 These scripts are important, but have zero static analysis, type-checking, or formatting. Add simple make-based build with CI hook for this. Format check is not yet enabled, we should "mak

Re: [PR] lint python scripts in dev-tools [lucene]

2025-03-01 Thread via GitHub
rmuir commented on PR #14318: URL: https://github.com/apache/lucene/pull/14318#issuecomment-2692256799 If we are ok with it, I would want to backport these changes and cleanups to 10.x branch, too, so that things start to become easier. -- This is an automated message from the Apache Git

[PR] Fix DirectIOIndexInput seek to not read when position is within buffer [lucene]

2025-03-01 Thread via GitHub
ChrisHegarty opened a new pull request, #14320: URL: https://github.com/apache/lucene/pull/14320 This commit changes DirectIOIndexInput::seek so that it repositions the indexInput's position without a read, when the new position is within the bounds of the current buffer. Prior to th

Re: [I] TestDirectIODirectory is slow on Ubuntu/Linux systems [lucene]

2025-03-01 Thread via GitHub
ChrisHegarty commented on issue #14315: URL: https://github.com/apache/lucene/issues/14315#issuecomment-2692428359 You found the bug @rmuir - when seeking, we're always reading from the file even when the new position is within the bounds of the current buffer. I raised this small PR #14320

Re: [I] TestDirectIODirectory is slow on Ubuntu/Linux systems [lucene]

2025-03-01 Thread via GitHub
rmuir commented on issue #14315: URL: https://github.com/apache/lucene/issues/14315#issuecomment-2691557207 When I run this seed on my system I see: ``` :lucene:misc:test (SUCCESS): 63 test(s), 2 skipped The slowest tests (exceeding 500 ms) during this run: 4.86s TestDirectIODi

Re: [PR] Binary vector format for flat and hnsw vectors [lucene]

2025-03-01 Thread via GitHub
gaoj0017 commented on PR #14078: URL: https://github.com/apache/lucene/pull/14078#issuecomment-2692580673 Thank you for acknowledging that our extended RaBitQ method proposes the idea of exploring different scalar quantization parameters on a per-vector basis for the first time and OSQ adop

Re: [PR] Optimize single value ranges for multivalue sortedset docvalues ranges. (#14276) [lucene]

2025-03-01 Thread via GitHub
mkhludnev merged PR #14317: URL: https://github.com/apache/lucene/pull/14317 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [I] TestDirectIODirectory is slow on Ubuntu/Linux systems [lucene]

2025-03-01 Thread via GitHub
rmuir commented on issue #14315: URL: https://github.com/apache/lucene/issues/14315#issuecomment-2692076156 From the strace output it seems to me like, we may not be buffering properly? so we keep re-reading the entire buffer over and over? maybe we aren't cloning our buffer on slice() and

Re: [PR] Optimize single value ranges for multivalue sortedset docvalues ranges. [lucene]

2025-03-01 Thread via GitHub
mkhludnev merged PR #14276: URL: https://github.com/apache/lucene/pull/14276 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [I] Python release checker scripts need an update to use Java 23 [lucene]

2025-03-01 Thread via GitHub
dweiss commented on issue #14316: URL: https://github.com/apache/lucene/issues/14316#issuecomment-2692160562 It's in .github/workflows/run-nightly-smoketester.yml, but the python scripts may also require a manual review. -- This is an automated message from the Apache Git Service. To resp

Re: [I] Python release checker scripts need an update to use Java 23 [lucene]

2025-03-01 Thread via GitHub
Brijeshthummar02 commented on issue #14316: URL: https://github.com/apache/lucene/issues/14316#issuecomment-2692137269 @dweiss i can fix it can you provide path to the yml file. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Python release checker scripts need an update to use Java 23 [lucene]

2025-03-01 Thread via GitHub
dweiss closed issue #14316: Python release checker scripts need an update to use Java 23 URL: https://github.com/apache/lucene/issues/14316 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [I] develocity build scans fail to upload sometimes [lucene]

2025-03-01 Thread via GitHub
dweiss closed issue #14305: develocity build scans fail to upload sometimes URL: https://github.com/apache/lucene/issues/14305 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] lint python scripts in dev-tools [lucene]

2025-03-01 Thread via GitHub
rmuir commented on PR #14318: URL: https://github.com/apache/lucene/pull/14318#issuecomment-2692327772 I will followup with changes to iterate. mainly want to detect common mistakes as a start, then it won't be so terrifying to change or review these files -- This is an automated message

Re: [PR] lint python scripts in dev-tools [lucene]

2025-03-01 Thread via GitHub
rmuir merged PR #14318: URL: https://github.com/apache/lucene/pull/14318 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

Re: [I] TestDirectIODirectory is slow on Ubuntu/Linux systems [lucene]

2025-03-01 Thread via GitHub
rmuir commented on issue #14315: URL: https://github.com/apache/lucene/issues/14315#issuecomment-2691142322 I'd expect that BaseDirectoryTestCase is too intense for direct-io. This test has a lot of randomization like what you are seeing, and none of the IO will be cached on linux. I

[PR] python scripts: fix enough so that undefined variable analysis works [lucene]

2025-03-01 Thread via GitHub
rmuir opened a new pull request, #14319: URL: https://github.com/apache/lucene/pull/14319 Fix basic errors from linter and get undefined variable analysis working through the type-checker. This will detect common problems such as typos, instead of at runtime. The changes are straight

Re: [I] TestDirectIODirectory is slow on Ubuntu/Linux systems [lucene]

2025-03-01 Thread via GitHub
rmuir commented on issue #14315: URL: https://github.com/apache/lucene/issues/14315#issuecomment-2691684579 easy to reproduce on linux if you want to dive in more: terminal 1: for monitoring whole system and locating PID ```console $ sudo perf top --sort overhead,overhead_us,over