[PR] Harden BaseDocValuesFormatTestCase [lucene]

2024-05-06 Thread via GitHub
dnhatn opened a new pull request, #13346: URL: https://github.com/apache/lucene/pull/13346 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
rmuir commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2097001668 Thanks Uwe, maybe the correct solution is to simply add the api and implement with `madvise()` for MMapDirectory, for now? To me this is just another `madvise` being hooked in. I fe

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
uschindler commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096977864 https://bugs.openjdk.org/browse/JDK-8292771 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
rmuir commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096977066 ability to madvise/fadvise without resorting to native code would be awesome too. I don't know how it may translate to windows. but it seems like it does exactly what this PR wants to do:

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
uschindler commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096970275 > > We can't get the file handle in Java (still open issue). > > hmm, ok. I felt like we were able to get it somewhere thru the guts of nio/2 filesystem apis, maybe I am wrong?

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
rmuir commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096968924 > We can't get the file handle in Java (still open issue). hmm, ok. I felt like we were able to get it somewhere thru the guts of nio/2 filesystem apis, maybe I am wrong? -- This

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
uschindler commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096965609 > If madvise does the trick for mmapdir, why not try POSIX_FADV_WILLNEED for the niofs case? We can't get the file handle in Java (still open issue). -- This is an automated

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
rmuir commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096958396 If madvise does the trick for mmapdir, why not try POSIX_FADV_WILLNEED for the niofs case? -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
uschindler commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096948490 > It works on my Linux box and returns 4096. 🎉 We could now also fix my hack regarding smaller chunk sizes and just ensure the chunk size is greater page size to enable madvise

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
jpountz commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096944622 It works on my Linux box and returns 4096. :tada: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
uschindler commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096921573 Yes, that was my idea. I also quickyl implemented the page size problem. I haven't tested it (on windows at moment). If you like you could quickly check the return value on lin

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
jpountz commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096902937 > Can't we use the same filechannel and do a positional read in another thread (not async)? I gave it a try in the last commit, is this what you had in mind? The benchmark suggest

Re: [I] How to run tests with the Panama Vector implementation [lucene]

2024-05-06 Thread via GitHub
ChrisHegarty commented on issue #13344: URL: https://github.com/apache/lucene/issues/13344#issuecomment-2096743493 I really wanna spawn a new JVM with those options for a small set of tests, rather than have them infect all other running tests. -- This is an automated message from the Apa

Re: [I] IndexWriter loses track of parent field when index is empty [lucene]

2024-05-06 Thread via GitHub
msokolov closed issue #13340: IndexWriter loses track of parent field when index is empty URL: https://github.com/apache/lucene/issues/13340 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] gh-13340: Allow adding a parent field to an index with no fields [lucene]

2024-05-06 Thread via GitHub
msokolov merged PR #13341: URL: https://github.com/apache/lucene/pull/13341 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [PR] gh-13340: Allow adding a parent field to an index with no fields [lucene]

2024-05-06 Thread via GitHub
msokolov commented on PR #13341: URL: https://github.com/apache/lucene/pull/13341#issuecomment-2096490756 I'll push since there don't seem to be any concerns raised. If we later want to make the index metadata a first-class file on its own we can always do that. -- This is an automated me

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
jpountz commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096458975 Please go ahead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
uschindler commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096443560 Hi, > It looks we need to have 2 options: > > * use try/catch around the madvise and do nothing if unaliged on 4k boundaries > * use the obsolete/deprecated `getpagesi

Re: [PR] Implement Weight#count for vector values in the FieldExistsQuery [lucene]

2024-05-06 Thread via GitHub
bugmakerr commented on PR #13322: URL: https://github.com/apache/lucene/pull/13322#issuecomment-2096390961 > I don't remember how vectors handle ghost fields, could this trigger a NPE if a field indexes vectors, then all docs that have vectors get merged away? The test shows that

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
uschindler commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096314678 > We can also use the Hotspot bean to get page size, but this fails on OpenJ9 or any 3rd party JVM. So we could try to get page size from HotSpt bean in Constants.java and save it in

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
uschindler commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096310679 > Thanks for taking a look Uwe, and suggesting approaches for the page size issue! By the way, feel free to push directly to the branch. > > > I also have some questions about t

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
jpountz commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096293398 Thanks for taking a look Uwe, and suggesting approaches for the page size issue! By the way, feel free to push directly to the branch. > I also have some questions about the NIOFS

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
uschindler commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096286224 We can also use the Hotspot bean to get page size, but this fails on OpenJ9 or any 3rd party JVM. So we could try to get page size from HotSpt bean in Constants.java and save it in Op

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
uschindler commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2096280645 Hi, give me some time to review. I got the concept! I also have some questions about the NIOFS one because I don't like to use twice as much file handles just for the prefetching.

Re: [I] How to run tests with the Panama Vector implementation [lucene]

2024-05-06 Thread via GitHub
uschindler commented on issue #13344: URL: https://github.com/apache/lucene/issues/13344#issuecomment-2096259809 No idea yet, still thinking about it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] How to run tests with the Panama Vector implementation [lucene]

2024-05-06 Thread via GitHub
uschindler commented on issue #13344: URL: https://github.com/apache/lucene/issues/13344#issuecomment-2096257331 > Those settings are really only useful when running the test which compares that the Panama and default implementations emit the same results. If we could somehow just limit set

Re: [PR] Deprecate COSINE VectorSimilarity function [lucene]

2024-05-06 Thread via GitHub
Pulkitg64 commented on PR #13308: URL: https://github.com/apache/lucene/pull/13308#issuecomment-2095782459 Thanks @benwtrent for all the feedback. I have raised another revision just to mark the COSINE function as deprecated. I didn't find any internal usages of this function. In a f

Re: [I] How to run tests with the Panama Vector implementation [lucene]

2024-05-06 Thread via GitHub
uschindler commented on issue #13344: URL: https://github.com/apache/lucene/issues/13344#issuecomment-2095762775 see https://github.com/apache/lucene/pull/12681 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [I] How to run tests with the Panama Vector implementation [lucene]

2024-05-06 Thread via GitHub
uschindler commented on issue #13344: URL: https://github.com/apache/lucene/issues/13344#issuecomment-2095759073 I think we must do some magic to not set those: https://github.com/apache/lucene/blob/40cae087f71a875478309abfc4b1ab1e7b027cab/gradle/testing/randomization.gradle#L108-L113

Re: [I] How to run tests with the Panama Vector implementation [lucene]

2024-05-06 Thread via GitHub
uschindler commented on issue #13344: URL: https://github.com/apache/lucene/issues/13344#issuecomment-2095734125 I think there should be the option to run this with defaults. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] How to run tests with the Panama Vector implementation [lucene]

2024-05-06 Thread via GitHub
uschindler commented on issue #13344: URL: https://github.com/apache/lucene/issues/13344#issuecomment-2095612656 Could it be the ";" in your command line? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] How to run tests with the Panama Vector implementation [lucene]

2024-05-06 Thread via GitHub
ChrisHegarty commented on issue #13344: URL: https://github.com/apache/lucene/issues/13344#issuecomment-2095546148 /cc @uschindler -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Make LRUQueryCache respect Accountable queries on eviction and consisten… [lucene]

2024-05-06 Thread via GitHub
justinmarygopal commented on PR #12614: URL: https://github.com/apache/lucene/pull/12614#issuecomment-2095502552 I am also facing the same issue after the upgrade of elastic to 8.10 from 8.7. Then we upgraded to 8.12, still we are seeing a similar behaviour . Any suggestions please?

Re: [PR] Add a MemorySegment Vector scorer - for scoring without copying on-heap [lucene]

2024-05-06 Thread via GitHub
ChrisHegarty commented on code in PR #13339: URL: https://github.com/apache/lucene/pull/13339#discussion_r1590701005 ## lucene/core/src/java21/org/apache/lucene/internal/vectorization/MemorySegmentByteVectorScorerSupplier.java: ## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-06 Thread via GitHub
jpountz commented on PR #13337: URL: https://github.com/apache/lucene/pull/13337#issuecomment-2095430556 @rmuir asked if we could add support for this on `MMapDirectory` via `madvise` + `POSIX_MADV_WILLNEED`. I pushed a new commit that does this (with several nocommits). This seems to perfo