Re: [PR] Change uses of withReadAdvice to use hints instead [lucene]
thecoop commented on code in PR #14510: URL: https://github.com/apache/lucene/pull/14510#discussion_r2071537608 ## lucene/test-framework/src/java/org/apache/lucene/tests/store/SerialIOCountingDirectory.java: ## @@ -72,7 +72,7 @@ public ChecksumIndexInput openChecksumInput(String name) throws IOException { @Override public IndexInput openInput(String name, IOContext context) throws IOException { -ReadAdvice readAdvice = context.readAdvice().orElse(Constants.DEFAULT_READADVICE); +ReadAdvice readAdvice = MMapDirectory.toReadAdvice(context); Review Comment: @jpountz Are these tests necessary as-is, or can they be removed & replaced with some other check? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change uses of withReadAdvice to use hints instead [lucene]
thecoop commented on code in PR #14510: URL: https://github.com/apache/lucene/pull/14510#discussion_r2071728838 ## lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene101/Lucene101PostingsReader.java: ## @@ -149,9 +150,9 @@ public Lucene101PostingsReader(SegmentReadState state) throws IOException { IndexFileNames.segmentFileName( state.segmentInfo.name, state.segmentSuffix, Lucene101PostingsFormat.DOC_EXTENSION); try { - // Postings have a forward-only access pattern, so pass ReadAdvice.NORMAL to perform - // readahead. - docIn = state.directory.openInput(docName, state.context.withReadAdvice(ReadAdvice.NORMAL)); + docIn = + state.directory.openInput( + docName, state.context.withHints(FileTypeHint.DATA, FileDataHint.POSTINGS)); Review Comment: This doesn't set a DataAccessHint as it's not RANDOM or SEQUENTIAL, just NORMAL. I've tried to keep the initial hints as simple as possible here, without changing behaviour, so we can normalise and possibly modify behaviour in a subsequent PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Remove RANDOM_PRELOAD read advice, which is not actually used [lucene]
ChrisHegarty merged PR #14593: URL: https://github.com/apache/lucene/pull/14593 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] [Bug] Fix for stored fields force merge regression [lucene]
benwtrent commented on PR #14512: URL: https://github.com/apache/lucene/pull/14512#issuecomment-2847040963 I don't like the read-advice update, even for vectors (I am talking about the referenced PR). It seems we are trying to fix a self-inflicted bug. This has been an incredibly trappy API that has (it seems) provided nothing but woe. I have yet to see any prevailing evidence that changing read-advice for MMAP has provided any benefit at all :(. The actual fix seems to be fixing the underlying APIs: https://github.com/apache/lucene/pull/14510 I do not like the idea of updating a very general format API (stored fields) to fix an esoteric bug. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change uses of withReadAdvice to use hints instead [lucene]
thecoop commented on code in PR #14510: URL: https://github.com/apache/lucene/pull/14510#discussion_r2071743905 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -277,6 +277,32 @@ public final long getMaxChunkSize() { return 1L << chunkSizePower; } + public static ReadAdvice toReadAdvice(IOContext context) { +if (context.context() == IOContext.Context.MERGE +|| context.context() == IOContext.Context.FLUSH) { + return ReadAdvice.SEQUENTIAL; +} + +if (context.hints().contains(DataAccessHint.RANDOM)) { + return ReadAdvice.RANDOM; +} +if (context.hints().contains(DataAccessHint.SEQUENTIAL)) { + return ReadAdvice.SEQUENTIAL; +} + +if (context.hints().contains(FileTypeHint.DATA) +|| context.hints().contains(FileTypeHint.INDEX)) { + return ReadAdvice.NORMAL; +} +// Postings have a forward-only access pattern, so pass ReadAdvice.NORMAL to perform +// readahead. +if (context.hints().contains(FileDataHint.POSTINGS)) { + return ReadAdvice.NORMAL; +} Review Comment: Currently, `ReadAdvice.NORMAL` doesn't have a corresponding `DataAccessHint`, so you get that by not specifying the `DataAccessHint` but then as the default is now `ReadAdvice.RANDOM`, you need some way to explicitly specify 'I want the OS default behaviour here'. Maybe we do have a `DataAccessHint.DEFAULT`, but then doesn't `DataAccessHint` just mirror `ReadAdvice`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change uses of withReadAdvice to use hints instead [lucene]
thecoop commented on code in PR #14510: URL: https://github.com/apache/lucene/pull/14510#discussion_r2071743905 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -277,6 +277,32 @@ public final long getMaxChunkSize() { return 1L << chunkSizePower; } + public static ReadAdvice toReadAdvice(IOContext context) { +if (context.context() == IOContext.Context.MERGE +|| context.context() == IOContext.Context.FLUSH) { + return ReadAdvice.SEQUENTIAL; +} + +if (context.hints().contains(DataAccessHint.RANDOM)) { + return ReadAdvice.RANDOM; +} +if (context.hints().contains(DataAccessHint.SEQUENTIAL)) { + return ReadAdvice.SEQUENTIAL; +} + +if (context.hints().contains(FileTypeHint.DATA) +|| context.hints().contains(FileTypeHint.INDEX)) { + return ReadAdvice.NORMAL; +} +// Postings have a forward-only access pattern, so pass ReadAdvice.NORMAL to perform +// readahead. +if (context.hints().contains(FileDataHint.POSTINGS)) { + return ReadAdvice.NORMAL; +} Review Comment: Currently, `ReadAdvice.NORMAL` doesn't have a corresponding `DataAccessHint`, so you get that by not specifying the `DataAccessHint` but then as the default is now `ReadAdvice.RANDOM` (or configurable), you need some way to explicitly specify 'I want the OS default behaviour here'. Maybe we do have a `DataAccessHint.DEFAULT`, but then doesn't `DataAccessHint` just mirror `ReadAdvice`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Reduce NeighborArray heap memory [lucene]
benwtrent commented on PR #14527: URL: https://github.com/apache/lucene/pull/14527#issuecomment-2847706746 @weizijun It should be fixed. Having an estimation that is more than 2x off is pretty bad. this estimation is used to determine how often flushes should occur, etc. There are a couple of ways this can be fixed. A simple way could be providing an optional call-back to `NeighborArray` that accesses package-private method on `OnHeapHnswGraph` that allows for their individual estimation to be adjusted during array growth. `NeighborArray(OnHeapHnswGraph::updateEstimate...)` or something. Then the ram estimation in OnHeapHnswGraph becomes the accumulation of those estimates as the inner estimates evolve. We need to be cautious there with multi-threadedness as many node updates could be occuring at a time. So, likely this inner accumulator needs to be `LongAccumulator` and it should also assert that its always a positive number Please also adjust the inner arrays to enforce their maximal length. This way we never over-allocate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] OptimisticKnnVectorQuery [lucene]
benwtrent commented on PR #14226: URL: https://github.com/apache/lucene/pull/14226#issuecomment-2848006782 I did a run over 4M vectors. One with bit quantization. In both cases, this optimistic pattern beats the baseline (sharing). Consistently, it gives better recall at better latency. I think that the vector ops count method I am using is "double counting" the vector ops... https://docs.google.com/spreadsheets/d/1B5L0Ob9Q_V7zwryYUJhk_T3d4zhbTLh17LezFWba08A/edit?usp=sharing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] MultiRange query for SortedNumeric DocValues [lucene]
mkhludnev merged PR #14404: URL: https://github.com/apache/lucene/pull/14404 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] MultiRange query for SortedNumeric DocValues (#14404) [lucene]
mkhludnev opened a new pull request, #14605: URL: https://github.com/apache/lucene/pull/14605 * Numeric Multi-Range DocValues query (cherry picked from commit 2eeb71877da4cbbc3033fe9a2c0a18f03550a070) ### Description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Add a preload hint for preloading mmap data on specific open calls [lucene]
thecoop opened a new pull request, #14604: URL: https://github.com/apache/lucene/pull/14604 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] gradle-wrapper.jar will not be updated when its sha/version changes [lucene]
ChrisHegarty commented on issue #14598: URL: https://github.com/apache/lucene/issues/14598#issuecomment-2846847520 I noticed problems with the gradle wrapper also, e.g. ``` ./gradlew check no main manifest attribute, in /Users/chegar/git/lucene/gradle/wrapper/gradle-wrapper.jar ``` I just deleted and re-cloned and all is fine now. It downloaded a new wrapper, e.g. ``` Downloading gradle-wrapper.jar from https://raw.githubusercontent.com/gradle/gradle/v8.14.0/gradle/wrapper/gradle-wrapper.jar ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] MultiRange query for SortedNumeric DocValues (#14404) [lucene]
mkhludnev merged PR #14605: URL: https://github.com/apache/lucene/pull/14605 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Refactor doc values to expose a `DocIdSetIterator` instead of extending `DocIdSetIterator`. [lucene]
github-actions[bot] commented on PR #14475: URL: https://github.com/apache/lucene/pull/14475#issuecomment-2848311502 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contribution! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] [Bug] Fix for postings force merge regression [lucene]
github-actions[bot] commented on PR #14513: URL: https://github.com/apache/lucene/pull/14513#issuecomment-2848311449 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contribution! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Supporting both parentQuery and childQuery and adding childLimitPerParent in BlockJoin queries [lucene]
asimmahmood1 commented on issue #14565: URL: https://github.com/apache/lucene/issues/14565#issuecomment-2848314981 I think new query type is good idea, you're already mentioned new reasons like which score to use, and infact we used combination of parent and child score. For the short term you can limit how many children to match for each parent. Since ParentChildrenBlockJoinQuery returns all children, you could name new query ParentChildBlockJoinQuery, althought the name doesn't make to exactly obvious what it does. @msfroh Yes sounds like the same problem 😄 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change uses of withReadAdvice to use hints instead [lucene]
jpountz commented on code in PR #14510: URL: https://github.com/apache/lucene/pull/14510#discussion_r2071603158 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -277,6 +277,32 @@ public final long getMaxChunkSize() { return 1L << chunkSizePower; } + public static ReadAdvice toReadAdvice(IOContext context) { +if (context.context() == IOContext.Context.MERGE +|| context.context() == IOContext.Context.FLUSH) { + return ReadAdvice.SEQUENTIAL; +} + +if (context.hints().contains(DataAccessHint.RANDOM)) { + return ReadAdvice.RANDOM; +} +if (context.hints().contains(DataAccessHint.SEQUENTIAL)) { + return ReadAdvice.SEQUENTIAL; +} + +if (context.hints().contains(FileTypeHint.DATA) +|| context.hints().contains(FileTypeHint.INDEX)) { + return ReadAdvice.NORMAL; +} +// Postings have a forward-only access pattern, so pass ReadAdvice.NORMAL to perform +// readahead. +if (context.hints().contains(FileDataHint.POSTINGS)) { + return ReadAdvice.NORMAL; +} Review Comment: I feel like this default implementation should be dumb and trust the `DataAccessHint`, and the `FileDataHint` hint should only be used by users to override defaults when they know better about their data / access pattern? ## lucene/test-framework/src/java/org/apache/lucene/tests/store/SerialIOCountingDirectory.java: ## @@ -72,7 +72,7 @@ public ChecksumIndexInput openChecksumInput(String name) throws IOException { @Override public IndexInput openInput(String name, IOContext context) throws IOException { -ReadAdvice readAdvice = context.readAdvice().orElse(Constants.DEFAULT_READADVICE); +ReadAdvice readAdvice = MMapDirectory.toReadAdvice(context); Review Comment: I would expect this class to make the following assumptions - METADATA implies that the file is read once and its content is loaded into heap, - INDEX implies that the file should fit in the page cache, so count only one I/O (potentially per block) at open time - DATA implies that the file may be larger than the page cache, so it needs to increment the counter when moving to a different page. (Related to my other comment about the default implementation in `MMapDirectory` needing to be simple and not looking at the kind of data it's working on.) Currently, the test tries to differenciate between random-access data files and sequential-access data files, I'd be happy with either trusting the `DataAccessHint` or always assuming sequential access. ## lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene50/Lucene50CompoundReader.java: ## @@ -74,7 +74,7 @@ public Lucene50CompoundReader(Directory directory, SegmentInfo si) throws IOExce } expectedLength += CodecUtil.footerLength(); -handle = directory.openInput(dataFileName, IOContext.DEFAULT.withReadAdvice(ReadAdvice.NORMAL)); +handle = directory.openInput(dataFileName, IOContext.DEFAULT.withHints(FileTypeHint.DATA)); Review Comment: `FileTypeHint.DATA` feels confusing as this file has `METADATA` and `INDEX` sub files? ## lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene101/Lucene101PostingsReader.java: ## @@ -149,9 +150,9 @@ public Lucene101PostingsReader(SegmentReadState state) throws IOException { IndexFileNames.segmentFileName( state.segmentInfo.name, state.segmentSuffix, Lucene101PostingsFormat.DOC_EXTENSION); try { - // Postings have a forward-only access pattern, so pass ReadAdvice.NORMAL to perform - // readahead. - docIn = state.directory.openInput(docName, state.context.withReadAdvice(ReadAdvice.NORMAL)); + docIn = + state.directory.openInput( + docName, state.context.withHints(FileTypeHint.DATA, FileDataHint.POSTINGS)); Review Comment: Should this call also pass a `DataAccessHint`? The fact that it's sometimes provided and sometimes not is a bit confusing to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change uses of withReadAdvice to use hints instead [lucene]
thecoop commented on code in PR #14510: URL: https://github.com/apache/lucene/pull/14510#discussion_r2071769318 ## lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/lucene50/Lucene50CompoundReader.java: ## @@ -74,7 +74,7 @@ public Lucene50CompoundReader(Directory directory, SegmentInfo si) throws IOExce } expectedLength += CodecUtil.footerLength(); -handle = directory.openInput(dataFileName, IOContext.DEFAULT.withReadAdvice(ReadAdvice.NORMAL)); +handle = directory.openInput(dataFileName, IOContext.DEFAULT.withHints(FileTypeHint.DATA)); Review Comment: Here, the handle isn't really used direct - its used to slice a new IndexInput, but then a replacement `IOContext` is specified there. So we don't actually need a hint/readadvice at all here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Speed up flush of softdelete by intoBitset [lucene]
jpountz commented on code in PR #14552: URL: https://github.com/apache/lucene/pull/14552#discussion_r2071284659 ## lucene/core/src/java/org/apache/lucene/index/SingletonSortedNumericDocValues.java: ## @@ -57,6 +58,16 @@ public int advance(int target) throws IOException { return in.advance(target); } + @Override + public void intoBitSet(int upTo, FixedBitSet bitSet, int offset) throws IOException { +in.intoBitSet(upTo, bitSet, offset); + } + + @Override + public int docIDRunEnd() throws IOException { +return in.docIDRunEnd(); + } Review Comment: Maybe extract this change to a separate PR and cover all wrappers? ## lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java: ## @@ -526,6 +540,60 @@ public int nextDoc() throws IOException { } while (hasValue == false); return docIDOut; } + +@Override +public void intoBitSet(int upTo, FixedBitSet bitSet, int offset) throws IOException { + if (onDiskDocValues == null) { +super.intoBitSet(upTo, bitSet, offset); +return; + } + + // we need a scratch bitset because the param bitset doesn't allow bits to be cleared. + if (scratch == null || scratch.length() < upTo - offset) { Review Comment: Should we compare scratch.length() with bitSet.length() rather than `upTo - offset`? My concern is that there are multiple call sites that call intoBitSet with upTo=NO_MORE_DOCS, so this could create a giant bit set? ## lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java: ## @@ -526,6 +540,60 @@ public int nextDoc() throws IOException { } while (hasValue == false); return docIDOut; } + +@Override +public void intoBitSet(int upTo, FixedBitSet bitSet, int offset) throws IOException { + if (onDiskDocValues == null) { +super.intoBitSet(upTo, bitSet, offset); +return; + } + + // we need a scratch bitset because the param bitset doesn't allow bits to be cleared. + if (scratch == null || scratch.length() < upTo - offset) { +// intoBitSet is usually called with fixed window size so we do not do overSize here. +scratch = new FixedBitSet(upTo - offset); Review Comment: Maybe use FixedBitSet#ensureCapacity? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change uses of withReadAdvice to use hints instead [lucene]
ChrisHegarty commented on code in PR #14510: URL: https://github.com/apache/lucene/pull/14510#discussion_r2071306868 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -277,6 +277,32 @@ public final long getMaxChunkSize() { return 1L << chunkSizePower; } + public static ReadAdvice toReadAdvice(IOContext context) { Review Comment: Actually, this seems more like it's a default implementation of a static READ_ADVICE function, similar to other override-defaults pattern in this class. Otherwise, the user cannot override with empty, since the chaining will delegate. ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -277,6 +277,32 @@ public final long getMaxChunkSize() { return 1L << chunkSizePower; } + public static ReadAdvice toReadAdvice(IOContext context) { Review Comment: Can you please add some javadoc for the public method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Change uses of withReadAdvice to use hints instead [lucene]
thecoop commented on code in PR #14510: URL: https://github.com/apache/lucene/pull/14510#discussion_r2071325241 ## lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java: ## @@ -277,6 +277,32 @@ public final long getMaxChunkSize() { return 1L << chunkSizePower; } + public static ReadAdvice toReadAdvice(IOContext context) { Review Comment: It's public so it can be accessed by `SerialIOCountingDirectory` - see comments there -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Supporting both parentQuery and childQuery in ToChildBlockJoinQuery and adding childLimitPerParent [lucene]
Jinny-Wang commented on issue #14565: URL: https://github.com/apache/lucene/issues/14565#issuecomment-2848301176 With more explorations in the code, I think it would be cleaner to just introduce a new query operator instead of trying to wrap both parentQuery and childQuery in ToChildBlockJoinQuery. - The current ToChildBlockJoinQuery uses [FilterWeight](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/FilterWeight.java) which only support one contained weight. If we start introducing another query, we will need 2 contained weights. Which means a custom `Weight` class would be required that does not extend the `FilterWeight`. This would make it harder to be 100% backward compatible with current code behavior. - The current score uses just the parentQuery's score to score. Introduction of childQuery will open up the option to use childQuery score / childQuery combined with parentQuery score etc. If we want to make the scoring behavior configurable, that's even more customization on top of exiting ToChildBlockJoinQuery. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org