Re: [PR] Initial upgrade to Gradle 8.14 (rc2) [lucene]
dweiss commented on PR #14533: URL: https://github.com/apache/lucene/pull/14533#issuecomment-2823279050 > I don't understand this one: their language server works fine with java 24 and i'm able to navigate the new JDK24 classfile api, etc. I guess I haven't used it extensively, due to the gradle issue, and they've got some bugfixes since the last release. I don't fully understand it either - the latest artifact for the batch ecj compiler (what we use) seems to be from march: https://repo1.maven.org/maven2/org/eclipse/jdt/ecj/3.41.0/ this doesn't support Java 24. I've just downloaded the latest Eclipse and it seems to contain the same batch compiler. Their website says this:  There are some marketplace extensions to support Java 24 but these seem to be external - so... I don't think Eclipse supports Java 24 yet? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] deps(java): bump org.apache.commons:commons-compress from 1.19 to 1.27.1 [lucene]
dweiss commented on PR #14540: URL: https://github.com/apache/lucene/pull/14540#issuecomment-2823322410 Ok, so here's how to work this out. First, show what's bringing in the conflicting dependencies, like the message shows: ``` ./gradlew :lucene:benchmark:dependencyInsight --dependency "commons-codec:commons-codec" --configuration "compileClasspath" ... commons-codec:commons-codec:1.17.1 \--- org.apache.commons:commons-compress:1.27.1 \--- compileClasspath ./gradlew :lucene:analysis:phonetic:dependencyInsight --dependency "commons-codec:commons-codec" --configuration "compileClasspath" ... commons-codec:commons-codec:1.17.2 \--- compileClasspath ``` The above means :lucene:benchmark pulls commons-codec as a transitive dependency and phonetic imports it directly. Previously, palantir's consistent-versions would try to make all dependencies "consistent" globally - unfortunately this isn't really working well with larger projects with more complicated configurations. This "dependency-locking" mechanism that I wrote and that we use in Lucene attempts to detect such inconsistencies but doesn't try to fix them. There are a few ways to fix this: * Add an explicit dependency on commons-codec or on :lucene:analysis:phonetic to :lucene:benchmark - then gradle will align the transitive dependency to the requested version. Here is what the dependency insight looks like after adding a dependency on phonetic: ``` ./gradlew :lucene:benchmark:dependencyInsight --dependency "commons-codec:commons-codec" --configuration "compileClasspath" ... \--- project :lucene:analysis:phonetic \--- compileClasspath commons-codec:commons-codec:1.17.1 -> 1.17.2 \--- org.apache.commons:commons-compress:1.27.1 \--- compileClasspath ``` note the auto-aligned dependency. * Add gradle's dependency constraint on the version of commons-codec to benchmarks. * remove :lucene:benchmark from the set of configurations included in the dependency lock checks (this would mean two different versions of commons-codec are used in different configurations). I went with the first option because this seems like the simplest one. Note that the latest version of commons-compress brings additional dependencies (commons-io and commons-lang3). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Tone down TestIndexWriterDelete.testDeleteAllRepeated (OOMs sometimes) [lucene]
dweiss closed issue #14508: Tone down TestIndexWriterDelete.testDeleteAllRepeated (OOMs sometimes) URL: https://github.com/apache/lucene/issues/14508 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Make TestIndexWriterDelete.testDeleteAllRepeated a monster test and force FSDirectory to prevent OOM. [lucene]
dweiss merged PR #14526: URL: https://github.com/apache/lucene/pull/14526 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
jpountz commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2056836360 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { +Map, List> hintClasses = + context.hints().stream().collect(Collectors.groupingBy(IOContext.FileOpenHint::getClass)); + +// there should only be one of FileType, FileData, DataAccess +List fileTypes = +hintClasses.getOrDefault(FileTypeHint.class, List.of()); +if (fileTypes.size() > 1) { + throw new IllegalArgumentException("Multiple file type hints specified: " + fileTypes); +} +List fileData = hintClasses.getOrDefault(FileDataHint.class, List.of()); +if (fileData.size() > 1) { + throw new IllegalArgumentException("Multiple file data hints specified: " + fileData); +} +List dataAccess = +hintClasses.getOrDefault(DataAccessHint.class, List.of()); +if (dataAccess.size() > 1) { + throw new IllegalArgumentException("Multiple data access hints specified: " + dataAccess); +} + } + + protected ReadAdvice toReadAdvice(IOContext context) { Review Comment: How would you customize read advices on a custom directory in the end? I would have assumed something like that if it was on IOContext: ```java Directory dir = new FilterDirectory(mmapDirectory) { @Override public IndexInput openInput(String name, IOContext context) { if (someCondition()) { // enforce specific read advice instead of relying on hints context = context.withReadAdvice(ReadAdvice.NORMAL); } return in.openInput(name, context); } } ``` I'm not sure how it can work with a method on `Directory` that would never be called by the wrapped directory? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Reduce the number of comparisons when lowerPoint is equal to upperPoint [lucene]
github-actions[bot] commented on PR #14267: URL: https://github.com/apache/lucene/pull/14267#issuecomment-2825838739 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contribution! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Integrating GPU based Vector Search using cuVS [lucene]
github-actions[bot] commented on PR #14131: URL: https://github.com/apache/lucene/pull/14131#issuecomment-2825838798 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contribution! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Run a workflow on dependabot's PRs (gradlew updateLicenses writeLocks) [lucene]
rmuir commented on issue #14506: URL: https://github.com/apache/lucene/issues/14506#issuecomment-2824924310 thanks for looking into this @dweiss ! I've got a PR to hopefully reduce the number of dependabot PRs for python and actions dependencies, since some of them release very often (multiple times a week): https://github.com/apache/lucene/pull/14544 I think it might help reduce the maintenance costs in a different way. I think most of the java dependencies don't update as frequently but we can revisit that one when we are "caught up" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Impl intoBitset for IndexedDISI and Docvalues [lucene]
gf2121 commented on PR #14529: URL: https://github.com/apache/lucene/pull/14529#issuecomment-2824954402 I personally prefer not to eagerly load docs for advance/advanceExact, or maybe leave it to another issue/PR. I update code to one of the version during my local iteration, which reduces the complexity by using a single `orRange` instead of multiple round of `orRange`. I wonder if this will make sense to you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Logic for collecting Histogram efficiently using Point Trees [lucene]
jpountz commented on code in PR #14439: URL: https://github.com/apache/lucene/pull/14439#discussion_r2056783050 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/plain/histograms/HistogramCollector.java: ## @@ -53,11 +63,31 @@ public LeafCollector getLeafCollector(LeafReaderContext context) throws IOExcept // The segment has no values, nothing to do. throw new CollectionTerminatedException(); } + +// We can use multi range traversal logic to collect the histogram on numeric +// field indexed as point for MATCH_ALL cases. In future, this can be extended +// for Point Range Query cases as well +if (context.reader().hasDeletions() == false +&& weight.count(context) == context.reader().maxDoc()) { Review Comment: I would check weight against `null`, applications may have collectors that don't set a weight. ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/plain/histograms/HistogramCollector.java: ## @@ -53,11 +63,31 @@ public LeafCollector getLeafCollector(LeafReaderContext context) throws IOExcept // The segment has no values, nothing to do. throw new CollectionTerminatedException(); } + +// We can use multi range traversal logic to collect the histogram on numeric +// field indexed as point for MATCH_ALL cases. In future, this can be extended +// for Point Range Query cases as well +if (context.reader().hasDeletions() == false Review Comment: You don't need this check, Weight#count already accounts for deletions. ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/plain/histograms/PointTreeBulkCollector.java: ## @@ -0,0 +1,227 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.sandbox.facet.plain.histograms; + +import static org.apache.lucene.search.DocIdSetIterator.NO_MORE_DOCS; + +import java.io.IOException; +import java.util.function.Function; +import org.apache.lucene.index.PointValues; +import org.apache.lucene.internal.hppc.LongIntHashMap; +import org.apache.lucene.search.DocIdSetIterator; +import org.apache.lucene.util.NumericUtils; + +class PointTreeBulkCollector { + private static Function bytesToLong(int numBytes) { +if (numBytes == Long.BYTES) { + // Used by LongPoint, DoublePoint + return a -> NumericUtils.sortableBytesToLong(a, 0); +} else if (numBytes == Integer.BYTES) { + // Used by IntPoint, FloatPoint, LatLonPoint, LatLonShape + return a -> (long) NumericUtils.sortableBytesToInt(a, 0); +} + +return null; + } + + static boolean canCollectEfficiently(final PointValues pointValues, final long bucketWidth) + throws IOException { +// TODO: Do we really need pointValues.getDocCount() == pointValues.size() for this optimization +if (pointValues == null +|| pointValues.getNumDimensions() != 1 +|| pointValues.getDocCount() != pointValues.size()) { + return false; +} + +final Function byteToLong = bytesToLong(pointValues.getBytesPerDimension()); +if (byteToLong == null) { + return false; +} + +long leafMinBucket = +Math.floorDiv(byteToLong.apply(pointValues.getMinPackedValue()), bucketWidth); +long leafMaxBucket = +Math.floorDiv(byteToLong.apply(pointValues.getMaxPackedValue()), bucketWidth); + +// We want that # leaf nodes is more than # buckets so that we can completely skip over +// some of the leaf nodes. Higher this ratio, more efficient it is than naive approach! +if ((pointValues.size() / 512) < (leafMaxBucket - leafMinBucket)) { Review Comment: Sounds good to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Move sloppySin into SloppyMath from GeoUtils [lucene]
jainankitk commented on code in PR #14516: URL: https://github.com/apache/lucene/pull/14516#discussion_r2056713597 ## lucene/core/src/java/org/apache/lucene/util/SloppyMath.java: ## @@ -178,6 +178,30 @@ public static double asin(double a) { } } + // some sloppyish stuff, do we really need this to be done in a sloppy way? + // unless it is performance sensitive, we should try to remove. + // This is performance sensitive, check SloppySinBenchmark and RectangleBenchmark + private static final double PIO2 = Math.PI / 2D; + + /** + * Returns the trigonometric sine of an angle converted as a cos operation. + * + * Note that this is not quite right... e.g. sin(0) != 0 + * + * Special cases: + * + * + * If the argument is {@code NaN} or an infinity, then the result is {@code NaN}. + * + * + * @param a an angle, in radians. + * @return the sine of the argument. + * @see Math#sin(double) + */ + public static double sin(double a) { Review Comment: > I'm not familiar with what the Rectangle code is doing here: I don't tend to think about sin() being associated with Rectanges... `sin()` is used for calculating and adjusting the longitude range of a circle's bounding Rectangle based on the latitude -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Logic for collecting Histogram efficiently using Point Trees [lucene]
jainankitk commented on code in PR #14439: URL: https://github.com/apache/lucene/pull/14439#discussion_r2056686537 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/plain/histograms/HistogramCollector.java: ## @@ -53,11 +65,30 @@ public LeafCollector getLeafCollector(LeafReaderContext context) throws IOExcept // The segment has no values, nothing to do. throw new CollectionTerminatedException(); } + +// We can use multi range traversal logic to collect the histogram on numeric +// field indexed as point for MATCH_ALL cases. In future, this can be extended +// for Point Range Query cases as well +if (query instanceof MatchAllDocsQuery && context.reader().hasDeletions() == false) { Review Comment: Oh yeah, that's a good point! Hadn't thought about queries on other fields. Changed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Logic for collecting Histogram efficiently using Point Trees [lucene]
jainankitk commented on code in PR #14439: URL: https://github.com/apache/lucene/pull/14439#discussion_r2056834941 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/plain/histograms/HistogramCollector.java: ## @@ -53,11 +63,31 @@ public LeafCollector getLeafCollector(LeafReaderContext context) throws IOExcept // The segment has no values, nothing to do. throw new CollectionTerminatedException(); } + +// We can use multi range traversal logic to collect the histogram on numeric +// field indexed as point for MATCH_ALL cases. In future, this can be extended +// for Point Range Query cases as well +if (context.reader().hasDeletions() == false +&& weight.count(context) == context.reader().maxDoc()) { Review Comment: Added `null` check -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055497713 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { +Map, List> hintClasses = + context.hints().stream().collect(Collectors.groupingBy(IOContext.FileOpenHint::getClass)); + +// there should only be one of FileType, FileData, DataAccess +List fileTypes = +hintClasses.getOrDefault(FileTypeHint.class, List.of()); +if (fileTypes.size() > 1) { + throw new IllegalArgumentException("Multiple file type hints specified: " + fileTypes); +} +List fileData = hintClasses.getOrDefault(FileDataHint.class, List.of()); +if (fileData.size() > 1) { + throw new IllegalArgumentException("Multiple file data hints specified: " + fileData); +} +List dataAccess = +hintClasses.getOrDefault(DataAccessHint.class, List.of()); +if (dataAccess.size() > 1) { + throw new IllegalArgumentException("Multiple data access hints specified: " + dataAccess); +} + } + + protected ReadAdvice toReadAdvice(IOContext context) { Review Comment: This becomes a lot more complicated in [subsequent PRs](https://github.com/apache/lucene/pull/14510/files#diff-a320c032354752cddf187b571d2832ca7a0dcfee5a1f3bda19578e2559338f8b) ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { +Map, List> hintClasses = + context.hints().stream().collect(Collectors.groupingBy(IOContext.FileOpenHint::getClass)); + +// there should only be one of FileType, FileData, DataAccess +List fileTypes = +hintClasses.getOrDefault(FileTypeHint.class, List.of()); +if (fileTypes.size() > 1) { + throw new IllegalArgumentException("Multiple file type hints specified: " + fileTypes); +} +List fileData = hintClasses.getOrDefault(FileDataHint.class, List.of()); +if (fileData.size() > 1) { + throw new IllegalArgumentException("Multiple file data hints specified: " + fileData); +} +List dataAccess = +hintClasses.getOrDefault(DataAccessHint.class, List.of()); +if (dataAccess.size() > 1) { + throw new IllegalArgumentException("Multiple data access hints specified: " + dataAccess); +} + } + + protected ReadAdvice toReadAdvice(IOContext context) { Review Comment: This becomes a lot more complicated in [subsequent PRs](https://github.com/apache/lucene/pull/14510/files#diff-a320c032354752cddf187b571d2832ca7a0dcfee5a1f3bda19578e2559338f8b). I'll add javadocs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] deps(java): bump com.github.luben:zstd-jni from 1.5.5-11 to 1.5.7-2 [lucene]
dweiss merged PR #14505: URL: https://github.com/apache/lucene/pull/14505 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055502471 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { +Map, List> hintClasses = + context.hints().stream().collect(Collectors.groupingBy(IOContext.FileOpenHint::getClass)); + +// there should only be one of FileType, FileData, DataAccess Review Comment: Because some uses can only specify some of the hint types - eg [here](https://github.com/apache/lucene/pull/14510/files#diff-6b2f0b03b74eed609dc835a669a38299f6bd24a3c2431cf7d58de27a869fc96e). I also want the ability to specify other hint types if necessary, eg [here](https://github.com/apache/lucene/pull/14509/commits/62fffb269edc6b7e981396a6a99f44b98d005c37) and [here](https://github.com/apache/lucene/pull/14510/files#diff-498e50abd4c94dc980f7bb096e63b64051a2ac77053f0a943643206699962bbc). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] deps(java): bump de.undercouch.download from 5.2.0 to 5.6.0 [lucene]
dweiss merged PR #14503: URL: https://github.com/apache/lucene/pull/14503 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Use a non-deprecated assertThat, and change several test assertions to use assertThat [lucene]
thecoop commented on code in PR #14518: URL: https://github.com/apache/lucene/pull/14518#discussion_r2055485408 ## lucene/core/src/test/org/apache/lucene/document/TestDocument.java: ## @@ -312,21 +324,21 @@ public void testFieldSetValue() throws Exception { // ensure that queries return expected results without DateFilter first ScoreDoc[] hits = searcher.search(query, 1000).scoreDocs; -assertEquals(3, hits.length); -int result = 0; +assertThat(hits, arrayWithSize(3)); +Set seen = new HashSet<>(); StoredFields storedFields = searcher.storedFields(); for (int i = 0; i < 3; i++) { Document doc2 = storedFields.document(hits[i].doc); Field f = (Field) doc2.getField("id"); - if (f.stringValue().equals("id1")) result |= 1; - else if (f.stringValue().equals("id2")) result |= 2; - else if (f.stringValue().equals("id3")) result |= 4; - else fail("unexpected id field"); + switch (f.stringValue()) { +case "id1", "id2", "id3" -> seen.add(f.stringValue()); +default -> fail("unexpected id field"); + } } writer.close(); reader.close(); dir.close(); -assertEquals("did not see all IDs", 7, result); +assertThat("did not see all IDs", seen, containsInAnyOrder("id1", "id2", "id3")); Review Comment: `containsInAnyOrder` fails if there are any extra items in the collection. More generally, `assertEquals(..., Set.of())` checks if the value is a `Set`, which is not actually important here. The collection could be a `List` and the test would still work. So checking for `Set` isn't required here for the test to pass, whereas `containsInAnyOrder` doesn't care about the type of collection used. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055496607 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { Review Comment: A more useful implementation can be found [here](https://github.com/apache/lucene/pull/14510/files#diff-a320c032354752cddf187b571d2832ca7a0dcfee5a1f3bda19578e2559338f8b). There are a couple of reasons for putting it here: - I wanted it on `Directory` so that `IOContext` is just a data transfer object, it is not opinionated on what it holds - Overrides of `Directory` can override this to specify their own validation logic for any new hints they use themselves -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] ci: bump actions/stale from 5.2.0 to 9.1.0 [lucene]
stefanvodita commented on PR #14498: URL: https://github.com/apache/lucene/pull/14498#issuecomment-2823469274 Thanks David! I've enabled debug mode. I'll push and have a look at the logs, then disable debug mode again if nothing seems out of place. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] ci: bump actions/stale from 5.2.0 to 9.1.0 [lucene]
stefanvodita merged PR #14498: URL: https://github.com/apache/lucene/pull/14498 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
rmuir commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055620022 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { Review Comment: I still feel my question is unanswered. I think adding a new method to Directory is a Very Serious Thing and I dont understand why it absolutely must be here, as to me this doesn't make sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055636449 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { Review Comment: I put it in `Directory` so it is alongside the `toReadAdvice` method, to keep `IOContext` as dumb as possible, and so it can be overridden by subclasses to add additional validation where needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Use a non-deprecated assertThat, and change several test assertions to use assertThat [lucene]
rmuir commented on PR #14518: URL: https://github.com/apache/lucene/pull/14518#issuecomment-2823693691 my concern wasn't about verbosity: if you are concerned about this, steer clear of java! instead just keeping "barrier to entry" to understanding the tests low. This includes requiring developer to go and understand a third-party library in order to debug it. But if this is really more natural to most developers, then I'm ok. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
ChrisHegarty commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055674573 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { Review Comment: Adding the invariant that an IOContext can have only instance of a concrete hint type seems ok, but I wonder if we need the validation at all. Maybe the validation can rather be enforced in a test-specific AssertingXXX wrapper? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Move sloppySin into SloppyMath from GeoUtils [lucene]
rmuir commented on code in PR #14516: URL: https://github.com/apache/lucene/pull/14516#discussion_r2055723991 ## lucene/core/src/java/org/apache/lucene/util/SloppyMath.java: ## @@ -178,6 +178,30 @@ public static double asin(double a) { } } + // some sloppyish stuff, do we really need this to be done in a sloppy way? + // unless it is performance sensitive, we should try to remove. + // This is performance sensitive, check SloppySinBenchmark and RectangleBenchmark + private static final double PIO2 = Math.PI / 2D; + + /** + * Returns the trigonometric sine of an angle converted as a cos operation. + * + * Note that this is not quite right... e.g. sin(0) != 0 + * + * Special cases: + * + * + * If the argument is {@code NaN} or an infinity, then the result is {@code NaN}. + * + * + * @param a an angle, in radians. + * @return the sine of the argument. + * @see Math#sin(double) + */ + public static double sin(double a) { Review Comment: I agree it is better to have the benchmark. I'm not familiar with what the Rectangle code is doing here: I don't tend to think about sin() being associated with Rectanges... But the background is that the `cos()` and `asin()` are needed for this use-case: https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/document/LatLonPointDistanceComparator.java#L37-L43 We try hard to avoid calling these functions at all, but in some cases such as the sorting, many calls are needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] TestForTooMuchCloning: too many calls to IndexInput.clone during TermRangeQuery: 7 [lucene]
rmuir commented on issue #14546: URL: https://github.com/apache/lucene/issues/14546#issuecomment-2823753882 This test is old, i think. The background IIRC is that for some Directory impls (e.g. NIOFSDirectory or others that use "buffered input"), `clone()` may cause at minimum a 1KB read/buffer refill. For MMapDirectory the clone() may be cheaper and so excessive cloning may go unnoticed, yet cause a performance issue for some. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Run a workflow on dependabot's PRs (gradlew updateLicenses writeLocks) [lucene]
dweiss commented on issue #14506: URL: https://github.com/apache/lucene/issues/14506#issuecomment-2823987442 So this would require some more extensive restructuring of workflows. A commit from within a workflow won't trigger the same set of checks (this is done to avoid endless loops in distributed jobs on gh side). So we'd need to extract any checks as a reusable action, then configure dependabot PRs to use a special workflow that updates locks, then checks if all other checks pass, then push the commit if they do. It is kind-of-doable but also a bit fragile. Let's live with manual update for now and see how pressing the issue is, then revisit. I attach the workflow I used for testing. [dependabot-update-shas.txt](https://github.com/user-attachments/files/19866430/dependabot-update-shas.txt) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Run a workflow on dependabot's PRs (gradlew updateLicenses writeLocks) [lucene]
dweiss closed issue #14506: Run a workflow on dependabot's PRs (gradlew updateLicenses writeLocks) URL: https://github.com/apache/lucene/issues/14506 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2056374094 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { +Map, List> hintClasses = + context.hints().stream().collect(Collectors.groupingBy(IOContext.FileOpenHint::getClass)); + +// there should only be one of FileType, FileData, DataAccess +List fileTypes = +hintClasses.getOrDefault(FileTypeHint.class, List.of()); +if (fileTypes.size() > 1) { + throw new IllegalArgumentException("Multiple file type hints specified: " + fileTypes); +} +List fileData = hintClasses.getOrDefault(FileDataHint.class, List.of()); +if (fileData.size() > 1) { + throw new IllegalArgumentException("Multiple file data hints specified: " + fileData); +} +List dataAccess = +hintClasses.getOrDefault(DataAccessHint.class, List.of()); +if (dataAccess.size() > 1) { + throw new IllegalArgumentException("Multiple data access hints specified: " + dataAccess); +} + } + + protected ReadAdvice toReadAdvice(IOContext context) { Review Comment: Does the logic [here](https://github.com/apache/lucene/pull/14510/files#diff-a320c032354752cddf187b571d2832ca7a0dcfee5a1f3bda19578e2559338f8bR107) really belong in `IOContext`? It feels like `ReadAdvice` is now more an implementation detail of `Directory`, so the mapping from hints -> ReadAdvice belongs in `Directory`. IOContext is more a POJO that has little opinions on what hints it passes thorough. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2056382795 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { +Map, List> hintClasses = + context.hints().stream().collect(Collectors.groupingBy(IOContext.FileOpenHint::getClass)); + +// there should only be one of FileType, FileData, DataAccess +List fileTypes = +hintClasses.getOrDefault(FileTypeHint.class, List.of()); +if (fileTypes.size() > 1) { + throw new IllegalArgumentException("Multiple file type hints specified: " + fileTypes); +} +List fileData = hintClasses.getOrDefault(FileDataHint.class, List.of()); +if (fileData.size() > 1) { + throw new IllegalArgumentException("Multiple file data hints specified: " + fileData); +} +List dataAccess = +hintClasses.getOrDefault(DataAccessHint.class, List.of()); +if (dataAccess.size() > 1) { + throw new IllegalArgumentException("Multiple data access hints specified: " + dataAccess); +} + } + + protected ReadAdvice toReadAdvice(IOContext context) { Review Comment: There's various options here allowing different ReadAdvice for compound directories - one is a subinterface of `IOContext` with an extra method to get file-specific hints that `CompoundDirectory` implementations would know how to access. This PR isn't the final implementation, there's going to be a few changes as the use of hints is expanded. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Compute the doc range more efficiently when flushing doc block [lucene]
stefanvodita commented on PR #14447: URL: https://github.com/apache/lucene/pull/14447#issuecomment-2823807598 Sorry, I don't know if I was clear enough. I was asking if we can also revert the change to `Lucene101PostingsWriter` and only change `Lucene103PostingsWriter` this time. Thanks for adding the changelog entry! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Compute the doc range more efficiently when flushing doc block [lucene]
bugmakerr commented on PR #14447: URL: https://github.com/apache/lucene/pull/14447#issuecomment-2823829947 > Sorry, I don't know if I was clear enough. I was asking if we can also revert the change to `Lucene101PostingsWriter` and only change `Lucene103PostingsWriter` this time. > > Thanks for adding the changelog entry! My fault, I forgot to revert. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] A specialized Trie for Block Tree Index [lucene]
gf2121 commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2824644842 Hi @stefanvodita We plan to allow CI chew on this change for a couple of weeks and backport it if everything goes well. See https://github.com/apache/lucene/pull/14333#issuecomment-2789601238. When backporting, i'll try to catch all changes made to `Lucene103Codec`, including https://github.com/apache/lucene/pull/14447. So for now, feel free to just merge it into main :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] A specialized Trie for Block Tree Index [lucene]
stefanvodita commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2824654704 You anticipated my concern with #14447 😄 Thank you for handling this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Fix TestForTooMuchCloning [lucene]
gf2121 merged PR #14547: URL: https://github.com/apache/lucene/pull/14547 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
jpountz commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055940755 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { Review Comment: Could this be a compile-time invariant instead of a runtime invariant, e.g. ~~by tracking the 3 hints separately, or~~ by using something similar to Guava's ClassToInstanceMap instead of a Set? EDIT: lated read the explanation why it can't be 3 different hints -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
jpountz commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055940755 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { Review Comment: Could this be a compile-time invariant instead of a runtime invariant, e.g. by tracking the 3 hints separately, or by using something similar to Guava's ClassToInstanceMap instead of a Set? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] TestForTooMuchCloning: too many calls to IndexInput.clone during TermRangeQuery: 7 [lucene]
mikemccand commented on issue #14546: URL: https://github.com/apache/lucene/issues/14546#issuecomment-2824159527 +1 to bump the clone limit to 7 so the test passes again -- clone inflation!! `clone` is likely super cheap for `MMapDirectory`? Hmm it is the JDK version specific `MemorySegmentIndexInput` (with `SingleSegmentImpl` subclass to optimize the common case ) that is actually cloned, and it's making a new class instance, setting a few members, making a shallow copy of the actual mapped segments array, and (hmm) calling `.toString()` on the to-be-cloned `IndexInput` heh (this just returns the `resourceDescription` member). So yeah it looks quite cheap. Also, this clone cost (fixed, one-time cost per segment) is amortized away as the query + segment matches more and more hits. So crazy fast queries matching very few hits are a wee bit slower and still crazy fast, and slow queries matching many hits are not affected in any meaningful way ... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055991981 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { Review Comment: OK, after looking at it a bit, I've just moved the hint checking into `IOContext` in ae1bc87. It does mean that no hint at all can have more than one instance as a hint though - but that might be ok? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Logic for collecting Histogram efficiently using Point Trees [lucene]
mikemccand commented on PR #14439: URL: https://github.com/apache/lucene/pull/14439#issuecomment-2824226140 This is a nice optimization, using points (if the user indexed them) to carefully optimize counting of ranges. > @stefanvodita - Thanks for a prompt review. Addressed most of the review comments. Adding JMH benchmark instead of the not so useful performance test added earlier. The benchark results demonstrate significant increase in throughput with increasing # documents and bucket width (lesser buckets mean less low level traversal in point range tree and more documents collected in bulk): Oooh those JMH benchy results are nice! Though, it's dangerous testing only on random data -- you can draw random conclusions/results. But it's better than no benchmark! Maybe we should add histogram faceting benchy to Lucene's nightly benchmarks? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Logic for collecting Histogram efficiently using Point Trees [lucene]
jpountz commented on code in PR #14439: URL: https://github.com/apache/lucene/pull/14439#discussion_r2056150570 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/plain/histograms/HistogramCollector.java: ## @@ -53,11 +65,30 @@ public LeafCollector getLeafCollector(LeafReaderContext context) throws IOExcept // The segment has no values, nothing to do. throw new CollectionTerminatedException(); } + +// We can use multi range traversal logic to collect the histogram on numeric +// field indexed as point for MATCH_ALL cases. In future, this can be extended +// for Point Range Query cases as well +if (query instanceof MatchAllDocsQuery && context.reader().hasDeletions() == false) { Review Comment: I'ml not sure if we're talking about the same case. You seem to be talking about the case when the histogram is computed on the same field as the range query. I was referring to the case when a query is not a `MatchAllDocsQuery` but still fully matches a segment, e.g. a `TermQuery` on a stop word like `a` or a wide `PointRangeQuery` (not necessarily on the same field as the histogram). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2056382795 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { +Map, List> hintClasses = + context.hints().stream().collect(Collectors.groupingBy(IOContext.FileOpenHint::getClass)); + +// there should only be one of FileType, FileData, DataAccess +List fileTypes = +hintClasses.getOrDefault(FileTypeHint.class, List.of()); +if (fileTypes.size() > 1) { + throw new IllegalArgumentException("Multiple file type hints specified: " + fileTypes); +} +List fileData = hintClasses.getOrDefault(FileDataHint.class, List.of()); +if (fileData.size() > 1) { + throw new IllegalArgumentException("Multiple file data hints specified: " + fileData); +} +List dataAccess = +hintClasses.getOrDefault(DataAccessHint.class, List.of()); +if (dataAccess.size() > 1) { + throw new IllegalArgumentException("Multiple data access hints specified: " + dataAccess); +} + } + + protected ReadAdvice toReadAdvice(IOContext context) { Review Comment: There's various options here allowing different ReadAdvice for compound directories - one is a subinterface of `IOContext` with an extra method to get file-specific hints that `CompoundDirectory` implementations would know how to access. Other options are available of course. This PR isn't the final implementation, there's going to be a few changes as the use of hints is expanded. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] TestForTooMuchCloning: too many calls to IndexInput.clone during TermRangeQuery: 7 [lucene]
gf2121 commented on issue #14546: URL: https://github.com/apache/lucene/issues/14546#issuecomment-2823836350 Thanks @ChrisHegarty for git bisect and @rmuir for the context. (I allowed myself to edit your comment to point to the correct PR.) I knew the new trie will clone the indexInput twice instead of FST only once so I changed the limitation from 5 to 6 per segment, while i did not expect it will clone 7 times so much. I printed out all the clone traces: 1. at org.apache.lucene.codecs.lucene103.blocktree.FieldReader.newReader(FieldReader.java:92) 2. at org.apache.lucene.codecs.lucene103.blocktree.TrieReader.(TrieReader.java:79) 3. org.apache.lucene.codecs.lucene103.blocktree.IntersectTermsEnum.(IntersectTermsEnum.java:85) 4. org.apache.lucene.codecs.lucene103.blocktree.FieldReader.newReader(FieldReader.java:92) 5. org.apache.lucene.codecs.lucene103.blocktree.TrieReader.(TrieReader.java:79) 6. at org.apache.lucene.codecs.lucene103.blocktree.SegmentTermsEnum.initIndexInput(SegmentTermsEnum.java:83) 7. org.apache.lucene.codecs.lucene103.Lucene103PostingsReader$BlockPostingsEnum.reset(Lucene103PostingsReader.java:517) So the `RangeQuery` intersect with `IntersectTermsEnum` (3 times), then `seeExact` with `SegmentTermsEnum` (3 times), and get postings (1 times). The TrieReader is constructed twice so we cloned twice more. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] TestForTooMuchCloning: too many calls to IndexInput.clone during TermRangeQuery: 7 [lucene]
gf2121 commented on issue #14546: URL: https://github.com/apache/lucene/issues/14546#issuecomment-2824072414 As the the added clones are cloning index file `tip` of term dictionary, which is usually opened with mmap. I raised a fix https://github.com/apache/lucene/pull/14547 to simply increase the limit of clones per segment from 6 to 7. Feel free to beat me if we think it is too many. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055991981 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { Review Comment: OK, after looking at it a bit, I've just moved the hint checking into `IOContext`. It does mean that no hint at all can have more than one instance as a hint though - but that might be ok? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add AnytimeRankingSearcher for SLA-Aware Early Termination with Bin-Based Score Boosting [lucene]
atris commented on PR #14525: URL: https://github.com/apache/lucene/pull/14525#issuecomment-2824857619 @jpountz Thanks! Here is the paper: https://arxiv.org/abs/2104.08976 Note that the core inspiration of this PR's approach comes from the paper, but the implementation diverges in certain ways: The paper talks about using bins mainly for hard cutoffs and filtering. The PR, instead, uses bins to compute adaptive score boosts, and wire that directly using the new Collector. The PR also adds: • index-time graph-based binning (exact + approximate). This adds minimal indexing latency but gives significant improvements in search time. • bin-level boosting at segment level So while the high-level idea overlaps, the implementation is more ambitious and also opens doors for implementations like bin skipping and multiple fields support and then use graph intersection or fusion to identify documents that are strongly connected across multiple semantic dimensions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055991981 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { Review Comment: OK, after looking at it a bit, I've just moved the hint checking into `IOContext` in ae1bc87 in a type-independent way. It does mean that no hint at all can have more than one instance as a hint though - but that might be ok? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Compute the doc range more efficiently when flushing doc block [lucene]
stefanvodita merged PR #14447: URL: https://github.com/apache/lucene/pull/14447 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055706972 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { Review Comment: Or maybe add the validation to `BaseDirectoryWrapper`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Initial upgrade to Gradle 8.14 (rc2) [lucene]
dweiss commented on PR #14533: URL: https://github.com/apache/lucene/pull/14533#issuecomment-2823770803 I've no idea either. All I know is that if I download eclipse-jdt, it doesn't have support for Java 24 yet. There is a milestone version which supposedly has it - but not an official stable version, as far as I can see. And the maven repo definitely doesn't have the batch compiler supporting java24. That eclipse-jdls has a fork of jdt... which is also weird. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
jpountz commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055959591 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { +Map, List> hintClasses = + context.hints().stream().collect(Collectors.groupingBy(IOContext.FileOpenHint::getClass)); + +// there should only be one of FileType, FileData, DataAccess +List fileTypes = +hintClasses.getOrDefault(FileTypeHint.class, List.of()); +if (fileTypes.size() > 1) { + throw new IllegalArgumentException("Multiple file type hints specified: " + fileTypes); +} +List fileData = hintClasses.getOrDefault(FileDataHint.class, List.of()); +if (fileData.size() > 1) { + throw new IllegalArgumentException("Multiple file data hints specified: " + fileData); +} +List dataAccess = +hintClasses.getOrDefault(DataAccessHint.class, List.of()); +if (dataAccess.size() > 1) { + throw new IllegalArgumentException("Multiple data access hints specified: " + dataAccess); +} + } + + protected ReadAdvice toReadAdvice(IOContext context) { Review Comment: This may also help with making things work on compound files, it looks like it is not possible to customize the read advice of sub-files of a compound file with the current approach? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
jpountz commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055961207 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { +Map, List> hintClasses = + context.hints().stream().collect(Collectors.groupingBy(IOContext.FileOpenHint::getClass)); + +// there should only be one of FileType, FileData, DataAccess Review Comment: Thanks, that makes sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] TestForTooMuchCloning: too many calls to IndexInput.clone during TermRangeQuery: 7 [lucene]
gf2121 closed issue #14546: TestForTooMuchCloning: too many calls to IndexInput.clone during TermRangeQuery: 7 URL: https://github.com/apache/lucene/issues/14546 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Initial upgrade to Gradle 8.14 (rc2) [lucene]
rmuir commented on PR #14533: URL: https://github.com/apache/lucene/pull/14533#issuecomment-2823807465 Yeah I think "big old eclipse" lags behind, whereas the jdtls is "the latest" and sees more usage (e.g. all VSCode users and other editors). There's even a plugin to allow use of jdtls in "big old eclipse": https://github.com/redhat-developer/eclipseide-jdtls/ very confusing ecosystem, but that's my read on it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Initial upgrade to Gradle 8.14 (rc2) [lucene]
dweiss commented on PR #14533: URL: https://github.com/apache/lucene/pull/14533#issuecomment-2823791230 Eclipse 4.36M1 seems to have Java 24 support in the batch compiler. This is marked as "stable" but artifacts for this release are not on maven central yet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Initial upgrade to Gradle 8.14 (rc2) [lucene]
dweiss commented on PR #14533: URL: https://github.com/apache/lucene/pull/14533#issuecomment-2823856413 I believe they use p2 repositories for interim releases and only do a maven push once in a quarter for stable final releases. If this becomes a problem, we can pull ecj directly from p2 drops, like here: https://www.eclipse.org/downloads/download.php?file=/eclipse/downloads/drops4/S-4.36M1-202504031800/ecj-4.36M1.jar&r=1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[PR] Fix TestForTooMuchCloning [lucene]
gf2121 opened a new pull request, #14547: URL: https://github.com/apache/lucene/pull/14547 closes #14546 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
jpountz commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055947695 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { +Map, List> hintClasses = + context.hints().stream().collect(Collectors.groupingBy(IOContext.FileOpenHint::getClass)); + +// there should only be one of FileType, FileData, DataAccess +List fileTypes = +hintClasses.getOrDefault(FileTypeHint.class, List.of()); +if (fileTypes.size() > 1) { + throw new IllegalArgumentException("Multiple file type hints specified: " + fileTypes); +} +List fileData = hintClasses.getOrDefault(FileDataHint.class, List.of()); +if (fileData.size() > 1) { + throw new IllegalArgumentException("Multiple file data hints specified: " + fileData); +} +List dataAccess = +hintClasses.getOrDefault(DataAccessHint.class, List.of()); +if (dataAccess.size() > 1) { + throw new IllegalArgumentException("Multiple data access hints specified: " + dataAccess); +} + } + + protected ReadAdvice toReadAdvice(IOContext context) { Review Comment: Then can we move it to IOContext instead of Directory? `FilterDirectory` could customize the read advice by wrapping the IOContext that is passed to createOutput/openInput. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055950140 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { Review Comment: I like that idea - so theres no actual logic involved, but you're only allowed one instance of each type in `IOContext`, whilst not being restrictive on how many hints can be specified. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Fix leadCost calculation in BooleanScorerSupplier.requiredBulkScorer [lucene]
benwtrent commented on PR #14543: URL: https://github.com/apache/lucene/pull/14543#issuecomment-2824317132 🤔 seems like this is another candidate for a 10.2.1 bugfix release. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] A specialized Trie for Block Tree Index [lucene]
stefanvodita commented on PR #14333: URL: https://github.com/apache/lucene/pull/14333#issuecomment-2824596025 If we think the issue last week was with the tests, should we go ahead and back-port this change? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Impl intoBitset for IndexedDISI and Docvalues [lucene]
gf2121 commented on code in PR #14529: URL: https://github.com/apache/lucene/pull/14529#discussion_r2057617346 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ## @@ -491,6 +492,14 @@ public int advance(int target) throws IOException { return doc; } + @Override + public void intoBitSet(int upTo, FixedBitSet bitSet, int offset) throws IOException { +assert doc >= offset; Review Comment: > (or maybe a better place for this would be asserting doc values +1 > It looks like some implementations of intoBitsetWithinBlock try to cover the case when the current doc doesn't exist, maybe they can be simplified? Done :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] ci: bump actions/stale from 5.2.0 to 9.1.0 [lucene]
dweiss commented on PR #14498: URL: https://github.com/apache/lucene/pull/14498#issuecomment-2823456560 @stefanvodita - sorry for missing your comment, please do! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] deps(java): bump flexmark from 0.61.24 to 0.64.8 [lucene]
dweiss commented on PR #14499: URL: https://github.com/apache/lucene/pull/14499#issuecomment-2823452992 Documentation markdown-to-html dependency. Tested by comparing the output manually (no difference). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] deps(java): bump flexmark from 0.61.24 to 0.64.8 [lucene]
dweiss merged PR #14499: URL: https://github.com/apache/lucene/pull/14499 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] deps(java): bump de.undercouch.download from 5.2.0 to 5.6.0 [lucene]
dweiss commented on PR #14503: URL: https://github.com/apache/lucene/pull/14503#issuecomment-2823442570 build script/ dataset download dependency. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[I] TestForTooMuchCloning: too many calls to IndexInput.clone during TermRangeQuery: 7 [lucene]
ChrisHegarty opened a new issue, #14546: URL: https://github.com/apache/lucene/issues/14546 ### Description ``` TestForTooMuchCloning > test FAILED java.lang.AssertionError: too many calls to IndexInput.clone during TermRangeQuery: 7 at __randomizedtesting.SeedInfo.seed([3B8D209D5C693E94:B3D91F47F295536C]:0) at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.assertTrue(Assert.java:42) at org.apache.lucene.index.TestForTooMuchCloning.test(TestForTooMuchCloning.java:88) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) at java.base/java.lang.reflect.Method.invoke(Method.java:580) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1763) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996) at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48) at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45) at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902) at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850) at java.base/java.lang.Thread.run(Thread.java:1575) ``` ### Gradle command to reproduce ./gradlew test --tests TestForTooMuchCloning.test -Dtests.seed=3B8D209D5C693E94 -Dtests.locale=kk-KZ -Dtests.timezone=America/Indiana/Marengo -Dtests.asserts=true -Dtests.file.encoding=UTF-8 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe
Re: [PR] Create file open hints on IOContext to replace ReadAdvice [lucene]
thecoop commented on code in PR #14482: URL: https://github.com/apache/lucene/pull/14482#discussion_r2055496607 ## lucene/core/src/java/org/apache/lucene/store/Directory.java: ## @@ -79,6 +83,31 @@ public abstract class Directory implements Closeable { */ public abstract long fileLength(String name) throws IOException; + protected void validateIOContext(IOContext context) { Review Comment: A more useful implementation can be found [here](https://github.com/apache/lucene/pull/14510/files#diff-a320c032354752cddf187b571d2832ca7a0dcfee5a1f3bda19578e2559338f8b). There are a couple of reasons for putting it here: - I wanted it on `Directory` so that `IOContext` is just a data transfer object, it is not opinionated on what it holds - `Directory` and subclasses contains all the logic for validating and using hints. - Overrides of `Directory` can override this to specify their own validation logic for any new hints they use themselves -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] ci: bump actions/stale from 5.2.0 to 9.1.0 [lucene]
stefanvodita commented on PR #14498: URL: https://github.com/apache/lucene/pull/14498#issuecomment-2823576204 Compare the [last run with v5](https://github.com/apache/lucene/actions/runs/14607185573/job/40978452884) with the [first run with v9](https://github.com/apache/lucene/actions/runs/14614033999/job/40998064219). The more recent run didn't encounter any new stale PRs, which is likely correct, and it is significantly more efficient than the older version, using less operations per PR and being able to get through all the PRs with budget to spare, while the older action could not. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] TestForTooMuchCloning: too many calls to IndexInput.clone during TermRangeQuery: 7 [lucene]
ChrisHegarty commented on issue #14546: URL: https://github.com/apache/lucene/issues/14546#issuecomment-2823593983 git bisect tells me that this commit is the culprit #14380. /cc @gf2121 ``` 878e23f77390058b6787e0a0537de9087d1019b3 is the first bad commit commit 878e23f77390058b6787e0a0537de9087d1019b3 Author: Guo Feng <52390227+gf2...@users.noreply.github.com> Date: Mon Apr 14 23:19:04 2025 +0800 A specialized Trie for Block Tree Index (#14380) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Initial upgrade to Gradle 8.14 (rc2) [lucene]
rmuir commented on PR #14533: URL: https://github.com/apache/lucene/pull/14533#issuecomment-2823604993 I'm using https://github.com/eclipse-jdtls/eclipse.jdt.ls which, is just this same eclipse stuff packaged in a different way?  It had the java 24 support like a week after the java 24 release. Sorry, I don't understand eclipse ecosystem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Impl intoBitset for IndexedDISI and Docvalues [lucene]
gf2121 commented on code in PR #14529: URL: https://github.com/apache/lucene/pull/14529#discussion_r2057599583 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ## @@ -625,8 +635,36 @@ boolean advanceExactWithinBlock(IndexedDISI disi, int target) throws IOException disi.exists = false; return false; } + + @Override + boolean intoBitsetWithinBlock(IndexedDISI disi, int upTo, FixedBitSet bitSet, int offset) + throws IOException { +if (disi.exists) { + if (disi.doc >= upTo) { +return true; + } + bitSet.set(disi.doc - offset); +} + +for (int i = disi.index, to = disi.nextBlockIndex; i < to; i++) { Review Comment: I changed this to make it clearer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Impl intoBitset for IndexedDISI and Docvalues [lucene]
gf2121 commented on code in PR #14529: URL: https://github.com/apache/lucene/pull/14529#discussion_r2057680596 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ## @@ -625,6 +634,29 @@ boolean advanceExactWithinBlock(IndexedDISI disi, int target) throws IOException disi.exists = false; return false; } + + @Override + boolean intoBitsetWithinBlock(IndexedDISI disi, int upTo, FixedBitSet bitSet, int offset) + throws IOException { +if (disi.doc >= upTo) { + return true; +} +bitSet.set(disi.doc - offset); + +for (int i = disi.index + 1, to = disi.nextBlockIndex; i <= to; i++) { Review Comment: In `advanceWithinBlock`, `disi.index++` is executed in the loop body before `if (doc >= targetInBlock)` so it is not actually excluded. `advanceWithinBlock` can return true when `disi.index` equals `disi.nextBlockIndex`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Impl intoBitset for IndexedDISI and Docvalues [lucene]
jpountz commented on code in PR #14529: URL: https://github.com/apache/lucene/pull/14529#discussion_r2057662695 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ## @@ -491,6 +492,14 @@ public int advance(int target) throws IOException { return doc; } + @Override + public void intoBitSet(int upTo, FixedBitSet bitSet, int offset) throws IOException { +assert doc >= offset; +while (method.intoBitsetWithinBlock(this, upTo, bitSet, offset) == false) { + readBlockHeader(); Review Comment: I wonder if we should call something like below after reading the block header (similar to what `advance(target)` does): ``` boolean found = method.advanceWithinBlock(this, block); assert found; ``` This would guarantee that `disi.doc` is always a doc ID of the block when `intoBitsetWithinBlock` is called, which could help simplify some implementations of `intoBitsetWithinBlock`? (I'm looking at `DENSE` in particular) ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ## @@ -625,6 +634,29 @@ boolean advanceExactWithinBlock(IndexedDISI disi, int target) throws IOException disi.exists = false; return false; } + + @Override + boolean intoBitsetWithinBlock(IndexedDISI disi, int upTo, FixedBitSet bitSet, int offset) + throws IOException { +if (disi.doc >= upTo) { + return true; +} +bitSet.set(disi.doc - offset); + +for (int i = disi.index + 1, to = disi.nextBlockIndex; i <= to; i++) { Review Comment: I'm still a bit confused, why is `disi.nextBlockIndex` included in the loop when it's excluded in `advanceWithinBlock`? ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ## @@ -720,6 +805,15 @@ boolean advanceExactWithinBlock(IndexedDISI disi, int target) { * return whether this document exists. */ abstract boolean advanceExactWithinBlock(IndexedDISI disi, int target) throws IOException; + +/** + * Similar to {@link DocIdSetIterator#intoBitSet}, load docs in this block into a bitset. This + * mehtod return true if there are remaining docs (gte upTo) in the block, otherwise false. When + * false return, fp of disi#slice is at blockEnd and disi#index is correct but other status vars + * are undefined. Caller should decode the header of next block by {@link #readBlockHeader()}. + */ +abstract boolean intoBitsetWithinBlock( Review Comment: Nit: for consistency with how it's called on `DocIdSetIterator`. ```suggestion abstract boolean intoBitSetWithinBlock( ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Use a non-deprecated assertThat, and change several test assertions to use assertThat [lucene]
dweiss merged PR #14518: URL: https://github.com/apache/lucene/pull/14518 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Impl intoBitset for IndexedDISI and Docvalues [lucene]
gf2121 commented on code in PR #14529: URL: https://github.com/apache/lucene/pull/14529#discussion_r2057597668 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ## @@ -535,6 +544,7 @@ private void readBlockHeader() throws IOException { method = Method.SPARSE; blockEnd = slice.getFilePointer() + (numValues << 1); nextExistDocInBlock = -1; + exists = false; Review Comment: This is to make `intoBitset` work with `advanceExact`, which is not really needed. I removed this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Logic for collecting Histogram efficiently using Point Trees [lucene]
jainankitk commented on PR #14439: URL: https://github.com/apache/lucene/pull/14439#issuecomment-2825127094 > Oooh those JMH benchy results are nice! Though, it's dangerous testing only on random data -- you can draw random conclusions/results. But it's better than no benchmark! Maybe we should add histogram faceting benchy to Lucene's nightly benchmarks? Thanks @mikemccand for the feedback. Having histogram faceting benchmark in Lucene's nightly will be great! Created issue https://github.com/mikemccand/luceneutil/issues/375 for following up on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Avoid reload block when seeking backward in SegmentTermsEnum. [lucene]
github-actions[bot] commented on PR #13253: URL: https://github.com/apache/lucene/pull/13253#issuecomment-2825839442 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contribution! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Impl intoBitset for IndexedDISI and Docvalues [lucene]
jpountz commented on code in PR #14529: URL: https://github.com/apache/lucene/pull/14529#discussion_r2056179949 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ## @@ -535,6 +544,7 @@ private void readBlockHeader() throws IOException { method = Method.SPARSE; blockEnd = slice.getFilePointer() + (numValues << 1); nextExistDocInBlock = -1; + exists = false; Review Comment: Is this a pre-existing bug? ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ## @@ -625,8 +635,36 @@ boolean advanceExactWithinBlock(IndexedDISI disi, int target) throws IOException disi.exists = false; return false; } + + @Override + boolean intoBitsetWithinBlock(IndexedDISI disi, int upTo, FixedBitSet bitSet, int offset) + throws IOException { +if (disi.exists) { + if (disi.doc >= upTo) { +return true; + } + bitSet.set(disi.doc - offset); +} + +for (int i = disi.index, to = disi.nextBlockIndex; i < to; i++) { Review Comment: I am a bit surprised that this loop starts at `disi.index` instead of `disi.index + 1` since the doc at `disi.index` was already handled above? ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/IndexedDISI.java: ## @@ -491,6 +492,14 @@ public int advance(int target) throws IOException { return doc; } + @Override + public void intoBitSet(int upTo, FixedBitSet bitSet, int offset) throws IOException { +assert doc >= offset; Review Comment: Can you also assert something like `doc == NOT_MORE_DOCS || advanceExact(doc)`? It shouldn't be legal to call intoBitSet after advanceExact returns false (or maybe a better place for this would be asserting doc values). It looks like some implementations of `intoBitsetWithinBlock` try to cover the case when the current doc doesn't exist, maybe they can be simplified? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Run a workflow on dependabot's PRs (gradlew updateLicenses writeLocks) [lucene]
dweiss commented on issue #14506: URL: https://github.com/apache/lucene/issues/14506#issuecomment-2825119807 Yes, there is no rush with this. I think the workflows do need some love but it's not really a high priority. I'm chipping at cleaning up gradle build to speed it up. It's a slow process. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Logic for collecting Histogram efficiently using Point Trees [lucene]
jainankitk commented on code in PR #14439: URL: https://github.com/apache/lucene/pull/14439#discussion_r2056830312 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/facet/plain/histograms/HistogramCollector.java: ## @@ -53,11 +63,31 @@ public LeafCollector getLeafCollector(LeafReaderContext context) throws IOExcept // The segment has no values, nothing to do. throw new CollectionTerminatedException(); } + +// We can use multi range traversal logic to collect the histogram on numeric +// field indexed as point for MATCH_ALL cases. In future, this can be extended +// for Point Range Query cases as well +if (context.reader().hasDeletions() == false Review Comment: Ah, nice! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org