[GitHub] [lucene] romseygeek commented on issue #11913: lucene-replicator PrimaryNode unsafely publishes reference during construction

2022-12-12 Thread GitBox


romseygeek commented on issue #11913:
URL: https://github.com/apache/lucene/issues/11913#issuecomment-1346186792

   This isn't an area of the code I know well, but I think part of the problem 
here is that the IndexWriter lifecycle is entirely under the management of the 
PrimaryNode (e.g. calling `close` on the PrimaryNode also closes the 
IndexWriter), but PrimaryNode takes the writer as a constructor parameter 
rather than building it itself, which means it's available to outside classes 
before construction is finished.  So maybe we should instead change the 
PrimaryNode constructor to take a Directory and IndexWriterConfig, and build 
the IndexWriter inside the constructor method? And then the IndexWriter can be 
exposed on a method on the PrimaryNode but it won't be available to outside 
consumers until PrimaryNode has finished building.
   
   cc @mikemccand who actually understands this thing :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz opened a new pull request, #12011: Tune the amount of memory that is allocated to sorting postings upon flushing.

2022-12-12 Thread GitBox


jpountz opened a new pull request, #12011:
URL: https://github.com/apache/lucene/pull/12011

   When flushing segments that have an index sort configured, postings lists 
get loaded into arrays and get reordered according to the index sort.
   
   This reordering is implemented with `TimSorter`, a variant of merge sort. 
Like merge sort, an important part of `TimSorter` consists of merging two 
contiguous sorted slices of the array into a combined sorted slice. This 
merging can be done either with external memory, which is the classical 
approach, or in-place, which still runs in linear time but with a much higher 
factor. Until now we were allocating a fixed budget of `maxDoc/64` for doing 
these merges with external memory. If this is not enough, sorted slices would 
be merged in place.
   
   I've been looking at some profiles recently for an index where a 
non-negligible chunk of the time was spent on in-place merges. So I would like 
to propose the following change:
- Increase the maximum RAM budget to `maxDoc / 8`. This should help avoid 
in-place merges for all postings up to `docFreq = maxDoc / 4`.
- Make this RAM budget lazily allocated, rather than eagerly like today. 
This would help not allocate memory in O(maxDoc) for fields like primary keys 
that only have a couple postings per term.
   
   So overall memory usage would never be more than 50% higher than what it is 
today, because `TimSorter` never needs more than X temporary slots if the 
postings list doesn't have at least 2*X entries, and these 2*X entries already 
get loaded into memory today. And for fields that have short postings, memory 
usage should actually be lower.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent commented on pull request #12004: Move byte vector queries into new KnnByteVectorQuery

2022-12-12 Thread GitBox


benwtrent commented on PR #12004:
URL: https://github.com/apache/lucene/pull/12004#issuecomment-1346600559

   >  Since we're changing this stuff, I wonder if BytesRef is the right 
abstraction for a vector of bytes or if we should switch to a plain byte[]
   
   I adjusted the query API there and am still using `BytesRef` internally. The 
reason for using `BytesRef` still internally is because we are still 
overloading `binaryValue()` for getting the byte vector. I hope to change this 
in a following PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] agorlenko commented on pull request #11946: add similarity threshold for hnsw

2022-12-12 Thread GitBox


agorlenko commented on PR #11946:
URL: https://github.com/apache/lucene/pull/11946#issuecomment-1346767509

   I've rewritten this PR with post-filtering approach, sorry for the delay.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] stefanvodita commented on a diff in pull request #12010: Enable LongDoubleConversion error-prone check

2022-12-12 Thread GitBox


stefanvodita commented on code in PR #12010:
URL: https://github.com/apache/lucene/pull/12010#discussion_r1046013555


##
lucene/misc/src/test/org/apache/lucene/misc/search/TestDocValuesStatsCollector.java:
##
@@ -301,8 +301,8 @@ public void testDocsWithMultipleDoubleValues() throws 
IOException {
 if (stats.count() > 0) {
   DoubleSummaryStatistics sumStats =
   filterAndFlatValues(docValues, (v) -> v != 
null).summaryStatistics();
-  assertEquals(sumStats.getMax(), stats.max().longValue(), 0.1);
-  assertEquals(sumStats.getMin(), stats.min().longValue(), 0.1);
+  assertEquals(sumStats.getMax(), (double) stats.max().longValue(), 
0.1);

Review Comment:
   `stats.max()` is already a `double` here, we could discard `longValue()`.



##
lucene/facet/src/test/org/apache/lucene/facet/range/TestRangeFacetCounts.java:
##
@@ -594,7 +594,7 @@ public void testBasicDouble() throws Exception {
 DoubleDocValuesField field = new DoubleDocValuesField("field", 0.0);
 doc.add(field);
 for (long l = 0; l < 100; l++) {
-  field.setDoubleValue(l);
+  field.setDoubleValue((double) l);

Review Comment:
   Can `l` be an `int` instead and then there's no need for a cast since 
there's no loss of data? Same for lines 635-636.



##
lucene/core/src/test/org/apache/lucene/codecs/lucene90/TestLucene90PointsFormat.java:
##
@@ -98,7 +98,8 @@ public void testMergeStability() throws Exception {
 super.testMergeStability();
   }
 
-  @SuppressWarnings("NarrowCalculation")
+  // TODO: clean up the math/estimation here rather than suppress so many 
warnings
+  @SuppressWarnings({"NarrowCalculation", "LongDoubleConversion"})

Review Comment:
   I think the only warnings in this file are asking to cast the arguments we 
pass to `Math.pow`. Why should we suppress that?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on a diff in pull request #12010: Enable LongDoubleConversion error-prone check

2022-12-12 Thread GitBox


rmuir commented on code in PR #12010:
URL: https://github.com/apache/lucene/pull/12010#discussion_r1046028135


##
lucene/core/src/test/org/apache/lucene/codecs/lucene90/TestLucene90PointsFormat.java:
##
@@ -98,7 +98,8 @@ public void testMergeStability() throws Exception {
 super.testMergeStability();
   }
 
-  @SuppressWarnings("NarrowCalculation")
+  // TODO: clean up the math/estimation here rather than suppress so many 
warnings
+  @SuppressWarnings({"NarrowCalculation", "LongDoubleConversion"})

Review Comment:
   i think in general this test needs some cleanup (I suppressed it for 
narrow-calculation, too). It is testing an "estimation" but in an exact manner, 
which is fine, but its also a randomized test and doing some sneaky stuff. 
   
   I'd rather not cause a bunch of intermittent test failures by mucking around 
with it in its current state, it would be better to get refactored by someone 
that understands what it is doing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #12010: Enable LongDoubleConversion error-prone check

2022-12-12 Thread GitBox


rmuir commented on PR #12010:
URL: https://github.com/apache/lucene/pull/12010#issuecomment-1346801373

   thanks @stefanvodita for the suggested cleanups here!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mdmarshmallow commented on a diff in pull request #11901: Github#11869: Add RangeOnRangeFacetCounts

2022-12-12 Thread GitBox


mdmarshmallow commented on code in PR #11901:
URL: https://github.com/apache/lucene/pull/11901#discussion_r1046120980


##
lucene/facet/src/java/org/apache/lucene/facet/rangeonrange/RangeOnRangeFacetCounts.java:
##
@@ -0,0 +1,209 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.rangeonrange;
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.List;
+import org.apache.lucene.document.BinaryRangeDocValues;
+import org.apache.lucene.document.RangeFieldQuery;
+import org.apache.lucene.facet.FacetCountsWithFilterQuery;
+import org.apache.lucene.facet.FacetResult;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.facet.LabelAndValue;
+import org.apache.lucene.index.DocValues;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.PriorityQueue;
+
+abstract class RangeOnRangeFacetCounts extends FacetCountsWithFilterQuery {
+
+  private final byte[][] encodedRanges;
+  private final String[] labels;
+  private final int numEncodedValueBytes;
+  private final int dims;
+
+  /** Counts, initialized in by subclass. */
+  protected final int[] counts;
+
+  /** Our field name. */
+  protected final String field;
+
+  /** Total number of hits. */
+  protected int totCount;
+
+  private final ArrayUtil.ByteArrayComparator comparator;
+
+  /** Type of "range overlap" we want to count. */
+  RangeFieldQuery.QueryType queryType;

Review Comment:
   No I don't think they do, changed in the next revision.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mdmarshmallow commented on a diff in pull request #11901: Github#11869: Add RangeOnRangeFacetCounts

2022-12-12 Thread GitBox


mdmarshmallow commented on code in PR #11901:
URL: https://github.com/apache/lucene/pull/11901#discussion_r1046128095


##
lucene/facet/src/java/org/apache/lucene/facet/rangeonrange/RangeOnRangeFacetCounts.java:
##
@@ -0,0 +1,209 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.rangeonrange;
+
+import java.io.IOException;
+import java.util.Collections;
+import java.util.List;
+import org.apache.lucene.document.BinaryRangeDocValues;
+import org.apache.lucene.document.RangeFieldQuery;
+import org.apache.lucene.facet.FacetCountsWithFilterQuery;
+import org.apache.lucene.facet.FacetResult;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.facet.LabelAndValue;
+import org.apache.lucene.index.DocValues;
+import org.apache.lucene.search.DocIdSetIterator;
+import org.apache.lucene.search.Query;
+import org.apache.lucene.util.ArrayUtil;
+import org.apache.lucene.util.PriorityQueue;
+
+abstract class RangeOnRangeFacetCounts extends FacetCountsWithFilterQuery {
+
+  private final byte[][] encodedRanges;
+  private final String[] labels;
+  private final int numEncodedValueBytes;
+  private final int dims;
+
+  /** Counts, initialized in by subclass. */
+  protected final int[] counts;
+
+  /** Our field name. */
+  protected final String field;
+
+  /** Total number of hits. */
+  protected int totCount;
+
+  private final ArrayUtil.ByteArrayComparator comparator;
+
+  /** Type of "range overlap" we want to count. */
+  RangeFieldQuery.QueryType queryType;
+
+  protected RangeOnRangeFacetCounts(
+  String field,
+  FacetsCollector hits,
+  RangeFieldQuery.QueryType queryType,
+  Query fastMatchQuery,
+  int numEncodedValueBytes,
+  byte[][] encodedRanges,
+  String[] labels)
+  throws IOException {
+super(fastMatchQuery);
+
+assert encodedRanges.length == labels.length;
+assert encodedRanges[0].length % (2 * numEncodedValueBytes) == 0;
+
+this.encodedRanges = encodedRanges;
+this.field = field;
+this.labels = labels;
+this.numEncodedValueBytes = numEncodedValueBytes;
+this.dims = encodedRanges[0].length / (2 * this.numEncodedValueBytes);
+this.queryType = queryType;
+this.comparator = 
ArrayUtil.getUnsignedComparator(this.numEncodedValueBytes);

Review Comment:
   Changed this in the next revision



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mdmarshmallow commented on a diff in pull request #11901: Github#11869: Add RangeOnRangeFacetCounts

2022-12-12 Thread GitBox


mdmarshmallow commented on code in PR #11901:
URL: https://github.com/apache/lucene/pull/11901#discussion_r1046130284


##
lucene/facet/src/java/org/apache/lucene/facet/rangeonrange/LongRange.java:
##
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.rangeonrange;
+
+import java.util.Arrays;
+import java.util.Objects;
+
+/** Represents a long range for RangeOnRange faceting */
+public class LongRange extends Range {
+  /** Minimum (inclusive). */
+  public final long[] min;
+
+  /** Maximum (inclusive). */
+  public final long[] max;
+
+  /**
+   * Represents a single dimensional long range for RangeOnRange faceting
+   *
+   * @param label the name of the range
+   * @param minIn the minimum
+   * @param minInclusive if the minimum is inclusive
+   * @param maxIn the maximum
+   * @param maxInclusive if the maximum is inclusive
+   */
+  public LongRange(
+  String label, long minIn, boolean minInclusive, long maxIn, boolean 
maxInclusive) {
+super(label, 1);
+
+if (minInclusive == false) {
+  if (minIn != Long.MAX_VALUE) {
+minIn++;
+  } else {
+failNoMatch();
+  }
+}
+
+if (maxInclusive == false) {
+  if (maxIn != Long.MIN_VALUE) {
+maxIn--;
+  } else {
+failNoMatch();
+  }
+}
+
+if (minIn > maxIn) {
+  failNoMatch();
+}
+
+this.min = new long[] {minIn};
+this.max = new long[] {maxIn};
+  }
+
+  /**
+   * Represents a multidimensional long range for RangeOnRange faceting
+   *
+   * @param label the name of the range
+   * @param min the minimum, inclusive
+   * @param max the maximum, inclusive
+   */
+  public LongRange(String label, long[] min, long[] max) {
+super(label, min.length);
+checkArgs(min, max);
+this.min = min;
+this.max = max;
+  }
+
+  @Override
+  public String toString() {
+return "LongRange(label: "
++ label
++ ", min: "
++ Arrays.toString(min)
++ ", max: "
++ Arrays.toString(max)
++ ")";
+  }
+
+  @Override
+  public boolean equals(Object o) {
+if (this == o) return true;
+if (o == null || getClass() != o.getClass()) return false;
+LongRange longRange = (LongRange) o;
+return Arrays.equals(min, longRange.min) && Arrays.equals(max, 
longRange.max);
+  }
+
+  @Override
+  public int hashCode() {
+return Objects.hash(label, Arrays.hashCode(min), Arrays.hashCode(max));
+  }
+
+  private static void checkArgs(final long[] min, final long[] max) {
+if (min == null || max == null || min.length == 0 || max.length == 0) {
+  throw new IllegalArgumentException("min/max range values cannot be null 
or empty");
+}
+if (min.length != max.length) {
+  throw new IllegalArgumentException("min/max ranges must agree");
+}
+
+for (int i = 0; i < min.length; i++) {
+  if (min[i] > max[i]) {
+throw new IllegalArgumentException("min should be less than max");

Review Comment:
   Yeah I think that makes sense, changing in next revision.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mdmarshmallow commented on a diff in pull request #11901: Github#11869: Add RangeOnRangeFacetCounts

2022-12-12 Thread GitBox


mdmarshmallow commented on code in PR #11901:
URL: https://github.com/apache/lucene/pull/11901#discussion_r1046133836


##
lucene/facet/src/java/org/apache/lucene/facet/rangeonrange/LongRange.java:
##
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.rangeonrange;
+
+import java.util.Arrays;
+import java.util.Objects;
+
+/** Represents a long range for RangeOnRange faceting */
+public class LongRange extends Range {
+  /** Minimum (inclusive). */
+  public final long[] min;
+
+  /** Maximum (inclusive). */
+  public final long[] max;
+
+  /**
+   * Represents a single dimensional long range for RangeOnRange faceting
+   *
+   * @param label the name of the range
+   * @param minIn the minimum
+   * @param minInclusive if the minimum is inclusive
+   * @param maxIn the maximum
+   * @param maxInclusive if the maximum is inclusive
+   */
+  public LongRange(
+  String label, long minIn, boolean minInclusive, long maxIn, boolean 
maxInclusive) {
+super(label, 1);
+
+if (minInclusive == false) {
+  if (minIn != Long.MAX_VALUE) {
+minIn++;
+  } else {
+failNoMatch();
+  }
+}
+
+if (maxInclusive == false) {
+  if (maxIn != Long.MIN_VALUE) {
+maxIn--;
+  } else {
+failNoMatch();
+  }
+}
+
+if (minIn > maxIn) {
+  failNoMatch();
+}
+
+this.min = new long[] {minIn};
+this.max = new long[] {maxIn};
+  }
+
+  /**
+   * Represents a multidimensional long range for RangeOnRange faceting
+   *
+   * @param label the name of the range
+   * @param min the minimum, inclusive
+   * @param max the maximum, inclusive
+   */
+  public LongRange(String label, long[] min, long[] max) {
+super(label, min.length);
+checkArgs(min, max);
+this.min = min;
+this.max = max;
+  }
+
+  @Override
+  public String toString() {
+return "LongRange(label: "
++ label
++ ", min: "
++ Arrays.toString(min)
++ ", max: "
++ Arrays.toString(max)
++ ")";
+  }
+
+  @Override
+  public boolean equals(Object o) {
+if (this == o) return true;
+if (o == null || getClass() != o.getClass()) return false;
+LongRange longRange = (LongRange) o;
+return Arrays.equals(min, longRange.min) && Arrays.equals(max, 
longRange.max);

Review Comment:
   Ah yeah, thanks for catching that. Also added `dims` check as well to both.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] mdmarshmallow commented on a diff in pull request #11901: Github#11869: Add RangeOnRangeFacetCounts

2022-12-12 Thread GitBox


mdmarshmallow commented on code in PR #11901:
URL: https://github.com/apache/lucene/pull/11901#discussion_r1046211359


##
lucene/facet/src/java/org/apache/lucene/facet/rangeonrange/LongRangeOnRangeFacetCounts.java:
##
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.rangeonrange;
+
+import static org.apache.lucene.document.LongRange.verifyAndEncode;
+
+import java.io.IOException;
+import org.apache.lucene.document.RangeFieldQuery;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.search.Query;
+
+/** Represents counts for long range on range faceting */

Review Comment:
   Added some more descriptive java docs in the next revision, let me know if 
those are better!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] stevenschlansker commented on issue #11913: lucene-replicator PrimaryNode unsafely publishes reference during construction

2022-12-12 Thread GitBox


stevenschlansker commented on issue #11913:
URL: https://github.com/apache/lucene/issues/11913#issuecomment-1347017604

   Letting PrimaryNode initialize the writer would work for us. It's in fact 
pretty similar to the workaround we implemented to fix this in our application.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir opened a new issue, #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox


rmuir opened a new issue, #12012:
URL: https://github.com/apache/lucene/issues/12012

   ### Description
   
   I've probably hit this one several times recently. Probably only impacts ppl 
not using IDEs :)
   
   But I think it is just an ordering issue with the `gradle check`? It seems 
to fire off `spotless` task before `javac task`?
   
   So if you make a typo (which I do all the time), you get an error like this. 
Yes it is still "usable" in the sense that if i scroll thru the massive 
exception I can find the line number, stare at it, and fix it. But I think 
javac would do a better job?
   
   At the same time, it'd be bad to slow down the build if that would be a side 
effect of forcing javac to run before spotlessJava...
   
   ```
   > Task :lucene:facet:spotlessJava FAILED
   Step 'google-java-format' found problem in 
'src/test/org/apache/lucene/facet/range/TestRangeFacetCounts.java':
   null
   java.lang.reflect.InvocationTargetException
   at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
   at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
   at 
com.diffplug.spotless.java.GoogleJavaFormatStep$State.lambda$createFormat$1(GoogleJavaFormatStep.java:176)
   at com.diffplug.spotless.FormatterFunc.apply(FormatterFunc.java:32)
   at 
com.diffplug.spotless.FormatterStepImpl$Standard.format(FormatterStepImpl.java:78)
   at 
com.diffplug.spotless.FormatterStep$Strict.format(FormatterStep.java:88)
   at com.diffplug.spotless.Formatter.compute(Formatter.java:230)
   at 
com.diffplug.spotless.PaddedCell.calculateDirtyState(PaddedCell.java:203)
   at 
com.diffplug.spotless.PaddedCell.calculateDirtyState(PaddedCell.java:190)
   at 
com.diffplug.gradle.spotless.SpotlessTaskImpl.processInputFile(SpotlessTaskImpl.java:92)
   at 
com.diffplug.gradle.spotless.SpotlessTaskImpl.performAction(SpotlessTaskImpl.java:78)
   at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
   at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.base/java.lang.reflect.Method.invoke(Method.java:568)
   at org.gradle.internal.reflect.JavaMethod.invoke(JavaMethod.java:125)
   at 
org.gradle.api.internal.project.taskfactory.IncrementalInputsTaskAction.doExecute(IncrementalInputsTaskAction.java:32)
   at 
org.gradle.api.internal.project.taskfactory.StandardTaskAction.execute(StandardTaskAction.java:51)
   at 
org.gradle.api.internal.project.taskfactory.AbstractIncrementalTaskAction.execute(AbstractIncrementalTaskAction.java:25)
   at 
org.gradle.api.internal.project.taskfactory.StandardTaskAction.execute(StandardTaskAction.java:29)
   at 
org.gradle.api.internal.tasks.execution.TaskExecution$3.run(TaskExecution.java:236)
   at 
org.gradle.internal.operations.DefaultBuildOperationRunner$1.execute(DefaultBuildOperationRunner.java:29)
   at 
org.gradle.internal.operations.DefaultBuildOperationRunner$1.execute(DefaultBuildOperationRunner.java:26)
   at 
org.gradle.internal.operations.DefaultBuildOperationRunner$2.execute(DefaultBuildOperationRunner.java:66)
   at 
org.gradle.internal.operations.DefaultBuildOperationRunner$2.execute(DefaultBuildOperationRunner.java:59)
   at 
org.gradle.internal.operations.DefaultBuildOperationRunner.execute(DefaultBuildOperationRunner.java:157)
   at 
org.gradle.internal.operations.DefaultBuildOperationRunner.execute(DefaultBuildOperationRunner.java:59)
   at 
org.gradle.internal.operations.DefaultBuildOperationRunner.run(DefaultBuildOperationRunner.java:47)
   at 
org.gradle.internal.operations.DefaultBuildOperationExecutor.run(DefaultBuildOperationExecutor.java:68)
   at 
org.gradle.api.internal.tasks.execution.TaskExecution.executeAction(TaskExecution.java:221)
   at 
org.gradle.api.internal.tasks.execution.TaskExecution.executeActions(TaskExecution.java:204)
   at 
org.gradle.api.internal.tasks.execution.TaskExecution.executeWithPreviousOutputFiles(TaskExecution.java:187)
   at 
org.gradle.api.internal.tasks.execution.TaskExecution.execute(TaskExecution.java:165)
   at 
org.gradle.internal.execution.steps.ExecuteStep.executeInternal(ExecuteStep.java:89)
   at 
org.gradle.internal.execution.steps.ExecuteStep.access$000(ExecuteStep.java:40)
   at 
org.gradle.internal.execution.steps.ExecuteStep$1.call(ExecuteStep.java:53)
   at 
org.gradle.i

[GitHub] [lucene] rmuir commented on pull request #12010: Enable LongDoubleConversion error-prone check

2022-12-12 Thread GitBox


rmuir commented on PR #12010:
URL: https://github.com/apache/lucene/pull/12010#issuecomment-1347152648

   Thank you for reviewing @stefanvodita 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox


dweiss commented on issue #12012:
URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347320745

   Enforcing the ordering here is quite trivial and makes sense to me. I'm not 
sure what the impact is going to be - perhaps not too much given how much stuff 
is going on elsewhere (in check)?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox


dweiss commented on issue #12012:
URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347348916

   This seems to do the trick (a bit coarse to schedule after all javac tasks 
but I'm not sure whether spotless cares about different source sets).
   
   
https://github.com/dweiss/lucene/commit/aa07deb8430e8504c81cac6fa5ad3c431654
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox


rmuir commented on issue #12012:
URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347351739

   i can test out timings of your patch on a fast mac m1: give me a few 
minutes. I know it has enough concurrency to basically be bottlenecked by the 
task dependencies... which I've looked into before, but haven't found an easy 
win yet due to similar concerns.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] gsmiller commented on a diff in pull request #11901: Github#11869: Add RangeOnRangeFacetCounts

2022-12-12 Thread GitBox


gsmiller commented on code in PR #11901:
URL: https://github.com/apache/lucene/pull/11901#discussion_r1046413930


##
lucene/facet/src/java/org/apache/lucene/facet/rangeonrange/DoubleRangeOnRangeFacetCounts.java:
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.rangeonrange;
+
+import static org.apache.lucene.document.DoubleRange.verifyAndEncode;
+
+import java.io.IOException;
+import org.apache.lucene.document.RangeFieldQuery;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.search.Query;
+
+/**
+ * Represents counts for double range on range faceting. To be more specific, 
this means that given
+ * a range (or list of ranges), this class will count all the documents in the 
index (or that match
+ * a fast match query) that contain ranges that "match" the provided ranges. 
These ranges are
+ * specified by the field parameter and expected to be of type {@link
+ * org.apache.lucene.document.DoubleRangeDocValuesField}. Matching is defined 
by the queryType
+ * param, you can see the type of matching supported by looking at {@link
+ * org.apache.lucene.document.RangeFieldQuery.QueryType}. In addition, this 
class supports
+ * multidimensional ranges. A multidimensional range will be counted as a 
match if every dimension
+ * matches the corresponding indexed range's dimension.
+ */
+public class DoubleRangeOnRangeFacetCounts extends RangeOnRangeFacetCounts {
+
+  /**
+   * Constructor without the fast match query, see other constructor 
description for more details.
+   */
+  public DoubleRangeOnRangeFacetCounts(
+  String field,
+  FacetsCollector hits,
+  RangeFieldQuery.QueryType queryType,
+  DoubleRange... ranges)
+  throws IOException {
+super(
+field,
+hits,
+queryType,
+null,
+Double.BYTES,
+getEncodedRanges(ranges),
+Range.getLabelsFromRanges(ranges));
+  }
+
+  /**
+   * Represents counts for long range on range faceting. See class javadoc for 
more details.
+   *
+   * @param field specifies a {@link 
org.apache.lucene.document.LongRangeDocValuesField} that will

Review Comment:
   Should be DoubleRangeDocValuesField right?



##
lucene/facet/src/java/org/apache/lucene/facet/rangeonrange/DoubleRangeOnRangeFacetCounts.java:
##
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.facet.rangeonrange;
+
+import static org.apache.lucene.document.DoubleRange.verifyAndEncode;
+
+import java.io.IOException;
+import org.apache.lucene.document.RangeFieldQuery;
+import org.apache.lucene.facet.FacetsCollector;
+import org.apache.lucene.search.Query;
+
+/**
+ * Represents counts for double range on range faceting. To be more specific, 
this means that given
+ * a range (or list of ranges), this class will count all the documents in the 
index (or that match
+ * a fast match query) that contain ranges that "match" the provided ranges. 
These ranges are
+ * specified by the field parameter and expected to be of type {@link
+ * org.apache.lucene.document.DoubleRangeDocValuesField}. Matching is defined 
by the queryType
+ * param, you can see the type of matching supported by looking at {@link
+ * org.apache.lucene.document.RangeFieldQuery.QueryType}. In addition, this 
class supports
+ * multidimensional ranges. A multidimensional range will be counted as a 
match if every dim

[GitHub] [lucene] rmuir commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox


rmuir commented on issue #12012:
URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347377866

   +1
   
   Any difference is in the noise there: 
   ```
   Before: BUILD SUCCESSFUL in 3m 34s
   After: BUILD SUCCESSFUL in 3m 24s
   ```
   
   I did notice all the `spotlessJavaCheck` and `spotlessCheck` tasks were 
ordered after compilation: so e.g. they are running alongside stuff like 
forbidden-apis tasks that also require compilation.
   
   Previously, spotless would run earlier, seemingly in parallel with things 
like compilation tasks. You could easily get unlucky and have `spotless` fail 
before `javac`, especially since lucene-core compilation takes a long time, and 
especially if you enable errorprone.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox


rmuir commented on issue #12012:
URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347394601

   Hmm, @dweiss I think it is still not quite right. I think we are not slowing 
down the `spotlessJava` task which is the one actually failing for me?
   
   Here's a simple reproducer patch to make compile fail in lucene/facet tests 
(which have dependencies on lucene core compile etc and are pretty slow). If 
you can't reproduce, add -Pvalidation.errorprone=true to slow down javac even 
more :)
   
   ```
   diff --git 
a/lucene/facet/src/test/org/apache/lucene/facet/SlowDirectory.java 
b/lucene/facet/src/test/org/apache/lucene/facet/SlowDirectory.java
   index a809ea1e711..d5d4f52bbb0 100644
   --- a/lucene/facet/src/test/org/apache/lucene/facet/SlowDirectory.java
   +++ b/lucene/facet/src/test/org/apache/lucene/facet/SlowDirectory.java
   @@ -27,7 +27,7 @@ import org.apache.lucene.util.ThreadInterruptedException;
   
/** Test utility - slow directory */
// TODO: move to test-framework and sometimes use in tests?
   -public class SlowDirectory extends FilterDirectory {
   +public class SlowDirectory extends FilterDirectory { X
   
  private static final int IO_SLEEP_THRESHOLD = 50;
   ```
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox


rmuir commented on issue #12012:
URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347398451

   Here's my patch which seems to work and do the right thing. I'll benchmark 
it. But also I don't know what i am doing with gradle:
   ```
   diff --git a/gradle/validation/spotless.gradle 
b/gradle/validation/spotless.gradle
   index 685bf0cec5b..9b6bc553fc7 100644
   --- a/gradle/validation/spotless.gradle
   +++ b/gradle/validation/spotless.gradle
   @@ -91,6 +91,7 @@ configure(project(":lucene").subprojects) { prj ->
  doFirst {
project.mkdir("${buildDir}/spotless/spotlessJava")
  }
   +  mustRunAfter tasks.withType(JavaCompile)
}
  }
   ```
   
   The error is greatly improved:
   
   ```
   > Task :lucene:facet:ecjLintTest FAILED
   --
   1. ERROR in 
/home/rmuir/workspace/lucene/lucene/facet/src/test/org/apache/lucene/facet/SlowDirectory.java
 (at line 30)
   public class SlowDirectory extends FilterDirectory { X
^
   Syntax error on token "X", delete this token
   --
   1 problem (1 error)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] vigyasharma opened a new pull request, #12013: Clear thread local values on UTF8TaxonomyWriterCache.close()

2022-12-12 Thread GitBox


vigyasharma opened a new pull request, #12013:
URL: https://github.com/apache/lucene/pull/12013

   `UTF8TaxonomyWriterCache` keeps a ThreadLocal `BytesRefBuilder`, the bytes 
for which, are not removed on thread close. This leads to memory leaks. 
   
   The change uses `CloseableThreadLocal` instead, and closes out the object on 
cache.close() 
   Addresses #12000 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] vigyasharma commented on issue #12000: Lucene-facet leaves ThreadLocal that creates a memory leak

2022-12-12 Thread GitBox


vigyasharma commented on issue #12000:
URL: https://github.com/apache/lucene/issues/12000#issuecomment-1347406654

   > But one possibility to fix this cache is to replace the ThreadLocal with 
CloseableThreadLocal, and close it out in close()
   
   `DirectoryTaxonomyWriter` does call `cache.close()` when it gets closed (via 
`closeResources()`), so I think this is a good idea. 
   
   I updated the cache to use CloseableThreadLocal. Don't have a good way to 
test for a memory leak, but I added a test to atleast verify that threadlocal 
memory is removed on close.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #12013: Clear thread local values on UTF8TaxonomyWriterCache.close()

2022-12-12 Thread GitBox


rmuir commented on PR #12013:
URL: https://github.com/apache/lucene/pull/12013#issuecomment-1347422875

   Thanks, it looks better to use `CloseableThreadLocal` than `ThreadLocal`... 
but I don't understand what this "cache" is doing and why it actually documents 
that it never frees memory. Especially as a default!
   
   As a followup can we open an issue to move away from Threadlocal storage 
here by default? I don't understand why we have this "cache" with this 
description:
   ```
   /** A "cache" that never frees memory, and stores labels in a BytesRefHash 
(utf-8 encoding). */
   ```
   
   Also, I'm trying to remove ThreadLocal (and CloseableThreadLocal) usages 
elsewhere in the codebase (e.g. #11998). These threadlocals are problematic in 
java apps that have many threads (or high thread churn rate), just as 
CloseableThreadLocal documents and attempts to workaround.
   
   At a glance, seems to me that using a bounded LRU cache would be much more 
reasonable...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox


rmuir commented on issue #12012:
URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347423692

   I ran this patch on the mac m1 and also got `BUILD SUCCESSFUL in 3m 33s`. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #12013: Clear thread local values on UTF8TaxonomyWriterCache.close()

2022-12-12 Thread GitBox


rmuir commented on PR #12013:
URL: https://github.com/apache/lucene/pull/12013#issuecomment-1347439475

   Also (summoning my inner @uschindler), curious what the perf impact is if we 
simply remove these caches completely. Maybe they are not relevant to the 
performance anymore due to faceting changes (e.g. storing labels in docvalues 
rather than payloads) ? Maybe they are even hurting performance?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #12013: Clear thread local values on UTF8TaxonomyWriterCache.close()

2022-12-12 Thread GitBox


uschindler commented on PR #12013:
URL: https://github.com/apache/lucene/pull/12013#issuecomment-1347501489

   Please remove the cache. Totally useless. It hurts more because the escape 
analysis can't work.
   When tiered compilation stepped in, this is just a useless extra
   
   My rule: Never ever cache object instances to prevent object creation and 
help GC. This was needed in in Java 6 and java 7, but not anymore. It has 
opposite problem: it actually creates those objects on heap and stresses GC.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir opened a new pull request, #12014: Ban use of Math.fma across the entire codebase

2022-12-12 Thread GitBox


rmuir opened a new pull request, #12014:
URL: https://github.com/apache/lucene/pull/12014

   When FMA is not supported by the hardware, these methods fall back to 
`BigDecimal` usage [1] which causes them to be 2500x slower [2].
   
   While most hardware in the last 10 years may have the support, out of box 
both VirtualBox and QEMU don't pass thru FMA support (for the latter at least 
you can tweak it with e.g. -cpu host or similar to fix this). 
   
   This creates a terrible undocumented performance trap, see [3] for an 
example of a 30x slowdown of an entire application. In my experience, 
developers are often too far detached from the production reality, and that 
reality is: we're not deploying to macbook pros in production, instead we are 
almost all using virtualization: we can't afford such performance traps.
   
   Practically it would be an issue too: e.g. Policeman jenkins instance that 
runs our tests currently uses virtualbox. It would be bad for vector-search 
tests to suddenly get 30x slower.
   
   We can't safely use this method anywhere, as we don't have access to check 
CPUID or anything to see if it will be insanely slow or not. Let's ban it 
completely: I'm concerned it will sneak into our codebase otherwise... it 
almost happened before: #10718
   
   [1] [Math.java source 
code](https://github.com/openjdk/jdk/blob/5d4ea9b9549b762b7c207e5c2ee65bc51591433b/src/java.base/share/classes/java/lang/Math.java#L2364-L2366)
   [2] [Comment on JIRA issue for x86 intrinsic mentioning 2500x 
speedup](https://bugs.openjdk.org/browse/JDK-8154122?focusedCommentId=13995171&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13995171)
   [3] [VirtualBox bug for lack of FMA 
support](https://www.virtualbox.org/ticket/15471)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] gsmiller commented on pull request #12014: Ban use of Math.fma across the entire codebase

2022-12-12 Thread GitBox


gsmiller commented on PR #12014:
URL: https://github.com/apache/lucene/pull/12014#issuecomment-1347617300

   +1, seems reasonable to me. We can always remove this ban in the future if 
there's a good reason, but seems reasonable to put this in place to prevent it 
sneaking in for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #12014: Ban use of Math.fma across the entire codebase

2022-12-12 Thread GitBox


rmuir commented on PR #12014:
URL: https://github.com/apache/lucene/pull/12014#issuecomment-1347620183

   Yeah, I think if the fallback java code was 2x, 4x, or 8x slower (like you 
would expect from these intrinsics), we wouldn't be having this conversation :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir closed issue #12009: Find (and fix) places where we treat a long as a double without an explicit cast

2022-12-12 Thread GitBox


rmuir closed issue #12009: Find (and fix) places where we treat a long as a 
double without an explicit cast
URL: https://github.com/apache/lucene/issues/12009


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir merged pull request #12010: Enable LongDoubleConversion error-prone check

2022-12-12 Thread GitBox


rmuir merged PR #12010:
URL: https://github.com/apache/lucene/pull/12010


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox


dweiss commented on issue #12012:
URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347844975

   > I think we are not slowing down the spotlessJava task which is the one 
actually failing for me?
   
   Eh. I don't know what the relationships between those spotless tasks are. 
Let me dig.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox


dweiss commented on issue #12012:
URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347872711

   I filed a PR #12015 . Indeed the 'core' formatting task seems to be named 
'spotlessJava' (spotless + format name), the check and apply are just attached 
to it in a fancy manner via some caching service.
   
   I get the same error output as you do. I believe it's still prone to some 
fuzziness because ecj linter and javac can run in arbitrary order. You can see 
it (on the SlowDirectory.java error example) when you compare the output of:
   ```
   gradlew -p lucene/facet compileTestJava ecjLint --max-workers 1
   gradlew -p lucene/facet ecjLint compileTestJava --max-workers 1
   ```
   
   I don't think it's a big issue though since both of these emit reasonably 
looking error messages.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss merged pull request #12015: Run spotless after javac (#12012)

2022-12-12 Thread GitBox


dweiss merged PR #12015:
URL: https://github.com/apache/lucene/pull/12015


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on issue #12012: Spotless runs before javac in `gradle check` which results in less-than-helpful errors on compilation problems

2022-12-12 Thread GitBox


dweiss commented on issue #12012:
URL: https://github.com/apache/lucene/issues/12012#issuecomment-1347874082

   I've applied the patch to 9x and main, btw.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org