date:20221109

[GitHub] [lucene] javanna commented on pull request #11793: Prevent PointValues from returning null for ghost fields

2022-11-09 Thread GitBox



javanna commented on PR #11793:
URL: https://github.com/apache/lucene/pull/11793#issuecomment-1308433716

   @jpountz would you have time to take another look at this please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] mohamedniyaz1996 opened a new issue, #11908: Get scores along with Spellcheck suggestions

2022-11-09 Thread GitBox



mohamedniyaz1996 opened a new issue, #11908:
URL: https://github.com/apache/lucene/issues/11908

   ### Description
   
   As we know, lucene spellcheck 
[suggestsimilar()](https://lucene.apache.org/core/6_0_1/suggest/org/apache/lucene/search/spell/SpellChecker.html#suggestSimilar-java.lang.String-int-)
 method returns just list of suggestions, it would be good to have a way to get 
the scores of each suggestions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] benwtrent commented on pull request #11905: Fix integer overflow when seeking the vector index for connections

2022-11-09 Thread GitBox



benwtrent commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1308664186

   @rmuir i took your test, modified it slightly (changing number of vectors 
and the assertion). It ran for 3.5 hours and failed on the old code in the 
exactly correct spot (overflowing on graph location).
   
   I am running it again with my fix to verify the fix. Thank you for iterating 
on this with me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] rmuir commented on pull request #11905: Fix integer overflow when seeking the vector index for connections

2022-11-09 Thread GitBox



rmuir commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1308697281

   yeah its your search at the end that triggers the issue, because checkindex 
on vectors is currently too wimpy and doesn't ever call seek(). this is also an 
issue that we must address here.
   
   you can make the test much faster by sticking with just one dimension and 
passing some additional args. The one on the other PR runs in 1.5 hours in its 
current config. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] donnerpeter opened a new pull request, #11909: hunspell: introduce FragmentChecker to speed up ModifyingSuggester

2022-11-09 Thread GitBox



donnerpeter opened a new pull request, #11909:
URL: https://github.com/apache/lucene/pull/11909

   add NGramFragmentChecker to quickly check whether insertions/replacements 
produce strings that are even possible in the language


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] rmuir opened a new issue, #11910: improve error-prone configuration for int-overflow bugs

2022-11-09 Thread GitBox



rmuir opened a new issue, #11910:
URL: https://github.com/apache/lucene/issues/11910

   ### Description
   
   As mentioned by @dweiss and @benwtrent in #11905, some static analysis could 
help here. Maybe it can force us to do a pass  reviewing other sketchy possible 
bugs like this in the codebase, and help prevent new ones from popping up in 
the future.
   
   I already looked into ecj (including their latest mainline) and don't see 
any applicable checks. So, let's look into error-prone?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jpountz commented on pull request #11875: Usability improvements for timeout support in IndexSearcher

2022-11-09 Thread GitBox



jpountz commented on PR #11875:
URL: https://github.com/apache/lucene/pull/11875#issuecomment-1309017859

   I'm somewhat familiar with this code, I wonder if it could be refactored in 
such a way that it would directly leverage Lucene's timeout support, e.g. can 
the logic that uses live docs to drive iteration be folded into the weight 
wrapper? Then it should be possible to use the existing `IndexSearcher#search` 
logic as-is? I wouldn't like to make more classes visible in Lucene than 
necessary for something that feels like it would be better addressed in 
Elasticsearch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jpountz commented on issue #11702: Multi-Value Support for Binary DocValues [LUCENE-10666]

2022-11-09 Thread GitBox



jpountz commented on issue #11702:
URL: https://github.com/apache/lucene/issues/11702#issuecomment-1309023675

   > e.g., the ESQL query language/engine proposed by Elastic.
   
   I don't think ESQL is going to be different from existing faceting support: 
it will still want to use ordinals when it makes sense such as grouping by 
term. It will still be up to users to configure their mappings correctly for 
the sort of aggregation that they plan to run: `SORTED(_SET)` for sorting, 
grouping, and unique counts, `BINARY` for operations that require actually 
looking at the data and can't work on ordinals.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] benwtrent commented on issue #11910: improve error-prone configuration for int-overflow bugs

2022-11-09 Thread GitBox



benwtrent commented on issue #11910:
URL: https://github.com/apache/lucene/issues/11910#issuecomment-1309035065

   I think static analysis will be a significant help.
   
   So, I did a quick check at turning on `IntLongMath` for `error-prone` and it 
is noisy. 
   
   Its flagging things like this as bad behavior:
   ```java
   long function(int v) {
  return v - 2;
   }
   ```
   or this
   ```java
   long foo() {
 return 2 * 2
   }
   ```
   
   As an aside, almost every single `long ramBytesUsed()` is flagged as 
auto-cast since it returns `long` but there are things like `2 * 
Integer.BYTE_VALUE`. 
   
   We may need something more nuanced or focused besides `error-prone`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] rmuir commented on issue #11910: improve error-prone configuration for int-overflow bugs

2022-11-09 Thread GitBox



rmuir commented on issue #11910:
URL: https://github.com/apache/lucene/issues/11910#issuecomment-1309038831

   maybe we could even turn it on, and dump its output here as a one-time 
thing? we can just take a pass through it, and filter out the noisy ones.
   
   I realize it isn't ideal and doesn't improve the build, but it still helps 
us try to look for similar bugs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] reta commented on pull request #11875: Usability improvements for timeout support in IndexSearcher

2022-11-09 Thread GitBox



reta commented on PR #11875:
URL: https://github.com/apache/lucene/pull/11875#issuecomment-1309053182

   > Then it should be possible to use the existing `IndexSearcher#search` 
logic as-is?
   
   Thanks @jpountz, I think that would be the best option for everyone (Lucene 
/ OpenSearch / Elasticsearch), but it would certainly fall into "high risk" 
change without reverse engineering and matching both implementations (I don't 
have full context sadly). I am closing this pull request (and related issue), 
we may get back to it at some point in the future, but for the time being it 
could not be used in OpenSearch. Thank you for your time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] reta closed pull request #11875: Usability improvements for timeout support in IndexSearcher

2022-11-09 Thread GitBox



reta closed pull request #11875: Usability improvements for timeout support in 
IndexSearcher
URL: https://github.com/apache/lucene/pull/11875


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] reta commented on issue #11874: Usability improvements for timeout support in IndexSearcher

2022-11-09 Thread GitBox



reta commented on issue #11874:
URL: https://github.com/apache/lucene/issues/11874#issuecomment-1309053848

   Closing was "won't do"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] reta closed issue #11874: Usability improvements for timeout support in IndexSearcher

2022-11-09 Thread GitBox



reta closed issue #11874: Usability improvements for timeout support in 
IndexSearcher
URL: https://github.com/apache/lucene/issues/11874


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] dweiss commented on issue #11910: improve error-prone configuration for int-overflow bugs

2022-11-09 Thread GitBox



dweiss commented on issue #11910:
URL: https://github.com/apache/lucene/issues/11910#issuecomment-1309095897

   I can understand where it signals a problem but can't determine the domain 
(the v - 2 can underflow if you pass v small enough)... The 2* 2 case is odd, 
I'm surprised it doesn't see a constant there. 
   
   I'm not a big fan of many of those tools - I agree they're noisy - but from 
time to time they do find real problems.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] dweiss commented on a diff in pull request #11909: hunspell: introduce FragmentChecker to speed up ModifyingSuggester

2022-11-09 Thread GitBox



dweiss commented on code in PR #11909:
URL: https://github.com/apache/lucene/pull/11909#discussion_r1018232665


##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/FragmentChecker.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+/**
+ * An oracle for quickly checking that a specific part of a word can never be 
a valid word. This
+ * allows speeding up the "Modification" part of {@link Suggester} but 
avoiding expensive checks on
+ * impossible words. Implementations may use character case, n-grams or 
whatever they like.
+ *
+ * @see NGramFragmentChecker
+ */
+public interface FragmentChecker {
+  FragmentChecker EVERYTHING_POSSIBLE = (word, start, end) -> false;
+
+  /**
+   * Check if some substring of the word that intersects with the given 
offsets is impossible in the

Review Comment:
   ```suggestion
  * Check if the substring of the word between {@code start} (inclusive) 
and {@code end} (exclusive) is impossible in the
   ```
   I didn't check whether the inclusive/exclusive part is right but I'd use the 
same contract as in String substring/subsequence.



##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/FragmentChecker.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+/**
+ * An oracle for quickly checking that a specific part of a word can never be 
a valid word. This
+ * allows speeding up the "Modification" part of {@link Suggester} but 
avoiding expensive checks on

Review Comment:
   ```suggestion
* allows speeding up the "Modification" part of {@link Suggester} by 
avoiding expensive checks on
   ```



##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/NGramFragmentChecker.java:
##
@@ -0,0 +1,181 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.util.BitSet;
+import java.util.Collection;
+import java.util.Locale;
+import java.util.function.Consumer;
+
+/**
+ * A {@link FragmentChecker} based on all character n-grams possible in a 
certain language, keeping
+ * them in a relatively memory-efficient, but probabilistic data structure.
+ *
+ * @see #fromAllSimpleWords for enumerating the whole dictionary automatically
+ * @see #fromWords for creating an instance from a precomputed set of all word 
forms or n-grams
+ */
+public class NGramFragmentChecker implements FragmentChecker {
+  private final int n;
+  private final BitSet hashes;
+
+  private NGramFragmentChecker(int n, BitSet hashes) {
+if (n

[GitHub] [lucene] benwtrent commented on pull request #11905: Fix integer overflow when seeking the vector index for connections

2022-11-09 Thread GitBox



benwtrent commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1309129512

   Updated the monster test. 
   
   Ran about 2hrs on my laptop (but I was working, in virtual meetings, etc. 
during the entire run). 
   
   Confirmed it fails without this patch. This patch allows it to pass.
   
   @rmuir @jdconrad 
   
   Would be good to get this in a 9.4.2 patch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] rmuir commented on pull request #11905: Fix integer overflow when seeking the vector index for connections

2022-11-09 Thread GitBox



rmuir commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1309198710

   nice. i'm fine with the changes. We can open another issue to fix the 
checkindex stuff. I do really think we should do that before releasing to look 
for more trouble. Also good if we can go through a candidate list of other 
potential problems (even if its noisy, its still less than entire codebase).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] rmuir commented on pull request #11905: Fix integer overflow when seeking the vector index for connections

2022-11-09 Thread GitBox



rmuir commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1309200544

   and thanks for help battling the testing. it will get better!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] rmuir opened a new issue, #11911: improve checkindex to be more thorough for vectors (e.g. test seeking)

2022-11-09 Thread GitBox



rmuir opened a new issue, #11911:
URL: https://github.com/apache/lucene/issues/11911

   ### Description
   
   Currently checkindex only `next()'s` through the vectors. We should do some 
seeking as well. It doesnt have to be intense, e.g. we could do 64 seeks (if 
there's 20,000,000 docs in the index, seek to 0, 312500, ..., something like 
that).
   
   This would have saved a lot of time with #11905 but will also improve 
testing in general. CheckIndex gets called on every index created by every 
test, so checks here are very powerful.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] rmuir commented on issue #11702: Multi-Value Support for Binary DocValues [LUCENE-10666]

2022-11-09 Thread GitBox



rmuir commented on issue #11702:
URL: https://github.com/apache/lucene/issues/11702#issuecomment-1309244025

   > Curious why you think a document can't have multiple locations? Why 
wouldn't the geo (wkt, json, wkb, protobuf) specification then not have Multi 
geometry types? The reason they do is because multi location can exist for a 
single document. It happens all the time, especially in data science 
applications where multiple observations are collected concurrently in a single 
document scan (RADAR, Multi Hypothesis Tracking). I should be able to have a 
multi value doc value for running facets, aggregations, spark jobs over the 
data stored in the lucene segment instead of trying to hack together a single 
encoding that stores all of the observations at once, and then post filter 
after decoding that entire binary value.
   > 
   > 
   
   because in the real world objects can only exist in one place a a time. 
That's an actual fact. And the way it works in the search engine, doing things 
like sorting by distance, really only makes sense with single valued fields.
   
   This is why i hate all multi-valued docvalues, because its always so 
ambiguous. If i have 3 locations for the doc and i'm sorting by distance, which 
one should i use? etc etc.
   
   If someone wants to encode multiple values into a binary docvalues, nothing 
is stopping them. they can encode integer/byte length up front, do a vint-like 
encoding, whatever they want. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] nknize commented on issue #11702: Multi-Value Support for Binary DocValues [LUCENE-10666]

2022-11-09 Thread GitBox



nknize commented on issue #11702:
URL: https://github.com/apache/lucene/issues/11702#issuecomment-1309269456

   > because in the real world objects can only exist in one place a a time.
   
   Except in geo search / analysis this depends on [spatial 
resolution](https://en.wikipedia.org/wiki/Spatial_resolution) of the source 
data; real world geo data is not precise and often ends up with multiple 
documents in the same location. Clearly (not through a hack) supporting these 
use cases  along with supporting coverage areas as a multi geometry shape only 
makes Lucene stronger. I don't think there's anything wrong w/ supporting 
standards like [RFC 7946](https://www.rfc-editor.org/rfc/rfc7946) in our 
encoding.
   
   > This is why i hate all multi-valued docvalues, because its always so 
ambiguous. If i have 3 locations for the doc and i'm sorting by distance, which 
one should i use? etc etc.
   
   It depends on spatial resolution. Besides, we support these use cases 
already, we don't support the multi-shape use case above without an 
unnecessarily bloated hackey encoding.
   
   > If someone wants to encode multiple values into a binary docvalues, 
nothing is stopping them. they can encode integer/byte length up front, do a 
vint-like encoding, whatever they want.
   
   It's software, yes. Nothing is also stopping anyone from encoding the bible 
and calling it a geo_shape. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] nknize commented on issue #11702: Multi-Value Support for Binary DocValues [LUCENE-10666]

2022-11-09 Thread GitBox



nknize commented on issue #11702:
URL: https://github.com/apache/lucene/issues/11702#issuecomment-1309274884

   > Therefore, we created a prototype that implements multi-valued binary 
docvalues which works well. However, having some support for this use case 
directly in Lucene is preferable, be it a new docvalues or some tooling as 
proposed by @jpountz .
   
   @scampi Would you be willing to contribute your multi-valued binary doc 
value implementation here? I think having multi-value parity with other doc 
value types is good to support multiple use cases like this. Per concerns 
already raised it would be good to slap warnings in the API doc that 
communicate potential trappy performance issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] donnerpeter commented on a diff in pull request #11909: hunspell: introduce FragmentChecker to speed up ModifyingSuggester

2022-11-09 Thread GitBox



donnerpeter commented on code in PR #11909:
URL: https://github.com/apache/lucene/pull/11909#discussion_r1018356474


##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/FragmentChecker.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+/**
+ * An oracle for quickly checking that a specific part of a word can never be 
a valid word. This
+ * allows speeding up the "Modification" part of {@link Suggester} but 
avoiding expensive checks on
+ * impossible words. Implementations may use character case, n-grams or 
whatever they like.
+ *
+ * @see NGramFragmentChecker
+ */
+public interface FragmentChecker {
+  FragmentChecker EVERYTHING_POSSIBLE = (word, start, end) -> false;
+
+  /**
+   * Check if some substring of the word that intersects with the given 
offsets is impossible in the

Review Comment:
   Mostly it's a single-char interval there now. And the semantics is as 
described (albeit probably not very clearly): if there is any impossible 
substring (ngram) intersecting that 1-char interval, return true. Any ideas on 
the wording improvement?



##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/FragmentChecker.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+/**
+ * An oracle for quickly checking that a specific part of a word can never be 
a valid word. This
+ * allows speeding up the "Modification" part of {@link Suggester} but 
avoiding expensive checks on
+ * impossible words. Implementations may use character case, n-grams or 
whatever they like.
+ *
+ * @see NGramFragmentChecker
+ */
+public interface FragmentChecker {
+  FragmentChecker EVERYTHING_POSSIBLE = (word, start, end) -> false;
+
+  /**
+   * Check if some substring of the word that intersects with the given 
offsets is impossible in the

Review Comment:
   Mostly it's a single-char interval there now, and those are all possible. 
And the semantics is as described (albeit probably not very clearly): if there 
is any impossible substring (ngram) intersecting that 1-char interval, return 
true. Any ideas on the wording improvement?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] rmuir opened a new issue, #11912: Can we improve MMapDir's exceptions for invalid offsets?

2022-11-09 Thread GitBox



rmuir opened a new issue, #11912:
URL: https://github.com/apache/lucene/issues/11912

   ### Description
   
   For #11905 bug, as an example, the user may get a generic exception "Seek 
Past EOF".
   
   Can we improve it to include the bogus position? For example, If you can see 
that offset is negative, it is faster to debug. I think the new MMapDir already 
has some improvements here. Better error messages would help not just people 
debugging real world problems, but also developers making new stuff and 
debugging tests.
   
   cc: @uschindler 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] donnerpeter commented on a diff in pull request #11909: hunspell: introduce FragmentChecker to speed up ModifyingSuggester

2022-11-09 Thread GitBox



donnerpeter commented on code in PR #11909:
URL: https://github.com/apache/lucene/pull/11909#discussion_r1018359280


##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/NGramFragmentChecker.java:
##
@@ -0,0 +1,181 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+import java.util.BitSet;
+import java.util.Collection;
+import java.util.Locale;
+import java.util.function.Consumer;
+
+/**
+ * A {@link FragmentChecker} based on all character n-grams possible in a 
certain language, keeping
+ * them in a relatively memory-efficient, but probabilistic data structure.
+ *
+ * @see #fromAllSimpleWords for enumerating the whole dictionary automatically
+ * @see #fromWords for creating an instance from a precomputed set of all word 
forms or n-grams
+ */
+public class NGramFragmentChecker implements FragmentChecker {
+  private final int n;
+  private final BitSet hashes;
+
+  private NGramFragmentChecker(int n, BitSet hashes) {
+if (n < 2) throw new IllegalArgumentException("N should be more >=2: " + 
n);
+
+this.n = n;
+this.hashes = hashes;
+
+if (hashes.cardinality() > hashes.size() * 2 / 3) {
+  throw new IllegalArgumentException(
+  "Too many collisions, please report this to d...@lucene.apache.org");
+}
+  }
+
+  int hashCount() {
+return hashes.cardinality();
+  }
+
+  @Override
+  public boolean hasImpossibleFragmentAround(CharSequence word, int start, int 
end) {
+if (word.length() < n) {
+  return false;
+}
+int firstIntersectingStart = Math.max(0, start - n + 1);
+int lastIntersectingStart = Math.min(end - 1, word.length() - n);
+for (int i = firstIntersectingStart; i <= lastIntersectingStart; i++) {
+  if (!hashes.get(Math.abs(lowCollisionHash(word, i, i + n) % 
hashes.size( {
+return true;
+  }
+}
+return false;
+  }
+
+  private static int lowCollisionHash(CharSequence chars, int offset, int end) 
{
+int result = 0;
+for (int i = offset; i < end; i++) {
+  result = 239 * result + chars.charAt(i);
+}
+return result;
+  }
+
+  /**
+   * Iterate the whole dictionary, derive all word forms (using {@link 
WordFormGenerator}), vary the
+   * case to get all words acceptable by the spellchecker, and create a 
fragment checker based on
+   * their {@code n}-grams. Note that this enumerates only words derivable by 
suffixes and prefixes.
+   * If the language has compounds, some n-grams possible via those compounds 
can be missed. In the
+   * latter case, consider using {@link #fromWords}.
+   *
+   * @param n the length of n-grams
+   * @param dictionary the dictionary to traverse
+   * @param checkCanceled an object that's periodically called, allowing to 
interrupt the traversal
+   * by throwing an exception
+   */
+  public static NGramFragmentChecker fromAllSimpleWords(
+  int n, Dictionary dictionary, Runnable checkCanceled) {
+BitSet hashes = new BitSet(1 << (7 + n * 3));

Review Comment:
   It's from the experiment. I've forgot to restrict it, thanks for reminding! 
And yes, it's currently a single-function Bloom filter, the implementation can 
be changed later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] dweiss commented on a diff in pull request #11909: hunspell: introduce FragmentChecker to speed up ModifyingSuggester

2022-11-09 Thread GitBox



dweiss commented on code in PR #11909:
URL: https://github.com/apache/lucene/pull/11909#discussion_r1018368778


##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/FragmentChecker.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+/**
+ * An oracle for quickly checking that a specific part of a word can never be 
a valid word. This
+ * allows speeding up the "Modification" part of {@link Suggester} but 
avoiding expensive checks on
+ * impossible words. Implementations may use character case, n-grams or 
whatever they like.
+ *
+ * @see NGramFragmentChecker
+ */
+public interface FragmentChecker {
+  FragmentChecker EVERYTHING_POSSIBLE = (word, start, end) -> false;
+
+  /**
+   * Check if some substring of the word that intersects with the given 
offsets is impossible in the

Review Comment:
   Ok, please disregard the suggestion. I get it now, although perhaps the 
wording could be improved indeed. Or maybe a concrete example based on the 
ngram fragment checker would clarify things more than a formal contract? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] donnerpeter commented on a diff in pull request #11909: hunspell: introduce FragmentChecker to speed up ModifyingSuggester

2022-11-09 Thread GitBox



donnerpeter commented on code in PR #11909:
URL: https://github.com/apache/lucene/pull/11909#discussion_r1018377196


##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/FragmentChecker.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.analysis.hunspell;
+
+/**
+ * An oracle for quickly checking that a specific part of a word can never be 
a valid word. This
+ * allows speeding up the "Modification" part of {@link Suggester} but 
avoiding expensive checks on
+ * impossible words. Implementations may use character case, n-grams or 
whatever they like.
+ *
+ * @see NGramFragmentChecker
+ */
+public interface FragmentChecker {
+  FragmentChecker EVERYTHING_POSSIBLE = (word, start, end) -> false;
+
+  /**
+   * Check if some substring of the word that intersects with the given 
offsets is impossible in the

Review Comment:
   Expanded the documentation. Hopefully it's clearer now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] uschindler commented on issue #11912: Can we improve MMapDir's exceptions for invalid offsets?

2022-11-09 Thread GitBox



uschindler commented on issue #11912:
URL: https://github.com/apache/lucene/issues/11912#issuecomment-1309351422

   As discussed in chat an hour ago:
   - for MemorySegmentIndexInput it already has a reworked Exception code where 
also those suppress unused warnings are gone due to some trick: we just catch 
the exception, but pass them to a method that maps it to an user readable 
Lucene exception and regrow it  This reduced code duplication and removed most 
suppress warnings.
   
   The code in memory segments already contains a message to report negative 
positions. I can change it to show the actual position.
   
   I would backport the exception factory methods to the byte buffer index 
input, that's a bit mechanical work but makes the code also better readable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] uschindler commented on issue #11912: Can we improve MMapDir's exceptions for invalid offsets?

2022-11-09 Thread GitBox



uschindler commented on issue #11912:
URL: https://github.com/apache/lucene/issues/11912#issuecomment-1309352460

   Maybe include that in 9.4.3 ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] uschindler commented on issue #11912: Can we improve MMapDir's exceptions for invalid offsets?

2022-11-09 Thread GitBox



uschindler commented on issue #11912:
URL: https://github.com/apache/lucene/issues/11912#issuecomment-1309354880

   Will do that tomorrow, PR will come. We can decide to add it to 9.4 branch 
to make later debugging.of broken code with recent vectors easier.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] stevenschlansker opened a new issue, #11913: lucene-replicator PrimaryNode unsafely publishes reference during construction

2022-11-09 Thread GitBox



stevenschlansker opened a new issue, #11913:
URL: https://github.com/apache/lucene/issues/11913

   ### Description
   
   In the lucene-replicator module, the PrimaryNode does some initialization 
work in the constructor. It starts with an IndexWriter provided by the 
application author. At line 92:
   
   ```
   writer.getConfig().setMergedSegmentWarmer(new 
PreCopyMergedSegmentWarmer(this));
   ```
   
   In this line, the IndexWriter is mutated to indirectly reference 
`PrimaryNode.this` before the constructor initialization has completed. At this 
point, important fields are not set yet (`mgr` for example is not set), and 
further initialization is not done carefully with regards to thread safety 
(`mgr` is not volatile, for example).
   
   Meanwhile, a background thread calls `IndexWriter.commit()`, not realizing 
the `PrimaryNode` is not yet initialized. Doing so causes a merge, which causes 
the `PreCopyMergedSegmentWarmer` to then invoke 
`PrimaryNode.preCopyMergedSegmentFiles`. Now, another thread is calling an 
instance method on an incompletely initialized PrimaryNode.
   
   In our case, this leads to a `NullPointerException` reading a field of our 
PrimaryNode subclass, despite it being `final` and initialized in the 
constructor. Essentially, this code can fail, but only in rare circumstances:
   
   ```
   public class MyNode extends PrimaryNode {
   private final ImportantField importantField;
   
   public MyNode(IndexWriter writer) {
   super(writer, ...); // leaks `this` into `writer`
   importantField = new ImportantField();
   }
   
   protected void preCopyMergedSegmentFiles(...) {
   // this can be called before constructor finishes! danger!
   importantField.getInformation(); // very surprising 
NullPointerException!
   }
   }
   ```
   
   I am not entirely sure what the fix here is. Initialization is clearly a 
tricky problem.
   Generally, it might be nice to move any code beyond setting fields out of 
the PrimaryNode constructor into a start() method, but this is likely not a 
compatible change.
   Leaking a `this` reference from an object constructor into shared state is 
an inherently risky thing to do.
   
   ### Version and environment details
   
   Lucene 9.4.1, Java 17+19, platform-independent


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jtibshirani commented on issue #11911: improve checkindex to be more thorough for vectors (e.g. test seeking)

2022-11-09 Thread GitBox



jtibshirani commented on issue #11911:
URL: https://github.com/apache/lucene/issues/11911#issuecomment-1309460506

   This is a good idea! As a note -- in order to exercise all parts of the file 
format, we'll have to perform a kNN search too through 
`KnnVectorsReader#search`. If we only load `VectorValues` and iterating through 
them (even with skipping), then we won't exercise the HNSW graph part which 
lives in the `.vex` file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jpountz opened a new issue, #11914: Remove QueryTimeout#isTimeoutEnabled?

2022-11-09 Thread GitBox



jpountz opened a new issue, #11914:
URL: https://github.com/apache/lucene/issues/11914

   ### Description
   
   I don't understand well why `QueryTimeout` has a `isTimeoutEnabled` method. 
If the timeout is not enabled, why configure a timeout at all?
   
   ### Version and environment details
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] jpountz commented on issue #11868: Add a FilterIndexOutput

2022-11-09 Thread GitBox



jpountz commented on issue #11868:
URL: https://github.com/apache/lucene/issues/11868#issuecomment-1309896931

   Let's do `FilterIndexInput` at the same time?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] javanna commented on pull request #11793: Prevent PointValues from returning null for ghost fields

[GitHub] [lucene] mohamedniyaz1996 opened a new issue, #11908: Get scores along with Spellcheck suggestions

[GitHub] [lucene] benwtrent commented on pull request #11905: Fix integer overflow when seeking the vector index for connections

[GitHub] [lucene] rmuir commented on pull request #11905: Fix integer overflow when seeking the vector index for connections

[GitHub] [lucene] donnerpeter opened a new pull request, #11909: hunspell: introduce FragmentChecker to speed up ModifyingSuggester

[GitHub] [lucene] rmuir opened a new issue, #11910: improve error-prone configuration for int-overflow bugs

[GitHub] [lucene] jpountz commented on pull request #11875: Usability improvements for timeout support in IndexSearcher

[GitHub] [lucene] jpountz commented on issue #11702: Multi-Value Support for Binary DocValues [LUCENE-10666]

[GitHub] [lucene] benwtrent commented on issue #11910: improve error-prone configuration for int-overflow bugs

[GitHub] [lucene] rmuir commented on issue #11910: improve error-prone configuration for int-overflow bugs

[GitHub] [lucene] reta commented on pull request #11875: Usability improvements for timeout support in IndexSearcher

[GitHub] [lucene] reta closed pull request #11875: Usability improvements for timeout support in IndexSearcher

[GitHub] [lucene] reta commented on issue #11874: Usability improvements for timeout support in IndexSearcher

[GitHub] [lucene] reta closed issue #11874: Usability improvements for timeout support in IndexSearcher

[GitHub] [lucene] dweiss commented on issue #11910: improve error-prone configuration for int-overflow bugs

[GitHub] [lucene] dweiss commented on a diff in pull request #11909: hunspell: introduce FragmentChecker to speed up ModifyingSuggester

[GitHub] [lucene] benwtrent commented on pull request #11905: Fix integer overflow when seeking the vector index for connections

[GitHub] [lucene] rmuir commented on pull request #11905: Fix integer overflow when seeking the vector index for connections

[GitHub] [lucene] rmuir commented on pull request #11905: Fix integer overflow when seeking the vector index for connections

[GitHub] [lucene] rmuir opened a new issue, #11911: improve checkindex to be more thorough for vectors (e.g. test seeking)

[GitHub] [lucene] rmuir commented on issue #11702: Multi-Value Support for Binary DocValues [LUCENE-10666]

[GitHub] [lucene] nknize commented on issue #11702: Multi-Value Support for Binary DocValues [LUCENE-10666]

[GitHub] [lucene] nknize commented on issue #11702: Multi-Value Support for Binary DocValues [LUCENE-10666]

[GitHub] [lucene] donnerpeter commented on a diff in pull request #11909: hunspell: introduce FragmentChecker to speed up ModifyingSuggester

[GitHub] [lucene] rmuir opened a new issue, #11912: Can we improve MMapDir's exceptions for invalid offsets?

[GitHub] [lucene] donnerpeter commented on a diff in pull request #11909: hunspell: introduce FragmentChecker to speed up ModifyingSuggester

[GitHub] [lucene] dweiss commented on a diff in pull request #11909: hunspell: introduce FragmentChecker to speed up ModifyingSuggester

[GitHub] [lucene] donnerpeter commented on a diff in pull request #11909: hunspell: introduce FragmentChecker to speed up ModifyingSuggester

[GitHub] [lucene] uschindler commented on issue #11912: Can we improve MMapDir's exceptions for invalid offsets?

[GitHub] [lucene] uschindler commented on issue #11912: Can we improve MMapDir's exceptions for invalid offsets?

[GitHub] [lucene] uschindler commented on issue #11912: Can we improve MMapDir's exceptions for invalid offsets?

[GitHub] [lucene] stevenschlansker opened a new issue, #11913: lucene-replicator PrimaryNode unsafely publishes reference during construction

[GitHub] [lucene] jtibshirani commented on issue #11911: improve checkindex to be more thorough for vectors (e.g. test seeking)

[GitHub] [lucene] jpountz opened a new issue, #11914: Remove QueryTimeout#isTimeoutEnabled?

[GitHub] [lucene] jpountz commented on issue #11868: Add a FilterIndexOutput

35 matches

Site Navigation

Mail list logo

Footer information