[GitHub] [lucene] thecoop opened a new pull request, #11847: Add a method allowing canonical strings to be returned from DataInput

2022-10-13 Thread GitBox


thecoop opened a new pull request, #11847:
URL: https://github.com/apache/lucene/pull/11847

   Use a shared buffer for decoding short strings


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #11847: Add a method allowing canonical strings to be returned from DataInput

2022-10-13 Thread GitBox


rmuir commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1277440222

   I don't know what this string interning is here, but I am strongly opposed 
to it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir merged pull request #11844: Mark TestLongBitSet.testHugeCapacity @Monster as it requires a lot of memory

2022-10-13 Thread GitBox


rmuir merged PR #11844:
URL: https://github.com/apache/lucene/pull/11844


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir closed issue #11842: TestLongBitSet.testHugeCapacity OOM

2022-10-13 Thread GitBox


rmuir closed issue #11842: TestLongBitSet.testHugeCapacity OOM
URL: https://github.com/apache/lucene/issues/11842


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir merged pull request #11846: WrapperDownloader: add retries for network blips around connect(), too

2022-10-13 Thread GitBox


rmuir merged PR #11846:
URL: https://github.com/apache/lucene/pull/11846


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir closed issue #11845: WrapperDownloader should retry on Layer3/Layer4 network errors

2022-10-13 Thread GitBox


rmuir closed issue #11845: WrapperDownloader should retry on Layer3/Layer4 
network errors
URL: https://github.com/apache/lucene/issues/11845


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on pull request #11843: Remove cancellation check on every vector

2022-10-13 Thread GitBox


jpountz commented on PR #11843:
URL: https://github.com/apache/lucene/pull/11843#issuecomment-1277592908

   > I wonder if we are running benchmarks with the cancellation/timeout 
checker?
   
   We recently introduced support for benchmarking the impact of timeouts in 
the benchmark suite, but it's not checked with nightly benchmarks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz merged pull request #11841: GITHUB-11761 (part 2): Fix unit tests to cleany work with new TierMer…

2022-10-13 Thread GitBox


jpountz merged PR #11841:
URL: https://github.com/apache/lucene/pull/11841


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz commented on issue #11761: Expand TieredMergePolicy deletePctAllowed limits

2022-10-13 Thread GitBox


jpountz commented on issue #11761:
URL: https://github.com/apache/lucene/issues/11761#issuecomment-1277704569

   Closing: https://github.com/apache/lucene/pull/11831 has been merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jpountz closed issue #11761: Expand TieredMergePolicy deletePctAllowed limits

2022-10-13 Thread GitBox


jpountz closed issue #11761: Expand TieredMergePolicy deletePctAllowed limits
URL: https://github.com/apache/lucene/issues/11761


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir opened a new issue, #11848: Fix ExitableDirectoryReader sampling constants to be power-of-2

2022-10-13 Thread GitBox


rmuir opened a new issue, #11848:
URL: https://github.com/apache/lucene/issues/11848

   ### Description
   
   When looking at #11843, I noticed code of the following in several places in 
ExitableDirectoryReader:
   
   ```
   if (calls++ % MAX_CALLS_XXX== 0) {
 checkAndThrow();
   }
   ```
   
   Unfortunately these `MAX_CALLS_XXX` constants are not powers of 2: we should 
avoid integer division.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent opened a new pull request, #11849: Fix failure to load larger data sets in KnnGraphTest

2022-10-13 Thread GitBox


benwtrent opened a new pull request, #11849:
URL: https://github.com/apache/lucene/pull/11849

   When running the `reindex` task with KnnGraphTest, exceptionally large 
datasets can be used. Since mmap is used to read the data, we need to know the 
buffer size. This size is limited to Integer.MAX_VALUE, which is inadequate for 
larger datasets.
   
   An example data set that the current behavior fails on is: 
http://sites.skoltech.ru/compvision/noimi/ (`deep-image-96-angular` in 
ann-benchmarks). 
   
   Specifically `deep-image-96-angular` dataset in mapped memory has a size of 
`999 * 96 * 4` (docNum * dim * byteSizeOf(Float32)). As an int, this rolls 
over to `-458807296`, as a long: `383616`.
   
   So, this commit adds back the iterative batching, taking batch sizes near 
`Integer.MAX_VALUE` that are a multiple of `dim * byteSize`. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani merged pull request #11843: Remove cancellation check on every vector

2022-10-13 Thread GitBox


jtibshirani merged PR #11843:
URL: https://github.com/apache/lucene/pull/11843


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent closed pull request #11849: Fix failure to load larger data sets in KnnGraphTest

2022-10-13 Thread GitBox


benwtrent closed pull request #11849: Fix failure to load larger data sets in 
KnnGraphTest
URL: https://github.com/apache/lucene/pull/11849


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent commented on pull request #11849: Fix failure to load larger data sets in KnnGraphTest

2022-10-13 Thread GitBox


benwtrent commented on PR #11849:
URL: https://github.com/apache/lucene/pull/11849#issuecomment-1277932553

   @jtibshirani or @msokolov care to review? The bug was introduced back in 
https://github.com/apache/lucene/pull/1054


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani commented on pull request #11849: Fix failure to load larger data sets in KnnGraphTest

2022-10-13 Thread GitBox


jtibshirani commented on PR #11849:
URL: https://github.com/apache/lucene/pull/11849#issuecomment-1278034553

   Thanks for fixing this @benwtrent ! I wonder if we could take the simpler 
approach of just opening the file, and iterating through the vectors one by 
one. I don't think there's a clear performance benefit to mmapping sections of 
the file, since we are always just iterating through the vectors in order (to 
then index them, search them one-by-one, etc.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] benwtrent commented on pull request #11849: Fix failure to load larger data sets in KnnGraphTest

2022-10-13 Thread GitBox


benwtrent commented on PR #11849:
URL: https://github.com/apache/lucene/pull/11849#issuecomment-1278060063

   @jtibshirani My goal here was to fix the bug with as much as the original 
design as possible. I didn't want to spend a bunch of time re-factoring this 
code. 
   
   I am open to simply read bytes from the file directly instead of using mmap. 
What say you @msokolov ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir opened a new pull request, #11850: Fix ExitableDirectoryReader sampling constants to be power-of-2

2022-10-13 Thread GitBox


rmuir opened a new pull request, #11850:
URL: https://github.com/apache/lucene/pull/11850

   If it's performance sensitive enough that we should do sampling, then we 
should avoid integer division too.
   
   Closes #11848


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] msokolov opened a new issue, #11851: Luke web interface

2022-10-13 Thread GitBox


msokolov opened a new issue, #11851:
URL: https://github.com/apache/lucene/issues/11851

   ### Description
   
   I threw together a demo for ApacheCon to show off vector search and I wanted 
a scrappy UI I could hack on. Luke seemed like a good place to start since it 
is already in the Lucene build and has a GUI. But I found two problems with it. 
First, the Swing UI made me feel like I had stepped into a car with Marty 
McFly; the fonts are not resizable, and there are other things about the app's 
design that have not aged well (the presentation of fields allows for their 
definition to vary by document, which they no longer can). Second, I wanted to 
run the demo on a remote machine (at least at first - I ended up not doing this 
in the end, but still, that would be a nice feature! and not everybody can 
tolerate an X Windows connection any more). So I ended up coding up a simple 
web UI that builds as part of Luke. It uses Luke's existing "models" but 
replaces the Swing UI with a web service based on the JDK's built-in HTTP 
server.
   
   I do like the idea of having some kind of minimal UI "baked in" to Lucene 
that can be used for debugging an index, maybe on a remote machine, or as a 
basis for some demo. I'd like to propose adding a Luke webapp. I have these 
principles in mind for it that are nonnegotiable in my mind:
   
   1. No (or very minimal, no more than current Luke) dependencies. (side note: 
I thought of using Marple, but it isn't up to date and when I tried to get it 
working I realized node.js was involved, so I backed out. I don't want to get 
caught up in Javascript frameworks.) 
   
   actually that's my only principle. Anyway I'll post what I have.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] msokolov opened a new pull request, #11852: Luke Webapp

2022-10-13 Thread GitBox


msokolov opened a new pull request, #11852:
URL: https://github.com/apache/lucene/pull/11852

   See  #11851 for an overview of what this is
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] msokolov commented on pull request #11852: Luke Webapp

2022-10-13 Thread GitBox


msokolov commented on PR #11852:
URL: https://github.com/apache/lucene/pull/11852#issuecomment-1278129891

   So -- this is just a scrappy start I wanted to post to get an idea if people 
think this is worth including. The initial "overview" page is functionally 
equivalent to the Luke overview screen, but the search page has some vestiges 
relating to the specific demo index I was using, and that's all that's here. My 
intent is to fill in the rest of the Luke UI, perhaps updating it along the way 
to incorporate the latest state of Lucene. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on a diff in pull request #11852: Luke Webapp

2022-10-13 Thread GitBox


rmuir commented on code in PR #11852:
URL: https://github.com/apache/lucene/pull/11852#discussion_r995094470


##
gradle/testing/randomization/policies/luke-tests.policy:
##
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+// Policy file for :lucene:luke tests. Please keep minimal and avoid wildcards.
+// this differs from the standard lucene policy in that it must allow opening 
port 8080
+// and allow JDK logging configuration.

Review Comment:
   This won't work: port 8080 might be busy doing other things. Or in TIME_WAIT 
state from a previous test run. We can't bind to any specific port in tests.
   
   Please, make the port configurable... and in tests bind to port 0, so that 
you get a spare port in a race-free way. Call getAddress() afterwards to find 
out which port you were allocated, and connect to that in the tests.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on a diff in pull request #11852: Luke Webapp

2022-10-13 Thread GitBox


rmuir commented on code in PR #11852:
URL: https://github.com/apache/lucene/pull/11852#discussion_r995097222


##
gradle/testing/randomization/policies/luke-tests.policy:
##
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+// Policy file for :lucene:luke tests. Please keep minimal and avoid wildcards.
+// this differs from the standard lucene policy in that it must allow opening 
port 8080
+// and allow JDK logging configuration.

Review Comment:
   to clarify more, not just the port, but also the address. so just take 
InetSocketAddress as parameter. Tests must use `127.0.0.1` and port `0`... no 
real ports and no wildcard addresses.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] msokolov commented on a diff in pull request #11852: Luke Webapp

2022-10-13 Thread GitBox


msokolov commented on code in PR #11852:
URL: https://github.com/apache/lucene/pull/11852#discussion_r995097975


##
gradle/testing/randomization/policies/luke-tests.policy:
##
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+// Policy file for :lucene:luke tests. Please keep minimal and avoid wildcards.
+// this differs from the standard lucene policy in that it must allow opening 
port 8080
+// and allow JDK logging configuration.

Review Comment:
   +1 I had intended to do this (already hit port already in use while testing) 
-- so many things to do here!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dsmiley commented on issue #11851: Luke web interface

2022-10-13 Thread GitBox


dsmiley commented on issue #11851:
URL: https://github.com/apache/lucene/issues/11851#issuecomment-1278229377

   I believe @romseygeek worked on a HTTP based Luke-like thing and invested a 
lot of time into it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dsmiley commented on pull request #11847: Add a method allowing canonical strings to be returned from DataInput

2022-10-13 Thread GitBox


dsmiley commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1278243705

   This PR does not use `String.intern` which was the previous concern.  So 
what's wrong here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dsmiley commented on pull request #1069: [LUCENE-2587] Highlighter fragment bug

2022-10-13 Thread GitBox


dsmiley commented on PR #1069:
URL: https://github.com/apache/lucene/pull/1069#issuecomment-1278245086

   If only we renamed "Highlighter" to "OriginalHighlighter", maybe folks 
wouldn't continue to using this thing.  Is the UnifiedHighlighter not 
satisfying you, and if so, why not?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #11847: Add a method allowing canonical strings to be returned from DataInput

2022-10-13 Thread GitBox


rmuir commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1278380008

   It does essentially the same thing. Leaking memory on purpose into static 
finals.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dsmiley commented on pull request #11847: Add a method allowing canonical strings to be returned from DataInput

2022-10-13 Thread GitBox


dsmiley commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1278448167

   The map isn't static.  
   
   Even if there was a static map, *if* it was expressly used for known static 
strings, then it wouldn't be a leak but just re-use of constants.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] stefanvodita commented on pull request #11815: Support deletions in rearrange (#11814)

2022-10-13 Thread GitBox


stefanvodita commented on PR #11815:
URL: https://github.com/apache/lucene/pull/11815#issuecomment-1278524902

   The second revision comes with a lot more changes to support selecting 
deletes in the same fashion as segment content. I’ve reworked the tests to be 
more thorough, especially about deletes. I think the new tests cover all the 
cases that the old tests did, so I replaced the old ones.
   
   This change is technically not backwards compatible. Not just because of 
changes to the rearrange API, but also because now we no longer make the 
deletes disappear from the rearranged index. They become live instead.
   
   Once this PR goes through, I’ll start working on a change for luceneutil to 
use the new functionality.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] zhaih commented on a diff in pull request #11840: GITHUB-11838 Add api to allow concurrent query rewrite

2022-10-13 Thread GitBox


zhaih commented on code in PR #11840:
URL: https://github.com/apache/lucene/pull/11840#discussion_r995382563


##
lucene/classification/src/java/org/apache/lucene/classification/utils/NearestFuzzyQuery.java:
##
@@ -31,13 +31,7 @@
 import org.apache.lucene.index.TermStates;
 import org.apache.lucene.index.Terms;
 import org.apache.lucene.index.TermsEnum;
-import org.apache.lucene.search.BooleanClause;
-import org.apache.lucene.search.BooleanQuery;
-import org.apache.lucene.search.BoostQuery;
-import org.apache.lucene.search.FuzzyTermsEnum;
-import org.apache.lucene.search.Query;
-import org.apache.lucene.search.QueryVisitor;
-import org.apache.lucene.search.TermQuery;
+import org.apache.lucene.search.*;

Review Comment:
   Yeah sure, seems I need to change IDE settings a little bit for the future!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] zhaih commented on a diff in pull request #11840: GITHUB-11838 Add api to allow concurrent query rewrite

2022-10-13 Thread GitBox


zhaih commented on code in PR #11840:
URL: https://github.com/apache/lucene/pull/11840#discussion_r995390126


##
lucene/core/src/java/org/apache/lucene/document/FeatureQuery.java:
##
@@ -50,12 +49,12 @@ final class FeatureQuery extends Query {
   }
 
   @Override
-  public Query rewrite(IndexReader reader) throws IOException {
-FeatureFunction rewritten = function.rewrite(reader);
+  public Query rewrite(IndexSearcher indexSearcher) throws IOException {
+FeatureFunction rewritten = 
function.rewrite(indexSearcher.getIndexReader());

Review Comment:
   Sure, I'll also change the `MultiTermQuery.rewriteMethod`, was planning to 
do them in another PR, but seems not too bad to merge all into this one.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org