date:20230724

[GitHub] [lucene] original-brownbear commented on a diff in pull request #12453: Faster bulk numeric reads from BufferedIndexInput

2023-07-24 Thread via GitHub

original-brownbear commented on code in PR #12453: URL: https://github.com/apache/lucene/pull/12453#discussion_r1271914546 ## lucene/core/src/java/org/apache/lucene/store/BufferedIndexInput.java: ## @@ -159,6 +159,63 @@ public final long readLong() throws IOException { }

[GitHub] [lucene] jpountz opened a new issue, #12456: Investigate slow fuzzy queries

2023-07-24 Thread via GitHub

jpountz opened a new issue, #12456: URL: https://github.com/apache/lucene/issues/12456 While disjunctive queries got a performance boost with https://github.com/apache/lucene/pull/12444 ([OrHighHigh](http://people.apache.org/~mikemccand/lucenebench/OrHighHigh.html), [OrHighMed](http://peop

[GitHub] [lucene] jpountz commented on a diff in pull request #12446: Enable rank-unsafe optimizations for MAXSCORE/WAND.

2023-07-24 Thread via GitHub

jpountz commented on code in PR #12446: URL: https://github.com/apache/lucene/pull/12446#discussion_r1271961619 ## lucene/core/src/java/org/apache/lucene/search/MaxScoreBulkScorer.java: ## @@ -168,7 +171,17 @@ private boolean partitionScorers() { if (maxScoreSumFloat >= m

[GitHub] [lucene] jpountz commented on a diff in pull request #12446: Enable rank-unsafe optimizations for MAXSCORE/WAND.

2023-07-24 Thread via GitHub

jpountz commented on code in PR #12446: URL: https://github.com/apache/lucene/pull/12446#discussion_r1271963742 ## lucene/core/src/java/org/apache/lucene/search/BulkScorer.java: ## @@ -90,4 +90,13 @@ public abstract int score(LeafCollector collector, Bits acceptDocs, int min, i

[GitHub] [lucene] jpountz commented on a diff in pull request #12453: Faster bulk numeric reads from BufferedIndexInput

2023-07-24 Thread via GitHub

jpountz commented on code in PR #12453: URL: https://github.com/apache/lucene/pull/12453#discussion_r1271956372 ## lucene/core/src/test/org/apache/lucene/store/TestBufferedIndexInput.java: ## @@ -209,6 +209,103 @@ public void testBackwardsLongReads() throws IOException {

[GitHub] [lucene] original-brownbear commented on a diff in pull request #12453: Faster bulk numeric reads from BufferedIndexInput

2023-07-24 Thread via GitHub

original-brownbear commented on code in PR #12453: URL: https://github.com/apache/lucene/pull/12453#discussion_r1272006226 ## lucene/core/src/test/org/apache/lucene/store/TestBufferedIndexInput.java: ## @@ -209,6 +209,103 @@ public void testBackwardsLongReads() throws IOExceptio

[GitHub] [lucene] jpountz opened a new pull request, #12457: Improve MaxScoreBulkScorer partitioning logic.

2023-07-24 Thread via GitHub

jpountz opened a new pull request, #12457: URL: https://github.com/apache/lucene/pull/12457 Partitioning scorers is an optimization problem: the optimal set of non-essential scorers is the subset of scorers whose sum of max window scores is less than the minimum competitive score that maxim

[GitHub] [lucene] jpountz commented on pull request #12457: Improve MaxScoreBulkScorer partitioning logic.

2023-07-24 Thread via GitHub

jpountz commented on PR #12457: URL: https://github.com/apache/lucene/pull/12457#issuecomment-1647837090 luceneutil doesn't show an improvement on wikimedium because all fuzzy queries only have low-frequency terms, only nightlies have the `titel~2` query in their tasks file. ```

[GitHub] [lucene] jpountz commented on a diff in pull request #12453: Faster bulk numeric reads from BufferedIndexInput

2023-07-24 Thread via GitHub

jpountz commented on code in PR #12453: URL: https://github.com/apache/lucene/pull/12453#discussion_r1272210731 ## lucene/CHANGES.txt: ## @@ -84,6 +84,8 @@ Optimizations * GITHUB#12372: Reduce allocation during HNSW construction (Jonathan Ellis) +* GITHUB#12453: Faster bulk

[GitHub] [lucene] jpountz commented on a diff in pull request #12453: Faster bulk numeric reads from BufferedIndexInput

2023-07-24 Thread via GitHub

jpountz commented on code in PR #12453: URL: https://github.com/apache/lucene/pull/12453#discussion_r1272352389 ## lucene/CHANGES.txt: ## @@ -84,6 +84,8 @@ Optimizations * GITHUB#12372: Reduce allocation during HNSW construction (Jonathan Ellis) +* GITHUB#12453: Faster bulk

[GitHub] [lucene] jpountz merged pull request #12442: Assert IdxOrDvQuery subqueries and document useful fields

2023-07-24 Thread via GitHub

jpountz merged PR #12442: URL: https://github.com/apache/lucene/pull/12442 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] HoustonPutman commented on pull request #12430: Enable search for site javadocs

2023-07-24 Thread via GitHub

HoustonPutman commented on PR #12430: URL: https://github.com/apache/lucene/pull/12430#issuecomment-1648042240 This same fix worked for Solr: https://solr.apache.org/docs/9_3_0/core/index.html Will go ahead and merge! -- This is an automated message from the Apache Git Service. To

[GitHub] [lucene] HoustonPutman merged pull request #12430: Enable search for site javadocs

2023-07-24 Thread via GitHub

HoustonPutman merged PR #12430: URL: https://github.com/apache/lucene/pull/12430 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@luce

[GitHub] [lucene] original-brownbear commented on a diff in pull request #12453: Faster bulk numeric reads from BufferedIndexInput

2023-07-24 Thread via GitHub

original-brownbear commented on code in PR #12453: URL: https://github.com/apache/lucene/pull/12453#discussion_r1272393881 ## lucene/CHANGES.txt: ## @@ -84,6 +84,8 @@ Optimizations * GITHUB#12372: Reduce allocation during HNSW construction (Jonathan Ellis) +* GITHUB#12453:

[GitHub] [lucene] jpountz merged pull request #12453: Faster bulk numeric reads from BufferedIndexInput

2023-07-24 Thread via GitHub

jpountz merged PR #12453: URL: https://github.com/apache/lucene/pull/12453 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] gsmiller commented on issue #12451: Interesting TestStringsToAutomaton failure

2023-07-24 Thread via GitHub

gsmiller commented on issue #12451: URL: https://github.com/apache/lucene/issues/12451#issuecomment-1648287605 Also, here's the compiled automaton resulting from the code-point automaton above: ![out](https://github.com/apache/lucene/assets/16479560/24b31791-22df-4e78--f0be01cce99c)

[GitHub] [lucene] jbellis commented on a diff in pull request #12421: Concurrent hnsw graph and builder, take two

2023-07-24 Thread via GitHub

jbellis commented on code in PR #12421: URL: https://github.com/apache/lucene/pull/12421#discussion_r1272560387 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -311,7 +369,6 @@ void searchLevel( graphSeek(graph, level, topCandidateNode);

[GitHub] [lucene] jbellis commented on a diff in pull request #12421: Concurrent hnsw graph and builder, take two

2023-07-24 Thread via GitHub

jbellis commented on code in PR #12421: URL: https://github.com/apache/lucene/pull/12421#discussion_r1272561099 ## lucene/core/src/java/org/apache/lucene/util/hnsw/ConcurrentHnswGraphBuilder.java: ## @@ -0,0 +1,465 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] [lucene] benwtrent commented on a diff in pull request #12421: Concurrent hnsw graph and builder, take two

2023-07-24 Thread via GitHub

benwtrent commented on code in PR #12421: URL: https://github.com/apache/lucene/pull/12421#discussion_r1272563613 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java: ## @@ -311,7 +369,6 @@ void searchLevel( graphSeek(graph, level, topCandidateNode)

[GitHub] [lucene] gsmiller opened a new issue, #12458: UTF32toUTF8 can produce automata that produce invalid unicode

2023-07-24 Thread via GitHub

gsmiller opened a new issue, #12458: URL: https://github.com/apache/lucene/issues/12458 ### Description When converting a unicode (UTF32) automaton down to a UTF8 representation, UTF32toUTF8 can create an automaton that produces/accepts invalid UTF8. This happens when a transition in

[GitHub] [lucene] gsmiller commented on issue #12451: Interesting TestStringsToAutomaton failure

2023-07-24 Thread via GitHub

gsmiller commented on issue #12451: URL: https://github.com/apache/lucene/issues/12451#issuecomment-1648504863 This looks like a bug in the `UTF32toUTF8` conversion logic to me. It seems the resulting UTF8 automaton in this case is producing a large range of invalid UTF8. This appears to al

[GitHub] [lucene] benwtrent commented on a diff in pull request #12434: Add ParentJoin KNN support

2023-07-24 Thread via GitHub

benwtrent commented on code in PR #12434: URL: https://github.com/apache/lucene/pull/12434#discussion_r1272680226 ## lucene/core/src/java/org/apache/lucene/util/hnsw/KnnResults.java: ## @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] [lucene] benwtrent commented on a diff in pull request #12434: Add ParentJoin KNN support

2023-07-24 Thread via GitHub

benwtrent commented on code in PR #12434: URL: https://github.com/apache/lucene/pull/12434#discussion_r1272681200 ## lucene/core/src/java/org/apache/lucene/util/hnsw/KnnResultsProvider.java: ## @@ -0,0 +1,25 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

[GitHub] [lucene] benwtrent commented on a diff in pull request #12434: Add ParentJoin KNN support

2023-07-24 Thread via GitHub

benwtrent commented on code in PR #12434: URL: https://github.com/apache/lucene/pull/12434#discussion_r1272735873 ## lucene/core/src/java/org/apache/lucene/util/hnsw/KnnResultsProvider.java: ## @@ -0,0 +1,25 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one o

[GitHub] [lucene] iverase opened a new issue, #12459: Allow reading binary doc values as a DataInput

2023-07-24 Thread via GitHub

iverase opened a new issue, #12459: URL: https://github.com/apache/lucene/issues/12459 ### Description Binary doc values allow to store a variable number of bytes on a doc value. In order to read those bytes, we currently get a BytesRef from the API which contains the bytes on heap.

[GitHub] [lucene] iverase opened a new pull request, #12460: Allow reading binary doc values as a DataInput

2023-07-24 Thread via GitHub

iverase opened a new pull request, #12460: URL: https://github.com/apache/lucene/pull/12460 see https://github.com/apache/lucene/issues/12459 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [lucene] iverase commented on issue #12459: Allow reading binary doc values as a DataInput

2023-07-24 Thread via GitHub

iverase commented on issue #12459: URL: https://github.com/apache/lucene/issues/12459#issuecomment-1648979938 I wrote a prototype for this change here: https://github.com/apache/lucene/pull/12460 -- This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [lucene] jmazanec15 commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores

2023-07-24 Thread via GitHub

jmazanec15 commented on issue #12342: URL: https://github.com/apache/lucene/issues/12342#issuecomment-1649102549 > 🤦 yep! > Here is with the higher max conn. Sort of better. Right, I was thinking this might explain the recall descrepency for the dotproduct score change (0.989 vs 0

[GitHub] [lucene] searchivarius commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores

2023-07-24 Thread via GitHub

searchivarius commented on issue #12342: URL: https://github.com/apache/lucene/issues/12342#issuecomment-1649112930 Hi @jmazanec15 and @benwtrent : thanks a lot for testing. For higher recalls (somewhat higher or lower than 0.8) transformation seem to lead to substantial increase in latency

[GitHub] [lucene] original-brownbear commented on a diff in pull request #12453: Faster bulk numeric reads from BufferedIndexInput

[GitHub] [lucene] jpountz opened a new issue, #12456: Investigate slow fuzzy queries

[GitHub] [lucene] jpountz commented on a diff in pull request #12446: Enable rank-unsafe optimizations for MAXSCORE/WAND.

[GitHub] [lucene] jpountz commented on a diff in pull request #12446: Enable rank-unsafe optimizations for MAXSCORE/WAND.

[GitHub] [lucene] jpountz commented on a diff in pull request #12453: Faster bulk numeric reads from BufferedIndexInput

[GitHub] [lucene] original-brownbear commented on a diff in pull request #12453: Faster bulk numeric reads from BufferedIndexInput

[GitHub] [lucene] jpountz opened a new pull request, #12457: Improve MaxScoreBulkScorer partitioning logic.

[GitHub] [lucene] jpountz commented on pull request #12457: Improve MaxScoreBulkScorer partitioning logic.

[GitHub] [lucene] jpountz commented on a diff in pull request #12453: Faster bulk numeric reads from BufferedIndexInput

[GitHub] [lucene] jpountz commented on a diff in pull request #12453: Faster bulk numeric reads from BufferedIndexInput

[GitHub] [lucene] jpountz merged pull request #12442: Assert IdxOrDvQuery subqueries and document useful fields

[GitHub] [lucene] HoustonPutman commented on pull request #12430: Enable search for site javadocs

[GitHub] [lucene] HoustonPutman merged pull request #12430: Enable search for site javadocs

[GitHub] [lucene] original-brownbear commented on a diff in pull request #12453: Faster bulk numeric reads from BufferedIndexInput

[GitHub] [lucene] jpountz merged pull request #12453: Faster bulk numeric reads from BufferedIndexInput

[GitHub] [lucene] gsmiller commented on issue #12451: Interesting TestStringsToAutomaton failure

[GitHub] [lucene] jbellis commented on a diff in pull request #12421: Concurrent hnsw graph and builder, take two

[GitHub] [lucene] jbellis commented on a diff in pull request #12421: Concurrent hnsw graph and builder, take two

[GitHub] [lucene] benwtrent commented on a diff in pull request #12421: Concurrent hnsw graph and builder, take two

[GitHub] [lucene] gsmiller opened a new issue, #12458: UTF32toUTF8 can produce automata that produce invalid unicode

[GitHub] [lucene] gsmiller commented on issue #12451: Interesting TestStringsToAutomaton failure

[GitHub] [lucene] benwtrent commented on a diff in pull request #12434: Add ParentJoin KNN support

[GitHub] [lucene] benwtrent commented on a diff in pull request #12434: Add ParentJoin KNN support

[GitHub] [lucene] benwtrent commented on a diff in pull request #12434: Add ParentJoin KNN support

[GitHub] [lucene] iverase opened a new issue, #12459: Allow reading binary doc values as a DataInput

[GitHub] [lucene] iverase opened a new pull request, #12460: Allow reading binary doc values as a DataInput

[GitHub] [lucene] iverase commented on issue #12459: Allow reading binary doc values as a DataInput

[GitHub] [lucene] jmazanec15 commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores

[GitHub] [lucene] searchivarius commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores

29 matches

Site Navigation

Mail list logo

Footer information