ChristophKaser closed pull request #124: LUCENE-9951: Add InfoStream to
ReplicationService
URL: https://github.com/apache/lucene/pull/124
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
ChristophKaser commented on PR #124:
URL: https://github.com/apache/lucene/pull/124#issuecomment-1815890261
@mikemccand Thank you for looking at the patch! However it is a bit hard to
refresh this PR - after all, the http servlet based replication mechanism has
been removed from lucene in P
dungba88 commented on code in PR #12624:
URL: https://github.com/apache/lucene/pull/12624#discussion_r1395333947
##
lucene/core/src/java/org/apache/lucene/util/fst/GrowableByteArrayDataOutput.java:
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) und
dungba88 commented on code in PR #12624:
URL: https://github.com/apache/lucene/pull/12624#discussion_r1392197630
##
lucene/core/src/java/org/apache/lucene/util/fst/ByteBuffersFSTReader.java:
##
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one o
dungba88 opened a new issue, #12822:
URL: https://github.com/apache/lucene/issues/12822
### Description
After https://github.com/apache/lucene/pull/12758, we streamlined the FST
constructors and they eventually call the constructor with `FSTMetadata`. For
the old constructors with `D
shubhamvishu commented on PR #12716:
URL: https://github.com/apache/lucene/pull/12716#issuecomment-1815794510
> I'm surprised linear probing doesn't yield an improvement. Perhaps it's
not a significant factor because of other load? Hard to say. Anyway, no need to
make things more complicate
shubhamvishu closed pull request #12716: Improve hash mixing in FST's
double-barrel LRU hash
URL: https://github.com/apache/lucene/pull/12716
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
msokolov commented on issue #12627:
URL: https://github.com/apache/lucene/issues/12627#issuecomment-1815705050
yes, this is a promising avenue to explore! One note of caution: we should
avoid drawing strong inferences from a single dataset. I'm especially wary of
GloVe because I've noticed
SreehariG73 opened a new pull request, #12821:
URL: https://github.com/apache/lucene/pull/12821
### Description
Replaced IntPoint, LongPoint, FloatPoint, and DoublePoint with IntField,
LongField, FloatField, and DoubleField to make it easier-to-use field
subclasses.
--
This is
SreehariG73 commented on issue #12725:
URL: https://github.com/apache/lucene/issues/12725#issuecomment-1815676094
Hello,
I am planning to work on this issue. Can this issue be assigned to me please?
--
This is an automated message from the Apache Git Service.
To respond to the message,
Harshitha-g-06 commented on issue #12125:
URL: https://github.com/apache/lucene/issues/12125#issuecomment-1815650208
@rmuir Hi, may I work on this task?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
dungba88 commented on PR #12715:
URL: https://github.com/apache/lucene/pull/12715#issuecomment-1815622585
Thanks @cavorite , I have incorporated this change to #12624 . Removing the
constructor would also be great as it means there is less thing needs to be
backward compatible :)
--
This
dungba88 commented on issue #12760:
URL: https://github.com/apache/lucene/issues/12760#issuecomment-1815613094
The last TODO can be resolved with #12624
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
slow-J commented on issue #10094:
URL: https://github.com/apache/lucene/issues/10094#issuecomment-1815566763
No longer an issue, was fixed by https://github.com/apache/lucene/pull/249.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
zacharymorn commented on PR #240:
URL: https://github.com/apache/lucene/pull/240#issuecomment-181613
Thanks @javanna for the feedback!
> One thing that I wonder is whether we are ok already deprecating
search(Query, Collector) given that we have a lot of usages still within
Lucen
slow-J commented on issue #10796:
URL: https://github.com/apache/lucene/issues/10796#issuecomment-1815540459
No longer an issue, this was fixed by
https://github.com/apache/lucene/pull/11812.
--
This is an automated message from the Apache Git Service.
To respond to the message, please lo
slow-J commented on issue #10898:
URL: https://github.com/apache/lucene/issues/10898#issuecomment-1815535373
This is no longer a problem as it was fixed by
https://github.com/apache/lucene/pull/537.
--
This is an automated message from the Apache Git Service.
To respond to the message, pl
slow-J commented on code in PR #12816:
URL: https://github.com/apache/lucene/pull/12816#discussion_r1396510611
##
lucene/core/src/java/org/apache/lucene/search/HumanReadableQuery.java:
##
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
slow-J commented on PR #12816:
URL: https://github.com/apache/lucene/pull/12816#issuecomment-1815511866
> Should we move it in `lucene/misc` rather than `lucene/core`?
Yes, that sounds like a better place for it.
--
This is an automated message from the Apache Git Service.
To respon
jpountz commented on PR #12820:
URL: https://github.com/apache/lucene/pull/12820#issuecomment-1815358559
This is an interesting idea. Ideally we would figure out up-front whether
it's best to use the graph or not, but I can also imagine that we can't always
make the right decision there, so
dweiss commented on PR #12716:
URL: https://github.com/apache/lucene/pull/12716#issuecomment-1815225881
The fact hash key remixing doesn't improve the situation is not necessarily
a sign that it's somehow wrong - it means hash keys are distributed evenly
already (which is good). Remixing ad
jpountz commented on code in PR #12816:
URL: https://github.com/apache/lucene/pull/12816#discussion_r1396259219
##
lucene/core/src/java/org/apache/lucene/search/HumanReadableQuery.java:
##
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or mor
jpountz commented on PR #12816:
URL: https://github.com/apache/lucene/pull/12816#issuecomment-1815216336
Should we move it in `lucene/misc` rather than `lucene/core`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
mayya-sharipova commented on PR #12794:
URL: https://github.com/apache/lucene/pull/12794#issuecomment-1815203589
## Experiments
- Available processors: 10; thread pool size: 16
- luceneutil tool
Search:
- **baseline**: Lucene main branch
- **candidate1**: only global queue
slow-J commented on code in PR #12816:
URL: https://github.com/apache/lucene/pull/12816#discussion_r1396241261
##
lucene/core/src/java/org/apache/lucene/search/HumanReadableQuery.java:
##
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
benwtrent commented on code in PR #12816:
URL: https://github.com/apache/lucene/pull/12816#discussion_r1396239102
##
lucene/core/src/java/org/apache/lucene/search/HumanReadableQuery.java:
##
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or m
benwtrent commented on code in PR #12816:
URL: https://github.com/apache/lucene/pull/12816#discussion_r1396236967
##
lucene/core/src/java/org/apache/lucene/search/HumanReadableQuery.java:
##
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or m
slow-J commented on PR #12816:
URL: https://github.com/apache/lucene/pull/12816#issuecomment-1815175078
Created a `HumanReadableQuery` which wraps a Query and only changes the
.toString() behaviour, please let me know if I misunderstood any part of the
suggestion.
--
This is an automated
slow-J commented on PR #12816:
URL: https://github.com/apache/lucene/pull/12816#issuecomment-1815014722
Thanks for the suggestion @jpountz! I'll add a `HumanReadableQuery` and
revert the current changes. I think it would be quite similar to the
`AssertingQuery`.
--
This is an automated m
kaivalnp opened a new pull request, #12820:
URL: https://github.com/apache/lucene/pull/12820
### Description
In KNN queries with a pre-filter, we first perform an approximate graph
search and then
[fallback](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/l
mayya-sharipova opened a new pull request, #12819:
URL: https://github.com/apache/lucene/pull/12819
Number of visited nodes during graph exploration is an important metric for
a knn query, that is lost when the query is rewritten. This allows to
optionally access it before the query is rewr
javanna commented on code in PR #12799:
URL: https://github.com/apache/lucene/pull/12799#discussion_r1396067925
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswConcurrentMergeBuilder.java:
##
@@ -77,42 +75,17 @@ public OnHeapHnswGraph build(int maxOrd) throws IOException
stefanvodita opened a new pull request, #12818:
URL: https://github.com/apache/lucene/pull/12818
We're only printing results for the `Author` dimension instead of printing
`Publish Year` too.
--
This is an automated message from the Apache Git Service.
To respond to the message, please lo
shubhamvishu commented on PR #12716:
URL: https://github.com/apache/lucene/pull/12716#issuecomment-1814848844
Ok so I ran the test with rehash value of 17/24 which is between 2/3 and
3/4. Here are the results:
| Golden ratio Bit mixing | Rehash ratio (2/3) | Rehash ratio (17/24) |
stefanvodita opened a new pull request, #12817:
URL: https://github.com/apache/lucene/pull/12817
We don't have a demo for faceting using `KeywordField`,
`SortedDocValuesField`, or `StringValueFacetCounts`. This PR adds one, mostly
inspired by `SimpleSortedSetFacetsExample`.
--
This i
jpountz commented on PR #12800:
URL: https://github.com/apache/lucene/pull/12800#issuecomment-1814761757
I like the idea, but this seems to come with greater heap requirements as
well?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
benwtrent commented on PR #12816:
URL: https://github.com/apache/lucene/pull/12816#issuecomment-1814712957
I much prefer @jpountz idea. This additional field is purely for debugging
purposes. A `DebugQuery` or `HumanReadableQuery` does seem like a good idea.
--
This is an automated messag
jpountz commented on PR #12816:
URL: https://github.com/apache/lucene/pull/12816#issuecomment-1814699360
I'd rather like not to touch these queries, and introduce a brand new query
that rewrites to a `Knn(Byte|Float)VectorQuery` and may add a description
string. Something like `HumanReadabl
slow-J opened a new pull request, #12816:
URL: https://github.com/apache/lucene/pull/12816
We use this only in KnnByte/FloatVectorQuery toString method so the
benchmarker can disambiguate between different
KnnFloatVectorQuery/KnnByteVectorQuery queries.
Closes #12487
--
This i
benwtrent commented on issue #12627:
URL: https://github.com/apache/lucene/issues/12627#issuecomment-1814561267
@nitirajrathore very interesting findings.
This makes me wonder if the heuristic should take a middle ground and
instead of keeping all pruned connections, keep half.
nitirajrathore commented on issue #12627:
URL: https://github.com/apache/lucene/issues/12627#issuecomment-1814530858
> meaning that we can get same recall for a smaller max-conn value now.
I ran some tests with with max-conn 16 and max-conn = 8 and it seems like
with [my proposal](htt
shubhamvishu commented on PR #12791:
URL: https://github.com/apache/lucene/pull/12791#issuecomment-1814515539
> I assume this is because this is a DoubleDocValuesField which encodes the
double using NumericUtils.doubleToSortableLong
@mikemccand Is it possible to fix `NumericUtils.doub
sitepark-veltrup commented on issue #7074:
URL: https://github.com/apache/lucene/issues/7074#issuecomment-1814499494
We use Solr to search for pages in a website. We index the content of the
website and also the content of PDF documents into a field `content`
Based on this field we would
jpountz commented on PR #12815:
URL: https://github.com/apache/lucene/pull/12815#issuecomment-1814493098
Here are results on `wikibigall`, none of the p-values seem significant:
```
TaskQPS baseline StdDevQPS
my_modified_version StdDev
shubhamvishu commented on code in PR #12799:
URL: https://github.com/apache/lucene/pull/12799#discussion_r1395692713
##
lucene/core/src/java/org/apache/lucene/search/TaskExecutor.java:
##
@@ -53,7 +53,7 @@ public final class TaskExecutor {
private final Executor executor;
shubhamvishu commented on code in PR #12799:
URL: https://github.com/apache/lucene/pull/12799#discussion_r1395692713
##
lucene/core/src/java/org/apache/lucene/search/TaskExecutor.java:
##
@@ -53,7 +53,7 @@ public final class TaskExecutor {
private final Executor executor;
benwtrent commented on code in PR #12799:
URL: https://github.com/apache/lucene/pull/12799#discussion_r1395689207
##
lucene/core/src/java/org/apache/lucene/search/TaskExecutor.java:
##
@@ -53,7 +53,7 @@ public final class TaskExecutor {
private final Executor executor;
-
jpountz opened a new pull request, #12815:
URL: https://github.com/apache/lucene/pull/12815
I think that this optimization was introduced because `advanceShallow` may
advance skip lists and then never decode a block of postings. But actually
`IndexInput#seek` is cheap, including on `NIOFSDi
slow-J commented on issue #12487:
URL: https://github.com/apache/lucene/issues/12487#issuecomment-1814317117
I think that we could simply add an `resourceDescription` field to the
`AbstractKnnVectorQuery` and modify the toString in the implementations so that
the output would look something
shubhamvishu commented on PR #12799:
URL: https://github.com/apache/lucene/pull/12799#issuecomment-1814218891
@javanna I have added the CHANGES entry and addressed the comment. Seems the
precommit fails on to `:lucene:documentation:markdownToHtml` task which looks
unrelated? Not sure.
--
shubhamvishu commented on code in PR #12799:
URL: https://github.com/apache/lucene/pull/12799#discussion_r1395505763
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswConcurrentMergeBuilder.java:
##
@@ -77,42 +75,21 @@ public OnHeapHnswGraph build(int maxOrd) throws IOExce
javanna commented on PR #240:
URL: https://github.com/apache/lucene/pull/240#issuecomment-1814140254
Thanks for reviving this PR @zacharymorn ! the changes look good to me,
having top score doc and top field collector managers sounds like a natural
next step, and removes code duplication. I
javanna commented on code in PR #12799:
URL: https://github.com/apache/lucene/pull/12799#discussion_r1395446929
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswConcurrentMergeBuilder.java:
##
@@ -77,42 +75,21 @@ public OnHeapHnswGraph build(int maxOrd) throws IOException
shubhamvishu commented on PR #12716:
URL: https://github.com/apache/lucene/pull/12716#issuecomment-1814055888
> @shubhamvishu can we close this one? Any other things to try?
Sure @mikemccand ! Maybe we could just try a rehash value between 2/3 and
3/4 as you mentioned earlier(how abo
nitirajrathore commented on issue #12627:
URL: https://github.com/apache/lucene/issues/12627#issuecomment-1814046690
> What if we added an "incoming connection" count for every node?
&
> I think this idea would prevent the isolated nodes, but not fix the other
case.
I w
shubhamvishu commented on issue #12675:
URL: https://github.com/apache/lucene/issues/12675#issuecomment-1814024776
@jpountz I think we can close this now?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
dungba88 commented on code in PR #12624:
URL: https://github.com/apache/lucene/pull/12624#discussion_r1395333947
##
lucene/core/src/java/org/apache/lucene/util/fst/GrowableByteArrayDataOutput.java:
##
@@ -0,0 +1,104 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) und
dungba88 commented on code in PR #12624:
URL: https://github.com/apache/lucene/pull/12624#discussion_r1395069004
##
lucene/core/src/java/org/apache/lucene/util/fst/BytesStore.java:
##
@@ -337,11 +349,23 @@ public long size() {
return getPosition();
}
+ /** Similar to
58 matches
Mail list logo