rmuir commented on PR #12787:
URL: https://github.com/apache/lucene/pull/12787#issuecomment-1803343093
When I run `make PATCH_BRANCH=rmuir:microbenchmark_ec2` we will just see no
differences but it demonstrates it (sorry: no speedups in this branch!).
It spins up/tears down `lucene-jm
gf2121 commented on code in PR #12748:
URL: https://github.com/apache/lucene/pull/12748#discussion_r1387636706
##
lucene/CHANGES.txt:
##
@@ -106,6 +106,8 @@ Optimizations
* GITHUB#12552: Make FSTPostingsFormat load FSTs off-heap. (Tony X)
+* GITHUB#12748: Specialize arc sto
dweiss commented on PR #12038:
URL: https://github.com/apache/lucene/pull/12038#issuecomment-1803391323
> If anyone is still using the legacy non-NRT mode, please let me know on
this issue and give me your IP address, so I can try to pop a shell.
Oh, I missed this bit somehow, @rmuir.
vsop-479 opened a new issue, #12788:
URL: https://github.com/apache/lucene/issues/12788
### Description
Does it worth to make Math.max in CompetitiveImpactAccumulator.addAll
unrolled or vectorized?
Maybe scalar can be auto vectorized by JIT, but there is some speed up with
unrolle
vsop-479 commented on issue #12788:
URL: https://github.com/apache/lucene/issues/12788#issuecomment-1803464509
@jpountz Please take a look when you get a chance!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
uschindler commented on issue #12788:
URL: https://github.com/apache/lucene/issues/12788#issuecomment-1803537830
Hi,
for correct vectorization please make use of the official Lucene framework
(add your implementation class' instance for the scalar and the vectorized
variant as a sepa
stefanvodita commented on PR #12454:
URL: https://github.com/apache/lucene/pull/12454#issuecomment-1803605667
Thanks Greg! I think the delay is partially my fault, I had mentioned a
different G. Miller in my message 😄
--
This is an automated message from the Apache Git Service.
To respon
mikemccand commented on PR #12748:
URL: https://github.com/apache/lucene/pull/12748#issuecomment-1803615768
> I can help merge this in and backport if there is no objection in 48h.
Thanks @gf2121 -- we should backport all these recent exciting FST changes
in the right order as a batch
mikemccand merged PR #12748:
URL: https://github.com/apache/lucene/pull/12748
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
mikemccand commented on PR #12454:
URL: https://github.com/apache/lucene/pull/12454#issuecomment-1803628190
> Thanks Greg! I think the delay is partially my fault, I had mentioned a
different G. Miller in my message 😄
Seems to be common mistake recently! See this [recent hilarious
e
easyice commented on PR #12748:
URL: https://github.com/apache/lucene/pull/12748#issuecomment-1803649666
@mikemccand @gf2121 Thanks for review and merge it ;-)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL a
uschindler commented on PR #12785:
URL: https://github.com/apache/lucene/pull/12785#issuecomment-1803669858
After some discussion with @mcimadamore we figured out that there are more
problem, so we need to rely on the exception message.
The following problem can occur and possibly hap
uschindler commented on PR #12785:
URL: https://github.com/apache/lucene/pull/12785#issuecomment-1803692612
I committed another change to make the sequence of `IndexInput#close()`
first try to close the segment and then set everything to null. In case if ISE,
the IndexInput is not closed.
epotyom commented on code in PR #12769:
URL: https://github.com/apache/lucene/pull/12769#discussion_r1387902988
##
lucene/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestDirectoryTaxonomyReader.java:
##
@@ -476,6 +479,86 @@ public void testOpenIfChangedReplaceTaxon
s1monw commented on issue #12725:
URL: https://github.com/apache/lucene/issues/12725#issuecomment-1803718796
yeah I think we should check if it's memory and time efficient. I think in
theory we could iterate the terms in the automaton against the bloom filter to
take advantage of it inside
mikemccand commented on PR #12711:
URL: https://github.com/apache/lucene/pull/12711#issuecomment-1803774655
> Really, if we'd be implementing the feature today would we use a bitset or
maybe a sparse DV field recording the number of children for each block in the
index?
In fact, in o
mikemccand commented on PR #12769:
URL: https://github.com/apache/lucene/pull/12769#issuecomment-1803988678
> I can make sure that we have a task that calls this method (indirectly) in
the next step for this issue - adding bulk Facets#getSpecificValues, will that
be ok?
+1, thanks!
mikemccand commented on code in PR #12769:
URL: https://github.com/apache/lucene/pull/12769#discussion_r1388134130
##
lucene/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestDirectoryTaxonomyReader.java:
##
@@ -570,16 +654,20 @@ public void testAccountable() throws
mikemccand merged PR #12769:
URL: https://github.com/apache/lucene/pull/12769
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
benwtrent opened a new pull request, #12789:
URL: https://github.com/apache/lucene/pull/12789
While doing some performance testing and digging into flamegraphs, I noticed
for smaller vectors (96dim float32), we were losing a fair bit of time within
the `SparseFixedBitSet#getAndSet` method.
mikemccand commented on PR #12769:
URL: https://github.com/apache/lucene/pull/12769#issuecomment-1804048913
I think this is safe to backport to 9.x? I'll do that, and move the
`CHANGES.txt` entry down.
--
This is an automated message from the Apache Git Service.
To respond to the message
uschindler commented on PR #12785:
URL: https://github.com/apache/lucene/pull/12785#issuecomment-1804063179
I fixed the `close()` method to no longer throw `IllegalStateException` as
this would violate the contract. When we close only `IOException` is allowed.
As half-open index inputs are
rmuir commented on PR #12787:
URL: https://github.com/apache/lucene/pull/12787#issuecomment-1804143223
I still struggle with the noise, it is even more than when you run the
benchmarks manually.
I inspected an instance under test and saw e.g. scheduled job burning up CPU
rebuilding m
jpountz commented on PR #12789:
URL: https://github.com/apache/lucene/pull/12789#issuecomment-1804146598
I can believe that FixedBitSet is faster in some cases, but it's surprising
to me that the memory usage of SparseFixedBitSet can go up to 2x that of
FixedBitSet, this makes me wonder if
gokaai commented on code in PR #12530:
URL: https://github.com/apache/lucene/pull/12530#discussion_r1388271749
##
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##
@@ -610,6 +610,31 @@ public Status checkIndex(List onlySegments,
ExecutorService executorServ
jpountz commented on code in PR #12782:
URL: https://github.com/apache/lucene/pull/12782#discussion_r1388273135
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/GroupVintWriter.java:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
jpountz commented on issue #12788:
URL: https://github.com/apache/lucene/issues/12788#issuecomment-1804189940
Oh, it's sad that this loop doesn't get auto-vectorized automatically. Out
of curiosity, are you seeing it show up in some benchmarks?
--
This is an automated message from the Apa
gokaai commented on code in PR #12530:
URL: https://github.com/apache/lucene/pull/12530#discussion_r1388271749
##
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##
@@ -610,6 +610,31 @@ public Status checkIndex(List onlySegments,
ExecutorService executorServ
jpountz commented on issue #12763:
URL: https://github.com/apache/lucene/issues/12763#issuecomment-1804193016
I'm away from my main working computer this week, I suspect it's a similar
issue that I saw elsewhere where merges cascade. I'll look into it on Monday if
nobody beats me to me.
-
benwtrent commented on PR #12789:
URL: https://github.com/apache/lucene/pull/12789#issuecomment-1804203048
@jpountz I re-ran my tests and double checked my numbers, I have some
corrections, I accidentally double-counted sparse sizes, so previous numbers
are 2x too big.
GLOVE-100-100_
uschindler commented on PR #12785:
URL: https://github.com/apache/lucene/pull/12785#issuecomment-1804242819
I let `TestMmapDirectory.testAceWithThreads` run with `gradlew
:lucene:core:beast` with many iterations and high multiplier: JDK 19, 20, 21
showed no problems.
--
This is an automa
jimczi commented on PR #12729:
URL: https://github.com/apache/lucene/pull/12729#issuecomment-1804243772
Sorry for the late reply.
> Since this is a larger API discussion, do we think we can move forward
with the way it is now (quantization for HNSW and other vector indices) and
itera
uschindler merged PR #12785:
URL: https://github.com/apache/lucene/pull/12785
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
mikemccand commented on code in PR #12530:
URL: https://github.com/apache/lucene/pull/12530#discussion_r1388366133
##
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##
@@ -610,6 +610,31 @@ public Status checkIndex(List onlySegments,
ExecutorService executorServ
uschindler commented on issue #12180:
URL: https://github.com/apache/lucene/issues/12180#issuecomment-1804313295
Hi, the commit causes test failures like this from time to time:
```
org.apache.lucene.facet.taxonomy.directory.TestDirectoryTaxonomyReader >
testGetPathAndOrdinalsRandomMul
uschindler commented on PR #12769:
URL: https://github.com/apache/lucene/pull/12769#issuecomment-1804314417
Hi, the commit causes test failures like this from time to time:
```
org.apache.lucene.facet.taxonomy.directory.TestDirectoryTaxonomyReader >
testGetPathAndOrdinalsRandomMultithr
uschindler commented on PR #12769:
URL: https://github.com/apache/lucene/pull/12769#issuecomment-1804316973
Looks like the ordinals array sizes must be at least 1, so in general the
initial setup of the ordinal size must use `numOrdinals = random(limit) + 1;`
--
This is an automated messa
benwtrent commented on PR #12729:
URL: https://github.com/apache/lucene/pull/12729#issuecomment-1804353255
@jimczi updated the title.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
mikemccand commented on PR #12769:
URL: https://github.com/apache/lucene/pull/12769#issuecomment-1804396963
Thanks Uwe and sorry! I think Egor is digging on this or I’ll revert soon.
Mike
On Thu, Nov 9, 2023 at 1:17 PM Uwe Schindler ***@***.***>
wrote:
> Assigned #127
epotyom opened a new pull request, #12790:
URL: https://github.com/apache/lucene/pull/12790
Fix bug from https://github.com/apache/lucene/pull/12769
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
epotyom commented on PR #12769:
URL: https://github.com/apache/lucene/pull/12769#issuecomment-1804432895
Hi all,
Sorry for the bug, this pull request should fix it:
https://github.com/apache/lucene/pull/12790
Kind regards,
Egor
On Thu, 9 Nov 2023 at 18:52, Michael M
mikemccand merged PR #12790:
URL: https://github.com/apache/lucene/pull/12790
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
mikemccand commented on issue #12180:
URL: https://github.com/apache/lucene/issues/12180#issuecomment-1804446870
OK fixed @uschindler -- sorry!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
mikemccand commented on issue #12180:
URL: https://github.com/apache/lucene/issues/12180#issuecomment-1804447251
And thanks @epotyom!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
gsmiller commented on issue #12180:
URL: https://github.com/apache/lucene/issues/12180#issuecomment-1804469669
Thanks @epotyom! Should we consider a follow up PR that leverages this new
bulk lookup by adding something like `Facets#getSpecificValues` that gets facet
values for multiple paths
epotyom commented on PR #12790:
URL: https://github.com/apache/lucene/pull/12790#issuecomment-1804531938
I've re-run the tests multiple times just in case, there were no errors:
```
./gradlew -p lucene/facet test --tests "*TestDirectoryTaxonomyReader*"
-Ptests.iters=1000
...
uschindler commented on PR #12790:
URL: https://github.com/apache/lucene/pull/12790#issuecomment-1804593054
I need to merge this also into the java 22 mmap branch where Jenkins runs
on. #12706
--
This is an automated message from the Apache Git Service.
To respond to the message, please
epotyom commented on issue #12180:
URL: https://github.com/apache/lucene/issues/12180#issuecomment-1804594098
@gsmiller yes, I'll be working on that now as well as adding benchmark task
for getSpecificValues, as was discussed with Mike in
https://github.com/apache/lucene/pull/12769#pullrequ
gsmiller commented on issue #12180:
URL: https://github.com/apache/lucene/issues/12180#issuecomment-1804667028
@epotyom got it, thanks! Didn't see that earlier conversation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
rmuir commented on code in PR #12782:
URL: https://github.com/apache/lucene/pull/12782#discussion_r1388595300
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/GroupVintReader.java:
##
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one o
uschindler commented on PR #12790:
URL: https://github.com/apache/lucene/pull/12790#issuecomment-1804722810
OK merged to java 22 branch. Tests pass.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t
rmuir commented on issue #12788:
URL: https://github.com/apache/lucene/issues/12788#issuecomment-1804722948
> Oh, it's sad that this loop doesn't get auto-vectorized automatically. Out
of curiosity, are you seeing it show up in some benchmarks
I don't believe that, there is code to do
uschindler commented on code in PR #12787:
URL: https://github.com/apache/lucene/pull/12787#discussion_r1388633378
##
gradle/validation/rat-sources.gradle:
##
@@ -53,6 +53,9 @@ allprojects {
include "**/*.sh"
include "**/*.bat"
+// exclude
rmuir commented on code in PR #12787:
URL: https://github.com/apache/lucene/pull/12787#discussion_r1388637999
##
gradle/validation/rat-sources.gradle:
##
@@ -53,6 +53,9 @@ allprojects {
include "**/*.sh"
include "**/*.bat"
+// exclude anyt
rmuir commented on code in PR #12787:
URL: https://github.com/apache/lucene/pull/12787#discussion_r1388639032
##
gradle/validation/rat-sources.gradle:
##
@@ -53,6 +53,9 @@ allprojects {
include "**/*.sh"
include "**/*.bat"
+// exclude anyt
uschindler commented on code in PR #12787:
URL: https://github.com/apache/lucene/pull/12787#discussion_r1388639477
##
gradle/validation/rat-sources.gradle:
##
@@ -53,6 +53,9 @@ allprojects {
include "**/*.sh"
include "**/*.bat"
+// exclude
rmuir commented on code in PR #12787:
URL: https://github.com/apache/lucene/pull/12787#discussion_r1388642669
##
gradle/validation/rat-sources.gradle:
##
@@ -53,6 +53,9 @@ allprojects {
include "**/*.sh"
include "**/*.bat"
+// exclude anyt
uschindler commented on code in PR #12787:
URL: https://github.com/apache/lucene/pull/12787#discussion_r1388643017
##
gradle/validation/rat-sources.gradle:
##
@@ -53,6 +53,9 @@ allprojects {
include "**/*.sh"
include "**/*.bat"
+// exclude
uschindler commented on code in PR #12787:
URL: https://github.com/apache/lucene/pull/12787#discussion_r1388643017
##
gradle/validation/rat-sources.gradle:
##
@@ -53,6 +53,9 @@ allprojects {
include "**/*.sh"
include "**/*.bat"
+// exclude
rmuir commented on code in PR #12787:
URL: https://github.com/apache/lucene/pull/12787#discussion_r1388647325
##
gradle/validation/rat-sources.gradle:
##
@@ -53,6 +53,9 @@ allprojects {
include "**/*.sh"
include "**/*.bat"
+// exclude anyt
uschindler commented on code in PR #12787:
URL: https://github.com/apache/lucene/pull/12787#discussion_r1388642702
##
gradle/validation/rat-sources.gradle:
##
@@ -53,6 +53,9 @@ allprojects {
include "**/*.sh"
include "**/*.bat"
+// exclude
uschindler commented on code in PR #12787:
URL: https://github.com/apache/lucene/pull/12787#discussion_r1388648419
##
gradle/validation/rat-sources.gradle:
##
@@ -53,6 +53,9 @@ allprojects {
include "**/*.sh"
include "**/*.bat"
+// exclude
uschindler commented on code in PR #12787:
URL: https://github.com/apache/lucene/pull/12787#discussion_r1388668515
##
gradle/validation/rat-sources.gradle:
##
@@ -53,6 +53,9 @@ allprojects {
include "**/*.sh"
include "**/*.bat"
+// exclude
robertvanwinkle1138 commented on issue #12615:
URL: https://github.com/apache/lucene/issues/12615#issuecomment-1804858072
Perhaps much of the jvector performance improvement is simply from on heap
caching.
https://github.com/jbellis/jvector/blob/main/jvector-base/src/main/java/io/git
vsop-479 commented on issue #12788:
URL: https://github.com/apache/lucene/issues/12788#issuecomment-1804999164
> To benchmark then use the benchmark-jmh Gradle module. This will enable
vectorization if all is sane.
Thanks for your explanation. I will try it.
> are you seeing it
easyice commented on code in PR #12782:
URL: https://github.com/apache/lucene/pull/12782#discussion_r1388843109
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/GroupVintWriter.java:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
easyice commented on code in PR #12782:
URL: https://github.com/apache/lucene/pull/12782#discussion_r1388845597
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/GroupVintReader.java:
##
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
easyice commented on PR #12782:
URL: https://github.com/apache/lucene/pull/12782#issuecomment-1805059427
@jpountz @rmuir Thanks for your suggestions, it's very helpful for me! I
will run the benchmark for recomputing length vs table lookup.
--
This is an automated message from the Apach
dungba88 commented on PR #12624:
URL: https://github.com/apache/lucene/pull/12624#issuecomment-1805190081
@mikemccand I put out another revision. Basically the idea is to write
everything to a DataOutput (BytesStore is also a DataOutput). To support
write-then-read-immediately use case that
zhaih merged PR #12767:
URL: https://github.com/apache/lucene/pull/12767
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apach
jpountz commented on PR #12741:
URL: https://github.com/apache/lucene/pull/12741#issuecomment-1805220170
[Nightly benchmarks](https://home.apache.org/~mikemccand/lucenebench/) just
caught up this change, it's no obvious that there is a speedup.
--
This is an automated message from the Apa
gf2121 commented on PR #12741:
URL: https://github.com/apache/lucene/pull/12741#issuecomment-1805253760
FYI this great
[view](https://home.apache.org/~mikemccand/lucenebench/2023.11.09.18.02.58.html)
could be easier to see the impact of changes in single day for all tasks. It
seems some co
72 matches
Mail list logo