jpountz commented on PR #12782:
URL: https://github.com/apache/lucene/pull/12782#issuecomment-1822293603
There seems to be a speedup on [prefix
queries](http://people.apache.org/~mikemccand/lucenebench/Prefix3.html) in
nightly benchmarks, I'll add an annotation.
--
This is an automated m
jpountz opened a new pull request, #12832:
URL: https://github.com/apache/lucene/pull/12832
When we moved to group-varint for tail postings, we stop interleaving docs
and freqs and instead wrote all docs first, then all freqs. This means that we
can now skip decoding frequencies when they a
jpountz commented on PR #12782:
URL: https://github.com/apache/lucene/pull/12782#issuecomment-1822300926
Also the
[size](http://people.apache.org/~mikemccand/lucenebench/indexing.html#FixedIndexSize)
increase is hardly noticeable.
--
This is an automated message from the Apache Git Servi
easyice commented on PR #12832:
URL: https://github.com/apache/lucene/pull/12832#issuecomment-1822307766
Great idea :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To un
gf2121 commented on code in PR #12810:
URL: https://github.com/apache/lucene/pull/12810#discussion_r1401710042
##
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/LegacyMultiLevelSkipListReader.java:
##
@@ -0,0 +1,263 @@
+/*
+ * Licensed to the Apache Software F
jpountz merged PR #12832:
URL: https://github.com/apache/lucene/pull/12832
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
jpountz commented on PR #12832:
URL: https://github.com/apache/lucene/pull/12832#issuecomment-1822375974
Thanks @easyice and @gf2121 for looking!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to t
jpountz commented on code in PR #12810:
URL: https://github.com/apache/lucene/pull/12810#discussion_r1401733088
##
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/LegacyMultiLevelSkipListReader.java:
##
@@ -0,0 +1,263 @@
+/*
+ * Licensed to the Apache Software
gf2121 commented on code in PR #12810:
URL: https://github.com/apache/lucene/pull/12810#discussion_r1401710042
##
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/LegacyMultiLevelSkipListReader.java:
##
@@ -0,0 +1,263 @@
+/*
+ * Licensed to the Apache Software F
jpountz commented on PR #12782:
URL: https://github.com/apache/lucene/pull/12782#issuecomment-1822453957
For reference, I computed the most frequent `flag` values on wikibigall,
which are the values that might be worth optimizing for:
- 0x55 (4 2-bytes ints): 29.6%
- 0xaa (5 3-bytes
jpountz opened a new pull request, #12833:
URL: https://github.com/apache/lucene/pull/12833
Instead of using a fixed number of bits per value, the group-varint
benchmark now tries to reproduce the distribution of the number of bits per
values that can be observed on tail postings of wikibig
easyice commented on PR #12782:
URL: https://github.com/apache/lucene/pull/12782#issuecomment-1822504456
It's very important as a reference! Thanks a lot!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
jpountz commented on PR #12833:
URL: https://github.com/apache/lucene/pull/12833#issuecomment-1822504924
Here is the output of the benchmark on my machine:
```
Benchmark (size) Mode Cnt Score
Error Units
GroupVIntBenchmark.byteArrayRead
jpountz commented on PR #12782:
URL: https://github.com/apache/lucene/pull/12782#issuecomment-1822506995
I opened a PR to feed some of this data into the micro benchmark to make it
more realistic: https://github.com/apache/lucene/pull/12833.
--
This is an automated message from the Apache
gf2121 commented on code in PR #12833:
URL: https://github.com/apache/lucene/pull/12833#discussion_r1401871396
##
lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/GroupVIntBenchmark.java:
##
@@ -103,11 +127,16 @@ void initByteBufferInput(long[] docs) throws Exceptio
jpountz commented on code in PR #12833:
URL: https://github.com/apache/lucene/pull/12833#discussion_r1401912046
##
lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/GroupVIntBenchmark.java:
##
@@ -103,11 +127,16 @@ void initByteBufferInput(long[] docs) throws Excepti
donnerpeter opened a new pull request, #12834:
URL: https://github.com/apache/lucene/pull/12834
### Description
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
donnerpeter commented on code in PR #12834:
URL: https://github.com/apache/lucene/pull/12834#discussion_r1401913469
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/SortingStrategy.java:
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundatio
easyice commented on PR #12833:
URL: https://github.com/apache/lucene/pull/12833#issuecomment-1822611956
Looks good to me, Thank you @jpountz . otherwise i'm a bit curious that
`byteArrayReadGroupVInt ` is so much faster than `byteBufferReadGroupVInt`.
--
This is an automated message from
mikemccand commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1400933055
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -1190,4 +1176,63 @@ public void seekExact(long ord) {
public lon
mikemccand merged PR #12830:
URL: https://github.com/apache/lucene/pull/12830
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
mikemccand commented on code in PR #12624:
URL: https://github.com/apache/lucene/pull/12624#discussion_r1402017454
##
lucene/core/src/java/org/apache/lucene/util/fst/ByteBuffersFSTReader.java:
##
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
mikemccand commented on code in PR #12624:
URL: https://github.com/apache/lucene/pull/12624#discussion_r1402022525
##
lucene/core/src/java/org/apache/lucene/util/fst/FST.java:
##
@@ -435,6 +433,13 @@ public FST(FSTMetadata metadata, DataInput in,
Outputs outputs, FSTStore f
mikemccand closed issue #12822: Remove the FST constructors with DataInput for
metadata
URL: https://github.com/apache/lucene/issues/12822
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
mikemccand merged PR #12803:
URL: https://github.com/apache/lucene/pull/12803
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
mikemccand commented on PR #12803:
URL: https://github.com/apache/lucene/pull/12803#issuecomment-1822773144
Hmm trying to backport but `FSTTermsReader.java` had conflicts which I tried
to resolve and then scary test failures and now I ran out of time for the
moment! Will take it up again s
dungba88 commented on code in PR #12624:
URL: https://github.com/apache/lucene/pull/12624#discussion_r1402058230
##
lucene/core/src/java/org/apache/lucene/util/fst/FST.java:
##
@@ -435,6 +433,13 @@ public FST(FSTMetadata metadata, DataInput in,
Outputs outputs, FSTStore f
jpountz merged PR #12833:
URL: https://github.com/apache/lucene/pull/12833
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
gf2121 commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1402112063
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -1190,4 +1176,63 @@ public void seekExact(long ord) {
public long or
gf2121 commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1402119508
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -1190,4 +1176,63 @@ public void seekExact(long ord) {
public long or
easyice commented on issue #12826:
URL: https://github.com/apache/lucene/issues/12826#issuecomment-1822845516
Sorry for the late reply, I got lost in the JMH wrong loop for a while, Now
I got the correct result,
`memorySegmentReadGroupVInt` is faster than `byteBufferReadGroupVInt` in
jpountz commented on issue #12826:
URL: https://github.com/apache/lucene/issues/12826#issuecomment-1822851422
Cool!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsub
gf2121 commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1402131137
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnumFrame.java:
##
@@ -104,13 +104,9 @@ public SegmentTermsEnumFrame(SegmentTermsEnum st
gf2121 commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1402136479
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnumFrame.java:
##
@@ -247,7 +243,7 @@ void rewind() {
nextEnt = -1;
hasTerms
gf2121 commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1402138507
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/IntersectTermsEnumFrame.java:
##
@@ -142,12 +138,20 @@ public void setState(int state) {
}
v
gf2121 commented on PR #12699:
URL: https://github.com/apache/lucene/pull/12699#issuecomment-1822923162
Thanks for review @mikemccand !
> but some head scratching -- hard to remember how these two crazy iterators
work.
Agree that this is head scratching... I make a chart to try
jpountz commented on code in PR #12622:
URL: https://github.com/apache/lucene/pull/12622#discussion_r1402306694
##
lucene/core/src/java/org/apache/lucene/index/IndexWriter.java:
##
@@ -3475,6 +3475,8 @@ public void addIndexesReaderMerge(MergePolicy.OneMerge
merge) throws IOExce
msokolov merged PR #12817:
URL: https://github.com/apache/lucene/pull/12817
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
msokolov commented on PR #12817:
URL: https://github.com/apache/lucene/pull/12817#issuecomment-1823101661
It just occurred to me; perhaps we should add a CHANGE log entry? And it
could be nice to backport to 9.x if you like
--
This is an automated message from the Apache Git Service.
To r
jpountz commented on code in PR #12622:
URL: https://github.com/apache/lucene/pull/12622#discussion_r1402394564
##
lucene/core/src/java/org/apache/lucene/index/IndexWriter.java:
##
@@ -5160,20 +5177,74 @@ public int length() {
}
mergeReaders.add(wrappedReader);
jpountz commented on PR #12622:
URL: https://github.com/apache/lucene/pull/12622#issuecomment-1823130379
@s1monw I pushed a commit that should address your feedback
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
U
stefanvodita commented on PR #12817:
URL: https://github.com/apache/lucene/pull/12817#issuecomment-1823134763
Thank you @msokolov! I pushed a
[commit](https://github.com/stefanvodita/lucene/commit/0b7498fe1af9ccb7a71df79655b3e3dbff3b253f)
with a CHANGES entry. Do you need me to open a PR ag
jpountz commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1823167735
In general, I like the idea of making block joins more of a first-class
citizen. I have been thinking for a long time about changing how blocks are
identified from using bitsets to using
jpountz commented on code in PR #12810:
URL: https://github.com/apache/lucene/pull/12810#discussion_r1402438164
##
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/LegacyMultiLevelSkipListReader.java:
##
@@ -0,0 +1,263 @@
+/*
+ * Licensed to the Apache Software
dweiss commented on code in PR #12834:
URL: https://github.com/apache/lucene/pull/12834#discussion_r1402449022
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/SortingStrategy.java:
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (AS
jpountz closed pull request #12823: Dry up DirectReader implementations
URL: https://github.com/apache/lucene/pull/12823
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsu
jpountz commented on PR #12823:
URL: https://github.com/apache/lucene/pull/12823#issuecomment-1823222857
It got no additional feedback in the last couple days, so I'll default to
closing if you don't mind. Thanks for contributing!
--
This is an automated message from the Apache Git Servic
donnerpeter commented on code in PR #12834:
URL: https://github.com/apache/lucene/pull/12834#discussion_r1402703007
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/SortingStrategy.java:
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundatio
stefanvodita commented on code in PR #12756:
URL: https://github.com/apache/lucene/pull/12756#discussion_r1402701132
##
lucene/core/src/test/org/apache/lucene/search/TestIndexSearcher.java:
##
@@ -293,4 +298,218 @@ public void testNullExecutorNonNullTaskExecutor() {
IndexSe
dweiss commented on code in PR #12834:
URL: https://github.com/apache/lucene/pull/12834#discussion_r1402711522
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/SortingStrategy.java:
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (AS
dweiss commented on code in PR #12834:
URL: https://github.com/apache/lucene/pull/12834#discussion_r1402712071
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/SortingStrategy.java:
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (AS
msokolov commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1823517625
I like @jpountz's idea to make the value of this field be the number of
children. It is simple and makes sense, and is pretty close to having the
degree of flexibility that the current
gf2121 commented on code in PR #12810:
URL: https://github.com/apache/lucene/pull/12810#discussion_r1402985278
##
lucene/core/src/java/org/apache/lucene/codecs/MultiLevelSkipListReader.java:
##
@@ -63,7 +63,7 @@ public abstract class MultiLevelSkipListReader implements
Closeabl
gf2121 opened a new pull request, #12835:
URL: https://github.com/apache/lucene/pull/12835
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-ma
54 matches
Mail list logo