gf2121 opened a new pull request, #12835:
URL: https://github.com/apache/lucene/pull/12835
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-ma
gf2121 commented on code in PR #12810:
URL: https://github.com/apache/lucene/pull/12810#discussion_r1402985278
##
lucene/core/src/java/org/apache/lucene/codecs/MultiLevelSkipListReader.java:
##
@@ -63,7 +63,7 @@ public abstract class MultiLevelSkipListReader implements
Closeabl
msokolov commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1823517625
I like @jpountz's idea to make the value of this field be the number of
children. It is simple and makes sense, and is pretty close to having the
degree of flexibility that the current
dweiss commented on code in PR #12834:
URL: https://github.com/apache/lucene/pull/12834#discussion_r1402712071
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/SortingStrategy.java:
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (AS
dweiss commented on code in PR #12834:
URL: https://github.com/apache/lucene/pull/12834#discussion_r1402711522
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/SortingStrategy.java:
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (AS
stefanvodita commented on code in PR #12756:
URL: https://github.com/apache/lucene/pull/12756#discussion_r1402701132
##
lucene/core/src/test/org/apache/lucene/search/TestIndexSearcher.java:
##
@@ -293,4 +298,218 @@ public void testNullExecutorNonNullTaskExecutor() {
IndexSe
donnerpeter commented on code in PR #12834:
URL: https://github.com/apache/lucene/pull/12834#discussion_r1402703007
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/SortingStrategy.java:
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundatio
jpountz closed pull request #12823: Dry up DirectReader implementations
URL: https://github.com/apache/lucene/pull/12823
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsu
jpountz commented on PR #12823:
URL: https://github.com/apache/lucene/pull/12823#issuecomment-1823222857
It got no additional feedback in the last couple days, so I'll default to
closing if you don't mind. Thanks for contributing!
--
This is an automated message from the Apache Git Servic
dweiss commented on code in PR #12834:
URL: https://github.com/apache/lucene/pull/12834#discussion_r1402449022
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/SortingStrategy.java:
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (AS
jpountz commented on code in PR #12810:
URL: https://github.com/apache/lucene/pull/12810#discussion_r1402438164
##
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/LegacyMultiLevelSkipListReader.java:
##
@@ -0,0 +1,263 @@
+/*
+ * Licensed to the Apache Software
jpountz commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1823167735
In general, I like the idea of making block joins more of a first-class
citizen. I have been thinking for a long time about changing how blocks are
identified from using bitsets to using
stefanvodita commented on PR #12817:
URL: https://github.com/apache/lucene/pull/12817#issuecomment-1823134763
Thank you @msokolov! I pushed a
[commit](https://github.com/stefanvodita/lucene/commit/0b7498fe1af9ccb7a71df79655b3e3dbff3b253f)
with a CHANGES entry. Do you need me to open a PR ag
jpountz commented on PR #12622:
URL: https://github.com/apache/lucene/pull/12622#issuecomment-1823130379
@s1monw I pushed a commit that should address your feedback
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
U
jpountz commented on code in PR #12622:
URL: https://github.com/apache/lucene/pull/12622#discussion_r1402394564
##
lucene/core/src/java/org/apache/lucene/index/IndexWriter.java:
##
@@ -5160,20 +5177,74 @@ public int length() {
}
mergeReaders.add(wrappedReader);
msokolov commented on PR #12817:
URL: https://github.com/apache/lucene/pull/12817#issuecomment-1823101661
It just occurred to me; perhaps we should add a CHANGE log entry? And it
could be nice to backport to 9.x if you like
--
This is an automated message from the Apache Git Service.
To r
msokolov merged PR #12817:
URL: https://github.com/apache/lucene/pull/12817
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
jpountz commented on code in PR #12622:
URL: https://github.com/apache/lucene/pull/12622#discussion_r1402306694
##
lucene/core/src/java/org/apache/lucene/index/IndexWriter.java:
##
@@ -3475,6 +3475,8 @@ public void addIndexesReaderMerge(MergePolicy.OneMerge
merge) throws IOExce
gf2121 commented on PR #12699:
URL: https://github.com/apache/lucene/pull/12699#issuecomment-1822923162
Thanks for review @mikemccand !
> but some head scratching -- hard to remember how these two crazy iterators
work.
Agree that this is head scratching... I make a chart to try
gf2121 commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1402138507
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/IntersectTermsEnumFrame.java:
##
@@ -142,12 +138,20 @@ public void setState(int state) {
}
v
gf2121 commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1402136479
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnumFrame.java:
##
@@ -247,7 +243,7 @@ void rewind() {
nextEnt = -1;
hasTerms
gf2121 commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1402131137
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnumFrame.java:
##
@@ -104,13 +104,9 @@ public SegmentTermsEnumFrame(SegmentTermsEnum st
jpountz commented on issue #12826:
URL: https://github.com/apache/lucene/issues/12826#issuecomment-1822851422
Cool!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsub
easyice commented on issue #12826:
URL: https://github.com/apache/lucene/issues/12826#issuecomment-1822845516
Sorry for the late reply, I got lost in the JMH wrong loop for a while, Now
I got the correct result,
`memorySegmentReadGroupVInt` is faster than `byteBufferReadGroupVInt` in
gf2121 commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1402119508
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -1190,4 +1176,63 @@ public void seekExact(long ord) {
public long or
gf2121 commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1402112063
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -1190,4 +1176,63 @@ public void seekExact(long ord) {
public long or
jpountz merged PR #12833:
URL: https://github.com/apache/lucene/pull/12833
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
dungba88 commented on code in PR #12624:
URL: https://github.com/apache/lucene/pull/12624#discussion_r1402058230
##
lucene/core/src/java/org/apache/lucene/util/fst/FST.java:
##
@@ -435,6 +433,13 @@ public FST(FSTMetadata metadata, DataInput in,
Outputs outputs, FSTStore f
mikemccand commented on PR #12803:
URL: https://github.com/apache/lucene/pull/12803#issuecomment-1822773144
Hmm trying to backport but `FSTTermsReader.java` had conflicts which I tried
to resolve and then scary test failures and now I ran out of time for the
moment! Will take it up again s
mikemccand merged PR #12803:
URL: https://github.com/apache/lucene/pull/12803
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
mikemccand closed issue #12822: Remove the FST constructors with DataInput for
metadata
URL: https://github.com/apache/lucene/issues/12822
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
mikemccand commented on code in PR #12624:
URL: https://github.com/apache/lucene/pull/12624#discussion_r1402022525
##
lucene/core/src/java/org/apache/lucene/util/fst/FST.java:
##
@@ -435,6 +433,13 @@ public FST(FSTMetadata metadata, DataInput in,
Outputs outputs, FSTStore f
mikemccand commented on code in PR #12624:
URL: https://github.com/apache/lucene/pull/12624#discussion_r1402017454
##
lucene/core/src/java/org/apache/lucene/util/fst/ByteBuffersFSTReader.java:
##
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
mikemccand merged PR #12830:
URL: https://github.com/apache/lucene/pull/12830
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
mikemccand commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r1400933055
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnum.java:
##
@@ -1190,4 +1176,63 @@ public void seekExact(long ord) {
public lon
easyice commented on PR #12833:
URL: https://github.com/apache/lucene/pull/12833#issuecomment-1822611956
Looks good to me, Thank you @jpountz . otherwise i'm a bit curious that
`byteArrayReadGroupVInt ` is so much faster than `byteBufferReadGroupVInt`.
--
This is an automated message from
donnerpeter commented on code in PR #12834:
URL: https://github.com/apache/lucene/pull/12834#discussion_r1401913469
##
lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/SortingStrategy.java:
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundatio
donnerpeter opened a new pull request, #12834:
URL: https://github.com/apache/lucene/pull/12834
### Description
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
jpountz commented on code in PR #12833:
URL: https://github.com/apache/lucene/pull/12833#discussion_r1401912046
##
lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/GroupVIntBenchmark.java:
##
@@ -103,11 +127,16 @@ void initByteBufferInput(long[] docs) throws Excepti
gf2121 commented on code in PR #12833:
URL: https://github.com/apache/lucene/pull/12833#discussion_r1401871396
##
lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/GroupVIntBenchmark.java:
##
@@ -103,11 +127,16 @@ void initByteBufferInput(long[] docs) throws Exceptio
jpountz commented on PR #12782:
URL: https://github.com/apache/lucene/pull/12782#issuecomment-1822506995
I opened a PR to feed some of this data into the micro benchmark to make it
more realistic: https://github.com/apache/lucene/pull/12833.
--
This is an automated message from the Apache
jpountz commented on PR #12833:
URL: https://github.com/apache/lucene/pull/12833#issuecomment-1822504924
Here is the output of the benchmark on my machine:
```
Benchmark (size) Mode Cnt Score
Error Units
GroupVIntBenchmark.byteArrayRead
easyice commented on PR #12782:
URL: https://github.com/apache/lucene/pull/12782#issuecomment-1822504456
It's very important as a reference! Thanks a lot!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above t
jpountz opened a new pull request, #12833:
URL: https://github.com/apache/lucene/pull/12833
Instead of using a fixed number of bits per value, the group-varint
benchmark now tries to reproduce the distribution of the number of bits per
values that can be observed on tail postings of wikibig
jpountz commented on PR #12782:
URL: https://github.com/apache/lucene/pull/12782#issuecomment-1822453957
For reference, I computed the most frequent `flag` values on wikibigall,
which are the values that might be worth optimizing for:
- 0x55 (4 2-bytes ints): 29.6%
- 0xaa (5 3-bytes
gf2121 commented on code in PR #12810:
URL: https://github.com/apache/lucene/pull/12810#discussion_r1401710042
##
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/LegacyMultiLevelSkipListReader.java:
##
@@ -0,0 +1,263 @@
+/*
+ * Licensed to the Apache Software F
jpountz commented on code in PR #12810:
URL: https://github.com/apache/lucene/pull/12810#discussion_r1401733088
##
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/LegacyMultiLevelSkipListReader.java:
##
@@ -0,0 +1,263 @@
+/*
+ * Licensed to the Apache Software
jpountz commented on PR #12832:
URL: https://github.com/apache/lucene/pull/12832#issuecomment-1822375974
Thanks @easyice and @gf2121 for looking!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to t
jpountz merged PR #12832:
URL: https://github.com/apache/lucene/pull/12832
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
gf2121 commented on code in PR #12810:
URL: https://github.com/apache/lucene/pull/12810#discussion_r1401710042
##
lucene/backward-codecs/src/java/org/apache/lucene/backward_codecs/LegacyMultiLevelSkipListReader.java:
##
@@ -0,0 +1,263 @@
+/*
+ * Licensed to the Apache Software F
easyice commented on PR #12832:
URL: https://github.com/apache/lucene/pull/12832#issuecomment-1822307766
Great idea :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To un
jpountz commented on PR #12782:
URL: https://github.com/apache/lucene/pull/12782#issuecomment-1822300926
Also the
[size](http://people.apache.org/~mikemccand/lucenebench/indexing.html#FixedIndexSize)
increase is hardly noticeable.
--
This is an automated message from the Apache Git Servi
jpountz opened a new pull request, #12832:
URL: https://github.com/apache/lucene/pull/12832
When we moved to group-varint for tail postings, we stop interleaving docs
and freqs and instead wrote all docs first, then all freqs. This means that we
can now skip decoding frequencies when they a
jpountz commented on PR #12782:
URL: https://github.com/apache/lucene/pull/12782#issuecomment-1822293603
There seems to be a speedup on [prefix
queries](http://people.apache.org/~mikemccand/lucenebench/Prefix3.html) in
nightly benchmarks, I'll add an annotation.
--
This is an automated m
54 matches
Mail list logo