vsop-479 commented on code in PR #12846:
URL: https://github.com/apache/lucene/pull/12846#discussion_r1405891395
##
lucene/core/src/java/org/apache/lucene/codecs/CompetitiveImpactAccumulator.java:
##
@@ -93,6 +93,21 @@ public void addAll(CompetitiveImpactAccumulator acc) {
mikemccand commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1406124683
##
lucene/core/src/java/org/apache/lucene/index/DocumentsWriterPerThread.java:
##
@@ -262,6 +277,73 @@ long updateDocuments(
}
}
+ private interface DocV
mikemccand commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1827825175
> using a doc-value field where only parents documents have a value for the
field, and the value must be the number of child documents that the parent has
This is a neat idea to
epugh commented on issue #11657:
URL: https://github.com/apache/lucene/issues/11657#issuecomment-1827887052
OpenNLP 2.3.1 was recently released and would be nice to have Lucene pick it
up.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log o
javanna commented on PR #12798:
URL: https://github.com/apache/lucene/pull/12798#issuecomment-1828210981
With the latest updates, I am not convinced about this change. I think it's
great to use TaskExecutor to execute parallel tasks, like you did in #12799,
but I am under the impression tha
epugh commented on PR #12674:
URL: https://github.com/apache/lucene/pull/12674#issuecomment-1828340906
FYI 2.3.1 was just released.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific co
msokolov commented on PR #12829:
URL: https://github.com/apache/lucene/pull/12829#issuecomment-1828402628
> I don't think we give up any functionality. can you elaborate what
functionality you are referring to? I don't think we should have a list of
parent fields that IW requires, what woul
mikemccand commented on code in PR #12699:
URL: https://github.com/apache/lucene/pull/12699#discussion_r140662
##
lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/SegmentTermsEnumFrame.java:
##
@@ -104,13 +104,9 @@ public SegmentTermsEnumFrame(SegmentTermsEnu
msfroh commented on PR #12750:
URL: https://github.com/apache/lucene/pull/12750#issuecomment-1828469855
I was looking into this and the approach used for (Edge)NGramTokenizer back
in 2013:
https://github.com/apache/lucene/commit/a03e38d5d05008aaef969a200071c03a1d6cb991
The solution t
mikemccand commented on PR #12624:
URL: https://github.com/apache/lucene/pull/12624#issuecomment-1828548480
Hmm I'm running `Test2BFSTs` on this patch and noticed it seems to take very
much longer during the `TEST: now verify` step where it confirms the built FST
accepts all the inputs it j
mikemccand commented on PR #12624:
URL: https://github.com/apache/lucene/pull/12624#issuecomment-1828590265
Hmm, also the `FSTCompiler.ramBytesUsed()` seems to no longer return the
growing FST size:
```
1> 310: 560 bytes; 594876500 nodes
1> 320: 560 bytes; 614066389
mikemccand commented on PR #12624:
URL: https://github.com/apache/lucene/pull/12624#issuecomment-1828597325
Hmm, also oddly -- why do the number of nodes differ between `main` and 9.x?
This PR should not have altered how many nodes are created as a function of
FST inputs right? Or maybe h
dungba88 commented on PR #12624:
URL: https://github.com/apache/lucene/pull/12624#issuecomment-1828839806
Ah I think since we removed the finish(), getting the reverse bytes reader
is expectedly slower. We have to copy the bytes to a readonly buffer every
time. If this is a problem maybe le
zacharymorn merged PR #240:
URL: https://github.com/apache/lucene/pull/240
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
dungba88 commented on PR #12624:
URL: https://github.com/apache/lucene/pull/12624#issuecomment-1828936176
I checked some of the usage in the analysis module. SynonymGraphFilter cache
the `BytesReader` on constructor, and I think TokenFilter by default are cached
per field? But lots of other
dungba88 opened a new pull request, #12847:
URL: https://github.com/apache/lucene/pull/12847
### Description
- Report the time it took for building the FST
- Report the FST actual size, as it can differ from the RAM bytes used once
the test is moved to off-heap
--
This is an aut
gf2121 commented on PR #12699:
URL: https://github.com/apache/lucene/pull/12699#issuecomment-1829112668
Thanks for review and great suggestions @mikemccand !
> you want to merge and backport to 9.x?
Yes. I'll merge and backport this this.
--
This is an automated message from
gf2121 merged PR #12699:
URL: https://github.com/apache/lucene/pull/12699
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apac
dungba88 commented on PR #12624:
URL: https://github.com/apache/lucene/pull/12624#issuecomment-1829144978
Tested Test2BFST with `-Dtests.seed=D193E7FD4B9E68C4`
**mainline**
```
110: 432584968 RAM bytes used; 432367203 FST bytes; 211082699 nodes;
took 248 seconds
```
19 matches
Mail list logo