gf2121 opened a new pull request #706:
URL: https://github.com/apache/lucene/pull/706
SIMD-optimization for BKD `DocIdsWriter` was introduced in
https://github.com/apache/lucene/pull/652 in order to speed up decoding of
docIDs, but it leads to the regression in nightly benchmark.
ht
[
https://issues.apache.org/jira/browse/LUCENE-10435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adrien Grand resolved LUCENE-10435.
---
Fix Version/s: 9.1
Resolution: Fixed
> Break loop early while checking whether DocVa
gf2121 merged pull request #706:
URL: https://github.com/apache/lucene/pull/706
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...
[
https://issues.apache.org/jira/browse/LUCENE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497260#comment-17497260
]
ASF subversion and git services commented on LUCENE-10315:
--
Co
[
https://issues.apache.org/jira/browse/LUCENE-10417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497259#comment-17497259
]
ASF subversion and git services commented on LUCENE-10417:
--
Co
[
https://issues.apache.org/jira/browse/LUCENE-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mayya Sharipova reassigned LUCENE-10194:
Assignee: Mayya Sharipova
> Should IndexWriter buffer KNN vectors on disk?
> ---
gf2121 merged pull request #707:
URL: https://github.com/apache/lucene/pull/707
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...
[
https://issues.apache.org/jira/browse/LUCENE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497268#comment-17497268
]
ASF subversion and git services commented on LUCENE-10315:
--
Co
[
https://issues.apache.org/jira/browse/LUCENE-10417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497267#comment-17497267
]
ASF subversion and git services commented on LUCENE-10417:
--
Co
LuXugang commented on pull request #705:
URL: https://github.com/apache/lucene/pull/705#issuecomment-1049639011
> We need two cases:
>
> * Checking whether all documents match and returning values.getDocCount().
This works when there are no deletions.
> * Actually
jpountz merged pull request #705:
URL: https://github.com/apache/lucene/pull/705
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr..
[
https://issues.apache.org/jira/browse/LUCENE-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497272#comment-17497272
]
ASF subversion and git services commented on LUCENE-10439:
--
Co
[
https://issues.apache.org/jira/browse/LUCENE-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497273#comment-17497273
]
ASF subversion and git services commented on LUCENE-10439:
--
Co
[
https://issues.apache.org/jira/browse/LUCENE-10417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497280#comment-17497280
]
Adrien Grand commented on LUCENE-10417:
---
FYI Elasticsearch was upgraded to a rece
jpountz opened a new pull request #708:
URL: https://github.com/apache/lucene/pull/708
Since doc IDs with a vector are loaded as an int[] in memory, this changes
the
on-disk format of vectors to align with the in-memory representation by using
ints instead of vints to represent doc ID
jpountz merged pull request #708:
URL: https://github.com/apache/lucene/pull/708
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr..
[
https://issues.apache.org/jira/browse/LUCENE-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497367#comment-17497367
]
ASF subversion and git services commented on LUCENE-10408:
--
Co
jpountz merged pull request #702:
URL: https://github.com/apache/lucene/pull/702
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr..
[
https://issues.apache.org/jira/browse/LUCENE-10382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497369#comment-17497369
]
ASF subversion and git services commented on LUCENE-10382:
--
Co
[
https://issues.apache.org/jira/browse/LUCENE-10382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497373#comment-17497373
]
ASF subversion and git services commented on LUCENE-10382:
--
Co
[
https://issues.apache.org/jira/browse/LUCENE-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497372#comment-17497372
]
ASF subversion and git services commented on LUCENE-10408:
--
Co
[
https://issues.apache.org/jira/browse/LUCENE-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adrien Grand resolved LUCENE-10439.
---
Fix Version/s: 9.1
Resolution: Fixed
> Support multi-valued and multiple dimensions
[
https://issues.apache.org/jira/browse/LUCENE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adrien Grand reopened LUCENE-10315:
---
> Speed up BKD leaf block ids codec by a 512 ints ForUtil
> ---
[
https://issues.apache.org/jira/browse/LUCENE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adrien Grand updated LUCENE-10315:
--
Fix Version/s: (was: 9.1)
> Speed up BKD leaf block ids codec by a 512 ints ForUtil
> ---
jpountz commented on pull request #692:
URL: https://github.com/apache/lucene/pull/692#issuecomment-1049857125
@rmuir We can remove the cost estimation, but it will not address the
problem. I'll try to explain the problem differently in case it helps.
DocIdSetBuilder takes doc IDs in
[
https://issues.apache.org/jira/browse/LUCENE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497394#comment-17497394
]
Adrien Grand commented on LUCENE-10438:
---
Solr indeed has a version of faceting th
rmuir commented on pull request #692:
URL: https://github.com/apache/lucene/pull/692#issuecomment-1049869937
> @rmuir We can remove the cost estimation, but it will not address the
problem. I'll try to explain the problem differently in case it helps.
I really think it will address t
[
https://issues.apache.org/jira/browse/LUCENE-10432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497399#comment-17497399
]
Adrien Grand commented on LUCENE-10432:
---
[~reta] I wonder if you have thought abo
[
https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497402#comment-17497402
]
Adrien Grand commented on LUCENE-10431:
---
I've been starring at the code and at th
iverase commented on pull request #692:
URL: https://github.com/apache/lucene/pull/692#issuecomment-1049887302
32 bits will need to be discarded anyway, the issue is where.
You either do it at the PointValues level by calling grow like:
```
visitor.grow((int) Math.min(getDo
[
https://issues.apache.org/jira/browse/LUCENE-10432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497405#comment-17497405
]
Andriy Redko commented on LUCENE-10432:
---
Thanks [~jpountz]
> I wonder if you ha
rmuir commented on pull request #692:
URL: https://github.com/apache/lucene/pull/692#issuecomment-1049894634
If this is literally all about "style" issue then let's be open and honest
about that. I am fine with:
```
/** sugar: to just make code look pretty, nothing else */
public B
rmuir commented on pull request #692:
URL: https://github.com/apache/lucene/pull/692#issuecomment-1049905030
To try to be more helpful, here's what i'd propose. I can try to hack up a
draft PR later if we want, if it is helpful.
DocIdSetBuilder, remove complex cost estimation:
* r
[
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497422#comment-17497422
]
Adrien Grand commented on LUCENE-10428:
---
Ouch this is bad.
Note that in your cod
rmuir opened a new pull request #709:
URL: https://github.com/apache/lucene/pull/709
Cost estimation drives the API complexity out of control, we don't need it.
Hopefully i've cleared up all the API damage from this explosive leak.
Instead, FixedBitSet.approximateCardinality() is use
rmuir commented on pull request #709:
URL: https://github.com/apache/lucene/pull/709#issuecomment-1049948027
Here's a first stab of what i proposed on
https://github.com/apache/lucene/pull/692
You can see how damaging the current cost() implementation is.
As followup commits w
rmuir commented on pull request #692:
URL: https://github.com/apache/lucene/pull/692#issuecomment-1049948208
prototype: https://github.com/apache/lucene/pull/709
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
jpountz commented on pull request #709:
URL: https://github.com/apache/lucene/pull/709#issuecomment-1049959940
That change makes sense to me. FWIW my recollection from profiling
DocIdSetBuilder is that the deduplication logic is cheap and most of the time
is spent in `LSBRadixSorter#reorde
[
https://issues.apache.org/jira/browse/LUCENE-10432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497468#comment-17497468
]
Adrien Grand commented on LUCENE-10432:
---
The bit I'm missing is how you would let
rmuir commented on pull request #709:
URL: https://github.com/apache/lucene/pull/709#issuecomment-1049967927
If we want to add the `grow(long)` sugar method that simply truncates to
`Integer.MAX_VALUE` and clean up all the points callsites, or write a cool
FixedBitSet.approximateCardinalit
iverase commented on a change in pull request #709:
URL: https://github.com/apache/lucene/pull/709#discussion_r813988648
##
File path: lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java
##
@@ -266,20 +224,12 @@ private void upgradeToBitSet() {
public DocIdSet b
iverase commented on a change in pull request #709:
URL: https://github.com/apache/lucene/pull/709#discussion_r813994000
##
File path: lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java
##
@@ -266,20 +224,12 @@ private void upgradeToBitSet() {
public DocIdSet b
iverase commented on pull request #709:
URL: https://github.com/apache/lucene/pull/709#issuecomment-104552
I don't think the is necessary, we can always added to the IntersectVisitor
instead. Maybe would be worthy to adjust how we call grow() in BKDReader#addAll
as it does not need the
iverase edited a comment on pull request #709:
URL: https://github.com/apache/lucene/pull/709#issuecomment-104552
I don't think the grow(long) is necessary, we can always added to the
IntersectVisitor instead. Maybe would be worthy to adjust how we call grow() in
BKDReader#addAll as it
[
https://issues.apache.org/jira/browse/LUCENE-10432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497499#comment-17497499
]
Andriy Redko commented on LUCENE-10432:
---
Yeah, it may not 100% cover everything (
rmuir commented on a change in pull request #709:
URL: https://github.com/apache/lucene/pull/709#discussion_r814039139
##
File path: lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java
##
@@ -266,20 +224,12 @@ private void upgradeToBitSet() {
public DocIdSet bui
rmuir commented on a change in pull request #709:
URL: https://github.com/apache/lucene/pull/709#discussion_r814040808
##
File path: lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java
##
@@ -266,20 +224,12 @@ private void upgradeToBitSet() {
public DocIdSet bui
iverase commented on a change in pull request #709:
URL: https://github.com/apache/lucene/pull/709#discussion_r814045946
##
File path: lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java
##
@@ -266,20 +224,12 @@ private void upgradeToBitSet() {
public DocIdSet b
iverase commented on a change in pull request #709:
URL: https://github.com/apache/lucene/pull/709#discussion_r814047234
##
File path: lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java
##
@@ -266,20 +224,12 @@ private void upgradeToBitSet() {
public DocIdSet b
jpountz opened a new pull request #710:
URL: https://github.com/apache/lucene/pull/710
This computes a pop count on a sample of the longs that back the bitset.
Quick benchmarks suggest that this runs 5x-10x faster than
`FixedBitSet#cardinality` depending on the length of the bitset
[
https://issues.apache.org/jira/browse/LUCENE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497526#comment-17497526
]
Adrien Grand commented on LUCENE-10427:
---
I know that the Elasticsearch team is lo
rmuir commented on pull request #710:
URL: https://github.com/apache/lucene/pull/710#issuecomment-1050050397
Since we made the method `abstract`, let's just have it forward to
exact-cardinality for the `JavaUtilBitSet` used in the unit tests? It should
fix the test issues.
I agree w
[
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497553#comment-17497553
]
Ankit Jain commented on LUCENE-10428:
-
{quote}By any chance, were you able to see w
[
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497553#comment-17497553
]
Ankit Jain edited comment on LUCENE-10428 at 2/24/22, 5:00 PM:
--
[
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497553#comment-17497553
]
Ankit Jain edited comment on LUCENE-10428 at 2/24/22, 5:01 PM:
--
[
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497597#comment-17497597
]
Adrien Grand commented on LUCENE-10428:
---
This is interesting indeed since query e
rmuir commented on a change in pull request #710:
URL: https://github.com/apache/lucene/pull/710#discussion_r814137771
##
File path: lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java
##
@@ -176,6 +176,30 @@ public int cardinality() {
return (int) BitUtil.pop_arr
[
https://issues.apache.org/jira/browse/LUCENE-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julie Tibshirani updated LUCENE-10391:
--
Attachment: Screen Shot 2022-02-24 at 10.18.42 AM.png
> Reuse data structures across
[
https://issues.apache.org/jira/browse/LUCENE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497616#comment-17497616
]
Greg Miller commented on LUCENE-10438:
--
I experimented with this a bit for taxo- a
[
https://issues.apache.org/jira/browse/LUCENE-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497648#comment-17497648
]
Julie Tibshirani commented on LUCENE-10391:
---
Now that the benchmarks are runn
rmuir commented on pull request #710:
URL: https://github.com/apache/lucene/pull/710#issuecomment-1050162660
also, another random suggestion for another day. I think it would be fine to
have some logic like this at some point:
```
if (length < N) {
return cardinality(); // fo
[
https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497658#comment-17497658
]
Ankit Jain commented on LUCENE-10428:
-
{quote}I opened a pull request that doesn't
jtibshirani commented on pull request #686:
URL: https://github.com/apache/lucene/pull/686#issuecomment-1050175765
Thanks @rmuir ! Are you okay to merge this? I got confused recently over a
sometimes-reproducible test failure.
--
This is an automated message from the Apache Git Service.
[
https://issues.apache.org/jira/browse/LUCENE-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Greg Miller reassigned LUCENE-10440:
Assignee: Greg Miller
> Reduce visibility of TaxonomyFacets and FloatTaxonomyFacets
> --
Greg Miller created LUCENE-10440:
Summary: Reduce visibility of TaxonomyFacets and
FloatTaxonomyFacets
Key: LUCENE-10440
URL: https://issues.apache.org/jira/browse/LUCENE-10440
Project: Lucene - Core
rmuir commented on pull request #709:
URL: https://github.com/apache/lucene/pull/709#issuecomment-1050188471
> I don't think the grow(long) is necessary, we can always added to the
IntersectVisitor instead. Maybe would be worthy to adjust how we call grow() in
BKDReader#addAll as it does n
gsmiller opened a new pull request #712:
URL: https://github.com/apache/lucene/pull/712
# Description
These two classes are really implementation details, meant to hold common
logic for our faceting implementations, but they are `public` and could be
extended by users. It would be n
gsmiller opened a new pull request #713:
URL: https://github.com/apache/lucene/pull/713
This is a "backport" of #712, providing early `@Deprecation` notice.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abov
[
https://issues.apache.org/jira/browse/LUCENE-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497692#comment-17497692
]
Greg Miller commented on LUCENE-10440:
--
PRs posted for this. The only point maybe
magibney commented on pull request #380:
URL: https://github.com/apache/lucene/pull/380#issuecomment-1050227228
This patch applies cleanly and all tests pass. I plan to commit this within
the next few days, because i think it does improve things (targeting 9.1
release).
But I want t
[
https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497716#comment-17497716
]
Uwe Schindler commented on LUCENE-10431:
What was the exact query. Is it the on
[
https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497718#comment-17497718
]
Uwe Schindler commented on LUCENE-10431:
Sorry with builder pattern you can't c
[
https://issues.apache.org/jira/browse/LUCENE-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497749#comment-17497749
]
ASF subversion and git services commented on LUCENE-9952:
-
Commi
[
https://issues.apache.org/jira/browse/LUCENE-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497757#comment-17497757
]
ASF subversion and git services commented on LUCENE-9952:
-
Commi
[
https://issues.apache.org/jira/browse/LUCENE-10394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497873#comment-17497873
]
Gautam Worah commented on LUCENE-10394:
---
I'll try to work on this soon. Looking i
[ https://issues.apache.org/jira/browse/LUCENE-10394 ]
Gautam Worah deleted comment on LUCENE-10394:
---
was (Author: gworah):
I'll try to work on this soon. Looking into the ByteBuffer API in the meantime.
> Explore moving ByteBuffer(sData|Inde
Peixin Li created LUCENE-10441:
--
Summary: ArrayIndexOutOfBoundsException during indexing
Key: LUCENE-10441
URL: https://issues.apache.org/jira/browse/LUCENE-10441
Project: Lucene - Core
Issue Ty
[
https://issues.apache.org/jira/browse/LUCENE-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peixin Li updated LUCENE-10441:
---
Description:
Hi experts!, i have facing ArrayIndexOutOfBoundsException during indexing and
committ
[
https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497915#comment-17497915
]
Michael Bien commented on LUCENE-10431:
---
thanks for the tips. I might have found
LuXugang opened a new pull request #714:
URL: https://github.com/apache/lucene/pull/714
update CHANGES.txt for
[LUCENE-10424](https://issues.apache.org/jira/browse/LUCENE-10424) and
[LUCENE-10439](https://issues.apache.org/jira/browse/LUCENE-10439) .
--
This is an automated message from
rmuir merged pull request #686:
URL: https://github.com/apache/lucene/pull/686
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@
[
https://issues.apache.org/jira/browse/LUCENE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497920#comment-17497920
]
ASF subversion and git services commented on LUCENE-10421:
--
Co
[
https://issues.apache.org/jira/browse/LUCENE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497932#comment-17497932
]
ASF subversion and git services commented on LUCENE-10421:
--
Co
[
https://issues.apache.org/jira/browse/LUCENE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir resolved LUCENE-10421.
--
Fix Version/s: 9.1
Resolution: Fixed
> Non-deterministic results from KnnVectorQuery?
84 matches
Mail list logo