thecoop commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1307057810
Unfortunately that doesn't seem to have much of an effect - same number
after a GC, with the option turned on or off
--
This is an automated message from the Apache Git Service.
To res
thecoop commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1307070854
Unfortunately that doesn't seem to have much of an impact, from what I can
see here.
@rmuir Would you be against having a string cache specifically in the
relevant methods in Fiel
scampi commented on issue #11702:
URL: https://github.com/apache/lucene/issues/11702#issuecomment-1307096355
I was involved in a [previous
issue](https://issues.apache.org/jira/browse/LUCENE-10449) that is related to
this one. The problem was a drop of performance when scanning
`SortedSetD
rmuir commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1307115506
yes because it would translate as a leak for many other
use-cases/applications.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to G
rmuir commented on PR #11906:
URL: https://github.com/apache/lucene/pull/11906#issuecomment-1307150684
i bumped the ram and restarted the test. but it is really broken that i can
flush out all the docs with a 512MB heap, but need many many gigabytes to merge
them together. and its only 16 m
thecoop commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1307164489
To be clear, are you referring to the extra memory used by the deduplication
hashmap for the duration of the deserialisation, that will then be eligible for
GC after the method returns?
benwtrent commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1307208338
> We have to start building up tests for these cases because this seems like
deja vu as far as int overflows in this area.
I am right there with ya @rmuir. 100% feels like "whack
rmuir commented on PR #11852:
URL: https://github.com/apache/lucene/pull/11852#issuecomment-1307217205
> I'm late to the party. Do we really want to have/maintain a web
application under Lucene? An HTTP server would not be sufficient to develop a
state-full web app, you need to write an app
rmuir commented on PR #11852:
URL: https://github.com/apache/lucene/pull/11852#issuecomment-1307222420
> Re: JS frameworks - I recognize my position is from Ludd, and it might be
untenable. If it gets out of hand we can always add something like jQuery, but
we can never remove, so let's sta
dweiss commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1307230679
There's a whole bunch of automated checks you could go through, selectively,
and try to enable them for the future. This includes IntLongMath, which is
currently off.
https://gith
benwtrent opened a new pull request, #11907:
URL: https://github.com/apache/lucene/pull/11907
This commit fixes a latent casting bug where int multiplication could
roll-over to the negatives.
`new byte[Math.toIntExact(numSplits * config.bytesPerDim)];`
`toIntExact` does nothin
benwtrent commented on PR #11907:
URL: https://github.com/apache/lucene/pull/11907#issuecomment-1307246646
@iverase you might be interested in this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t
iverase commented on PR #11907:
URL: https://github.com/apache/lucene/pull/11907#issuecomment-1307282609
Actually, I think there are more occurrences of this multiplication without
check, could we add it? for example:
https://github.com/apache/lucene/blob/3210a42f0958e395930d2259e155a7149fb
rmuir commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1307289709
> Yeah, we can probably trigger this overflow by using 16268815 byte vectors
of few dimensions. Something as small as 2 dimensions could work.
> One issue with HNSW is that completel
iverase merged PR #11907:
URL: https://github.com/apache/lucene/pull/11907
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
rmuir commented on PR #11906:
URL: https://github.com/apache/lucene/pull/11906#issuecomment-1307432615
I looked into why the test is taking eternity to run, the super slow merge
at the end is spending all its time clearing bitsets! Looks like the wrong
datastructure...
```
java.la
jpountz commented on issue #11676:
URL: https://github.com/apache/lucene/issues/11676#issuecomment-1307486547
I wonder if the complexity introduced by the nanotime trick is worth the
benefits, but I'm happy to discuss it over a PR. In my opinion only exceeding
the configured allowed timeout
rmuir commented on issue #11676:
URL: https://github.com/apache/lucene/issues/11676#issuecomment-1307533642
It is worth it. nobody wants to debug test failures that happen because NTP
skewed the clock.
--
This is an automated message from the Apache Git Service.
To respond to the message,
gsmiller commented on code in PR #11881:
URL: https://github.com/apache/lucene/pull/11881#discussion_r1016914939
##
lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysScorer.java:
##
@@ -166,89 +160,158 @@ public int score(LeafCollector collector, Bits
acceptDocs, int m
jpountz commented on issue #11676:
URL: https://github.com/apache/lucene/issues/11676#issuecomment-1307598183
Sorry for the confusion, I was thinking of not relying on any timing info
**at all** besides the one that is already encapsulated by the `QueryTimeout`
object. Just relying on the f
jpountz commented on code in PR #11900:
URL: https://github.com/apache/lucene/pull/11900#discussion_r1016950141
##
lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java:
##
@@ -46,7 +46,9 @@ public class FuzzySet implements Accountable {
public static final in
gsmiller merged PR #11881:
URL: https://github.com/apache/lucene/pull/11881
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
rmuir commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1307727467
> * In Lucene 9.2+, the bug appears when there are `16268814`
(Integer.MAX_VALUE/(M * 2 + 1)) or more vectors in a single segment.
If this is correct we should just be able to create
benwtrent commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1307744298
@rmuir Thinking outside the box! I will try that. It would definitely cause
the graph offset calculation to be completely blown out of proportion! Which is
the cause of this overflow.
rmuir commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1307756832
Yes, if such a test works it may at least prevent similar regressions.
Another possible idea is to give every vector value of 0, then zip up the
index, it should be ~16MB of zeros w
rmuir commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1307760820
@jdconrad helped with some math that may explain why previous tests didnt
fail:
```
jshell> int M = 16;
M ==> 16
jshell> long v1 = (1 + (M*2)) * 4 * 16268814;
v1 ==> 2147
rmuir commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1307821765
With the 20M docs it still didnt fail. I have the index saved so i can play
around, maybe checkindex doesnt trigger what is needed here (e.g. advance vs
next).
It is a little crazy
benwtrent commented on PR #11905:
URL: https://github.com/apache/lucene/pull/11905#issuecomment-1307833883
> It is a little crazy that this index has 2.5GB .vex file that, if i run
zip, deflates 98% down to 75MB. very wasteful.
Agreed :). Once this stuff is solved, I hope to further i
jmazanec15 commented on issue #11354:
URL: https://github.com/apache/lucene/issues/11354#issuecomment-1307862709
Hi @mayya-sharipova @jtibshirani @msokolov
I figured out the issue in the previous tests with the recall - I was not
using the copy of the vectors when recomputing the dis
sebbASF opened a new pull request, #71:
URL: https://github.com/apache/lucene-site/pull/71
Github repo currently says "Apache Lucene and Solr web site"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to g
jdconrad commented on PR #11906:
URL: https://github.com/apache/lucene/pull/11906#issuecomment-1308006136
Just as confirmation I'm seeing `FixedBitSet.clear` taking up a lot of time
as well when running this test.
```
"Lucene Merge Thread #0" #18 daemon prio=5 os_prio=0 cpu=347309.
uschindler merged PR #71:
URL: https://github.com/apache/lucene-site/pull/71
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
uschindler merged PR #70:
URL: https://github.com/apache/lucene-site/pull/70
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.a
sebbASF opened a new issue, #72:
URL: https://github.com/apache/lucene-site/issues/72
It would be helpful to have a link to this issue tracker from the website.
Perhaps under 'Editing this site'?
--
This is an automated message from the Apache Git Service.
To respond to the message,
rmuir commented on PR #11906:
URL: https://github.com/apache/lucene/pull/11906#issuecomment-1308119525
current test still doesn't fail. checkIndex just calls nextDoc() on
low-level vectors but we may need to invoke skipping to find the issue. That's
my theory at least.
one thing miss
donnerpeter merged PR #11893:
URL: https://github.com/apache/lucene/pull/11893
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene
36 matches
Mail list logo