thecoop commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1301893972
This is ready to review. I've removed the threadlocal byte buffer.
With testing in elasticsearch, this reduced the duplicated strings in these
fields by 90%. For the linked tickets
thecoop commented on issue #11712:
URL: https://github.com/apache/lucene/issues/11712#issuecomment-1301896942
@dweiss Please see #11847 for a proposed solution
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL ab
dweiss commented on code in PR #11847:
URL: https://github.com/apache/lucene/pull/11847#discussion_r1012761533
##
lucene/core/src/java/org/apache/lucene/store/DataInput.java:
##
@@ -280,6 +281,18 @@ public String readString() throws IOException {
return new String(bytes, 0,
thecoop commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1301947013
Another option is to have the `FieldInfosFormat` classes handle string
pooling themselves, leaving `DataInput` unchanged.
--
This is an automated message from the Apache Git Service.
T
thecoop commented on code in PR #11847:
URL: https://github.com/apache/lucene/pull/11847#discussion_r1012767732
##
lucene/core/src/java/org/apache/lucene/store/DataInput.java:
##
@@ -280,6 +281,18 @@ public String readString() throws IOException {
return new String(bytes, 0
rmuir commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1301951432
I still have a problem with leaking memory on purpose. The issue here is
elastic's bad design, not lucene. Otherwise you wouldnt have thousands of
indexes X thousands of fields: that situa
rmuir commented on issue #11896:
URL: https://github.com/apache/lucene/issues/11896#issuecomment-1301971180
The build fails in CI with this issue (multiple times), so i captured a seed
to reproduce it.
i don't think intellij has anything to do with this.
If we aren't going to
iverase opened a new issue, #11898:
URL: https://github.com/apache/lucene/issues/11898
### Description
The current release wizard still contains instructions that needs to be
performed on JIRA. We are not using it any more so those instructions are
obsolete and should be removed / re
rmuir commented on PR #11860:
URL: https://github.com/apache/lucene/pull/11860#issuecomment-1302331407
> For this vector graph file, since we use the same bpv, we can easily find
a node's neighbor list just by offseting using `nodeOrdinal * 32 * bpv`. It's
important this is efficient becaus
dsmiley commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1302705957
There is somewhat a matter of taste at stake, so I can appreciate Robert's
point of view that DataInput is nice the way it is without adding this to it
for an edge (abuse?) case. I've s
rmuir commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1302748754
I think if you get yourself in this situation via oversharding/dynamic
fields/etc, then you can add `-XX:+UseStringDeduplication` to your commandline,
problem solved.
--
This is an aut
gsmiller commented on PR #11881:
URL: https://github.com/apache/lucene/pull/11881#issuecomment-1302761082
OK, this should be ready for review now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
dweiss commented on PR #11847:
URL: https://github.com/apache/lucene/pull/11847#issuecomment-1303040362
> -XX:+UseStringDeduplication to your commandline, problem solved.
This is actually a very elegant workaround... I wonder what the benchmarks
would show on those affected systems.
13 matches
Mail list logo