[GitHub] [lucene] jpountz edited a comment on pull request #180: LUCENE-9959: [WIP] Add non thread local based API for term vector reader usage

2021-06-15 Thread GitBox
jpountz edited a comment on pull request #180: URL: https://github.com/apache/lucene/pull/180#issuecomment-861482554 > +1 to fix the test I've come up with a simple fix in https://github.com/apache/lucene/pull/180/commits/5520c5d0aed30fdb3231f920881cd2c685153267. Please let me know

[GitHub] [lucene] gsmiller commented on pull request #181: LUCENE-10001: Make CollectionTerminatedException handling in MultiCollector configurable

2021-06-15 Thread GitBox
gsmiller commented on pull request #181: URL: https://github.com/apache/lucene/pull/181#issuecomment-861881544 @gautamworah96 thanks for the feedback! Adrien and I have been having some conversation around whether-or-not this change makes sense over in the Jira issue. As a result, I'm goin

[GitHub] [lucene] gsmiller commented on a change in pull request #181: LUCENE-10001: Make CollectionTerminatedException handling in MultiCollector configurable

2021-06-15 Thread GitBox
gsmiller commented on a change in pull request #181: URL: https://github.com/apache/lucene/pull/181#discussion_r652200243 ## File path: lucene/core/src/java/org/apache/lucene/search/MultiCollector.java ## @@ -25,13 +25,39 @@ /** * A {@link Collector} which allows running a s

[GitHub] [lucene] jpountz commented on pull request #92: Expunge big segment with oversize deletePct caused by continuously updating a batch of data

2021-06-15 Thread GitBox
jpountz commented on pull request #92: URL: https://github.com/apache/lucene/pull/92#issuecomment-861638216 If you look at `BaseMergePolicyTestCase` and `TestTieredMergePolicy`, we actually have tests that simulate merges in order to verify that things like the maximum percentage of delete

[GitHub] [lucene] jpountz commented on pull request #92: Expunge big segment with oversize deletePct caused by continuously updating a batch of data

2021-06-15 Thread GitBox
jpountz commented on pull request #92: URL: https://github.com/apache/lucene/pull/92#issuecomment-861634770 If I read this line correctly, it says that large segments (more than 50% the maximum segment size) shouldn't be merged unless both the percentage of deletes of the segment and the p

[jira] [Resolved] (LUCENE-9998) The param 'fis' in StoredFieldsWriter.finish(FieldInfos fis, int numDocs) is never used

2021-06-15 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-9998. -- Fix Version/s: (was: 8.6.2) main (9.0) Resolution: Fixed > The p

[jira] [Commented] (LUCENE-9998) The param 'fis' in StoredFieldsWriter.finish(FieldInfos fis, int numDocs) is never used

2021-06-15 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363694#comment-17363694 ] ASF subversion and git services commented on LUCENE-9998: - Commi

[GitHub] [lucene] jpountz merged pull request #183: LUCENE-9998: delete useless param fis in StoredFieldsWriter.finish() and TermVectorsWriter.finish()

2021-06-15 Thread GitBox
jpountz merged pull request #183: URL: https://github.com/apache/lucene/pull/183 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please co

[GitHub] [lucene] jimczi commented on a change in pull request #185: LUCENE-9999: CombinedFieldQuery can fail with an exception when document is missing fields

2021-06-15 Thread GitBox
jimczi commented on a change in pull request #185: URL: https://github.com/apache/lucene/pull/185#discussion_r651810484 ## File path: lucene/sandbox/src/java/org/apache/lucene/sandbox/search/MultiNormsLeafSimScorer.java ## @@ -80,9 +80,7 @@ } private long getNormValue(

[jira] [Commented] (LUCENE-10003) Disallow C-style array declarations

2021-06-15 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363644#comment-17363644 ] Robert Muir commented on LUCENE-10003: -- My concern is really beyond when the tool

[GitHub] [lucene] uschindler commented on pull request #177: Initial rewrite of MMapDirectory for JDK-17 preview (incubating) Panama APIs (>= JDK-17-ea-b25)

2021-06-15 Thread GitBox
uschindler commented on pull request #177: URL: https://github.com/apache/lucene/pull/177#issuecomment-861496169 In the next Panama iteration, there will also be a ready-to use copy method, which has same shape as methods added for bulk copy: https://github.com/openjdk/panama-foreign/pull/

[jira] [Commented] (LUCENE-10001) Make CollectionTerminatedException handling in MultiCollector configurable

2021-06-15 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363632#comment-17363632 ] Greg Miller commented on LUCENE-10001: -- Thanks [~jpountz]! I'll clarify a bit but

[GitHub] [lucene] kkewwei commented on pull request #183: LUCENE-9998: delete useless param fis in StoredFieldsWriter.finish() and TermVectorsWriter.finish()

2021-06-15 Thread GitBox
kkewwei commented on pull request #183: URL: https://github.com/apache/lucene/pull/183#issuecomment-861494424 I have added the note, please check. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [lucene] uschindler edited a comment on pull request #177: Initial rewrite of MMapDirectory for JDK-17 preview (incubating) Panama APIs (>= JDK-17-ea-b25)

2021-06-15 Thread GitBox
uschindler edited a comment on pull request #177: URL: https://github.com/apache/lucene/pull/177#issuecomment-861492558 The issue is confirmed and for the readBytes() code there's already a workaround. Long term we will improve to have this for all array types. See discussion here: https:/

[GitHub] [lucene] uschindler commented on pull request #177: Initial rewrite of MMapDirectory for JDK-17 preview (incubating) Panama APIs (>= JDK-17-ea-b25)

2021-06-15 Thread GitBox
uschindler commented on pull request #177: URL: https://github.com/apache/lucene/pull/177#issuecomment-861492558 The issue is confirmed and for the readBytes() code there's already a workaround. Long term we will improve For the float and long options: copyMemory() has some overhead

[GitHub] [lucene] jpountz commented on pull request #180: LUCENE-9959: [WIP] Add non thread local based API for term vector reader usage

2021-06-15 Thread GitBox
jpountz commented on pull request #180: URL: https://github.com/apache/lucene/pull/180#issuecomment-861482554 +1 to fix the test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [lucene] jpountz commented on pull request #183: LUCENE-9998: delete useless param fis in StoredFieldsWriter.finish() and TermVectorsWriter.finish()

2021-06-15 Thread GitBox
jpountz commented on pull request #183: URL: https://github.com/apache/lucene/pull/183#issuecomment-861480777 Right above line 130 (`Improvements`) in `lucene/CHANGES.txt` on your working copy. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [lucene] jpountz commented on a change in pull request #185: LUCENE-9999: CombinedFieldQuery can fail with an exception when document is missing fields

2021-06-15 Thread GitBox
jpountz commented on a change in pull request #185: URL: https://github.com/apache/lucene/pull/185#discussion_r651759489 ## File path: lucene/sandbox/src/java/org/apache/lucene/sandbox/search/MultiNormsLeafSimScorer.java ## @@ -80,9 +80,7 @@ } private long getNormValue

[jira] [Commented] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-15 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363613#comment-17363613 ] Adrien Grand commented on LUCENE-10004: --- Order is important because it implicitly

[jira] [Commented] (LUCENE-10003) Disallow C-style array declarations

2021-06-15 Thread Dawid Weiss (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363607#comment-17363607 ] Dawid Weiss commented on LUCENE-10003: -- You'd need to edit github workflow config

[jira] [Commented] (LUCENE-10003) Disallow C-style array declarations

2021-06-15 Thread David Smiley (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363603#comment-17363603 ] David Smiley commented on LUCENE-10003: --- I think it's enough for GitHub PRs to do

[jira] [Commented] (LUCENE-9935) Bulk merges for stored fields when index sorting is enabled

2021-06-15 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363578#comment-17363578 ] ASF subversion and git services commented on LUCENE-9935: - Commi

[GitHub] [lucene] dnhatn merged pull request #182: LUCENE-9935: Clone term vectors reader for merges

2021-06-15 Thread GitBox
dnhatn merged pull request #182: URL: https://github.com/apache/lucene/pull/182 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please con

[GitHub] [lucene] dnhatn commented on pull request #182: LUCENE-9935: Clone term vectors reader for merges

2021-06-15 Thread GitBox
dnhatn commented on pull request #182: URL: https://github.com/apache/lucene/pull/182#issuecomment-861408329 Thanks @jpountz. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [lucene] glawson0 commented on pull request #157: LUCENE-9963 Fix issue with FlattenGraphFilter throwing exceptions from holes

2021-06-15 Thread GitBox
glawson0 commented on pull request #157: URL: https://github.com/apache/lucene/pull/157#issuecomment-861386826 >Maybe another way to improve the checking for correctness in the randomized test (or maybe in a new randomized test) would be to randomly generate a set of strings from a limited

[jira] [Comment Edited] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-15 Thread kkewwei (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363529#comment-17363529 ] kkewwei edited comment on LUCENE-10004 at 6/15/21, 10:31 AM:

[jira] [Commented] (LUCENE-9999) CombinedFieldQuery can fail when document is missing fields

2021-06-15 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363531#comment-17363531 ] Jim Ferenczi commented on LUCENE-: -- I opened https://github.com/apache/lucene/p

[GitHub] [lucene] jimczi opened a new pull request #185: CombinedFieldQuery can fail with an exception when document is missing fields

2021-06-15 Thread GitBox
jimczi opened a new pull request #185: URL: https://github.com/apache/lucene/pull/185 This change fixes a bug in `MultiNormsLeafSimScorer` that assumes that each field should have a norm for every term/document. -- This is an automated message from the Apache Git Service. To respond

[jira] [Comment Edited] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-15 Thread kkewwei (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363529#comment-17363529 ] kkewwei edited comment on LUCENE-10004 at 6/15/21, 10:16 AM:

[jira] [Commented] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-15 Thread kkewwei (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363529#comment-17363529 ] kkewwei commented on LUCENE-10004: -- There seems no need to guarantee the order of stor

[GitHub] [lucene] kkewwei commented on pull request #183: LUCENE-9998: delete useless param fis in StoredFieldsWriter.finish() and TermVectorsWriter.finish()

2021-06-15 Thread GitBox
kkewwei commented on pull request #183: URL: https://github.com/apache/lucene/pull/183#issuecomment-861367299 Of course, Can you tell me where I can add the node? I can't find the place. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [lucene] uschindler commented on pull request #177: Initial rewrite of MMapDirectory for JDK-17 preview (incubating) Panama APIs (>= JDK-17-ea-b25)

2021-06-15 Thread GitBox
uschindler commented on pull request #177: URL: https://github.com/apache/lucene/pull/177#issuecomment-861362120 I opened https://bugs.openjdk.java.net/browse/JDK-8268743 about the object allocations. -- This is an automated message from the Apache Git Service. To respond to the message,

[jira] [Commented] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-15 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363483#comment-17363483 ] Adrien Grand commented on LUCENE-10004: --- I believe that it's actually important t

[jira] [Commented] (LUCENE-10001) Make CollectionTerminatedException handling in MultiCollector configurable

2021-06-15 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17363477#comment-17363477 ] Adrien Grand commented on LUCENE-10001: --- Can you help me understand the use-case

[GitHub] [lucene] jpountz edited a comment on pull request #184: LUCENE-9996: Reduce RAM usage of DWPT for a single document.

2021-06-15 Thread GitBox
jpountz edited a comment on pull request #184: URL: https://github.com/apache/lucene/pull/184#issuecomment-861287012 On this simple test, memory usage for a single doc in the DWPT goes from 9.6MB to 1.4MB. ```java import org.apache.lucene.document.Document; import org.apache.lu

[GitHub] [lucene] jpountz commented on pull request #184: LUCENE-9996: Reduce RAM usage of DWPT for a single document.

2021-06-15 Thread GitBox
jpountz commented on pull request #184: URL: https://github.com/apache/lucene/pull/184#issuecomment-861287012 On this very simple test, memory usage for a single doc in the DWPT goes from 9.6MB to 1.4MB. ```java import org.apache.lucene.document.Document; import org.apache.luce

[GitHub] [lucene] jpountz opened a new pull request #184: LUCENE-9996: Reduce RAM usage of DWPT for a single document.

2021-06-15 Thread GitBox
jpountz opened a new pull request #184: URL: https://github.com/apache/lucene/pull/184 With this change, doc-value terms dictionaries use a shared `ByteBlockPool` across all fields, and points, binary doc values and doc-value ordinals use slightly smaller page sizes. -- This is an a

[GitHub] [lucene] kkewwei opened a new pull request #183: LUCENE-9998: delete useless param fis in StoredFieldsWriter.finish() and TermVectorsWriter.finish()

2021-06-15 Thread GitBox
kkewwei opened a new pull request #183: URL: https://github.com/apache/lucene/pull/183 # Description the paramater `fis` in In StoredFieldsWriter.finish() and TermVectorsWriter.finish() is useless, and deleting it would help simplify the API. # Solution Deleting it from StoredFiel

[GitHub] [lucene] uschindler commented on pull request #177: Initial rewrite of MMapDirectory for JDK-17 preview (incubating) Panama APIs (>= JDK-17-ea-b25)

2021-06-15 Thread GitBox
uschindler commented on pull request #177: URL: https://github.com/apache/lucene/pull/177#issuecomment-861265227 I also ran the same with tiered compilation turned on and no `-Xbatch` (java defaults). The results are much better, but still the heap allocations are done. For long-runn

[jira] [Updated] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-15 Thread kkewwei (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kkewwei updated LUCENE-10004: - Description: In CompressingStoredFieldsWriter.merge(): if the segment meet the following conditions: {