[GitHub] [lucene] iverase merged pull request #478: LUCENE-10264: Clone index input when creating a PointTree in SimpleTextBKDReader

2021-11-29 Thread GitBox


iverase merged pull request #478:
URL: https://github.com/apache/lucene/pull/478


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9820) Separate logic for reading the BKD index from logic to intersecting it.

2021-11-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450249#comment-17450249
 ] 

ASF subversion and git services commented on LUCENE-9820:
-

Commit 634c22c527ef72b1d400bb8284cff6b9971766c1 in lucene's branch 
refs/heads/main from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=634c22c ]

LUCENE-10264: Clone index input when creating a PointTree in 
SimpleTextBKDReader (#478)

Fixes a race condition introduced in LUCENE-9820.

> Separate logic for reading the BKD index from logic to intersecting it.
> ---
>
> Key: LUCENE-9820
> URL: https://issues.apache.org/jira/browse/LUCENE-9820
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ignacio Vera
>Assignee: Ignacio Vera
>Priority: Major
> Fix For: 9.1
>
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> Currently the class BKDReader contains all the logic for traversing the KD 
> tree and the logic to read the actual index. This makes difficult to develop 
> new visiting strategies, for example LUCENE-9619, where it is proposed to 
> move Points from a visitor API to a custor-style API.
> The first step is to isolate the logic the read the index from the logic that 
> visits the the tree. Another benefit of doing this, is that it will help 
> evolving the index, for example moving the current index format to backwards 
> codec without moving the visiting logic.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10264) Test errors in SimpleTextBKDReader

2021-11-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450248#comment-17450248
 ] 

ASF subversion and git services commented on LUCENE-10264:
--

Commit 634c22c527ef72b1d400bb8284cff6b9971766c1 in lucene's branch 
refs/heads/main from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=634c22c ]

LUCENE-10264: Clone index input when creating a PointTree in 
SimpleTextBKDReader (#478)

Fixes a race condition introduced in LUCENE-9820.

> Test errors in SimpleTextBKDReader
> --
>
> Key: LUCENE-10264
> URL: https://issues.apache.org/jira/browse/LUCENE-10264
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I noticed a couple of errors in CI regarding the SimpleTextBKDReader which 
> are introduced by LUCENE-9820. I had a look and indeed the problem is that we 
> are not cloning the index input when creating a PointTree and therefore if 
> there are two threads accessing the same PointValue instance (e.g a search 
> request and a background merge), then we have troubles.
> Reproduce with:
>  
> {noformat}
> ./gradlew test --tests TestSimpleTextPointsFormat.testWithExceptions 
> -Dtests.seed=56F6BF03D7871A6D -Dtests.multiplier=3 -Dtests.slow=true 
> -Dtests.locale=ses -Dtests.timezone=Asia/Ho_Chi_Minh -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
> {noformat}
>  
> Error:
> {noformat}
> Stack Trace:
> java.lang.AssertionError
>         at 
> __randomizedtesting.SeedInfo.seed([56F6BF03D7871A6D:F4A5237F58095597]:0)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextBKDReader$SimpleTextPointTree.parseInt(SimpleTextBKDReader.java:387)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextBKDReader$SimpleTextPointTree.readDocIDs(SimpleTextBKDReader.java:374)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextBKDReader$SimpleTextPointTree.visitDocValues(SimpleTextBKDReader.java:345)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextBKDReader$SimpleTextPointTree.visitDocValues(SimpleTextBKDReader.java:362)
>         at 
> org.apache.lucene.codecs.PointsWriter$1$1$1.visitDocValues(PointsWriter.java:142)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextPointsWriter.writeField(SimpleTextPointsWriter.java:95)
>         at 
> org.apache.lucene.codecs.PointsWriter.mergeOneField(PointsWriter.java:57)
>         at org.apache.lucene.codecs.PointsWriter.merge(PointsWriter.java:231)
>         at 
> org.apache.lucene.index.SegmentMerger.mergePoints(SegmentMerger.java:184)
>         at 
> org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:291)
>         at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:144)
>         at 
> org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:3190)
>         at 
> org.apache.lucene.index.RandomIndexWriter.addIndexes(RandomIndexWriter.java:320)
>         at 
> org.apache.lucene.index.BasePointsFormatTestCase.switchIndex(BasePointsFormatTestCase.java:1118)
>         at 
> org.apache.lucene.index.BasePointsFormatTestCase.verify(BasePointsFormatTestCase.java:779)
>         at 
> org.apache.lucene.index.BasePointsFormatTestCase.testWithExceptions(BasePointsFormatTestCase.java:247)
>         at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:942)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:978)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:992)
>         at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
>         at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>         at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>         at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>         at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>         at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>         at 
> com.carrotsearch.randomiz

[jira] [Commented] (LUCENE-9820) Separate logic for reading the BKD index from logic to intersecting it.

2021-11-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450251#comment-17450251
 ] 

ASF subversion and git services commented on LUCENE-9820:
-

Commit 62084d7138808887783199b4256fc3eee794355e in lucene's branch 
refs/heads/branch_9x from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=62084d7 ]

LUCENE-10264: Clone index input when creating a PointTree in 
SimpleTextBKDReader (#478)

Fixes a race condition introduced in LUCENE-9820.

> Separate logic for reading the BKD index from logic to intersecting it.
> ---
>
> Key: LUCENE-9820
> URL: https://issues.apache.org/jira/browse/LUCENE-9820
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ignacio Vera
>Assignee: Ignacio Vera
>Priority: Major
> Fix For: 9.1
>
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> Currently the class BKDReader contains all the logic for traversing the KD 
> tree and the logic to read the actual index. This makes difficult to develop 
> new visiting strategies, for example LUCENE-9619, where it is proposed to 
> move Points from a visitor API to a custor-style API.
> The first step is to isolate the logic the read the index from the logic that 
> visits the the tree. Another benefit of doing this, is that it will help 
> evolving the index, for example moving the current index format to backwards 
> codec without moving the visiting logic.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10264) Test errors in SimpleTextBKDReader

2021-11-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450250#comment-17450250
 ] 

ASF subversion and git services commented on LUCENE-10264:
--

Commit 62084d7138808887783199b4256fc3eee794355e in lucene's branch 
refs/heads/branch_9x from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=62084d7 ]

LUCENE-10264: Clone index input when creating a PointTree in 
SimpleTextBKDReader (#478)

Fixes a race condition introduced in LUCENE-9820.

> Test errors in SimpleTextBKDReader
> --
>
> Key: LUCENE-10264
> URL: https://issues.apache.org/jira/browse/LUCENE-10264
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I noticed a couple of errors in CI regarding the SimpleTextBKDReader which 
> are introduced by LUCENE-9820. I had a look and indeed the problem is that we 
> are not cloning the index input when creating a PointTree and therefore if 
> there are two threads accessing the same PointValue instance (e.g a search 
> request and a background merge), then we have troubles.
> Reproduce with:
>  
> {noformat}
> ./gradlew test --tests TestSimpleTextPointsFormat.testWithExceptions 
> -Dtests.seed=56F6BF03D7871A6D -Dtests.multiplier=3 -Dtests.slow=true 
> -Dtests.locale=ses -Dtests.timezone=Asia/Ho_Chi_Minh -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
> {noformat}
>  
> Error:
> {noformat}
> Stack Trace:
> java.lang.AssertionError
>         at 
> __randomizedtesting.SeedInfo.seed([56F6BF03D7871A6D:F4A5237F58095597]:0)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextBKDReader$SimpleTextPointTree.parseInt(SimpleTextBKDReader.java:387)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextBKDReader$SimpleTextPointTree.readDocIDs(SimpleTextBKDReader.java:374)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextBKDReader$SimpleTextPointTree.visitDocValues(SimpleTextBKDReader.java:345)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextBKDReader$SimpleTextPointTree.visitDocValues(SimpleTextBKDReader.java:362)
>         at 
> org.apache.lucene.codecs.PointsWriter$1$1$1.visitDocValues(PointsWriter.java:142)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextPointsWriter.writeField(SimpleTextPointsWriter.java:95)
>         at 
> org.apache.lucene.codecs.PointsWriter.mergeOneField(PointsWriter.java:57)
>         at org.apache.lucene.codecs.PointsWriter.merge(PointsWriter.java:231)
>         at 
> org.apache.lucene.index.SegmentMerger.mergePoints(SegmentMerger.java:184)
>         at 
> org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:291)
>         at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:144)
>         at 
> org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:3190)
>         at 
> org.apache.lucene.index.RandomIndexWriter.addIndexes(RandomIndexWriter.java:320)
>         at 
> org.apache.lucene.index.BasePointsFormatTestCase.switchIndex(BasePointsFormatTestCase.java:1118)
>         at 
> org.apache.lucene.index.BasePointsFormatTestCase.verify(BasePointsFormatTestCase.java:779)
>         at 
> org.apache.lucene.index.BasePointsFormatTestCase.testWithExceptions(BasePointsFormatTestCase.java:247)
>         at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:942)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:978)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:992)
>         at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
>         at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>         at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>         at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>         at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>         at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>         at 
> com.carrotsearch.ran

[jira] [Commented] (LUCENE-10267) Gradle does not write module version attribute for modules with zero dependencies

2021-11-29 Thread Jerome Prinet (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450257#comment-17450257
 ] 

Jerome Prinet commented on LUCENE-10267:


Hi David and thanks for raising that!
I'm taking it internally and I'll keep you posted.




> Gradle does not write module version attribute for modules with zero 
> dependencies
> -
>
> Key: LUCENE-10267
> URL: https://issues.apache.org/jira/browse/LUCENE-10267
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Dawid Weiss
>Priority: Minor
> Attachments: mod-version-repro.zip
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10267) Gradle does not write module version attribute for modules with zero dependencies

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450258#comment-17450258
 ] 

Dawid Weiss commented on LUCENE-10267:
--

Thanks [~JeromeP]!

> Gradle does not write module version attribute for modules with zero 
> dependencies
> -
>
> Key: LUCENE-10267
> URL: https://issues.apache.org/jira/browse/LUCENE-10267
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Dawid Weiss
>Priority: Minor
> Attachments: mod-version-repro.zip
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-10264) Test errors in SimpleTextBKDReader

2021-11-29 Thread Ignacio Vera (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera resolved LUCENE-10264.
---
  Assignee: Ignacio Vera
Resolution: Fixed

I haven't added an entry in CHANGES.txt as it is an unreleased bug.

> Test errors in SimpleTextBKDReader
> --
>
> Key: LUCENE-10264
> URL: https://issues.apache.org/jira/browse/LUCENE-10264
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ignacio Vera
>Assignee: Ignacio Vera
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I noticed a couple of errors in CI regarding the SimpleTextBKDReader which 
> are introduced by LUCENE-9820. I had a look and indeed the problem is that we 
> are not cloning the index input when creating a PointTree and therefore if 
> there are two threads accessing the same PointValue instance (e.g a search 
> request and a background merge), then we have troubles.
> Reproduce with:
>  
> {noformat}
> ./gradlew test --tests TestSimpleTextPointsFormat.testWithExceptions 
> -Dtests.seed=56F6BF03D7871A6D -Dtests.multiplier=3 -Dtests.slow=true 
> -Dtests.locale=ses -Dtests.timezone=Asia/Ho_Chi_Minh -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
> {noformat}
>  
> Error:
> {noformat}
> Stack Trace:
> java.lang.AssertionError
>         at 
> __randomizedtesting.SeedInfo.seed([56F6BF03D7871A6D:F4A5237F58095597]:0)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextBKDReader$SimpleTextPointTree.parseInt(SimpleTextBKDReader.java:387)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextBKDReader$SimpleTextPointTree.readDocIDs(SimpleTextBKDReader.java:374)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextBKDReader$SimpleTextPointTree.visitDocValues(SimpleTextBKDReader.java:345)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextBKDReader$SimpleTextPointTree.visitDocValues(SimpleTextBKDReader.java:362)
>         at 
> org.apache.lucene.codecs.PointsWriter$1$1$1.visitDocValues(PointsWriter.java:142)
>         at 
> org.apache.lucene.codecs.simpletext.SimpleTextPointsWriter.writeField(SimpleTextPointsWriter.java:95)
>         at 
> org.apache.lucene.codecs.PointsWriter.mergeOneField(PointsWriter.java:57)
>         at org.apache.lucene.codecs.PointsWriter.merge(PointsWriter.java:231)
>         at 
> org.apache.lucene.index.SegmentMerger.mergePoints(SegmentMerger.java:184)
>         at 
> org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:291)
>         at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:144)
>         at 
> org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:3190)
>         at 
> org.apache.lucene.index.RandomIndexWriter.addIndexes(RandomIndexWriter.java:320)
>         at 
> org.apache.lucene.index.BasePointsFormatTestCase.switchIndex(BasePointsFormatTestCase.java:1118)
>         at 
> org.apache.lucene.index.BasePointsFormatTestCase.verify(BasePointsFormatTestCase.java:779)
>         at 
> org.apache.lucene.index.BasePointsFormatTestCase.testWithExceptions(BasePointsFormatTestCase.java:247)
>         at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:942)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:978)
>         at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:992)
>         at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
>         at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
>         at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
>         at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
>         at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
>         at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>         at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>         at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:370)
>         at 
> com.carrotsearch.randomizedtesting.ThreadLe

[jira] [Commented] (LUCENE-9619) Move Points from a visitor API to a cursor-style API?

2021-11-29 Thread Ignacio Vera (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450277#comment-17450277
 ] 

Ignacio Vera commented on LUCENE-9619:
--

In LUCENE-9820 we have done the first step to move the API but still the 
methods #visitDocsIds and #visitDocValues are using the IntersectVisitor as an 
input.  Here I am proposing to introduce two functional interfaces 
{{DocIdsVisitor}} and {{DocValuesVisitor}} to use them as the input for those 
methods so the API would look like:

 
{code:java}
/**
 * Basic operations to read the KD-tree.
 *
 * @lucene.experimental
 */
public interface PointTree extends Cloneable {

  /** Clone, the current node becomes the root of the new tree. */
  PointTree clone();

  /**
   * Move to the first child node and return {@code true} upon success. Returns 
{@code false} for
   * leaf nodes and {@code true} otherwise.
   */
  boolean moveToChild() throws IOException;

  /**
   * Move to the next sibling node and return {@code true} upon success. 
Returns {@code false} if
   * the current node has no more siblings.
   */
  boolean moveToSibling() throws IOException;

  /**
   * Move to the parent node and return {@code true} upon success. Returns 
{@code false} for the
   * root node and {@code true} otherwise.
   */
  boolean moveToParent() throws IOException;

  /** Return the minimum packed value of the current node. */
  byte[] getMinPackedValue();

  /** Return the maximum packed value of the current node. */
  byte[] getMaxPackedValue();

  /** Return the number of points below the current node. */
  long size();

  /** Visit all the docs below the current node. */
  void visitDocIDs(DocIdsVisitor docIdsVisitor) throws IOException;

  /** Visit all the docs and values below the current node. */
  default void visitDocValues(DocValuesVisitor docValuesVisitor) throws 
IOException {
visitDocValues((min, max) -> Relation.CELL_CROSSES_QUERY, docID -> {}, 
docValuesVisitor);
  }

  /**
   * Similar to {@link #visitDocValues(DocValuesVisitor)} but in this case it 
allows adding a
   * filter that works like {@link IntersectVisitor#compare(byte[], byte[])}.
   */
  void visitDocValues(
  BiFunction compare,
  DocIdsVisitor docIdsVisitor,
  DocValuesVisitor docValuesVisitor)
  throws IOException;
}

/**
 * Collects all documents below a tree node by calling {@link
 * PointTree#visitDocIDs(DocIdsVisitor)}
 */
@FunctionalInterface
public interface DocIdsVisitor {
  /** Called for all documents below a tree node. */
  void visit(int docID) throws IOException;
}

/**
 * Collects all documents and values below a tree node by calling {@link
 * PointTree#visitDocValues(DocValuesVisitor)} (DocIdsVisitor)}
 */
@FunctionalInterface
public interface DocValuesVisitor {
  /** Called for all documents and values below a tree node. */
  void visit(int docID, byte[] packedValue) throws IOException;

  /**
   * Similar to {@link DocValuesVisitor#visit(int, byte[])} but in this case 
the packedValue can
   * have more than one docID associated to it. The provided iterator should 
not escape the scope
   * of this method so that implementations of PointValues are free to reuse it.
   */
  default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
int docID;
while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) {
  visit(docID, packedValue);
}
  }
}

/**
 * We recurse the {@link PointTree}, using a provided instance of this to guide 
the recursion.
 *
 * @lucene.experimental
 */
public interface IntersectVisitor extends DocValuesVisitor, DocIdsVisitor {

  /**
   * Called for non-leaf cells to test how the cell relates to the query, to 
determine how to
   * further recurse down the tree.
   *
   * 
   *   {@link Relation#CELL_OUTSIDE_QUERY}: Stop recursing down the current 
branch of the
   *   tree.
   *   {@link Relation#CELL_INSIDE_QUERY}: All nodes below the current node 
are visited using
   *   the underlying {@link DocIdsVisitor}. he consumer should generally 
blindly accept the
   *   docID.
   *   {@link Relation#CELL_CROSSES_QUERY}: Keep recursing down the current 
branch of the
   *   tree. If the current node is a leaf, visit all docs and values 
usinng the underlying
   *   {@link DocValuesVisitor}. The consumer should scrutinize the 
packedValue to decide
   *   whether to accept it.
   * 
   */
  Relation compare(byte[] minPackedValue, byte[] maxPackedValue);

  /** Notifies the caller that this many documents are about to be visited */
  default void grow(int count) {}
} {code}
 

Any thoughts?

> Move Points from a visitor API to a cursor-style API?
> -
>
> Key: LUCENE-9619
> URL: https://issues.apache.org/jira/browse/LUCENE-9619
> Project: Lucene - Core
>  Issue Type: 

[jira] [Created] (LUCENE-10269) Add the ability to read KD trees from right to left

2021-11-29 Thread Ignacio Vera (Jira)
Ignacio Vera created LUCENE-10269:
-

 Summary: Add the ability to read KD trees from right to left
 Key: LUCENE-10269
 URL: https://issues.apache.org/jira/browse/LUCENE-10269
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Ignacio Vera


In LUCENE-9820 we exposed a programatic API to navigate Lucene Kd-trees. It is 
currently only possible to navigate those trees from left to right via the 
methods #moveToChild and #moveToSibling.

 

In LUCENE-10262 we improve the Kd tree so we remove the constraint of having to 
read the tree always forward. This added the possibility to introduce an API to 
read the tree from right to left. This will allow for example to get the 
maximum value for a dimension stored in a kd-tree that contains deleted 
documents,

 

The idea will be something like:

 

 
{code:java}
/**
 * Move to the first child node and return {@code true} upon success. Returns 
{@code false} for
 * leaf nodes and {@code true} otherwise.
 */
boolean moveToFirstChild() throws IOException;

/**
 * Move to the next sibling node and return {@code true} upon success. Returns 
{@code false} if
 * the current node is the last child.
 */
boolean moveToNextSibling() throws IOException; 

/**
 * Move to the last child node and return {@code true} upon success. Returns 
{@code false} for
 * leaf nodes and {@code true} otherwise.
 */
boolean moveToLastChild() throws IOException;

/**
 * Move to the previous sibling node and return {@code true} upon success. 
Returns {@code false} if
 * the current node is the first child.
 */
boolean moveToPreviousSibling() throws IOException;

{code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10266) Move nearest-neighbor search on points to core?

2021-11-29 Thread Ignacio Vera (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450296#comment-17450296
 ] 

Ignacio Vera commented on LUCENE-10266:
---

+1

> Move nearest-neighbor search on points to core?
> ---
>
> Key: LUCENE-10266
> URL: https://issues.apache.org/jira/browse/LUCENE-10266
> Project: Lucene - Core
>  Issue Type: Task
>Reporter: Adrien Grand
>Priority: Minor
>
> Now that the Points' public API supports running nearest-nearest neighbor 
> search, should we move it to core via helper methods on {{LatLonPoint}} and 
> {{XYPoint}}?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir opened a new pull request #485: LUCENE-10010: don't determinize in CompiledAutomaton/RunAutomaton

2021-11-29 Thread GitBox


rmuir opened a new pull request #485:
URL: https://github.com/apache/lucene/pull/485


   Instead, require that incoming automata is determinized by the caller, 
throwing an exception if it isn't.
   
   This paves the way for NFA execution in the future: if you pass an NFA to 
AutomatonQuery, we should use the NFA algorithm on it. No need for lots of 
booleans or enums.
   
   The idea is that we clean this one up and fold this into the main 
LUCENE-10010 PR, to keep the APIs simple. But we could also merge it 
independently first.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #225: LUCENE-10010 Introduce NFARunAutomaton to run NFA directly

2021-11-29 Thread GitBox


rmuir commented on pull request #225:
URL: https://github.com/apache/lucene/pull/225#issuecomment-981447584


   I made a quick prototype with what i mean for the API: 
https://github.com/apache/lucene/pull/485
   
   The idea is that AutomatonQuery shouldn't be determinizing. Let's push this 
to the caller. If they pass it a DFA, it uses DFA algorithm. If they pass it 
NFA, it can use the NFA algorithm (it currently throws an exception in my 
branch, instead of slowly determinizing, that is the change).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] iverase merged pull request #428: LUCENE-9538: Detect polygon self-intersections in the Tessellator

2021-11-29 Thread GitBox


iverase merged pull request #428:
URL: https://github.com/apache/lucene/pull/428


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9538) Tessellator should provide a better error message for self-intersecting shapes

2021-11-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450310#comment-17450310
 ] 

ASF subversion and git services commented on LUCENE-9538:
-

Commit 78c8d7b7ea6aca2202c5eeffcc19e837279721c6 in lucene's branch 
refs/heads/main from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=78c8d7b ]

LUCENE-9538: Detect polygon self-intersections in the Tessellator (#428)

Detect self-intersections so it can provide a more meaningful error to the 
users.

> Tessellator should provide a better error message for self-intersecting shapes
> --
>
> Key: LUCENE-9538
> URL: https://issues.apache.org/jira/browse/LUCENE-9538
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Self-intersecting shapes cannot be tessellated and currently throw a generic 
> like:
>  
>  
> {code:java}
>   Unable to Tessellate shape...{code}
>  
> In case of Self-intersecting shapes we can do better and try to give a more 
> useful message by detecting the self-intersection position and provide that 
> information to the user. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9538) Tessellator should provide a better error message for self-intersecting shapes

2021-11-29 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450311#comment-17450311
 ] 

ASF subversion and git services commented on LUCENE-9538:
-

Commit 70243ea81151335183773944607164bb1c2b4ece in lucene's branch 
refs/heads/branch_9x from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=70243ea ]

LUCENE-9538: Detect polygon self-intersections in the Tessellator (#428)

Detect self-intersections so it can provide a more meaningful error to the 
users.

> Tessellator should provide a better error message for self-intersecting shapes
> --
>
> Key: LUCENE-9538
> URL: https://issues.apache.org/jira/browse/LUCENE-9538
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ignacio Vera
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Self-intersecting shapes cannot be tessellated and currently throw a generic 
> like:
>  
>  
> {code:java}
>   Unable to Tessellate shape...{code}
>  
> In case of Self-intersecting shapes we can do better and try to give a more 
> useful message by detecting the self-intersection position and provide that 
> information to the user. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9538) Tessellator should provide a better error message for self-intersecting shapes

2021-11-29 Thread Ignacio Vera (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera resolved LUCENE-9538.
--
Fix Version/s: 9.1
 Assignee: Ignacio Vera
   Resolution: Fixed

> Tessellator should provide a better error message for self-intersecting shapes
> --
>
> Key: LUCENE-9538
> URL: https://issues.apache.org/jira/browse/LUCENE-9538
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Ignacio Vera
>Assignee: Ignacio Vera
>Priority: Major
> Fix For: 9.1
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Self-intersecting shapes cannot be tessellated and currently throw a generic 
> like:
>  
>  
> {code:java}
>   Unable to Tessellate shape...{code}
>  
> In case of Self-intersecting shapes we can do better and try to give a more 
> useful message by detecting the self-intersection position and provide that 
> information to the user. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450327#comment-17450327
 ] 

Uwe Schindler commented on LUCENE-10255:


Hi,
I would like to bring in one more thing to investiagte: The current Lucene 
modules as of 9.0 are named without full reverse domain names. We should 
investigate on other ASF projects if there is a "standard" how to name modules. 
I don't like it that the Maven group:artifact name is totally different from 
the module name. IMHO the Lucene module should be named with 
"org.apache.lucene." instead of plain "lucene.X".

The log4j module uses this pattern already, and we should coordinate that. 
Maybe ASF has a standard already. I'd ask on the "Apache Commons" project to 
figure out how they plan to handle it.

Changing the current syntax of module name is not a problem, because except for 
Luke we don't expose the modules in our documentation.

As said before I am in favor to name the modules like "groupid.artifactid" 
based on Maven coordinates (append with "." inbetween).

Thoughts?

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450359#comment-17450359
 ] 

Dawid Weiss commented on LUCENE-10255:
--

I did that intentionally. I hate those long prefixes. They make life much more 
complicated and I don't think there's a risk of running into a conflict with 
anything existing... 

java -m org.apache.lucene.core sounds way less attractive than just java -m 
lucene.core.



> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450361#comment-17450361
 ] 

Dawid Weiss commented on LUCENE-10255:
--

Think this way: java's internal modules don't have the domain prefix either - 
they rely on the uniqueness of the first part (jdk., java.). I think this is 
sufficient. No need to be paranoid./

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10267) Gradle does not write module version attribute for modules with zero dependencies

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450365#comment-17450365
 ] 

Dawid Weiss commented on LUCENE-10267:
--

I see this is a known issue - it was marked as a duplicate of:
https://github.com/gradle/gradle/issues/17484


> Gradle does not write module version attribute for modules with zero 
> dependencies
> -
>
> Key: LUCENE-10267
> URL: https://issues.apache.org/jira/browse/LUCENE-10267
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Dawid Weiss
>Priority: Minor
> Attachments: mod-version-repro.zip
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] romseygeek commented on pull request #477: LUCENE-10263: Implement Weight.count() on NormsFieldExistsQuery

2021-11-29 Thread GitBox


romseygeek commented on pull request #477:
URL: https://github.com/apache/lucene/pull/477#issuecomment-981531806


   Have updated; the test is now docCount == maxDoc, which works even in the 
case that we have deleted docs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10267) Gradle does not write module version attribute for modules with zero dependencies

2021-11-29 Thread Jerome Prinet (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450447#comment-17450447
 ] 

Jerome Prinet commented on LUCENE-10267:


Yes, it was already raised but not scoped yet. Your submission will help to 
give it more weight.

 

> Gradle does not write module version attribute for modules with zero 
> dependencies
> -
>
> Key: LUCENE-10267
> URL: https://issues.apache.org/jira/browse/LUCENE-10267
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Dawid Weiss
>Priority: Minor
> Attachments: mod-version-repro.zip
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450472#comment-17450472
 ] 

Uwe Schindler commented on LUCENE-10255:


bq. java -m org.apache.lucene.core sounds way less attractive than just java -m 
lucene.core

This is IMHO no argument for shorter names: If your own project is using the 
module system then you have a module-info.java, too. Then you can start it 
without hassle and won't specify any extra options.

I would ask around if there's a standard already. I would really like to see 
consistent module names. "java", "jdk" prefix is different, because Java never 
had any modules names before, but Maven has/had package prefixes. There were 
discussions about this already on JDK mailing list together with Maven people, 
but I have to find them. I think Maven uses the artifact coordinates also for 
module name, but I am not 100% sure. Maybe [~rfscholte] has some more 
information what community standards have evolved.

-1 to use "lucene" as module name prefix, +1 to use "org.apache.lucene" as 
prefix.

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] xaviersanchez commented on pull request #461: LUCENE-10248: Spanish Plural Stemmer

2021-11-29 Thread GitBox


xaviersanchez commented on pull request #461:
URL: https://github.com/apache/lucene/pull/461#issuecomment-981735988


   > Hi @xaviersanchez, this contribution looks great.
   > 
   > I'll do another pass on review and give some time for others to review as 
well.
   > 
   > I did a little investigation at a glance, and I think it is confusing that 
the current `SpanishMinimalStemmer` is doing aggressive conversions such as `ñ 
-> n`. I think, as a followup issue, we should `@deprecate` the 
`SpanishMinimalStemmer` and point users to this one instead?
   > 
   > `SpanishMinimalStemmer` is not a typical "upstream" algorithm, with 
academic papers/study from snowball or savoy, and there doesn't seem to be any 
reason to keep it anymore, except for a legacy index. So we could keep it 
around for another major release or so but not forever, IMO.
   
   Thanks @rmuir for the comment! 
   
   Yes, I agree we could deprecate SpanishMinimalStemmer and point the users to 
this implementation since it can cover the same use cases. We implemented this 
a while ago so, before contributing our code, we did the analysis of the 
different behaviors of the Spanish stemmers just for checking we could provide 
some added value. From our analysis we see that SpanishMinimalStemmer has some 
issues and does some quite aggressive text normalization. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #477: LUCENE-10263: Implement Weight.count() on NormsFieldExistsQuery

2021-11-29 Thread GitBox


rmuir commented on pull request #477:
URL: https://github.com/apache/lucene/pull/477#issuecomment-981738439


   Let's fix the CHANGES now that it works with deleted documents.
   
   I'm sad the optimization couldnt work because of a crazy corner case: which 
begs the question, why does the user care about corner cases of Norms? 
Shouldn't that be a implementation detail? e.g., should we deprecate this 
`NormsExistQuery`, and create a `TokensExistQuery` in its place, that has both 
this optimization, and the docCount-based opto (when there are no deleted 
docs). It would be faster, so I'd love to know the use-case where the user 
actually cares about low-level stuff like norms.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] romseygeek commented on pull request #477: LUCENE-10263: Implement Weight.count() on NormsFieldExistsQuery

2021-11-29 Thread GitBox


romseygeek commented on pull request #477:
URL: https://github.com/apache/lucene/pull/477#issuecomment-981755138


   For a `TokensExistsQuery`, is the idea that the query part would work the 
same as norms, we just filter out docs with a norm of 0?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #477: LUCENE-10263: Implement Weight.count() on NormsFieldExistsQuery

2021-11-29 Thread GitBox


rmuir commented on pull request #477:
URL: https://github.com/apache/lucene/pull/477#issuecomment-981761039


   > For a `TokensExistsQuery`, is the idea that the query part would work the 
same as norms, we just filter out docs with a norm of 0?
   
   yeah, at first at least. sounds like we need a zero-check because apparently 
put a norm in there when there's no tokens (which seems absolutely insane to 
me). Maybe we can fix it for a future index version and then remove the zero 
check.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #477: LUCENE-10263: Implement Weight.count() on NormsFieldExistsQuery

2021-11-29 Thread GitBox


rmuir commented on pull request #477:
URL: https://github.com/apache/lucene/pull/477#issuecomment-981765930


   personally, i really feel if someone wants "empty string" to be considered 
"indexed" for cases like this, they should use KeywordTokenizer/StringField, 
and actually index that empty string? We've certainly suffered lots of pain to 
support indexing that damn thing, might as well lean on it for such cases, and 
keep lucene fast.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] iverase opened a new pull request #486: LUCENE-9619: Remove IntersectVisitor from PointsTree API

2021-11-29 Thread GitBox


iverase opened a new pull request #486:
URL: https://github.com/apache/lucene/pull/486


   Introduces two functional interfaces, `DocValuesVisitor` and `DocIdsVisitor` 
that are used in the PointTree API instead of using the IntersectVisitor. The 
IntersectVisitor is now extending those interfaces. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] iverase commented on a change in pull request #486: LUCENE-9619: Remove IntersectVisitor from PointsTree API

2021-11-29 Thread GitBox


iverase commented on a change in pull request #486:
URL: https://github.com/apache/lucene/pull/486#discussion_r758504244



##
File path: lucene/core/src/java/org/apache/lucene/index/PointValues.java
##
@@ -323,10 +355,18 @@ default void grow(int count) {}
*/
   public final void intersect(IntersectVisitor visitor) throws IOException {
 final PointTree pointTree = getPointTree();
-intersect(visitor, pointTree);
+intersect(wrapIntersectVisitor(visitor), pointTree);
 assert pointTree.moveToParent() == false;
   }
 
+  /**
+   * Adds the possibility of wrapping a provided {@link IntersectVisitor} in 
{@link
+   * #intersect(IntersectVisitor)}.
+   */
+  protected IntersectVisitor wrapIntersectVisitor(IntersectVisitor visitor) 
throws IOException {
+return visitor;
+  }

Review comment:
   This added this entry point in order to wrap IntersectVisitor with an 
AssertingIntersectVisitor during testing. I don't really like it but the only 
other option is to make intersects method not final which I didn't like it 
either.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450554#comment-17450554
 ] 

Dawid Weiss commented on LUCENE-10255:
--

Sorry but I remain unconvinced that typing a million times "org.apache." in 
various contexts wins you or me anything. Sure - maven coordinates are there as 
an example where this sort of makes sense (because all of the bazillion 
artifacts live under the same namespace tree). The module system is different  
though - there will be no name conflicts there if you shorten the module name 
to just "lucene". I don't see any gain in prefixing it with anything - the 
opposite, adding a prefix is a nuisance if the 'lucene' prefix is sufficiently 
unique to guarantee no conflicts with anything else. 

Even in the maven namespace some people opt for shorter prefixes (including 
various Apache commons libraries) [1].

[1] https://repo1.maven.org/maven2/commons-net/commons-net/



> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] romseygeek commented on pull request #477: LUCENE-10263: Implement Weight.count() on NormsFieldExistsQuery

2021-11-29 Thread GitBox


romseygeek commented on pull request #477:
URL: https://github.com/apache/lucene/pull/477#issuecomment-981775748


   One disadvantage of renaming it is that it really does require norms to 
work; it might be a bit surprising to have a 'TokensExistsQuery' that you run 
against a field with norms disabled and it doesn't return anything.  Or maybe 
it could throw an exception if the field in question doesn't have norms.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #477: LUCENE-10263: Implement Weight.count() on NormsFieldExistsQuery

2021-11-29 Thread GitBox


rmuir commented on pull request #477:
URL: https://github.com/apache/lucene/pull/477#issuecomment-981811821


   > One disadvantage of renaming it is that it really does require norms to 
work; it might be a bit surprising to have a 'TokensExistsQuery' that you run 
against a field with norms disabled and it doesn't return anything. Or maybe it 
could throw an exception if the field in question doesn't have norms.
   
   +1 to an exception and documenting the restriction. It is crazy that the 
existing NormsFieldExistsQuery doesn't throw exception today when 
FieldInfo.omitNorms, instead silently returning `0`! This is clearly an error, 
like not indexing positions for a phrasequery.
   
   I personally think a new name would be more descriptive of what it does 
(clarifying the semantics to make it faster), and make more sense to users. We 
could even document that if you want to count empty strings, you should index 
empty strings as tokens. I suspect almost nobody cares about this previous 
empty string crap, seems overthought and now hurts our performance, due to the 
way the current query is named/defined.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450583#comment-17450583
 ] 

Uwe Schindler commented on LUCENE-10255:


Hi,
please have an overview on Maven central and the good work done by 
[~sormu...@gmx.de]:

This table has module names and their artifact names extracted by a script from 
Maven central: 
https://github.com/sormuras/modules/blob/main/doc/Top1000-2020.txt.md (see also 
the repo: https://github.com/sormuras/modules)

When looking at the Top 1000, you will se that all module names that can be 
found on Maven Central use the package names / coordinate names. If you read 
the JLS, they recommend for packages and modules only "simple names" for small 
projects without large outreach.

Here is  conclusion what he recommends: 
https://sormuras.github.io/blog/2019-08-04-maven-coordinates-and-java-module-names.html

So I just repeat myself: Module names should really be unqiue. I don't care 
about 9.0, because its not officially announced, but when we enable the module 
system we should use unique names.

O should we rename also all packages in Lucene's sozurce code and strip off 
org.apache?

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450589#comment-17450589
 ] 

Dawid Weiss commented on LUCENE-10255:
--

You don't understand me, Uwe. I agree on maven central coordinates. I don't 
agree on full prefixes for module naming. I think "lucene." is unique enough. 
This is a subjective opinion and it's really no convincing me otherwise. If you 
want to push the full prefix - I'll live with it, but I don't agree it is 
necessary or useful or solves anything.

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450583#comment-17450583
 ] 

Uwe Schindler edited comment on LUCENE-10255 at 11/29/21, 4:50 PM:
---

Hi,
please have an overview on Maven central and the good work done by 
[~sormu...@gmx.de]:

This table has module names and their artifact names extracted by a script from 
Maven central: 
https://github.com/sormuras/modules/blob/main/doc/Top1000-2020.txt.md (see also 
the repo: https://github.com/sormuras/modules)

When looking at the Top 1000, you will se that all module names that can be 
found on Maven Central use the package names / coordinate names. If you read 
the JLS, they recommend for packages and modules only "simple names" for small 
projects without large outreach.

Here is  conclusion what he recommends: 
https://sormuras.github.io/blog/2019-08-04-maven-coordinates-and-java-module-names.html

So I just repeat myself: Module names should really be unqiue. I don't care 
about 9.0, because its not officially announced, but when we enable the module 
system we should use unique names.

Or should we rename also all packages in Lucene's source code and strip off 
"org.apache."?


was (Author: thetaphi):
Hi,
please have an overview on Maven central and the good work done by 
[~sormu...@gmx.de]:

This table has module names and their artifact names extracted by a script from 
Maven central: 
https://github.com/sormuras/modules/blob/main/doc/Top1000-2020.txt.md (see also 
the repo: https://github.com/sormuras/modules)

When looking at the Top 1000, you will se that all module names that can be 
found on Maven Central use the package names / coordinate names. If you read 
the JLS, they recommend for packages and modules only "simple names" for small 
projects without large outreach.

Here is  conclusion what he recommends: 
https://sormuras.github.io/blog/2019-08-04-maven-coordinates-and-java-module-names.html

So I just repeat myself: Module names should really be unqiue. I don't care 
about 9.0, because its not officially announced, but when we enable the module 
system we should use unique names.

O should we rename also all packages in Lucene's sozurce code and strip off 
org.apache?

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, man

[jira] [Comment Edited] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450583#comment-17450583
 ] 

Uwe Schindler edited comment on LUCENE-10255 at 11/29/21, 4:59 PM:
---

Hi,
please have an overview on Maven central and the good work done by [~sor]:

This table has module names and their artifact names extracted by a script from 
Maven central: 
https://github.com/sormuras/modules/blob/main/doc/Top1000-2020.txt.md (see also 
the repo: https://github.com/sormuras/modules)

When looking at the Top 1000, you will se that all module names that can be 
found on Maven Central use the package names / coordinate names. If you read 
the JLS, they recommend for packages and modules only "simple names" for small 
projects without large outreach.

Here is  conclusion what he recommends: 
https://sormuras.github.io/blog/2019-08-04-maven-coordinates-and-java-module-names.html

So I just repeat myself: Module names should really be unqiue. I don't care 
about 9.0, because its not officially announced, but when we enable the module 
system we should use unique names.

Or should we rename also all packages in Lucene's source code and strip off 
"org.apache."?


was (Author: thetaphi):
Hi,
please have an overview on Maven central and the good work done by 
[~sormu...@gmx.de]:

This table has module names and their artifact names extracted by a script from 
Maven central: 
https://github.com/sormuras/modules/blob/main/doc/Top1000-2020.txt.md (see also 
the repo: https://github.com/sormuras/modules)

When looking at the Top 1000, you will se that all module names that can be 
found on Maven Central use the package names / coordinate names. If you read 
the JLS, they recommend for packages and modules only "simple names" for small 
projects without large outreach.

Here is  conclusion what he recommends: 
https://sormuras.github.io/blog/2019-08-04-maven-coordinates-and-java-module-names.html

So I just repeat myself: Module names should really be unqiue. I don't care 
about 9.0, because its not officially announced, but when we enable the module 
system we should use unique names.

Or should we rename also all packages in Lucene's source code and strip off 
"org.apache."?

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things a

[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450597#comment-17450597
 ] 

Uwe Schindler commented on LUCENE-10255:


We can disagree, for sure, but I'd like to get some more opinions. This MUST be 
a community decision. I gave my well educated opinion and invite everybody to 
read this blog post: 
https://sormuras.github.io/blog/2019-08-04-maven-coordinates-and-java-module-names.html;
 [~sor]  explains very well how a module name should look like.

The module names inside java/jdk are short, but the same is for package names. 
There is also the satement: The package names in every module *should* start 
with the module name (this is not always fully possible, but a good rule is 
that module name and package name should have a common prefix).

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450604#comment-17450604
 ] 

Dawid Weiss commented on LUCENE-10255:
--

Sure, Uwe. I think I expressed my personal opinion. :) Some of our current 
module naming cannot be converted to modules (anything with a dash). If you 
want consistency then the first step would be to rename those modules in the 
repo.

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #477: LUCENE-10263: Implement Weight.count() on NormsFieldExistsQuery

2021-11-29 Thread GitBox


rmuir commented on pull request #477:
URL: https://github.com/apache/lucene/pull/477#issuecomment-981837493


   and btw i'm not suggesting we do all this crap underneath this PR, the 
current PR looks fine to me (the optimization it uses is safe)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450607#comment-17450607
 ] 

Uwe Schindler commented on LUCENE-10255:


>From the list posted before there is also an example made you:  
>"com.carrotsearch.hppc" is module name of the maven artifact 
>"com.carrotsearch:hppc". This was exactly also my proposal for Lucene:

https://github.com/carrotsearch/hppc/blob/29ab369adac23a76acae1d08529654b2c2dc59e5/gradle/java/compiler.gradle#L24

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450607#comment-17450607
 ] 

Uwe Schindler edited comment on LUCENE-10255 at 11/29/21, 5:14 PM:
---

>From the list posted before there is also an example made by you: 
>"com.carrotsearch.hppc" is module name of the maven artifact 
>"com.carrotsearch:hppc". This was exactly also my proposal for Lucene:

https://github.com/carrotsearch/hppc/blob/29ab369adac23a76acae1d08529654b2c2dc59e5/gradle/java/compiler.gradle#L24


was (Author: thetaphi):
>From the list posted before there is also an example made you:  
>"com.carrotsearch.hppc" is module name of the maven artifact 
>"com.carrotsearch:hppc". This was exactly also my proposal for Lucene:

https://github.com/carrotsearch/hppc/blob/29ab369adac23a76acae1d08529654b2c2dc59e5/gradle/java/compiler.gradle#L24

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450613#comment-17450613
 ] 

Dawid Weiss commented on LUCENE-10255:
--

Mistakes of the youth... I remain unconvinced.

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450620#comment-17450620
 ] 

Uwe Schindler commented on LUCENE-10255:


Apache TIKA also uses module names according to the spec: 
https://github.com/apache/tika/blob/9d29536228860860549d89a052673d47c2af75ca/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-xmp-commons/pom.xml#L48

I just figured out that you already added Automatic Module names to the 9.0 
release, which are not even hardcoded, but derived through regular 
expressions/search replace. This was done completely without any announcement, 
so we have a Lucene release with broken names going out soon. I am glad that 
nobody takes care about modules at the moment...

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450620#comment-17450620
 ] 

Uwe Schindler edited comment on LUCENE-10255 at 11/29/21, 5:41 PM:
---

Apache TIKA also uses module names according to the spec: 
https://github.com/apache/tika/blob/9d29536228860860549d89a052673d47c2af75ca/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-xmp-commons/pom.xml#L48

I just figured out that you already added Automatic Module names to the 9.0 
release, which are not even hardcoded, but derived through regular 
expressions/search-replace from the internal gradle project path. This was done 
completely without any announcement, so we have a Lucene release with broken 
names going out soon. I am glad that nobody takes care about modules at the 
moment...


was (Author: thetaphi):
Apache TIKA also uses module names according to the spec: 
https://github.com/apache/tika/blob/9d29536228860860549d89a052673d47c2af75ca/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-xmp-commons/pom.xml#L48

I just figured out that you already added Automatic Module names to the 9.0 
release, which are not even hardcoded, but derived through regular 
expressions/search replace. This was done completely without any announcement, 
so we have a Lucene release with broken names going out soon. I am glad that 
nobody takes care about modules at the moment...

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450624#comment-17450624
 ] 

Dawid Weiss commented on LUCENE-10255:
--

Uwe... This was done with an announcement on the pull request and the issue. 
And it's also literally everywhere in the scripts you've reviewed ("-m 
lucene.luke"). 

If you really care so much about it and wish to change it to a full prefix we 
can still do it - 9.0 is not out yet.

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450625#comment-17450625
 ] 

Uwe Schindler commented on LUCENE-10255:


bq. I just figured out that you already added Automatic Module names to the 9.0 
release, which are not even hardcoded, but derived through regular 
expressions/search-replace from the internal gradle project path. 

This is even more risky if we decide to remove the ":lucene" top level Gradle 
folder, then the module name changes and nobody will notice!

Everything that's relevant to source code of downstream users should be 
explicitly declared (either in module-info.java or in the manifest).

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450627#comment-17450627
 ] 

Uwe Schindler commented on LUCENE-10255:


bq. This was done with an announcement on the pull request

These are so important changes that it should have been a post on mailing list!

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450628#comment-17450628
 ] 

Dawid Weiss commented on LUCENE-10255:
--

https://issues.apache.org/jira/browse/LUCENE-10234

> These are so important changes that it should have been a post on mailing 
> list!

Sure. It wasn't a change though - it was an introduction of what wasn't there 
before at all.

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450630#comment-17450630
 ] 

Robert Muir commented on LUCENE-10255:
--

{quote}
Sorry but I remain unconvinced that typing a million times "org.apache." in 
various contexts wins you or me anything.
{quote}

Sorry for the silly question, but I'm trying to understand why you'd need to 
type it a million times. I too dislike the verbosity of java, but it is my 
understanding that you might only add it to a module-info.java, like once?

I think as far as specifying stuff on the commandline, its not a problem, as 
lucene isn't a commandline application but instead an API. The one app we 
really ship (luke) has a sh/bat to make it easy.

But because it is an API, I do care that it's easy for users to consume it with 
the module system. And that also includes making it easy to consume things like 
analyzers via SPI providers if they are using the module system. I just don't 
know what that looks like yet (due to my unfamiliarity with the module system), 
but I'd love to visually see the tradeoffs between say 'lucene.analysis.common' 
and 'org.apache.lucene.analysis.common' from an "API user" perspective.

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Reopened] (LUCENE-10234) Add automatic module name to JAR manifests.

2021-11-29 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss reopened LUCENE-10234:
--

Uwe seems to be really devoted to changing the automatic module name to 
full-prefix convention, so I'm reopening this issue. This will require changes 
to the build system and the scripts that launch Luke. 

[~jpountz] - please cancel the current release candidate, we will have to 
respin.

> Add automatic module name to JAR manifests.
> ---
>
> Key: LUCENE-10234
> URL: https://issues.apache.org/jira/browse/LUCENE-10234
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 9.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This is the first step to make Lucene a proper fit for the java module 
> system. I chose a shorthand "lucene.[x]" module name convention, without the 
> "org.apache" prefix.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450635#comment-17450635
 ] 

Dawid Weiss commented on LUCENE-10255:
--

I already typed "-m lucene.luke" what seems like half a million times while 
debugging stuff around the jms and gradle bugs. So I'm almost there.

Listen... I really don't like the full prefix but I really could care less 
about it if you all want to stick with the full domain name - let's just fix 
it, respin the release candidate and be done with it. I did announce the 
shorthand version on LUCENE-10234, perhaps I should have written an all-caps 
announcement but I didn't, sorry. Let's do it the way you like it, I really 
don't care THAT MUCH. I only care a little.

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Robert Muir (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450639#comment-17450639
 ] 

Robert Muir commented on LUCENE-10255:
--

my comment was a genuine question, as I don't yet understand how annoying this 
name will be to API users. I don't yet have any opinion on the color of the 
bikeshed :)

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450647#comment-17450647
 ] 

Dawid Weiss commented on LUCENE-10255:
--

It'll be a different prefix in module-info.java "requires xyz" statements and 
in command-line invocations of Luke. Also, it'll list Lucene module as 
"lucene.core@version" instead of "org.apache.lucene@version". I'll provide a PR 
to go back to the full-prefix - Uwe seems to be really determined that this is 
the right way (tm) of doing it.

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450647#comment-17450647
 ] 

Dawid Weiss edited comment on LUCENE-10255 at 11/29/21, 6:15 PM:
-

It'll be a different prefix in module-info.java "requires xyz" statements and 
in command-line invocations of Luke. Also, it'll list Lucene module as 
"lucene.core@version" instead of "org.apache.lucene.core@version". I'll provide 
a PR to go back to the full-prefix - Uwe seems to be really determined that 
this is the right way (tm) of doing it.


was (Author: dweiss):
It'll be a different prefix in module-info.java "requires xyz" statements and 
in command-line invocations of Luke. Also, it'll list Lucene module as 
"lucene.core@version" instead of "org.apache.lucene@version". I'll provide a PR 
to go back to the full-prefix - Uwe seems to be really determined that this is 
the right way (tm) of doing it.

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10234) Add automatic module name to JAR manifests.

2021-11-29 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-10234:
-
Description: This is the first step to make Lucene a proper fit for the 
java module system. -I chose a shorthand "lucene.[x]" module name convention, 
without the "org.apache" prefix.-  (was: This is the first step to make Lucene 
a proper fit for the java module system. I chose a shorthand "lucene.[x]" 
module name convention, without the "org.apache" prefix.)

> Add automatic module name to JAR manifests.
> ---
>
> Key: LUCENE-10234
> URL: https://issues.apache.org/jira/browse/LUCENE-10234
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 9.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This is the first step to make Lucene a proper fit for the java module 
> system. -I chose a shorthand "lucene.[x]" module name convention, without the 
> "org.apache" prefix.-



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


uschindler commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758637522



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   Can we do this like here, based on the Maven group?
   
https://github.com/apache/lucene/blob/main/gradle/maven/publications-maven.gradle#L59-L60
   
   Of course, we would need to strip the ":lucene" from project path.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450680#comment-17450680
 ] 

Uwe Schindler commented on LUCENE-10255:


bq. Sorry for the silly question, but I'm trying to understand why you'd need 
to type it a million times. I too dislike the verbosity of java, but it is my 
understanding that you might only add it to a module-info.java, like once?

Exactly. And for our API users it is not understandable why you must write in 
the moudle-info.java {{requires lucene.core}} but in all java files {{import 
org.apach.lucene.xyz.*;}}. This is inconsistent. And there is the risk of 
clashes (although Lucene is very special, but we will see other third party 
modules then also name their modules like "lucene.foobar.xy", although they 
have nothing in common with Apache. We are an Apache project, so our package 
names, module names and maven artifact names should have the 
"org.apache.lucene" prefix.

This allows to consume in the way everybody knows: In java files for imports 
and when definig your dependencies in Maven or the requires directoives in Java 
modules.

bq. But because it is an API, I do care that it's easy for users to consume it 
with the module system. And that also includes making it easy to consume things 
like analyzers via SPI providers if they are using the module system.

Yes, and this will also work with module system. I tested it after adding 
correct "uses SPIBaseClass" statements to lucene-core's module-info.java. 
Theoretically, in addition we can hide everything from analyzers-common except 
the SPI

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450680#comment-17450680
 ] 

Uwe Schindler edited comment on LUCENE-10255 at 11/29/21, 7:02 PM:
---

bq. Sorry for the silly question, but I'm trying to understand why you'd need 
to type it a million times. I too dislike the verbosity of java, but it is my 
understanding that you might only add it to a module-info.java, like once?

Exactly. And for our API users it is not understandable why you must write in 
the modudle-info.java "{{requires lucene.core;}}" but in all java files 
"{{import org.apach.lucene.xyz.*;}}". This is inconsistent! And there is the 
risk of clashes (although Lucene is very special, but we will see other third 
party modules then also name their modules like "lucene.foobar.xy", although 
they have nothing in common with Apache. We are an Apache project, so our 
package names, module names and maven artifact names should have the 
"org.apache.lucene" prefix.

This allows to consume in the way everybody knows: In java files for imports 
and when definig your dependencies in Maven or the requires directoives in Java 
modules.

bq. But because it is an API, I do care that it's easy for users to consume it 
with the module system. And that also includes making it easy to consume things 
like analyzers via SPI providers if they are using the module system.

Yes, and this will also work with module system. I tested it after adding 
correct "uses SPIBaseClass" statements to lucene-core's module-info.java. 
Theoretically, in addition we can hide everything from analyzers-common except 
the SPI


was (Author: thetaphi):
bq. Sorry for the silly question, but I'm trying to understand why you'd need 
to type it a million times. I too dislike the verbosity of java, but it is my 
understanding that you might only add it to a module-info.java, like once?

Exactly. And for our API users it is not understandable why you must write in 
the moudle-info.java {{requires lucene.core}} but in all java files {{import 
org.apach.lucene.xyz.*;}}. This is inconsistent. And there is the risk of 
clashes (although Lucene is very special, but we will see other third party 
modules then also name their modules like "lucene.foobar.xy", although they 
have nothing in common with Apache. We are an Apache project, so our package 
names, module names and maven artifact names should have the 
"org.apache.lucene" prefix.

This allows to consume in the way everybody knows: In java files for imports 
and when definig your dependencies in Maven or the requires directoives in Java 
modules.

bq. But because it is an API, I do care that it's easy for users to consume it 
with the module system. And that also includes making it easy to consume things 
like analyzers via SPI providers if they are using the module system.

Yes, and this will also work with module system. I tested it after adding 
correct "uses SPIBaseClass" statements to lucene-core's module-info.java. 
Theoretically, in addition we can hide everything from analyzers-common except 
the SPI

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac w

[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450690#comment-17450690
 ] 

Dawid Weiss commented on LUCENE-10255:
--

I accept your arguments, even if I disagree with them, Uwe. I provided a PR to 
change it already.

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


dweiss commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758670507



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   The problem may be in the order of how all these properties are set (and 
when) - this is the dark pit I wouldn't want to go into... Manifest attributes 
should be resolved lazily but I've had mixed results with this. I'll see if it 
works.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani commented on a change in pull request #416: LUCENE-10054 Make HnswGraph hierarchical

2021-11-29 Thread GitBox


jtibshirani commented on a change in pull request #416:
URL: https://github.com/apache/lucene/pull/416#discussion_r758678766



##
File path: lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraph.java
##
@@ -56,31 +59,50 @@
 public final class HnswGraph extends KnnGraphValues {
 
   private final int maxConn;
+  private int numLevels; // the current number of levels in the graph
+  private int entryNode; // the current graph entry node on the top level
 
-  // Each entry lists the top maxConn neighbors of a node. The nodes 
correspond to vectors added to
-  // HnswBuilder, and the
-  // node values are the ordinals of those vectors.
-  private final List graph;
+  // Nodes by level expressed as the level 0's nodes' ordinals.
+  // As level 0 contains all nodes, nodesByLevel.get(0) is null.
+  private final List nodesByLevel;
+
+  // graph is a list of graph levels.
+  // Each level is represented as List – nodes' connections on 
this level.
+  // Each entry in the list has the top maxConn neighbors of a node. The nodes 
correspond to vectors
+  // added to HnswBuilder, and the node values are the ordinals of those 
vectors.
+  // Thus, on all levels, neighbors expressed as the level 0's nodes' ordinals.
+  private final List> graph;
 
   // KnnGraphValues iterator members
   private int upto;
   private NeighborArray cur;
 
-  HnswGraph(int maxConn) {
-graph = new ArrayList<>();
-// Typically with diversity criteria we see nodes not fully occupied; 
average fanout seems to be
-// about 1/2 maxConn. There is some indexing time penalty for 
under-allocating, but saves RAM
-graph.add(new NeighborArray(Math.max(32, maxConn / 4)));
+  HnswGraph(int maxConn, int levelOfFirstNode) {
 this.maxConn = maxConn;
+this.numLevels = levelOfFirstNode + 1;
+this.graph = new ArrayList<>(numLevels);
+this.entryNode = 0;
+for (int i = 0; i < numLevels; i++) {
+  graph.add(new ArrayList<>());
+  // Typically with diversity criteria we see nodes not fully occupied;
+  // average fanout seems to be about 1/2 maxConn.
+  // There is some indexing time penalty for under-allocating, but saves 
RAM
+  graph.get(i).add(new NeighborArray(Math.max(32, maxConn / 4)));
+}
+
+this.nodesByLevel = new ArrayList<>(numLevels);
+nodesByLevel.add(null); // we don't need this for 0th level, as it 
contains all nodes
+for (int l = 1; l < numLevels; l++) {
+  nodesByLevel.add(new int[] {0});
+}
   }
 
   /**
-   * Searches for the nearest neighbors of a query vector.
+   * Searches HNSW graph for the nearest neighbors of a query vector.
*
* @param query search query vector
* @param topK the number of nodes to be returned
-   * @param numSeed the size of the queue maintained while searching, and 
controls the number of
-   * random entry points to sample
+   * @param numSeed the size of the queue maintained while searching

Review comment:
   It works for me to have a separate discussion. Maybe at least in this PR 
we can rename this to `numCandidates`, since the 'seed' naming no longer makes 
sense?
   
   As context, I still think it makes sense to remove the `numCandidates` vs. 
`k` distinction in `HnswGraph`. The public signature `KnnVectorsReader#search` 
does not include a notion of "num candidates", so users have no way to even use 
this distinction. I'd be in favor of removing it from `HnswGraph`, then having 
a follow-up discussion about whether the vector search APIs should handle 
`numCandidates` vs `k`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob merged pull request #2615: SOLR-14412 NPE in MetricsHistoryHandler

2021-11-29 Thread GitBox


madrob merged pull request #2615:
URL: https://github.com/apache/lucene-solr/pull/2615


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


dweiss commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758689212



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   Ok, take a look now. Try: gradlew showModuleNames




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


dweiss commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758691210



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   There is a functional change piggybacked in the last commit - javadoc 
and source jars no longer receive automatic module name. I consider it a fix of 
something that wasn't right (these JARs are not modules).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


uschindler commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758697734



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   Will check soon. I am not yet sure about how the local name should look 
like.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


dweiss commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758700461



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   Sure. Let me know (or commit the changes to this PR).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Christian Stein (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450707#comment-17450707
 ] 

Christian Stein commented on LUCENE-10255:
--

FWIW, I agree with Uwe on the naming topic and want to add "prior art" samples 
from other `org.apache.*` project already shipping as Java modules with their 
module names with `org.apache.`: Derby, Felix, POI, Tomcat, and Wicket.

https://github.com/sormuras/modules/blob/be524907f29f60c7895b3cde62850a1937969ad7/com.github.sormuras.modules/com/github/sormuras/modules/modules.properties#L2480-L2551

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to catch (since javac would complain about missing classes 
> during compilation, even if they're in module path).
> * Probably the biggest challenge (not covered in the PoC) are with our custom 
> javadoc and ecj linter tasks - they see the module-info.java and can't cope 
> with it. At the same time, there is no easy way to exclude that one 
> particular file: ecj would have to accept a full set of sources (command 
> argument limit will be a problem), javac can accept a full set of java 
> sources (external file) but then it doesn't copy doc-files properly anymore 
> (this is probably easier to fix). 
> * There are differences at runtime that are hard to anticipate - for example 
> resource lookups via class loader no longer work (I fixed this in Luke).
> After poking a bit and trying it out I have to say I have mixed feelings 
> about moving to the JMS. On the one hand, many things are great - the module 
> path, module descriptors and access modes. On the other hand, the tooling 
> tricks required to make it all work make you shiver.
> If anybody wants to play/ improve things on that experimental branch (I 
> converted Luke to a full module - it works), please be my guest. I have to 
> sit on this and think whether it's something I really like or not.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


uschindler commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758715684



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   Hi, so it looks like this:
   
   ```
   > Task :showModuleNames
   lucene-benchmark-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.benchmark
   lucene-backward-codecs-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.backward_codecs
   lucene-classification-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.classification
   lucene-codecs-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.codecs
   lucene-core-10.0.0-SNAPSHOT.jar-> org.apache.lucene.core
   lucene-demo-10.0.0-SNAPSHOT.jar-> org.apache.lucene.demo
   lucene-expressions-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.expressions
   lucene-facet-10.0.0-SNAPSHOT.jar   -> org.apache.lucene.facet
   lucene-grouping-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.grouping
   lucene-highlighter-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.highlighter
   lucene-join-10.0.0-SNAPSHOT.jar-> org.apache.lucene.join
   lucene-luke-10.0.0-SNAPSHOT.jar-> org.apache.lucene.luke
   lucene-memory-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.memory
   lucene-misc-10.0.0-SNAPSHOT.jar-> org.apache.lucene.misc
   lucene-monitor-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.monitor
   lucene-queries-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.queries
   lucene-queryparser-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.queryparser
   lucene-replicator-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.replicator
   lucene-sandbox-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.sandbox
   lucene-spatial-extras-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.spatial_extras
   lucene-spatial3d-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.spatial3d
   lucene-suggest-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.suggest
   lucene-test-framework-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.test_framework
   lucene-analysis-common-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.common
   lucene-analysis-icu-10.0.0-SNAPSHOT.jar-> org.apache.lucene.icu
   lucene-analysis-kuromoji-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.kuromoji
   lucene-analysis-morfologik-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.morfologik
   lucene-analysis-nori-10.0.0-SNAPSHOT.jar   -> org.apache.lucene.nori
   lucene-analysis-opennlp-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.opennlp
   lucene-analysis-phonetic-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.phonetic
   lucene-analysis-smartcn-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.smartcn
   lucene-analysis-stempel-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.stempel
   ```
   
   I liked the previous names more, because now "analysis" is missing, @rmuir 
any suggestion. Maybe we should do something like this:
   
   ```
   "${-> project.group.toString() + "." + project.path.replaceFirst(":lucene:", 
"").replace(':', '.').replace("-", "_")}"
   ```
   
   So maybe ask others on mailing list.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


uschindler commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758715684



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   Hi, so it looks like this:
   
   ```
   > Task :showModuleNames
   lucene-benchmark-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.benchmark
   lucene-backward-codecs-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.backward_codecs
   lucene-classification-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.classification
   lucene-codecs-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.codecs
   lucene-core-10.0.0-SNAPSHOT.jar-> org.apache.lucene.core
   lucene-demo-10.0.0-SNAPSHOT.jar-> org.apache.lucene.demo
   lucene-expressions-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.expressions
   lucene-facet-10.0.0-SNAPSHOT.jar   -> org.apache.lucene.facet
   lucene-grouping-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.grouping
   lucene-highlighter-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.highlighter
   lucene-join-10.0.0-SNAPSHOT.jar-> org.apache.lucene.join
   lucene-luke-10.0.0-SNAPSHOT.jar-> org.apache.lucene.luke
   lucene-memory-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.memory
   lucene-misc-10.0.0-SNAPSHOT.jar-> org.apache.lucene.misc
   lucene-monitor-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.monitor
   lucene-queries-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.queries
   lucene-queryparser-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.queryparser
   lucene-replicator-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.replicator
   lucene-sandbox-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.sandbox
   lucene-spatial-extras-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.spatial_extras
   lucene-spatial3d-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.spatial3d
   lucene-suggest-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.suggest
   lucene-test-framework-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.test_framework
   lucene-analysis-common-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.common
   lucene-analysis-icu-10.0.0-SNAPSHOT.jar-> org.apache.lucene.icu
   lucene-analysis-kuromoji-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.kuromoji
   lucene-analysis-morfologik-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.morfologik
   lucene-analysis-nori-10.0.0-SNAPSHOT.jar   -> org.apache.lucene.nori
   lucene-analysis-opennlp-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.opennlp
   lucene-analysis-phonetic-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.phonetic
   lucene-analysis-smartcn-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.smartcn
   lucene-analysis-stempel-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.stempel
   ```
   
   I liked the previous names more, because now "analysis" is missing, @rmuir 
do you have any suggestion? Maybe we should do something like this:
   
   ```
   "${-> project.group.toString() + "." + project.path.replaceFirst(":lucene:", 
"").replace(':', '.').replace("-", "_")}"
   ```
   
   So maybe ask others on mailing list.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


uschindler commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758717236



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   Anyways, thanks for fixing the bug with the wrong manifest on 
non-library JARs (javadocs).
   
   One thing to keep in mind: When we add module-info.java, we need to remove 
the attribute, otherwise we have a duplicate.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


uschindler commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758718326



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   With my change it would like this:
   
   ```
   > Task :showModuleNames
   lucene-benchmark-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.benchmark
   lucene-backward-codecs-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.backward_codecs
   lucene-classification-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.classification
   lucene-codecs-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.codecs
   lucene-core-10.0.0-SNAPSHOT.jar-> org.apache.lucene.core
   lucene-demo-10.0.0-SNAPSHOT.jar-> org.apache.lucene.demo
   lucene-expressions-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.expressions
   lucene-facet-10.0.0-SNAPSHOT.jar   -> org.apache.lucene.facet
   lucene-grouping-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.grouping
   lucene-highlighter-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.highlighter
   lucene-join-10.0.0-SNAPSHOT.jar-> org.apache.lucene.join
   lucene-luke-10.0.0-SNAPSHOT.jar-> org.apache.lucene.luke
   lucene-memory-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.memory
   lucene-misc-10.0.0-SNAPSHOT.jar-> org.apache.lucene.misc
   lucene-monitor-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.monitor
   lucene-queries-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.queries
   lucene-queryparser-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.queryparser
   lucene-replicator-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.replicator
   lucene-sandbox-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.sandbox
   lucene-spatial-extras-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.spatial_extras
   lucene-spatial3d-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.spatial3d
   lucene-suggest-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.suggest
   lucene-test-framework-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.test_framework
   lucene-analysis-common-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.analysis.common
   lucene-analysis-icu-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.analysis.icu
   lucene-analysis-kuromoji-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.analysis.kuromoji
   lucene-analysis-morfologik-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.analysis.morfologik
   lucene-analysis-nori-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.analysis.nori
   lucene-analysis-opennlp-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.analysis.opennlp
   lucene-analysis-phonetic-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.analysis.phonetic
   lucene-analysis-smartcn-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.analysis.smartcn
   lucene-analysis-stempel-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.analysis.stempel
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


rmuir commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758721225



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   hmm, for sure I'm unhappy about the "analysis" missing, because 
`org.apache.lucene.common` seems pretty ambiguous. But I'd be happy with 
`org.apache.lucene.analysis_common` which is what I think your suggestion would 
create?
   
   But +1 to iterate here a little bit more (can we get "analysis" in the name 
some way or another), and then ping the mailing list with a printout just like 
what you showed here. If anyone has a strong opinion then they had a chance.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


rmuir commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758723885



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   That's also great to me (in a way, preferred over the underscores for 
the analysis ones, but I didn't want to suggest special-casing it in the gradle 
build).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


uschindler commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758723981



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   We might have overlapped: I copypasted my idea. Thanks @dweiss for the 
printout possibility. This also helps when migrating to module-info.java.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


rmuir commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758728664



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   by the way (sorry its offtopic) i just realized that in 9.0, we changed 
the maven names of all the analyzers from `lucene-analyzers-xxx` to 
`lucene-analysis-xxx`. Do we call this out somewhere bigtime / can we add a 
note here if we dont? I would anticipate lots of questions if anyone upgrades, 
because they may not expect to have to adjust this in their build. Sorry, just 
now thought of it, trying to reduce respins :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


uschindler commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758729874



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   Yes it was changed to be conform to the source directry names. The 
"analyzers" name was wrong.
   
   But I agree, we should add a note to the MIGRATE.md file! Thanks @rmuir !




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


dweiss commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758732884



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   Artifact rename is mentioned in migration:
   ```
   ## Rename of binary artifacts from '**-analyzers-**' to '**-analysis-**' 
(LUCENE-9562)
   ```
   
   So... what should I do about module names then?... I'm not sure what the 
outcome of the discussion is. Perhaps we should do this - tweak the names in a 
way you like it and commit (or provide a change suggestion)?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


uschindler commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758733113



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   I rewrote it a bit, if nobody objects, I'd replace the line in Dawids 
code with:
   
   ```
   manifestAttrs["Automatic-Module-Name"] = "${->  
project.path.replaceFirst(/^:lucene/, project.group as String).replace(':', 
'.').replace('-', '_')}"
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


rmuir commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758737615



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   > Artifact rename is mentioned in migration:
   >
   > ## Rename of binary artifacts from '**-analyzers-**' to '**-analysis-**' 
(LUCENE-9562)
   
   OK I will make a separate PR with my suggestions. Sorry for creating noise 
on the issue, but Uwe's questions here had me snooping around maven doing 
inspections, and that's why I noticed it. I'd like to move this up "higher" in 
the file and just add a simple list of old/new maven coordinates for each 
affected jar. I think it would be a bit more verbose but useful to almost 
anyone upgrading the library.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


dweiss commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758742625



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   Sure, please do. It must have been at the top at some point... The 
structure of this file was never too clear to me - whether it's prioritized or 
just listed chronologically.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


uschindler commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758744591



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   I committed my change. I will post the output on mailing list to get the 
others informed.
   
   Nevertheless we should still not make module system public for the 9.0 
release, this may lead to too many questions. Once we have real module-info 
files and tested everything, we can make it public.
   
   By my complaint I just wanted to make sure that at least the module names 
are according to community standards and suggestions by Oracle. I know, @dweiss 
does not agree but let's present this to the committers on ML.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] dweiss commented on a change in pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


dweiss commented on a change in pull request #487:
URL: https://github.com/apache/lucene/pull/487#discussion_r758746111



##
File path: gradle/java/jar-manifest.gradle
##
@@ -66,7 +66,7 @@ subprojects {
   "X-Build-JDK"   : "${System.properties['java.version']} 
(${System.properties['java.vendor']} ${System.properties['java.vm.version']})",
   "X-Build-OS": "${System.properties['os.name']} 
${System.properties['os.arch']} ${System.properties['os.version']}",
 
-  "Automatic-Module-Name" : "${-> project.path.replaceFirst(":", 
"").replace(':', '.').replace("-", "_")}"
+  "Automatic-Module-Name" : "org.apache.${-> 
project.path.replaceFirst(":", "").replace(':', '.').replace("-", "_")}"

Review comment:
   That's fine, Uwe - I'll live with it. As for the module system - I think 
it can be mentioned that it is preliminary support but at least module names 
will remain the same. It's better than nothing...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir opened a new pull request #488: Improve MIGRATE.md around analyzers artifacts.

2021-11-29 Thread GitBox


rmuir opened a new pull request #488:
URL: https://github.com/apache/lucene/pull/488


   Move this to the very top of MIGRATE, the user needs to first be able to 
pull in the artifacts, before doing anything else like trying to compile, deal 
with renamed classes, etc.
   
   Add a table of each package that got moved, with explicit old and new names. 
Hopefully it helps search engines and users.
   
   @jpountz I'd like to backport this to 9.0 if possible, since we are 
respinning for module names anyway. It is low risk.
   
   
   
   
   # Description
   
   Please provide a short description of the changes you're making with this 
pull request.
   
   # Solution
   
   Please provide a short description of the approach taken to implement your 
solution.
   
   # Tests
   
   Please describe the tests you've developed or run to confirm this patch 
implements the feature or solves the problem.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [ ] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/lucene/HowToContribute) and my code 
conforms to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [ ] I have given Lucene maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [ ] I have developed this patch against the `main` branch.
   - [ ] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #488: Improve MIGRATE.md around analyzers artifacts.

2021-11-29 Thread GitBox


uschindler commented on pull request #488:
URL: https://github.com/apache/lucene/pull/488#issuecomment-982046022


   A separate note: The new name is also more conforming what the modules relay 
do: They are not only "analyzers", those are compoents for "analysis" of text 
while indexing/searching lucene. So new name is better. Maybe add this to 
explanation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #488: Improve MIGRATE.md around analyzers artifacts.

2021-11-29 Thread GitBox


uschindler commented on pull request #488:
URL: https://github.com/apache/lucene/pull/488#issuecomment-982048976


   Table looks much better now. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #487: LUCENE-10234: Change module prefix to org.apache.*

2021-11-29 Thread GitBox


uschindler commented on pull request #487:
URL: https://github.com/apache/lucene/pull/487#issuecomment-982055622


   I added a change to the CHANGES.txt file to explain that the automatic 
module names are a preparation for full  module system support. This should be 
added, because we have not fully tested that everything works well with 
automatic module names.
   
   Also I added a note that the module names should not be considered "stable". 
OK?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #488: Improve MIGRATE.md around analyzers artifacts.

2021-11-29 Thread GitBox


rmuir commented on pull request #488:
URL: https://github.com/apache/lucene/pull/488#issuecomment-982055841


   Sorry for all the commits, I wanted to try to make this easy to read and 
prominent, and linked from the README too for more visibility. 
   
   I realize the `MIGRATE.md` is an unstructured list, but there's an advantage 
to listing some stuff at the top (esp. if it is likely to impact most users 
that upgrade). I don't want to hold up the 9.0 release, but maybe for the next 
one we can improve it to be better, some ideas:
   * avoid usage of abbreviations in our MIGRATE notes such as 
`o.a.l.a.util.TokenizerFactory`
   * avoid lists of from-to stuff and use tables (like this one as an example)
   * structure the high-impacting stuff such as package and class renames at 
the top.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #488: Improve MIGRATE.md around analyzers artifacts.

2021-11-29 Thread GitBox


uschindler commented on pull request #488:
URL: https://github.com/apache/lucene/pull/488#issuecomment-982058390


   Originally I made the MIGRATE.md a markdown file to have all formatting 
possibilities. The unordered list was just a "quick conversion" from the old 
format introduced in Lucene 4.0.
   
   The Markdown converter accepts all markdown in the `gradlew documentation` 
output and also expands LUCENE/SOLR issue numbers (and makes them clickable): 
https://github.com/apache/lucene/blob/main/gradle/documentation/markdown.gradle


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir commented on pull request #488: Improve MIGRATE.md around analyzers artifacts.

2021-11-29 Thread GitBox


rmuir commented on pull request #488:
URL: https://github.com/apache/lucene/pull/488#issuecomment-982061761


   > A separate note: The new name is also more conforming what the modules 
relay do: They are not only "analyzers", those are compoents for "analysis" of 
text while indexing/searching lucene. So new name is better. Maybe add this to 
explanation.
   
   I'd rather not mix in justification/reasoning for any changes in this file, 
I think it adds noise. Most users will be annoyed with us regardless :)  I 
think this file should just be simple hints of how to fix your code? We list 
the JIRA issue for each change already, in case an interested party wants to 
drill down to background discussion of why changes were done, or get any more 
detailed information.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #488: Improve MIGRATE.md around analyzers artifacts.

2021-11-29 Thread GitBox


uschindler commented on pull request #488:
URL: https://github.com/apache/lucene/pull/488#issuecomment-982064591


   This was just meant as a replacement for this text: "and are now consistent 
with repository module 'analysis'". This does not sound like a acceptable 
explanation to an annoyed user, so my idea was to just say: "better name 
because it does more than providing analyzers".
   
   But all fine, was just my 2ct.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir merged pull request #488: Improve MIGRATE.md around analyzers artifacts.

2021-11-29 Thread GitBox


rmuir merged pull request #488:
URL: https://github.com/apache/lucene/pull/488


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #488: Improve MIGRATE.md around analyzers artifacts.

2021-11-29 Thread GitBox


uschindler commented on pull request #488:
URL: https://github.com/apache/lucene/pull/488#issuecomment-982072123


   Hi Robert,
   I built the documentation with "gradlew documentation" and noticed that 
tables were not enabled in the markdown converter. It now looks not very well.
   
   If you don't mind I will add a change to fix the converter:
   
![image](https://user-images.githubusercontent.com/1005388/143951064-994b4563-018d-4cf3-9750-09c092d7c32e.png)
   
   So please wait a bit with merging!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir opened a new pull request #489: support tables in generated html documentation

2021-11-29 Thread GitBox


rmuir opened a new pull request #489:
URL: https://github.com/apache/lucene/pull/489


   https://github.com/apache/lucene/pull/488 added table of analyzer artifacts 
changes.
   
   Unfortunately it looks like crap in generated HTML unless we bring in the 
tables extension.
   
   Before:
   
   ![Screen_Shot_2021-11-29_at_17 20 
15](https://user-images.githubusercontent.com/504194/143952027-9744fbc9-0f8d-46cd-8575-604e54fded56.png)
   
   After:
   
   ![Screen_Shot_2021-11-29_at_17 20 
26](https://user-images.githubusercontent.com/504194/143952047-3cd0b6fe-5a21-4f1f-bd38-bf43515b0829.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #489: support tables in generated html documentation

2021-11-29 Thread GitBox


uschindler commented on pull request #489:
URL: https://github.com/apache/lucene/pull/489#issuecomment-982080248


   I created the same PR but did not yet commit it.
   
   Looks identical here, I just reformatted the long line with the extensions.
   
   +1 to commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] uschindler commented on pull request #489: support tables in generated html documentation

2021-11-29 Thread GitBox


uschindler commented on pull request #489:
URL: https://github.com/apache/lucene/pull/489#issuecomment-982083351


   +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] rmuir merged pull request #489: support tables in generated html documentation

2021-11-29 Thread GitBox


rmuir merged pull request #489:
URL: https://github.com/apache/lucene/pull/489


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10255) Fully embrace the java module system

2021-11-29 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450762#comment-17450762
 ] 

Uwe Schindler commented on LUCENE-10255:


Thanks [~sor] for confirmation.

We now have the following names generated from the gradle build:
{noformat}
> Task :showModuleNames
lucene-benchmark-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.benchmark
lucene-backward-codecs-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.backward_codecs
lucene-classification-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.classification
lucene-codecs-10.0.0-SNAPSHOT.jar  -> org.apache.lucene.codecs
lucene-core-10.0.0-SNAPSHOT.jar-> org.apache.lucene.core
lucene-demo-10.0.0-SNAPSHOT.jar-> org.apache.lucene.demo
lucene-expressions-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.expressions
lucene-facet-10.0.0-SNAPSHOT.jar   -> org.apache.lucene.facet
lucene-grouping-10.0.0-SNAPSHOT.jar-> org.apache.lucene.grouping
lucene-highlighter-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.highlighter
lucene-join-10.0.0-SNAPSHOT.jar-> org.apache.lucene.join
lucene-luke-10.0.0-SNAPSHOT.jar-> org.apache.lucene.luke
lucene-memory-10.0.0-SNAPSHOT.jar  -> org.apache.lucene.memory
lucene-misc-10.0.0-SNAPSHOT.jar-> org.apache.lucene.misc
lucene-monitor-10.0.0-SNAPSHOT.jar -> org.apache.lucene.monitor
lucene-queries-10.0.0-SNAPSHOT.jar -> org.apache.lucene.queries
lucene-queryparser-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.queryparser
lucene-replicator-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.replicator
lucene-sandbox-10.0.0-SNAPSHOT.jar -> org.apache.lucene.sandbox
lucene-spatial-extras-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.spatial_extras
lucene-spatial3d-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.spatial3d
lucene-suggest-10.0.0-SNAPSHOT.jar -> org.apache.lucene.suggest
lucene-test-framework-10.0.0-SNAPSHOT.jar  -> 
org.apache.lucene.test_framework
lucene-analysis-common-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.analysis.common
lucene-analysis-icu-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.analysis.icu
lucene-analysis-kuromoji-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.analysis.kuromoji
lucene-analysis-morfologik-10.0.0-SNAPSHOT.jar -> 
org.apache.lucene.analysis.morfologik
lucene-analysis-nori-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.analysis.nori
lucene-analysis-opennlp-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.analysis.opennlp
lucene-analysis-phonetic-10.0.0-SNAPSHOT.jar   -> 
org.apache.lucene.analysis.phonetic
lucene-analysis-smartcn-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.analysis.smartcn
lucene-analysis-stempel-10.0.0-SNAPSHOT.jar-> 
org.apache.lucene.analysis.stempel
{noformat}

(see https://github.com/apache/lucene/pull/487)

At the moment it is automatic module names, but this issue is about fully 
modularizing.

> Fully embrace the java module system
> 
>
> Key: LUCENE-10255
> URL: https://issues.apache.org/jira/browse/LUCENE-10255
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Dawid Weiss
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> I've experimented a bit trying to move the code to the JMS. It is 
> _surprisingly difficult_... A PoC that almost passes all checks is here:
> https://github.com/dweiss/lucene/tree/jms
> Here are my conclusions so far:
> * The JMS and gradle add a lot of complexity (this applies to any 
> higher-level tooling, including IDEs, I think). For starters, modules have to 
> be JARs. The effect of this is that what was previously a set of directories 
> from dependencies now has to be a JAR. What was previously an incremental 
> update of a single .class file now ripples throughout the build recreating 
> module JARs (ZIPs!)... I didn't realize it at first, but it's a costly thing 
> to do. I'm not even sure how IDEs handle this issue.
> * A Java module contains metadata (such as the module version or main class) 
> that is completely detached from any source file. These things live in a 
> class bytecode of the compiled module-info; interestingly, there is no 
> source-level way to specify it - these class attributes are injected by the 
> 'jar' tool. Gradle has some fancy on-the-fly asm conversion filter that 
> injects it.
> * Dependencies between modules will effectively live in two places: in gradle 
> build files and in module-info files. And they can go out of sync, although 
> it's probably easy to

[GitHub] [lucene] zhaih commented on pull request #225: LUCENE-10010 Introduce NFARunAutomaton to run NFA directly

2021-11-29 Thread GitBox


zhaih commented on pull request #225:
URL: https://github.com/apache/lucene/pull/225#issuecomment-982125853


   Thanks @rmuir, I'll run a benchmark to ensure this PR does not introduce 
regression recently.
   
   I like the approach you proposed in #485, it would be nice if we can get rid 
of `determinizeWorkLimit` in some classes that previously exists everywhere. 
One reason for carrying an enum and the `determinizeWorkLimit` together is that 
we might want to use that `determinizeWorkLimit` to limit the number of state 
that NFA can cache as well. But that's a feature not implemented yet and could 
be done in some other ways.
   
   I think we can try to get that pushed and then I can rebase this one after.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



  1   2   >