[GitHub] [lucene-solr] iverase commented on pull request #2059: LUCENE-9595: Make Component2D#withinPoint implementations consistent with ShapeQuery logic
iverase commented on pull request #2059: URL: https://github.com/apache/lucene-solr/pull/2059#issuecomment-725913962 @nknize Do you have an opinion about this change? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9569) Temporarily disable sort optimization on _doc for 8.7 release.
[ https://issues.apache.org/jira/browse/LUCENE-9569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230413#comment-17230413 ] Adrien Grand commented on LUCENE-9569: -- [~mayyas] Should we add it back now? > Temporarily disable sort optimization on _doc for 8.7 release. > -- > > Key: LUCENE-9569 > URL: https://issues.apache.org/jira/browse/LUCENE-9569 > Project: Lucene - Core > Issue Type: Task >Affects Versions: 8.7 >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > Sort optimization on _doc was introduced in LUCENE-9449, but it looks > unstable and lead to some recent tests failures. > As the release of 8.7 is very soon, we need to temporarily disable this sort > optimization for _doc for this release with a plan to stabilize it for later > releases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
dweiss commented on a change in pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#discussion_r521911417 ## File path: .gitignore ## @@ -8,6 +8,10 @@ build/ /.idea/ #IntelliJ creates this folder, ignore. /dev-tools/missing-doclet/out/ +*.iml Review comment: Right... gradlew idea is a plugin - it does generate those files. IntelliJ has native gradle support nowadays. See help/IDEs.txt; perhaps it should be clarified there. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9511) Include StoredFieldsWriter in DWPT accounting
[ https://issues.apache.org/jira/browse/LUCENE-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-9511. -- Fix Version/s: 8.7 Resolution: Fixed > Include StoredFieldsWriter in DWPT accounting > - > > Key: LUCENE-9511 > URL: https://issues.apache.org/jira/browse/LUCENE-9511 > Project: Lucene - Core > Issue Type: Bug >Reporter: Simon Willnauer >Priority: Major > Fix For: 8.7 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > StoredFieldsWriter might consume some heap space memory that can have a > significant impact on decisions made in the IW if writers should be stalled > or DWPTs should be flushed if memory settings are small in IWC and flushes > are frequent. We should add some accounting to the StoredFieldsWriter since > it's part of the DWPT lifecycle and not just present during flush. > Our nightly builds ran into some OOMs due to the large chunk size used in the > CompressedStoredFieldsFormat. The reason are very frequent flushes due to > small maxBufferedDocs which causes 300+ DWPTs to be blocked for flush causing > ultimately an OOM exception. > {noformat} > > NOTE: reproduce with: ant test -Dtestcase=TestIndexingSequenceNumbers > -Dtests.method=testStressConcurrentCommit -Dtests.seed=A04943A98C8E2954 > -Dtests.nightly=true -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=vo-001 -Dtests.timezone=Africa/Ouagadougou > -Dtests.asserts=true -Dtests.file.encoding=UTF8*06:06:15*[junit4] ERROR > 107s J3 | TestIndexingSequenceNumbers.testStressConcurrentCommit > <<<*06:06:15*[junit4]> Throwable #1: > org.apache.lucene.store.AlreadyClosedException: this IndexWriter is > closed*06:06:15*[junit4]>at > org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:876)*06:06:15* > [junit4]> at > org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:890)*06:06:15* > [junit4]> at > org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3727)*06:06:15* > [junit4]> at > org.apache.lucene.index.TestIndexingSequenceNumbers.testStressConcurrentCommit(TestIndexingSequenceNumbers.java:228)*06:06:15* > [junit4]> at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method)*06:06:15*[junit4]>at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)*06:06:15* > [junit4]> at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*06:06:15* > [junit4]> at > java.base/java.lang.reflect.Method.invoke(Method.java:566)*06:06:15* > [junit4]>at > java.base/java.lang.Thread.run(Thread.java:834)*06:06:15*[junit4]> > Caused by: java.lang.OutOfMemoryError: Java heap space*06:06:15*[junit4] > > at > __randomizedtesting.SeedInfo.seed([A04943A98C8E2954]:0)*06:06:15*[junit4] >> at > org.apache.lucene.store.GrowableByteArrayDataOutput.(GrowableByteArrayDataOutput.java:46)*06:06:15* > [junit4]> at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.(CompressingStoredFieldsWriter.java:111)*06:06:15* > [junit4]> at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:130)*06:06:15* > [junit4]> at > org.apache.lucene.codecs.lucene87.Lucene87StoredFieldsFormat.fieldsWriter(Lucene87StoredFieldsFormat.java:141)*06:06:15* > [junit4]>at > org.apache.lucene.codecs.asserting.AssertingStoredFieldsFormat.fieldsWriter(AssertingStoredFieldsFormat.java:48)*06:06:15* > [junit4]> at > org.apache.lucene.index.StoredFieldsConsumer.initStoredFieldsWriter(StoredFieldsConsumer.java:39)*06:06:15* > [junit4]> at > org.apache.lucene.index.StoredFieldsConsumer.startDocument(StoredFieldsConsumer.java:46)*06:06:15* > [junit4]> at > org.apache.lucene.index.DefaultIndexingChain.startStoredFields(DefaultIndexingChain.java:426)*06:06:15* > [junit4]> at > org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:462)*06:06:15* > [junit4]> at > org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:233)*06:06:15* > [junit4]> at > org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:419)*06:06:15* > [junit4]> at > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1470)*06:06:15* > [junit4]>at > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1463)*06:06:15* > [junit4]>at > org.apache.lucene.index.TestIndexin
[jira] [Closed] (LUCENE-9511) Include StoredFieldsWriter in DWPT accounting
[ https://issues.apache.org/jira/browse/LUCENE-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand closed LUCENE-9511. > Include StoredFieldsWriter in DWPT accounting > - > > Key: LUCENE-9511 > URL: https://issues.apache.org/jira/browse/LUCENE-9511 > Project: Lucene - Core > Issue Type: Bug >Reporter: Simon Willnauer >Priority: Major > Fix For: 8.7 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > StoredFieldsWriter might consume some heap space memory that can have a > significant impact on decisions made in the IW if writers should be stalled > or DWPTs should be flushed if memory settings are small in IWC and flushes > are frequent. We should add some accounting to the StoredFieldsWriter since > it's part of the DWPT lifecycle and not just present during flush. > Our nightly builds ran into some OOMs due to the large chunk size used in the > CompressedStoredFieldsFormat. The reason are very frequent flushes due to > small maxBufferedDocs which causes 300+ DWPTs to be blocked for flush causing > ultimately an OOM exception. > {noformat} > > NOTE: reproduce with: ant test -Dtestcase=TestIndexingSequenceNumbers > -Dtests.method=testStressConcurrentCommit -Dtests.seed=A04943A98C8E2954 > -Dtests.nightly=true -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=vo-001 -Dtests.timezone=Africa/Ouagadougou > -Dtests.asserts=true -Dtests.file.encoding=UTF8*06:06:15*[junit4] ERROR > 107s J3 | TestIndexingSequenceNumbers.testStressConcurrentCommit > <<<*06:06:15*[junit4]> Throwable #1: > org.apache.lucene.store.AlreadyClosedException: this IndexWriter is > closed*06:06:15*[junit4]>at > org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:876)*06:06:15* > [junit4]> at > org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:890)*06:06:15* > [junit4]> at > org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3727)*06:06:15* > [junit4]> at > org.apache.lucene.index.TestIndexingSequenceNumbers.testStressConcurrentCommit(TestIndexingSequenceNumbers.java:228)*06:06:15* > [junit4]> at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method)*06:06:15*[junit4]>at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)*06:06:15* > [junit4]> at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*06:06:15* > [junit4]> at > java.base/java.lang.reflect.Method.invoke(Method.java:566)*06:06:15* > [junit4]>at > java.base/java.lang.Thread.run(Thread.java:834)*06:06:15*[junit4]> > Caused by: java.lang.OutOfMemoryError: Java heap space*06:06:15*[junit4] > > at > __randomizedtesting.SeedInfo.seed([A04943A98C8E2954]:0)*06:06:15*[junit4] >> at > org.apache.lucene.store.GrowableByteArrayDataOutput.(GrowableByteArrayDataOutput.java:46)*06:06:15* > [junit4]> at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.(CompressingStoredFieldsWriter.java:111)*06:06:15* > [junit4]> at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:130)*06:06:15* > [junit4]> at > org.apache.lucene.codecs.lucene87.Lucene87StoredFieldsFormat.fieldsWriter(Lucene87StoredFieldsFormat.java:141)*06:06:15* > [junit4]>at > org.apache.lucene.codecs.asserting.AssertingStoredFieldsFormat.fieldsWriter(AssertingStoredFieldsFormat.java:48)*06:06:15* > [junit4]> at > org.apache.lucene.index.StoredFieldsConsumer.initStoredFieldsWriter(StoredFieldsConsumer.java:39)*06:06:15* > [junit4]> at > org.apache.lucene.index.StoredFieldsConsumer.startDocument(StoredFieldsConsumer.java:46)*06:06:15* > [junit4]> at > org.apache.lucene.index.DefaultIndexingChain.startStoredFields(DefaultIndexingChain.java:426)*06:06:15* > [junit4]> at > org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:462)*06:06:15* > [junit4]> at > org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:233)*06:06:15* > [junit4]> at > org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:419)*06:06:15* > [junit4]> at > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1470)*06:06:15* > [junit4]>at > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1463)*06:06:15* > [junit4]>at > org.apache.lucene.index.TestIndexingSequenceNumbers$2.run(TestIndexingSequenceNumbers.j
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
dweiss commented on a change in pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#discussion_r521912140 ## File path: lucene/misc/native/src/main/posix/NativePosixUtil.cpp ## @@ -38,12 +38,12 @@ #ifdef LINUX /* - * Class: org_apache_lucene_store_NativePosixUtil + * Class: org_apache_lucene_misc_store_NativePosixUtil * Method:posix_fadvise * Signature: (Ljava/io/FileDescriptor;JJI)V */ extern "C" -JNIEXPORT jint JNICALL Java_org_apache_lucene_store_NativePosixUtil_posix_1fadvise(JNIEnv *env, jclass _ignore, jobject fileDescriptor, jlong offset, jlong len, jint advice) +JNIEXPORT jint JNICALL Java_org_apache_lucene_misc_store_NativePosixUtil_posix_1fadvise(JNIEnv *env, jclass _ignore, jobject fileDescriptor, jlong offset, jlong len, jint advice) Review comment: Yeah. We really should try to add a test that tries to run with these libs. I don't know how to handle this yet - can be a follow-up issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
dweiss commented on pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#issuecomment-725917917 > On the other hand, I think these tests will break if run from IDEs. Do we need to support that in this PR? Oh, thanks! I'll take a look after I come back from work. I think the PR should be clean in that it doesn't break other people's workflow, so yes - if you added tests they should run or be quietly ignored if they're not supported. I'll take a look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on pull request #2077: LUCENE-9605: update snowball to d8cf01ddf37a, adds Yiddish
dweiss commented on pull request #2077: URL: https://github.com/apache/lucene-solr/pull/2077#issuecomment-725922377 LGTM. Just scratching my head about one thing - the generator has that extra field now yet the patch doesn't contain diffs for other languages - only Serbian and Yiddish? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14975) Optimize CoreContainer.getAllCoreNames and getLoadedCoreNames
[ https://issues.apache.org/jira/browse/SOLR-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230421#comment-17230421 ] ASF subversion and git services commented on SOLR-14975: Commit c53f0630169c535f24534a1b1333cbebdfc7ea2f in lucene-solr's branch refs/heads/branch_8x from Bruno Roustant [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c53f063 ] SOLR-14975: Optimize CoreContainer.getAllCoreNames and getLoadedCoreNames. Also optimize getCoreDescriptors. Closes #2066 > Optimize CoreContainer.getAllCoreNames and getLoadedCoreNames > -- > > Key: SOLR-14975 > URL: https://issues.apache.org/jira/browse/SOLR-14975 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: David Smiley >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > The methods CoreContainer.getAllCoreNames and getLoadedCoreNames hold a lock > while they grab core names to put into a TreeSet. When there are *many* > cores, this delay is noticeable. Holding this lock effectively blocks > queries since queries lookup a core; so it's critically important that these > methods are *fast*. The tragedy here is that some callers merely want to > know if a particular name is in the set, or what the aggregated size is. > Some callers want to iterate the names but don't really care what the > iteration order is. > I propose that some callers of these two methods find suitable alternatives, > like getCoreDescriptor to check for null. And I propose that these methods > return a HashSet -- no order. If the caller wants it sorted, it can do so > itself. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] bruno-roustant closed pull request #2066: SOLR-14975: Optimize CoreContainer.getAllCoreNames and getLoadedCoreNames.
bruno-roustant closed pull request #2066: URL: https://github.com/apache/lucene-solr/pull/2066 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14975) Optimize CoreContainer.getAllCoreNames and getLoadedCoreNames
[ https://issues.apache.org/jira/browse/SOLR-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno Roustant resolved SOLR-14975. --- Resolution: Fixed Thanks Erick and David for the review! > Optimize CoreContainer.getAllCoreNames and getLoadedCoreNames > -- > > Key: SOLR-14975 > URL: https://issues.apache.org/jira/browse/SOLR-14975 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: David Smiley >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > The methods CoreContainer.getAllCoreNames and getLoadedCoreNames hold a lock > while they grab core names to put into a TreeSet. When there are *many* > cores, this delay is noticeable. Holding this lock effectively blocks > queries since queries lookup a core; so it's critically important that these > methods are *fast*. The tragedy here is that some callers merely want to > know if a particular name is in the set, or what the aggregated size is. > Some callers want to iterate the names but don't really care what the > iteration order is. > I propose that some callers of these two methods find suitable alternatives, > like getCoreDescriptor to check for null. And I propose that these methods > return a HashSet -- no order. If the caller wants it sorted, it can do so > itself. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #2077: LUCENE-9605: update snowball to d8cf01ddf37a, adds Yiddish
uschindler commented on pull request #2077: URL: https://github.com/apache/lucene-solr/pull/2077#issuecomment-725941942 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9606) Wrap boolean queries generated by shape fields with a Constant score query
Ignacio Vera created LUCENE-9606: Summary: Wrap boolean queries generated by shape fields with a Constant score query Key: LUCENE-9606 URL: https://issues.apache.org/jira/browse/LUCENE-9606 Project: Lucene - Core Issue Type: Bug Reporter: Ignacio Vera When querying a shape field with a Geometry collection and a CONTAINS spatial relationship, the query is rewritten as a boolean query. We should wrap the resulting query with a ConstantScoreQuery. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9606) Wrap boolean queries generated by shape fields with a Constant score query
[ https://issues.apache.org/jira/browse/LUCENE-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230490#comment-17230490 ] Adrien Grand commented on LUCENE-9606: -- +1 > Wrap boolean queries generated by shape fields with a Constant score query > -- > > Key: LUCENE-9606 > URL: https://issues.apache.org/jira/browse/LUCENE-9606 > Project: Lucene - Core > Issue Type: Bug >Reporter: Ignacio Vera >Priority: Major > > When querying a shape field with a Geometry collection and a CONTAINS spatial > relationship, the query is rewritten as a boolean query. We should wrap the > resulting query with a ConstantScoreQuery. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9508) DocumentsWriter doesn't check for BlockedFlushes in stall mode``
[ https://issues.apache.org/jira/browse/LUCENE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230497#comment-17230497 ] Simon Willnauer commented on LUCENE-9508: - Hey Zach, thanks for opening this. Lemme ask some question and clarify what is going on here first: {quote} 2) Should the fullFlush thread wait indefinitely for the lock on ThreadStates ? Since single blocking writing thread can block the full flush here. {quote} yes we have to block on the threadstates here since this is the contract of full flush in order to atomically commit changes and establish a happens before relationship. {quote} 1) Should *preUpdate* look into the blocked flushes information as well instead of just flush queue ? {quote} I am not sure what is would do with the information in blocked flushes? Can you elaborate on this? we can't let blocked flushes go unless the full flush is over otherwise we will have inconsistent commits. Can you share your IndexWriter config and how you configured the 10% heap? Can you also share what thread holds the ThreadState that the full flush is waiting for? I wonder what causes this situation. > DocumentsWriter doesn't check for BlockedFlushes in stall mode`` > > > Key: LUCENE-9508 > URL: https://issues.apache.org/jira/browse/LUCENE-9508 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 8.5.1 >Reporter: Sorabh Hamirwasia >Priority: Major > Labels: IndexWriter > > Hi, > I was investigating an issue where the memory usage by a single Lucene > IndexWriter went up to ~23GB. Lucene has a concept of stalling in case the > memory used by each index breaches the 2 X ramBuffer limit (10% of JVM heap, > this case ~3GB). So ideally memory usage should not go above that limit. I > looked into the heap dump and found that the fullFlush thread when enters > *markForFullFlush* method, it tries to take lock on the ThreadStates of all > the DWPT thread sequentially. If lock on one of the ThreadState is blocked > then it will block indefinitely. This is what happened in my case, where one > of the DWPT thread was stuck in indexing process. Due to this fullFlush > thread was unable to populate the flush queue even though the stall mode was > detected. This caused the new indexing request which came on indexing thread > to continue after sleeping for a second, and continue with indexing. In > **preUpdate()** method it looks for the stalled case and see if there is any > pending flushes (based on flush queue), if not then sleep and continue. > Question: > 1) Should **preUpdate** look into the blocked flushes information as well > instead of just flush queue ? > 2) Should the fullFlush thread wait indefinitely for the lock on ThreadStates > ? Since single blocking writing thread can block the full flush here. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on a change in pull request #2022: LUCENE-9004: KNN vector search using NSW graphs
s1monw commented on a change in pull request #2022: URL: https://github.com/apache/lucene-solr/pull/2022#discussion_r521992111 ## File path: lucene/core/src/java/org/apache/lucene/index/VectorValues.java ## @@ -74,6 +74,18 @@ public BytesRef binaryValue() throws IOException { throw new UnsupportedOperationException(); } + /** + * Return the k nearest neighbor documents as determined by comparison of their vector values + * for this field, to the given vector, by the field's search strategy. If the search strategy is + * reversed, lower values indicate nearer vectors, otherwise higher scores indicate nearer + * vectors. Unlike relevance scores, vector scores may be negative. + * @param target the vector-valued query + * @param k the number of docs to return + * @param fanout control the accuracy/speed tradeoff - larger values give better recall at higher cost Review comment: @mikemccand it was pushed but removed again in https://issues.apache.org/jira/browse/LUCENE-9257 a little while ago. I get why it's removed but it seems useful maybe we add it back? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9590) Add javadoc for Lucene86PointsFormat class
[ https://issues.apache.org/jira/browse/LUCENE-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230509#comment-17230509 ] Lu Xugang commented on LUCENE-9590: --- Or as [~dsmiley] said in mail, We could link to it from javadocs and host it at the Confluence based wiki here: [https://cwiki.apache.org/confluence/display/LUCENE/Home] ? > Add javadoc for Lucene86PointsFormat class > --- > > Key: LUCENE-9590 > URL: https://issues.apache.org/jira/browse/LUCENE-9590 > Project: Lucene - Core > Issue Type: Wish > Components: core/codecs >Reporter: Lu Xugang >Priority: Minor > Attachments: 1.png > > > I would like to add javadoc for Lucene86PointsFormat class, it is really > helpful for source reader to understand the data structure with point value, > is anyone doing this or plan? > The attachment list part of the data structure (filled with color means it > has sub data structure) > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #2065: SOLR-14977 : ContainerPlugins should be configurable
sigram commented on a change in pull request #2065: URL: https://github.com/apache/lucene-solr/pull/2065#discussion_r522034988 ## File path: solr/core/src/test/org/apache/solr/handler/TestContainerPlugin.java ## @@ -366,7 +381,7 @@ public void m2(SolrQueryRequest req, SolrQueryResponse rsp) { } - public static class CConfig extends PluginMeta { + public static class CConfig implements ReflectMapWriter { Review comment: Then we have to explicitly say so in `ConfigurablePlugin`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #2065: SOLR-14977 : ContainerPlugins should be configurable
sigram commented on a change in pull request #2065: URL: https://github.com/apache/lucene-solr/pull/2065#discussion_r522039722 ## File path: solr/core/src/java/org/apache/solr/api/ContainerPluginsRegistry.java ## @@ -114,6 +118,16 @@ public synchronized ApiInfo getPlugin(String name) { return currentPlugins.get(name); } + static class PluginMetaHolder { +private final Map original; +private final PluginMeta meta; Review comment: I know that I can ignore it - my point was that this property is a relic of the time when we allowed only Api handlers as plugins. Now when non-Api plugins are first-class citizens half of the time this property doesn't make sense because it's specific only to Api plugins - so it should not be exposed as a standard property for all plugins. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14991) tag and remove obsolete branches
[ https://issues.apache.org/jira/browse/SOLR-14991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230611#comment-17230611 ] Erick Erickson commented on SOLR-14991: --- I'll tag these and remove the branch this weekend absent feedback. [~noble.paul] remotes/origin/jira-14151-revert remotes/origin/jira/V2Request [~caomanhdat] remotes/origin/jira/http2 [~danmuzi] or maybe [~rmuir] remotes/origin/revert-776-remove_icu_dependency Maybe this is LUCENE-8912 pull 776? In which case the JIRAs closed and I'll tag/remove. > tag and remove obsolete branches > > > Key: SOLR-14991 > URL: https://issues.apache.org/jira/browse/SOLR-14991 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I'm going to gradually work through the branches, tagging and removing > 1> anything with a Jira name that's fixed > 2> anything that I'm certain will never be fixed (e.g. the various gradle > build branches) > So the changes will still available, they just won't pollute the branch list. > I'll list the branches here, all the tags will be > history/branches/lucene-solr/ > > This specifically will _not_ include > 1> any release, e.g. branch_8_4 > 2> anything I'm unsure about. People who've created branches should expect > some pings about this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14983) Score returned in search request is original score and not reranked score
[ https://issues.apache.org/jira/browse/SOLR-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230620#comment-17230620 ] ASF subversion and git services commented on SOLR-14983: Commit 2f02040a4c45e4dfdb1f569ae05637c86f0f001b in lucene-solr's branch refs/heads/master from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2f02040 ] SOLR-14983: Fix response returning original score instead of reranked score due to query and filter combining. (Krishan Goyal, Jason Baik, Christine Poerschke) > Score returned in search request is original score and not reranked score > - > > Key: SOLR-14983 > URL: https://issues.apache.org/jira/browse/SOLR-14983 > Project: Solr > Issue Type: Bug >Affects Versions: 8.0 >Reporter: Krishan Goyal >Assignee: Christine Poerschke >Priority: Major > Attachments: 0001-LUCENE-9542-Unit-test-to-reproduce-bug.patch, > SOLR-14983.patch, SOLR-14983.patch > > > Score returned in search request is original score and not reranked score > post the changes in https://issues.apache.org/jira/browse/LUCENE-8412. > Commit - > [https://github.com/apache/lucene-solr/commit/55bfadbce115a825a75686fe0bfe71406bc3ee44#diff-4e354f104ed52bd7f620b0c05ae8467d] > Specifically - > if (cmd.getSort() != null && query instanceof RankQuery == false && > (cmd.getFlags() & GET_SCORES) != 0) { > TopFieldCollector.populateScores(topDocs.scoreDocs, this, query); > } > in SolrIndexSearcher.java recomputes the score but outputs only the original > score and not the reranked score. > > The issue is cmd.getQuery() is a type of RankQuery but the "query" variable > is a boolean query and probably replacing query with cmd.getQuery() should be > the right fix for this so that the score is not overriden for rerank queries > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14983) Score returned in search request is original score and not reranked score
[ https://issues.apache.org/jira/browse/SOLR-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230622#comment-17230622 ] ASF subversion and git services commented on SOLR-14983: Commit ad27dd7c56e4f9e1d1828206e7595b557ed070cb in lucene-solr's branch refs/heads/branch_8x from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ad27dd7 ] SOLR-14983: Fix response returning original score instead of reranked score due to query and filter combining. (Krishan Goyal, Jason Baik, Christine Poerschke) > Score returned in search request is original score and not reranked score > - > > Key: SOLR-14983 > URL: https://issues.apache.org/jira/browse/SOLR-14983 > Project: Solr > Issue Type: Bug >Affects Versions: 8.0 >Reporter: Krishan Goyal >Assignee: Christine Poerschke >Priority: Major > Attachments: 0001-LUCENE-9542-Unit-test-to-reproduce-bug.patch, > SOLR-14983.patch, SOLR-14983.patch > > > Score returned in search request is original score and not reranked score > post the changes in https://issues.apache.org/jira/browse/LUCENE-8412. > Commit - > [https://github.com/apache/lucene-solr/commit/55bfadbce115a825a75686fe0bfe71406bc3ee44#diff-4e354f104ed52bd7f620b0c05ae8467d] > Specifically - > if (cmd.getSort() != null && query instanceof RankQuery == false && > (cmd.getFlags() & GET_SCORES) != 0) { > TopFieldCollector.populateScores(topDocs.scoreDocs, this, query); > } > in SolrIndexSearcher.java recomputes the score but outputs only the original > score and not the reranked score. > > The issue is cmd.getQuery() is a type of RankQuery but the "query" variable > is a boolean query and probably replacing query with cmd.getQuery() should be > the right fix for this so that the score is not overriden for rerank queries > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14983) Score returned in search request is original score and not reranked score
[ https://issues.apache.org/jira/browse/SOLR-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke updated SOLR-14983: --- Fix Version/s: 8.8 master (9.0) Resolution: Fixed Status: Resolved (was: Patch Available) Thanks everyone! > Score returned in search request is original score and not reranked score > - > > Key: SOLR-14983 > URL: https://issues.apache.org/jira/browse/SOLR-14983 > Project: Solr > Issue Type: Bug >Affects Versions: 8.0 >Reporter: Krishan Goyal >Assignee: Christine Poerschke >Priority: Major > Fix For: master (9.0), 8.8 > > Attachments: 0001-LUCENE-9542-Unit-test-to-reproduce-bug.patch, > SOLR-14983.patch, SOLR-14983.patch > > > Score returned in search request is original score and not reranked score > post the changes in https://issues.apache.org/jira/browse/LUCENE-8412. > Commit - > [https://github.com/apache/lucene-solr/commit/55bfadbce115a825a75686fe0bfe71406bc3ee44#diff-4e354f104ed52bd7f620b0c05ae8467d] > Specifically - > if (cmd.getSort() != null && query instanceof RankQuery == false && > (cmd.getFlags() & GET_SCORES) != 0) { > TopFieldCollector.populateScores(topDocs.scoreDocs, this, query); > } > in SolrIndexSearcher.java recomputes the score but outputs only the original > score and not the reranked score. > > The issue is cmd.getQuery() is a type of RankQuery but the "query" variable > is a boolean query and probably replacing query with cmd.getQuery() should be > the right fix for this so that the score is not overriden for rerank queries > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14986) Add warning to ref guide that using "properties.name" is an expert option
[ https://issues.apache.org/jira/browse/SOLR-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-14986: -- Summary: Add warning to ref guide that using "properties.name" is an expert option (was: Restrict the properties possible to define with "property.name=value" when creating a collection) > Add warning to ref guide that using "properties.name" is an expert option > - > > Key: SOLR-14986 > URL: https://issues.apache.org/jira/browse/SOLR-14986 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > This came to light when I was looking at two user-list questions where people > try to manually define core.properties to define _replicas_ in SolrCloud. > There are two related issues: > 1> You can do things like "action=CREATE&name=eoe&property.collection=blivet" > which results in an opaque error about "could not create replica." I > propose we return a better error here like "property.collection should not be > specified when creating a collection". What do people think about the rest of > the auto-created properties on collection creation? > coreNodeName > collection.configName > name > numShards > shard > collection > replicaType > "name" seems to be OK to change, although i don't see anyplace anyone can > actually see it afterwards > 2> Change the ref guide to steer people away from attempting to manually > create a core.properties file to define cores/replicas in SolrCloud. There's > no warning on the "defining-core-properties.adoc" for instance. Additionally > there should be some kind of message on the collections API documentation > about not trying to set the properties in <1> on the CREATE command. > <2> used to actually work (apparently) with legacyCloud... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14991) tag and remove obsolete branches
[ https://issues.apache.org/jira/browse/SOLR-14991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230628#comment-17230628 ] Cao Manh Dat commented on SOLR-14991: - thank you Erick, I am ok on removing that! > tag and remove obsolete branches > > > Key: SOLR-14991 > URL: https://issues.apache.org/jira/browse/SOLR-14991 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I'm going to gradually work through the branches, tagging and removing > 1> anything with a Jira name that's fixed > 2> anything that I'm certain will never be fixed (e.g. the various gradle > build branches) > So the changes will still available, they just won't pollute the branch list. > I'll list the branches here, all the tags will be > history/branches/lucene-solr/ > > This specifically will _not_ include > 1> any release, e.g. branch_8_4 > 2> anything I'm unsure about. People who've created branches should expect > some pings about this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] rmuir commented on pull request #2077: LUCENE-9605: update snowball to d8cf01ddf37a, adds Yiddish
rmuir commented on pull request #2077: URL: https://github.com/apache/lucene-solr/pull/2077#issuecomment-726089629 > Just scratching my head about one thing - the generator has that extra field now yet the patch doesn't contain diffs for other languages - only Serbian and Yiddish? @dweiss Which extra field? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand merged pull request #2076: LUCENE-9603: Remove redundant fieldType.stored() check
mikemccand merged pull request #2076: URL: https://github.com/apache/lucene-solr/pull/2076 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on pull request #2077: LUCENE-9605: update snowball to d8cf01ddf37a, adds Yiddish
dweiss commented on pull request #2077: URL: https://github.com/apache/lucene-solr/pull/2077#issuecomment-726091990 This one? https://github.com/apache/lucene-solr/pull/2077/files#diff-455f29a3b76e17c21dded0a5f1b853145bfa7e6e5f1f36c52012a5f124e14ac2R589 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] rmuir commented on pull request #2077: LUCENE-9605: update snowball to d8cf01ddf37a, adds Yiddish
rmuir commented on pull request #2077: URL: https://github.com/apache/lucene-solr/pull/2077#issuecomment-726092971 There isn't any changes here. It's confusing because its showing a diff of a patch file, and github is highlighting something in green that isn't a change. The only thing that actually changed in the patch file are the line numbers of patch chunks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] rmuir commented on pull request #2077: LUCENE-9605: update snowball to d8cf01ddf37a, adds Yiddish
rmuir commented on pull request #2077: URL: https://github.com/apache/lucene-solr/pull/2077#issuecomment-726094165 It's not your fault, this is why its good to try to work down this patch file, it is confusing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on pull request #2077: LUCENE-9605: update snowball to d8cf01ddf37a, adds Yiddish
dweiss commented on pull request #2077: URL: https://github.com/apache/lucene-solr/pull/2077#issuecomment-726093808 Duh. I see it now... darn, sorry for the noise. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson opened a new pull request #2078: SOLR-14986: Add warning to ref guide that using properties.name is an…
ErickErickson opened a new pull request #2078: URL: https://github.com/apache/lucene-solr/pull/2078 Changed both CREATE and ADDREPLICA to just add the warning to the docs. The JIRA has a long explanation about why fixing it in the code is too risky/expensive. gw buildsite succeeds. I'll commmit this over the weekend absent objections. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #2022: LUCENE-9004: KNN vector search using NSW graphs
mikemccand commented on a change in pull request #2022: URL: https://github.com/apache/lucene-solr/pull/2022#discussion_r522164296 ## File path: lucene/core/src/java/org/apache/lucene/index/VectorValues.java ## @@ -74,6 +74,18 @@ public BytesRef binaryValue() throws IOException { throw new UnsupportedOperationException(); } + /** + * Return the k nearest neighbor documents as determined by comparison of their vector values + * for this field, to the given vector, by the field's search strategy. If the search strategy is + * reversed, lower values indicate nearer vectors, otherwise higher scores indicate nearer + * vectors. Unlike relevance scores, vector scores may be negative. + * @param target the vector-valued query + * @param k the number of docs to return + * @param fanout control the accuracy/speed tradeoff - larger values give better recall at higher cost Review comment: Ahh OK thanks for the context @s1monw. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14986) Add warning to ref guide that using "properties.name" is an expert option
[ https://issues.apache.org/jira/browse/SOLR-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230681#comment-17230681 ] Erick Erickson commented on SOLR-14986: --- [~ctargett] [~mdrob] Any comments? It's just a couple of warnings in the docs now. > Add warning to ref guide that using "properties.name" is an expert option > - > > Key: SOLR-14986 > URL: https://issues.apache.org/jira/browse/SOLR-14986 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > This came to light when I was looking at two user-list questions where people > try to manually define core.properties to define _replicas_ in SolrCloud. > There are two related issues: > 1> You can do things like "action=CREATE&name=eoe&property.collection=blivet" > which results in an opaque error about "could not create replica." I > propose we return a better error here like "property.collection should not be > specified when creating a collection". What do people think about the rest of > the auto-created properties on collection creation? > coreNodeName > collection.configName > name > numShards > shard > collection > replicaType > "name" seems to be OK to change, although i don't see anyplace anyone can > actually see it afterwards > 2> Change the ref guide to steer people away from attempting to manually > create a core.properties file to define cores/replicas in SolrCloud. There's > no warning on the "defining-core-properties.adoc" for instance. Additionally > there should be some kind of message on the collections API documentation > about not trying to set the properties in <1> on the CREATE command. > <2> used to actually work (apparently) with legacyCloud... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz merged pull request #2069: LUCENE-9378: Make it possible to configure how to trade speed for compression on doc values.
jpountz merged pull request #2069: URL: https://github.com/apache/lucene-solr/pull/2069 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues
[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230700#comment-17230700 ] ASF subversion and git services commented on LUCENE-9378: - Commit 06877b2c6e47bc481a79d7bedd8ea4fb099f1b4c in lucene-solr's branch refs/heads/master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=06877b2 ] LUCENE-9378: Make it possible to configure how to trade speed for compression on doc values. (#2069) This adds a switch to `Lucene80DocValuesFormat` which allows to configure whether to prioritize retrieval speed over compression ratio or the other way around. When prioritizing retrieval speed, binary doc values are written using the exact same format as before more aggressive compression got introduced. > Configurable compression for BinaryDocValues > > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Viral Gandhi >Priority: Major > Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, > hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, > hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, > image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, > image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, > snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues
[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230701#comment-17230701 ] ASF subversion and git services commented on LUCENE-9378: - Commit 06877b2c6e47bc481a79d7bedd8ea4fb099f1b4c in lucene-solr's branch refs/heads/master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=06877b2 ] LUCENE-9378: Make it possible to configure how to trade speed for compression on doc values. (#2069) This adds a switch to `Lucene80DocValuesFormat` which allows to configure whether to prioritize retrieval speed over compression ratio or the other way around. When prioritizing retrieval speed, binary doc values are written using the exact same format as before more aggressive compression got introduced. > Configurable compression for BinaryDocValues > > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Viral Gandhi >Priority: Major > Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, > hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, > hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, > image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, > image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, > snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman commented on pull request #2078: SOLR-14986: Add warning to ref guide that using properties.name is an…
HoustonPutman commented on pull request #2078: URL: https://github.com/apache/lucene-solr/pull/2078#issuecomment-726206070 I believe you can actually make a warning box in asciidoc via: ```asciidoc [WARNING] ``` I also prefer "overwriting" or "overriding" to "conflicting", I feel that more accurately describes what the user would be doing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues
[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230801#comment-17230801 ] Nico Tonozzi commented on LUCENE-9378: -- Thank you for the help here folks, and especially [~jpountz]! > Configurable compression for BinaryDocValues > > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Viral Gandhi >Priority: Major > Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, > hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, > hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, > image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, > image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, > snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reopened LUCENE-9499: --- Hi, since this commit, the Windows builds fail reproducible: FAILURE: Build failed with an exception. * Where: Script 'C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\gradle\validation\check-broken-links.gradle' line: 63 * What went wrong: Execution failed for task ':lucene:documentation:checkBrokenLinks'. > Broken links check failed. Command output at: > C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\lucene\documentation\build\tmp\checkBrokenLinks\check-broken-links-output.txt Currently, a build is running on Jenkins, so the mentioned temp file is not yet there: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/ws/lucene/documentation/build/ Will post once visible. > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230851#comment-17230851 ] Uwe Schindler edited comment on LUCENE-9499 at 11/12/20, 6:39 PM: -- Hi, since this commit (#2072), the Windows builds fail reproducible: FAILURE: Build failed with an exception. * Where: Script 'C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\gradle\validation\check-broken-links.gradle' line: 63 * What went wrong: Execution failed for task ':lucene:documentation:checkBrokenLinks'. > Broken links check failed. Command output at: > C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\lucene\documentation\build\tmp\checkBrokenLinks\check-broken-links-output.txt Currently, a build is running on Jenkins, so the mentioned temp file is not yet there: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/ws/lucene/documentation/build/ Will post once visible. was (Author: thetaphi): Hi, since this commit, the Windows builds fail reproducible: FAILURE: Build failed with an exception. * Where: Script 'C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\gradle\validation\check-broken-links.gradle' line: 63 * What went wrong: Execution failed for task ':lucene:documentation:checkBrokenLinks'. > Broken links check failed. Command output at: > C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\lucene\documentation\build\tmp\checkBrokenLinks\check-broken-links-output.txt Currently, a build is running on Jenkins, so the mentioned temp file is not yet there: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/ws/lucene/documentation/build/ Will post once visible. > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230855#comment-17230855 ] Uwe Schindler commented on LUCENE-9499: --- And this may be removed, as we have no split packages anymore: https://gitbox.apache.org/repos/asf?p=lucene-solr.git;a=blob;f=gradle/documentation/render-javadoc.gradle;h=bbd1b5e603a0c9f513c836452a18b9ce9caa83e7;hb=426a9c2#l277 > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230907#comment-17230907 ] Uwe Schindler commented on LUCENE-9499: --- The problem is the following: {noformat} Crawl/parse... Verify... file:///C%3A/Users/jenkins/workspace/Lucene-Solr-master-Windows/lucene/documentation/build/site/core/org/apache/lucene/index/PointValues.html BROKEN LINK: file:///C%3A/Users/jenkins/workspace/Lucene-Solr-master-Windows/lucene/documentation/build/site/misc/org/apache/lucene/document/InetAddressPoint.html Broken javadocs links were found! Common root causes: * A typo of some sort for manually created links. * Public methods referencing non-public classes in their signature. {noformat} It looks like there is missing a relative link, because all those links should be relative. The link checker on unix does not figure this out as it also accepts absolute links, but because the link is completely wrongly escaped here, the error was catched. Not sure if it really came from this commit, but i think it's the correct place to fix it. > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230908#comment-17230908 ] Uwe Schindler commented on LUCENE-9499: --- I think that's caused by LUCENE-9600, will reopen that. > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-9600) Clean up package name conflicts for misc module
[ https://issues.apache.org/jira/browse/LUCENE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reopened LUCENE-9600: --- I think this caused the bug reported in LUCENE-9499. > Clean up package name conflicts for misc module > --- > > Key: LUCENE-9600 > URL: https://issues.apache.org/jira/browse/LUCENE-9600 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Minor > Fix For: master (9.0) > > Time Spent: 1h 50m > Remaining Estimate: 0h > > misc module shares the package names o.a.l.document, o.a.l.index, > o.a.l.search, o.a.l.store, and o.a.l.util with lucene-core. Those should be > moved under o.a.l.misc (or some classed should be moved to core?). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues
[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230912#comment-17230912 ] ASF subversion and git services commented on LUCENE-9378: - Commit a48dc123b3e65928d5e59a76735cf6c88099915a in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a48dc12 ] LUCENE-9378: Make it possible to configure how to trade speed for compression on doc values. (#2069) This adds a switch to `Lucene80DocValuesFormat` which allows to configure whether to prioritize retrieval speed over compression ratio or the other way around. When prioritizing retrieval speed, binary doc values are written using the exact same format as before more aggressive compression got introduced. > Configurable compression for BinaryDocValues > > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Viral Gandhi >Priority: Major > Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, > hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, > hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, > image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, > image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, > snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9378) Configurable compression for BinaryDocValues
[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230911#comment-17230911 ] ASF subversion and git services commented on LUCENE-9378: - Commit a48dc123b3e65928d5e59a76735cf6c88099915a in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a48dc12 ] LUCENE-9378: Make it possible to configure how to trade speed for compression on doc values. (#2069) This adds a switch to `Lucene80DocValuesFormat` which allows to configure whether to prioritize retrieval speed over compression ratio or the other way around. When prioritizing retrieval speed, binary doc values are written using the exact same format as before more aggressive compression got introduced. > Configurable compression for BinaryDocValues > > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Viral Gandhi >Priority: Major > Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, > hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, > hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, > image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, > image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, > snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9600) Clean up package name conflicts for misc module
[ https://issues.apache.org/jira/browse/LUCENE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230918#comment-17230918 ] Uwe Schindler commented on LUCENE-9600: --- The problem is caused by this link: https://github.com/apache/lucene-solr/blob/32bf7bad4bb59a630942e72a7fe5d1d2cc47cb56/lucene/core/src/java/org/apache/lucene/index/PointValues.java#L51 It also affects linux, as I just noticed. > Clean up package name conflicts for misc module > --- > > Key: LUCENE-9600 > URL: https://issues.apache.org/jira/browse/LUCENE-9600 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Minor > Fix For: master (9.0) > > Time Spent: 1h 50m > Remaining Estimate: 0h > > misc module shares the package names o.a.l.document, o.a.l.index, > o.a.l.search, o.a.l.store, and o.a.l.util with lucene-core. Those should be > moved under o.a.l.misc (or some classed should be moved to core?). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9600) Clean up package name conflicts for misc module
[ https://issues.apache.org/jira/browse/LUCENE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230918#comment-17230918 ] Uwe Schindler edited comment on LUCENE-9600 at 11/12/20, 8:33 PM: -- The problem is caused by this link: https://github.com/apache/lucene-solr/blob/32bf7bad4bb59a630942e72a7fe5d1d2cc47cb56/lucene/core/src/java/org/apache/lucene/index/PointValues.java#L51 As InetAddressPoint was moved to core, we can make a standard {{@link}} out of it and import the class. We should also check the link in the line following this one. It also affects linux, as I just noticed. was (Author: thetaphi): The problem is caused by this link: https://github.com/apache/lucene-solr/blob/32bf7bad4bb59a630942e72a7fe5d1d2cc47cb56/lucene/core/src/java/org/apache/lucene/index/PointValues.java#L51 It also affects linux, as I just noticed. > Clean up package name conflicts for misc module > --- > > Key: LUCENE-9600 > URL: https://issues.apache.org/jira/browse/LUCENE-9600 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Minor > Fix For: master (9.0) > > Time Spent: 1h 50m > Remaining Estimate: 0h > > misc module shares the package names o.a.l.document, o.a.l.index, > o.a.l.search, o.a.l.store, and o.a.l.util with lucene-core. Those should be > moved under o.a.l.misc (or some classed should be moved to core?). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9378) Configurable compression for BinaryDocValues
[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-9378. -- Fix Version/s: 8.8 Resolution: Fixed > Configurable compression for BinaryDocValues > > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Viral Gandhi >Priority: Major > Fix For: 8.8 > > Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, > hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, > hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, > image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, > image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, > snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9569) Temporarily disable sort optimization on _doc for 8.7 release.
[ https://issues.apache.org/jira/browse/LUCENE-9569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova resolved LUCENE-9569. - Fix Version/s: 8.7 Resolution: Fixed > Temporarily disable sort optimization on _doc for 8.7 release. > -- > > Key: LUCENE-9569 > URL: https://issues.apache.org/jira/browse/LUCENE-9569 > Project: Lucene - Core > Issue Type: Task >Affects Versions: 8.7 >Reporter: Mayya Sharipova >Priority: Minor > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > Sort optimization on _doc was introduced in LUCENE-9449, but it looks > unstable and lead to some recent tests failures. > As the release of 8.7 is very soon, we need to temporarily disable this sort > optimization for _doc for this release with a plan to stabilize it for later > releases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9569) Temporarily disable sort optimization on _doc for 8.7 release.
[ https://issues.apache.org/jira/browse/LUCENE-9569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230994#comment-17230994 ] Mayya Sharipova commented on LUCENE-9569: - [~jpountz] Thank you for the reminder. I have reverted this commit, so sort optimization on _doc should again be enabled on branch_8x. > Temporarily disable sort optimization on _doc for 8.7 release. > -- > > Key: LUCENE-9569 > URL: https://issues.apache.org/jira/browse/LUCENE-9569 > Project: Lucene - Core > Issue Type: Task >Affects Versions: 8.7 >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > Sort optimization on _doc was introduced in LUCENE-9449, but it looks > unstable and lead to some recent tests failures. > As the release of 8.7 is very soon, we need to temporarily disable this sort > optimization for _doc for this release with a plan to stabilize it for later > releases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9569) Temporarily disable sort optimization on _doc for 8.7 release.
[ https://issues.apache.org/jira/browse/LUCENE-9569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230997#comment-17230997 ] ASF subversion and git services commented on LUCENE-9569: - Commit e4038142cc33cfcdf58fa2fdde9d8e66251ae4bf in lucene-solr's branch refs/heads/branch_8x from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e403814 ] Revert "LUCENE-9569 Disalbe sort opt on _doc (#1959)" Re-enabled sort optimization on _doc This reverts commit 1c0f07ac03f0235adaf5c150f1c6656336e4282f. > Temporarily disable sort optimization on _doc for 8.7 release. > -- > > Key: LUCENE-9569 > URL: https://issues.apache.org/jira/browse/LUCENE-9569 > Project: Lucene - Core > Issue Type: Task >Affects Versions: 8.7 >Reporter: Mayya Sharipova >Priority: Minor > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > Sort optimization on _doc was introduced in LUCENE-9449, but it looks > unstable and lead to some recent tests failures. > As the release of 8.7 is very soon, we need to temporarily disable this sort > optimization for _doc for this release with a plan to stabilize it for later > releases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9569) Temporarily disable sort optimization on _doc for 8.7 release.
[ https://issues.apache.org/jira/browse/LUCENE-9569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230998#comment-17230998 ] ASF subversion and git services commented on LUCENE-9569: - Commit e4038142cc33cfcdf58fa2fdde9d8e66251ae4bf in lucene-solr's branch refs/heads/branch_8x from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e403814 ] Revert "LUCENE-9569 Disalbe sort opt on _doc (#1959)" Re-enabled sort optimization on _doc This reverts commit 1c0f07ac03f0235adaf5c150f1c6656336e4282f. > Temporarily disable sort optimization on _doc for 8.7 release. > -- > > Key: LUCENE-9569 > URL: https://issues.apache.org/jira/browse/LUCENE-9569 > Project: Lucene - Core > Issue Type: Task >Affects Versions: 8.7 >Reporter: Mayya Sharipova >Priority: Minor > Fix For: 8.7 > > Time Spent: 20m > Remaining Estimate: 0h > > Sort optimization on _doc was introduced in LUCENE-9449, but it looks > unstable and lead to some recent tests failures. > As the release of 8.7 is very soon, we need to temporarily disable this sort > optimization for _doc for this release with a plan to stabilize it for later > releases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9450) Taxonomy index should use DocValues not StoredFields
[ https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231006#comment-17231006 ] ASF subversion and git services commented on LUCENE-9450: - Commit 3f8f84f9b063277e9017221bfc5e80fb901fc1ce in lucene-solr's branch refs/heads/master from Gautam Worah [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=3f8f84f ] LUCENE-9450 Switch to BinaryDocValues instead of stored fields in Lucene's facet implementation, yielding ~4-5% red-line QPS gain in pure faceting benchmarks (#1733) > Taxonomy index should use DocValues not StoredFields > > > Key: LUCENE-9450 > URL: https://issues.apache.org/jira/browse/LUCENE-9450 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet >Affects Versions: 8.5.2 >Reporter: Gautam Worah >Priority: Minor > Labels: performance > Attachments: LUCENE-9450-localrun.py-v1, wip_taxonomy_patch > > Time Spent: 3h 50m > Remaining Estimate: 0h > > The taxonomy index that maps binning labels to ordinals was created before > Lucene added BinaryDocValues. > I've attached a WIP patch (does not pass tests currently) > Issue suggested by [~mikemccand] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand merged pull request #1733: LUCENE-9450 Use BinaryDocValues in the taxonomy writer
mikemccand merged pull request #1733: URL: https://github.com/apache/lucene-solr/pull/1733 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9600) Clean up package name conflicts for misc module
[ https://issues.apache.org/jira/browse/LUCENE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231029#comment-17231029 ] ASF subversion and git services commented on LUCENE-9600: - Commit af47cb7bcdd4eb10263a0586474c6e255307 in lucene-solr's branch refs/heads/master from Uwe Schindler [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=af47cb7 ] LUCENE-9600: Fix wrong link > Clean up package name conflicts for misc module > --- > > Key: LUCENE-9600 > URL: https://issues.apache.org/jira/browse/LUCENE-9600 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Minor > Fix For: master (9.0) > > Time Spent: 1h 50m > Remaining Estimate: 0h > > misc module shares the package names o.a.l.document, o.a.l.index, > o.a.l.search, o.a.l.store, and o.a.l.util with lucene-core. Those should be > moved under o.a.l.misc (or some classed should be moved to core?). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-9499. --- Resolution: Fixed Fixed by LUCENE-9600. > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231031#comment-17231031 ] Uwe Schindler edited comment on LUCENE-9499 at 11/12/20, 11:29 PM: --- Fixed by recent commit described in LUCENE-9600. was (Author: thetaphi): Fixed by LUCENE-9600. > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9600) Clean up package name conflicts for misc module
[ https://issues.apache.org/jira/browse/LUCENE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-9600. --- Resolution: Fixed This should be fixed now. > Clean up package name conflicts for misc module > --- > > Key: LUCENE-9600 > URL: https://issues.apache.org/jira/browse/LUCENE-9600 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Minor > Fix For: master (9.0) > > Time Spent: 1h 50m > Remaining Estimate: 0h > > misc module shares the package names o.a.l.document, o.a.l.index, > o.a.l.search, o.a.l.store, and o.a.l.util with lucene-core. Those should be > moved under o.a.l.misc (or some classed should be moved to core?). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke commented on pull request #1571: SOLR-14560: Interleaving for Learning To Rank
cpoerschke commented on pull request #1571: URL: https://github.com/apache/lucene-solr/pull/1571#issuecomment-726414214 Hi @alessandrobenedetti, I returned to this pull request today, both the changes above and the tests (which I hadn't looked at before). Very comprehensive test coverage, thank you. Have pushed all my remaining insights to the https://github.com/cpoerschke/lucene-solr/commits/feature/SOLR-14560-cpoerschke-2 branch -- for the https://github.com/cpoerschke/lucene-solr/commit/4912daccd596435f5c61ac1a3cf86eaebb039118 and https://github.com/cpoerschke/lucene-solr/commit/3a61287a0e4fb5a77f080a92e7129582b234cbd7 commits which are perhaps a bit subtle I've added annotations on the pull request here -- the other commits are hopefully relatively self-explanatory. Let me know what you think, I agree the commit phase is fast approaching here :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank
cpoerschke commented on a change in pull request #1571: URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r522486838 ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java ## @@ -208,55 +216,116 @@ public void setContext(ResultContext context) { if (threadManager != null) { threadManager.setExecutor(context.getRequest().getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor()); } - - // Setup LTRScoringQuery - scoringQuery = SolrQueryRequestContextUtils.getScoringQuery(req); - docsWereNotReranked = (scoringQuery == null); - String featureStoreName = SolrQueryRequestContextUtils.getFvStoreName(req); - if (docsWereNotReranked || (featureStoreName != null && (!featureStoreName.equals(scoringQuery.getScoringModel().getFeatureStoreName() { -// if store is set in the transformer we should overwrite the logger -final ManagedFeatureStore fr = ManagedFeatureStore.getManagedFeatureStore(req.getCore()); + LTRScoringQuery[] rerankingQueriesFromContext = SolrQueryRequestContextUtils.getScoringQueries(req); + docsWereNotReranked = (rerankingQueriesFromContext == null || rerankingQueriesFromContext.length == 0); + String transformerFeatureStore = SolrQueryRequestContextUtils.getFvStoreName(req); + Map transformerExternalFeatureInfo = LTRQParserPlugin.extractEFIParams(localparams); -final FeatureStore store = fr.getFeatureStore(featureStoreName); -featureStoreName = store.getName(); // if featureStoreName was null before this gets actual name - -try { - final LoggingModel lm = new LoggingModel(loggingModelName, - featureStoreName, store.getFeatures()); + initLoggingModel(transformerFeatureStore); Review comment: LTRFeatureLoggerTransformerFactory.2 - `loggingModel` being a member of the transformer factory gives it `SolrCore` lifetime/scope but here it's initialised based on per-request parameters. If multiple threads use the same transformer factory object concurrently then they might trampled upon each other. https://github.com/cpoerschke/lucene-solr/commit/4912daccd596435f5c61ac1a3cf86eaebb039118 proposes to not have the logging model as a member of the transformer factory. ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java ## @@ -79,6 +81,7 @@ private char csvFeatureSeparator = CSVFeatureLogger.DEFAULT_FEATURE_SEPARATOR; private LTRThreadModule threadManager = null; + private LoggingModel loggingModel = null; Review comment: LTRFeatureLoggerTransformerFactory.1 - `loggingModel` is a member of of the transformer factory here. ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java ## @@ -208,55 +216,116 @@ public void setContext(ResultContext context) { if (threadManager != null) { threadManager.setExecutor(context.getRequest().getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor()); } Review comment: LTRFeatureLoggerTransformerFactory.3 - I noted that `threadManager` here is an existing member of the transformer factory and it is initialised as part of request processing. Since there's no locking or anything there could be a chance that multiple threads concurrently call `threadManager.setExecutor()` but the argument to the set call is not specific to the request i.e. all requests would set the same thing (whereas for the logging model different requests could supply a different feature store name via the `fl=[feature store=...]` parameter). ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java ## @@ -208,55 +216,116 @@ public void setContext(ResultContext context) { if (threadManager != null) { threadManager.setExecutor(context.getRequest().getCore().getCoreContainer().getUpdateShardHandler().getUpdateExecutor()); } - - // Setup LTRScoringQuery - scoringQuery = SolrQueryRequestContextUtils.getScoringQuery(req); - docsWereNotReranked = (scoringQuery == null); - String featureStoreName = SolrQueryRequestContextUtils.getFvStoreName(req); - if (docsWereNotReranked || (featureStoreName != null && (!featureStoreName.equals(scoringQuery.getScoringModel().getFeatureStoreName() { -// if store is set in the transformer we should overwrite the logger -final ManagedFeatureStore fr = ManagedFeatureStore.getManagedFeatureStore(req.getCore()); + LTRScoringQuery[] rerankingQueriesFromContext = SolrQueryRequestContextUtils.getScoringQueries(req); + docsWereNotReranked = (rerankingQueriesFromContext == null || rerankingQueriesFromContext.le
[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank
cpoerschke commented on a change in pull request #1571: URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r522515743 ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/interleaving/TeamDraftInterleaving.java ## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.ltr.interleaving; + +import java.util.ArrayList; +import java.util.HashSet; +import java.util.LinkedHashSet; +import java.util.Random; +import java.util.Set; + +import org.apache.lucene.search.ScoreDoc; + +/** + * Interleaving was introduced the first time by Joachims in [1, 2]. + * Team Draft Interleaving is among the most successful and used interleaving approaches[3]. + * Here the authors implement a method similar to the way in which captains select their players in team-matches. + * Team Draft Interleaving produces a fair distribution of ranking models’ elements in the final interleaved list. + * It has also proved to overcome an issue of the previous implemented approach, Balanced interleaving, in determining the winning model[4]. + * + * [1] T. Joachims. Optimizing search engines using clickthrough data. KDD (2002) + * [2] T.Joachims.Evaluatingretrievalperformanceusingclickthroughdata.InJ.Franke, G. Nakhaeizadeh, and I. Renz, editors, + * Text Mining, pages 79–96. Physica/Springer (2003) + * [3] F. Radlinski, M. Kurup, and T. Joachims. How does clickthrough data reflect re- + * trieval quality? In CIKM, pages 43–52. ACM Press (2008) + * [4] O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. + * Large-scale validation and analysis of interleaved search evaluation. ACM TOIS, 30(1):1–41, Feb. (2012) + */ +public class TeamDraftInterleaving implements Interleaving{ + public static Random RANDOM; + + static { +// We try to make things reproducible in the context of our tests by initializing the random instance +// based on the current seed +String seed = System.getProperty("tests.seed"); +if (seed == null) { + RANDOM = new Random(); +} else { + RANDOM = new Random(seed.hashCode()); +} + } + + /** + * Team Draft Interleaving considers two ranking models: modelA and modelB. + * For a given query, each model returns its ranked list of documents La = (a1,a2,...) and Lb = (b1, b2, ...). + * The algorithm creates a unique ranked list I = (i1, i2, ...). + * This list is created by interleaving elements from the two lists la and lb as described by Chapelle et al.[1]. + * Each element Ij is labelled TeamA if it is selected from La and TeamB if it is selected from Lb. + * + * [1] O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. + * Large-scale validation and analysis of interleaved search evaluation. ACM TOIS, 30(1):1–41, Feb. (2012) + * + * Assumptions: + * - rerankedA and rerankedB has the same length. + * They contains the same search results, ranked differently by two ranking models + * - each reranked list can not contain the same search result more than once. + * + * @param rerankedA a ranked list of search results produced by a ranking model A + * @param rerankedB a ranked list of search results produced by a ranking model B + * @return the interleaved ranking list + */ + public InterleavingResult interleave(ScoreDoc[] rerankedA, ScoreDoc[] rerankedB) { +LinkedHashSet interleavedResults = new LinkedHashSet<>(); +ScoreDoc[] interleavedResultArray = new ScoreDoc[rerankedA.length]; +ArrayList> interleavingPicks = new ArrayList<>(2); +Set teamA = new HashSet<>(); +Set teamB = new HashSet<>(); +int topN = rerankedA.length; +int indexA = 0, indexB = 0; + +while (interleavedResults.size() < topN && indexA < rerankedA.length && indexB < rerankedB.length) { + if(teamA.size() interleaved, int index, ScoreDoc[] reranked) { +boolean foundElementToAdd = false; +while (index < reranked.length && !foundElementToAdd) { + ScoreDoc elementToCheck = reranked[index]; + if (interleaved.contains(elementToCheck)) { Review comment: > ... currently Interleaving doesn't support sharding ... Let's include that in the documentation somehow, e.g. https://github.com/cp
[GitHub] [lucene-solr] cpoerschke commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank
cpoerschke commented on a change in pull request #1571: URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r522516056 ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java ## @@ -210,50 +216,59 @@ public void setContext(ResultContext context) { } // Setup LTRScoringQuery - scoringQuery = SolrQueryRequestContextUtils.getScoringQuery(req); - docsWereNotReranked = (scoringQuery == null); - String featureStoreName = SolrQueryRequestContextUtils.getFvStoreName(req); - if (docsWereNotReranked || (featureStoreName != null && (!featureStoreName.equals(scoringQuery.getScoringModel().getFeatureStoreName() { -// if store is set in the transformer we should overwrite the logger - -final ManagedFeatureStore fr = ManagedFeatureStore.getManagedFeatureStore(req.getCore()); - -final FeatureStore store = fr.getFeatureStore(featureStoreName); -featureStoreName = store.getName(); // if featureStoreName was null before this gets actual name - -try { - final LoggingModel lm = new LoggingModel(loggingModelName, - featureStoreName, store.getFeatures()); - - scoringQuery = new LTRScoringQuery(lm, - LTRQParserPlugin.extractEFIParams(localparams), - true, - threadManager); // request feature weights to be created for all features - -}catch (final Exception e) { - throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, - "retrieving the feature store "+featureStoreName, e); -} - } + rerankingQueries = SolrQueryRequestContextUtils.getScoringQueries(req); - if (scoringQuery.getOriginalQuery() == null) { -scoringQuery.setOriginalQuery(context.getQuery()); + docsWereNotReranked = (rerankingQueries == null || rerankingQueries.length == 0); + if (docsWereNotReranked) { +rerankingQueries = new LTRScoringQuery[]{null}; } - if (scoringQuery.getFeatureLogger() == null){ -scoringQuery.setFeatureLogger( SolrQueryRequestContextUtils.getFeatureLogger(req) ); - } - scoringQuery.setRequest(req); - - featureLogger = scoringQuery.getFeatureLogger(); + modelWeights = new LTRScoringQuery.ModelWeight[rerankingQueries.length]; + String featureStoreName = SolrQueryRequestContextUtils.getFvStoreName(req); + for (int i = 0; i < rerankingQueries.length; i++) { +LTRScoringQuery scoringQuery = rerankingQueries[i]; +if ((scoringQuery == null || !(scoringQuery instanceof OriginalRankingLTRScoringQuery)) && (docsWereNotReranked || (featureStoreName != null && !featureStoreName.equals(scoringQuery.getScoringModel().getFeatureStoreName() { Review comment: > ... I believe the code is much more readable now ... now that part is extremely clear. Yes, I agree, very nice. > ... another consideration sparkled: ... a separate Jira for that ... Interesting points, will need to think about them a bit (next week). I agree it's unrelated i.e. not a blocker for this pull request here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14928) Remove Overseer ClusterStateUpdater
[ https://issues.apache.org/jira/browse/SOLR-14928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231069#comment-17231069 ] Ilan Ginzburg commented on SOLR-14928: -- Just forced push a [new commit|https://github.com/murblanc/lucene-solr/commit/e77e0ec784f024438788c6b7eb5ca12785cac5d2] to the same branch. Nothing is tested yet but the latest drop is somewhat more complete and hopefully I'll be able to debug then run it for collection creation (code supports the {{CreateCollectionCmd}} cluster state changes only, not the nodes advertising the replicas are up) and get a feel on how it behaves. > Remove Overseer ClusterStateUpdater > --- > > Key: SOLR-14928 > URL: https://issues.apache.org/jira/browse/SOLR-14928 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: Ilan Ginzburg >Assignee: Ilan Ginzburg >Priority: Major > Labels: cluster, collection-api, overseer > > Remove the Overseer {{ClusterStateUpdater}} thread and associated Zookeeper > queue at {{<_chroot_>/overseer/queue}}. > Change cluster state updates so that each (Collection API) command execution > does the update directly in Zookeeper using optimistic locking (Compare and > Swap on the {{state.json}} Zookeeper files). > Following this change cluster state updates would still be happening only > from the Overseer node (that's where Collection API commands are executing), > but the code will be ready for distribution once such commands can be > executed by any node (other work done in the context of parent task > SOLR-14927). > See the [Cluster State > Updater|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/edit#heading=h.ymtfm3p518c] > section in the Removing Overseer doc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
zacharymorn commented on pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#issuecomment-726467539 > > On the other hand, I think these tests will break if run from IDEs. Do we need to support that in this PR? > > Oh, thanks! I'll take a look after I come back from work. I think the PR should be clean in that it doesn't break other people's workflow, so yes - if you added tests they should run or be quietly ignored if they're not supported. I'll take a look. Agreed. I just looked around as well, and it seems @Nightly annotation might be a possible work around here to disable running these tests from IDE. Granted it's a bit of a hack as it also disabled running those tests from developer's local gradle build by default, but at least these tests will get run in nightly build and bugs can be caught (albeit a bit late). Another solution is perhaps to create a new annotation to disable tests running from IDE, if such a thing can be detected ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9508) DocumentsWriter doesn't check for BlockedFlushes in stall mode``
[ https://issues.apache.org/jira/browse/LUCENE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231109#comment-17231109 ] Zach Chen commented on LUCENE-9508: --- Hi [~simonw], just to clarify I wasn't the one who originally reported the issue. I just poked around in code and run some tests to see if I can help here. > DocumentsWriter doesn't check for BlockedFlushes in stall mode`` > > > Key: LUCENE-9508 > URL: https://issues.apache.org/jira/browse/LUCENE-9508 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 8.5.1 >Reporter: Sorabh Hamirwasia >Priority: Major > Labels: IndexWriter > > Hi, > I was investigating an issue where the memory usage by a single Lucene > IndexWriter went up to ~23GB. Lucene has a concept of stalling in case the > memory used by each index breaches the 2 X ramBuffer limit (10% of JVM heap, > this case ~3GB). So ideally memory usage should not go above that limit. I > looked into the heap dump and found that the fullFlush thread when enters > *markForFullFlush* method, it tries to take lock on the ThreadStates of all > the DWPT thread sequentially. If lock on one of the ThreadState is blocked > then it will block indefinitely. This is what happened in my case, where one > of the DWPT thread was stuck in indexing process. Due to this fullFlush > thread was unable to populate the flush queue even though the stall mode was > detected. This caused the new indexing request which came on indexing thread > to continue after sleeping for a second, and continue with indexing. In > **preUpdate()** method it looks for the stalled case and see if there is any > pending flushes (based on flush queue), if not then sleep and continue. > Question: > 1) Should **preUpdate** look into the blocked flushes information as well > instead of just flush queue ? > 2) Should the fullFlush thread wait indefinitely for the lock on ThreadStates > ? Since single blocking writing thread can block the full flush here. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14991) tag and remove obsolete branches
[ https://issues.apache.org/jira/browse/SOLR-14991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17231113#comment-17231113 ] Noble Paul commented on SOLR-14991: --- done [~erickerickson] > tag and remove obsolete branches > > > Key: SOLR-14991 > URL: https://issues.apache.org/jira/browse/SOLR-14991 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I'm going to gradually work through the branches, tagging and removing > 1> anything with a Jira name that's fixed > 2> anything that I'm certain will never be fixed (e.g. the various gradle > build branches) > So the changes will still available, they just won't pollute the branch list. > I'll list the branches here, all the tags will be > history/branches/lucene-solr/ > > This specifically will _not_ include > 1> any release, e.g. branch_8_4 > 2> anything I'm unsure about. People who've created branches should expect > some pings about this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org