[jira] [Comment Edited] (SOLR-14161) System.ArgumentNullException: Value cannot be null. Parameter name: fieldNameTranslator
[ https://issues.apache.org/jira/browse/SOLR-14161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008603#comment-17008603 ] Mohammed edited comment on SOLR-14161 at 1/6/20 7:59 AM: - Thx Erick, For now I use the following workaround: I stopped the Solr service and start it again, then I can delete an item from sitecore. was (Author: lazar): Thx Erick, For now I use the following workaround: I stopped the Solr service and start it again, then I can delete item from sitecore. > System.ArgumentNullException: Value cannot be null. Parameter name: > fieldNameTranslator > --- > > Key: SOLR-14161 > URL: https://issues.apache.org/jira/browse/SOLR-14161 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: website >Affects Versions: 6.6.2 > Environment: versie: > Solr 6.6.2 > Sitecore: 9.1 >Reporter: Mohammed >Priority: Major > > {{}} > Tijdens delete van een item in sitecore 9, treedt de volgende error. > Graag kan iemand helpen met oplossen van dit issue. > > [ArgumentNullException: Value cannot be null. Parameter name: > fieldNameTranslator] > Sitecore.ContentSearch.Linq.Solr.SolrIndexParameters..ctor(IIndexValueFormatter > valueFormatter, IFieldQueryTranslatorMap`1 fieldQueryTranslators, > FieldNameTranslator fieldNameTranslator, IExecutionContext[] > executionContexts, IFieldMapReaders fieldMap, Boolean convertQueryDatesToUtc) > +328 > Sitecore.ContentSearch.SolrProvider.LinqToSolrIndex`1..ctor(SolrSearchContext > context, IExecutionContext[] executionContexts) +200 > Sitecore.ContentSearch.SolrProvider.SolrSearchContext.GetQueryable(IExecutionContext[] > executionContexts) +271 > Sitecore.ContentTesting.ContentSearch.TestingSearch.GetRunningTestsInAllLanguages(Item > hostItem) +1064 > Sitecore.ContentTesting.Pipelines.DeleteItems.DeleteTestDefinitionItems.GetConfirmMessage(Item[] > contentItems) +53 > Sitecore.ContentTesting.Pipelines.DeleteItems.DeleteTestDefinitionItems.CheckActiveTests(ClientPipelineArgs > args) +140 > > [TargetInvocationException: Exception has been thrown by the target of an > invocation.] System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] > arguments, Signature sig, Boolean constructor) +0 > System.Reflection.RuntimeMethodInfo.UnsafeInvokeInternal(Object obj, Object[] > parameters, Object[] arguments) +132 > System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags > invokeAttr, Binder binder, Object[] parameters, CultureInfo culture) +146 > Sitecore.Reflection.ReflectionUtil.InvokeMethod(MethodInfo method, Object[] > parameters, Object obj) +89 > Sitecore.Nexus.Pipelines.NexusPipelineApi.Resume(PipelineArgs args, Pipeline > pipeline) +313 Sitecore.Web.UI.Sheer.ClientPage.ResumePipeline() +215 > Sitecore.Web.UI.Sheer.ClientPage.OnPreRender(EventArgs e) +806 > Sitecore.Shell.Applications.ContentManager.ContentEditorPage.OnPreRender(EventArgs > e) +24 System.Web.UI.Control.PreRenderRecursiveInternal() +132 > System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, > Boolean includeStagesAfterAsyncPoint) +4005 > {{}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on issue #1136: LUCENE-9113: Speed up merging doc values' terms dictionaries.
jpountz commented on issue #1136: LUCENE-9113: Speed up merging doc values' terms dictionaries. URL: https://github.com/apache/lucene-solr/pull/1136#issuecomment-571043073 Thanks @rmuir ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz merged pull request #1136: LUCENE-9113: Speed up merging doc values' terms dictionaries.
jpountz merged pull request #1136: LUCENE-9113: Speed up merging doc values' terms dictionaries. URL: https://github.com/apache/lucene-solr/pull/1136 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9113) Speed up merging doc values terms dictionaries
[ https://issues.apache.org/jira/browse/LUCENE-9113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008604#comment-17008604 ] ASF subversion and git services commented on LUCENE-9113: - Commit dcc01fdaa6841a94613f68b419799523a157fe4a in lucene-solr's branch refs/heads/master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=dcc01fd ] LUCENE-9113: Speed up merging doc values' terms dictionaries. (#1136) > Speed up merging doc values terms dictionaries > -- > > Key: LUCENE-9113 > URL: https://issues.apache.org/jira/browse/LUCENE-9113 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > The default {{DocValuesConsumer#mergeSortedField}} and > {{DocValuesConsumer#mergeSortedSetField}} implementations create a merged > view of the doc values producers to merge. Unfortunately, it doesn't override > {{termsEnum()}}, whose default implementation of {{next()}} increments the > ordinal and calls {{lookupOrd()}} to retrieve the term. Currently, > {{lookupOrd()}} doesn't take advantage of its current position, and would > seek to the block start and then call {{next()}} up to 16 times to go to the > desired term. While there are discussions to optimize lookups to take > advantage of the current ord (LUCENE-8836), it shouldn't be required for > merging to be efficient and we should instead make {{next()}} call {{next()}} > on its sub enums. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz merged pull request #1074: BlockTreeTermsWriter should compute prefix lengths using Arrays#mismatch.
jpountz merged pull request #1074: BlockTreeTermsWriter should compute prefix lengths using Arrays#mismatch. URL: https://github.com/apache/lucene-solr/pull/1074 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz merged pull request #1047: MINOR: Fix Incorrect Constant Name in Codec Docs
jpountz merged pull request #1047: MINOR: Fix Incorrect Constant Name in Codec Docs URL: https://github.com/apache/lucene-solr/pull/1047 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-site] janhoy commented on issue #8: Simple build script
janhoy commented on issue #8: Simple build script URL: https://github.com/apache/lucene-site/pull/8#issuecomment-571044089 So any more feedback on this script? Should we try to do this "the pelican way" or just a simple script? Publishing of the site will happen through merging to staging/production branch so no need for fancy scripts there... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #964: LUCENE-9023: GlobalOrdinalsWithScore should not compute occurrences when the provided min is 1
jpountz commented on a change in pull request #964: LUCENE-9023: GlobalOrdinalsWithScore should not compute occurrences when the provided min is 1 URL: https://github.com/apache/lucene-solr/pull/964#discussion_r363191004 ## File path: lucene/CHANGES.txt ## @@ -52,6 +52,8 @@ Improvements * LUCENE-8937: Avoid agressive stemming on numbers in the FrenchMinimalStemmer. (Adrien Gallou via Tomoko Uchida) + +* LUCENE-9023: GlobalOrdinalsWithScore should not compute occurrences when the provided min is 1 Review comment: ```suggestion * LUCENE-9023: GlobalOrdinalsWithScore should not compute occurrences when the provided min is 1. (Jim Ferenczi) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz merged pull request #1125: LUCENE-9096: Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler
jpountz merged pull request #1125: LUCENE-9096: Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler URL: https://github.com/apache/lucene-solr/pull/1125 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9096) Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler
[ https://issues.apache.org/jira/browse/LUCENE-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008613#comment-17008613 ] ASF subversion and git services commented on LUCENE-9096: - Commit 2db4c909ca10c0d7edda0c94622fa1369833 in lucene-solr's branch refs/heads/master from kkewwei [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2db4c90 ] LUCENE-9096:Simplify CompressingTermVectorsWriter#flushOffsets. (#1125) > Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler > -- > > Key: LUCENE-9096 > URL: https://issues.apache.org/jira/browse/LUCENE-9096 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Affects Versions: 8.2 >Reporter: kkewwei >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > In CompressingTermVectorsWriter.flushOffsets, we count > sumPos and sumOffsets by the way > {code:java} > for (int i = 0; i < fd.numTerms; ++i) { > int previousPos = 0; > int previousOff = 0; > for (int j = 0; j < fd.freqs[i]; ++j) { > final int position = positionsBuf[fd.posStart + pos]; > final int startOffset = startOffsetsBuf[fd.offStart + pos]; > sumPos[fieldNumOff] += position - previousPos; > sumOffsets[fieldNumOff] += startOffset - previousOff; > previousPos = position; > previousOff = startOffset; > ++pos; > } > } > {code} > we always use the position - previousPos, it can be summarized like this: > {code:java} > (position5-position4)+(position4-position3)+(position3-position2)+(position2-position1){code} > If we should simplify it: position5-position1 > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9096) Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler
[ https://issues.apache.org/jira/browse/LUCENE-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008618#comment-17008618 ] ASF subversion and git services commented on LUCENE-9096: - Commit 6bb1f6cbbe8accefbfd30b8ee74924ad43ddc356 in lucene-solr's branch refs/heads/master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6bb1f6c ] LUCENE-9096: CHANGES entry. > Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler > -- > > Key: LUCENE-9096 > URL: https://issues.apache.org/jira/browse/LUCENE-9096 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Affects Versions: 8.2 >Reporter: kkewwei >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > In CompressingTermVectorsWriter.flushOffsets, we count > sumPos and sumOffsets by the way > {code:java} > for (int i = 0; i < fd.numTerms; ++i) { > int previousPos = 0; > int previousOff = 0; > for (int j = 0; j < fd.freqs[i]; ++j) { > final int position = positionsBuf[fd.posStart + pos]; > final int startOffset = startOffsetsBuf[fd.offStart + pos]; > sumPos[fieldNumOff] += position - previousPos; > sumOffsets[fieldNumOff] += startOffset - previousOff; > previousPos = position; > previousOff = startOffset; > ++pos; > } > } > {code} > we always use the position - previousPos, it can be summarized like this: > {code:java} > (position5-position4)+(position4-position3)+(position3-position2)+(position2-position1){code} > If we should simplify it: position5-position1 > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9096) Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler
[ https://issues.apache.org/jira/browse/LUCENE-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-9096. -- Fix Version/s: 8.5 Resolution: Fixed Thanks [~kkewwei]. > Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler > -- > > Key: LUCENE-9096 > URL: https://issues.apache.org/jira/browse/LUCENE-9096 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Affects Versions: 8.2 >Reporter: kkewwei >Priority: Major > Fix For: 8.5 > > Time Spent: 40m > Remaining Estimate: 0h > > In CompressingTermVectorsWriter.flushOffsets, we count > sumPos and sumOffsets by the way > {code:java} > for (int i = 0; i < fd.numTerms; ++i) { > int previousPos = 0; > int previousOff = 0; > for (int j = 0; j < fd.freqs[i]; ++j) { > final int position = positionsBuf[fd.posStart + pos]; > final int startOffset = startOffsetsBuf[fd.offStart + pos]; > sumPos[fieldNumOff] += position - previousPos; > sumOffsets[fieldNumOff] += startOffset - previousOff; > previousPos = position; > previousOff = startOffset; > ++pos; > } > } > {code} > we always use the position - previousPos, it can be summarized like this: > {code:java} > (position5-position4)+(position4-position3)+(position3-position2)+(position2-position1){code} > If we should simplify it: position5-position1 > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9113) Speed up merging doc values terms dictionaries
[ https://issues.apache.org/jira/browse/LUCENE-9113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-9113. -- Fix Version/s: 8.5 Resolution: Fixed > Speed up merging doc values terms dictionaries > -- > > Key: LUCENE-9113 > URL: https://issues.apache.org/jira/browse/LUCENE-9113 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Fix For: 8.5 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The default {{DocValuesConsumer#mergeSortedField}} and > {{DocValuesConsumer#mergeSortedSetField}} implementations create a merged > view of the doc values producers to merge. Unfortunately, it doesn't override > {{termsEnum()}}, whose default implementation of {{next()}} increments the > ordinal and calls {{lookupOrd()}} to retrieve the term. Currently, > {{lookupOrd()}} doesn't take advantage of its current position, and would > seek to the block start and then call {{next()}} up to 16 times to go to the > desired term. While there are discussions to optimize lookups to take > advantage of the current ord (LUCENE-8836), it shouldn't be required for > merging to be efficient and we should instead make {{next()}} call {{next()}} > on its sub enums. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9096) Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler
[ https://issues.apache.org/jira/browse/LUCENE-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008628#comment-17008628 ] ASF subversion and git services commented on LUCENE-9096: - Commit 7d6067000cdfcece70c15ce74a5727e56729fdc4 in lucene-solr's branch refs/heads/branch_8x from kkewwei [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7d60670 ] LUCENE-9096:Simplify CompressingTermVectorsWriter#flushOffsets. (#1125) > Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler > -- > > Key: LUCENE-9096 > URL: https://issues.apache.org/jira/browse/LUCENE-9096 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Affects Versions: 8.2 >Reporter: kkewwei >Priority: Major > Fix For: 8.5 > > Time Spent: 40m > Remaining Estimate: 0h > > In CompressingTermVectorsWriter.flushOffsets, we count > sumPos and sumOffsets by the way > {code:java} > for (int i = 0; i < fd.numTerms; ++i) { > int previousPos = 0; > int previousOff = 0; > for (int j = 0; j < fd.freqs[i]; ++j) { > final int position = positionsBuf[fd.posStart + pos]; > final int startOffset = startOffsetsBuf[fd.offStart + pos]; > sumPos[fieldNumOff] += position - previousPos; > sumOffsets[fieldNumOff] += startOffset - previousOff; > previousPos = position; > previousOff = startOffset; > ++pos; > } > } > {code} > we always use the position - previousPos, it can be summarized like this: > {code:java} > (position5-position4)+(position4-position3)+(position3-position2)+(position2-position1){code} > If we should simplify it: position5-position1 > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9113) Speed up merging doc values terms dictionaries
[ https://issues.apache.org/jira/browse/LUCENE-9113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008627#comment-17008627 ] ASF subversion and git services commented on LUCENE-9113: - Commit f6c2cb21379044b04f201567d5017ca81624821c in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f6c2cb2 ] LUCENE-9113: Speed up merging doc values' terms dictionaries. (#1136) > Speed up merging doc values terms dictionaries > -- > > Key: LUCENE-9113 > URL: https://issues.apache.org/jira/browse/LUCENE-9113 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Fix For: 8.5 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The default {{DocValuesConsumer#mergeSortedField}} and > {{DocValuesConsumer#mergeSortedSetField}} implementations create a merged > view of the doc values producers to merge. Unfortunately, it doesn't override > {{termsEnum()}}, whose default implementation of {{next()}} increments the > ordinal and calls {{lookupOrd()}} to retrieve the term. Currently, > {{lookupOrd()}} doesn't take advantage of its current position, and would > seek to the block start and then call {{next()}} up to 16 times to go to the > desired term. While there are discussions to optimize lookups to take > advantage of the current ord (LUCENE-8836), it shouldn't be required for > merging to be efficient and we should instead make {{next()}} call {{next()}} > on its sub enums. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9096) Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler
[ https://issues.apache.org/jira/browse/LUCENE-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008629#comment-17008629 ] ASF subversion and git services commented on LUCENE-9096: - Commit e2b39bd0ff8241c13296c7388924cb3f4e7ad9b8 in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e2b39bd ] LUCENE-9096: CHANGES entry. > Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler > -- > > Key: LUCENE-9096 > URL: https://issues.apache.org/jira/browse/LUCENE-9096 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Affects Versions: 8.2 >Reporter: kkewwei >Priority: Major > Fix For: 8.5 > > Time Spent: 40m > Remaining Estimate: 0h > > In CompressingTermVectorsWriter.flushOffsets, we count > sumPos and sumOffsets by the way > {code:java} > for (int i = 0; i < fd.numTerms; ++i) { > int previousPos = 0; > int previousOff = 0; > for (int j = 0; j < fd.freqs[i]; ++j) { > final int position = positionsBuf[fd.posStart + pos]; > final int startOffset = startOffsetsBuf[fd.offStart + pos]; > sumPos[fieldNumOff] += position - previousPos; > sumOffsets[fieldNumOff] += startOffset - previousOff; > previousPos = position; > previousOff = startOffset; > ++pos; > } > } > {code} > we always use the position - previousPos, it can be summarized like this: > {code:java} > (position5-position4)+(position4-position3)+(position3-position2)+(position2-position1){code} > If we should simplify it: position5-position1 > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13089) bin/solr's use of lsof has some issues
[ https://issues.apache.org/jira/browse/SOLR-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008684#comment-17008684 ] Martijn Koster commented on SOLR-13089: --- LGTM > bin/solr's use of lsof has some issues > -- > > Key: SOLR-13089 > URL: https://issues.apache.org/jira/browse/SOLR-13089 > Project: Solr > Issue Type: Bug > Components: SolrCLI >Reporter: Martijn Koster >Assignee: Jan Høydahl >Priority: Minor > Attachments: 0001-SOLR-13089-lsof-fixes.patch, SOLR-13089.patch > > > The {{bin/solr}} script uses this {{lsof}} invocation to check if the Solr > port is being listened on: > {noformat} > running=`lsof -PniTCP:$SOLR_PORT -sTCP:LISTEN` > if [ -z "$running" ]; then > {noformat} > code is at > [here|https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2147]. > There are a few issues with this. > h2. 1. False negatives when port is occupied by different user > When {{lsof}} runs as non-root, it only shows sockets for processes with your > effective uid. > For example: > {noformat} > $ id -u && nc -l 7788 & > [1] 26576 > 1000 > works: nc ran as my user > $ lsof -PniTCP:7788 -sTCP:LISTEN > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > nc 26580 mak3u IPv4 2818104 0t0 TCP *:7788 (LISTEN) > fails: ssh is running as root > $ lsof -PniTCP:22 -sTCP:LISTEN > works if we are root > $ sudo lsof -PniTCP:22 -sTCP:LISTEN > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > sshd2524 root3u IPv4 18426 0t0 TCP *:22 (LISTEN) > sshd2524 root4u IPv6 18428 0t0 TCP *:22 (LISTEN) > {noformat} > Solr runs as non-root. > So if some other process owned by a different user occupies that port, you > will get a false negative (it will say Solr is not running even though it is) > I can't think of a good way to fix or work around that (short of not using > {{lsof}} in the first place). > Perhaps an uncommon scenario we need not worry too much about. > h2. 2. lsof can complain about lack of /etc/password entries > If {{lsof}} runs without the current effective user having an entry in > {{/etc/passwd}}, > it produces a warning on stderr: > {noformat} > $ docker run -d -u 0 solr:7.6.0 bash -c "chown -R /opt/; gosu > solr-foreground" > 4397c3f51d4a1cfca7e5815e5b047f75fb144265d4582745a584f0dba51480c6 > $ docker exec -it -u > 4397c3f51d4a1cfca7e5815e5b047f75fb144265d4582745a584f0dba51480c6 bash > I have no name!@4397c3f51d4a:/opt/solr$ lsof -PniTCP:8983 -sTCP:LISTEN > lsof: no pwd entry for UID > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > lsof: no pwd entry for UID > java 9 115u IPv4 2813503 0t0 TCP *:8983 (LISTEN) > I have no name!@4397c3f51d4a:/opt/solr$ lsof -PniTCP:8983 > -sTCP:LISTEN>/dev/null > lsof: no pwd entry for UID > lsof: no pwd entry for UID > {noformat} > You can avoid this by using the {{-t}} tag, which specifies that lsof should > produce terse output with process identifiers only and no header: > {noformat} > I have no name!@4397c3f51d4a:/opt/solr$ lsof -t -PniTCP:8983 -sTCP:LISTEN > 9 > {noformat} > This is a rare circumstance, but one I encountered and worked around. > h2. 3. On Alpine, lsof is implemented by busybox, but with incompatible > arguments > On Alpine, {{busybox}} implements {{lsof}}, but does not support the > arguments, so you get: > {noformat} > $ docker run -it alpine sh > / # lsof -t -PniTCP:8983 -sTCP:LISTEN > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/tty > {noformat} > so if you ran Solr, in the background, and it failed to start, this code > would produce a false positive. > For example: > {noformat} > docker volume create mysol > docker run -v mysol:/mysol bash bash -c "chown 8983:8983 /mysol" > docker run -it -v mysol:/mysol -w /mysol -v > $HOME/Downloads/solr-7.6.0.tgz:/solr-7.6.0.tgz openjdk:8-alpine sh > apk add procps bash > tar xvzf /solr-7.6.0.tgz > chown -R 8983:8983 . > {noformat} > then in a separate terminal: > {noformat} > $ docker exec -it -u 8983 serene_saha sh > /mysol $ SOLR_OPTS=--invalid ./solr-7.6.0/bin/solr start > whoami: unknown uid 8983 > Waiting up to 180 seconds to see Solr running on port 8983 [|] > Started Solr server on port 8983 (pid=101). Happy searching! > /mysol $ > {noformat} > and in another separate terminal: > {noformat} > $ docker exec -it thirsty_liskov bash > bash-4.4$ cat server/logs/solr-8983-console.log > Unrecognized option: --invalid > Error: Could not create the Java Virtual Machine. > Error: A fatal exception has occurred. Program will exit. > {noformat} > so it is saying Solr is running, when it isn't. > Now, all this can be avoided by j
[jira] [Commented] (LUCENE-8673) Use radix partitioning when merging dimensional points
[ https://issues.apache.org/jira/browse/LUCENE-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008703#comment-17008703 ] Adrien Grand commented on LUCENE-8673: -- The test already uses a FSDirectory for large numbers of documents. The seed doesn't reproduce for me but I suspect that this is related to the fact that the test framework randomly wraps with NRTCachingDirectory. If I add some logging I'm seeing about 50MB spent on the RAMDirectory even though it's configured with a max size of 500kB. > Use radix partitioning when merging dimensional points > -- > > Key: LUCENE-8673 > URL: https://issues.apache.org/jira/browse/LUCENE-8673 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ignacio Vera >Assignee: Ignacio Vera >Priority: Major > Fix For: 8.x, master (9.0) > > Attachments: Geo3D.png, Geo3D.png, Geo3D.png, LatLonPoint.png, > LatLonPoint.png, LatLonPoint.png, LatLonShape.png, LatLonShape.png, > LatLonShape.png > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Following the advise of [~jpountz] in LUCENE-8623I have investigated using > radix selection when merging segments instead of sorting the data at the > beginning. The results are pretty promising when running Lucene geo > benchmarks: > > ||Approach||Index time (sec): Dev||Index Time (sec): Base||Index Time: > Diff||Force merge time (sec): Dev||Force Merge time (sec): Base||Force Merge > Time: Diff||Index size (GB): Dev||Index size (GB): Base||Index Size: > Diff||Reader heap (MB): Dev||Reader heap (MB): Base||Reader heap: Diff > |points|241.5s|235.0s| 3%|157.2s|157.9s|-0%|0.55|0.55| 0%|1.57|1.57| 0%| > |shapes|416.1s|650.1s|-36%|306.1s|603.2s|-49%|1.29|1.29| 0%|1.61|1.61| 0%| > |geo3d|261.0s|360.1s|-28%|170.2s|279.9s|-39%|0.75|0.75| 0%|1.58|1.58| 0%| > > edited: table formatting to be a jira table > > In 2D the index throughput is more or less equal but for higher dimensions > the impact is quite big. In all cases the merging process requires much less > disk space, I am attaching plots showing the different behaviour and I am > opening a pull request. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14158) package manager to read keys from packagestore and not ZK
[ https://issues.apache.org/jira/browse/SOLR-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-14158: -- Priority: Blocker (was: Major) > package manager to read keys from packagestore and not ZK > -- > > Key: SOLR-14158 > URL: https://issues.apache.org/jira/browse/SOLR-14158 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: packages >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Blocker > Labels: packagemanager > > The security of the package system relies on securing ZK. It's much easier > for users to secure the file system than securing ZK. > We provide an option to read public keys from file store. > This will > * Have a special directory called {{_trusted_}} . Direct writes are forbidden > to that directory over http > * The CLI directly writes to the keys to > {{/filestore/_trusted_/keys/}} directory. Other nodes are asked to > fetch the public key files from that node > * Package artifacts will continue to be uploaded over http -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14158) package manager to read keys from packagestore and not ZK
[ https://issues.apache.org/jira/browse/SOLR-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-14158: -- Fix Version/s: 8.4.1 > package manager to read keys from packagestore and not ZK > -- > > Key: SOLR-14158 > URL: https://issues.apache.org/jira/browse/SOLR-14158 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: packages >Affects Versions: 8.4 >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Blocker > Labels: packagemanager > Fix For: 8.4.1 > > > The security of the package system relies on securing ZK. It's much easier > for users to secure the file system than securing ZK. > We provide an option to read public keys from file store. > This will > * Have a special directory called {{_trusted_}} . Direct writes are forbidden > to that directory over http > * The CLI directly writes to the keys to > {{/filestore/_trusted_/keys/}} directory. Other nodes are asked to > fetch the public key files from that node > * Package artifacts will continue to be uploaded over http -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14158) package manager to read keys from packagestore and not ZK
[ https://issues.apache.org/jira/browse/SOLR-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-14158: -- Affects Version/s: 8.4 > package manager to read keys from packagestore and not ZK > -- > > Key: SOLR-14158 > URL: https://issues.apache.org/jira/browse/SOLR-14158 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: packages >Affects Versions: 8.4 >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Blocker > Labels: packagemanager > > The security of the package system relies on securing ZK. It's much easier > for users to secure the file system than securing ZK. > We provide an option to read public keys from file store. > This will > * Have a special directory called {{_trusted_}} . Direct writes are forbidden > to that directory over http > * The CLI directly writes to the keys to > {{/filestore/_trusted_/keys/}} directory. Other nodes are asked to > fetch the public key files from that node > * Package artifacts will continue to be uploaded over http -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14158) package manager to read keys from packagestore and not ZK
[ https://issues.apache.org/jira/browse/SOLR-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008755#comment-17008755 ] Jan Høydahl commented on SOLR-14158: This should go in 8.5 and not be a blocker. It has ALWAYS been the case that a production Solr cluster needs a secure Zookeeper one way or another. Nothing has changed here. > package manager to read keys from packagestore and not ZK > -- > > Key: SOLR-14158 > URL: https://issues.apache.org/jira/browse/SOLR-14158 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: packages >Affects Versions: 8.4 >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Blocker > Labels: packagemanager > Fix For: 8.4.1 > > > The security of the package system relies on securing ZK. It's much easier > for users to secure the file system than securing ZK. > We provide an option to read public keys from file store. > This will > * Have a special directory called {{_trusted_}} . Direct writes are forbidden > to that directory over http > * The CLI directly writes to the keys to > {{/filestore/_trusted_/keys/}} directory. Other nodes are asked to > fetch the public key files from that node > * Package artifacts will continue to be uploaded over http -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14158) package manager to read keys from packagestore and not ZK
[ https://issues.apache.org/jira/browse/SOLR-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008760#comment-17008760 ] Noble Paul commented on SOLR-14158: --- The problem is anyone who uses this new feature will have a backward incompatible system that's insecure by nature. The threat levels are much higher in this case. An attacker can run malicious code if ZK is compromised. We should not leave this hole open > package manager to read keys from packagestore and not ZK > -- > > Key: SOLR-14158 > URL: https://issues.apache.org/jira/browse/SOLR-14158 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: packages >Affects Versions: 8.4 >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Blocker > Labels: packagemanager > Fix For: 8.4.1 > > > The security of the package system relies on securing ZK. It's much easier > for users to secure the file system than securing ZK. > We provide an option to read public keys from file store. > This will > * Have a special directory called {{_trusted_}} . Direct writes are forbidden > to that directory over http > * The CLI directly writes to the keys to > {{/filestore/_trusted_/keys/}} directory. Other nodes are asked to > fetch the public key files from that node > * Package artifacts will continue to be uploaded over http -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14169) Fix 20 Resource Leak warnings in apache/solr/common
Andras Salamon created SOLR-14169: - Summary: Fix 20 Resource Leak warnings in apache/solr/common Key: SOLR-14169 URL: https://issues.apache.org/jira/browse/SOLR-14169 Project: Solr Issue Type: Sub-task Reporter: Andras Salamon There are 20 resource leak warnings in {{apache/solr/common}} {noformat} [ecj-lint] 5. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/java/org/apache/solr/common/cloud/ZkNodeProps.java (at line 98) [ecj-lint] 5. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/java/org/apache/solr/common/cloud/ZkNodeProps.java (at line 98) [ecj-lint] props = (Map) new JavaBinCodec().unmarshal(bytes); [ecj-lint] ^^ [ecj-lint] Resource leak: '' is never closed-- [ecj-lint] 6. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/java/org/apache/solr/common/util/Utils.java (at line 206) [ecj-lint] new SolrJSONWriter(writer) [ecj-lint] ^^ [ecj-lint] Resource leak: '' is never closed-- [ecj-lint] 2. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java (at line 50) [ecj-lint] try (InputStream is = new SolrResourceLoader().openResource("solrj/README"); [ecj-lint] [ecj-lint] Resource leak: '' is never closed-- [ecj-lint] 3. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java (at line 73) [ecj-lint] try (InputStream is = new SolrResourceLoader().openResource("solrj/README"); [ecj-lint] [ecj-lint] Resource leak: '' is never closed-- [ecj-lint] 4. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java (at line 98) [ecj-lint] try (InputStream is = new SolrResourceLoader().openResource("solrj/README"); [ecj-lint] [ecj-lint] Resource leak: '' is never closed-- [ecj-lint] 5. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java (at line 127) [ecj-lint] try (InputStream is = new SolrResourceLoader().openResource("solrj/README"); [ecj-lint] [ecj-lint] Resource leak: '' is never closed-- [ecj-lint] 6. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java (at line 152) [ecj-lint] try (InputStream is = new SolrResourceLoader().openResource("solrj/README"); [ecj-lint] [ecj-lint] Resource leak: '' is never closed-- [ecj-lint] 7. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java (at line 177) [ecj-lint] try (InputStream is = new SolrResourceLoader().openResource("solrj/README"); [ecj-lint] [ecj-lint] Resource leak: '' is never closed-- [ecj-lint] 8. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java (at line 48) [ecj-lint] JavaBinCodec codec = new JavaBinCodec(faos, null); [ecj-lint] ^ [ecj-lint] Resource leak: 'codec' is never closed-- [ecj-lint] 9. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java (at line 58) [ecj-lint] FastJavaBinDecoder.StreamCodec scodec = new FastJavaBinDecoder.StreamCodec(fis); [ecj-lint] ^^ [ecj-lint] Resource leak: 'scodec' is never closed-- [ecj-lint] 10. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java (at line 81) [ecj-lint] new JavaBinCodec().marshal(m, baos); [ecj-lint] ^^ [ecj-lint] Resource leak: '' is never closed-- [ecj-lint] 11. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java (at line 83) [ecj-lint] Map m2 = (Map) new JavaBinCodec().unmarshal(new FastInputStream(null, baos.getbuf(), 0, baos.size())); [ecj-lint] ^^ [ecj-lint] Resource leak: '' is never closed-- [ecj-lint] 12. WARNING in /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java (at line 124) [ecj-lint] SimpleOrderedMap o = (SimpleOrderedMap) new JavaBinCodec().unmarshal(baos.toByteArray()); [ecj-lint] ^^ [ecj-lint] Resource lea
[jira] [Updated] (SOLR-14169) Fix 20 Resource Leak warnings in apache/solr/common
[ https://issues.apache.org/jira/browse/SOLR-14169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Salamon updated SOLR-14169: -- Attachment: SOLR-14169-01.patch Status: Open (was: Open) > Fix 20 Resource Leak warnings in apache/solr/common > --- > > Key: SOLR-14169 > URL: https://issues.apache.org/jira/browse/SOLR-14169 > Project: Solr > Issue Type: Sub-task >Reporter: Andras Salamon >Priority: Minor > Attachments: SOLR-14169-01.patch > > > There are 20 resource leak warnings in {{apache/solr/common}} > {noformat} > [ecj-lint] 5. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/java/org/apache/solr/common/cloud/ZkNodeProps.java > (at line 98) [ecj-lint] 5. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/java/org/apache/solr/common/cloud/ZkNodeProps.java > (at line 98) [ecj-lint] props = (Map) new > JavaBinCodec().unmarshal(bytes); [ecj-lint] > ^^ [ecj-lint] Resource leak: '' > is never closed-- [ecj-lint] 6. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/java/org/apache/solr/common/util/Utils.java > (at line 206) [ecj-lint] new SolrJSONWriter(writer) [ecj-lint] > ^^ [ecj-lint] Resource leak: ' value>' is never closed-- [ecj-lint] 2. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 50) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 3. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 73) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 4. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 98) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 5. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 127) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 6. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 152) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 7. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 177) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 8. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java > (at line 48) [ecj-lint] JavaBinCodec codec = new JavaBinCodec(faos, null); > [ecj-lint] ^ [ecj-lint] Resource leak: 'codec' is never > closed-- [ecj-lint] 9. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java > (at line 58) [ecj-lint] FastJavaBinDecoder.StreamCodec scodec = new > FastJavaBinDecoder.StreamCodec(fis); [ecj-lint] > ^^ [ecj-lint] Resource leak: 'scodec' is never closed-- [ecj-lint] 10. > WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java > (at line 81) [ecj-lint] new JavaBinCodec().marshal(m, baos); [ecj-lint] > ^^ [ecj-lint] Resource leak: '' > is never closed-- [ecj-lint] 11. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java > (at line 83) [ecj-lint] Map m2 = (Map) new JavaBinCodec().unm
[jira] [Updated] (SOLR-14169) Fix 20 Resource Leak warnings in apache/solr/common
[ https://issues.apache.org/jira/browse/SOLR-14169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Salamon updated SOLR-14169: -- Status: Patch Available (was: Open) > Fix 20 Resource Leak warnings in apache/solr/common > --- > > Key: SOLR-14169 > URL: https://issues.apache.org/jira/browse/SOLR-14169 > Project: Solr > Issue Type: Sub-task >Reporter: Andras Salamon >Priority: Minor > Attachments: SOLR-14169-01.patch > > > There are 20 resource leak warnings in {{apache/solr/common}} > {noformat} > [ecj-lint] 5. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/java/org/apache/solr/common/cloud/ZkNodeProps.java > (at line 98) [ecj-lint] 5. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/java/org/apache/solr/common/cloud/ZkNodeProps.java > (at line 98) [ecj-lint] props = (Map) new > JavaBinCodec().unmarshal(bytes); [ecj-lint] > ^^ [ecj-lint] Resource leak: '' > is never closed-- [ecj-lint] 6. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/java/org/apache/solr/common/util/Utils.java > (at line 206) [ecj-lint] new SolrJSONWriter(writer) [ecj-lint] > ^^ [ecj-lint] Resource leak: ' value>' is never closed-- [ecj-lint] 2. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 50) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 3. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 73) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 4. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 98) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 5. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 127) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 6. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 152) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 7. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 177) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 8. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java > (at line 48) [ecj-lint] JavaBinCodec codec = new JavaBinCodec(faos, null); > [ecj-lint] ^ [ecj-lint] Resource leak: 'codec' is never > closed-- [ecj-lint] 9. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java > (at line 58) [ecj-lint] FastJavaBinDecoder.StreamCodec scodec = new > FastJavaBinDecoder.StreamCodec(fis); [ecj-lint] > ^^ [ecj-lint] Resource leak: 'scodec' is never closed-- [ecj-lint] 10. > WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java > (at line 81) [ecj-lint] new JavaBinCodec().marshal(m, baos); [ecj-lint] > ^^ [ecj-lint] Resource leak: '' > is never closed-- [ecj-lint] 11. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/TestFastJavabinDecoder.java > (at line 83) [ecj-lint] Map m2 = (Map) new JavaBinCodec().unmarshal(new > FastInputStream
[jira] [Commented] (SOLR-13089) bin/solr's use of lsof has some issues
[ https://issues.apache.org/jira/browse/SOLR-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008796#comment-17008796 ] ASF subversion and git services commented on SOLR-13089: Commit ac777a5352224b2c8f46836f0e078809308fc2d8 in lucene-solr's branch refs/heads/master from Martijn Koster [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ac777a5 ] SOLR-13089: Fix lsof edge cases in the solr CLI script > bin/solr's use of lsof has some issues > -- > > Key: SOLR-13089 > URL: https://issues.apache.org/jira/browse/SOLR-13089 > Project: Solr > Issue Type: Bug > Components: SolrCLI >Reporter: Martijn Koster >Assignee: Jan Høydahl >Priority: Minor > Attachments: 0001-SOLR-13089-lsof-fixes.patch, SOLR-13089.patch > > > The {{bin/solr}} script uses this {{lsof}} invocation to check if the Solr > port is being listened on: > {noformat} > running=`lsof -PniTCP:$SOLR_PORT -sTCP:LISTEN` > if [ -z "$running" ]; then > {noformat} > code is at > [here|https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2147]. > There are a few issues with this. > h2. 1. False negatives when port is occupied by different user > When {{lsof}} runs as non-root, it only shows sockets for processes with your > effective uid. > For example: > {noformat} > $ id -u && nc -l 7788 & > [1] 26576 > 1000 > works: nc ran as my user > $ lsof -PniTCP:7788 -sTCP:LISTEN > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > nc 26580 mak3u IPv4 2818104 0t0 TCP *:7788 (LISTEN) > fails: ssh is running as root > $ lsof -PniTCP:22 -sTCP:LISTEN > works if we are root > $ sudo lsof -PniTCP:22 -sTCP:LISTEN > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > sshd2524 root3u IPv4 18426 0t0 TCP *:22 (LISTEN) > sshd2524 root4u IPv6 18428 0t0 TCP *:22 (LISTEN) > {noformat} > Solr runs as non-root. > So if some other process owned by a different user occupies that port, you > will get a false negative (it will say Solr is not running even though it is) > I can't think of a good way to fix or work around that (short of not using > {{lsof}} in the first place). > Perhaps an uncommon scenario we need not worry too much about. > h2. 2. lsof can complain about lack of /etc/password entries > If {{lsof}} runs without the current effective user having an entry in > {{/etc/passwd}}, > it produces a warning on stderr: > {noformat} > $ docker run -d -u 0 solr:7.6.0 bash -c "chown -R /opt/; gosu > solr-foreground" > 4397c3f51d4a1cfca7e5815e5b047f75fb144265d4582745a584f0dba51480c6 > $ docker exec -it -u > 4397c3f51d4a1cfca7e5815e5b047f75fb144265d4582745a584f0dba51480c6 bash > I have no name!@4397c3f51d4a:/opt/solr$ lsof -PniTCP:8983 -sTCP:LISTEN > lsof: no pwd entry for UID > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > lsof: no pwd entry for UID > java 9 115u IPv4 2813503 0t0 TCP *:8983 (LISTEN) > I have no name!@4397c3f51d4a:/opt/solr$ lsof -PniTCP:8983 > -sTCP:LISTEN>/dev/null > lsof: no pwd entry for UID > lsof: no pwd entry for UID > {noformat} > You can avoid this by using the {{-t}} tag, which specifies that lsof should > produce terse output with process identifiers only and no header: > {noformat} > I have no name!@4397c3f51d4a:/opt/solr$ lsof -t -PniTCP:8983 -sTCP:LISTEN > 9 > {noformat} > This is a rare circumstance, but one I encountered and worked around. > h2. 3. On Alpine, lsof is implemented by busybox, but with incompatible > arguments > On Alpine, {{busybox}} implements {{lsof}}, but does not support the > arguments, so you get: > {noformat} > $ docker run -it alpine sh > / # lsof -t -PniTCP:8983 -sTCP:LISTEN > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/tty > {noformat} > so if you ran Solr, in the background, and it failed to start, this code > would produce a false positive. > For example: > {noformat} > docker volume create mysol > docker run -v mysol:/mysol bash bash -c "chown 8983:8983 /mysol" > docker run -it -v mysol:/mysol -w /mysol -v > $HOME/Downloads/solr-7.6.0.tgz:/solr-7.6.0.tgz openjdk:8-alpine sh > apk add procps bash > tar xvzf /solr-7.6.0.tgz > chown -R 8983:8983 . > {noformat} > then in a separate terminal: > {noformat} > $ docker exec -it -u 8983 serene_saha sh > /mysol $ SOLR_OPTS=--invalid ./solr-7.6.0/bin/solr start > whoami: unknown uid 8983 > Waiting up to 180 seconds to see Solr running on port 8983 [|] > Started Solr server on port 8983 (pid=101). Happy searching! > /mysol $ > {noformat} > and in another separate terminal: > {noformat} > $ docker exec -it thirsty_liskov bash > bash-4.4$ cat server/logs/s
[jira] [Commented] (SOLR-13089) bin/solr's use of lsof has some issues
[ https://issues.apache.org/jira/browse/SOLR-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008798#comment-17008798 ] ASF subversion and git services commented on SOLR-13089: Commit 2aa739ae873b8b1c9dac4a42daa9e790ebdf700e in lucene-solr's branch refs/heads/branch_8x from Martijn Koster [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2aa739a ] SOLR-13089: Fix lsof edge cases in the solr CLI script (cherry picked from commit ac777a5352224b2c8f46836f0e078809308fc2d8) > bin/solr's use of lsof has some issues > -- > > Key: SOLR-13089 > URL: https://issues.apache.org/jira/browse/SOLR-13089 > Project: Solr > Issue Type: Bug > Components: SolrCLI >Reporter: Martijn Koster >Assignee: Jan Høydahl >Priority: Minor > Fix For: 8.5 > > Attachments: 0001-SOLR-13089-lsof-fixes.patch, SOLR-13089.patch > > > The {{bin/solr}} script uses this {{lsof}} invocation to check if the Solr > port is being listened on: > {noformat} > running=`lsof -PniTCP:$SOLR_PORT -sTCP:LISTEN` > if [ -z "$running" ]; then > {noformat} > code is at > [here|https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2147]. > There are a few issues with this. > h2. 1. False negatives when port is occupied by different user > When {{lsof}} runs as non-root, it only shows sockets for processes with your > effective uid. > For example: > {noformat} > $ id -u && nc -l 7788 & > [1] 26576 > 1000 > works: nc ran as my user > $ lsof -PniTCP:7788 -sTCP:LISTEN > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > nc 26580 mak3u IPv4 2818104 0t0 TCP *:7788 (LISTEN) > fails: ssh is running as root > $ lsof -PniTCP:22 -sTCP:LISTEN > works if we are root > $ sudo lsof -PniTCP:22 -sTCP:LISTEN > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > sshd2524 root3u IPv4 18426 0t0 TCP *:22 (LISTEN) > sshd2524 root4u IPv6 18428 0t0 TCP *:22 (LISTEN) > {noformat} > Solr runs as non-root. > So if some other process owned by a different user occupies that port, you > will get a false negative (it will say Solr is not running even though it is) > I can't think of a good way to fix or work around that (short of not using > {{lsof}} in the first place). > Perhaps an uncommon scenario we need not worry too much about. > h2. 2. lsof can complain about lack of /etc/password entries > If {{lsof}} runs without the current effective user having an entry in > {{/etc/passwd}}, > it produces a warning on stderr: > {noformat} > $ docker run -d -u 0 solr:7.6.0 bash -c "chown -R /opt/; gosu > solr-foreground" > 4397c3f51d4a1cfca7e5815e5b047f75fb144265d4582745a584f0dba51480c6 > $ docker exec -it -u > 4397c3f51d4a1cfca7e5815e5b047f75fb144265d4582745a584f0dba51480c6 bash > I have no name!@4397c3f51d4a:/opt/solr$ lsof -PniTCP:8983 -sTCP:LISTEN > lsof: no pwd entry for UID > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > lsof: no pwd entry for UID > java 9 115u IPv4 2813503 0t0 TCP *:8983 (LISTEN) > I have no name!@4397c3f51d4a:/opt/solr$ lsof -PniTCP:8983 > -sTCP:LISTEN>/dev/null > lsof: no pwd entry for UID > lsof: no pwd entry for UID > {noformat} > You can avoid this by using the {{-t}} tag, which specifies that lsof should > produce terse output with process identifiers only and no header: > {noformat} > I have no name!@4397c3f51d4a:/opt/solr$ lsof -t -PniTCP:8983 -sTCP:LISTEN > 9 > {noformat} > This is a rare circumstance, but one I encountered and worked around. > h2. 3. On Alpine, lsof is implemented by busybox, but with incompatible > arguments > On Alpine, {{busybox}} implements {{lsof}}, but does not support the > arguments, so you get: > {noformat} > $ docker run -it alpine sh > / # lsof -t -PniTCP:8983 -sTCP:LISTEN > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/tty > {noformat} > so if you ran Solr, in the background, and it failed to start, this code > would produce a false positive. > For example: > {noformat} > docker volume create mysol > docker run -v mysol:/mysol bash bash -c "chown 8983:8983 /mysol" > docker run -it -v mysol:/mysol -w /mysol -v > $HOME/Downloads/solr-7.6.0.tgz:/solr-7.6.0.tgz openjdk:8-alpine sh > apk add procps bash > tar xvzf /solr-7.6.0.tgz > chown -R 8983:8983 . > {noformat} > then in a separate terminal: > {noformat} > $ docker exec -it -u 8983 serene_saha sh > /mysol $ SOLR_OPTS=--invalid ./solr-7.6.0/bin/solr start > whoami: unknown uid 8983 > Waiting up to 180 seconds to see Solr running on port 8983 [|] > Started Solr server on port 8983 (pid=101). Happy searching! > /mysol $ > {noformat} > and in another
[jira] [Updated] (SOLR-13089) bin/solr's use of lsof has some issues
[ https://issues.apache.org/jira/browse/SOLR-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-13089: --- Fix Version/s: 8.5 > bin/solr's use of lsof has some issues > -- > > Key: SOLR-13089 > URL: https://issues.apache.org/jira/browse/SOLR-13089 > Project: Solr > Issue Type: Bug > Components: SolrCLI >Reporter: Martijn Koster >Assignee: Jan Høydahl >Priority: Minor > Fix For: 8.5 > > Attachments: 0001-SOLR-13089-lsof-fixes.patch, SOLR-13089.patch > > > The {{bin/solr}} script uses this {{lsof}} invocation to check if the Solr > port is being listened on: > {noformat} > running=`lsof -PniTCP:$SOLR_PORT -sTCP:LISTEN` > if [ -z "$running" ]; then > {noformat} > code is at > [here|https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2147]. > There are a few issues with this. > h2. 1. False negatives when port is occupied by different user > When {{lsof}} runs as non-root, it only shows sockets for processes with your > effective uid. > For example: > {noformat} > $ id -u && nc -l 7788 & > [1] 26576 > 1000 > works: nc ran as my user > $ lsof -PniTCP:7788 -sTCP:LISTEN > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > nc 26580 mak3u IPv4 2818104 0t0 TCP *:7788 (LISTEN) > fails: ssh is running as root > $ lsof -PniTCP:22 -sTCP:LISTEN > works if we are root > $ sudo lsof -PniTCP:22 -sTCP:LISTEN > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > sshd2524 root3u IPv4 18426 0t0 TCP *:22 (LISTEN) > sshd2524 root4u IPv6 18428 0t0 TCP *:22 (LISTEN) > {noformat} > Solr runs as non-root. > So if some other process owned by a different user occupies that port, you > will get a false negative (it will say Solr is not running even though it is) > I can't think of a good way to fix or work around that (short of not using > {{lsof}} in the first place). > Perhaps an uncommon scenario we need not worry too much about. > h2. 2. lsof can complain about lack of /etc/password entries > If {{lsof}} runs without the current effective user having an entry in > {{/etc/passwd}}, > it produces a warning on stderr: > {noformat} > $ docker run -d -u 0 solr:7.6.0 bash -c "chown -R /opt/; gosu > solr-foreground" > 4397c3f51d4a1cfca7e5815e5b047f75fb144265d4582745a584f0dba51480c6 > $ docker exec -it -u > 4397c3f51d4a1cfca7e5815e5b047f75fb144265d4582745a584f0dba51480c6 bash > I have no name!@4397c3f51d4a:/opt/solr$ lsof -PniTCP:8983 -sTCP:LISTEN > lsof: no pwd entry for UID > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > lsof: no pwd entry for UID > java 9 115u IPv4 2813503 0t0 TCP *:8983 (LISTEN) > I have no name!@4397c3f51d4a:/opt/solr$ lsof -PniTCP:8983 > -sTCP:LISTEN>/dev/null > lsof: no pwd entry for UID > lsof: no pwd entry for UID > {noformat} > You can avoid this by using the {{-t}} tag, which specifies that lsof should > produce terse output with process identifiers only and no header: > {noformat} > I have no name!@4397c3f51d4a:/opt/solr$ lsof -t -PniTCP:8983 -sTCP:LISTEN > 9 > {noformat} > This is a rare circumstance, but one I encountered and worked around. > h2. 3. On Alpine, lsof is implemented by busybox, but with incompatible > arguments > On Alpine, {{busybox}} implements {{lsof}}, but does not support the > arguments, so you get: > {noformat} > $ docker run -it alpine sh > / # lsof -t -PniTCP:8983 -sTCP:LISTEN > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/tty > {noformat} > so if you ran Solr, in the background, and it failed to start, this code > would produce a false positive. > For example: > {noformat} > docker volume create mysol > docker run -v mysol:/mysol bash bash -c "chown 8983:8983 /mysol" > docker run -it -v mysol:/mysol -w /mysol -v > $HOME/Downloads/solr-7.6.0.tgz:/solr-7.6.0.tgz openjdk:8-alpine sh > apk add procps bash > tar xvzf /solr-7.6.0.tgz > chown -R 8983:8983 . > {noformat} > then in a separate terminal: > {noformat} > $ docker exec -it -u 8983 serene_saha sh > /mysol $ SOLR_OPTS=--invalid ./solr-7.6.0/bin/solr start > whoami: unknown uid 8983 > Waiting up to 180 seconds to see Solr running on port 8983 [|] > Started Solr server on port 8983 (pid=101). Happy searching! > /mysol $ > {noformat} > and in another separate terminal: > {noformat} > $ docker exec -it thirsty_liskov bash > bash-4.4$ cat server/logs/solr-8983-console.log > Unrecognized option: --invalid > Error: Could not create the Java Virtual Machine. > Error: A fatal exception has occurred. Program will exit. > {noformat} > so it is saying Solr is running, when it isn't. > Now, all this can be avoided by just installing th
[jira] [Resolved] (SOLR-13089) bin/solr's use of lsof has some issues
[ https://issues.apache.org/jira/browse/SOLR-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl resolved SOLR-13089. Resolution: Fixed > bin/solr's use of lsof has some issues > -- > > Key: SOLR-13089 > URL: https://issues.apache.org/jira/browse/SOLR-13089 > Project: Solr > Issue Type: Bug > Components: SolrCLI >Reporter: Martijn Koster >Assignee: Jan Høydahl >Priority: Minor > Fix For: 8.5 > > Attachments: 0001-SOLR-13089-lsof-fixes.patch, SOLR-13089.patch > > > The {{bin/solr}} script uses this {{lsof}} invocation to check if the Solr > port is being listened on: > {noformat} > running=`lsof -PniTCP:$SOLR_PORT -sTCP:LISTEN` > if [ -z "$running" ]; then > {noformat} > code is at > [here|https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2147]. > There are a few issues with this. > h2. 1. False negatives when port is occupied by different user > When {{lsof}} runs as non-root, it only shows sockets for processes with your > effective uid. > For example: > {noformat} > $ id -u && nc -l 7788 & > [1] 26576 > 1000 > works: nc ran as my user > $ lsof -PniTCP:7788 -sTCP:LISTEN > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > nc 26580 mak3u IPv4 2818104 0t0 TCP *:7788 (LISTEN) > fails: ssh is running as root > $ lsof -PniTCP:22 -sTCP:LISTEN > works if we are root > $ sudo lsof -PniTCP:22 -sTCP:LISTEN > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > sshd2524 root3u IPv4 18426 0t0 TCP *:22 (LISTEN) > sshd2524 root4u IPv6 18428 0t0 TCP *:22 (LISTEN) > {noformat} > Solr runs as non-root. > So if some other process owned by a different user occupies that port, you > will get a false negative (it will say Solr is not running even though it is) > I can't think of a good way to fix or work around that (short of not using > {{lsof}} in the first place). > Perhaps an uncommon scenario we need not worry too much about. > h2. 2. lsof can complain about lack of /etc/password entries > If {{lsof}} runs without the current effective user having an entry in > {{/etc/passwd}}, > it produces a warning on stderr: > {noformat} > $ docker run -d -u 0 solr:7.6.0 bash -c "chown -R /opt/; gosu > solr-foreground" > 4397c3f51d4a1cfca7e5815e5b047f75fb144265d4582745a584f0dba51480c6 > $ docker exec -it -u > 4397c3f51d4a1cfca7e5815e5b047f75fb144265d4582745a584f0dba51480c6 bash > I have no name!@4397c3f51d4a:/opt/solr$ lsof -PniTCP:8983 -sTCP:LISTEN > lsof: no pwd entry for UID > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > lsof: no pwd entry for UID > java 9 115u IPv4 2813503 0t0 TCP *:8983 (LISTEN) > I have no name!@4397c3f51d4a:/opt/solr$ lsof -PniTCP:8983 > -sTCP:LISTEN>/dev/null > lsof: no pwd entry for UID > lsof: no pwd entry for UID > {noformat} > You can avoid this by using the {{-t}} tag, which specifies that lsof should > produce terse output with process identifiers only and no header: > {noformat} > I have no name!@4397c3f51d4a:/opt/solr$ lsof -t -PniTCP:8983 -sTCP:LISTEN > 9 > {noformat} > This is a rare circumstance, but one I encountered and worked around. > h2. 3. On Alpine, lsof is implemented by busybox, but with incompatible > arguments > On Alpine, {{busybox}} implements {{lsof}}, but does not support the > arguments, so you get: > {noformat} > $ docker run -it alpine sh > / # lsof -t -PniTCP:8983 -sTCP:LISTEN > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/tty > {noformat} > so if you ran Solr, in the background, and it failed to start, this code > would produce a false positive. > For example: > {noformat} > docker volume create mysol > docker run -v mysol:/mysol bash bash -c "chown 8983:8983 /mysol" > docker run -it -v mysol:/mysol -w /mysol -v > $HOME/Downloads/solr-7.6.0.tgz:/solr-7.6.0.tgz openjdk:8-alpine sh > apk add procps bash > tar xvzf /solr-7.6.0.tgz > chown -R 8983:8983 . > {noformat} > then in a separate terminal: > {noformat} > $ docker exec -it -u 8983 serene_saha sh > /mysol $ SOLR_OPTS=--invalid ./solr-7.6.0/bin/solr start > whoami: unknown uid 8983 > Waiting up to 180 seconds to see Solr running on port 8983 [|] > Started Solr server on port 8983 (pid=101). Happy searching! > /mysol $ > {noformat} > and in another separate terminal: > {noformat} > $ docker exec -it thirsty_liskov bash > bash-4.4$ cat server/logs/solr-8983-console.log > Unrecognized option: --invalid > Error: Could not create the Java Virtual Machine. > Error: A fatal exception has occurred. Program will exit. > {noformat} > so it is saying Solr is running, when it isn't. > Now, all this can be avoided by just installing t
[jira] [Commented] (SOLR-14169) Fix 20 Resource Leak warnings in apache/solr/common
[ https://issues.apache.org/jira/browse/SOLR-14169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008815#comment-17008815 ] Lucene/Solr QA commented on SOLR-14169: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 30s{color} | {color:green} solrj in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 9m 10s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | SOLR-14169 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12990001/SOLR-14169-01.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / ac777a53522 | | ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 | | Default Java | LTS | | Test Results | https://builds.apache.org/job/PreCommit-SOLR-Build/646/testReport/ | | modules | C: solr/solrj U: solr/solrj | | Console output | https://builds.apache.org/job/PreCommit-SOLR-Build/646/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > Fix 20 Resource Leak warnings in apache/solr/common > --- > > Key: SOLR-14169 > URL: https://issues.apache.org/jira/browse/SOLR-14169 > Project: Solr > Issue Type: Sub-task >Reporter: Andras Salamon >Priority: Minor > Attachments: SOLR-14169-01.patch > > > There are 20 resource leak warnings in {{apache/solr/common}} > {noformat} > [ecj-lint] 5. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/java/org/apache/solr/common/cloud/ZkNodeProps.java > (at line 98) [ecj-lint] 5. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/java/org/apache/solr/common/cloud/ZkNodeProps.java > (at line 98) [ecj-lint] props = (Map) new > JavaBinCodec().unmarshal(bytes); [ecj-lint] > ^^ [ecj-lint] Resource leak: '' > is never closed-- [ecj-lint] 6. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/java/org/apache/solr/common/util/Utils.java > (at line 206) [ecj-lint] new SolrJSONWriter(writer) [ecj-lint] > ^^ [ecj-lint] Resource leak: ' value>' is never closed-- [ecj-lint] 2. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 50) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 3. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 73) [ecj-lint] try (InputStream is = new > SolrResourceLoader().openResource("solrj/README"); [ecj-lint] > [ecj-lint] Resource leak: ' Closeable value>' is never closed-- [ecj-lint] 4. WARNING in > /Users/andrassalamon/src/lucene-solr-upstream/solr/solrj/src/test/org/apache/solr/common/util/ContentStreamTest.java > (at line 98)
[GitHub] [lucene-solr] kaynewu opened a new pull request #1143: HdfsDirectory support createTempOutput
kaynewu opened a new pull request #1143: HdfsDirectory support createTempOutput URL: https://github.com/apache/lucene-solr/pull/1143 HdfsDirectory support createTempOutput This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1142: SOLR-14166: fq cache=false should use TwoPhaseIterator
dsmiley commented on a change in pull request #1142: SOLR-14166: fq cache=false should use TwoPhaseIterator URL: https://github.com/apache/lucene-solr/pull/1142#discussion_r363290923 ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/SolrFeature.java ## @@ -237,21 +237,25 @@ public FeatureScorer scorer(LeafReaderContext context) throws IOException { * @return DocIdSetIterator to traverse documents that matched all filter * criteria */ +// TODO it's not optimal to call getProcessedFilter per-segment! Save the results into one Query +// TODO rename to "FromFilterQueries" suffix to at least suggest this uses the filter cache private DocIdSetIterator getDocIdSetIteratorFromQueries(List queries, LeafReaderContext context) throws IOException { final SolrIndexSearcher.ProcessedFilter pf = ((SolrIndexSearcher) searcher) .getProcessedFilter(null, queries); - final Bits liveDocs = context.reader().getLiveDocs(); - - DocIdSetIterator idIter = null; - if (pf.filter != null) { -final DocIdSet idSet = pf.filter.getDocIdSet(context, liveDocs); -if (idSet != null) { - idIter = idSet.iterator(); -} + if (pf.postFilter != null) { +throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, +"PostFilter queries are not supported"); } - - return idIter; + Query q = pf.filter; + if (q == null) { +q = new MatchAllDocsQuery(); // usually never happens? + } + Scorer scorer = q.createWeight(searcher, ScoreMode.COMPLETE_NO_SCORES, 1f).scorer(context); Review comment: Ah; good catch! No it's not. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1142: SOLR-14166: fq cache=false should use TwoPhaseIterator
dsmiley commented on a change in pull request #1142: SOLR-14166: fq cache=false should use TwoPhaseIterator URL: https://github.com/apache/lucene-solr/pull/1142#discussion_r363293567 ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/SolrFeature.java ## @@ -237,21 +237,25 @@ public FeatureScorer scorer(LeafReaderContext context) throws IOException { * @return DocIdSetIterator to traverse documents that matched all filter * criteria */ +// TODO it's not optimal to call getProcessedFilter per-segment! Save the results into one Query +// TODO rename to "FromFilterQueries" suffix to at least suggest this uses the filter cache private DocIdSetIterator getDocIdSetIteratorFromQueries(List queries, LeafReaderContext context) throws IOException { final SolrIndexSearcher.ProcessedFilter pf = ((SolrIndexSearcher) searcher) .getProcessedFilter(null, queries); - final Bits liveDocs = context.reader().getLiveDocs(); - - DocIdSetIterator idIter = null; - if (pf.filter != null) { -final DocIdSet idSet = pf.filter.getDocIdSet(context, liveDocs); -if (idSet != null) { - idIter = idSet.iterator(); -} + if (pf.postFilter != null) { +throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, +"PostFilter queries are not supported"); } - - return idIter; + Query q = pf.filter; + if (q == null) { +q = new MatchAllDocsQuery(); // usually never happens? + } + Scorer scorer = q.createWeight(searcher, ScoreMode.COMPLETE_NO_SCORES, 1f).scorer(context); + if (scorer != null) { +return scorer.iterator(); Review comment: This particular method on this class for LTR wants to return a DocIdSetIterator. You are correct that this method will not completely benefit from TwoPhaseIterator as-designed. It will benefit from the cost ordering aspect though. I like your suggestion of returning a Scorer, thus enabling the caller to _potentially_ use it better (it does not today). But I don't want to scope creep this PR into the LTR module more than necessary to accomplish the primary goal of the PR. If what you propose is pretty simple then it can be done now but I don't see it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14158) package manager to read keys from packagestore and not ZK
[ https://issues.apache.org/jira/browse/SOLR-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008846#comment-17008846 ] David Smiley commented on SOLR-14158: - This is perhaps a bigger issue that needs discussion on the dev list. It gets at Solr's security posture and what assumptions we have about securing Solr. I'm for/against what's happening in the issue but just want more eye-balls on it. > package manager to read keys from packagestore and not ZK > -- > > Key: SOLR-14158 > URL: https://issues.apache.org/jira/browse/SOLR-14158 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: packages >Affects Versions: 8.4 >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Blocker > Labels: packagemanager > Fix For: 8.4.1 > > > The security of the package system relies on securing ZK. It's much easier > for users to secure the file system than securing ZK. > We provide an option to read public keys from file store. > This will > * Have a special directory called {{_trusted_}} . Direct writes are forbidden > to that directory over http > * The CLI directly writes to the keys to > {{/filestore/_trusted_/keys/}} directory. Other nodes are asked to > fetch the public key files from that node > * Package artifacts will continue to be uploaded over http -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14158) package manager to read keys from packagestore and not ZK
[ https://issues.apache.org/jira/browse/SOLR-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008846#comment-17008846 ] David Smiley edited comment on SOLR-14158 at 1/6/20 1:36 PM: - This is perhaps a bigger issue that needs discussion on the dev list. It gets at Solr's security posture and what assumptions we have about securing Solr. I'm not for or against what's happening in the issue; I just want more eye-balls on it. was (Author: dsmiley): This is perhaps a bigger issue that needs discussion on the dev list. It gets at Solr's security posture and what assumptions we have about securing Solr. I'm for/against what's happening in the issue but just want more eye-balls on it. > package manager to read keys from packagestore and not ZK > -- > > Key: SOLR-14158 > URL: https://issues.apache.org/jira/browse/SOLR-14158 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: packages >Affects Versions: 8.4 >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Blocker > Labels: packagemanager > Fix For: 8.4.1 > > > The security of the package system relies on securing ZK. It's much easier > for users to secure the file system than securing ZK. > We provide an option to read public keys from file store. > This will > * Have a special directory called {{_trusted_}} . Direct writes are forbidden > to that directory over http > * The CLI directly writes to the keys to > {{/filestore/_trusted_/keys/}} directory. Other nodes are asked to > fetch the public key files from that node > * Package artifacts will continue to be uploaded over http -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9115) NRTCachingDirectory may put large files in the cache
Adrien Grand created LUCENE-9115: Summary: NRTCachingDirectory may put large files in the cache Key: LUCENE-9115 URL: https://issues.apache.org/jira/browse/LUCENE-9115 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand NRTCachingDirectory assumes that the length of a file to write is 0 if there is no merge info or flush info. This is not correct as there are situations when Lucene might write very large files that have neither of them, for instance: - Stored fields are written on the fly with IOContext.DEFAULT (which doesn't have flush or merge info) and without taking any of the IndexWriter buffer, so gigabytes could be written before a flush happens. - BKD trees are merged with IOContext.DEFAULT. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zsgyulavari opened a new pull request #1144: SOLR-13756 updated restlet mvn repository url.
zsgyulavari opened a new pull request #1144: SOLR-13756 updated restlet mvn repository url. URL: https://github.com/apache/lucene-solr/pull/1144 # Description Updated old repository URL for the restlet framework to the current official stated at: https://restlet.talend.com/downloads/current/ # Solution The old repository URL does a redirect for the new one, but Ivy fails to follow it on some platforms. The redirect also points to the updated URL. # Tests It could be compiled even after deleting the local ivy cache using `rm -rf ~/.ivy2/cache` . # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `master` branch. - [x] I have run `ant precommit` and the appropriate test suite. - [n/a] I have added tests for my changes. - [n/a] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8673) Use radix partitioning when merging dimensional points
[ https://issues.apache.org/jira/browse/LUCENE-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008854#comment-17008854 ] ASF subversion and git services commented on LUCENE-8673: - Commit b6f31835ad18da0f7a22064481b0d0e167f9f30c in lucene-solr's branch refs/heads/master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b6f3183 ] LUCENE-8673: Avoid OOMEs because of IOContext randomization. > Use radix partitioning when merging dimensional points > -- > > Key: LUCENE-8673 > URL: https://issues.apache.org/jira/browse/LUCENE-8673 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ignacio Vera >Assignee: Ignacio Vera >Priority: Major > Fix For: 8.x, master (9.0) > > Attachments: Geo3D.png, Geo3D.png, Geo3D.png, LatLonPoint.png, > LatLonPoint.png, LatLonPoint.png, LatLonShape.png, LatLonShape.png, > LatLonShape.png > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Following the advise of [~jpountz] in LUCENE-8623I have investigated using > radix selection when merging segments instead of sorting the data at the > beginning. The results are pretty promising when running Lucene geo > benchmarks: > > ||Approach||Index time (sec): Dev||Index Time (sec): Base||Index Time: > Diff||Force merge time (sec): Dev||Force Merge time (sec): Base||Force Merge > Time: Diff||Index size (GB): Dev||Index size (GB): Base||Index Size: > Diff||Reader heap (MB): Dev||Reader heap (MB): Base||Reader heap: Diff > |points|241.5s|235.0s| 3%|157.2s|157.9s|-0%|0.55|0.55| 0%|1.57|1.57| 0%| > |shapes|416.1s|650.1s|-36%|306.1s|603.2s|-49%|1.29|1.29| 0%|1.61|1.61| 0%| > |geo3d|261.0s|360.1s|-28%|170.2s|279.9s|-39%|0.75|0.75| 0%|1.58|1.58| 0%| > > edited: table formatting to be a jira table > > In 2D the index throughput is more or less equal but for higher dimensions > the impact is quite big. In all cases the merging process requires much less > disk space, I am attaching plots showing the different behaviour and I am > opening a pull request. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8673) Use radix partitioning when merging dimensional points
[ https://issues.apache.org/jira/browse/LUCENE-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008853#comment-17008853 ] ASF subversion and git services commented on LUCENE-8673: - Commit 83999401ae9d3b23d14fe880adeb4fc57358bc2a in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8399940 ] LUCENE-8673: Avoid OOMEs because of IOContext randomization. > Use radix partitioning when merging dimensional points > -- > > Key: LUCENE-8673 > URL: https://issues.apache.org/jira/browse/LUCENE-8673 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ignacio Vera >Assignee: Ignacio Vera >Priority: Major > Fix For: 8.x, master (9.0) > > Attachments: Geo3D.png, Geo3D.png, Geo3D.png, LatLonPoint.png, > LatLonPoint.png, LatLonPoint.png, LatLonShape.png, LatLonShape.png, > LatLonShape.png > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Following the advise of [~jpountz] in LUCENE-8623I have investigated using > radix selection when merging segments instead of sorting the data at the > beginning. The results are pretty promising when running Lucene geo > benchmarks: > > ||Approach||Index time (sec): Dev||Index Time (sec): Base||Index Time: > Diff||Force merge time (sec): Dev||Force Merge time (sec): Base||Force Merge > Time: Diff||Index size (GB): Dev||Index size (GB): Base||Index Size: > Diff||Reader heap (MB): Dev||Reader heap (MB): Base||Reader heap: Diff > |points|241.5s|235.0s| 3%|157.2s|157.9s|-0%|0.55|0.55| 0%|1.57|1.57| 0%| > |shapes|416.1s|650.1s|-36%|306.1s|603.2s|-49%|1.29|1.29| 0%|1.61|1.61| 0%| > |geo3d|261.0s|360.1s|-28%|170.2s|279.9s|-39%|0.75|0.75| 0%|1.58|1.58| 0%| > > edited: table formatting to be a jira table > > In 2D the index throughput is more or less equal but for higher dimensions > the impact is quite big. In all cases the merging process requires much less > disk space, I am attaching plots showing the different behaviour and I am > opening a pull request. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13756) ivy cannot download org.restlet.ext.servlet jar
[ https://issues.apache.org/jira/browse/SOLR-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008855#comment-17008855 ] Zsolt Gyulavari commented on SOLR-13756: Created a PR with the change mentioned above, please review: [https://github.com/apache/lucene-solr/pull/1144] > ivy cannot download org.restlet.ext.servlet jar > --- > > Key: SOLR-13756 > URL: https://issues.apache.org/jira/browse/SOLR-13756 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chongchen Chen >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > I checkout the project and run `ant idea`, it will try to download jars. But > https://repo1.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar > will return 404 now. > [ivy:retrieve] public: tried > [ivy:retrieve] > https://repo1.maven.org/maven2/org/restlet/jee/org.restlet.ext.servlet/2.3.0/org.restlet.ext.servlet-2.3.0.jar > [ivy:retrieve]:: > [ivy:retrieve]:: FAILED DOWNLOADS:: > [ivy:retrieve]:: ^ see resolution messages for details ^ :: > [ivy:retrieve]:: > [ivy:retrieve]:: > org.restlet.jee#org.restlet;2.3.0!org.restlet.jar > [ivy:retrieve]:: > org.restlet.jee#org.restlet.ext.servlet;2.3.0!org.restlet.ext.servlet.jar > [ivy:retrieve]:: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14170) Tag package feature as experimental
Jan Høydahl created SOLR-14170: -- Summary: Tag package feature as experimental Key: SOLR-14170 URL: https://issues.apache.org/jira/browse/SOLR-14170 Project: Solr Issue Type: Test Security Level: Public (Default Security Level. Issues are Public) Components: documentation Reporter: Jan Høydahl The new package store and package installation feature introduced in 8.4 was supposed to be tagged as lucene.experimental with a clear warning in ref-guide "Not yet recommended for production use" Let's add that for 8.5 so there is no doubt that if you use the feature you know the risks. Once the APIs have stabilized and there are a number of packages available "in the wild", we can decide to release it as a "GA" feature, but not yet! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14170) Tag package feature as experimental
[ https://issues.apache.org/jira/browse/SOLR-14170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-14170: --- Fix Version/s: 8.5 > Tag package feature as experimental > --- > > Key: SOLR-14170 > URL: https://issues.apache.org/jira/browse/SOLR-14170 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Jan Høydahl >Priority: Major > Fix For: 8.5 > > > The new package store and package installation feature introduced in 8.4 was > supposed to be tagged as lucene.experimental with a clear warning in > ref-guide "Not yet recommended for production use" > Let's add that for 8.5 so there is no doubt that if you use the feature you > know the risks. Once the APIs have stabilized and there are a number of > packages available "in the wild", we can decide to release it as a "GA" > feature, but not yet! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz opened a new pull request #1145: LUCENE-9115: NRTCachingDirectory shouldn't cache files of unknown size.
jpountz opened a new pull request #1145: LUCENE-9115: NRTCachingDirectory shouldn't cache files of unknown size. URL: https://github.com/apache/lucene-solr/pull/1145 See https://issues.apache.org/jira/browse/LUCENE-9115. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8673) Use radix partitioning when merging dimensional points
[ https://issues.apache.org/jira/browse/LUCENE-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008872#comment-17008872 ] Adrien Grand commented on LUCENE-8673: -- I dug a bit more. This failure is mostly due to IOContext randomization that can make NRTCachingDirectory put large files in the cache. I fixed the test framework to not use NRTCachingDirectory when the test requests a FSDirectory in order to avoid this issue. Separately I found an issue with NRTCachingDirectory, but it is not the root cause of these failures: LUCENE-9115. > Use radix partitioning when merging dimensional points > -- > > Key: LUCENE-8673 > URL: https://issues.apache.org/jira/browse/LUCENE-8673 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ignacio Vera >Assignee: Ignacio Vera >Priority: Major > Fix For: 8.x, master (9.0) > > Attachments: Geo3D.png, Geo3D.png, Geo3D.png, LatLonPoint.png, > LatLonPoint.png, LatLonPoint.png, LatLonShape.png, LatLonShape.png, > LatLonShape.png > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Following the advise of [~jpountz] in LUCENE-8623I have investigated using > radix selection when merging segments instead of sorting the data at the > beginning. The results are pretty promising when running Lucene geo > benchmarks: > > ||Approach||Index time (sec): Dev||Index Time (sec): Base||Index Time: > Diff||Force merge time (sec): Dev||Force Merge time (sec): Base||Force Merge > Time: Diff||Index size (GB): Dev||Index size (GB): Base||Index Size: > Diff||Reader heap (MB): Dev||Reader heap (MB): Base||Reader heap: Diff > |points|241.5s|235.0s| 3%|157.2s|157.9s|-0%|0.55|0.55| 0%|1.57|1.57| 0%| > |shapes|416.1s|650.1s|-36%|306.1s|603.2s|-49%|1.29|1.29| 0%|1.61|1.61| 0%| > |geo3d|261.0s|360.1s|-28%|170.2s|279.9s|-39%|0.75|0.75| 0%|1.58|1.58| 0%| > > edited: table formatting to be a jira table > > In 2D the index throughput is more or less equal but for higher dimensions > the impact is quite big. In all cases the merging process requires much less > disk space, I am attaching plots showing the different behaviour and I am > opening a pull request. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008871#comment-17008871 ] Markus Jelsma commented on LUCENE-9112: --- SegmentingTokenizerBase works fine on texts smaller than 1024. Any term that occupies the 1024th position is split due to this bug. Ideally, the class should refill the buffer and move on for each full sentence it takes, there are hardly any sentences over 1024 characters. But judging from the println i see, it does not do that, or incorrectly. I am going to work around the problem for now by splitting my text into paragraphs using newlines. However, paragraphs larger than 1024 will be a problem. I have checked my text sources on paragraph length and they usually do not exceed it, but paragraphs longer than 1024 are common enough, so i'll attach the simplest patch that 'fixes' that part for my case. > OpenNLP tokenizer is fooled by text containing spurious punctuation > --- > > Key: LUCENE-9112 > URL: https://issues.apache.org/jira/browse/LUCENE-9112 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (9.0) >Reporter: Markus Jelsma >Priority: Major > Labels: opennlp > Fix For: master (9.0) > > Attachments: LUCENE-9112-unittest.patch, LUCENE-9112-unittest.patch, > en-sent.bin, en-token.bin > > > The OpenNLP tokenizer show weird behaviour when text contains spurious > punctuation such as having triple dots trailing a sentence... > # the first dot becomes part of the token, having 'sentence.' becomes the > token > # much further down the text, a seemingly unrelated token is then suddenly > split up, in my example (see attached unit test) the name 'Baron' is split > into 'Baro' and 'n', this is the real problem > The problems never seem to occur when using small texts in unit tests but it > certainly does in real world examples. Depending on how many 'spurious' dots, > a completely different term can become split, or the same term in just a > different location. > I am not too sure if this is actually a problem in the Lucene code, but it is > a problem and i have a Lucene unit test proving the problem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14170) Tag package feature as experimental
[ https://issues.apache.org/jira/browse/SOLR-14170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya reassigned SOLR-14170: --- Assignee: Ishan Chattopadhyaya > Tag package feature as experimental > --- > > Key: SOLR-14170 > URL: https://issues.apache.org/jira/browse/SOLR-14170 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Jan Høydahl >Assignee: Ishan Chattopadhyaya >Priority: Major > Fix For: 8.5 > > > The new package store and package installation feature introduced in 8.4 was > supposed to be tagged as lucene.experimental with a clear warning in > ref-guide "Not yet recommended for production use" > Let's add that for 8.5 so there is no doubt that if you use the feature you > know the risks. Once the APIs have stabilized and there are a number of > packages available "in the wild", we can decide to release it as a "GA" > feature, but not yet! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] bruno-roustant opened a new pull request #1146: SOLR-6613: TextField.analyzeMultiTerm does not throw an exception…
bruno-roustant opened a new pull request #1146: SOLR-6613: TextField.analyzeMultiTerm does not throw an exception… URL: https://github.com/apache/lucene-solr/pull/1146 when Analyzer returns no term. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-6613) TextField.analyzeMultiTerm should not throw exception when analyzer returns no term
[ https://issues.apache.org/jira/browse/SOLR-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008876#comment-17008876 ] Bruno Roustant commented on SOLR-6613: -- PR added > TextField.analyzeMultiTerm should not throw exception when analyzer returns > no term > --- > > Key: SOLR-6613 > URL: https://issues.apache.org/jira/browse/SOLR-6613 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 4.3.1, 4.10.2, 6.0 >Reporter: Bruno Roustant >Assignee: Bruno Roustant >Priority: Major > Attachments: TestTextField.java > > Time Spent: 10m > Remaining Estimate: 0h > > In TextField.analyzeMultiTerm() > at line > try { > if (!source.incrementToken()) > throw new SolrException(); > The method should not throw an exception if there is no token because having > no token is legitimate because all tokens may be filtered out (e.g. with a > blocking Filter such as StopFilter). > In this case it should simply return null (as it already returns null in some > cases, see first line of method). However, SolrQueryParserBase needs also to > be fixed to correctly handle null returned by TextField.analyzeMultiTerm(). > See attached TestTextField for the corresponding new test class. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9112) SegmentingTokenizerBase splits terms that occupy 1024th positions in text
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Summary: SegmentingTokenizerBase splits terms that occupy 1024th positions in text (was: OpenNLP tokenizer is fooled by text containing spurious punctuation) > SegmentingTokenizerBase splits terms that occupy 1024th positions in text > - > > Key: LUCENE-9112 > URL: https://issues.apache.org/jira/browse/LUCENE-9112 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (9.0) >Reporter: Markus Jelsma >Priority: Major > Labels: opennlp > Fix For: master (9.0) > > Attachments: LUCENE-9112-unittest.patch, LUCENE-9112-unittest.patch, > en-sent.bin, en-token.bin > > > The OpenNLP tokenizer show weird behaviour when text contains spurious > punctuation such as having triple dots trailing a sentence... > # the first dot becomes part of the token, having 'sentence.' becomes the > token > # much further down the text, a seemingly unrelated token is then suddenly > split up, in my example (see attached unit test) the name 'Baron' is split > into 'Baro' and 'n', this is the real problem > The problems never seem to occur when using small texts in unit tests but it > certainly does in real world examples. Depending on how many 'spurious' dots, > a completely different term can become split, or the same term in just a > different location. > I am not too sure if this is actually a problem in the Lucene code, but it is > a problem and i have a Lucene unit test proving the problem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on issue #1055: SOLR-13932 Review directory locking and Blob interactions
murblanc commented on issue #1055: SOLR-13932 Review directory locking and Blob interactions URL: https://github.com/apache/lucene-solr/pull/1055#issuecomment-571155842 @mbwaheed @yonik I have rebased the changes after Bilal approved the review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] risdenk opened a new pull request #1147: SOLR-14163: SOLR_SSL_CLIENT_HOSTNAME_VERIFICATION needs to work with Jetty server/client SSL contexts
risdenk opened a new pull request #1147: SOLR-14163: SOLR_SSL_CLIENT_HOSTNAME_VERIFICATION needs to work with Jetty server/client SSL contexts URL: https://github.com/apache/lucene-solr/pull/1147 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14075) Investigate performance degradation of /export from Solr 6 to Solr 7
[ https://issues.apache.org/jira/browse/SOLR-14075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein reassigned SOLR-14075: - Assignee: Joel Bernstein > Investigate performance degradation of /export from Solr 6 to Solr 7 > > > Key: SOLR-14075 > URL: https://issues.apache.org/jira/browse/SOLR-14075 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > > There have been customer reports, not on the user or dev list, of performance > degradation of the /export handler from Solr 6 to Solr 7. Originally it was > thought that SOLR-13013 would resolve this issue but, this has turned out not > to be the case. This ticket will determine if there is a performance > degradation in /export between Solr 6 and 7 and pin-point the reason. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13890) Add postfilter support to {!terms} queries
[ https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gerlowski updated SOLR-13890: --- Attachment: SOLR-13890.patch toplevel-tpi-perf-comparison.png Status: Open (was: Open) Given the recent performance results proving that the main differentiator is top-level vs per-segment, I took a stab at a "top-level" DVTQ TPI implementation. It still needs some cleanup, and I could use some feedback on if/how we want to expose this to users: should Solr try to pick intelligently between the per-segment and top-level TPI implementations? Should users be able to override this if desired? (Right now I've added a switch over to using "top-level" at 500 terms, with a "subMethod" param to let users override this if desired.) So there's some loose ends here, but the performance numbers for the new TPI implementation are promising. Roughly equivalent to the postfilter implementation we've been going off of. !toplevel-tpi-perf-comparison.png! Thoughts? > Add postfilter support to {!terms} queries > -- > > Key: SOLR-13890 > URL: https://issues.apache.org/jira/browse/SOLR-13890 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: master (9.0) >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Major > Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, > SOLR-13890.patch, SOLR-13890.patch, Screen Shot 2020-01-02 at 2.25.12 PM.png, > post_optimize_performance.png, toplevel-tpi-perf-comparison.png > > > There are some use-cases where it'd be nice if the "terms" qparser created a > query that could be run as a postfilter. Particularly, when users are > checking for hundreds or thousands of terms, a postfilter implementation can > be more performant than the standard processing. > WIth this issue, I'd like to propose a post-filter implementation for the > {{docValuesTermsFilter}} "method". Postfilter creation can use a > SortedSetDocValues object to populate a DV bitset with the "terms" being > checked for. Each document run through the post-filter can look at their > doc-values for the field in question and check them efficiently against the > constructed bitset. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14130) Add postlogs command line tool for indexing Solr logs
[ https://issues.apache.org/jira/browse/SOLR-14130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008973#comment-17008973 ] Erick Erickson commented on SOLR-14130: --- [~jbernste] I took a quick look at the patch and it looks great. There are two things I might suggest: 1> A short note on how to set up the collection you index to, mostly just the configset you should use (_default I assume, one shard no replicas?). 2> When I was playing around with this concept I found it useful to have the option of a "batch" parameter to allow me to group runs, in this case maybe default to the directory specified "/user/foo/logs". I can easily see wanting to restrict searches to "baseline", "change1" etc., not to mention deleting all the logs indexed for a particular batch when no longer relevant while keeping those that are. Or maybe just use the directory specified and encourage users to put different runs (or whatever) in different dirs. There are a number of refinements that I was playing around with for experimenting, one in particular (that can wait for later) is being able to facet on the first recognizable line in the exceptions. "recognizable" might be a line that mentions "org.apache.solr" or "org.apache.lucene". Then when I facet on it and see 2,500 exceptions generated by the query parser, I can skip them easily... But that's for later and only really useful when there's a UI around it. Now if we just had a UI around this for arbitrary searches ;) > Add postlogs command line tool for indexing Solr logs > - > > Key: SOLR-14130 > URL: https://issues.apache.org/jira/browse/SOLR-14130 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > Attachments: SOLR-14130.patch, SOLR-14130.patch, SOLR-14130.patch, > SOLR-14130.patch, SOLR-14130.patch, SOLR-14130.patch, SOLR-14130.patch, > Screen Shot 2019-12-19 at 2.04.41 PM.png, Screen Shot 2019-12-19 at 2.16.01 > PM.png, Screen Shot 2019-12-19 at 2.35.41 PM.png, Screen Shot 2019-12-21 at > 8.46.51 AM.png > > > This ticket adds a simple command line tool for posting Solr logs to a solr > index. The tool works with the out of the box Solr log format. Still a work > in progress but currently indexes: > * queries > * updates > * commits > * new searchers > * errors - including stack traces > Attached are some sample visualizations using Solr Streaming Expressions and > Math Expressions after the data has been loaded. The visualizations show: > time series, scatter plots, histograms and quantile plots, but really this is > just scratching the surface of the visualizations that can be done with the > Solr logs. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman reassigned SOLR-11746: - Assignee: Houston Putman > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman updated SOLR-11746: -- Attachment: SOLR-11746.patch > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008988#comment-17008988 ] Houston Putman commented on SOLR-11746: --- [~tflobbe] added a test for that. All tests passed before I added the test, so I'm going to do a final precommit check, then commit to 8 and master. > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman updated SOLR-11746: -- Affects Version/s: 7.0 > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman updated SOLR-11746: -- Attachment: SOLR-11746.patch > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009000#comment-17009000 ] Houston Putman commented on SOLR-11746: --- Updated the ref-guide to reduce confusion. > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson opened a new pull request #1148: LUCENE-9080: Upgrade ICU4j to 62.2 and regenerate
ErickErickson opened a new pull request #1148: LUCENE-9080: Upgrade ICU4j to 62.2 and regenerate URL: https://github.com/apache/lucene-solr/pull/1148 # Description See comments on the JIRA This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009011#comment-17009011 ] ASF subversion and git services commented on SOLR-11746: Commit f5ab3ca688b3127bece252ffd87cc8bfa9f285ff in lucene-solr's branch refs/heads/master from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f5ab3ca ] SOLR-11746: Existence query support for numeric point fields > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9080) Upgrade ICU4j to 62.2 and regenerate
[ https://issues.apache.org/jira/browse/LUCENE-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009012#comment-17009012 ] Erick Erickson commented on LUCENE-9080: I think I have it working. Current state: * regenerate works * precommit passes * full test suite passes * I updated to ICU 64.2, see above. * I ran an absolutely minimal test of Solr, just "bin/solr start -e techproducts" and did a search. The fact that some of the binary files got regenerated makes me a little nervous, but all tests pass and I can fire up Solr. [~jpountz] : I'm flying a little blind here, WDYT about the changes to util/packed? I pulled out what I hope are the correct bits from the Packed directory, and changed the build file: see the changes in lucene/core/build.xml. I deleted gen_PackedThreeBlocks.py and gen_Direct.py as well. I'm trusting that the tests would barf if they were necessary. As a test I deleted everything in util/packed that had the "DO NOT EDIT" tag before I regenerated to see if anything broke. Since files like BulkOperation*.java regenerated I feel more confident. The files I manually changed were: lucene/core/build.xml utils/packed/gen_Packed64SingleBlock.py gen_Direct.py (deleted) gen_PackedThreeBlocks.py (deleted) generateUTR30DataFiles.java ivy-versions.properties The rest of the changes are a result of running the regenerate. I'll push it to master in the next day or so absent objections. [~dawid.weiss] [~jpountz] [~rcmuir] [~mikemccand] [~uschindler] et.al. what do you think about merging this back to 8x? Nobody's apparently run regenerate in ages, and my motivation is to have a working baseline for the Gradle build which won't be 8x anyway. Maybe raise another Jira that points back here if for posterity in case someone else needs to do this? Oh, and I'm finally trying to get all modern and use PRs, pardon me if I screw it up. > Upgrade ICU4j to 62.2 and regenerate > > > Key: LUCENE-9080 > URL: https://issues.apache.org/jira/browse/LUCENE-9080 > Project: Lucene - Core > Issue Type: Bug >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: after_regen.patch, before_regen.patch, status.res > > Time Spent: 10m > Remaining Estimate: 0h > > The root cause is that RamUsageEstimator.NUM_BYTES_INT has been removed and > the python scripts still reference it in the generated scripts. That part's > easy to fix. > Last time I looked, though, the regenerate produces some differences in the > generated files that should be looked at to insure they're benign. > Not really sure whether this should be a Lucene or Solr JIRA. Putting it in > Lucene since one of the failed files is: > lucene/core/src/java/org/apache/lucene/util/packed/Packed8ThreeBlocks.java > I do know that one of the Solr jflex-produced file has an unexplained > difference so it may bleed over. > "ant regenerate" needs about 24G on my machine FWIW. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009048#comment-17009048 ] Tomas Eduardo Fernandez Lobbe commented on SOLR-11746: -- Thanks for updating the docs. One nit: {code} -** `field:[* TO *]` matches all documents with the field +** `field:*` or `field:[* TO *]` matches all documents where the field exists * Pure negative queries (all clauses prohibited) are allowed (only as a top-level clause) ** `-inStock:false` finds all field values where inStock is not false ** `-field:[* TO *]` finds all documents without a value for field {code} You updated the positive but not the negative, i.e. {{`-field:*` or `-field:[* TO *]`...}} > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009050#comment-17009050 ] Houston Putman commented on SOLR-11746: --- Good call, I'll add that in when I push to 8x, and commit the small change to master > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] andyvuong commented on issue #1131: SOLR-14134: Add lazy and time-based evictiction of shared core concurrency metada…
andyvuong commented on issue #1131: SOLR-14134: Add lazy and time-based evictiction of shared core concurrency metada… URL: https://github.com/apache/lucene-solr/pull/1131#issuecomment-571242869 > And for regular eviction we do not need this tracking either, we can simply do that in SolrCore#close. The basic assumption is: if core container can hold so many SolrCore instances we can very easily hold this simple metadata. The problem of many cores is planned to be addressed by transient cores or zero replica design. With that I don't think this cache need to worry about its own size as long as it ties itself with SolrCore instances. @mbwaheed we can evict lazily on on SolrCore instance creation in addition to SolrCore#close. To clarify you're also saying we can scope this item smaller and stick with a simpler cache (simple map, no size/time based eviction as done here) and let that future item handle whatever is needed ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009068#comment-17009068 ] ASF subversion and git services commented on SOLR-11746: Commit 1f1b719478e298b5ada064197a7fa919b608d24c in lucene-solr's branch refs/heads/branch_8x from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1f1b719 ] SOLR-11746: Existence query support for numeric point fields > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009069#comment-17009069 ] ASF subversion and git services commented on SOLR-11746: Commit 9edb143efdc6616906972ae6c629860c91a5a2e7 in lucene-solr's branch refs/heads/master from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9edb143 ] SOLR-11746: Adding docs for negative existence queries. > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9116) Simplify postings API by removing long[] metadata
Adrien Grand created LUCENE-9116: Summary: Simplify postings API by removing long[] metadata Key: LUCENE-9116 URL: https://issues.apache.org/jira/browse/LUCENE-9116 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand The postings API allows to store metadata about a term either in a long[] or in a byte[]. This is unnecessary as all information could be encoded in the byte[], which is what most codecs do in practice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz opened a new pull request #1149: LUCENE-9116: Remove long[] from `PostingsWriterBase#encodeTerm`.
jpountz opened a new pull request #1149: LUCENE-9116: Remove long[] from `PostingsWriterBase#encodeTerm`. URL: https://github.com/apache/lucene-solr/pull/1149 All the metadata can be directly encoded in the `DataOutput`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman resolved SOLR-11746. --- Fix Version/s: 8.5 master (9.0) Resolution: Fixed > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Fix For: master (9.0), 8.5 > > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman updated SOLR-11746: -- Attachment: SOLR-11746.patch > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Improvement >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman updated SOLR-11746: -- Issue Type: Bug (was: Improvement) > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Bug >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Fix For: master (9.0), 8.5 > > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9116) Simplify postings API by removing long[] metadata
[ https://issues.apache.org/jira/browse/LUCENE-9116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009073#comment-17009073 ] Adrien Grand commented on LUCENE-9116: -- I want to draw attention to the fact that the attached pull request removes the FST and FSTOrd postings formats, which were harder to migrate, and that it breaks compatibility for some postings formats, but not Lucene84 and Lucene50 which we need to support. > Simplify postings API by removing long[] metadata > - > > Key: LUCENE-9116 > URL: https://issues.apache.org/jira/browse/LUCENE-9116 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > The postings API allows to store metadata about a term either in a long[] or > in a byte[]. This is unnecessary as all information could be encoded in the > byte[], which is what most codecs do in practice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mbwaheed commented on issue #1131: SOLR-14134: Add lazy and time-based evictiction of shared core concurrency metada…
mbwaheed commented on issue #1131: SOLR-14134: Add lazy and time-based evictiction of shared core concurrency metada… URL: https://github.com/apache/lucene-solr/pull/1131#issuecomment-571256880 > To clarify you're also saying we can scope this item smaller and stick with a simpler cache (simple map, no size/time based eviction as done here) and let that future item handle whatever is needed ? @andyvuong Yes. Many SolrCore instances is a bigger problem that needs to be handled either by transient cores or zero replica(w/ autoscaling). For this cache to grow and shrink with number of SolrCore instances is good enough. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14171) allTermsRequired does not work when using context filter query
Jonathan J Senchyna created SOLR-14171: -- Summary: allTermsRequired does not work when using context filter query Key: SOLR-14171 URL: https://issues.apache.org/jira/browse/SOLR-14171 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: Suggester Affects Versions: 8.4, 8.1.1 Reporter: Jonathan J Senchyna When using the suggester context filtering query param {{suggest.contextFilterQuery}} introduced in SOLR-7888, the suggester configuration {{allTermsRequired}} is ignored and all terms become required. In my test configuration, I am not specifying {{allTermsRequired}}, so it defaults to {{false}}. If I send a request without {{cfq}} specified in my query params, I get back results for partial matches, as expected. As soon as I specify a {{cfq}} in my requests, I only get back results where all terms match. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009132#comment-17009132 ] David Smiley commented on SOLR-11746: - I'm really glad to finally see this get in :) . Thanks everyone! > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Bug >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Fix For: master (9.0), 8.5 > > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13890) Add postfilter support to {!terms} queries
[ https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009170#comment-17009170 ] Jason Gerlowski commented on SOLR-13890: Latest patch ties up some of the loose ends I mentioned in my last comment. Pending review from you guys, I'm pretty happy pulling the trigger on what we've got right now. We get the good performance I was after without introducing another postfilter. Pending more feedback I'll aim to merge this on Wednesday. > Add postfilter support to {!terms} queries > -- > > Key: SOLR-13890 > URL: https://issues.apache.org/jira/browse/SOLR-13890 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: master (9.0) >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Major > Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, > SOLR-13890.patch, SOLR-13890.patch, Screen Shot 2020-01-02 at 2.25.12 PM.png, > post_optimize_performance.png, toplevel-tpi-perf-comparison.png > > > There are some use-cases where it'd be nice if the "terms" qparser created a > query that could be run as a postfilter. Particularly, when users are > checking for hundreds or thousands of terms, a postfilter implementation can > be more performant than the standard processing. > WIth this issue, I'd like to propose a post-filter implementation for the > {{docValuesTermsFilter}} "method". Postfilter creation can use a > SortedSetDocValues object to populate a DV bitset with the "terms" being > checked for. Each document run through the post-filter can look at their > doc-values for the field in question and check them efficiently against the > constructed bitset. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries
[ https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009170#comment-17009170 ] Jason Gerlowski edited comment on SOLR-13890 at 1/6/20 9:04 PM: Latest patch ties up some of the loose ends I mentioned in my last comment. Pending review from you guys, I'm pretty happy pulling the trigger on what we've got right now. We get the improved performance I was after without introducing another postfilter. Things will be even better with SOLR-14166, but that doesn't need to block this effort. Pending more feedback I'll aim to merge this on Wednesday. was (Author: gerlowskija): Latest patch ties up some of the loose ends I mentioned in my last comment. Pending review from you guys, I'm pretty happy pulling the trigger on what we've got right now. We get the improved performance I was after without introducing another postfilter. Pending more feedback I'll aim to merge this on Wednesday. > Add postfilter support to {!terms} queries > -- > > Key: SOLR-13890 > URL: https://issues.apache.org/jira/browse/SOLR-13890 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: master (9.0) >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Major > Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, > SOLR-13890.patch, SOLR-13890.patch, Screen Shot 2020-01-02 at 2.25.12 PM.png, > post_optimize_performance.png, toplevel-tpi-perf-comparison.png > > > There are some use-cases where it'd be nice if the "terms" qparser created a > query that could be run as a postfilter. Particularly, when users are > checking for hundreds or thousands of terms, a postfilter implementation can > be more performant than the standard processing. > WIth this issue, I'd like to propose a post-filter implementation for the > {{docValuesTermsFilter}} "method". Postfilter creation can use a > SortedSetDocValues object to populate a DV bitset with the "terms" being > checked for. Each document run through the post-filter can look at their > doc-values for the field in question and check them efficiently against the > constructed bitset. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries
[ https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009170#comment-17009170 ] Jason Gerlowski edited comment on SOLR-13890 at 1/6/20 9:04 PM: Latest patch ties up some of the loose ends I mentioned in my last comment. Pending review from you guys, I'm pretty happy pulling the trigger on what we've got right now. We get the improved performance I was after without introducing another postfilter. Pending more feedback I'll aim to merge this on Wednesday. was (Author: gerlowskija): Latest patch ties up some of the loose ends I mentioned in my last comment. Pending review from you guys, I'm pretty happy pulling the trigger on what we've got right now. We get the good performance I was after without introducing another postfilter. Pending more feedback I'll aim to merge this on Wednesday. > Add postfilter support to {!terms} queries > -- > > Key: SOLR-13890 > URL: https://issues.apache.org/jira/browse/SOLR-13890 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: master (9.0) >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Major > Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, > SOLR-13890.patch, SOLR-13890.patch, Screen Shot 2020-01-02 at 2.25.12 PM.png, > post_optimize_performance.png, toplevel-tpi-perf-comparison.png > > > There are some use-cases where it'd be nice if the "terms" qparser created a > query that could be run as a postfilter. Particularly, when users are > checking for hundreds or thousands of terms, a postfilter implementation can > be more performant than the standard processing. > WIth this issue, I'd like to propose a post-filter implementation for the > {{docValuesTermsFilter}} "method". Postfilter creation can use a > SortedSetDocValues object to populate a DV bitset with the "terms" being > checked for. Each document run through the post-filter can look at their > doc-values for the field in question and check them efficiently against the > constructed bitset. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13890) Add postfilter support to {!terms} queries
[ https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gerlowski updated SOLR-13890: --- Attachment: SOLR-13890.patch > Add postfilter support to {!terms} queries > -- > > Key: SOLR-13890 > URL: https://issues.apache.org/jira/browse/SOLR-13890 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: master (9.0) >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Major > Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, > SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, Screen Shot 2020-01-02 > at 2.25.12 PM.png, post_optimize_performance.png, > toplevel-tpi-perf-comparison.png > > > There are some use-cases where it'd be nice if the "terms" qparser created a > query that could be run as a postfilter. Particularly, when users are > checking for hundreds or thousands of terms, a postfilter implementation can > be more performant than the standard processing. > WIth this issue, I'd like to propose a post-filter implementation for the > {{docValuesTermsFilter}} "method". Postfilter creation can use a > SortedSetDocValues object to populate a DV bitset with the "terms" being > checked for. Each document run through the post-filter can look at their > doc-values for the field in question and check them efficiently against the > constructed bitset. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9080) Upgrade ICU4j to 62.2 and regenerate
[ https://issues.apache.org/jira/browse/LUCENE-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated LUCENE-9080: --- Status: Patch Available (was: Open) > Upgrade ICU4j to 62.2 and regenerate > > > Key: LUCENE-9080 > URL: https://issues.apache.org/jira/browse/LUCENE-9080 > Project: Lucene - Core > Issue Type: Bug >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: after_regen.patch, before_regen.patch, status.res > > Time Spent: 10m > Remaining Estimate: 0h > > The root cause is that RamUsageEstimator.NUM_BYTES_INT has been removed and > the python scripts still reference it in the generated scripts. That part's > easy to fix. > Last time I looked, though, the regenerate produces some differences in the > generated files that should be looked at to insure they're benign. > Not really sure whether this should be a Lucene or Solr JIRA. Putting it in > Lucene since one of the failed files is: > lucene/core/src/java/org/apache/lucene/util/packed/Packed8ThreeBlocks.java > I do know that one of the Solr jflex-produced file has an unexplained > difference so it may bleed over. > "ant regenerate" needs about 24G on my machine FWIW. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13890) Add postfilter support to {!terms} queries
[ https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009194#comment-17009194 ] Mikhail Khludnev commented on SOLR-13890: - regarding {{PerSegmentViewDocIdSetIterator}}: I don't follow. Lucene's {{DocIdSetIterator}} is strictly per-segment, using it for top-level iteration is something that never happen. fwiw, usually toplevel Solr docsets converted to Lucene's DocIdSets via {{DocSet.getTopFilter()}}. adding argument to method {{QueryMethod.makeFilter(String fname, BytesRef[] bytesRefs, SolrParams localParams)}} is not something which is backward compatible, and might frustrate other devs. Note: {{TopLevelDocValuesTermsQuery}} uses {{OrdinalMap}} via {{getSlowAtomicReader()}}. It might be clearer to iterate persegment, and then access global ordinals via MultiSortedDocValues.mapping.getGlobalOrds() > Add postfilter support to {!terms} queries > -- > > Key: SOLR-13890 > URL: https://issues.apache.org/jira/browse/SOLR-13890 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: master (9.0) >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Major > Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, > SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, Screen Shot 2020-01-02 > at 2.25.12 PM.png, post_optimize_performance.png, > toplevel-tpi-perf-comparison.png > > > There are some use-cases where it'd be nice if the "terms" qparser created a > query that could be run as a postfilter. Particularly, when users are > checking for hundreds or thousands of terms, a postfilter implementation can > be more performant than the standard processing. > WIth this issue, I'd like to propose a post-filter implementation for the > {{docValuesTermsFilter}} "method". Postfilter creation can use a > SortedSetDocValues object to populate a DV bitset with the "terms" being > checked for. Each document run through the post-filter can look at their > doc-values for the field in question and check them efficiently against the > constructed bitset. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries
[ https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009194#comment-17009194 ] Mikhail Khludnev edited comment on SOLR-13890 at 1/6/20 10:08 PM: -- regarding {{PerSegmentViewDocIdSetIterator}}: I don't follow. Lucene's {{DocIdSetIterator}} is strictly per-segment, using it for top-level iteration is something that never happen. fwiw, usually toplevel Solr docsets converted to Lucene's DocIdSets via {{DocSet.getTopFilter()}}. adding argument to method {{QueryMethod.makeFilter(String fname, BytesRef[] bytesRefs, SolrParams localParams)}} is not something which is backward compatible, and might frustrate other devs. Note: {{TopLevelDocValuesTermsQuery}} uses {{OrdinalMap}} via {{getSlowAtomicReader()}}. It might be clearer to iterate persegment, and then access global ordinals via MultiSortedDocValues.mapping.getGlobalOrds() Also, this query relies on SolrIndexSearcher, but iirc even in Solr queries sometimes invoked with Lucene's Searcher. There's some issues with such cast. was (Author: mkhludnev): regarding {{PerSegmentViewDocIdSetIterator}}: I don't follow. Lucene's {{DocIdSetIterator}} is strictly per-segment, using it for top-level iteration is something that never happen. fwiw, usually toplevel Solr docsets converted to Lucene's DocIdSets via {{DocSet.getTopFilter()}}. adding argument to method {{QueryMethod.makeFilter(String fname, BytesRef[] bytesRefs, SolrParams localParams)}} is not something which is backward compatible, and might frustrate other devs. Note: {{TopLevelDocValuesTermsQuery}} uses {{OrdinalMap}} via {{getSlowAtomicReader()}}. It might be clearer to iterate persegment, and then access global ordinals via MultiSortedDocValues.mapping.getGlobalOrds() > Add postfilter support to {!terms} queries > -- > > Key: SOLR-13890 > URL: https://issues.apache.org/jira/browse/SOLR-13890 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: master (9.0) >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Major > Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, > SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, Screen Shot 2020-01-02 > at 2.25.12 PM.png, post_optimize_performance.png, > toplevel-tpi-perf-comparison.png > > > There are some use-cases where it'd be nice if the "terms" qparser created a > query that could be run as a postfilter. Particularly, when users are > checking for hundreds or thousands of terms, a postfilter implementation can > be more performant than the standard processing. > WIth this issue, I'd like to propose a post-filter implementation for the > {{docValuesTermsFilter}} "method". Postfilter creation can use a > SortedSetDocValues object to populate a DV bitset with the "terms" being > checked for. Each document run through the post-filter can look at their > doc-values for the field in question and check them efficiently against the > constructed bitset. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13890) Add postfilter support to {!terms} queries
[ https://issues.apache.org/jira/browse/SOLR-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009194#comment-17009194 ] Mikhail Khludnev edited comment on SOLR-13890 at 1/6/20 10:17 PM: -- regarding {{PerSegmentViewDocIdSetIterator}}: I don't follow. Lucene's {{DocIdSetIterator}} is strictly per-segment, using it for top-level iteration is something that never happen. fwiw, usually toplevel Solr docsets converted to Lucene's DocIdSets via {{DocSet.getTopFilter()}}. adding argument to method {{QueryMethod.makeFilter(String fname, BytesRef[] bytesRefs, SolrParams localParams)}} is not something which is backward compatible, and might frustrate other devs. Note: {{TopLevelDocValuesTermsQuery}} uses {{OrdinalMap}} via {{getSlowAtomicReader()}}. It might be clearer to iterate persegment, and then access global ordinals via MultiSortedDocValues.mapping.getGlobalOrds() Also, this query relies on SolrIndexSearcher, but iirc even in Solr queries sometimes invoked with Lucene's Searcher. There's some issues with such cast SOLR-6357. was (Author: mkhludnev): regarding {{PerSegmentViewDocIdSetIterator}}: I don't follow. Lucene's {{DocIdSetIterator}} is strictly per-segment, using it for top-level iteration is something that never happen. fwiw, usually toplevel Solr docsets converted to Lucene's DocIdSets via {{DocSet.getTopFilter()}}. adding argument to method {{QueryMethod.makeFilter(String fname, BytesRef[] bytesRefs, SolrParams localParams)}} is not something which is backward compatible, and might frustrate other devs. Note: {{TopLevelDocValuesTermsQuery}} uses {{OrdinalMap}} via {{getSlowAtomicReader()}}. It might be clearer to iterate persegment, and then access global ordinals via MultiSortedDocValues.mapping.getGlobalOrds() Also, this query relies on SolrIndexSearcher, but iirc even in Solr queries sometimes invoked with Lucene's Searcher. There's some issues with such cast. > Add postfilter support to {!terms} queries > -- > > Key: SOLR-13890 > URL: https://issues.apache.org/jira/browse/SOLR-13890 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: master (9.0) >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Major > Attachments: SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, > SOLR-13890.patch, SOLR-13890.patch, SOLR-13890.patch, Screen Shot 2020-01-02 > at 2.25.12 PM.png, post_optimize_performance.png, > toplevel-tpi-perf-comparison.png > > > There are some use-cases where it'd be nice if the "terms" qparser created a > query that could be run as a postfilter. Particularly, when users are > checking for hundreds or thousands of terms, a postfilter implementation can > be more performant than the standard processing. > WIth this issue, I'd like to propose a post-filter implementation for the > {{docValuesTermsFilter}} "method". Postfilter creation can use a > SortedSetDocValues object to populate a DV bitset with the "terms" being > checked for. Each document run through the post-filter can look at their > doc-values for the field in question and check them efficiently against the > constructed bitset. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] yonik merged pull request #1055: SOLR-13932 Review directory locking and Blob interactions
yonik merged pull request #1055: SOLR-13932 Review directory locking and Blob interactions URL: https://github.com/apache/lucene-solr/pull/1055 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13932) Review directory locking and Blob interactions
[ https://issues.apache.org/jira/browse/SOLR-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009202#comment-17009202 ] ASF subversion and git services commented on SOLR-13932: Commit 7d728d9d3a552dda75272bf339f17cae9d6b3734 in lucene-solr's branch refs/heads/jira/SOLR-13101 from murblanc [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7d728d9 ] SOLR-13932 Review directory locking and Blob interactions (#1055) * Initial minor changes for SOLR-13932 * Use all files in index directory when doing resolution against Blob to switch local index to new dir in case of conflicts * Do push from the index directory directly without first making a local copy of the index files * misspelling * update after comments from mbwaheed > Review directory locking and Blob interactions > -- > > Key: SOLR-13932 > URL: https://issues.apache.org/jira/browse/SOLR-13932 > Project: Solr > Issue Type: Sub-task >Reporter: Ilan Ginzburg >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > Review resolution of local index directory content vs Blob copy. > There has been wrong understanding of following line acquiring a lock on > index directory. > {{solrCore.getDirectoryFactory().get(indexDirPath, > DirectoryFactory.DirContext.DEFAULT, > solrCore.getSolrConfig().indexConfig.lockType);}} > From Yonik: > _A couple things about Directory locking the locks were only ever to > prevent more than one IndexWriter from trying to modify the same index. The > IndexWriter grabs a write lock once when it is created and does not release > it until it is closed._ > _Directories are not locked on acquisition of the Directory from the > DirectoryFactory. See the IndexWriter constructor, where the lock is > explicitly grabbed._ > Review CorePushPull#pullUpdateFromBlob, ServerSideMetadata and other classes > as relevant. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13932) Review directory locking and Blob interactions
[ https://issues.apache.org/jira/browse/SOLR-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009201#comment-17009201 ] ASF subversion and git services commented on SOLR-13932: Commit 7d728d9d3a552dda75272bf339f17cae9d6b3734 in lucene-solr's branch refs/heads/jira/SOLR-13101 from murblanc [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7d728d9 ] SOLR-13932 Review directory locking and Blob interactions (#1055) * Initial minor changes for SOLR-13932 * Use all files in index directory when doing resolution against Blob to switch local index to new dir in case of conflicts * Do push from the index directory directly without first making a local copy of the index files * misspelling * update after comments from mbwaheed > Review directory locking and Blob interactions > -- > > Key: SOLR-13932 > URL: https://issues.apache.org/jira/browse/SOLR-13932 > Project: Solr > Issue Type: Sub-task >Reporter: Ilan Ginzburg >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > Review resolution of local index directory content vs Blob copy. > There has been wrong understanding of following line acquiring a lock on > index directory. > {{solrCore.getDirectoryFactory().get(indexDirPath, > DirectoryFactory.DirContext.DEFAULT, > solrCore.getSolrConfig().indexConfig.lockType);}} > From Yonik: > _A couple things about Directory locking the locks were only ever to > prevent more than one IndexWriter from trying to modify the same index. The > IndexWriter grabs a write lock once when it is created and does not release > it until it is closed._ > _Directories are not locked on acquisition of the Directory from the > DirectoryFactory. See the IndexWriter constructor, where the lock is > explicitly grabbed._ > Review CorePushPull#pullUpdateFromBlob, ServerSideMetadata and other classes > as relevant. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-7964) suggest.highlight=true does not work when using context filter query
[ https://issues.apache.org/jira/browse/SOLR-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009211#comment-17009211 ] Lucene/Solr QA commented on SOLR-7964: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 59s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 4s{color} | {color:green} suggest in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 62m 57s{color} | {color:green} core in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 73m 20s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | SOLR-7964 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12906397/SOLR-7964.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene2-us-west.apache.org 4.4.0-170-generic #199-Ubuntu SMP Thu Nov 14 01:45:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / 9edb143 | | ant | version: Apache Ant(TM) version 1.9.6 compiled on July 20 2018 | | Default Java | LTS | | Test Results | https://builds.apache.org/job/PreCommit-SOLR-Build/647/testReport/ | | modules | C: lucene/suggest solr/core U: . | | Console output | https://builds.apache.org/job/PreCommit-SOLR-Build/647/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > suggest.highlight=true does not work when using context filter query > > > Key: SOLR-7964 > URL: https://issues.apache.org/jira/browse/SOLR-7964 > Project: Solr > Issue Type: Improvement > Components: Suggester >Affects Versions: 5.4 >Reporter: Arcadius Ahouansou >Assignee: David Smiley >Priority: Minor > Labels: suggester > Attachments: SOLR-7964.patch, SOLR_7964.patch, SOLR_7964.patch > > > When using the new suggester context filtering query param > {{suggest.contextFilterQuery}} introduced in SOLR-7888, the param > {{suggest.highlight=true}} has no effect. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] andyvuong commented on issue #1131: SOLR-14134: Add lazy and time-based evictiction of shared core concurrency metada…
andyvuong commented on issue #1131: SOLR-14134: Add lazy and time-based evictiction of shared core concurrency metada… URL: https://github.com/apache/lucene-solr/pull/1131#issuecomment-571356887 cc @mbwaheed - I moved the eviction on creation into registerCore, a layer above the actual ZooKeeper registration it was previously at after checking that all of the new SolrCore instances get created and go through this code path. Also switch to simple cache until the future item. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on issue #1138: LUCENE-9077 Print repro line for failed tests
madrob commented on issue #1138: LUCENE-9077 Print repro line for failed tests URL: https://github.com/apache/lucene-solr/pull/1138#issuecomment-571362867 I have a rudimentary in memory version now, can look at the disk spilling version later. Ended up fighting with Gradle more than I thought I would need to get even this far, although the disk spilling version should be straightforward from here. LMK what you think. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ErickErickson commented on issue #1131: SOLR-14134: Add lazy and time-based evictiction of shared core concurrency metada…
ErickErickson commented on issue #1131: SOLR-14134: Add lazy and time-based evictiction of shared core concurrency metada… URL: https://github.com/apache/lucene-solr/pull/1131#issuecomment-571368556 I was just skimming to see if this is related to transient cores (it doesn't appear to be) and noticed that the tests set up a new cluster in each test. That's quite a bit of work, why not set up the cluster in BeforeClass and dispose of it in AfterClass? Since each test creates its own collection, there shouldn't be any confusion. AddReplicaTest shows one way to do this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] MarcusSorealheis commented on a change in pull request #1141: SOLR-14147 change the Security manager to default to true.
MarcusSorealheis commented on a change in pull request #1141: SOLR-14147 change the Security manager to default to true. URL: https://github.com/apache/lucene-solr/pull/1141#discussion_r363549534 ## File path: solr/bin/solr ## @@ -2084,14 +2084,14 @@ else REMOTE_JMX_OPTS=() fi -# Enable java security manager (limiting filesystem access and other things) -if [ "$SOLR_SECURITY_MANAGER_ENABLED" == "true" ]; then +# Disable java security manager (allowing filesystem access and other things) +if [ "$SOLR_SECURITY_MANAGER_ENABLED" == "false" ]; then Review comment: should be resolved? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9096) Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler
[ https://issues.apache.org/jira/browse/LUCENE-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009440#comment-17009440 ] ASF subversion and git services commented on LUCENE-9096: - Commit 6bb1f6cbbe8accefbfd30b8ee74924ad43ddc356 in lucene-solr's branch refs/heads/gradle-master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6bb1f6c ] LUCENE-9096: CHANGES entry. > Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler > -- > > Key: LUCENE-9096 > URL: https://issues.apache.org/jira/browse/LUCENE-9096 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Affects Versions: 8.2 >Reporter: kkewwei >Priority: Major > Fix For: 8.5 > > Time Spent: 40m > Remaining Estimate: 0h > > In CompressingTermVectorsWriter.flushOffsets, we count > sumPos and sumOffsets by the way > {code:java} > for (int i = 0; i < fd.numTerms; ++i) { > int previousPos = 0; > int previousOff = 0; > for (int j = 0; j < fd.freqs[i]; ++j) { > final int position = positionsBuf[fd.posStart + pos]; > final int startOffset = startOffsetsBuf[fd.offStart + pos]; > sumPos[fieldNumOff] += position - previousPos; > sumOffsets[fieldNumOff] += startOffset - previousOff; > previousPos = position; > previousOff = startOffset; > ++pos; > } > } > {code} > we always use the position - previousPos, it can be summarized like this: > {code:java} > (position5-position4)+(position4-position3)+(position3-position2)+(position2-position1){code} > If we should simplify it: position5-position1 > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009444#comment-17009444 ] ASF subversion and git services commented on SOLR-11746: Commit 9edb143efdc6616906972ae6c629860c91a5a2e7 in lucene-solr's branch refs/heads/gradle-master from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9edb143 ] SOLR-11746: Adding docs for negative existence queries. > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Bug >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Fix For: master (9.0), 8.5 > > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9096) Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler
[ https://issues.apache.org/jira/browse/LUCENE-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009439#comment-17009439 ] ASF subversion and git services commented on LUCENE-9096: - Commit 2db4c909ca10c0d7edda0c94622fa1369833 in lucene-solr's branch refs/heads/gradle-master from kkewwei [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2db4c90 ] LUCENE-9096:Simplify CompressingTermVectorsWriter#flushOffsets. (#1125) > Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler > -- > > Key: LUCENE-9096 > URL: https://issues.apache.org/jira/browse/LUCENE-9096 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Affects Versions: 8.2 >Reporter: kkewwei >Priority: Major > Fix For: 8.5 > > Time Spent: 40m > Remaining Estimate: 0h > > In CompressingTermVectorsWriter.flushOffsets, we count > sumPos and sumOffsets by the way > {code:java} > for (int i = 0; i < fd.numTerms; ++i) { > int previousPos = 0; > int previousOff = 0; > for (int j = 0; j < fd.freqs[i]; ++j) { > final int position = positionsBuf[fd.posStart + pos]; > final int startOffset = startOffsetsBuf[fd.offStart + pos]; > sumPos[fieldNumOff] += position - previousPos; > sumOffsets[fieldNumOff] += startOffset - previousOff; > previousPos = position; > previousOff = startOffset; > ++pos; > } > } > {code} > we always use the position - previousPos, it can be summarized like this: > {code:java} > (position5-position4)+(position4-position3)+(position3-position2)+(position2-position1){code} > If we should simplify it: position5-position1 > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13089) bin/solr's use of lsof has some issues
[ https://issues.apache.org/jira/browse/SOLR-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009441#comment-17009441 ] ASF subversion and git services commented on SOLR-13089: Commit ac777a5352224b2c8f46836f0e078809308fc2d8 in lucene-solr's branch refs/heads/gradle-master from Martijn Koster [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ac777a5 ] SOLR-13089: Fix lsof edge cases in the solr CLI script > bin/solr's use of lsof has some issues > -- > > Key: SOLR-13089 > URL: https://issues.apache.org/jira/browse/SOLR-13089 > Project: Solr > Issue Type: Bug > Components: SolrCLI >Reporter: Martijn Koster >Assignee: Jan Høydahl >Priority: Minor > Fix For: 8.5 > > Attachments: 0001-SOLR-13089-lsof-fixes.patch, SOLR-13089.patch > > > The {{bin/solr}} script uses this {{lsof}} invocation to check if the Solr > port is being listened on: > {noformat} > running=`lsof -PniTCP:$SOLR_PORT -sTCP:LISTEN` > if [ -z "$running" ]; then > {noformat} > code is at > [here|https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2147]. > There are a few issues with this. > h2. 1. False negatives when port is occupied by different user > When {{lsof}} runs as non-root, it only shows sockets for processes with your > effective uid. > For example: > {noformat} > $ id -u && nc -l 7788 & > [1] 26576 > 1000 > works: nc ran as my user > $ lsof -PniTCP:7788 -sTCP:LISTEN > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > nc 26580 mak3u IPv4 2818104 0t0 TCP *:7788 (LISTEN) > fails: ssh is running as root > $ lsof -PniTCP:22 -sTCP:LISTEN > works if we are root > $ sudo lsof -PniTCP:22 -sTCP:LISTEN > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > sshd2524 root3u IPv4 18426 0t0 TCP *:22 (LISTEN) > sshd2524 root4u IPv6 18428 0t0 TCP *:22 (LISTEN) > {noformat} > Solr runs as non-root. > So if some other process owned by a different user occupies that port, you > will get a false negative (it will say Solr is not running even though it is) > I can't think of a good way to fix or work around that (short of not using > {{lsof}} in the first place). > Perhaps an uncommon scenario we need not worry too much about. > h2. 2. lsof can complain about lack of /etc/password entries > If {{lsof}} runs without the current effective user having an entry in > {{/etc/passwd}}, > it produces a warning on stderr: > {noformat} > $ docker run -d -u 0 solr:7.6.0 bash -c "chown -R /opt/; gosu > solr-foreground" > 4397c3f51d4a1cfca7e5815e5b047f75fb144265d4582745a584f0dba51480c6 > $ docker exec -it -u > 4397c3f51d4a1cfca7e5815e5b047f75fb144265d4582745a584f0dba51480c6 bash > I have no name!@4397c3f51d4a:/opt/solr$ lsof -PniTCP:8983 -sTCP:LISTEN > lsof: no pwd entry for UID > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > lsof: no pwd entry for UID > java 9 115u IPv4 2813503 0t0 TCP *:8983 (LISTEN) > I have no name!@4397c3f51d4a:/opt/solr$ lsof -PniTCP:8983 > -sTCP:LISTEN>/dev/null > lsof: no pwd entry for UID > lsof: no pwd entry for UID > {noformat} > You can avoid this by using the {{-t}} tag, which specifies that lsof should > produce terse output with process identifiers only and no header: > {noformat} > I have no name!@4397c3f51d4a:/opt/solr$ lsof -t -PniTCP:8983 -sTCP:LISTEN > 9 > {noformat} > This is a rare circumstance, but one I encountered and worked around. > h2. 3. On Alpine, lsof is implemented by busybox, but with incompatible > arguments > On Alpine, {{busybox}} implements {{lsof}}, but does not support the > arguments, so you get: > {noformat} > $ docker run -it alpine sh > / # lsof -t -PniTCP:8983 -sTCP:LISTEN > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/pts/0 > 1 /bin/busybox/dev/tty > {noformat} > so if you ran Solr, in the background, and it failed to start, this code > would produce a false positive. > For example: > {noformat} > docker volume create mysol > docker run -v mysol:/mysol bash bash -c "chown 8983:8983 /mysol" > docker run -it -v mysol:/mysol -w /mysol -v > $HOME/Downloads/solr-7.6.0.tgz:/solr-7.6.0.tgz openjdk:8-alpine sh > apk add procps bash > tar xvzf /solr-7.6.0.tgz > chown -R 8983:8983 . > {noformat} > then in a separate terminal: > {noformat} > $ docker exec -it -u 8983 serene_saha sh > /mysol $ SOLR_OPTS=--invalid ./solr-7.6.0/bin/solr start > whoami: unknown uid 8983 > Waiting up to 180 seconds to see Solr running on port 8983 [|] > Started Solr server on port 8983 (pid=101). Happy searching! > /mysol $ > {noformat} > and in another separate terminal: > {noformat} > $ docker exec -it thirsty_lisko
[jira] [Commented] (LUCENE-8673) Use radix partitioning when merging dimensional points
[ https://issues.apache.org/jira/browse/LUCENE-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009442#comment-17009442 ] ASF subversion and git services commented on LUCENE-8673: - Commit b6f31835ad18da0f7a22064481b0d0e167f9f30c in lucene-solr's branch refs/heads/gradle-master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b6f3183 ] LUCENE-8673: Avoid OOMEs because of IOContext randomization. > Use radix partitioning when merging dimensional points > -- > > Key: LUCENE-8673 > URL: https://issues.apache.org/jira/browse/LUCENE-8673 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ignacio Vera >Assignee: Ignacio Vera >Priority: Major > Fix For: 8.x, master (9.0) > > Attachments: Geo3D.png, Geo3D.png, Geo3D.png, LatLonPoint.png, > LatLonPoint.png, LatLonPoint.png, LatLonShape.png, LatLonShape.png, > LatLonShape.png > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Following the advise of [~jpountz] in LUCENE-8623I have investigated using > radix selection when merging segments instead of sorting the data at the > beginning. The results are pretty promising when running Lucene geo > benchmarks: > > ||Approach||Index time (sec): Dev||Index Time (sec): Base||Index Time: > Diff||Force merge time (sec): Dev||Force Merge time (sec): Base||Force Merge > Time: Diff||Index size (GB): Dev||Index size (GB): Base||Index Size: > Diff||Reader heap (MB): Dev||Reader heap (MB): Base||Reader heap: Diff > |points|241.5s|235.0s| 3%|157.2s|157.9s|-0%|0.55|0.55| 0%|1.57|1.57| 0%| > |shapes|416.1s|650.1s|-36%|306.1s|603.2s|-49%|1.29|1.29| 0%|1.61|1.61| 0%| > |geo3d|261.0s|360.1s|-28%|170.2s|279.9s|-39%|0.75|0.75| 0%|1.58|1.58| 0%| > > edited: table formatting to be a jira table > > In 2D the index throughput is more or less equal but for higher dimensions > the impact is quite big. In all cases the merging process requires much less > disk space, I am attaching plots showing the different behaviour and I am > opening a pull request. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11746) numeric fields need better error handling for prefix/wildcard syntax -- consider uniform support for "foo:* == foo:[* TO *]"
[ https://issues.apache.org/jira/browse/SOLR-11746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009443#comment-17009443 ] ASF subversion and git services commented on SOLR-11746: Commit f5ab3ca688b3127bece252ffd87cc8bfa9f285ff in lucene-solr's branch refs/heads/gradle-master from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f5ab3ca ] SOLR-11746: Existence query support for numeric point fields > numeric fields need better error handling for prefix/wildcard syntax -- > consider uniform support for "foo:* == foo:[* TO *]" > > > Key: SOLR-11746 > URL: https://issues.apache.org/jira/browse/SOLR-11746 > Project: Solr > Issue Type: Bug >Affects Versions: 7.0 >Reporter: Chris M. Hostetter >Assignee: Houston Putman >Priority: Major > Fix For: master (9.0), 8.5 > > Attachments: SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch, > SOLR-11746.patch, SOLR-11746.patch, SOLR-11746.patch > > > On the solr-user mailing list, Torsten Krah pointed out that with Trie > numeric fields, query syntax such as {{foo_d:\*}} has been functionality > equivilent to {{foo_d:\[\* TO \*]}} and asked why this was not also supported > for Point based numeric fields. > The fact that this type of syntax works (for {{indexed="true"}} Trie fields) > appears to have been an (untested, undocumented) fluke of Trie fields given > that they use indexed terms for the (encoded) numeric terms and inherit the > default implementation of {{FieldType.getPrefixQuery}} which produces a > prefix query against the {{""}} (empty string) term. > (Note that this syntax has aparently _*never*_ worked for Trie fields with > {{indexed="false" docValues="true"}} ) > In general, we should assess the behavior users attempt a prefix/wildcard > syntax query against numeric fields, as currently the behavior is largely > non-sensical: prefix/wildcard syntax frequently match no docs w/o any sort > of error, and the aformentioned {{numeric_field:*}} behaves inconsistently > between points/trie fields and between indexed/docValued trie fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org