[GitHub] [lucene-solr] s1monw commented on pull request #1918: LUCENE-9535: Commit DWPT bytes used before locking indexing
s1monw commented on pull request #1918: URL: https://github.com/apache/lucene-solr/pull/1918#issuecomment-698163487 @jpountz I had to change some stuff to make it work. Down the road I want to clean this up more so we don't need the extra step but I want to do this after we cut 8.7 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14613) Provide a clean API for pluggable replica assignment implementations
[ https://issues.apache.org/jira/browse/SOLR-14613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201326#comment-17201326 ] ASF subversion and git services commented on SOLR-14613: Commit cafa449769fb131830ede129910287670625aa0d in lucene-solr's branch refs/heads/master from noblepaul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cafa449 ] SOLR-14613: Avoid multiple ZK write > Provide a clean API for pluggable replica assignment implementations > > > Key: SOLR-14613 > URL: https://issues.apache.org/jira/browse/SOLR-14613 > Project: Solr > Issue Type: Improvement > Components: AutoScaling >Reporter: Andrzej Bialecki >Assignee: Ilan Ginzburg >Priority: Major > Time Spent: 41h > Remaining Estimate: 0h > > As described in SIP-8 the current autoscaling Policy implementation has > several limitations that make it difficult to use for very large clusters and > very large collections. SIP-8 also mentions the possible migration path by > providing alternative implementations of the placement strategies that are > less complex but more efficient in these very large environments. > We should review the existing APIs that the current autoscaling engine uses > ({{SolrCloudManager}} , {{AssignStrategy}} , {{Suggester}} and related > interfaces) to see if they provide a sufficient and minimal API for plugging > in alternative autoscaling placement strategies, and if necessary refactor > the existing APIs. > Since these APIs are internal it should be possible to do this without > breaking back-compat. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9535) Investigate recent indexing slowdown for wikimedium documents
[ https://issues.apache.org/jira/browse/LUCENE-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201335#comment-17201335 ] ASF subversion and git services commented on LUCENE-9535: - Commit c258905bd01f458df4924e361b2395f06e387b88 in lucene-solr's branch refs/heads/master from Simon Willnauer [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c258905 ] LUCENE-9535: Commit DWPT bytes used before locking indexing (#1918) Currently we calculate the ramBytesUsed by the DWPT under the flushControl lock. We can do this calculation safely outside of the lock without any downside. The FlushControl lock should be used with care since it's a central part of indexing and might block all indexing. > Investigate recent indexing slowdown for wikimedium documents > - > > Key: LUCENE-9535 > URL: https://issues.apache.org/jira/browse/LUCENE-9535 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: cpu_profile.svg > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Nightly benchmarks report a ~10% slowdown for 1kB documents as of September > 9th: [http://people.apache.org/~mikemccand/lucenebench/indexing.html]. > On that day, we added stored fields in DWPT accounting (LUCENE-9511), so I > first thought this could be due to smaller flushed segments and more merging, > but I still wonder whether there's something else. The benchmark runs with > 8GB of heap, 2GB of RAM buffer and 36 indexing threads. So it's about 2GB/36 > = 57MB of RAM buffer per thread in the worst-case scenario that all DWPTs get > full at the same time. Stored fields account for about 0.7MB of memory, or 1% > of the indexing buffer size. How can a 1% reduction of buffering capacity > explain a 10% indexing slowdown? I looked into this further by running > indexing benchmarks locally with 8 indexing threads and 128MB of indexing > buffer memory, which would make this issue even more apparent if the smaller > RAM buffer was the cause, but I'm not seeing a regression and actually I'm > seeing similar number of flushes when I disabled memory accounting for stored > fields. > I ran indexing under a profiler to see whether something else could cause > this slowdown, e.g. slow implementations of ramBytesUsed on stored fields > writers, but nothing surprising showed up and the profile looked just like I > would have expected. > Another question I have is why the 4kB benchmark is not affected at all. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw merged pull request #1918: LUCENE-9535: Commit DWPT bytes used before locking indexing
s1monw merged pull request #1918: URL: https://github.com/apache/lucene-solr/pull/1918 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on a change in pull request #1906: SOLR-13528: Implement API Based Config For Rate Limiters
noblepaul commented on a change in pull request #1906: URL: https://github.com/apache/lucene-solr/pull/1906#discussion_r494114574 ## File path: solr/core/src/java/org/apache/solr/handler/ClusterAPI.java ## @@ -206,7 +209,7 @@ public void setObjProperty(PayloadObj obj) { public void setProperty(PayloadObj> obj) throws Exception { Map m = obj.getDataMap(); m.put("action", CLUSTERPROP.toString()); - collectionsHandler.handleRequestBody(wrapParams(obj.getRequest(),m ), obj.getResponse()); + collectionsHandler.handleRequestBody(wrapParams(obj.getRequest(), m), obj.getResponse()); Review comment: why is there a change here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1905: LUCENE-9488 Release with Gradle Part 2
dweiss commented on a change in pull request #1905: URL: https://github.com/apache/lucene-solr/pull/1905#discussion_r494116068 ## File path: lucene/build.gradle ## @@ -15,8 +15,56 @@ * limitations under the License. */ +// Should we do this as :lucene:packaging similar to how Solr does it? +// Or is this fine here? + +plugins { + id 'distribution' +} + description = 'Parent project for Apache Lucene Core' subprojects { group "org.apache.lucene" -} \ No newline at end of file +} + +distributions { + main { + // This is empirically wrong, but it is mostly a copy from `ant package-zip` Review comment: Ok, fair enough. But I wouldn't want to add things to gradle that are not right - these are hard to get rid of later on. I'm really busy this week but I will correct those assembly bits and commit them to this PR. Please give me some time to work on this, thank you. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml
[ https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201355#comment-17201355 ] Dawid Weiss commented on SOLR-14889: Hi Chris. Uwe explained the particular problem, I wanted to follow up with a generic explanation. The order of evaluation is I think the most tricky bit in gradle builds as it involves syntactic sugar of groovy, hooks in gradle itself and the different "stages" of gradle build itself. This is a good read: https://docs.gradle.org/current/userguide/build_lifecycle.html In short terms the three phases are: initialization - boostrap, reading the setup properties, etc (we don't care much), evaluation - this is actually execution too - all build scripts (groovy) are run against an empty build graph; they add and configure tasks, prepare properties, declare dependencies (but shouldn't resolve them yet), etc. This is when you "set up" the build. The execution phase follows when gradle decides which tasks to actually execute (based on user-provided task names, dependencies, configurations to resolve and cache states). The "doFirst" and "doLast" clauses in tasks add an anonymous closure to the list of things to run in execution phase. So even if a script looks linear, the closure in doFirst and doLast runs at a completely different time than what's outside of it. The simplest way to see what gets executed and when is to add debug printlns... It really helps sometimes. > improve templated variable escaping in ref-guide _config.yml > > > Key: SOLR-14889 > URL: https://issues.apache.org/jira/browse/SOLR-14889 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch > > > SOLR-14824 ran into windows failures when we switching from using a hardcoded > "relative" path to the solrRootPath to using groovy/project variables to get > the path. the reason for the failures was that the path us used as a > variable tempted into {{_config.yml.template}} to build the {{_config.yml}} > file, but on windows the path seperater of '\' was being parsed by > jekyll/YAML as a string escape character. > (This wasn't a problem we ran into before, even on windows, prior to the > SOLR-14824 changes, because the hardcoded relative path only used '/' > delimiters, which (j)ruby was happy to work with, even on windows. > As Uwe pointed out when hotfixing this... > {quote}Problem was that backslashes are used to escape strings, but windows > paths also have those. Fix was to add StringEscapeUtils, but I don't like > this too much. Maybe we find a better solution to make special characters in > those properties escaped correctly when used in strings inside templates. > {quote} > ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this > one variable -- doesn't really protect other variables that might have > special charactes in them down the road, and while "escapeJava" work ok for > the "\" issue, it isn't neccessarily consistent with all YAML escapse, which > could lead to even weird bugs/cofusion down the road. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on pull request #1836: LUCENE-9317: Clean up split package in analyzers-common
dweiss commented on pull request #1836: URL: https://github.com/apache/lucene-solr/pull/1836#issuecomment-698189266 If there are test infrastructure that needs to be shared then I'd suggest creating a project that is a test-configuration dependency from other subprojects (rather than cloning those classes). This is a simple and cheap thing to do with gradle. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml
[ https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201360#comment-17201360 ] Uwe Schindler commented on SOLR-14889: -- Hi [~dweiss]: Do you think my patch is fine. It works for me, although I don't like to pass an empty map to the expand() configuration of the SyncTask. I was looking for a method like expand() that takes a closure to expand properties, but the only one provided by the sync task is the one taking a Map. So I don't see any better alternative! I was also looking into using the project.provider(...) but that is also not accepted by expand(). IMHO, I would remove the extra task "populateLazyProps" and move this into the copy task (at the place where I added the doFirst()). Then it works linear (because doFirst() is run explicitely before the main task method of SyncTask). The lazy props have no effect on inputs, so theres no need to have it separated (inputs are also executed before). When it depends on the configuration, it's reexecuted anyways. I can provide a patch simplifying this. > improve templated variable escaping in ref-guide _config.yml > > > Key: SOLR-14889 > URL: https://issues.apache.org/jira/browse/SOLR-14889 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch > > > SOLR-14824 ran into windows failures when we switching from using a hardcoded > "relative" path to the solrRootPath to using groovy/project variables to get > the path. the reason for the failures was that the path us used as a > variable tempted into {{_config.yml.template}} to build the {{_config.yml}} > file, but on windows the path seperater of '\' was being parsed by > jekyll/YAML as a string escape character. > (This wasn't a problem we ran into before, even on windows, prior to the > SOLR-14824 changes, because the hardcoded relative path only used '/' > delimiters, which (j)ruby was happy to work with, even on windows. > As Uwe pointed out when hotfixing this... > {quote}Problem was that backslashes are used to escape strings, but windows > paths also have those. Fix was to add StringEscapeUtils, but I don't like > this too much. Maybe we find a better solution to make special characters in > those properties escaped correctly when used in strings inside templates. > {quote} > ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this > one variable -- doesn't really protect other variables that might have > special charactes in them down the road, and while "escapeJava" work ok for > the "\" issue, it isn't neccessarily consistent with all YAML escapse, which > could lead to even weird bugs/cofusion down the road. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml
[ https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201364#comment-17201364 ] Dawid Weiss commented on SOLR-14889: Darn, I didn't even look at your patch, Uwe - assumed you've solved it. :) I really can't do anything in the next ~7 hours or so. Will try to take a look after that though. > improve templated variable escaping in ref-guide _config.yml > > > Key: SOLR-14889 > URL: https://issues.apache.org/jira/browse/SOLR-14889 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch > > > SOLR-14824 ran into windows failures when we switching from using a hardcoded > "relative" path to the solrRootPath to using groovy/project variables to get > the path. the reason for the failures was that the path us used as a > variable tempted into {{_config.yml.template}} to build the {{_config.yml}} > file, but on windows the path seperater of '\' was being parsed by > jekyll/YAML as a string escape character. > (This wasn't a problem we ran into before, even on windows, prior to the > SOLR-14824 changes, because the hardcoded relative path only used '/' > delimiters, which (j)ruby was happy to work with, even on windows. > As Uwe pointed out when hotfixing this... > {quote}Problem was that backslashes are used to escape strings, but windows > paths also have those. Fix was to add StringEscapeUtils, but I don't like > this too much. Maybe we find a better solution to make special characters in > those properties escaped correctly when used in strings inside templates. > {quote} > ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this > one variable -- doesn't really protect other variables that might have > special charactes in them down the road, and while "escapeJava" work ok for > the "\" issue, it isn't neccessarily consistent with all YAML escapse, which > could lead to even weird bugs/cofusion down the road. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.
sigram commented on a change in pull request #1758: URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r494131970 ## File path: solr/core/src/java/org/apache/solr/core/CoreContainer.java ## @@ -889,7 +896,37 @@ public void load() { ContainerPluginsApi containerPluginsApi = new ContainerPluginsApi(this); containerHandlers.getApiBag().registerObject(containerPluginsApi.readAPI); containerHandlers.getApiBag().registerObject(containerPluginsApi.editAPI); + + // create the ClusterEventProducer + CustomContainerPlugins.ApiInfo clusterEventProducerInfo = customContainerPlugins.getPlugin(ClusterEventProducer.PLUGIN_NAME); + if (clusterEventProducerInfo != null) { +clusterEventProducer = (ClusterEventProducer) clusterEventProducerInfo.getInstance(); + } else { +clusterEventProducer = new ClusterEventProducerImpl(this); + } + // init ClusterSingleton-s + Map singletons = new ConcurrentHashMap<>(); + if (clusterEventProducer instanceof ClusterSingleton) { +singletons.put(ClusterEventProducer.PLUGIN_NAME, (ClusterSingleton) clusterEventProducer); + } + + // register ClusterSingleton handlers + // XXX register also other ClusterSingleton-s from packages - how? + containerHandlers.keySet().forEach(handlerName -> { Review comment: The purpose of this code is to build a registry of existing `ClusterSingleton` implementations (perhaps this should go to a dedicated registry class). We don't have a dependency injection framework, so we need to somewhere perform the discovery and registration ourselves. And we need a registry in order to manage the `ClusterSingleton` lifecycle together with the Overseer leader life-cycle. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.
sigram commented on a change in pull request #1758: URL: https://github.com/apache/lucene-solr/pull/1758#discussion_r494134314 ## File path: solr/core/src/java/org/apache/solr/handler/admin/ContainerPluginsApi.java ## @@ -64,15 +64,15 @@ public ContainerPluginsApi(CoreContainer coreContainer) { public class Read { @EndPoint(method = METHOD.GET, -path = "/cluster/plugin", +path = "/cluster/plugins", Review comment: Right, but this is for 9.0 so we can break back-compat if it's justified - and I think it is because the singular name here doesn't make sense, as it is a location where multiple plugin configurations are defined. In any case we can provide a back-compat shim for 9.0 to also accept `plugin` singular. ## File path: solr/core/src/java/org/apache/solr/handler/admin/ContainerPluginsApi.java ## @@ -64,15 +64,15 @@ public ContainerPluginsApi(CoreContainer coreContainer) { public class Read { @EndPoint(method = METHOD.GET, -path = "/cluster/plugin", +path = "/cluster/plugins", permission = PermissionNameProvider.Name.COLL_READ_PERM) public void list(SolrQueryRequest req, SolrQueryResponse rsp) throws IOException { - rsp.add(PLUGIN, plugins(zkClientSupplier)); + rsp.add(PLUGINS, plugins(zkClientSupplier)); Review comment: See above. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml
[ https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-14889: - Attachment: SOLR-14889.patch > improve templated variable escaping in ref-guide _config.yml > > > Key: SOLR-14889 > URL: https://issues.apache.org/jira/browse/SOLR-14889 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch, > SOLR-14889.patch > > > SOLR-14824 ran into windows failures when we switching from using a hardcoded > "relative" path to the solrRootPath to using groovy/project variables to get > the path. the reason for the failures was that the path us used as a > variable tempted into {{_config.yml.template}} to build the {{_config.yml}} > file, but on windows the path seperater of '\' was being parsed by > jekyll/YAML as a string escape character. > (This wasn't a problem we ran into before, even on windows, prior to the > SOLR-14824 changes, because the hardcoded relative path only used '/' > delimiters, which (j)ruby was happy to work with, even on windows. > As Uwe pointed out when hotfixing this... > {quote}Problem was that backslashes are used to escape strings, but windows > paths also have those. Fix was to add StringEscapeUtils, but I don't like > this too much. Maybe we find a better solution to make special characters in > those properties escaped correctly when used in strings inside templates. > {quote} > ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this > one variable -- doesn't really protect other variables that might have > special charactes in them down the road, and while "escapeJava" work ok for > the "\" issue, it isn't neccessarily consistent with all YAML escapse, which > could lead to even weird bugs/cofusion down the road. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on pull request #1758: SOLR-14749: Provide a clean API for cluster-level event processing, Initial draft.
sigram commented on pull request #1758: URL: https://github.com/apache/lucene-solr/pull/1758#issuecomment-698204377 > an example of how you register some type of plugin Still open for suggestions. IMHO `ClusterEventListener`-s make sense only if they are also `ClusterSingleton`-s. If that's the case then there's that messy section in `CoreContainer` that already registers `ClusterSingleton`-s, and we can add a section that additionally registers any instance that implements `ClusterEventListener` with the `ClusterEventProducer`. > how it looks in some JSON in ZK I'm reusing the plugin-s configs from `CustomContainerPlugins` so it will look like any other plugin config. > If a plugin can be registered using a public API, is there a testcase for the same? Not yet :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml
[ https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201368#comment-17201368 ] Uwe Schindler commented on SOLR-14889: -- Hi [~dweiss], hi [~hossman], I updated the patch, here's my new one: [^SOLR-14889.patch] I removed the prepareLazyProps task and moved the stuff into doFirst. This has several goodies: - It's an antipattern to modify global properties during a task execution, because depending on order of tasks (or parallelism) this may lead to strange results. Because of this the doFirst clones the input map and then starts to add stuff - Uptodate now works as expected, because the properties as input don't suddenly change. To me the task was suddenly reexceuted, although running it several times. Now this is solved. prepareSources only runs if input files change (sync task), or the modifiable templateProps input changes. It is also reexcuted, if the configuration changes, so I also added "dependsOn configurations.depVer" - The properties are printed out - After printing them out they are assigned to a FINAL local variable, which is passed to expand. It's declared final to make sure the map reference stays the same. [~hossman]: you can now change the nocommits back to their original state, looks like it works as expected. I am happy with it now. > improve templated variable escaping in ref-guide _config.yml > > > Key: SOLR-14889 > URL: https://issues.apache.org/jira/browse/SOLR-14889 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch, > SOLR-14889.patch > > > SOLR-14824 ran into windows failures when we switching from using a hardcoded > "relative" path to the solrRootPath to using groovy/project variables to get > the path. the reason for the failures was that the path us used as a > variable tempted into {{_config.yml.template}} to build the {{_config.yml}} > file, but on windows the path seperater of '\' was being parsed by > jekyll/YAML as a string escape character. > (This wasn't a problem we ran into before, even on windows, prior to the > SOLR-14824 changes, because the hardcoded relative path only used '/' > delimiters, which (j)ruby was happy to work with, even on windows. > As Uwe pointed out when hotfixing this... > {quote}Problem was that backslashes are used to escape strings, but windows > paths also have those. Fix was to add StringEscapeUtils, but I don't like > this too much. Maybe we find a better solution to make special characters in > those properties escaped correctly when used in strings inside templates. > {quote} > ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this > one variable -- doesn't really protect other variables that might have > special charactes in them down the road, and while "escapeJava" work ok for > the "\" issue, it isn't neccessarily consistent with all YAML escapse, which > could lead to even weird bugs/cofusion down the road. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml
[ https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201368#comment-17201368 ] Uwe Schindler edited comment on SOLR-14889 at 9/24/20, 8:43 AM: Hi [~dweiss], hi [~hossman], I updated the patch, here's my new one: [^SOLR-14889.patch] I removed the prepareLazyProps task and moved the stuff into doFirst. This has several goodies: - It's an antipattern to modify global properties during a task execution, because depending on order of tasks (or parallelism) this may lead to strange results. Because of this the doFirst clones the input map and then starts to add stuff - Uptodate now works as expected, because the properties as input don't suddenly change. To me the task was suddenly reexceuted, although running it several times. Now this is solved. prepareSources only runs if input files change (sync task), or the modifiable templateProps input changes. It is also reexcuted, if the configuration changes, so I also added "dependsOn configurations.depVer" - The properties are printed out - After printing them out they are added to the previously (in configuration phase) FINAL local Map variable, which is passed to expand. It's declared final to make sure the map reference stays the same, so expand() always sees only one instance. [~hossman]: you can now change the nocommits back to their original state, looks like it works as expected. I am happy with it now. was (Author: thetaphi): Hi [~dweiss], hi [~hossman], I updated the patch, here's my new one: [^SOLR-14889.patch] I removed the prepareLazyProps task and moved the stuff into doFirst. This has several goodies: - It's an antipattern to modify global properties during a task execution, because depending on order of tasks (or parallelism) this may lead to strange results. Because of this the doFirst clones the input map and then starts to add stuff - Uptodate now works as expected, because the properties as input don't suddenly change. To me the task was suddenly reexceuted, although running it several times. Now this is solved. prepareSources only runs if input files change (sync task), or the modifiable templateProps input changes. It is also reexcuted, if the configuration changes, so I also added "dependsOn configurations.depVer" - The properties are printed out - After printing them out they are assigned to a FINAL local variable, which is passed to expand. It's declared final to make sure the map reference stays the same. [~hossman]: you can now change the nocommits back to their original state, looks like it works as expected. I am happy with it now. > improve templated variable escaping in ref-guide _config.yml > > > Key: SOLR-14889 > URL: https://issues.apache.org/jira/browse/SOLR-14889 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch, > SOLR-14889.patch > > > SOLR-14824 ran into windows failures when we switching from using a hardcoded > "relative" path to the solrRootPath to using groovy/project variables to get > the path. the reason for the failures was that the path us used as a > variable tempted into {{_config.yml.template}} to build the {{_config.yml}} > file, but on windows the path seperater of '\' was being parsed by > jekyll/YAML as a string escape character. > (This wasn't a problem we ran into before, even on windows, prior to the > SOLR-14824 changes, because the hardcoded relative path only used '/' > delimiters, which (j)ruby was happy to work with, even on windows. > As Uwe pointed out when hotfixing this... > {quote}Problem was that backslashes are used to escape strings, but windows > paths also have those. Fix was to add StringEscapeUtils, but I don't like > this too much. Maybe we find a better solution to make special characters in > those properties escaped correctly when used in strings inside templates. > {quote} > ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this > one variable -- doesn't really protect other variables that might have > special charactes in them down the road, and while "escapeJava" work ok for > the "\" issue, it isn't neccessarily consistent with all YAML escapse, which > could lead to even weird bugs/cofusion down the road. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml
[ https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201372#comment-17201372 ] Dawid Weiss commented on SOLR-14889: I glanced at it and I think it's good, will verify in the afternoon. > improve templated variable escaping in ref-guide _config.yml > > > Key: SOLR-14889 > URL: https://issues.apache.org/jira/browse/SOLR-14889 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch, > SOLR-14889.patch > > > SOLR-14824 ran into windows failures when we switching from using a hardcoded > "relative" path to the solrRootPath to using groovy/project variables to get > the path. the reason for the failures was that the path us used as a > variable tempted into {{_config.yml.template}} to build the {{_config.yml}} > file, but on windows the path seperater of '\' was being parsed by > jekyll/YAML as a string escape character. > (This wasn't a problem we ran into before, even on windows, prior to the > SOLR-14824 changes, because the hardcoded relative path only used '/' > delimiters, which (j)ruby was happy to work with, even on windows. > As Uwe pointed out when hotfixing this... > {quote}Problem was that backslashes are used to escape strings, but windows > paths also have those. Fix was to add StringEscapeUtils, but I don't like > this too much. Maybe we find a better solution to make special characters in > those properties escaped correctly when used in strings inside templates. > {quote} > ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this > one variable -- doesn't really protect other variables that might have > special charactes in them down the road, and while "escapeJava" work ok for > the "\" issue, it isn't neccessarily consistent with all YAML escapse, which > could lead to even weird bugs/cofusion down the road. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14613) Provide a clean API for pluggable replica assignment implementations
[ https://issues.apache.org/jira/browse/SOLR-14613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201378#comment-17201378 ] Ilan Ginzburg commented on SOLR-14613: -- [~noble.paul] could we change {{setClusterProperty(String propertyName, Object propertyValue)}} in {{ClusterProperties}} to replace the existing property value with the passed one, or do we really need this "partial" update behavior? (where existing Json in {{clusterprops.json}} for {{propertyName}} is merged rather than replaced with the new Json of {{propertyValue}}) If we can't change the existing method, I suggest the added one to have a consistent signature, i.e. define *{{update(String propertyName, Object propertyValue)}}* rather than {{update(MapWriter obj, String... path)}}. This will make {{ClusterAPI}} easier to read and move the configuration management implementation details to {{ClusterProperties}} where they belong. > Provide a clean API for pluggable replica assignment implementations > > > Key: SOLR-14613 > URL: https://issues.apache.org/jira/browse/SOLR-14613 > Project: Solr > Issue Type: Improvement > Components: AutoScaling >Reporter: Andrzej Bialecki >Assignee: Ilan Ginzburg >Priority: Major > Time Spent: 41h > Remaining Estimate: 0h > > As described in SIP-8 the current autoscaling Policy implementation has > several limitations that make it difficult to use for very large clusters and > very large collections. SIP-8 also mentions the possible migration path by > providing alternative implementations of the placement strategies that are > less complex but more efficient in these very large environments. > We should review the existing APIs that the current autoscaling engine uses > ({{SolrCloudManager}} , {{AssignStrategy}} , {{Suggester}} and related > interfaces) to see if they provide a sufficient and minimal API for plugging > in alternative autoscaling placement strategies, and if necessary refactor > the existing APIs. > Since these APIs are internal it should be possible to do this without > breaking back-compat. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14843) Define strongly-typed cluster configuration API
[ https://issues.apache.org/jira/browse/SOLR-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201427#comment-17201427 ] Ilan Ginzburg commented on SOLR-14843: -- Would an initial step in this Jira be the ability to define a way to "prime" the Zookeeper based configuration with something coming from a file? Motivation: The plugin based replica placement code is in master, but the default placement strategy is {{LEGACY}}. To use the new placement strategy a specific config must be set on a running SolrCloud cluster using a {{curl}} command to the config API. To make plugin placement the default strategy now or later (now would be good so it gets some baking time...) yet be able to switch back a cluster to {{LEGACY}} if needed (by removing that configuration), a configuration needs to somehow be automatically pushed to {{/clusterprops.json}}, but there's no support for doing that. I believe priming a ZK config with content from a file is not easy (handle new configs or changes to default configs coming with a new release, deal with existing configuration in ZK to not overwrite it etc.) and is not my preferred way of dealing with this types of configs, but we do need something. > Define strongly-typed cluster configuration API > --- > > Key: SOLR-14843 > URL: https://issues.apache.org/jira/browse/SOLR-14843 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Andrzej Bialecki >Priority: Major > Labels: clean-api > Fix For: master (9.0) > > > Current cluster-level configuration uses a hodgepodge of traditional Solr > config sources (solr.xml, system properties) and the new somewhat arbitrary > config files kept in ZK ({{/clusterprops.json, /security.json, > /packages.json, /autoscaling.json}} etc...). There's no uniform > strongly-typed API to access and manage these configs - currently each config > source has its own CRUD, often relying on direct access to Zookeeper. There's > also no uniform method for monitoring changes to these config sources. > This issue proposes a uniform config API facade with the following > characteristics: > * Using a single hierarchical (or at least key-based) facade for accessing > any global config. > * Using strongly-typed sub-system configs instead of opaque Map-s: > components would no longer deal with JSON parsing/writing, instead they would > use properly annotated Java objects for config CRUD. Config objects would > include versioning information (eg. lastModified timestamp). > * Isolating access to the underlying config persistence layer: components > would no longer directly interact with Zookeeper or files. Most likely the > default implementation would continue using different ZK files per-subsystem > in order to limit the complexity of file formats and to reduce the cost of > notifications for unmodified parts of the configs. > * Providing uniform way to register listeners for monitoring changes in > specific configs: components would no longer need to interact with ZK > watches, they would instead be notified about modified configs that they are > interested in. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async
[ https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201437#comment-17201437 ] Cao Manh Dat commented on SOLR-14354: - [~ichattopadhyaya] sorry but I gave solr-bench a shot and it seems with the default config-local.json finish too quick – even when I tried to increase the size of queryLog 56k queries, but the total time seems unchanged. With the total time of 500ms for query benchmark, the result can't say anything at all. Again I'm ok with reverting, but I kinda sad when good things unable to reach users soon. [~sarkaramr...@gmail.com] I heard that you have the toolbox for doing benchmark in a heavy scenario. If possible can you do that and post the result here? It will be a big help for us and our users. > HttpShardHandler send requests in async > --- > > Key: SOLR-14354 > URL: https://issues.apache.org/jira/browse/SOLR-14354 > Project: Solr > Issue Type: Improvement >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Blocker > Fix For: master (9.0), 8.7 > > Attachments: image-2020-03-23-10-04-08-399.png, > image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png > > Time Spent: 4h > Remaining Estimate: 0h > > h2. 1. Current approach (problem) of Solr > Below is the diagram describe the model on how currently handling a request. > !image-2020-03-23-10-04-08-399.png! > The main-thread that handles the search requests, will submit n requests (n > equals to number of shards) to an executor. So each request will correspond > to a thread, after sending a request that thread basically do nothing just > waiting for response from other side. That thread will be swapped out and CPU > will try to handle another thread (this is called context switch, CPU will > save the context of the current thread and switch to another one). When some > data (not all) come back, that thread will be called to parsing these data, > then it will wait until more data come back. So there will be lots of context > switching in CPU. That is quite inefficient on using threads.Basically we > want less threads and most of them must busy all the time, because threads > are not free as well as context switching. That is the main idea behind > everything, like executor > h2. 2. Async call of Jetty HttpClient > Jetty HttpClient offers async API like this. > {code:java} > httpClient.newRequest("http://domain.com/path";) > // Add request hooks > .onRequestQueued(request -> { ... }) > .onRequestBegin(request -> { ... }) > // Add response hooks > .onResponseBegin(response -> { ... }) > .onResponseHeaders(response -> { ... }) > .onResponseContent((response, buffer) -> { ... }) > .send(result -> { ... }); {code} > Therefore after calling {{send()}} the thread will return immediately without > any block. Then when the client received the header from other side, it will > call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not > all response) from the data it will call {{onContent(buffer)}} listeners. > When everything finished it will call {{onComplete}} listeners. One main > thing that will must notice here is all listeners should finish quick, if the > listener block, all further data of that request won’t be handled until the > listener finish. > h2. 3. Solution 1: Sending requests async but spin one thread per response > Jetty HttpClient already provides several listeners, one of them is > InputStreamResponseListener. This is how it is get used > {code:java} > InputStreamResponseListener listener = new InputStreamResponseListener(); > client.newRequest(...).send(listener); > // Wait for the response headers to arrive > Response response = listener.get(5, TimeUnit.SECONDS); > if (response.getStatus() == 200) { > // Obtain the input stream on the response content > try (InputStream input = listener.getInputStream()) { > // Read the response content > } > } {code} > In this case, there will be 2 thread > * one thread trying to read the response content from InputStream > * one thread (this is a short-live task) feeding content to above > InputStream whenever some byte[] is available. Note that if this thread > unable to feed data into InputStream, this thread will wait. > By using this one, the model of HttpShardHandler can be written into > something like this > {code:java} > handler.sendReq(req, (is) -> { > executor.submit(() -> > try (is) { > // Read the content from InputStream > } > ) > }) {code} > The first diagram will be changed into this > !image-2020-03-23-10-09-10-221.png! > Notice that although “sending req to shard1” is wide, it won’t take long time > since sending req is a very quic
[GitHub] [lucene-solr] mocobeta commented on pull request #1836: LUCENE-9317: Clean up split package in analyzers-common
mocobeta commented on pull request #1836: URL: https://github.com/apache/lucene-solr/pull/1836#issuecomment-698267536 > If you have any questions, please ask me on slack, there I can respond faster. We already have Jira and Github, I'd rather not disperse discussions onto multiple platforms any more... Is Jira/Github mentions insufficient for us? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mocobeta edited a comment on pull request #1836: LUCENE-9317: Clean up split package in analyzers-common
mocobeta edited a comment on pull request #1836: URL: https://github.com/apache/lucene-solr/pull/1836#issuecomment-698267536 > If you have any questions, please ask me on slack, there I can respond faster. We already have Jira and Github, I'd rather not disperse discussions onto multiple platforms any more... Is Jira/Github mention insufficient for us? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on pull request #1912: LUCENE-9535: Try to do larger flushes.
mikemccand commented on pull request #1912: URL: https://github.com/apache/lucene-solr/pull/1912#issuecomment-698332632 Oh my! We are talking about assigning DWPT to incoming indexing thread, right? (And not which DWPT to pick for flushing because RAM buffer is full, which I think is currently "the biggest one/s"?). I think this is a good idea. It will tend to make the biggest DWPTs even bigger, especially when there is high variance of how many threads are indexing at once over time, until a flush is triggered. I do not remember why we switched to "last DWPT". Long ago we did have thread affinity, so the same indexing thread would try to get the same DWPT, but we moved away from that quite a while ago. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1917: LUCENE-9535: Make ByteBuffersDataOutput#ramBytesUsed run in constant-time.
dweiss commented on a change in pull request #1917: URL: https://github.com/apache/lucene-solr/pull/1917#discussion_r494311008 ## File path: lucene/core/src/java/org/apache/lucene/store/ByteBuffersDataOutput.java ## @@ -400,8 +400,13 @@ public void writeSetOfStrings(Set set) { public long ramBytesUsed() { // Return a rough estimation for allocated blocks. Note that we do not make // any special distinction for direct memory buffers. -return RamUsageEstimator.NUM_BYTES_OBJECT_REF * blocks.size() + - blocks.stream().mapToLong(buf -> buf.capacity()).sum(); +ByteBuffer first = blocks.peek(); +if (first == null) { + return 0L; +} else { + // All blocks have the same capacity. Review comment: Hmmm... Do they? I don't think this is the case, in general, since you can plug in an arbitrary block provider/ recycler? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1917: LUCENE-9535: Make ByteBuffersDataOutput#ramBytesUsed run in constant-time.
dweiss commented on a change in pull request #1917: URL: https://github.com/apache/lucene-solr/pull/1917#discussion_r494314029 ## File path: lucene/core/src/java/org/apache/lucene/store/ByteBuffersDataOutput.java ## @@ -400,8 +400,13 @@ public void writeSetOfStrings(Set set) { public long ramBytesUsed() { // Return a rough estimation for allocated blocks. Note that we do not make // any special distinction for direct memory buffers. -return RamUsageEstimator.NUM_BYTES_OBJECT_REF * blocks.size() + - blocks.stream().mapToLong(buf -> buf.capacity()).sum(); +ByteBuffer first = blocks.peek(); +if (first == null) { + return 0L; +} else { + // All blocks have the same capacity. Review comment: I'm not sure I like this assumption (that all blocks are equal). Maybe we could maintain a separate ram usage on block addition/ removal and thus make the sum-time constant but also accurate? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1917: LUCENE-9535: Make ByteBuffersDataOutput#ramBytesUsed run in constant-time.
jpountz commented on a change in pull request #1917: URL: https://github.com/apache/lucene-solr/pull/1917#discussion_r494317183 ## File path: lucene/core/src/java/org/apache/lucene/store/ByteBuffersDataOutput.java ## @@ -400,8 +400,13 @@ public void writeSetOfStrings(Set set) { public long ramBytesUsed() { // Return a rough estimation for allocated blocks. Note that we do not make // any special distinction for direct memory buffers. -return RamUsageEstimator.NUM_BYTES_OBJECT_REF * blocks.size() + - blocks.stream().mapToLong(buf -> buf.capacity()).sum(); +ByteBuffer first = blocks.peek(); +if (first == null) { + return 0L; +} else { + // All blocks have the same capacity. Review comment: Oops thanks for catching. I thought they did because we make the assumption that all buffers have the same length in toDataInput, which is different from capacity. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1917: LUCENE-9535: Make ByteBuffersDataOutput#ramBytesUsed run in constant-time.
jpountz commented on a change in pull request #1917: URL: https://github.com/apache/lucene-solr/pull/1917#discussion_r494317441 ## File path: lucene/core/src/java/org/apache/lucene/store/ByteBuffersDataOutput.java ## @@ -400,8 +400,13 @@ public void writeSetOfStrings(Set set) { public long ramBytesUsed() { // Return a rough estimation for allocated blocks. Note that we do not make // any special distinction for direct memory buffers. -return RamUsageEstimator.NUM_BYTES_OBJECT_REF * blocks.size() + - blocks.stream().mapToLong(buf -> buf.capacity()).sum(); +ByteBuffer first = blocks.peek(); +if (first == null) { + return 0L; +} else { + // All blocks have the same capacity. Review comment: Yep, I'll do that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14891) Upgrade Jetty to 9.4.28+ to fix Startup Warning
[ https://issues.apache.org/jira/browse/SOLR-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201531#comment-17201531 ] Cassandra Targett commented on SOLR-14891: -- I think this is a duplicate of SOLR-14835. > Upgrade Jetty to 9.4.28+ to fix Startup Warning > --- > > Key: SOLR-14891 > URL: https://issues.apache.org/jira/browse/SOLR-14891 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.6.2 >Reporter: Bernd Wahlen >Priority: Minor > > Solr currently using Jetty 9.4.27 which displays strange Warning at startup. > I think it is fixed in 9.4.28 > https://github.com/eclipse/jetty.project/issues/4631 > 2020-09-23 09:57:57.346 WARN (main) [ ] o.e.j.x.XmlConfiguration Ignored > arg: > class="com.codahale.metrics.jetty9.InstrumentedQueuedThreadPool"> name="registry"> > class="com.codahale.metrics.SharedMetricRegistries">solr.jetty > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14891) Upgrade Jetty to 9.4.28+ to fix Startup Warning
[ https://issues.apache.org/jira/browse/SOLR-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bernd Wahlen resolved SOLR-14891. - Resolution: Duplicate > Upgrade Jetty to 9.4.28+ to fix Startup Warning > --- > > Key: SOLR-14891 > URL: https://issues.apache.org/jira/browse/SOLR-14891 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.6.2 >Reporter: Bernd Wahlen >Priority: Minor > > Solr currently using Jetty 9.4.27 which displays strange Warning at startup. > I think it is fixed in 9.4.28 > https://github.com/eclipse/jetty.project/issues/4631 > 2020-09-23 09:57:57.346 WARN (main) [ ] o.e.j.x.XmlConfiguration Ignored > arg: > class="com.codahale.metrics.jetty9.InstrumentedQueuedThreadPool"> name="registry"> > class="com.codahale.metrics.SharedMetricRegistries">solr.jetty > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14835) Solr 8.6.x log starts with "XmlConfiguration Ignored arg" warning from Jetty
[ https://issues.apache.org/jira/browse/SOLR-14835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201533#comment-17201533 ] Bernd Wahlen commented on SOLR-14835: - Solr currently using Jetty 9.4.27 which displays strange Warning at startup. I think it is fixed in 9.4.28 https://github.com/eclipse/jetty.project/issues/4631 > Solr 8.6.x log starts with "XmlConfiguration Ignored arg" warning from Jetty > > > Key: SOLR-14835 > URL: https://issues.apache.org/jira/browse/SOLR-14835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.6.2 >Reporter: Colvin Cowie >Assignee: Andrzej Bialecki >Priority: Trivial > > After moving to 8.6.2 the first lines of the solr.log are > {noformat} > 2020-09-06 18:19:09.164 INFO (main) [ ] o.e.j.u.log Logging initialized > @1197ms to org.eclipse.jetty.util.log.Slf4jLog > 2020-09-06 18:19:09.226 WARN (main) [ ] o.e.j.u.l.o.e.j.x.XmlConfiguration > Ignored arg: > class="com.codahale.metrics.jetty9.InstrumentedQueuedThreadPool"> name="registry"> > class="com.codahale.metrics.SharedMetricRegistries">solr.jetty > > > {noformat} > This config is declared here: > https://github.com/apache/lucene-solr/blob/5154b6008f54c9d096f5efe9ae347492c23dd780/solr/server/etc/jetty.xml#L33 > and has been there for a long time, so I assume it's the bump in Jetty > version that's causing it now. > I'm seeing this in 8.6.2, but I've not gone back to check other versions -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14891) Upgrade Jetty to 9.4.28+ to fix Startup Warning
[ https://issues.apache.org/jira/browse/SOLR-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201535#comment-17201535 ] Bernd Wahlen commented on SOLR-14891: - thanks => close/duplicate > Upgrade Jetty to 9.4.28+ to fix Startup Warning > --- > > Key: SOLR-14891 > URL: https://issues.apache.org/jira/browse/SOLR-14891 > Project: Solr > Issue Type: Wish > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.6.2 >Reporter: Bernd Wahlen >Priority: Minor > > Solr currently using Jetty 9.4.27 which displays strange Warning at startup. > I think it is fixed in 9.4.28 > https://github.com/eclipse/jetty.project/issues/4631 > 2020-09-23 09:57:57.346 WARN (main) [ ] o.e.j.x.XmlConfiguration Ignored > arg: > class="com.codahale.metrics.jetty9.InstrumentedQueuedThreadPool"> name="registry"> > class="com.codahale.metrics.SharedMetricRegistries">solr.jetty > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml
[ https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201550#comment-17201550 ] Dawid Weiss commented on SOLR-14889: Yep, Uwe's version looks good to me. Added a few additional unrelated cleanups (imports, comments, warning from jekyll). Seems good to go. > improve templated variable escaping in ref-guide _config.yml > > > Key: SOLR-14889 > URL: https://issues.apache.org/jira/browse/SOLR-14889 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch, > SOLR-14889.patch, SOLR-14889.patch > > > SOLR-14824 ran into windows failures when we switching from using a hardcoded > "relative" path to the solrRootPath to using groovy/project variables to get > the path. the reason for the failures was that the path us used as a > variable tempted into {{_config.yml.template}} to build the {{_config.yml}} > file, but on windows the path seperater of '\' was being parsed by > jekyll/YAML as a string escape character. > (This wasn't a problem we ran into before, even on windows, prior to the > SOLR-14824 changes, because the hardcoded relative path only used '/' > delimiters, which (j)ruby was happy to work with, even on windows. > As Uwe pointed out when hotfixing this... > {quote}Problem was that backslashes are used to escape strings, but windows > paths also have those. Fix was to add StringEscapeUtils, but I don't like > this too much. Maybe we find a better solution to make special characters in > those properties escaped correctly when used in strings inside templates. > {quote} > ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this > one variable -- doesn't really protect other variables that might have > special charactes in them down the road, and while "escapeJava" work ok for > the "\" issue, it isn't neccessarily consistent with all YAML escapse, which > could lead to even weird bugs/cofusion down the road. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml
[ https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated SOLR-14889: --- Attachment: SOLR-14889.patch > improve templated variable escaping in ref-guide _config.yml > > > Key: SOLR-14889 > URL: https://issues.apache.org/jira/browse/SOLR-14889 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch, > SOLR-14889.patch, SOLR-14889.patch > > > SOLR-14824 ran into windows failures when we switching from using a hardcoded > "relative" path to the solrRootPath to using groovy/project variables to get > the path. the reason for the failures was that the path us used as a > variable tempted into {{_config.yml.template}} to build the {{_config.yml}} > file, but on windows the path seperater of '\' was being parsed by > jekyll/YAML as a string escape character. > (This wasn't a problem we ran into before, even on windows, prior to the > SOLR-14824 changes, because the hardcoded relative path only used '/' > delimiters, which (j)ruby was happy to work with, even on windows. > As Uwe pointed out when hotfixing this... > {quote}Problem was that backslashes are used to escape strings, but windows > paths also have those. Fix was to add StringEscapeUtils, but I don't like > this too much. Maybe we find a better solution to make special characters in > those properties escaped correctly when used in strings inside templates. > {quote} > ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this > one variable -- doesn't really protect other variables that might have > special charactes in them down the road, and while "escapeJava" work ok for > the "\" issue, it isn't neccessarily consistent with all YAML escapse, which > could lead to even weird bugs/cofusion down the road. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1917: LUCENE-9535: Make ByteBuffersDataOutput#ramBytesUsed run in constant-time.
dweiss commented on a change in pull request #1917: URL: https://github.com/apache/lucene-solr/pull/1917#discussion_r494352956 ## File path: lucene/core/src/java/org/apache/lucene/store/ByteBuffersDataOutput.java ## @@ -400,8 +400,13 @@ public void writeSetOfStrings(Set set) { public long ramBytesUsed() { // Return a rough estimation for allocated blocks. Note that we do not make // any special distinction for direct memory buffers. -return RamUsageEstimator.NUM_BYTES_OBJECT_REF * blocks.size() + - blocks.stream().mapToLong(buf -> buf.capacity()).sum(); +ByteBuffer first = blocks.peek(); +if (first == null) { + return 0L; +} else { + // All blocks have the same capacity. Review comment: Thanks Adrien! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14549) Listing of Files in a Directory on Solr Admin is Broken
[ https://issues.apache.org/jira/browse/SOLR-14549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201565#comment-17201565 ] David Eric Pugh commented on SOLR-14549: Happy to QA your changes [~krisden], it definitely stumped me. > Listing of Files in a Directory on Solr Admin is Broken > --- > > Key: SOLR-14549 > URL: https://issues.apache.org/jira/browse/SOLR-14549 > Project: Solr > Issue Type: Bug > Components: Admin UI >Affects Versions: master (9.0), 8.5.1, 8.5.2 >Reporter: David Eric Pugh >Assignee: Kevin Risden >Priority: Major > Attachments: Screenshot at Jun 09 07-40-06.png > > > The Admin interface for showing files only lets you see the top level files, > no nested files in a directory: > http://localhost:8983/solr/#/gettingstarted/files?file=lang%2F > Choosing a nested directory doesn't generate any console errors, but the tree > doesn't open. > I believe this was introduced during SOLR-14209 upgrade in Jquery. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz opened a new pull request #1919: Compute RAM usage ByteBuffersDataOutput on the fly.
jpountz opened a new pull request #1919: URL: https://github.com/apache/lucene-solr/pull/1919 This helps remove the assumption that all blocks have the same size. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on pull request #1919: Compute RAM usage ByteBuffersDataOutput on the fly.
jpountz commented on pull request #1919: URL: https://github.com/apache/lucene-solr/pull/1919#issuecomment-698392573 @dweiss FYI I could not find a way to have blocks of different capacities as we have an assertion that the allocator creates blocks of the expected capacity, not larger. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14888) Echo directory to run Solr from for the "assemble" and "dev" targets in the Gradle build
[ https://issues.apache.org/jira/browse/SOLR-14888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201574#comment-17201574 ] Erick Erickson commented on SOLR-14888: --- One other thing that'd be good to do at the same time. If you execute "gradlew tasks", this line comes out: {code} dev - Assemble Solr distribution into 'development' folder at /Users/Erick/apache/solrJiras/master/solr/packaging/build/dev {code} It'd be helpful for that same "folder at..." to come out for the assemble target too. > Echo directory to run Solr from for the "assemble" and "dev" targets in the > Gradle build > > > Key: SOLR-14888 > URL: https://issues.apache.org/jira/browse/SOLR-14888 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Priority: Major > > This used to happen. As per [~mdrob] opening a JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…
mikemccand commented on a change in pull request #1893: URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494395610 ## File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java ## @@ -0,0 +1,184 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.facet.taxonomy; + +import org.apache.lucene.facet.FacetsConfig; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.util.IntsRef; + +import java.io.IOException; + +import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL; +import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL; + +/** + * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values, + * reducing your index size. + * + * @lucene.experimental + **/ +public class TaxonomyFacetLabels { + + /** + * Index field name provided to the constructor + */ + private final String indexFieldName; + + /** + * {@code TaxonomyReader} provided to the constructor + */ + private final TaxonomyReader taxoReader; + + /** + * {@code FacetsConfig} provided to the constructor + */ + private final FacetsConfig config; + + /** + * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field + */ + private final OrdinalsReader ordsReader; + + /** + * Sole constructor. Do not close the provided {@link TaxonomyReader} while still using this instance! + */ + public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException { +this.taxoReader = taxoReader; +this.config = config; +this.indexFieldName = indexFieldName; +this.ordsReader = new DocValuesOrdinalsReader(indexFieldName); + } + + /** + * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for + * multiple documents and (optionally) for a specific dimension. You must create this per-segment, + * and then step through all hits, in order, for that segment. + * + * NOTE: This class is not thread-safe, so you must use a new instance of this + * class for each thread. + * + * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field + * @return an instance of {@link FacetLabelReader} + * @throws IOException when a low-level IO issue occurs + */ + public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException { +return new FacetLabelReader(ordsReader, readerContext); + } + + /** + * Utility class to retrieve facet labels for multiple documents. + * + * @lucene.experimental + */ + public class FacetLabelReader { +private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader; +private final IntsRef decodedOrds = new IntsRef(); +private int currentDocId = -1; +private int currentPos = -1; + +// Lazily set when nextFacetLabel(int docId, String facetDimension) is first called +private int[] parents; + +/** + * Sole constructor. + */ +public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException { + ordinalsSegmentReader = ordsReader.getReader(readerContext); +} + +/** + * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more. + * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the + * next {@link FacetLabel} for that document. Otherwise, it advances to the new {@code docId} and provides the + * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets. Each + * new {@code docId} must be in strictly monotonic (increasing) order. + * + * @param docId input docId provided in monotonic (non-decreasing) order + * @return the first or next {@link FacetLabel}, or {@code null} if there are no more + * @throws IOException when a low-level IO issue occurs + */ +public FacetLabel nextFacetLabel(int docId) throws IOException
[GitHub] [lucene-solr] dweiss commented on pull request #1919: Compute RAM usage ByteBuffersDataOutput on the fly.
dweiss commented on pull request #1919: URL: https://github.com/apache/lucene-solr/pull/1919#issuecomment-698412271 I will take another look. I can't remember forcing block capacity but it's been a while! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #1919: Compute RAM usage ByteBuffersDataOutput on the fly.
uschindler commented on pull request #1919: URL: https://github.com/apache/lucene-solr/pull/1919#issuecomment-698416370 I like this approach more to just sum up the size when new blocks are allocated and added to Deque. +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…
mikemccand commented on a change in pull request #1893: URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494398560 ## File path: lucene/facet/src/test/org/apache/lucene/facet/FacetTestCase.java ## @@ -56,6 +60,28 @@ public Facets getTaxonomyFacetCounts(TaxonomyReader taxoReader, FacetsConfig con return facets; } + public List> getTaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, FacetsCollector fc) throws IOException { Review comment: Thank you for adding this utility method so tests can easily use the new utility class! Can we rename this to `getAllTaxonomyFacetLabels`, and add javadoc explaining that the outer list is one entry per matched hit, and the inner list is one entry per `FacetLabel` belonging to that hit? ## File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java ## @@ -726,6 +743,39 @@ public void testRandom() throws Exception { IOUtils.close(tw, searcher.getIndexReader(), tr, indexDir, taxoDir); } + private static List> sortedFacetLabels(List> allfacetLabels) { +for (List facetLabels : allfacetLabels) { + Collections.sort(facetLabels); +} + +Collections.sort(allfacetLabels, (o1, o2) -> { + if (o1 == null) { Review comment: Hmm why are these `null` checks necessary? Are we really seeing `null` in the argument? Oh, I guess this legitimately happens when the hit had no facets? Maybe add a comment? Hmm, actually, looking at how actual and expected are populated, neither of them seems to add `null`? One of them filters out empty list but the other does not? ## File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java ## @@ -711,6 +723,11 @@ public void testRandom() throws Exception { } } + // Test facet labels for each matching test doc + List> actualLabels = getTaxonomyFacetLabels(tr, config, fc); + assertEquals(expectedLabels.size(), actualLabels.size()); Review comment: Hmm I think `expectedLabels` filters out empty `List` but `actualLabels` does not, so this might false trip? ## File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java ## @@ -726,6 +743,39 @@ public void testRandom() throws Exception { IOUtils.close(tw, searcher.getIndexReader(), tr, indexDir, taxoDir); } + private static List> sortedFacetLabels(List> allfacetLabels) { +for (List facetLabels : allfacetLabels) { + Collections.sort(facetLabels); +} + +Collections.sort(allfacetLabels, (o1, o2) -> { Review comment: I'm confused why we are sorting the top list? Isn't the top list in order of the hits? And we want to confirm, for a given `docId` hit, that expected and actual labels match? OK, I think I understand: this test does not index anything allowing you to track which original doc mapped to which `FacetLabel`, so then you cannot know, per segment, which docs ended up where :) Given that, I think it's OK to do the top-level sort of all `List` across all hits. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke commented on pull request #1890: Rename ConfigSetsAPITest to TestConfigSetsAPISolrCloud
cpoerschke commented on pull request #1890: URL: https://github.com/apache/lucene-solr/pull/1890#issuecomment-698439045 > ... if you could give me a couple days ... #1892 Sure, no problem at all. Now that `TestConfigSetsAPI` extends `SolrCloudTestCase` too, I wonder * if `TestConfigSetsAPISolrCloud` would still be a good replacement for the `ConfigSetsAPITest` still, or * if adding of `ConfigSetsAPITest` functionality to `TestConfigSetsAPI` might be better? (Haven't looked at details as yet.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke merged pull request #1825: SOLR-14828: reduce 'error' logging noise in BaseCloudSolrClient.requestWithRetryOnStaleState
cpoerschke merged pull request #1825: URL: https://github.com/apache/lucene-solr/pull/1825 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14828) reduce 'error' logging noise in BaseCloudSolrClient.requestWithRetryOnStaleState
[ https://issues.apache.org/jira/browse/SOLR-14828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201612#comment-17201612 ] ASF subversion and git services commented on SOLR-14828: Commit 876de8be41a837b83ef7ea6b82b322ed829b0595 in lucene-solr's branch refs/heads/master from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=876de8b ] SOLR-14828: reduce 'error' logging noise in BaseCloudSolrClient.requestWithRetryOnStaleState (#1825) > reduce 'error' logging noise in > BaseCloudSolrClient.requestWithRetryOnStaleState > > > Key: SOLR-14828 > URL: https://issues.apache.org/jira/browse/SOLR-14828 > Project: Solr > Issue Type: Task > Components: SolrJ >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Currently -- e.g. > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.6.2/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BaseCloudSolrClient.java#L960-L961 > -- an error is logged even if request retrying will happen (and hopefully > succeed). > This task proposes to 'info' or 'warn' rather than 'error' log if the request > will be retried. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke merged pull request #1913: SOLR-11167: Avoid $SOLR_STOP_WAIT use during 'bin/solr start' if $SOLR_START_WAIT is supplied.
cpoerschke merged pull request #1913: URL: https://github.com/apache/lucene-solr/pull/1913 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11167) bin/solr uses $SOLR_STOP_WAIT during start
[ https://issues.apache.org/jira/browse/SOLR-11167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201615#comment-17201615 ] ASF subversion and git services commented on SOLR-11167: Commit ea77d242377d942912525f76c307de568c2b3d90 in lucene-solr's branch refs/heads/master from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ea77d24 ] SOLR-11167: Avoid $SOLR_STOP_WAIT use during 'bin/solr start' if $SOLR_START_WAIT is supplied. (#1913) > bin/solr uses $SOLR_STOP_WAIT during start > -- > > Key: SOLR-11167 > URL: https://issues.apache.org/jira/browse/SOLR-11167 > Project: Solr > Issue Type: Improvement > Components: scripts and tools >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.7 > > Attachments: SOLR-11167.patch > > Time Spent: 10m > Remaining Estimate: 0h > > bin/solr using $SOLR_STOP_WAIT during start is unexpected, I think it would > be clearer to have a separate $SOLR_START_WAIT variable. > related minor thing: SOLR_STOP_WAIT is mentioned in solr.in.sh but not in > solr.in.cmd equivalent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] munendrasn merged pull request #1914: Move 9x upgrade notes out of changes.txt
munendrasn merged pull request #1914: URL: https://github.com/apache/lucene-solr/pull/1914 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11167) bin/solr uses $SOLR_STOP_WAIT during start
[ https://issues.apache.org/jira/browse/SOLR-11167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201622#comment-17201622 ] ASF subversion and git services commented on SOLR-11167: Commit 38ab92da8b6649e4232e4d7fa6833b5cebdff993 in lucene-solr's branch refs/heads/branch_8x from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=38ab92d ] SOLR-11167: Avoid $SOLR_STOP_WAIT use during 'bin/solr start' if $SOLR_START_WAIT is supplied. (#1913) Resolved Conflicts: solr/CHANGES.txt > bin/solr uses $SOLR_STOP_WAIT during start > -- > > Key: SOLR-11167 > URL: https://issues.apache.org/jira/browse/SOLR-11167 > Project: Solr > Issue Type: Improvement > Components: scripts and tools >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.7 > > Attachments: SOLR-11167.patch > > Time Spent: 20m > Remaining Estimate: 0h > > bin/solr using $SOLR_STOP_WAIT during start is unexpected, I think it would > be clearer to have a separate $SOLR_START_WAIT variable. > related minor thing: SOLR_STOP_WAIT is mentioned in solr.in.sh but not in > solr.in.cmd equivalent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14828) reduce 'error' logging noise in BaseCloudSolrClient.requestWithRetryOnStaleState
[ https://issues.apache.org/jira/browse/SOLR-14828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201621#comment-17201621 ] ASF subversion and git services commented on SOLR-14828: Commit eca8aa81718d025e215b21c9e81b4b4620ec8f1e in lucene-solr's branch refs/heads/branch_8x from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=eca8aa8 ] SOLR-14828: reduce 'error' logging noise in BaseCloudSolrClient.requestWithRetryOnStaleState (#1825) > reduce 'error' logging noise in > BaseCloudSolrClient.requestWithRetryOnStaleState > > > Key: SOLR-14828 > URL: https://issues.apache.org/jira/browse/SOLR-14828 > Project: Solr > Issue Type: Task > Components: SolrJ >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Currently -- e.g. > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.6.2/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BaseCloudSolrClient.java#L960-L961 > -- an error is logged even if request retrying will happen (and hopefully > succeed). > This task proposes to 'info' or 'warn' rather than 'error' log if the request > will be retried. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] munendrasn commented on pull request #1371: SOLR-14333: print readable version of CollapsedPostFilter query
munendrasn commented on pull request #1371: URL: https://github.com/apache/lucene-solr/pull/1371#issuecomment-698447210 Fixed the failing tests and updated changes to include deprecation This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cpoerschke opened a new pull request #1920: branch_8x: add two missing(?) solr/CHANGES.txt entries
cpoerschke opened a new pull request #1920: URL: https://github.com/apache/lucene-solr/pull/1920 Encountered a cherry-pick merge conflict and it seems that these two entries are present in the master branch's solr/CHANGES.txt 8.7 section but (unintentionally?) missing in the branch_8x solr/CHANGES.txt 8.7 section. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14828) reduce 'error' logging noise in BaseCloudSolrClient.requestWithRetryOnStaleState
[ https://issues.apache.org/jira/browse/SOLR-14828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke resolved SOLR-14828. Fix Version/s: 8.7 master (9.0) Resolution: Fixed > reduce 'error' logging noise in > BaseCloudSolrClient.requestWithRetryOnStaleState > > > Key: SOLR-14828 > URL: https://issues.apache.org/jira/browse/SOLR-14828 > Project: Solr > Issue Type: Task > Components: SolrJ >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.7 > > Time Spent: 50m > Remaining Estimate: 0h > > Currently -- e.g. > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.6.2/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BaseCloudSolrClient.java#L960-L961 > -- an error is logged even if request retrying will happen (and hopefully > succeed). > This task proposes to 'info' or 'warn' rather than 'error' log if the request > will be retried. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-11167) bin/solr uses $SOLR_STOP_WAIT during start
[ https://issues.apache.org/jira/browse/SOLR-11167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christine Poerschke resolved SOLR-11167. Resolution: Fixed > bin/solr uses $SOLR_STOP_WAIT during start > -- > > Key: SOLR-11167 > URL: https://issues.apache.org/jira/browse/SOLR-11167 > Project: Solr > Issue Type: Improvement > Components: scripts and tools >Reporter: Christine Poerschke >Assignee: Christine Poerschke >Priority: Minor > Fix For: master (9.0), 8.7 > > Attachments: SOLR-11167.patch > > Time Spent: 20m > Remaining Estimate: 0h > > bin/solr using $SOLR_STOP_WAIT during start is unexpected, I think it would > be clearer to have a separate $SOLR_START_WAIT variable. > related minor thing: SOLR_STOP_WAIT is mentioned in solr.in.sh but not in > solr.in.cmd equivalent. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tflobbe commented on pull request #1920: branch_8x: add two missing(?) solr/CHANGES.txt entries
tflobbe commented on pull request #1920: URL: https://github.com/apache/lucene-solr/pull/1920#issuecomment-698457823 ugh, looks like I forgot to backport the CHANGES entry in my change. Thanks Christine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] munendrasn merged pull request #1900: SOLR-14036: Remove explicit distrib=false from /terms handler
munendrasn merged pull request #1900: URL: https://github.com/apache/lucene-solr/pull/1900 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14036) TermsComponent distributed search (shards) doesn't work with SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-14036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201648#comment-17201648 ] ASF subversion and git services commented on SOLR-14036: Commit ac5847231017f12d6a51348e1cdbd50c9732a224 in lucene-solr's branch refs/heads/master from Munendra S N [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ac58472 ] SOLR-14036: Remove explicit distrib=false from /terms handler (#1900) * Remove distrib=false from /terms handler so that terms are returned from across all shards instead of a single local shard. * cleanup shards parameter handling in TermsComponent. This is handled in HttpShardHandler * Remove redundant tests for shard whitelist * remove redundant terms params from ScoreNodeStream > TermsComponent distributed search (shards) doesn't work with SolrCloud > -- > > Key: SOLR-14036 > URL: https://issues.apache.org/jira/browse/SOLR-14036 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: David Smiley >Assignee: Munendra S N >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > My colleagues [~bruno.roustant] and [~antogruz] attempted to use the > {{TermsComponent}} in SolrCloud on a collection with multiple shards. The > results were inconsistent depending on which shard the client was talking > with. Looking at the prepare() method, I can see this component reads the > "shards" param. It should not have been coded that way; the SearchHandler or > related machinery is responsible for parsing/processing that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14036) TermsComponent distributed search (shards) doesn't work with SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-14036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Munendra S N updated SOLR-14036: Fix Version/s: master (9.0) Resolution: Fixed (was: Invalid) Status: Resolved (was: Patch Available) > TermsComponent distributed search (shards) doesn't work with SolrCloud > -- > > Key: SOLR-14036 > URL: https://issues.apache.org/jira/browse/SOLR-14036 > Project: Solr > Issue Type: Improvement > Components: SearchComponents - other >Reporter: David Smiley >Assignee: Munendra S N >Priority: Major > Fix For: master (9.0) > > Time Spent: 2h > Remaining Estimate: 0h > > My colleagues [~bruno.roustant] and [~antogruz] attempted to use the > {{TermsComponent}} in SolrCloud on a collection with multiple shards. The > results were inconsistent depending on which shard the client was talking > with. Looking at the prepare() method, I can see this component reads the > "shards" param. It should not have been coded that way; the SearchHandler or > related machinery is responsible for parsing/processing that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml
[ https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201659#comment-17201659 ] Chris M. Hostetter commented on SOLR-14889: --- thanks guys ... this is very enlightening. {quote}The "doFirst" and "doLast" clauses in tasks add an anonymous closure to the list of things to run in execution phase. So even if a script looks linear, the closure in doFirst and doLast runs at a completely different time than what's outside of it. {quote} doFirst/doLast are actually the most straight foward aspect of the gradle lifecycle as far as i can tell : ) ... what's blowing my mind is what Uwe pointed out, about how during "configuration" the task execution code evidently ... "captures" (for lack of a better word) ... references to the variables/input used that code, so that _re-assigning_ to those variables in doFirst had no effect, but modifying the objects those variables pointed to did. ...that's wild... ... and really nothing i'd seen in any of the gradle tutotrials/lifecycle discussion really prepared me for that. I have a few lingering questions, mostly about some of the ancillary stuff, in the latest patch (i didn't dig into who changed what) ... * can we make `templateProps` final now? ... i didn't realize groovy supported final as a keyword, and seems like a good idea for as many things as possible to be final ** should we use `asImmutable()` which is apparently a groovy add-on for on maps/collections? * The "TODO 2" comment is stale and makes sense to remove, but shouldn't the "TODO 1" and "TODO 3" comments stick around? ... those are still applicable aren't they? * why is buildSiteJekyll now hooked into the "assemble" task? ** my understanding is that "assemble" is for building the lucene/solr artifacts/distribution – but the ref-guide shouldn't be included in that, we don't "release" it officially ** even if it does make sense to hook into "assemble" why is it hooking directly to buildSiteJekyll and not buildSite ? *** if that was to avoid the "check" style validation `buildSite` does ok, but as discussed in SOLR-14870 the way forward there seemed to be to pull it out of buildSite into it's own "checkSite" task .. "buildSite" is the main task people should know about/run ... buildSiteJekyll (as a task name) is an implementation detail that should really go away > improve templated variable escaping in ref-guide _config.yml > > > Key: SOLR-14889 > URL: https://issues.apache.org/jira/browse/SOLR-14889 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch, > SOLR-14889.patch, SOLR-14889.patch > > > SOLR-14824 ran into windows failures when we switching from using a hardcoded > "relative" path to the solrRootPath to using groovy/project variables to get > the path. the reason for the failures was that the path us used as a > variable tempted into {{_config.yml.template}} to build the {{_config.yml}} > file, but on windows the path seperater of '\' was being parsed by > jekyll/YAML as a string escape character. > (This wasn't a problem we ran into before, even on windows, prior to the > SOLR-14824 changes, because the hardcoded relative path only used '/' > delimiters, which (j)ruby was happy to work with, even on windows. > As Uwe pointed out when hotfixing this... > {quote}Problem was that backslashes are used to escape strings, but windows > paths also have those. Fix was to add StringEscapeUtils, but I don't like > this too much. Maybe we find a better solution to make special characters in > those properties escaped correctly when used in strings inside templates. > {quote} > ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this > one variable -- doesn't really protect other variables that might have > special charactes in them down the road, and while "escapeJava" work ok for > the "\" issue, it isn't neccessarily consistent with all YAML escapse, which > could lead to even weird bugs/cofusion down the road. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14843) Define strongly-typed cluster configuration API
[ https://issues.apache.org/jira/browse/SOLR-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201668#comment-17201668 ] Tomas Eduardo Fernandez Lobbe commented on SOLR-14843: -- I don’t know what Andrzej was thinking when he created this Jira, but what thought when I saw it was something like: The “consumer” side (our code, components, etc) could look something like: {code:java} int myInt = Config.getInteger(“some.configurable.thing”, default: 30); String myStr = Config.getString(“some.configurable.string”, default: “foo”); MyObject myStr = Config.get(“some.configurable.obj”, new SomeSortOfFactory()); {code} Maybe even be able to support attach an onChange event, like {code:java} int myInt = Config.getInteger(“some.configurable.thing”, default: 30, onChange: (v) -> { setMyInt(v); refresh()}); {code} or something. Then, this {{Config}} class could load the configuration from a predictable hierarchy, something like: {noformat} system props > env > cluster props > node props {noformat} (don’t know if that’s the right order, and again, there could be more than one hierarchy), so that a property can be set in the node configuration, but could be overriden by collection level properties, etc. One extra nice thing of an approach like this is that we could have an API to show exactly the current configuration and where each config is coming from, something like: {code} some.configurable.string: { value: “bar”, source: “collection property” } some.configurable.thing: { value: 30, source: “default” } {code} Maybe even a timestamp of the change or something. > Define strongly-typed cluster configuration API > --- > > Key: SOLR-14843 > URL: https://issues.apache.org/jira/browse/SOLR-14843 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Andrzej Bialecki >Priority: Major > Labels: clean-api > Fix For: master (9.0) > > > Current cluster-level configuration uses a hodgepodge of traditional Solr > config sources (solr.xml, system properties) and the new somewhat arbitrary > config files kept in ZK ({{/clusterprops.json, /security.json, > /packages.json, /autoscaling.json}} etc...). There's no uniform > strongly-typed API to access and manage these configs - currently each config > source has its own CRUD, often relying on direct access to Zookeeper. There's > also no uniform method for monitoring changes to these config sources. > This issue proposes a uniform config API facade with the following > characteristics: > * Using a single hierarchical (or at least key-based) facade for accessing > any global config. > * Using strongly-typed sub-system configs instead of opaque Map-s: > components would no longer deal with JSON parsing/writing, instead they would > use properly annotated Java objects for config CRUD. Config objects would > include versioning information (eg. lastModified timestamp). > * Isolating access to the underlying config persistence layer: components > would no longer directly interact with Zookeeper or files. Most likely the > default implementation would continue using different ZK files per-subsystem > in order to limit the complexity of file formats and to reduce the cost of > notifications for unmodified parts of the configs. > * Providing uniform way to register listeners for monitoring changes in > specific configs: components would no longer need to interact with ZK > watches, they would instead be notified about modified configs that they are > interested in. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14503) Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property
[ https://issues.apache.org/jira/browse/SOLR-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201682#comment-17201682 ] ASF subversion and git services commented on SOLR-14503: Commit ddd10725b00649edc80726c59f9fdf0442adb6c2 in lucene-solr's branch refs/heads/master from Munendra S N [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=ddd1072 ] SOLR-14503: use specified waitForZk val as conn timeout for zk * Also, consume SOLR_WAIT_FOR_ZK in bin/solr.cmd > Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property > --- > > Key: SOLR-14503 > URL: https://issues.apache.org/jira/browse/SOLR-14503 > Project: Solr > Issue Type: Bug >Affects Versions: 7.1, 7.2, 7.2.1, 7.3, 7.3.1, 7.4, 7.5, 7.6, 7.7, 7.7.1, > 7.7.2, 8.0, 8.1, 8.2, 7.7.3, 8.1.1, 8.3, 8.4, 8.3.1, 8.5, 8.4.1, 8.5.1 >Reporter: Colvin Cowie >Assignee: Munendra S N >Priority: Minor > Attachments: SOLR-14503.patch, SOLR-14503.patch > > > When starting Solr in cloud mode, if zookeeper is not available within 30 > seconds, then core container intialization fails and the node will not > recover when zookeeper is available. > > I believe SOLR-5129 should have addressed this issue, however it doesn't > quite do so for two reasons: > # > [https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L297] > it calls {{SolrZkClient(String zkServerAddress, int zkClientTimeout)}} > rather than {{SolrZkClient(String zkServerAddress, int zkClientTimeout, int > zkClientConnectTimeout)}} so the DEFAULT_CLIENT_CONNECT_TIMEOUT of 30 seconds > is used even when you specify a different waitForZk value > # bin/solr contains script to set -DwaitForZk from the SOLR_WAIT_FOR_ZK > environment property > [https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2148] but > there is no corresponding assignment in bin/solr.cmd, while SOLR_WAIT_FOR_ZK > appears in the solr.in.cmd as an example. > > I will attach a patch that fixes the above. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14503) Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property
[ https://issues.apache.org/jira/browse/SOLR-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201683#comment-17201683 ] Munendra S N commented on SOLR-14503: - Beasted the test before committing, will shortly backport to 8x {code:java} ./gradlew -p solr/core beast -Ptests.dups=10 --tests ZkFailoverTest {code} > Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property > --- > > Key: SOLR-14503 > URL: https://issues.apache.org/jira/browse/SOLR-14503 > Project: Solr > Issue Type: Bug >Affects Versions: 7.1, 7.2, 7.2.1, 7.3, 7.3.1, 7.4, 7.5, 7.6, 7.7, 7.7.1, > 7.7.2, 8.0, 8.1, 8.2, 7.7.3, 8.1.1, 8.3, 8.4, 8.3.1, 8.5, 8.4.1, 8.5.1 >Reporter: Colvin Cowie >Assignee: Munendra S N >Priority: Minor > Attachments: SOLR-14503.patch, SOLR-14503.patch > > > When starting Solr in cloud mode, if zookeeper is not available within 30 > seconds, then core container intialization fails and the node will not > recover when zookeeper is available. > > I believe SOLR-5129 should have addressed this issue, however it doesn't > quite do so for two reasons: > # > [https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L297] > it calls {{SolrZkClient(String zkServerAddress, int zkClientTimeout)}} > rather than {{SolrZkClient(String zkServerAddress, int zkClientTimeout, int > zkClientConnectTimeout)}} so the DEFAULT_CLIENT_CONNECT_TIMEOUT of 30 seconds > is used even when you specify a different waitForZk value > # bin/solr contains script to set -DwaitForZk from the SOLR_WAIT_FOR_ZK > environment property > [https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2148] but > there is no corresponding assignment in bin/solr.cmd, while SOLR_WAIT_FOR_ZK > appears in the solr.in.cmd as an example. > > I will attach a patch that fixes the above. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14503) Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property
[ https://issues.apache.org/jira/browse/SOLR-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201692#comment-17201692 ] ASF subversion and git services commented on SOLR-14503: Commit 894f91100d3bc1eab8332c9066222d99572393a3 in lucene-solr's branch refs/heads/branch_8x from Munendra S N [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=894f911 ] SOLR-14503: use specified waitForZk val as conn timeout for zk * Also, consume SOLR_WAIT_FOR_ZK in bin/solr.cmd > Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property > --- > > Key: SOLR-14503 > URL: https://issues.apache.org/jira/browse/SOLR-14503 > Project: Solr > Issue Type: Bug >Affects Versions: 7.1, 7.2, 7.2.1, 7.3, 7.3.1, 7.4, 7.5, 7.6, 7.7, 7.7.1, > 7.7.2, 8.0, 8.1, 8.2, 7.7.3, 8.1.1, 8.3, 8.4, 8.3.1, 8.5, 8.4.1, 8.5.1 >Reporter: Colvin Cowie >Assignee: Munendra S N >Priority: Minor > Attachments: SOLR-14503.patch, SOLR-14503.patch > > > When starting Solr in cloud mode, if zookeeper is not available within 30 > seconds, then core container intialization fails and the node will not > recover when zookeeper is available. > > I believe SOLR-5129 should have addressed this issue, however it doesn't > quite do so for two reasons: > # > [https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L297] > it calls {{SolrZkClient(String zkServerAddress, int zkClientTimeout)}} > rather than {{SolrZkClient(String zkServerAddress, int zkClientTimeout, int > zkClientConnectTimeout)}} so the DEFAULT_CLIENT_CONNECT_TIMEOUT of 30 seconds > is used even when you specify a different waitForZk value > # bin/solr contains script to set -DwaitForZk from the SOLR_WAIT_FOR_ZK > environment property > [https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2148] but > there is no corresponding assignment in bin/solr.cmd, while SOLR_WAIT_FOR_ZK > appears in the solr.in.cmd as an example. > > I will attach a patch that fixes the above. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14503) Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property
[ https://issues.apache.org/jira/browse/SOLR-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Munendra S N updated SOLR-14503: Fix Version/s: 8.7 Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~cjcowie] > Solr does not respect waitForZk (SOLR_WAIT_FOR_ZK) property > --- > > Key: SOLR-14503 > URL: https://issues.apache.org/jira/browse/SOLR-14503 > Project: Solr > Issue Type: Bug >Affects Versions: 7.1, 7.2, 7.2.1, 7.3, 7.3.1, 7.4, 7.5, 7.6, 7.7, 7.7.1, > 7.7.2, 8.0, 8.1, 8.2, 7.7.3, 8.1.1, 8.3, 8.4, 8.3.1, 8.5, 8.4.1, 8.5.1 >Reporter: Colvin Cowie >Assignee: Munendra S N >Priority: Minor > Fix For: 8.7 > > Attachments: SOLR-14503.patch, SOLR-14503.patch > > > When starting Solr in cloud mode, if zookeeper is not available within 30 > seconds, then core container intialization fails and the node will not > recover when zookeeper is available. > > I believe SOLR-5129 should have addressed this issue, however it doesn't > quite do so for two reasons: > # > [https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java#L297] > it calls {{SolrZkClient(String zkServerAddress, int zkClientTimeout)}} > rather than {{SolrZkClient(String zkServerAddress, int zkClientTimeout, int > zkClientConnectTimeout)}} so the DEFAULT_CLIENT_CONNECT_TIMEOUT of 30 seconds > is used even when you specify a different waitForZk value > # bin/solr contains script to set -DwaitForZk from the SOLR_WAIT_FOR_ZK > environment property > [https://github.com/apache/lucene-solr/blob/master/solr/bin/solr#L2148] but > there is no corresponding assignment in bin/solr.cmd, while SOLR_WAIT_FOR_ZK > appears in the solr.in.cmd as an example. > > I will attach a patch that fixes the above. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14843) Define strongly-typed cluster configuration API
[ https://issues.apache.org/jira/browse/SOLR-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201732#comment-17201732 ] Ilan Ginzburg commented on SOLR-14843: -- I think we need to think the hierarchy carefully... Node props being overridden by cluster props might work, but I can easily think of use cases where the opposite makes sense as well, for example configuring a one off node in an otherwise homogeneous cluster, but then system properties and environment variables (which really are node props) override all the rest... I believe a proposal needs to also have another dimension of where these configurations come from. For system properties and environment variables it's pretty simple, but cluster and node props can be in some central place (ZK) or can be defined within the Solr distribution (file) and as such can end up being different on each node (nothing prevents deploying slightly different images on the nodes, or changing the node config after deploy). What I really need short term is a way to do what {{solr.xml}} allows me doing (define default config, let the user change them before deploy if he so wishes). We do not currently have a replacement for this. > Define strongly-typed cluster configuration API > --- > > Key: SOLR-14843 > URL: https://issues.apache.org/jira/browse/SOLR-14843 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Andrzej Bialecki >Priority: Major > Labels: clean-api > Fix For: master (9.0) > > > Current cluster-level configuration uses a hodgepodge of traditional Solr > config sources (solr.xml, system properties) and the new somewhat arbitrary > config files kept in ZK ({{/clusterprops.json, /security.json, > /packages.json, /autoscaling.json}} etc...). There's no uniform > strongly-typed API to access and manage these configs - currently each config > source has its own CRUD, often relying on direct access to Zookeeper. There's > also no uniform method for monitoring changes to these config sources. > This issue proposes a uniform config API facade with the following > characteristics: > * Using a single hierarchical (or at least key-based) facade for accessing > any global config. > * Using strongly-typed sub-system configs instead of opaque Map-s: > components would no longer deal with JSON parsing/writing, instead they would > use properly annotated Java objects for config CRUD. Config objects would > include versioning information (eg. lastModified timestamp). > * Isolating access to the underlying config persistence layer: components > would no longer directly interact with Zookeeper or files. Most likely the > default implementation would continue using different ZK files per-subsystem > in order to limit the complexity of file formats and to reduce the cost of > notifications for unmodified parts of the configs. > * Providing uniform way to register listeners for monitoring changes in > specific configs: components would no longer need to interact with ZK > watches, they would instead be notified about modified configs that they are > interested in. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml
[ https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201749#comment-17201749 ] Dawid Weiss commented on SOLR-14889: bq. references to the variables/input used that code, so that re-assigning to those variables in doFirst had no effect, but modifying the objects those variables pointed to did. I think what makes it more complex is that some of these "assignments" are actually syntactic sugar for method calls with parameters. So you're calling a method with a reference to a local variable - this method stores that reference somewhere. When you assign a different object to that variable later nothing really happens to the reference previously published. I don't think there's much magic involved there, really - it's the syntactic sugar that makes some of these calls not obvious. If we added brackets around method calls it'd be probably clearer. bq. but shouldn't the "TODO 1" and "TODO 3" comments stick around? ... those are still applicable aren't they? Yes, but it's probably better to file a jira issue than leave them as comments. bq. why is buildSiteJekyll now hooked into the "assemble" task? assemble is a base plugin's convention task for "assembling the outcomes" of a project. It's not related to lucene/solr distribution - we can *use* whatever is assembled in the packaging project, but we don't have to. When you see an unknown gradle project you'd typically run 'gradlew assemble' to build stuff. Much like 'mvn package' works. Maven analogy works with "check" too (mvn validate). Please change the task names according to your expertise - it seemed to me that buildSite runs the assemble and validation (check) - It was an arbitrary choice on my behalf to just hook it up this way. > improve templated variable escaping in ref-guide _config.yml > > > Key: SOLR-14889 > URL: https://issues.apache.org/jira/browse/SOLR-14889 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch, > SOLR-14889.patch, SOLR-14889.patch > > > SOLR-14824 ran into windows failures when we switching from using a hardcoded > "relative" path to the solrRootPath to using groovy/project variables to get > the path. the reason for the failures was that the path us used as a > variable tempted into {{_config.yml.template}} to build the {{_config.yml}} > file, but on windows the path seperater of '\' was being parsed by > jekyll/YAML as a string escape character. > (This wasn't a problem we ran into before, even on windows, prior to the > SOLR-14824 changes, because the hardcoded relative path only used '/' > delimiters, which (j)ruby was happy to work with, even on windows. > As Uwe pointed out when hotfixing this... > {quote}Problem was that backslashes are used to escape strings, but windows > paths also have those. Fix was to add StringEscapeUtils, but I don't like > this too much. Maybe we find a better solution to make special characters in > those properties escaped correctly when used in strings inside templates. > {quote} > ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this > one variable -- doesn't really protect other variables that might have > special charactes in them down the road, and while "escapeJava" work ok for > the "\" issue, it isn't neccessarily consistent with all YAML escapse, which > could lead to even weird bugs/cofusion down the road. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on pull request #1919: Compute RAM usage ByteBuffersDataOutput on the fly.
dweiss commented on pull request #1919: URL: https://github.com/apache/lucene-solr/pull/1919#issuecomment-698564766 I like the explicit field too, actually - even if you're right about capacity of internal blocks, Adrien. - this assertion there may actually be my mistake and the capacity of a new block should actually be its limit (remaining free space)... Or maybe I did have capacity in mind (can't remember, to be honest). ``` currentBlock = blockAllocate.apply(requiredBlockSize); assert currentBlock.capacity() == requiredBlockSize; ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…
goankur commented on a change in pull request #1893: URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494637269 ## File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java ## @@ -711,6 +723,11 @@ public void testRandom() throws Exception { } } + // Test facet labels for each matching test doc + List> actualLabels = getTaxonomyFacetLabels(tr, config, fc); + assertEquals(expectedLabels.size(), actualLabels.size()); Review comment: Nice catch, thanks. I fixed `actualLabels` generation in `FacetTestCase.getAllTaxonomyFacetLabels()` method to filter out empty `List`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14889) improve templated variable escaping in ref-guide _config.yml
[ https://issues.apache.org/jira/browse/SOLR-14889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201802#comment-17201802 ] Uwe Schindler commented on SOLR-14889: -- bq. can we make `templateProps` final now? ... i didn't realize groovy supported final as a keyword, and seems like a good idea for as many things as possible to be final No, that's not possible. You can only make local variables or real class member final in Groovy. The code I added was just a local variable, that was accessed from the closure. The "ext" properties are properties and can be modified at any time. I was just saying: They should only be used during the configuration phase, while task execution, nothing should change them. bq. why is buildSiteJekyll now hooked into the "assemble" task? As Dawid said, assemble is per project. And the site is already built in "check", so as it's already there you can also assemble it. I am planning to also make the global javadocs/documentation a separate project with assemble. FYI, I commented in the assemble dependency. > improve templated variable escaping in ref-guide _config.yml > > > Key: SOLR-14889 > URL: https://issues.apache.org/jira/browse/SOLR-14889 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14889.patch, SOLR-14889.patch, SOLR-14889.patch, > SOLR-14889.patch, SOLR-14889.patch > > > SOLR-14824 ran into windows failures when we switching from using a hardcoded > "relative" path to the solrRootPath to using groovy/project variables to get > the path. the reason for the failures was that the path us used as a > variable tempted into {{_config.yml.template}} to build the {{_config.yml}} > file, but on windows the path seperater of '\' was being parsed by > jekyll/YAML as a string escape character. > (This wasn't a problem we ran into before, even on windows, prior to the > SOLR-14824 changes, because the hardcoded relative path only used '/' > delimiters, which (j)ruby was happy to work with, even on windows. > As Uwe pointed out when hotfixing this... > {quote}Problem was that backslashes are used to escape strings, but windows > paths also have those. Fix was to add StringEscapeUtils, but I don't like > this too much. Maybe we find a better solution to make special characters in > those properties escaped correctly when used in strings inside templates. > {quote} > ...the current fix of using {{StringEscapeUtils.escapeJava}} - only for this > one variable -- doesn't really protect other variables that might have > special charactes in them down the road, and while "escapeJava" work ok for > the "\" issue, it isn't neccessarily consistent with all YAML escapse, which > could lead to even weird bugs/cofusion down the road. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…
goankur commented on a change in pull request #1893: URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494654559 ## File path: lucene/facet/src/test/org/apache/lucene/facet/FacetTestCase.java ## @@ -56,6 +60,28 @@ public Facets getTaxonomyFacetCounts(TaxonomyReader taxoReader, FacetsConfig con return facets; } + public List> getTaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, FacetsCollector fc) throws IOException { Review comment: done in the next revision. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…
goankur commented on a change in pull request #1893: URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494655549 ## File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java ## @@ -726,6 +743,39 @@ public void testRandom() throws Exception { IOUtils.close(tw, searcher.getIndexReader(), tr, indexDir, taxoDir); } + private static List> sortedFacetLabels(List> allfacetLabels) { +for (List facetLabels : allfacetLabels) { + Collections.sort(facetLabels); +} + +Collections.sort(allfacetLabels, (o1, o2) -> { + if (o1 == null) { Review comment: Thanks for catching this @mikemccand. I fixed the `actualLabels` to exclude empty lists. The null checks were just me being extra cautious. I realized they were unnecessary and removed them :-) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14829) Default components are missing facet_module and terms in documentation
[ https://issues.apache.org/jira/browse/SOLR-14829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexandre Rafalovitch reassigned SOLR-14829: Assignee: Alexandre Rafalovitch (was: Ishan Chattopadhyaya) > Default components are missing facet_module and terms in documentation > -- > > Key: SOLR-14829 > URL: https://issues.apache.org/jira/browse/SOLR-14829 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation, examples >Affects Versions: 8.6.2 >Reporter: Johannes Baiter >Assignee: Alexandre Rafalovitch >Priority: Minor > Attachments: SOLR-14829.patch > > > In the reference guide, the list of search components that are enabled by > default is missing the {{facet_module}} and {{terms}} components. The terms > component is instead listed under "other useful components", while the > {{FacetModule}} is never listed anywhere in the documentation, despite it > being neccessary for the JSON Facet API to work. > This is also how I stumbled upon this, I spent hours trying to figure out why > JSON-based faceting was not working with my setup, after taking a glance at > the {{SearchHandler}} source code based on a hunch, it became clear that my > custom list of search components (created based on the list in the reference > guide) was to blame. > A patch for the documentation gap is attached, but I think there are some > other issues with the naming/documentation around the two faceting APIs that > may be worth discussing: > * The names {{facet_module}} / {{FacetModule}} are very misleading, since > the documentation is always talking about the "JSON Facet API", but the term > "JSON" does not appear in the name of the component nor does the component > have any documentation attached that mentions this > * Why is the {{FacetModule}} class located in the {{search.facet}} package > while every single other search component included in the core is located in > the {{handler.component}} package? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…
goankur commented on a change in pull request #1893: URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494665188 ## File path: lucene/facet/src/test/org/apache/lucene/facet/taxonomy/TestTaxonomyFacetCounts.java ## @@ -726,6 +743,39 @@ public void testRandom() throws Exception { IOUtils.close(tw, searcher.getIndexReader(), tr, indexDir, taxoDir); } + private static List> sortedFacetLabels(List> allfacetLabels) { +for (List facetLabels : allfacetLabels) { + Collections.sort(facetLabels); +} + +Collections.sort(allfacetLabels, (o1, o2) -> { Review comment: Yes, a document with `N`th position in the input sequence might end up with `K`th docId in a random segment making it harder to compare actual and expected labels. Thanks for confirming that the approach is acceptable. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14829) Default components are missing facet_module and terms in documentation
[ https://issues.apache.org/jira/browse/SOLR-14829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201815#comment-17201815 ] Alexandre Rafalovitch commented on SOLR-14829: -- The patch no longer applies clean because some of the other documentation changes I did. But I also did some related research and want to cleanup the default components information in RefGuide and solrconfig files. So, I took the issue over from Ishan and will make it a bit more generic. > Default components are missing facet_module and terms in documentation > -- > > Key: SOLR-14829 > URL: https://issues.apache.org/jira/browse/SOLR-14829 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation, examples >Affects Versions: 8.6.2 >Reporter: Johannes Baiter >Assignee: Alexandre Rafalovitch >Priority: Minor > Attachments: SOLR-14829.patch > > > In the reference guide, the list of search components that are enabled by > default is missing the {{facet_module}} and {{terms}} components. The terms > component is instead listed under "other useful components", while the > {{FacetModule}} is never listed anywhere in the documentation, despite it > being neccessary for the JSON Facet API to work. > This is also how I stumbled upon this, I spent hours trying to figure out why > JSON-based faceting was not working with my setup, after taking a glance at > the {{SearchHandler}} source code based on a hunch, it became clear that my > custom list of search components (created based on the list in the reference > guide) was to blame. > A patch for the documentation gap is attached, but I think there are some > other issues with the naming/documentation around the two faceting APIs that > may be worth discussing: > * The names {{facet_module}} / {{FacetModule}} are very misleading, since > the documentation is always talking about the "JSON Facet API", but the term > "JSON" does not appear in the name of the component nor does the component > have any documentation attached that mentions this > * Why is the {{FacetModule}} class located in the {{search.facet}} package > while every single other search component included in the core is located in > the {{handler.component}} package? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14829) Update default components information in Reference Guide and solrconfig.xml files
[ https://issues.apache.org/jira/browse/SOLR-14829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexandre Rafalovitch updated SOLR-14829: - Summary: Update default components information in Reference Guide and solrconfig.xml files (was: Default components are missing facet_module and terms in documentation) > Update default components information in Reference Guide and solrconfig.xml > files > - > > Key: SOLR-14829 > URL: https://issues.apache.org/jira/browse/SOLR-14829 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation, examples >Affects Versions: 8.6.2 >Reporter: Johannes Baiter >Assignee: Alexandre Rafalovitch >Priority: Minor > Attachments: SOLR-14829.patch > > > In the reference guide, the list of search components that are enabled by > default is missing the {{facet_module}} and {{terms}} components. The terms > component is instead listed under "other useful components", while the > {{FacetModule}} is never listed anywhere in the documentation, despite it > being neccessary for the JSON Facet API to work. > This is also how I stumbled upon this, I spent hours trying to figure out why > JSON-based faceting was not working with my setup, after taking a glance at > the {{SearchHandler}} source code based on a hunch, it became clear that my > custom list of search components (created based on the list in the reference > guide) was to blame. > A patch for the documentation gap is attached, but I think there are some > other issues with the naming/documentation around the two faceting APIs that > may be worth discussing: > * The names {{facet_module}} / {{FacetModule}} are very misleading, since > the documentation is always talking about the "JSON Facet API", but the term > "JSON" does not appear in the name of the component nor does the component > have any documentation attached that mentions this > * Why is the {{FacetModule}} class located in the {{search.facet}} package > while every single other search component included in the core is located in > the {{handler.component}} package? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] goankur commented on a change in pull request #1893: LUCENE-9444 Utility class to get facet labels from taxonomy for a fac…
goankur commented on a change in pull request #1893: URL: https://github.com/apache/lucene-solr/pull/1893#discussion_r494675000 ## File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/TaxonomyFacetLabels.java ## @@ -0,0 +1,184 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.facet.taxonomy; + +import org.apache.lucene.facet.FacetsConfig; +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.util.IntsRef; + +import java.io.IOException; + +import static org.apache.lucene.facet.taxonomy.TaxonomyReader.INVALID_ORDINAL; +import static org.apache.lucene.facet.taxonomy.TaxonomyReader.ROOT_ORDINAL; + +/** + * Utility class to easily retrieve previously indexed facet labels, allowing you to skip also adding stored fields for these values, + * reducing your index size. + * + * @lucene.experimental + **/ +public class TaxonomyFacetLabels { + + /** + * Index field name provided to the constructor + */ + private final String indexFieldName; + + /** + * {@code TaxonomyReader} provided to the constructor + */ + private final TaxonomyReader taxoReader; + + /** + * {@code FacetsConfig} provided to the constructor + */ + private final FacetsConfig config; + + /** + * {@code OrdinalsReader} to decode ordinals previously indexed into the {@code BinaryDocValues} facet field + */ + private final OrdinalsReader ordsReader; + + /** + * Sole constructor. Do not close the provided {@link TaxonomyReader} while still using this instance! + */ + public TaxonomyFacetLabels(TaxonomyReader taxoReader, FacetsConfig config, String indexFieldName) throws IOException { +this.taxoReader = taxoReader; +this.config = config; +this.indexFieldName = indexFieldName; +this.ordsReader = new DocValuesOrdinalsReader(indexFieldName); + } + + /** + * Create and return an instance of {@link FacetLabelReader} to retrieve facet labels for + * multiple documents and (optionally) for a specific dimension. You must create this per-segment, + * and then step through all hits, in order, for that segment. + * + * NOTE: This class is not thread-safe, so you must use a new instance of this + * class for each thread. + * + * @param readerContext LeafReaderContext used to access the {@code BinaryDocValues} facet field + * @return an instance of {@link FacetLabelReader} + * @throws IOException when a low-level IO issue occurs + */ + public FacetLabelReader getFacetLabelReader(LeafReaderContext readerContext) throws IOException { +return new FacetLabelReader(ordsReader, readerContext); + } + + /** + * Utility class to retrieve facet labels for multiple documents. + * + * @lucene.experimental + */ + public class FacetLabelReader { +private final OrdinalsReader.OrdinalsSegmentReader ordinalsSegmentReader; +private final IntsRef decodedOrds = new IntsRef(); +private int currentDocId = -1; +private int currentPos = -1; + +// Lazily set when nextFacetLabel(int docId, String facetDimension) is first called +private int[] parents; + +/** + * Sole constructor. + */ +public FacetLabelReader(OrdinalsReader ordsReader, LeafReaderContext readerContext) throws IOException { + ordinalsSegmentReader = ordsReader.getReader(readerContext); +} + +/** + * Retrieves the next {@link FacetLabel} for the specified {@code docId}, or {@code null} if there are no more. + * This method has state: if the provided {@code docId} is the same as the previous invocation, it returns the + * next {@link FacetLabel} for that document. Otherwise, it advances to the new {@code docId} and provides the + * first {@link FacetLabel} for that document, or {@code null} if that document has no indexed facets. Each + * new {@code docId} must be in strictly monotonic (increasing) order. + * + * @param docId input docId provided in monotonic (non-decreasing) order + * @return the first or next {@link FacetLabel}, or {@code null} if there are no more + * @throws IOException when a low-level IO issue occurs + */ +public FacetLabel nextFacetLabel(int docId) throws IOException {
[jira] [Commented] (SOLR-14829) Update default components information in Reference Guide and solrconfig.xml files
[ https://issues.apache.org/jira/browse/SOLR-14829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201841#comment-17201841 ] Alexandre Rafalovitch commented on SOLR-14829: -- Ok, the [Reference Guide page|https://lucene.apache.org/solr/guide/8_6/requesthandlers-and-searchcomponents-in-solrconfig.html] is rather a mess. Here is a couple of things that need cleaning up: * defaults, appends, and invariants are defined as part of SearchHandler, should be all handlers * we mention initParams (in general section, good) but not useParams * it is very hard to notice that SearchComponents references (components, first-components, last-components) are actually section *inside* SearchHandlers only, partially because they are so far apart * we don't actually explain how to declare a custom search component explicitly (some definitions are available on linked pages) * we don't have example of UpdateRequestHandlers either in the doc or in solrconfig.xml (because they all became implicit) * we don't mention UpdateRequestProcessors, which could be viewed as a parallel pipeline to SearchComponents I am going to try refactoring that page. > Update default components information in Reference Guide and solrconfig.xml > files > - > > Key: SOLR-14829 > URL: https://issues.apache.org/jira/browse/SOLR-14829 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation, examples >Affects Versions: 8.6.2 >Reporter: Johannes Baiter >Assignee: Alexandre Rafalovitch >Priority: Minor > Attachments: SOLR-14829.patch > > > In the reference guide, the list of search components that are enabled by > default is missing the {{facet_module}} and {{terms}} components. The terms > component is instead listed under "other useful components", while the > {{FacetModule}} is never listed anywhere in the documentation, despite it > being neccessary for the JSON Facet API to work. > This is also how I stumbled upon this, I spent hours trying to figure out why > JSON-based faceting was not working with my setup, after taking a glance at > the {{SearchHandler}} source code based on a hunch, it became clear that my > custom list of search components (created based on the list in the reference > guide) was to blame. > A patch for the documentation gap is attached, but I think there are some > other issues with the naming/documentation around the two faceting APIs that > may be worth discussing: > * The names {{facet_module}} / {{FacetModule}} are very misleading, since > the documentation is always talking about the "JSON Facet API", but the term > "JSON" does not appear in the name of the component nor does the component > have any documentation attached that mentions this > * Why is the {{FacetModule}} class located in the {{search.facet}} package > while every single other search component included in the core is located in > the {{handler.component}} package? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14613) Provide a clean API for pluggable replica assignment implementations
[ https://issues.apache.org/jira/browse/SOLR-14613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201915#comment-17201915 ] Noble Paul commented on SOLR-14613: --- The 2nd parameter is a varargs parameter. so, you can set a deeply nested value My other concern is why have we committed a large amount of code without a single test case? > Provide a clean API for pluggable replica assignment implementations > > > Key: SOLR-14613 > URL: https://issues.apache.org/jira/browse/SOLR-14613 > Project: Solr > Issue Type: Improvement > Components: AutoScaling >Reporter: Andrzej Bialecki >Assignee: Ilan Ginzburg >Priority: Major > Time Spent: 41h > Remaining Estimate: 0h > > As described in SIP-8 the current autoscaling Policy implementation has > several limitations that make it difficult to use for very large clusters and > very large collections. SIP-8 also mentions the possible migration path by > providing alternative implementations of the placement strategies that are > less complex but more efficient in these very large environments. > We should review the existing APIs that the current autoscaling engine uses > ({{SolrCloudManager}} , {{AssignStrategy}} , {{Suggester}} and related > interfaces) to see if they provide a sufficient and minimal API for plugging > in alternative autoscaling placement strategies, and if necessary refactor > the existing APIs. > Since these APIs are internal it should be possible to do this without > breaking back-compat. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on pull request #1863: SOLR-14701: GuessSchemaFields URP to replace AddSchemaFields URP in schemaless mode
noblepaul commented on pull request #1863: URL: https://github.com/apache/lucene-solr/pull/1863#issuecomment-698728576 I recommend a new request handler such as `/update/guess-schema` This way we do not need to add any new functionality, nor do we need to pass any extra params ``` curl -F 'data=@datafile.json' http://localhost:8983/gettingstarted/update/guess-schema ``` The response can be ``` curl -X POST -H 'Content-type: application/json' -d '{"add-field":[ { "name":"id", "type":"string", "stored":true }, { "name":"desc", "type":"text", "stored":true } ]}' http://localhost:8983/solr/gettingstarted/schema This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9535) Investigate recent indexing slowdown for wikimedium documents
[ https://issues.apache.org/jira/browse/LUCENE-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201924#comment-17201924 ] Adrien Grand commented on LUCENE-9535: -- Indexing throughput looks like it's back to where it was before including stored fields in DWPT accounting: https://home.apache.org/~mikemccand/lucenebench/indexing.html > Investigate recent indexing slowdown for wikimedium documents > - > > Key: LUCENE-9535 > URL: https://issues.apache.org/jira/browse/LUCENE-9535 > Project: Lucene - Core > Issue Type: Task >Reporter: Adrien Grand >Priority: Minor > Attachments: cpu_profile.svg > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Nightly benchmarks report a ~10% slowdown for 1kB documents as of September > 9th: [http://people.apache.org/~mikemccand/lucenebench/indexing.html]. > On that day, we added stored fields in DWPT accounting (LUCENE-9511), so I > first thought this could be due to smaller flushed segments and more merging, > but I still wonder whether there's something else. The benchmark runs with > 8GB of heap, 2GB of RAM buffer and 36 indexing threads. So it's about 2GB/36 > = 57MB of RAM buffer per thread in the worst-case scenario that all DWPTs get > full at the same time. Stored fields account for about 0.7MB of memory, or 1% > of the indexing buffer size. How can a 1% reduction of buffering capacity > explain a 10% indexing slowdown? I looked into this further by running > indexing benchmarks locally with 8 indexing threads and 128MB of indexing > buffer memory, which would make this issue even more apparent if the smaller > RAM buffer was the cause, but I'm not seeing a regression and actually I'm > seeing similar number of flushes when I disabled memory accounting for stored > fields. > I ran indexing under a profiler to see whether something else could cause > this slowdown, e.g. slow implementations of ramBytesUsed on stored fields > writers, but nothing surprising showed up and the profile looked just like I > would have expected. > Another question I have is why the 4kB benchmark is not affected at all. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org