[jira] [Commented] (LUCENE-9623) Add module descriptor (module-info.java) to lucene jars
[ https://issues.apache.org/jira/browse/LUCENE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240543#comment-17240543 ] Dawid Weiss commented on LUCENE-9623: - I'm really not sure how it should work. :) I'm really excited to try the module system, finally, but I can't find the time to finish the things I still have in the backlog so I'll try to clean up that first! > Add module descriptor (module-info.java) to lucene jars > --- > > Key: LUCENE-9623 > URL: https://issues.apache.org/jira/browse/LUCENE-9623 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Priority: Major > Attachments: generate-all-module-info.sh > > > For a starter, module descriptors can be automatically generated by jdeps > utility. > There are two choices. > 1. generate "open" modules which allows reflective accesses with > --generate-open-module option > 2. generate non-open modules with --generate-module-info option > Which is the better - not fully sure, but maybe 2 (non-open modules)? > Also, we need to choose proper module names - just using gradle project path > for it is OK? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9623) Add module descriptor (module-info.java) to lucene jars
[ https://issues.apache.org/jira/browse/LUCENE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240604#comment-17240604 ] Uwe Schindler commented on LUCENE-9623: --- We also need some testing of the artifacts! Our standard test environment can't do testing of module system. This needs some "integration" tests: A project using the JAR files on module path - no classpath. And here it must be JAR files, the non-packaged class files won't work as far as I remember. When doing this, you will figure out that the SPI classes (codecs, analyzers) won't work on module path. Because we do not only need to open the packages in our modules, the contents on META-INF/servisices need to be added as "native services" to the module info. Every module-info file must list all class names that are services explicit. The META-INF/service files are not read in module mode: See this blog post: [https://blog.frankel.ch/migrating-serviceloader-java-9-module-system/] I am not sure if JDEPS figures this out automatically! > Add module descriptor (module-info.java) to lucene jars > --- > > Key: LUCENE-9623 > URL: https://issues.apache.org/jira/browse/LUCENE-9623 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Priority: Major > Attachments: generate-all-module-info.sh > > > For a starter, module descriptors can be automatically generated by jdeps > utility. > There are two choices. > 1. generate "open" modules which allows reflective accesses with > --generate-open-module option > 2. generate non-open modules with --generate-module-info option > Which is the better - not fully sure, but maybe 2 (non-open modules)? > Also, we need to choose proper module names - just using gradle project path > for it is OK? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9623) Add module descriptor (module-info.java) to lucene jars
[ https://issues.apache.org/jira/browse/LUCENE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240604#comment-17240604 ] Uwe Schindler edited comment on LUCENE-9623 at 11/30/20, 9:38 AM: -- We also need some testing of the artifacts! Our standard test environment can't do testing of module system. This needs some "integration" tests: A project using the JAR files on module path - no classpath. And here it must be JAR files, the non-packaged class files won't work as far as I remember. When doing this, you will figure out that the SPI classes (codecs, analyzers) won't work on module path. Because we do not only need to open the packages in our modules, the contents on META-INF/services need to be added as "native services" to the module info (using "provides"). Every module-info file must list all class names that are services explicit. The META-INF/services files are not read in module mode: See this blog post: [https://blog.frankel.ch/migrating-serviceloader-java-9-module-system/] I am not sure if JDEPS figures this out automatically! was (Author: thetaphi): We also need some testing of the artifacts! Our standard test environment can't do testing of module system. This needs some "integration" tests: A project using the JAR files on module path - no classpath. And here it must be JAR files, the non-packaged class files won't work as far as I remember. When doing this, you will figure out that the SPI classes (codecs, analyzers) won't work on module path. Because we do not only need to open the packages in our modules, the contents on META-INF/servisices need to be added as "native services" to the module info. Every module-info file must list all class names that are services explicit. The META-INF/service files are not read in module mode: See this blog post: [https://blog.frankel.ch/migrating-serviceloader-java-9-module-system/] I am not sure if JDEPS figures this out automatically! > Add module descriptor (module-info.java) to lucene jars > --- > > Key: LUCENE-9623 > URL: https://issues.apache.org/jira/browse/LUCENE-9623 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Priority: Major > Attachments: generate-all-module-info.sh > > > For a starter, module descriptors can be automatically generated by jdeps > utility. > There are two choices. > 1. generate "open" modules which allows reflective accesses with > --generate-open-module option > 2. generate non-open modules with --generate-module-info option > Which is the better - not fully sure, but maybe 2 (non-open modules)? > Also, we need to choose proper module names - just using gradle project path > for it is OK? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9623) Add module descriptor (module-info.java) to lucene jars
[ https://issues.apache.org/jira/browse/LUCENE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240609#comment-17240609 ] Uwe Schindler commented on LUCENE-9623: --- By the way, this is why this issues was a requirement to enable moudle systems: LUCENE-9281 (our old SPIClassIterator was not able to use the module system). Now this all fits nicely together. We just need BOTH, the META-INF/services for Classpath applications, but also the new "provides" statements in moudle info for applications using module system. > Add module descriptor (module-info.java) to lucene jars > --- > > Key: LUCENE-9623 > URL: https://issues.apache.org/jira/browse/LUCENE-9623 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Priority: Major > Attachments: generate-all-module-info.sh > > > For a starter, module descriptors can be automatically generated by jdeps > utility. > There are two choices. > 1. generate "open" modules which allows reflective accesses with > --generate-open-module option > 2. generate non-open modules with --generate-module-info option > Which is the better - not fully sure, but maybe 2 (non-open modules)? > Also, we need to choose proper module names - just using gradle project path > for it is OK? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9623) Add module descriptor (module-info.java) to lucene jars
[ https://issues.apache.org/jira/browse/LUCENE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240610#comment-17240610 ] Uwe Schindler commented on LUCENE-9623: --- Ah I have seen in the examples above that it looks like JDEPS creates the services correctly. > Add module descriptor (module-info.java) to lucene jars > --- > > Key: LUCENE-9623 > URL: https://issues.apache.org/jira/browse/LUCENE-9623 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Priority: Major > Attachments: generate-all-module-info.sh > > > For a starter, module descriptors can be automatically generated by jdeps > utility. > There are two choices. > 1. generate "open" modules which allows reflective accesses with > --generate-open-module option > 2. generate non-open modules with --generate-module-info option > Which is the better - not fully sure, but maybe 2 (non-open modules)? > Also, we need to choose proper module names - just using gradle project path > for it is OK? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9623) Add module descriptor (module-info.java) to lucene jars
[ https://issues.apache.org/jira/browse/LUCENE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240609#comment-17240609 ] Uwe Schindler edited comment on LUCENE-9623 at 11/30/20, 9:55 AM: -- By the way, this is why the following issue was a requirement to enable module system: LUCENE-9281 (our old SPIClassIterator was not able to use the module system). Now this all fits nicely together. We just need BOTH, the META-INF/services for Classpath applications, but also the new "provides" statements in moudle info for applications using module system. We should find a way to maybe have some templating system for the module-info.java: - services are automatically included based on META-INF/services (to not maintain them duplicate). - exports are done manual - imports maybe based on Gradle dependencies? was (Author: thetaphi): By the way, this is why this issues was a requirement to enable moudle systems: LUCENE-9281 (our old SPIClassIterator was not able to use the module system). Now this all fits nicely together. We just need BOTH, the META-INF/services for Classpath applications, but also the new "provides" statements in moudle info for applications using module system. > Add module descriptor (module-info.java) to lucene jars > --- > > Key: LUCENE-9623 > URL: https://issues.apache.org/jira/browse/LUCENE-9623 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Priority: Major > Attachments: generate-all-module-info.sh > > > For a starter, module descriptors can be automatically generated by jdeps > utility. > There are two choices. > 1. generate "open" modules which allows reflective accesses with > --generate-open-module option > 2. generate non-open modules with --generate-module-info option > Which is the better - not fully sure, but maybe 2 (non-open modules)? > Also, we need to choose proper module names - just using gradle project path > for it is OK? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9623) Add module descriptor (module-info.java) to lucene jars
[ https://issues.apache.org/jira/browse/LUCENE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240619#comment-17240619 ] Tomoko Uchida commented on LUCENE-9623: --- {quote}bq.I'm really excited to try the module system, finally, but I can't find the time to finish the things I still have in the backlog so I'll try to clean up that first! {quote} Yes, of course, I'm not in a hurry to resolve this at all. Until you feel ready to deal with modules (on our build ecosystem), I'd be happy to play around or try to make a patch if I can take time for it. > Add module descriptor (module-info.java) to lucene jars > --- > > Key: LUCENE-9623 > URL: https://issues.apache.org/jira/browse/LUCENE-9623 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Priority: Major > Attachments: generate-all-module-info.sh > > > For a starter, module descriptors can be automatically generated by jdeps > utility. > There are two choices. > 1. generate "open" modules which allows reflective accesses with > --generate-open-module option > 2. generate non-open modules with --generate-module-info option > Which is the better - not fully sure, but maybe 2 (non-open modules)? > Also, we need to choose proper module names - just using gradle project path > for it is OK? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15017) The core's lib/ folder content is not loaded in the classloader anymore when the core's configuration does not define any element
[ https://issues.apache.org/jira/browse/SOLR-15017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240630#comment-17240630 ] Thomas Mortagne commented on SOLR-15017: bq. basically don't exit early from this method just because there is no in the XML. WDYT? Can you file a PR? That was my first though but I was just not sure if the clean one was not to move it back to a different area. Anyway doing a PR in that directly if you think it's OK. > The core's lib/ folder content is not loaded in the classloader anymore when > the core's configuration does not define any element > --- > > Key: SOLR-15017 > URL: https://issues.apache.org/jira/browse/SOLR-15017 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.7 >Reporter: Thomas Mortagne >Assignee: David Smiley >Priority: Blocker > Labels: regression > > I just upgraded solr-core (I'm using Solr in embedded mode) from 8.5.1 to > 8.7.0 and it seems that the lib subfolder inside a core folder is not taken > into account anymore. > Works fine when I move the lib/ folder one level up (in the Solr home folder) > but when the lib folder with the plugins is located in the core folder it > cannot find any of the classes. > I debugged it a little and I think the regression was caused by the > refactoring done in > https://github.com/apache/lucene-solr/commit/732348ec7f9c6b6f7bf9d539a40e50d16198#diff-473fbcdab103c08461ad1b3c3bb1c6d56f1bcd16d6ce341d80855db2cb20a427R749 > : the handling of the lib/ core's sub folder was moved to > SolrConfig#initLibs() but unfortunately the check to make sure there is at > least one {{}} element in the configuration file was not removed which > means that if you don't have any of those then the content of the lib/ folder > is totally ignored. > That debugging was easy enough but I don't know Solr internals enough to > propose something clean to fix the issue in a pull request. > The workaround is to make sure there is at least one {{}} element (for > example ) in the core's > solrconfig.xml file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9623) Add module descriptor (module-info.java) to lucene jars
[ https://issues.apache.org/jira/browse/LUCENE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240633#comment-17240633 ] Tomoko Uchida commented on LUCENE-9623: --- bq. We also need some testing of the artifacts! Our standard test environment can't do testing of module system. This needs some "integration" tests: A project using the JAR files on module path - no classpath. I was also wondering how we should test if the modules are correctly generated... do we need a test fixture or framework for it ?? bq. Ah I have seen in the examples above that it looks like JDEPS creates the services correctly. Yes, I think so. I copy-pasted the auto generated module-info for only one analysis module to show the "provides" descriptor, but codecs look also just fine to me (though I have not yet closely looked at all modules). > Add module descriptor (module-info.java) to lucene jars > --- > > Key: LUCENE-9623 > URL: https://issues.apache.org/jira/browse/LUCENE-9623 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Priority: Major > Attachments: generate-all-module-info.sh > > > For a starter, module descriptors can be automatically generated by jdeps > utility. > There are two choices. > 1. generate "open" modules which allows reflective accesses with > --generate-open-module option > 2. generate non-open modules with --generate-module-info option > Which is the better - not fully sure, but maybe 2 (non-open modules)? > Also, we need to choose proper module names - just using gradle project path > for it is OK? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9623) Add module descriptor (module-info.java) to lucene jars
[ https://issues.apache.org/jira/browse/LUCENE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240636#comment-17240636 ] Uwe Schindler commented on LUCENE-9623: --- bq. Yes, I think so. I copy-pasted the auto generated module-info for only one analysis module to show the "provides" descriptor, but codecs look also just fine to me (though I have not yet closely looked at all modules). We have to test codecs and analyzers under real conditions: sample app with e.g., lucene-core, lucene-backwards-codecs and lucene-analyzers-common in module PATH and then check what codecs you can see and which analyzers (using Codecs.availableCodecs(), Tokenizer.availableTokenizers(),...). Maybe also spawn an app. Maybe Luke is a good example. > Add module descriptor (module-info.java) to lucene jars > --- > > Key: LUCENE-9623 > URL: https://issues.apache.org/jira/browse/LUCENE-9623 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Priority: Major > Attachments: generate-all-module-info.sh > > > For a starter, module descriptors can be automatically generated by jdeps > utility. > There are two choices. > 1. generate "open" modules which allows reflective accesses with > --generate-open-module option > 2. generate non-open modules with --generate-module-info option > Which is the better - not fully sure, but maybe 2 (non-open modules)? > Also, we need to choose proper module names - just using gradle project path > for it is OK? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9623) Add module descriptor (module-info.java) to lucene jars
[ https://issues.apache.org/jira/browse/LUCENE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-9623: -- Status: Patch Available (was: Open) > Add module descriptor (module-info.java) to lucene jars > --- > > Key: LUCENE-9623 > URL: https://issues.apache.org/jira/browse/LUCENE-9623 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Priority: Major > Attachments: generate-all-module-info.sh > > > For a starter, module descriptors can be automatically generated by jdeps > utility. > There are two choices. > 1. generate "open" modules which allows reflective accesses with > --generate-open-module option > 2. generate non-open modules with --generate-module-info option > Which is the better - not fully sure, but maybe 2 (non-open modules)? > Also, we need to choose proper module names - just using gradle project path > for it is OK? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9623) Add module descriptor (module-info.java) to lucene jars
[ https://issues.apache.org/jira/browse/LUCENE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240637#comment-17240637 ] Uwe Schindler commented on LUCENE-9623: --- When developing LUCENE-9281 i was not able to test this easily :-) > Add module descriptor (module-info.java) to lucene jars > --- > > Key: LUCENE-9623 > URL: https://issues.apache.org/jira/browse/LUCENE-9623 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Priority: Major > Attachments: generate-all-module-info.sh > > > For a starter, module descriptors can be automatically generated by jdeps > utility. > There are two choices. > 1. generate "open" modules which allows reflective accesses with > --generate-open-module option > 2. generate non-open modules with --generate-module-info option > Which is the better - not fully sure, but maybe 2 (non-open modules)? > Also, we need to choose proper module names - just using gradle project path > for it is OK? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9623) Add module descriptor (module-info.java) to lucene jars
[ https://issues.apache.org/jira/browse/LUCENE-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-9623: -- Status: Open (was: Patch Available) > Add module descriptor (module-info.java) to lucene jars > --- > > Key: LUCENE-9623 > URL: https://issues.apache.org/jira/browse/LUCENE-9623 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Priority: Major > Attachments: generate-all-module-info.sh > > > For a starter, module descriptors can be automatically generated by jdeps > utility. > There are two choices. > 1. generate "open" modules which allows reflective accesses with > --generate-open-module option > 2. generate non-open modules with --generate-module-info option > Which is the better - not fully sure, but maybe 2 (non-open modules)? > Also, we need to choose proper module names - just using gradle project path > for it is OK? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-15022) RefGuide documentation for /cluster/plugin API
Andrzej Bialecki created SOLR-15022: --- Summary: RefGuide documentation for /cluster/plugin API Key: SOLR-15022 URL: https://issues.apache.org/jira/browse/SOLR-15022 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: Plugin system Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki The {{/cluster/plugin}} API needs user-level documentation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15022) RefGuide documentation for /cluster/plugin API
[ https://issues.apache.org/jira/browse/SOLR-15022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240682#comment-17240682 ] Andrzej Bialecki commented on SOLR-15022: - This documentation depends on some of the changes in the linked issues. > RefGuide documentation for /cluster/plugin API > -- > > Key: SOLR-15022 > URL: https://issues.apache.org/jira/browse/SOLR-15022 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Plugin system >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > > The {{/cluster/plugin}} API needs user-level documentation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] iverase commented on pull request #2094: LUCENE-9047: Move the Directory APIs to be little endian
iverase commented on pull request #2094: URL: https://github.com/apache/lucene-solr/pull/2094#issuecomment-735732072 > SegmentInfos certainly cannot know the endianness of the file up-front. But for other file formats, we could know this on a per-file-format basis? E.g. Lucene86PointsFormat always uses big endian but Lucene90PointsFormat will always use little endian? @jpountz That is true, but what I understand is the files read by for example `CompressingStoredFieldsReader`, they hold versioning in the header? > So I'd like to see benchmark results before anything is committed. @rmuir I have created [JMH benchmarks](https://github.com/iverase/endianness_benchmark) that read longs using BytesBuffer and LongBuffer with different endianness. Results are here: ``` Benchmark (byteOrder) Mode Cnt Score Error Units ReadLongBenchmark.readBytesBuffer LE thrpt 25 9.015 ± 0.012 ops/us ReadLongBenchmark.readBytesBuffer BE thrpt 25 8.333 ± 0.040 ops/us ReadLongBenchmark.readLongsBuffer LE thrpt 25 24.510 ± 0.191 ops/us ReadLongBenchmark.readLongsBuffer BE thrpt 25 9.981 ± 0.034 ops/us ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9624) fix duplicate compute on maxUnpatchedValue
Feng Guo created LUCENE-9624: Summary: fix duplicate compute on maxUnpatchedValue Key: LUCENE-9624 URL: https://issues.apache.org/jira/browse/LUCENE-9624 Project: Lucene - Core Issue Type: Improvement Components: core/codecs Reporter: Feng Guo in [LUCENE-9027|[https://github.com/apache/lucene-solr/pull/973]] lucene introduced SIMD to decode postings, which leaves a very small problem. i hope i can fix this as my first PR on this amazing project :). detail: maxUnpatchedValue has already computed: {code:java} apache.lucene.codecs.lucene84.PForUtil#encode line 64 final long maxUnpatchedValue = (1L << patchedBitsRequired) - 1;{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9624) fix duplicate compute on maxUnpatchedValue
[ https://issues.apache.org/jira/browse/LUCENE-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feng Guo updated LUCENE-9624: - Description: in [LUCENE-9027|[https://github.com/apache/lucene-solr/pull/973]] lucene introduced SIMD to decode postings, which leaves a very small problem. i hope i can fix this as my first PR on this amazing project :). detail: {code:java} maxUnpatchedValue has already computed apache.lucene.codecs.lucene84.PForUtil#encode line 64 final long maxUnpatchedValue = (1L << patchedBitsRequired) - 1; apache.lucene.codecs.lucene84.PForUtil#encode line 74 but computed later again: if (longs[i] > (1L << patchedBitsRequired) - 1)...{code} was: in [LUCENE-9027|[https://github.com/apache/lucene-solr/pull/973]] lucene introduced SIMD to decode postings, which leaves a very small problem. i hope i can fix this as my first PR on this amazing project :). detail: maxUnpatchedValue has already computed: {code:java} apache.lucene.codecs.lucene84.PForUtil#encode line 64 final long maxUnpatchedValue = (1L << patchedBitsRequired) - 1;{code} > fix duplicate compute on maxUnpatchedValue > -- > > Key: LUCENE-9624 > URL: https://issues.apache.org/jira/browse/LUCENE-9624 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Reporter: Feng Guo >Priority: Major > > in [LUCENE-9027|[https://github.com/apache/lucene-solr/pull/973]] lucene > introduced SIMD to decode postings, which leaves a very small problem. i hope > i can fix this as my first PR on this amazing project :). > detail: > {code:java} > maxUnpatchedValue has already computed > apache.lucene.codecs.lucene84.PForUtil#encode line 64 > final long maxUnpatchedValue = (1L << patchedBitsRequired) - 1; > apache.lucene.codecs.lucene84.PForUtil#encode line 74 > but computed later again: > if (longs[i] > (1L << patchedBitsRequired) - 1)...{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] gf2121 opened a new pull request #2106: LUCENE-9624: fix duplicate compute on maxUnpatchedValue
gf2121 opened a new pull request #2106: URL: https://github.com/apache/lucene-solr/pull/2106 # Description in [LUCENE-9027|https://github.com/apache/lucene-solr/pull/973] lucene introduced SIMD to decode postings, which leaves a very small problem. since this is a hot way when indexing and searching, so fixing it may make a bit sense. i hope i can fix this problem as my first PR on this amazing project :) detail: ``` maxUnpatchedValue has already computed apache.lucene.codecs.lucene84.PForUtil#encode line 64 final long maxUnpatchedValue = (1L << patchedBitsRequired) - 1; apache.lucene.codecs.lucene84.PForUtil#encode line 74 but computed later again: if (longs[i] > (1L << patchedBitsRequired) - 1) ... ``` # Solution ``` convert apache.lucene.codecs.lucene84.PForUtil#encode line 74 if (longs[i] > (1L << patchedBitsRequired) - 1) ... to if (longs[i] > maxUnpatchedValue) ... ``` # Tests none # Checklist Please review the following and check all that apply: - [ ] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [ ] I have created a Jira issue and added the issue ID to my pull request title. - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [ ] I have developed this patch against the `master` branch. - [ ] I have run `./gradlew check`. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on pull request #2106: LUCENE-9624: fix duplicate compute on maxUnpatchedValue
jpountz commented on pull request #2106: URL: https://github.com/apache/lucene-solr/pull/2106#issuecomment-735760219 Thank you! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz merged pull request #2106: LUCENE-9624: fix duplicate compute on maxUnpatchedValue
jpountz merged pull request #2106: URL: https://github.com/apache/lucene-solr/pull/2106 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9624) fix duplicate compute on maxUnpatchedValue
[ https://issues.apache.org/jira/browse/LUCENE-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-9624. -- Fix Version/s: 8.8 Resolution: Fixed > fix duplicate compute on maxUnpatchedValue > -- > > Key: LUCENE-9624 > URL: https://issues.apache.org/jira/browse/LUCENE-9624 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Reporter: Feng Guo >Priority: Major > Fix For: 8.8 > > Time Spent: 0.5h > Remaining Estimate: 0h > > in [LUCENE-9027|[https://github.com/apache/lucene-solr/pull/973]] lucene > introduced SIMD to decode postings, which leaves a very small problem. i hope > i can fix this as my first PR on this amazing project :). > detail: > {code:java} > maxUnpatchedValue has already computed > apache.lucene.codecs.lucene84.PForUtil#encode line 64 > final long maxUnpatchedValue = (1L << patchedBitsRequired) - 1; > apache.lucene.codecs.lucene84.PForUtil#encode line 74 > but computed later again: > if (longs[i] > (1L << patchedBitsRequired) - 1)...{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9624) fix duplicate compute on maxUnpatchedValue
[ https://issues.apache.org/jira/browse/LUCENE-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240740#comment-17240740 ] ASF subversion and git services commented on LUCENE-9624: - Commit e6af255e67bde552a62f266a028e2b652cedc518 in lucene-solr's branch refs/heads/branch_8x from gf2121 [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e6af255 ] LUCENE-9624: fix duplicate compute on maxUnpatchedValue (#2106) Co-authored-by: 郭峰 > fix duplicate compute on maxUnpatchedValue > -- > > Key: LUCENE-9624 > URL: https://issues.apache.org/jira/browse/LUCENE-9624 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs >Reporter: Feng Guo >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > in [LUCENE-9027|[https://github.com/apache/lucene-solr/pull/973]] lucene > introduced SIMD to decode postings, which leaves a very small problem. i hope > i can fix this as my first PR on this amazing project :). > detail: > {code:java} > maxUnpatchedValue has already computed > apache.lucene.codecs.lucene84.PForUtil#encode line 64 > final long maxUnpatchedValue = (1L << patchedBitsRequired) - 1; > apache.lucene.codecs.lucene84.PForUtil#encode line 74 > but computed later again: > if (longs[i] > (1L << patchedBitsRequired) - 1)...{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on pull request #2094: LUCENE-9047: Move the Directory APIs to be little endian
jpountz commented on pull request #2094: URL: https://github.com/apache/lucene-solr/pull/2094#issuecomment-735763595 > what I understand is the files read by for example CompressingStoredFieldsReader, they hold versioning in the header? When we want to make changes to a file format we have two options: 1. Either we create a new one and use it in a new codec, the old one only being used for bw compat. 2. Or we handle this internally by incrementing the internal version of the file format. In general we lean towards 1 for the bigger changes and 2 for the smaller changes. For this change of endianness, we could decide to use option 1 across all file formats so that a given file format always knows what endianness it's supposed to use up-front. So in the example you mentionned, we could create a new Lucene90StoredFieldsFormat that doesn't share any logic with the current stored fields format and always writes and reads data in little endian order. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9625) Benchmark KNN search with ann-benchmarks
Michael Sokolov created LUCENE-9625: --- Summary: Benchmark KNN search with ann-benchmarks Key: LUCENE-9625 URL: https://issues.apache.org/jira/browse/LUCENE-9625 Project: Lucene - Core Issue Type: New Feature Reporter: Michael Sokolov In addition to benchmarking with luceneutil, it would be good to be able to make use of ann-benchmarks, which is publishing results from many approximate knn algorithms, including the hnsw implementation from its authors. We don't expect to challenge the performance of these native code libraries, however it would be good to know just how far off we are. I started looking into this and posted a fork of ann-benchmarks that uses KnnGraphTester class to run these: https://github.com/msokolov/ann-benchmarks. It's still a WIP; you have to manually copy jars and the KnnGraphTester.class to the test host machine rather than downloading from a distribution. KnnGraphTester needs some modifications in order to support this process - this issue is mostly about that. One thing I noticed is that some of the index builds with higher fanout (efConstruction) settings time out at 2h (on an AWS c5 instance), so this is concerning and I'll open a separate issue for trying to improve that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tmortagne opened a new pull request #2107: SOLR-15017: The core's lib/ folder content is not loaded in the classloader anymore when the core's configuration does not define any
tmortagne opened a new pull request #2107: URL: https://github.com/apache/lucene-solr/pull/2107 Fixes https://issues.apache.org/jira/browse/SOLR-15017 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15017) The core's lib/ folder content is not loaded in the classloader anymore when the core's configuration does not define any element
[ https://issues.apache.org/jira/browse/SOLR-15017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240780#comment-17240780 ] Thomas Mortagne commented on SOLR-15017: Here it is: https://github.com/apache/lucene-solr/pull/2107 > The core's lib/ folder content is not loaded in the classloader anymore when > the core's configuration does not define any element > --- > > Key: SOLR-15017 > URL: https://issues.apache.org/jira/browse/SOLR-15017 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.7 >Reporter: Thomas Mortagne >Assignee: David Smiley >Priority: Blocker > Labels: regression > Time Spent: 10m > Remaining Estimate: 0h > > I just upgraded solr-core (I'm using Solr in embedded mode) from 8.5.1 to > 8.7.0 and it seems that the lib subfolder inside a core folder is not taken > into account anymore. > Works fine when I move the lib/ folder one level up (in the Solr home folder) > but when the lib folder with the plugins is located in the core folder it > cannot find any of the classes. > I debugged it a little and I think the regression was caused by the > refactoring done in > https://github.com/apache/lucene-solr/commit/732348ec7f9c6b6f7bf9d539a40e50d16198#diff-473fbcdab103c08461ad1b3c3bb1c6d56f1bcd16d6ce341d80855db2cb20a427R749 > : the handling of the lib/ core's sub folder was moved to > SolrConfig#initLibs() but unfortunately the check to make sure there is at > least one {{}} element in the configuration file was not removed which > means that if you don't have any of those then the content of the lib/ folder > is totally ignored. > That debugging was easy enough but I don't know Solr internals enough to > propose something clean to fix the issue in a pull request. > The workaround is to make sure there is at least one {{}} element (for > example ) in the core's > solrconfig.xml file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on pull request #2107: SOLR-15017: The core's lib/ folder content is not loaded in the classloader anymore when the core's configuration does not define any
dsmiley commented on pull request #2107: URL: https://github.com/apache/lucene-solr/pull/2107#issuecomment-735813730 +1 LGTM. I assume you will test this manually? (good enough). LMK when done; I'll merge this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] tmortagne commented on pull request #2107: SOLR-15017: The core's lib/ folder content is not loaded in the classloader anymore when the core's configuration does not define any
tmortagne commented on pull request #2107: URL: https://github.com/apache/lucene-solr/pull/2107#issuecomment-735817609 I was wondering where would be the best place to add unit tests actually, I could not find a SolrConfigTest but maybe you have a idea of a place dedicated to SolrConfig validation already. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9626) Represent HNSW neighbors with primitive arrays instead of Neighbor Objects
Michael Sokolov created LUCENE-9626: --- Summary: Represent HNSW neighbors with primitive arrays instead of Neighbor Objects Key: LUCENE-9626 URL: https://issues.apache.org/jira/browse/LUCENE-9626 Project: Lucene - Core Issue Type: Improvement Reporter: Michael Sokolov -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9626) Represent HNSW neighbors with primitive arrays instead of Neighbor Objects
[ https://issues.apache.org/jira/browse/LUCENE-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Sokolov updated LUCENE-9626: Description: I ran some KNN tests constructing an index under the profiler. ||function || percent CPU || |---|---| |dotProduct| 28%| |PriorityQueue.insertWithOverflow| 13% + 4%| | PriorityQueue.lessThan| 10%| |TreeSet.add| 4% + 4%| |HashSet.add| 7% (visited list?) + 2%| |BoundedVectorValues.vectorValue| 6%| |HnswGraph.getNeighbors| 6%| |HashSet.init| 3%| The main cost, as we'd expect, is computing dot products, but we also spend a lot of time in the various collections. We do not need a {TreeSet} (used to keep a candidate list); a heap is enough for that. We should also be able to improve the {PriorityQueue} times by switching to a native int heap ({lessThan} will be faster, at least). And I also noticed in the profiler that we do a lot of autoboxing of Integers today, which we can start to reduce to save on garbage. The idea of this issue is that instead of maintaining a priority queue of Neighbor objects (node id, score) for each node in the graph, we maintain two parallel arrays: one for node ids and one for scores. These can be pre-allocated to max-connections, or perhaps to half of that and then grown, since we see that on average fanout is about half of max-connections. Then we can reimplement {Neighbors}, which is currently a {PriorityQueue}, as an integer heap, encoding both the score (as a half-width float sortable bits), and the index into the parallel arrays of the node (as a short) in the same integer value, using the score as the high bits so that priority queue sorting is correct. Future issues can tackle replacing the visited {HashSet} with some more efficient data structure - perhaps a {SparseBitSet} or native int hash set of some sort. > Represent HNSW neighbors with primitive arrays instead of Neighbor Objects > -- > > Key: LUCENE-9626 > URL: https://issues.apache.org/jira/browse/LUCENE-9626 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael Sokolov >Priority: Major > > I ran some KNN tests constructing an index under the profiler. > ||function || percent CPU || > |---|---| > |dotProduct| 28%| > |PriorityQueue.insertWithOverflow| 13% + 4%| > | PriorityQueue.lessThan| 10%| > |TreeSet.add| 4% + 4%| > |HashSet.add| 7% (visited list?) + 2%| > |BoundedVectorValues.vectorValue| 6%| > |HnswGraph.getNeighbors| 6%| > |HashSet.init| 3%| > The main cost, as we'd expect, is computing dot products, but we also spend a > lot of time in the various collections. We do not need a {TreeSet} (used to > keep a candidate list); a heap is enough for that. We should also be able to > improve the {PriorityQueue} times by switching to a native int heap > ({lessThan} will be faster, at least). And I also noticed in the profiler > that we do a lot of autoboxing of Integers today, which we can start to > reduce to save on garbage. > The idea of this issue is that instead of maintaining a priority queue of > Neighbor objects (node id, score) for each node in the graph, we maintain two > parallel arrays: one for node ids and one for scores. These can be > pre-allocated to max-connections, or perhaps to half of that and then grown, > since we see that on average fanout is about half of max-connections. > Then we can reimplement {Neighbors}, which is currently a > {PriorityQueue}, as an integer heap, encoding both the score (as a > half-width float sortable bits), and the index into the parallel arrays of > the node (as a short) in the same integer value, using the score as the high > bits so that priority queue sorting is correct. > Future issues can tackle replacing the visited {HashSet} with some > more efficient data structure - perhaps a {SparseBitSet} or native int hash > set of some sort. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14958) zkHost sys prop requirement prevents sane/safe cloud test usage
[ https://issues.apache.org/jira/browse/SOLR-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240915#comment-17240915 ] ASF subversion and git services commented on SOLR-14958: Commit 37a61635e1c348bcdad9f73eea212b20305115c1 in lucene-solr's branch refs/heads/master from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=37a6163 ] SOLR-14958: Refactor zkHost config logic to make testing easier and reduce risk of incorrect value being used > zkHost sys prop requirement prevents sane/safe cloud test usage > --- > > Key: SOLR-14958 > URL: https://issues.apache.org/jira/browse/SOLR-14958 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14958.patch > > > (This is somewhat analogous to SOLR-14934, but AFAICT only affects tests) > MiniSolrCloudCluster - and/or any test that wants to run "cloud" nodes that > pull solr.xml from ZooKeeper - currently *only* works because it calls > {{System.setProperty("zkHost",...)}} - there is no other mechanism to > communicate a 'zkHost' connection information to a Solr node (w/o hardcoding > the value in a {{solr.xml}} file already on disk), making it unsafe to have > multiple "solr clusters" running in a single JVM. > SolrDispatchFilter already supports the ability to read properties from > "context" attributes (which is currently leveraged by our test > infrastructure) which are used to specify the "node properties" for the core > container, and allow per-node overrides of system properties with the same > name when parsing variables in solr.xml. But! ... SolrDispatchFilter does > not consult these node properties when deciding where to try and load > solr.xml from. > Even if we "fix" SolrDispatchFilter to look for 'zkHost' in the node > properties, SolrXmlConfig supports a {{}} option in the > {{}} section. if that option is missing, then > {{System.getProperty("zkHost")}} is used as a default - *IGNORING ANY zkHost > IN THE NODE PROPERTIES*. > I think we should try to fix this discrepency, and make it possible to run a > {{MiniSolrCloud}} cluster w/o relying on setting 'zkHost' sys prop. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on pull request #2107: SOLR-15017: The core's lib/ folder content is not loaded in the classloader anymore when the core's configuration does not define any
dsmiley commented on pull request #2107: URL: https://github.com/apache/lucene-solr/pull/2107#issuecomment-735945736 See `org.apache.solr.core.TestCoreContainer#testSharedLib` which looks interesting. You might do something similar and add to TestConfig, and ignore the built-in core in TestConfig since that one probably is using . You could use `solr/configsets/minimal.conf` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15009) DirectoryFactory exists masks IOExceptions
[ https://issues.apache.org/jira/browse/SOLR-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240960#comment-17240960 ] ASF subversion and git services commented on SOLR-15009: Commit 388d66573881175cf38a4f879e3b73fe25699086 in lucene-solr's branch refs/heads/branch_8x from Mike Drob [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=388d665 ] SOLR-15009 Propogate IOException from DF.exists > DirectoryFactory exists masks IOExceptions > -- > > Key: SOLR-15009 > URL: https://issues.apache.org/jira/browse/SOLR-15009 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.4.1 >Reporter: Mike Drob >Priority: Major > Labels: NPE > Time Spent: 40m > Remaining Estimate: 0h > > This NPE was seen on a cluster running 8.4.1 - > {noformat} > o.a.s.h.RequestHandlerBase java.lang.NullPointerException > at > org.apache.solr.core.StandardDirectoryFactory.exists(StandardDirectoryFactory.java:92) > at org.apache.solr.core.SolrCore.getIndexSize(SolrCore.java:462) > at > org.apache.solr.core.SolrCore.lambda$initializeMetrics$8(SolrCore.java:1200) > at > org.apache.solr.metrics.SolrMetricManager$GaugeWrapper.getValue(SolrMetricManager.java:711) > at > org.apache.solr.util.stats.MetricUtils.convertGauge(MetricUtils.java:488) > at > org.apache.solr.util.stats.MetricUtils.convertMetric(MetricUtils.java:274) > at > org.apache.solr.util.stats.MetricUtils.lambda$toMaps$4(MetricUtils.java:213) > {noformat} > The problem comes from an IOException from list, which gets turned into a > null return. In this case, we do want the IOException to bubble so that we > can catch and log in getIndexSize. > Aside from that, if all we really care about is whether there are files, then > we could do a much more efficient check than requiring the whole directory to > be listed. This is especially problematic when the directory is large or > remote. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on pull request #2092: SOLR-15009 Propogate IOException from DF.exists
madrob commented on pull request #2092: URL: https://github.com/apache/lucene-solr/pull/2092#issuecomment-735961986 Committed in cb5ba42bd7d9777eaf76705f17a3407fd2897b10 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15009) DirectoryFactory exists masks IOExceptions
[ https://issues.apache.org/jira/browse/SOLR-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240959#comment-17240959 ] ASF subversion and git services commented on SOLR-15009: Commit cb5ba42bd7d9777eaf76705f17a3407fd2897b10 in lucene-solr's branch refs/heads/master from Mike Drob [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cb5ba42 ] SOLR-15009 Propogate IOException from DF.exists > DirectoryFactory exists masks IOExceptions > -- > > Key: SOLR-15009 > URL: https://issues.apache.org/jira/browse/SOLR-15009 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.4.1 >Reporter: Mike Drob >Priority: Major > Labels: NPE > Time Spent: 40m > Remaining Estimate: 0h > > This NPE was seen on a cluster running 8.4.1 - > {noformat} > o.a.s.h.RequestHandlerBase java.lang.NullPointerException > at > org.apache.solr.core.StandardDirectoryFactory.exists(StandardDirectoryFactory.java:92) > at org.apache.solr.core.SolrCore.getIndexSize(SolrCore.java:462) > at > org.apache.solr.core.SolrCore.lambda$initializeMetrics$8(SolrCore.java:1200) > at > org.apache.solr.metrics.SolrMetricManager$GaugeWrapper.getValue(SolrMetricManager.java:711) > at > org.apache.solr.util.stats.MetricUtils.convertGauge(MetricUtils.java:488) > at > org.apache.solr.util.stats.MetricUtils.convertMetric(MetricUtils.java:274) > at > org.apache.solr.util.stats.MetricUtils.lambda$toMaps$4(MetricUtils.java:213) > {noformat} > The problem comes from an IOException from list, which gets turned into a > null return. In this case, we do want the IOException to bubble so that we > can catch and log in getIndexSize. > Aside from that, if all we really care about is whether there are files, then > we could do a much more efficient check than requiring the whole directory to > be listed. This is especially problematic when the directory is large or > remote. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-15009) DirectoryFactory exists masks IOExceptions
[ https://issues.apache.org/jira/browse/SOLR-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob resolved SOLR-15009. -- Fix Version/s: master (9.0) 8.8 Assignee: Mike Drob Resolution: Fixed > DirectoryFactory exists masks IOExceptions > -- > > Key: SOLR-15009 > URL: https://issues.apache.org/jira/browse/SOLR-15009 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.4.1 >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Labels: NPE > Fix For: 8.8, master (9.0) > > Time Spent: 1h > Remaining Estimate: 0h > > This NPE was seen on a cluster running 8.4.1 - > {noformat} > o.a.s.h.RequestHandlerBase java.lang.NullPointerException > at > org.apache.solr.core.StandardDirectoryFactory.exists(StandardDirectoryFactory.java:92) > at org.apache.solr.core.SolrCore.getIndexSize(SolrCore.java:462) > at > org.apache.solr.core.SolrCore.lambda$initializeMetrics$8(SolrCore.java:1200) > at > org.apache.solr.metrics.SolrMetricManager$GaugeWrapper.getValue(SolrMetricManager.java:711) > at > org.apache.solr.util.stats.MetricUtils.convertGauge(MetricUtils.java:488) > at > org.apache.solr.util.stats.MetricUtils.convertMetric(MetricUtils.java:274) > at > org.apache.solr.util.stats.MetricUtils.lambda$toMaps$4(MetricUtils.java:213) > {noformat} > The problem comes from an IOException from list, which gets turned into a > null return. In this case, we do want the IOException to bubble so that we > can catch and log in getIndexSize. > Aside from that, if all we really care about is whether there are files, then > we could do a much more efficient check than requiring the whole directory to > be listed. This is especially problematic when the directory is large or > remote. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude commented on pull request #2010: SOLR-12182: Don't persist base_url in ZK as the scheme is variable, compute from node_name instead
thelabdude commented on pull request #2010: URL: https://github.com/apache/lucene-solr/pull/2010#issuecomment-735976873 thank you for the review @noblepaul and @anshumg ... going to merge to master now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude merged pull request #2010: SOLR-12182: Don't persist base_url in ZK as the scheme is variable, compute from node_name instead
thelabdude merged pull request #2010: URL: https://github.com/apache/lucene-solr/pull/2010 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12182) Can not switch urlScheme in 7x if there are any cores in the cluster
[ https://issues.apache.org/jira/browse/SOLR-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17240984#comment-17240984 ] ASF subversion and git services commented on SOLR-12182: Commit a0492840ee8690ddf48369665c744d16c7dd30cb in lucene-solr's branch refs/heads/master from Timothy Potter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a049284 ] SOLR-12182: Don't persist base_url in ZK as the scheme is variable, compute from node_name instead (#2010) > Can not switch urlScheme in 7x if there are any cores in the cluster > > > Key: SOLR-12182 > URL: https://issues.apache.org/jira/browse/SOLR-12182 > Project: Solr > Issue Type: Bug >Affects Versions: 7.0, 7.1, 7.2 >Reporter: Anshum Gupta >Assignee: Timothy Potter >Priority: Major > Fix For: master (9.0) > > Attachments: SOLR-12182.patch, SOLR-12182_20200423.patch > > Time Spent: 5h > Remaining Estimate: 0h > > I was trying to enable TLS on a cluster that was already in use i.e. had > existing collections and ended up with down cores, that wouldn't come up and > the following core init errors in the logs: > *org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: > replica with coreNodeName core_node4 exists but with a different name or > base_url.* > What is happening here is that the core/replica is defined in the > clusterstate with the urlScheme as part of it's base URL e.g. > *"base_url":"http:hostname:port/solr"*. > Switching the urlScheme in Solr breaks this convention as the host now uses > HTTPS instead. > Actually, I ran into this with an older version because I was running with > *legacyCloud=false* and then realized that we switched that to the default > behavior only in 7x i.e while most users did not hit this issue with older > versions, unless they overrode the legacyCloud value explicitly, users > running 7x are bound to run into this more often. > Switching the value of legacyCloud to true, bouncing the cluster so that the > clusterstate gets flushed, and then setting it back to false is a workaround > but a bit risky one if you don't know if you have any old cores lying around. > Ideally, I think we shouldn't prepend the urlScheme to the base_url value and > use the urlScheme on the fly to construct it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14289) Solr may attempt to check Chroot after already having connected once
[ https://issues.apache.org/jira/browse/SOLR-14289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241013#comment-17241013 ] ASF subversion and git services commented on SOLR-14289: Commit d4ea99f32690929ba9b4f81ee8d25107fda0e045 in lucene-solr's branch refs/heads/branch_8x from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d4ea99f ] SOLR-14958: Refactor zkHost config logic to make testing easier and reduce risk of incorrect value being used (cherry picked from commit 37a61635e1c348bcdad9f73eea212b20305115c1) Cherry pick was modified to resolve conflicts due to SOLR-14289 only existing on master. > Solr may attempt to check Chroot after already having connected once > > > Key: SOLR-14289 > URL: https://issues.apache.org/jira/browse/SOLR-14289 > Project: Solr > Issue Type: Task > Components: Server >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Fix For: master (9.0) > > Attachments: Screen Shot 2020-02-26 at 2.56.14 PM.png > > Time Spent: 1h > Remaining Estimate: 0h > > On server startup, we will attempt to load the solr.xml from zookeeper if we > have the right properties set, and then later when starting up the core > container will take time to verify (and create) the chroot even if it is the > same string that we already used before. We can likely skip the second > short-lived zookeeper connection to speed up our startup sequence a little > bit. > > See this attached image from thread profiling during startup. > !Screen Shot 2020-02-26 at 2.56.14 PM.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14958) zkHost sys prop requirement prevents sane/safe cloud test usage
[ https://issues.apache.org/jira/browse/SOLR-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241012#comment-17241012 ] ASF subversion and git services commented on SOLR-14958: Commit d4ea99f32690929ba9b4f81ee8d25107fda0e045 in lucene-solr's branch refs/heads/branch_8x from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d4ea99f ] SOLR-14958: Refactor zkHost config logic to make testing easier and reduce risk of incorrect value being used (cherry picked from commit 37a61635e1c348bcdad9f73eea212b20305115c1) Cherry pick was modified to resolve conflicts due to SOLR-14289 only existing on master. > zkHost sys prop requirement prevents sane/safe cloud test usage > --- > > Key: SOLR-14958 > URL: https://issues.apache.org/jira/browse/SOLR-14958 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-14958.patch > > > (This is somewhat analogous to SOLR-14934, but AFAICT only affects tests) > MiniSolrCloudCluster - and/or any test that wants to run "cloud" nodes that > pull solr.xml from ZooKeeper - currently *only* works because it calls > {{System.setProperty("zkHost",...)}} - there is no other mechanism to > communicate a 'zkHost' connection information to a Solr node (w/o hardcoding > the value in a {{solr.xml}} file already on disk), making it unsafe to have > multiple "solr clusters" running in a single JVM. > SolrDispatchFilter already supports the ability to read properties from > "context" attributes (which is currently leveraged by our test > infrastructure) which are used to specify the "node properties" for the core > container, and allow per-node overrides of system properties with the same > name when parsing variables in solr.xml. But! ... SolrDispatchFilter does > not consult these node properties when deciding where to try and load > solr.xml from. > Even if we "fix" SolrDispatchFilter to look for 'zkHost' in the node > properties, SolrXmlConfig supports a {{}} option in the > {{}} section. if that option is missing, then > {{System.getProperty("zkHost")}} is used as a default - *IGNORING ANY zkHost > IN THE NODE PROPERTIES*. > I think we should try to fix this discrepency, and make it possible to run a > {{MiniSolrCloud}} cluster w/o relying on setting 'zkHost' sys prop. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob closed pull request #2092: SOLR-15009 Propogate IOException from DF.exists
madrob closed pull request #2092: URL: https://github.com/apache/lucene-solr/pull/2092 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cammiemw commented on a change in pull request #2097: LUCENE-9537
cammiemw commented on a change in pull request #2097: URL: https://github.com/apache/lucene-solr/pull/2097#discussion_r532905011 ## File path: lucene/core/src/java/org/apache/lucene/search/IndriAndQuery.java ## @@ -0,0 +1,22 @@ +package org.apache.lucene.search; + +import java.io.IOException; +import java.util.List; + +/** A Query that matches documents matching combinations of + * {@link TermQuery}s or other IndriAndQuerys. + */ +public class IndriAndQuery extends IndriQuery { Review comment: I agree the naming is confusing. I have taken the naming schema as well as the logic from the original Indri search engine implementation. The issue with renaming it is that there is already IndriOrQuery, which I have created and hope to be able to add at a future time. I will continue to think about whether there is a better name for the IndriAndQuery though. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cammiemw commented on a change in pull request #2097: LUCENE-9537
cammiemw commented on a change in pull request #2097: URL: https://github.com/apache/lucene-solr/pull/2097#discussion_r532905253 ## File path: lucene/core/src/java/org/apache/lucene/search/IndriAndQuery.java ## @@ -0,0 +1,22 @@ +package org.apache.lucene.search; Review comment: Done :-) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cammiemw commented on a change in pull request #2097: LUCENE-9537
cammiemw commented on a change in pull request #2097: URL: https://github.com/apache/lucene-solr/pull/2097#discussion_r532905390 ## File path: lucene/core/src/java/org/apache/lucene/search/IndriAndScorer.java ## @@ -0,0 +1,59 @@ +package org.apache.lucene.search; + +import java.io.IOException; +import java.util.List; + +/** Combines scores of subscorers. If a subscorer does not contain + * the docId, a smoothing score is calculated for that + * document/subscorer combination. + */ +public class IndriAndScorer extends IndriDisjunctionScorer { + + protected IndriAndScorer(Weight weight, List subScorers, + ScoreMode scoreMode, float boost) throws IOException { +super(weight, subScorers, scoreMode, boost); + } + + @Override + public float score(List subScorers) throws IOException { +int docId = this.docID(); +return scoreDoc(subScorers, docId); + } + + @Override + public float smoothingScore(List subScorers, int docId) + throws IOException { +return scoreDoc(subScorers, docId); + } + + private float scoreDoc(List subScorers, int docId) + throws IOException { +double score = 0; +double boostSum = 0.0; +for (Scorer scorer : subScorers) { + if (scorer instanceof IndriScorer) { +IndriScorer indriScorer = (IndriScorer) scorer; +int scorerDocId = indriScorer.docID(); +//If the query exists in the document, score the document +//Otherwise, compute a smoothing score, which acts like an idf +//for subqueries/terms +if (docId == scorerDocId) { Review comment: Done :-) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cammiemw commented on a change in pull request #2097: LUCENE-9537
cammiemw commented on a change in pull request #2097: URL: https://github.com/apache/lucene-solr/pull/2097#discussion_r532907950 ## File path: lucene/core/src/java/org/apache/lucene/search/Scorable.java ## @@ -30,6 +30,13 @@ * Returns the score of the current document matching the query. */ public abstract float score() throws IOException; + + /** + * Returns the smoothing score of the current document matching the query. + * This score is used when the query/term does not appear in the document. + * This can return 0 or a smoothing score. Review comment: I have added more detail and a paper the describes the motivation of the smoothing score. The description of how the smoothing score is used is at the bottom of page 11 in the paper. It is important to note that most of the explanation has to do with when the score is a product. Even though the IndriAndScorer does not use a product, the smoothing score is still helpful for acting like an idf. Additionally, there are many more Indri operators that I would like to add that do use a product. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cammiemw commented on a change in pull request #2097: LUCENE-9537
cammiemw commented on a change in pull request #2097: URL: https://github.com/apache/lucene-solr/pull/2097#discussion_r532908661 ## File path: lucene/core/src/java/org/apache/lucene/search/similarities/IndriDirichletSimilarity.java ## @@ -0,0 +1,108 @@ +/* + * === + * Copyright (c) 2019 Carnegie Mellon University and University of Massachusetts. All Rights + * Reserved. + * + * Use of the Lemur Toolkit for Language Modeling and Information Retrieval is subject to the terms + * of the software license set forth in the LICENSE file included with this software, and also + * available at http://www.lemurproject.org/license.html + * + * + */ +package org.apache.lucene.search.similarities; + +import java.util.List; +import java.util.Locale; + +import org.apache.lucene.search.Explanation; +import org.apache.lucene.search.similarities.BasicStats; +import org.apache.lucene.search.similarities.LMSimilarity; + +/** + * Bayesian smoothing using Dirichlet priors as implemented in the Indri Search + * engine (http://www.lemurproject.org/indri.php). Indri Dirichelet Smoothing! + * tf_E + mu*P(t|D) P(t|E)= documentLength + documentMu + * mu*P(t|C) + tf_D where P(t|D)= - doclen + mu + */ Review comment: I tried adding more formatting and a description of mu. Let me know if you would like to see anything different or more. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14958) zkHost sys prop requirement prevents sane/safe cloud test usage
[ https://issues.apache.org/jira/browse/SOLR-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter resolved SOLR-14958. --- Fix Version/s: master (9.0) 8.8 Resolution: Fixed > zkHost sys prop requirement prevents sane/safe cloud test usage > --- > > Key: SOLR-14958 > URL: https://issues.apache.org/jira/browse/SOLR-14958 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Fix For: 8.8, master (9.0) > > Attachments: SOLR-14958.patch > > > (This is somewhat analogous to SOLR-14934, but AFAICT only affects tests) > MiniSolrCloudCluster - and/or any test that wants to run "cloud" nodes that > pull solr.xml from ZooKeeper - currently *only* works because it calls > {{System.setProperty("zkHost",...)}} - there is no other mechanism to > communicate a 'zkHost' connection information to a Solr node (w/o hardcoding > the value in a {{solr.xml}} file already on disk), making it unsafe to have > multiple "solr clusters" running in a single JVM. > SolrDispatchFilter already supports the ability to read properties from > "context" attributes (which is currently leveraged by our test > infrastructure) which are used to specify the "node properties" for the core > container, and allow per-node overrides of system properties with the same > name when parsing variables in solr.xml. But! ... SolrDispatchFilter does > not consult these node properties when deciding where to try and load > solr.xml from. > Even if we "fix" SolrDispatchFilter to look for 'zkHost' in the node > properties, SolrXmlConfig supports a {{}} option in the > {{}} section. if that option is missing, then > {{System.getProperty("zkHost")}} is used as a default - *IGNORING ANY zkHost > IN THE NODE PROPERTIES*. > I think we should try to fix this discrepency, and make it possible to run a > {{MiniSolrCloud}} cluster w/o relying on setting 'zkHost' sys prop. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cammiemw commented on a change in pull request #2097: LUCENE-9537
cammiemw commented on a change in pull request #2097: URL: https://github.com/apache/lucene-solr/pull/2097#discussion_r532927287 ## File path: lucene/core/src/java/org/apache/lucene/search/IndriAndScorer.java ## @@ -0,0 +1,59 @@ +package org.apache.lucene.search; + +import java.io.IOException; +import java.util.List; + +/** Combines scores of subscorers. If a subscorer does not contain + * the docId, a smoothing score is calculated for that + * document/subscorer combination. + */ +public class IndriAndScorer extends IndriDisjunctionScorer { Review comment: Yes, I do hope to be able to add additional Indri query types that will extend the IndriDisjuctionScorer. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] cammiemw commented on a change in pull request #2097: LUCENE-9537
cammiemw commented on a change in pull request #2097: URL: https://github.com/apache/lucene-solr/pull/2097#discussion_r532933758 ## File path: lucene/core/src/java/org/apache/lucene/search/IndriScorer.java ## @@ -0,0 +1,37 @@ +package org.apache.lucene.search; + +import java.io.IOException; + +/** + * The Indri parent scorer that stores the boost so that + * IndriScorers can use the boost outside of the term. + * + */ +abstract public class IndriScorer extends Scorer { + + private float boost; Review comment: I did this because I apply the boost in the scorer rather than in the similarity (such as in LMDirichletSimilarity), and I divide by the sum of the boosts. I originally did this to exactly match the Indri scores; however, this is not a huge priority. I don't have much issue with dropping this if it doesn't fit in the Lucene workflow well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] msokolov opened a new pull request #2108: LUCENE-9626 represent HNSW graph neighbors using primitive arrays
msokolov opened a new pull request #2108: URL: https://github.com/apache/lucene-solr/pull/2108 The subject line is the main thrust, but there are a few subsidiary changes that were needed in order to achieve that (see below), and I made a few incidental improvements to HNSW-related classes. 1. Add a primitive int-valued heap, IntHeap, based on the existing PriorityQueue 2. Convert scores to 16-bit precision in order to pack them into an int along with a neighbor ordinal The result was about a 16% improvement in indexing times, and fewer, smaller GC pauses noted in profiler. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15009) DirectoryFactory exists masks IOExceptions
[ https://issues.apache.org/jira/browse/SOLR-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241170#comment-17241170 ] ASF subversion and git services commented on SOLR-15009: Commit bd4deb2c90be9a2f9568155efc9344b9f1db4fe4 in lucene-solr's branch refs/heads/branch_8x from Mike Drob [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bd4deb2 ] SOLR-15009 Use older Path APIs > DirectoryFactory exists masks IOExceptions > -- > > Key: SOLR-15009 > URL: https://issues.apache.org/jira/browse/SOLR-15009 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.4.1 >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Labels: NPE > Fix For: 8.8, master (9.0) > > Time Spent: 1h > Remaining Estimate: 0h > > This NPE was seen on a cluster running 8.4.1 - > {noformat} > o.a.s.h.RequestHandlerBase java.lang.NullPointerException > at > org.apache.solr.core.StandardDirectoryFactory.exists(StandardDirectoryFactory.java:92) > at org.apache.solr.core.SolrCore.getIndexSize(SolrCore.java:462) > at > org.apache.solr.core.SolrCore.lambda$initializeMetrics$8(SolrCore.java:1200) > at > org.apache.solr.metrics.SolrMetricManager$GaugeWrapper.getValue(SolrMetricManager.java:711) > at > org.apache.solr.util.stats.MetricUtils.convertGauge(MetricUtils.java:488) > at > org.apache.solr.util.stats.MetricUtils.convertMetric(MetricUtils.java:274) > at > org.apache.solr.util.stats.MetricUtils.lambda$toMaps$4(MetricUtils.java:213) > {noformat} > The problem comes from an IOException from list, which gets turned into a > null return. In this case, we do want the IOException to bubble so that we > can catch and log in getIndexSize. > Aside from that, if all we really care about is whether there are files, then > we could do a much more efficient check than requiring the whole directory to > be listed. This is especially problematic when the directory is large or > remote. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on a change in pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r533044192 ## File path: lucene/test-framework/src/java/org/apache/lucene/store/MockDirectoryWrapper.java ## @@ -745,7 +745,7 @@ public synchronized IndexInput openInput(String name, IOContext context) throws maybeThrowDeterministicException(); } if (!LuceneTestCase.slowFileExists(in, name)) { - throw randomState.nextBoolean() ? new FileNotFoundException(name + " in dir=" + in) : new NoSuchFileException(name + " in dir=" + in); + throw new NoSuchFileException(name + " in dir=" + in); Review comment: Sorry Dawid for the late reply, I took a week long vacation for Thanksgiving. Ah this is a great tip! Was waiting for about half an hour before for each `./gradlew check`. Thanks for the suggestion! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9627) Small refactor of codec classes
Ignacio Vera created LUCENE-9627: Summary: Small refactor of codec classes Key: LUCENE-9627 URL: https://issues.apache.org/jira/browse/LUCENE-9627 Project: Lucene - Core Issue Type: Wish Reporter: Ignacio Vera While working on LUCENE-9047, I had to refactor some classes in order to separate code that opens a file and reads the header/ footer from the code that reads the actual content of the file. Regardless of that issue, I think the refactor is a good thing. In addition it seems Lucene50FieldInfosFormat is not used anywhere in the code so I propose to remove it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] iverase opened a new pull request #2109: LUCENE-9627: Small refactor of codec classes
iverase opened a new pull request #2109: URL: https://github.com/apache/lucene-solr/pull/2109 Refactor of codec classes to separate reading header/footer from reading content of the file. In addition the `Lucene50FieldInfosFormat` class is removed as it is not used in the codebase. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org