[jira] [Commented] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191220#comment-17191220 ] Ishan Chattopadhyaya commented on SOLR-14768: - Thanks Joe & Markus! Patches welcome. > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml > > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] > The integration tests of the solarium PHP library and the integration tests > of the Search API Solr Drupal module both fail on PDF extraction if executed > on Solr 8.6. > They still work on Solr 8.5.1 an earlier versions. > {quote}2020-08-20 12:30:35.279 INFO (qtp855700733-19) [ x:5f3e6ce2810ef] > o.a.s.u.p.LogUpdateProcessorFactory [5f3e6ce2810ef] webapp=/solr > path=/update/extract > params=\{json.nl=flat&commitWithin=0&omitHeader=false&resource.name=testpdf.pdf&literal.id=extract-test&commit=true&extractOnly=false&uprefix=attr_&wt=json}{add=[extract-test > (1675547519474466816)],commit=} 0 957 > solr8_1 | 2020-08-20 12:30:35.280 WARN (qtp855700733-19) [ ] > o.e.j.s.HttpChannel /solr/5f3e6ce2810ef/update/extract => > java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > solr8_1 | java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:443) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) > ~[?:?] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590) > ~[jetty-security-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.InetAccessHandler.handle(
[jira] [Commented] (SOLR-14829) Default components are missing facet_module and terms in documentation
[ https://issues.apache.org/jira/browse/SOLR-14829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191223#comment-17191223 ] Johannes Baiter commented on SOLR-14829: {quote} Is it enough to merely define your search component with the name "highlight" in solrconfig.xml, thus overriding the built-in one? Then you don't even need to make any component changes to your handler definition. {quote} Unfortunately, no. The [custom highlighter|https://dbmdz.github.io/solr-ocrhighlighting/] is supposed to run alongside the regular highlighter, since it only takes care of highlighting a specific subset of fields in the response, all other fields go through the default highlighter. > Default components are missing facet_module and terms in documentation > -- > > Key: SOLR-14829 > URL: https://issues.apache.org/jira/browse/SOLR-14829 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation, examples >Affects Versions: 8.6.2 >Reporter: Johannes Baiter >Assignee: Ishan Chattopadhyaya >Priority: Minor > Attachments: SOLR-14829.patch > > > In the reference guide, the list of search components that are enabled by > default is missing the {{facet_module}} and {{terms}} components. The terms > component is instead listed under "other useful components", while the > {{FacetModule}} is never listed anywhere in the documentation, despite it > being neccessary for the JSON Facet API to work. > This is also how I stumbled upon this, I spent hours trying to figure out why > JSON-based faceting was not working with my setup, after taking a glance at > the {{SearchHandler}} source code based on a hunch, it became clear that my > custom list of search components (created based on the list in the reference > guide) was to blame. > A patch for the documentation gap is attached, but I think there are some > other issues with the naming/documentation around the two faceting APIs that > may be worth discussing: > * The names {{facet_module}} / {{FacetModule}} are very misleading, since > the documentation is always talking about the "JSON Facet API", but the term > "JSON" does not appear in the name of the component nor does the component > have any documentation attached that mentions this > * Why is the {{FacetModule}} class located in the {{search.facet}} package > while every single other search component included in the core is located in > the {{handler.component}} package? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14829) Default components are missing facet_module and terms in documentation
[ https://issues.apache.org/jira/browse/SOLR-14829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191223#comment-17191223 ] Johannes Baiter edited comment on SOLR-14829 at 9/6/20, 7:38 AM: - {quote}Is it enough to merely define your search component with the name "highlight" in solrconfig.xml, thus overriding the built-in one? Then you don't even need to make any component changes to your handler definition. {quote} Unfortunately, no. The [custom highlighter|https://dbmdz.github.io/solr-ocrhighlighting/] is supposed to run alongside the regular highlighter, since it only takes care of highlighting a specific subset of fields in the response, all other fields go through the default highlighter. Maybe it would be a good idea to add a warning to the section about defining the {{components}} from scratch that this is a last resort (because of mistakes like mine, and also for future-proofing in case other new default components are added) and the other alternatives should be preferred? was (Author: jbaiter): {quote}Is it enough to merely define your search component with the name "highlight" in solrconfig.xml, thus overriding the built-in one? Then you don't even need to make any component changes to your handler definition. {quote} Unfortunately, no. The [custom highlighter|https://dbmdz.github.io/solr-ocrhighlighting/] is supposed to run alongside the regular highlighter, since it only takes care of highlighting a specific subset of fields in the response, all other fields go through the default highlighter. But maybe it would be a good idea to add a warning to the section about defining the {{components}} from scratch that this is a last resort (because of mistakes like mine, and also for future-proofing in case other new default components are added) and the other alternatives should be preferred? > Default components are missing facet_module and terms in documentation > -- > > Key: SOLR-14829 > URL: https://issues.apache.org/jira/browse/SOLR-14829 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation, examples >Affects Versions: 8.6.2 >Reporter: Johannes Baiter >Assignee: Ishan Chattopadhyaya >Priority: Minor > Attachments: SOLR-14829.patch > > > In the reference guide, the list of search components that are enabled by > default is missing the {{facet_module}} and {{terms}} components. The terms > component is instead listed under "other useful components", while the > {{FacetModule}} is never listed anywhere in the documentation, despite it > being neccessary for the JSON Facet API to work. > This is also how I stumbled upon this, I spent hours trying to figure out why > JSON-based faceting was not working with my setup, after taking a glance at > the {{SearchHandler}} source code based on a hunch, it became clear that my > custom list of search components (created based on the list in the reference > guide) was to blame. > A patch for the documentation gap is attached, but I think there are some > other issues with the naming/documentation around the two faceting APIs that > may be worth discussing: > * The names {{facet_module}} / {{FacetModule}} are very misleading, since > the documentation is always talking about the "JSON Facet API", but the term > "JSON" does not appear in the name of the component nor does the component > have any documentation attached that mentions this > * Why is the {{FacetModule}} class located in the {{search.facet}} package > while every single other search component included in the core is located in > the {{handler.component}} package? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14829) Default components are missing facet_module and terms in documentation
[ https://issues.apache.org/jira/browse/SOLR-14829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191223#comment-17191223 ] Johannes Baiter edited comment on SOLR-14829 at 9/6/20, 7:38 AM: - {quote}Is it enough to merely define your search component with the name "highlight" in solrconfig.xml, thus overriding the built-in one? Then you don't even need to make any component changes to your handler definition. {quote} Unfortunately, no. The [custom highlighter|https://dbmdz.github.io/solr-ocrhighlighting/] is supposed to run alongside the regular highlighter, since it only takes care of highlighting a specific subset of fields in the response, all other fields go through the default highlighter. But maybe it would be a good idea to add a warning to the section about defining the {{components}} from scratch that this is a last resort (because of mistakes like mine, and also for future-proofing in case other new default components are added) and the other alternatives should be preferred? was (Author: jbaiter): {quote} Is it enough to merely define your search component with the name "highlight" in solrconfig.xml, thus overriding the built-in one? Then you don't even need to make any component changes to your handler definition. {quote} Unfortunately, no. The [custom highlighter|https://dbmdz.github.io/solr-ocrhighlighting/] is supposed to run alongside the regular highlighter, since it only takes care of highlighting a specific subset of fields in the response, all other fields go through the default highlighter. > Default components are missing facet_module and terms in documentation > -- > > Key: SOLR-14829 > URL: https://issues.apache.org/jira/browse/SOLR-14829 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation, examples >Affects Versions: 8.6.2 >Reporter: Johannes Baiter >Assignee: Ishan Chattopadhyaya >Priority: Minor > Attachments: SOLR-14829.patch > > > In the reference guide, the list of search components that are enabled by > default is missing the {{facet_module}} and {{terms}} components. The terms > component is instead listed under "other useful components", while the > {{FacetModule}} is never listed anywhere in the documentation, despite it > being neccessary for the JSON Facet API to work. > This is also how I stumbled upon this, I spent hours trying to figure out why > JSON-based faceting was not working with my setup, after taking a glance at > the {{SearchHandler}} source code based on a hunch, it became clear that my > custom list of search components (created based on the list in the reference > guide) was to blame. > A patch for the documentation gap is attached, but I think there are some > other issues with the naming/documentation around the two faceting APIs that > may be worth discussing: > * The names {{facet_module}} / {{FacetModule}} are very misleading, since > the documentation is always talking about the "JSON Facet API", but the term > "JSON" does not appear in the name of the component nor does the component > have any documentation attached that mentions this > * Why is the {{FacetModule}} class located in the {{search.facet}} package > while every single other search component included in the core is located in > the {{handler.component}} package? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9509) Refine lucene/BUILD.md (move Solr related contents to top-level README)
Tomoko Uchida created LUCENE-9509: - Summary: Refine lucene/BUILD.md (move Solr related contents to top-level README) Key: LUCENE-9509 URL: https://issues.apache.org/jira/browse/LUCENE-9509 Project: Lucene - Core Issue Type: Task Reporter: Tomoko Uchida Assignee: Tomoko Uchida Current lucene/BUILD.md is somehow mixed-up with information for Solr developers, this is wrong place for them. Solr related information/tips should be moved to the top-level README (or somewhere else). Also the content could be elaborated for newdevs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mocobeta opened a new pull request #1835: LUCENE-9509: Refine lucene/BUILD.md and top-level README (for newdevs)
mocobeta opened a new pull request #1835: URL: https://github.com/apache/lucene-solr/pull/1835 Current lucene/BUILD.md is somehow mixed-up with information for Solr developers, this is wrong place for them. Solr related information/tips should be moved to the top-level README (or somewhere else). Also the content could be elaborated for newdevs (required JDK version, and so on). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mocobeta commented on pull request #1835: LUCENE-9509: Refine lucene/BUILD.md and top-level README (for newdevs)
mocobeta commented on pull request #1835: URL: https://github.com/apache/lucene-solr/pull/1835#issuecomment-687765986 You can see how the modified README/BUILD look like on GitHub at here: https://github.com/mocobeta/lucene-solr/tree/jira/LUCENE-9509 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Doupnik updated SOLR-14768: --- Attachment: Solr v8.6.x fails with multipart MIME in commands.eml Ishan and Markus, Patches. Alas, given the large untamed dense tropical jungle of Solr/Lucene's Java material that task is more than I can undertake myself. That isn't to say I have not explored and tried, but the details are far too diffuse and many to decode matters. As I write this I have yet another test build in the making, this with the problem file of SolrRequestParsers.java, section cleanupMultipartFiles(), replaced by a simple return in an attempt to avoid the not-found effect. This build will take quite a while to finish, assuming I gave the correct sequence of ant commands, and then test. Even with that the problem is not solved, just its use location has been verified. The Jetty material on the web, for example, discusses abandoning MultiParts in v9 to use a new, presumably faster, rendition in v10. What is apparent is presently there is a missing Java file, or equivalently a needed reference to it, to compose a working Solr. Finding such missing references is for those who are deeply immersed in those details. For easy reference I have attached the email version of my original report, which has details and screen captures. A suggestion. It is all very nice to have nearly a thousand minor Java "test" utilities, but the facilities end results must be tested for correct operation, and it is apparent that such overall testing is lacking presently. To help with that part my suggestion is get my crawler program (which exhibits the problem encountered by Markus and myself), strip it down to just basic core operations (create, delete, add a few files to) to obtain experimental results which are just the kind used in the field. Snippets of the PHP code are in my attached message, and the full material is available from https://netlab1.net, go to section "Presentations of long term utility", thence to "Solr/Lucene Search Service." Grab the offered SearchService2.1.tar.gz file which has PHP code and documentation, and put the crawler to work as a test tool. Shrinking down the crawler to just needed test essentials is easy, valuable and free, and can be part of the pre-release test and validation process. A second free suggestion is offer folks the _complete_ source tar ball, not just parts which go out on the web to fetch lots of other source files. Pretend the Internet does not exist. Those "other files" keep changing and hence are not thoroughly tested within the full Solr unit. We need the complete set which has been both tested and proven successful, not mixing the latest bits off someone's machine. Within reason, if not proven correct then not shipped. A last freebee. I see some individual in the project wants to abandon Tika and friends and instead just play with Lucene. That is an unacceptably narrow and regressive approach to the material, and it strongly works directly against users of Solr in the field. The entire package is what we need, not end users be players assembling and shuffling Java parts about willy nilly. Markus has commented on the large number of users of his Drupal interface material for Solr. The build process seems as if more hours need elapse to finish the "test" building component, so I will stop here. Oh dear, "ant package" failed, saying file lucene/common-build.xml, line 2331 failed (that line wants $(git.exe) for gosh sakes). I am building on SUSE Leap 15.2 Linux, not Windows. I appreciate that we are dealing with a complicated assemblage of material with constantly changing responsible people, and with limits on project resources. Thus both Markus and myself are willing to assist as we can, but we cannot be expected to become deeply immersed in this dense material. That chore for those responsible people. Thus can we escalate matters to those folks and see if we can rectify the problem. Thanks, Joe D. > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml, > Solr v8.6.x fails with multipart MIME in commands.eml > > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] >
[GitHub] [lucene-solr] mocobeta opened a new pull request #1836: LUCENE-9317: Clean up split package in analyzers-common
mocobeta opened a new pull request #1836: URL: https://github.com/apache/lucene-solr/pull/1836 This is a draft pull request for review, to try to clean up package name conflicts between `analyzers-common` and `core`. Also I tried to make necessary changes as small as possible. See https://issues.apache.org/jira/browse/LUCENE-9317 for more background. The main changes are: - Move analysis base classes to lucene-core (o.a.l.a) from analyzers-common (o.a.l.a.util) - Rename all service provider files (META-INF/services/...). - Move o.a.l.a.standard.StandardTokenizer to lucene-core - Split o.a.l.a.standard in analyzers-common into o.a.l.a.classic and o.a.l.a.email With above changes, there is no package name conflicts. - `o.a.l.a.util` and newly created `o.a.l.a.classic` and `o.a.l.a.email` only exist in `analyzers-common` - `o.a.l.a.standard` only exists in `lucene-core` - other packages are not touched. Compiling whole Lucene/Solr main classes are fine, thanks to IDE's refactoring feature. Tasks to be done: - Create fake factory base classes in o.a.l.a.util for backward compatibility (?) - Fix tests - Fix gradle scripts (?) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mocobeta commented on pull request #1836: LUCENE-9317: Clean up split package in analyzers-common
mocobeta commented on pull request #1836: URL: https://github.com/apache/lucene-solr/pull/1836#issuecomment-687784685 To add to the volume of the changes here, the most of them are automatically done by IDE. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9317) Resolve package name conflicts for StandardAnalyzer to allow Java module system support
[ https://issues.apache.org/jira/browse/LUCENE-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191272#comment-17191272 ] Tomoko Uchida commented on LUCENE-9317: --- I opened a PR for an early stage review: [https://github.com/apache/lucene-solr/pull/1836] The main part was done but there remain many follow-up tasks remain there (fixing tests etc.). I believe we already has reached a common ground, but do not want to proceed without reviews/permissions. [~uschindler] (or someone else who takes care of the issue) would you take a look at it when you have some time? > Resolve package name conflicts for StandardAnalyzer to allow Java module > system support > --- > > Key: LUCENE-9317 > URL: https://issues.apache.org/jira/browse/LUCENE-9317 > Project: Lucene - Core > Issue Type: Improvement > Components: core/other >Affects Versions: master (9.0) >Reporter: David Ryan >Priority: Major > Labels: build, features > Time Spent: 20m > Remaining Estimate: 0h > > > To allow Lucene to be modularised there are a few preparatory tasks to be > completed prior to this being possible. The Java module system requires that > jars do not use the same package name in different jars. The lucene-core and > lucene-analyzers-common both share the package > org.apache.lucene.analysis.standard. > Possible resolutions to this issue are discussed by Uwe on the mailing list > here: > > [http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3CCAM21Rt8FHOq_JeUSELhsQJH0uN0eKBgduBQX4fQKxbs49TLqzA%40mail.gmail.com%3E] > {quote}About StandardAnalyzer: Unfortunately I aggressively complained a > while back when Mike McCandless wanted to move standard analyzer out of the > analysis package into core (“for convenience”). This was a bad step, and IMHO > we should revert that or completely rename the packages and everything. The > problem here is: As the analysis services are only part of lucene-analyzers, > we had to leave the factory classes there, but move the implementation > classes in core. The package has to be the same. The only way around that is > to move the analysis factory framework also to core (I would not be against > that). This would include all factory base classes and the service loading > stuff. Then we can move standard analyzer and some of the filters/tokenizers > including their factories to core an that problem would be solved. > {quote} > There are two options here, either move factory framework into core or revert > StandardAnalyzer back to lucene-analyzers. In the email, the solution lands > on reverting back as per the task list: > {quote}Add some preparatory issues to cleanup class hierarchy: Move Analysis > SPI to core / remove StandardAnalyzer and related classes out of core back to > anaysis > {quote} > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9317) Resolve package name conflicts for StandardAnalyzer to allow Java module system support
[ https://issues.apache.org/jira/browse/LUCENE-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191272#comment-17191272 ] Tomoko Uchida edited comment on LUCENE-9317 at 9/6/20, 1:35 PM: I opened a PR for an early stage review: [https://github.com/apache/lucene-solr/pull/1836] The main part was almost done but many follow-up tasks remain there (fixing tests etc.). I believe we already has reached a common ground, but do not want to proceed without reviews/permissions. [~uschindler] (or someone else who takes care of the issue) would you take a look at it when you have some time? was (Author: tomoko uchida): I opened a PR for an early stage review: [https://github.com/apache/lucene-solr/pull/1836] The main part was done but there remain many follow-up tasks remain there (fixing tests etc.). I believe we already has reached a common ground, but do not want to proceed without reviews/permissions. [~uschindler] (or someone else who takes care of the issue) would you take a look at it when you have some time? > Resolve package name conflicts for StandardAnalyzer to allow Java module > system support > --- > > Key: LUCENE-9317 > URL: https://issues.apache.org/jira/browse/LUCENE-9317 > Project: Lucene - Core > Issue Type: Improvement > Components: core/other >Affects Versions: master (9.0) >Reporter: David Ryan >Priority: Major > Labels: build, features > Time Spent: 20m > Remaining Estimate: 0h > > > To allow Lucene to be modularised there are a few preparatory tasks to be > completed prior to this being possible. The Java module system requires that > jars do not use the same package name in different jars. The lucene-core and > lucene-analyzers-common both share the package > org.apache.lucene.analysis.standard. > Possible resolutions to this issue are discussed by Uwe on the mailing list > here: > > [http://mail-archives.apache.org/mod_mbox/lucene-dev/202004.mbox/%3CCAM21Rt8FHOq_JeUSELhsQJH0uN0eKBgduBQX4fQKxbs49TLqzA%40mail.gmail.com%3E] > {quote}About StandardAnalyzer: Unfortunately I aggressively complained a > while back when Mike McCandless wanted to move standard analyzer out of the > analysis package into core (“for convenience”). This was a bad step, and IMHO > we should revert that or completely rename the packages and everything. The > problem here is: As the analysis services are only part of lucene-analyzers, > we had to leave the factory classes there, but move the implementation > classes in core. The package has to be the same. The only way around that is > to move the analysis factory framework also to core (I would not be against > that). This would include all factory base classes and the service loading > stuff. Then we can move standard analyzer and some of the filters/tokenizers > including their factories to core an that problem would be solved. > {quote} > There are two options here, either move factory framework into core or revert > StandardAnalyzer back to lucene-analyzers. In the email, the solution lands > on reverting back as per the task list: > {quote}Add some preparatory issues to cleanup class hierarchy: Move Analysis > SPI to core / remove StandardAnalyzer and related classes out of core back to > anaysis > {quote} > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dnhatn merged pull request #1791: LUCENE-9482: Fix deletion count error message
dnhatn merged pull request #1791: URL: https://github.com/apache/lucene-solr/pull/1791 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9482) There is a problem with the description of "invalid deletion count" exception.
[ https://issues.apache.org/jira/browse/LUCENE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191297#comment-17191297 ] ASF subversion and git services commented on LUCENE-9482: - Commit 1606a7618712a4c5f9f7e92ce44e489e9d07993e in lucene-solr's branch refs/heads/master from LWY [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1606a76 ] LUCENE-9482: Fix deletion count error message > There is a problem with the description of "invalid deletion count" exception. > -- > > Key: LUCENE-9482 > URL: https://issues.apache.org/jira/browse/LUCENE-9482 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Affects Versions: 7.7.3 >Reporter: wenyao >Priority: Minor > Labels: ready-to-commit > Time Spent: 1h 10m > Remaining Estimate: 0h > > class: org.apache.lucene.index.SegmentInfos#376 > throw new CorruptIndexException("invalid deletion count: " + softDelCount + > delCount + " vs maxDoc=" + info.maxDoc(), input); > It needs to be changed to: > throw new CorruptIndexException("invalid deletion count: " + (softDelCount + > delCount) + " vs maxDoc=" + info.maxDoc(), input); // todo softDelCount + > delCount should merge -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9482) There is a problem with the description of "invalid deletion count" exception.
[ https://issues.apache.org/jira/browse/LUCENE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191310#comment-17191310 ] ASF subversion and git services commented on LUCENE-9482: - Commit bbacec07bf8c2c01c2c6a8cc97daefe4672ee0eb in lucene-solr's branch refs/heads/branch_8x from LWY [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bbacec0 ] LUCENE-9482: Fix deletion count error message > There is a problem with the description of "invalid deletion count" exception. > -- > > Key: LUCENE-9482 > URL: https://issues.apache.org/jira/browse/LUCENE-9482 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Affects Versions: 7.7.3 >Reporter: wenyao >Priority: Minor > Labels: ready-to-commit > Time Spent: 1h 10m > Remaining Estimate: 0h > > class: org.apache.lucene.index.SegmentInfos#376 > throw new CorruptIndexException("invalid deletion count: " + softDelCount + > delCount + " vs maxDoc=" + info.maxDoc(), input); > It needs to be changed to: > throw new CorruptIndexException("invalid deletion count: " + (softDelCount + > delCount) + " vs maxDoc=" + info.maxDoc(), input); // todo softDelCount + > delCount should merge -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9482) There is a problem with the description of "invalid deletion count" exception.
[ https://issues.apache.org/jira/browse/LUCENE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191311#comment-17191311 ] ASF subversion and git services commented on LUCENE-9482: - Commit 5df8e574b00962ca4f0b9ca509f91cecd0989b1b in lucene-solr's branch refs/heads/branch_8_6 from LWY [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5df8e57 ] LUCENE-9482: Fix deletion count error message > There is a problem with the description of "invalid deletion count" exception. > -- > > Key: LUCENE-9482 > URL: https://issues.apache.org/jira/browse/LUCENE-9482 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Affects Versions: 7.7.3 >Reporter: wenyao >Priority: Minor > Labels: ready-to-commit > Time Spent: 1h 10m > Remaining Estimate: 0h > > class: org.apache.lucene.index.SegmentInfos#376 > throw new CorruptIndexException("invalid deletion count: " + softDelCount + > delCount + " vs maxDoc=" + info.maxDoc(), input); > It needs to be changed to: > throw new CorruptIndexException("invalid deletion count: " + (softDelCount + > delCount) + " vs maxDoc=" + info.maxDoc(), input); // todo softDelCount + > delCount should merge -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9482) There is a problem with the description of "invalid deletion count" exception.
[ https://issues.apache.org/jira/browse/LUCENE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nhat Nguyen updated LUCENE-9482: Fix Version/s: 8.6.3 8.7 master (9.0) Affects Version/s: (was: 7.7.3) Issue Type: Bug (was: Improvement) Labels: (was: ready-to-commit) Priority: Trivial (was: Minor) > There is a problem with the description of "invalid deletion count" exception. > -- > > Key: LUCENE-9482 > URL: https://issues.apache.org/jira/browse/LUCENE-9482 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Reporter: wenyao >Priority: Trivial > Fix For: master (9.0), 8.7, 8.6.3 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > class: org.apache.lucene.index.SegmentInfos#376 > throw new CorruptIndexException("invalid deletion count: " + softDelCount + > delCount + " vs maxDoc=" + info.maxDoc(), input); > It needs to be changed to: > throw new CorruptIndexException("invalid deletion count: " + (softDelCount + > delCount) + " vs maxDoc=" + info.maxDoc(), input); // todo softDelCount + > delCount should merge -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9482) There is a problem with the description of "invalid deletion count" exception.
[ https://issues.apache.org/jira/browse/LUCENE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nhat Nguyen resolved LUCENE-9482. - Resolution: Fixed > There is a problem with the description of "invalid deletion count" exception. > -- > > Key: LUCENE-9482 > URL: https://issues.apache.org/jira/browse/LUCENE-9482 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Reporter: wenyao >Priority: Trivial > Fix For: master (9.0), 8.7, 8.6.3 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > class: org.apache.lucene.index.SegmentInfos#376 > throw new CorruptIndexException("invalid deletion count: " + softDelCount + > delCount + " vs maxDoc=" + info.maxDoc(), input); > It needs to be changed to: > throw new CorruptIndexException("invalid deletion count: " + (softDelCount + > delCount) + " vs maxDoc=" + info.maxDoc(), input); // todo softDelCount + > delCount should merge -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mocobeta edited a comment on pull request #1836: LUCENE-9317: Clean up split package in analyzers-common
mocobeta edited a comment on pull request #1836: URL: https://github.com/apache/lucene-solr/pull/1836#issuecomment-687784685 As a supplement to the volume of the changes here, the most of them are automatically done by IDE. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] murblanc commented on a change in pull request #1813: SOLR-14613: No new APIs. use the existing APIs
murblanc commented on a change in pull request #1813: URL: https://github.com/apache/lucene-solr/pull/1813#discussion_r484094742 ## File path: solr/core/src/java/org/apache/solr/cloud/api/collections/Assign.java ## @@ -551,11 +555,29 @@ public AssignStrategyFactory(SolrCloudManager solrCloudManager) { this.solrCloudManager = solrCloudManager; } +@SuppressWarnings("unchecked") public AssignStrategy create(ClusterState clusterState, CloudConfig cloudConfig, DocCollection collection) throws IOException, InterruptedException { @SuppressWarnings({"unchecked", "rawtypes"}) List ruleMaps = (List) collection.get("rule"); @SuppressWarnings({"rawtypes"}) List snitches = (List) collection.get(SNITCH); + Map props = solrCloudManager.getClusterStateProvider().getClusterProperties(); + Map assignInfo = (Map) props.get("assign-strategy"); Review comment: Do you have code (and curl) examples of how the `assign-strategy` property or similar ones are stored in `clusterprops.json`? As I'm switching my PR to not use `solr.xml` but instead use `clusterprops.json`, I not only need to retrieve the plugin class name if any, but also associated configuration values whose names/keys and types are unknown since they are plugin specific. If I copy the code as written here, I assume an iteration over `assignInfo` would reveal all configured values, want to make sure this makes sense. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191345#comment-17191345 ] Colvin Cowie commented on SOLR-14768: - Just to add this to the mix, while having look at this while running 8.6.2 from Eclipse with the Eclipse Jetty plugin, I hit a ClassCastException. I don't know what request prompted this to happen, but it looks like there's a couple of issues [here|https://github.com/apache/lucene-solr/commit/41b4bec51b6b2b083c5fb2170057e69693b2ff77#diff-c682d1eb01de9a7ab98f75ead0af9dd3R624], not just the java.lang.NoClassDefFoundError, [~dsmiley] {noformat} 2020-09-06 17:23:18.044:WARN:oejs.HttpChannel:qtp94345706-152: /solr/main_index/update java.lang.ClassCastException: org.eclipse.jetty.server.MultiParts$MultiPartsUtilParser cannot be cast to org.eclipse.jetty.server.MultiParts at org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:443) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.eclipse.jetty.server.Server.handle(Server.java:500) at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383) at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.lang.Thread.run(Thread.java:748){noformat} > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml, > Solr v8.6.x fails with multipart MIME in commands.eml > > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] > The integration tests of the solarium PHP l
[jira] [Updated] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Doupnik updated SOLR-14768: --- Attachment: Solr v8.6.x fails with multipart MIME in commands.eml Colvin, Misery loves company dept. Today Ishan Chattopadhyaya (j...@apache.org) finally noticed our Jira entry and asked me for patches. I wrote back a long reply indicating that supplying a patch was beyond my means though I have tried to solve the puzzle, and was in fact a chore for the code maintainers rather than end users. There are more comments in the Jetty mail archives about depreciating MultiParts and replacing the whole thing with new code in the v10 Jetty code (which is not out yet). You found even more of this. It does appear to me that someone either forgot to include a MultiParts file in Solr v8.6.x or messed up a ClassPath, or similar. Thus three editions of Solr with the same rather serious error. I have returned to Solr 8.5 which works correctly here. I recommended considering my exploiting my crawler program (https://netlab1.net, find Solr/Lucene Search Service) as one part of product pre-release testing because it does many common end user level tasks such as create, delete, add files to a core using the REST interface. Such tests are on the whole program, not just the internal thousand tests on only small components. Clearly, such whole product testing had not been successful. Just FYI, attached is my initial list message together with the screen capture graphics. Thanks, Joe D. > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml, > Solr v8.6.x fails with multipart MIME in commands.eml, Solr v8.6.x fails with > multipart MIME in commands.eml > > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] > The integration tests of the solarium PHP library and the integration tests > of the Search API Solr Drupal module both fail on PDF extraction if executed > on Solr 8.6. > They still work on Solr 8.5.1 an earlier versions. > {quote}2020-08-20 12:30:35.279 INFO (qtp855700733-19) [ x:5f3e6ce2810ef] > o.a.s.u.p.LogUpdateProcessorFactory [5f3e6ce2810ef] webapp=/solr > path=/update/extract > params=\{json.nl=flat&commitWithin=0&omitHeader=false&resource.name=testpdf.pdf&literal.id=extract-test&commit=true&extractOnly=false&uprefix=attr_&wt=json}{add=[extract-test > (1675547519474466816)],commit=} 0 957 > solr8_1 | 2020-08-20 12:30:35.280 WARN (qtp855700733-19) [ ] > o.e.j.s.HttpChannel /solr/5f3e6ce2810ef/update/extract => > java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > solr8_1 | java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:443) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) > ~[?:?] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590) > ~[jetty-security-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.Scope
[jira] [Created] (SOLR-14835) Solr 8.6.x log starts with "XmlConfiguration Ignored arg" warning from Jetty
Colvin Cowie created SOLR-14835: --- Summary: Solr 8.6.x log starts with "XmlConfiguration Ignored arg" warning from Jetty Key: SOLR-14835 URL: https://issues.apache.org/jira/browse/SOLR-14835 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Affects Versions: 8.6.2 Reporter: Colvin Cowie After moving to 8.6.2 the first lines of the solr.log are {noformat} 2020-09-06 18:19:09.164 INFO (main) [ ] o.e.j.u.log Logging initialized @1197ms to org.eclipse.jetty.util.log.Slf4jLog 2020-09-06 18:19:09.226 WARN (main) [ ] o.e.j.u.l.o.e.j.x.XmlConfiguration Ignored arg: solr.jetty {noformat} This config is declared here: https://github.com/apache/lucene-solr/blob/5154b6008f54c9d096f5efe9ae347492c23dd780/solr/server/etc/jetty.xml#L33 and has been there for a long time, so I assume it's the bump in Jetty version that's causing it now. I'm seeing this in 8.6.2, but I've not gone back to check other versions -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9507) Custom order for leaves in DirectoryReader, IndexWriter and searcher
[ https://issues.apache.org/jira/browse/LUCENE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191363#comment-17191363 ] Michael Sokolov commented on LUCENE-9507: - Jim, this sounds like it would be very useful. I'm curious to see how you will propose to balance sorting with other merge policy considerations. Are you thinking that the sorting criterion influence the choice of segments in some way, or just that it be used to sort whichever segments are chosen, when merging? > Custom order for leaves in DirectoryReader, IndexWriter and searcher > > > Key: LUCENE-9507 > URL: https://issues.apache.org/jira/browse/LUCENE-9507 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Jim Ferenczi >Priority: Minor > > Now that we're able [to skip documents efficiently when sorting by a numeric > field|https://issues.apache.org/jira/browse/LUCENE-9280], I was wondering if > we could optimize sorted queries further by also sorting the leaf readers > based on the primary sort. > For time-based indices in Elasticsearch, we've implemented an optimization > that does that at query time. If the query is sorted by a numeric docvalue > field, prior to search, we sort the leaves according to the query sort. When > sorting by timestamp this small optimization can have a big impact since > early termination can be reached much faster if the sort values in the > segments don't overlap too much. Applying this optimization at query time is > challenging , it has the benefit to work on any numeric field sort and order > but it requires to use a multi-reader that will reorganize the segments. It > can also be deceptive that after a force merge to 1 segment sorted queries > may be slower since there is nothing to sort anymore. > So, another option that I look at is to add the ability to provide a leaf > order directly in the IndexWriter and DirectoryReader. That could be similar > to an index sort or even complementary to it since sorting segments based on > the index sort could also help at query time. For time-based indices that > cannot afford index sorting but have lots of sorted queries on timestamp, > forcing the order of segments could speed up sorted queries significantly. > The advantage of forcing a single leaf sort in the writer/reader is that we > can also use it to influence the merges by putting the segments with the > highest value first. That would help with the case of indices that are merged > to a single segment but would like to keep the sorted queries fast but also > for the multi-segments case since big segments would have more chance to have > highest values first too. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14836) Exclude README.committers.txt from distribution
Dawid Weiss created SOLR-14836: -- Summary: Exclude README.committers.txt from distribution Key: SOLR-14836 URL: https://issues.apache.org/jira/browse/SOLR-14836 Project: Solr Issue Type: Task Security Level: Public (Default Security Level. Issues are Public) Reporter: Dawid Weiss -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14836) Exclude README.committers.txt from distribution
[ https://issues.apache.org/jira/browse/SOLR-14836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191371#comment-17191371 ] ASF subversion and git services commented on SOLR-14836: Commit 8a1644779b50c4e21f2531a2a7a0cd01b3fb77d8 in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8a16447 ] SOLR-14836: Exclude README.committers.txt from distribution > Exclude README.committers.txt from distribution > --- > > Key: SOLR-14836 > URL: https://issues.apache.org/jira/browse/SOLR-14836 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Priority: Trivial > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14836) Exclude README.committers.txt from distribution
[ https://issues.apache.org/jira/browse/SOLR-14836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved SOLR-14836. Resolution: Fixed > Exclude README.committers.txt from distribution > --- > > Key: SOLR-14836 > URL: https://issues.apache.org/jira/browse/SOLR-14836 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Priority: Trivial > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9475) Enhance the Gradle build as necessary after removing Ant support
[ https://issues.apache.org/jira/browse/LUCENE-9475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191372#comment-17191372 ] ASF subversion and git services commented on LUCENE-9475: - Commit e3437a467ecac9d10ef7cbd4ec9774e6043a1055 in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e3437a4 ] LUCENE-9475: remove obsolete ant-only jar sha's from Solr. > Enhance the Gradle build as necessary after removing Ant support > > > Key: LUCENE-9475 > URL: https://issues.apache.org/jira/browse/LUCENE-9475 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > Once the bulk of the Ant build system is removed, stuff will come bubbling up > out of the cracks, especially as we try the first 9.0 release which will be > Gradle only. Here we list some of the areas we'll have to be aware of. Please > add as you see fit. Assigning to myself to track, but I certainly don't want > hog all the fun. > * Remove Maven support and replace with "The Gradle Way" of doing Maven. See > LUCENE-9077 (Dawid) > * > ** Remove all of dev-tools/maven? > ** Other dev-tools files no longer used, check if any Gradle build file > references and remove if not. > * -Move Jenkins over to use Gradle only- > * -Verify reference guide build works under Gradle- > * Smoke tester > * Remove anything having to to with Clover (obsolete as of Java 11) > * -Remove all of {{lucene/tools}} (Ivy, forbiddenapis,...}} - > * Remove obsolete files in root dirs of lucene and solr (like > version.properties, now integrated into gradle) > * Remove Maven build files (obsolete with Gradle) > * Hoss's test rollups? > * Enable javadocs after ant stops being used (LUCENE-9441) > * fix some relative links in javadocs which contain ant module names (?) > * dev-tools/scripts/* There are a lot of mentions of ant in the *.py* files, > and some in the README.md. This one should probably be its own JIRA since > it'll require quite a bit of verification... > * -Make "the best damn beasting script in the world" work with the Gradle > build.- (see LUCENE-9465, LUCENE-9472 for alternatives) > * Update the release documentation to reflect Gradle (LUCENE-9488) > * Clean up anything in lucene/tools > * Clean up Confluence, in particular any page that mentions IDEs. The "How > to Contribute" page has several links to various bits and pieces of how to > use IDEs, and some mention ant. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9510) SortingStoredFieldsConsumer should use a format that has better random-access
Adrien Grand created LUCENE-9510: Summary: SortingStoredFieldsConsumer should use a format that has better random-access Key: LUCENE-9510 URL: https://issues.apache.org/jira/browse/LUCENE-9510 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand We noticed some indexing rate regressions in Elasticsearch after upgrading to a new Lucene snapshot. This is due to the fact that SortingStoredFieldsConsumer is using the default codec to write stored fields on flush. Compression doesn't matter much for this case since these are temporary files that get removed on flush after the segment is sorted anyway so we could switch to a format that has faster random access. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9509) Refine lucene/BUILD.md (move Solr related contents to top-level README)
[ https://issues.apache.org/jira/browse/LUCENE-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191377#comment-17191377 ] Erick Erickson commented on LUCENE-9509: These look good to me. As chance would have it, I'm modifying solr/README.md, but I don't think that conflicts at all with this work. > Refine lucene/BUILD.md (move Solr related contents to top-level README) > --- > > Key: LUCENE-9509 > URL: https://issues.apache.org/jira/browse/LUCENE-9509 > Project: Lucene - Core > Issue Type: Task >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > Current lucene/BUILD.md is somehow mixed-up with information for Solr > developers, this is wrong place for them. Solr related information/tips > should be moved to the top-level README (or somewhere else). > Also the content could be elaborated for newdevs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191378#comment-17191378 ] Ishan Chattopadhyaya commented on SOLR-14768: - bq. Today Ishan Chattopadhyaya (j...@apache.org) finally noticed our Jira entry and asked me for patches. I didn't exactly ask you, but expressed the general sentiment that this is an area where contributions will be appreciated. bq. A second free suggestion bq. A last freebee Thanks for your "free" suggestions to improve a free (and open source) project. bq. in fact a chore for the code maintainers rather than end users. There bq. we cannot be expected to become deeply immersed in this dense material. That chore for those responsible people. Thus can we escalate matters to those folks and see if we can rectify the problem. Please keep in mind that Apache Solr is a search engine, and all committers for the project are volunteers (like you) working as hard as possible to improve the project. Part of the reason why we wish to deprecate Tika/Solr Cell/Extraction is that this is one of the non-essential functionalities for our project which has not received the level and quality of support that we strive hard to provide for the rest of the project. Thank you for your suggestions, they make sense. We realize the importance of this functionality and also wish to continue supporting this functionality via non-core packages (either official Solr packages or community supported packages) going forward. Having said that, we also wish to help resolve all outstanding issues (like this) that have been caused inadvertently. FYI David, [~tallison]. > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml, > Solr v8.6.x fails with multipart MIME in commands.eml, Solr v8.6.x fails with > multipart MIME in commands.eml > > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] > The integration tests of the solarium PHP library and the integration tests > of the Search API Solr Drupal module both fail on PDF extraction if executed > on Solr 8.6. > They still work on Solr 8.5.1 an earlier versions. > {quote}2020-08-20 12:30:35.279 INFO (qtp855700733-19) [ x:5f3e6ce2810ef] > o.a.s.u.p.LogUpdateProcessorFactory [5f3e6ce2810ef] webapp=/solr > path=/update/extract > params=\{json.nl=flat&commitWithin=0&omitHeader=false&resource.name=testpdf.pdf&literal.id=extract-test&commit=true&extractOnly=false&uprefix=attr_&wt=json}{add=[extract-test > (1675547519474466816)],commit=} 0 957 > solr8_1 | 2020-08-20 12:30:35.280 WARN (qtp855700733-19) [ ] > o.e.j.s.HttpChannel /solr/5f3e6ce2810ef/update/extract => > java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > solr8_1 | java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:443) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) > ~[?:?] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590) > ~[jetty-security-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.jav
[jira] [Commented] (LUCENE-9475) Enhance the Gradle build as necessary after removing Ant support
[ https://issues.apache.org/jira/browse/LUCENE-9475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191379#comment-17191379 ] ASF subversion and git services commented on LUCENE-9475: - Commit 8c5ce090dd08df4a196ff67811fea13a1fe6691c in lucene-solr's branch refs/heads/master from Erick Erickson [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8c5ce090 ] LUCENE-9475: Enhance the Gradle build as necessary after removing Ant support, some minor text changes to solr/README.md and help.gradle > Enhance the Gradle build as necessary after removing Ant support > > > Key: LUCENE-9475 > URL: https://issues.apache.org/jira/browse/LUCENE-9475 > Project: Lucene - Core > Issue Type: Improvement > Components: general/build >Affects Versions: master (9.0) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > Once the bulk of the Ant build system is removed, stuff will come bubbling up > out of the cracks, especially as we try the first 9.0 release which will be > Gradle only. Here we list some of the areas we'll have to be aware of. Please > add as you see fit. Assigning to myself to track, but I certainly don't want > hog all the fun. > * Remove Maven support and replace with "The Gradle Way" of doing Maven. See > LUCENE-9077 (Dawid) > * > ** Remove all of dev-tools/maven? > ** Other dev-tools files no longer used, check if any Gradle build file > references and remove if not. > * -Move Jenkins over to use Gradle only- > * -Verify reference guide build works under Gradle- > * Smoke tester > * Remove anything having to to with Clover (obsolete as of Java 11) > * -Remove all of {{lucene/tools}} (Ivy, forbiddenapis,...}} - > * Remove obsolete files in root dirs of lucene and solr (like > version.properties, now integrated into gradle) > * Remove Maven build files (obsolete with Gradle) > * Hoss's test rollups? > * Enable javadocs after ant stops being used (LUCENE-9441) > * fix some relative links in javadocs which contain ant module names (?) > * dev-tools/scripts/* There are a lot of mentions of ant in the *.py* files, > and some in the README.md. This one should probably be its own JIRA since > it'll require quite a bit of verification... > * -Make "the best damn beasting script in the world" work with the Gradle > build.- (see LUCENE-9465, LUCENE-9472 for alternatives) > * Update the release documentation to reflect Gradle (LUCENE-9488) > * Clean up anything in lucene/tools > * Clean up Confluence, in particular any page that mentions IDEs. The "How > to Contribute" page has several links to various bits and pieces of how to > use IDEs, and some mention ant. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] uschindler opened a new pull request #1837: LUCENE-7882: First idea of using Java 15 hidden anonymous classes for Lucene expressions
uschindler opened a new pull request #1837: URL: https://github.com/apache/lucene-solr/pull/1837 This PR is a first (but already working) idea of using Java 15 hidden classes (see https://openjdk.java.net/jeps/371) to implement the Lucene expressions. The big advantages: - No classloader for every expression is needed, because the class is completely anonymous and has no strong reference to a classloader. Actually the class has a classloader to lookup any referenced other class, but it is NOT loaded by any classloader - The class can easily be unloaded - The performance of loading that class is better, as no locks can occur (classloaders have locks when looking up classes), see https://issues.apache.org/jira/browse/LUCENE-7882 Backside: - The class and its methods do not appear in any stack trace. This also fails one test in the test suite. When using this for Lucene expressions, we have to think about a better way how to make the source code and method calls visible in stack traces. Like lambda frames the hidden class is not visible (unless enabled in JVM to show hidden frames). - It currently does not work if you pass a different classloader than Lucene's to the expressions module. To allow this we need to change APIs a bit (Classloader -> Lookup). @mikemccand can you test this with JDK 15 (release candidate) and your test. You should not see any locks anymore, speed should be higher, and the created anonymous classes should be unloaded very fast. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-7882) Maybe expression compiler should cache recently compiled expressions?
[ https://issues.apache.org/jira/browse/LUCENE-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-7882: - Assignee: Uwe Schindler > Maybe expression compiler should cache recently compiled expressions? > - > > Key: LUCENE-7882 > URL: https://issues.apache.org/jira/browse/LUCENE-7882 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/expressions >Reporter: Michael McCandless >Assignee: Uwe Schindler >Priority: Major > Attachments: demo.patch > > Time Spent: 10m > Remaining Estimate: 0h > > I've been running search performance tests using a simple expression > ({{_score + ln(1000+unit_sales)}}) for sorting and hit this odd bottleneck: > {noformat} > "pool-1-thread-30" #70 prio=5 os_prio=0 tid=0x7eea7000a000 nid=0x1ea8a > waiting for monitor entry [0x7eea867dd000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.lucene.expressions.js.JavascriptCompiler$CompiledExpression.evaluate(_score > + ln(1000+unit_sales)) > at > org.apache.lucene.expressions.ExpressionFunctionValues.doubleValue(ExpressionFunctionValues.java:49) > at > com.amazon.lucene.OrderedVELeafCollector.collectInternal(OrderedVELeafCollector.java:123) > at > com.amazon.lucene.OrderedVELeafCollector.collect(OrderedVELeafCollector.java:108) > at > org.apache.lucene.search.MultiCollectorManager$Collectors$LeafCollectors.collect(MultiCollectorManager.java:102) > at > org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:241) > at > org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:184) > at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:658) > at org.apache.lucene.search.IndexSearcher$5.call(IndexSearcher.java:600) > at org.apache.lucene.search.IndexSearcher$5.call(IndexSearcher.java:597) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > I couldn't see any {{synchronized}} in the sources here, so I'm not sure > which object monitor it's blocked on. > I was accidentally compiling a new expression for every query, and that > bottleneck would cause overall QPS to slow down drastically (~4X slower after > ~1 hour of redline tests), as if the JVM is getting slower and slower to > evaluate each expression the more expressions I had compiled. > I tested JDK 9-ea and it also kept slowing down over time as the performance > test ran. > Maybe we should put a small cache in front of the expressions compiler to > make it less trappy? Or maybe we can get to the root cause of why the JVM > slows down more and more, the more expressions you compile? > I won't have time to work on this in the near future so if anyone else feels > the itch, please scratch it! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-7882) Maybe expression compiler should cache recently compiled expressions?
[ https://issues.apache.org/jira/browse/LUCENE-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191381#comment-17191381 ] Uwe Schindler commented on LUCENE-7882: --- Hi Mike, Java 15 now has the "solutions" for the problems you are seeing. We can now use "hidden classes" ([JEP-371|https://openjdk.java.net/jeps/371]) for this. Performance should be much better and pressure on GC lower. Here is a first PR: https://github.com/apache/lucene-solr/pull/1837 Backside: We don't see the expression in stack frames anymore. I have no idea how to solve this, we may need to make the source code available in a different way on errors. The attached PR does not pass all tests because of missing stack frames, but the expressions module works. Could you test the attached PR with JDK 15 and your test code that had the problems? > Maybe expression compiler should cache recently compiled expressions? > - > > Key: LUCENE-7882 > URL: https://issues.apache.org/jira/browse/LUCENE-7882 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/expressions >Reporter: Michael McCandless >Assignee: Uwe Schindler >Priority: Major > Attachments: demo.patch > > Time Spent: 10m > Remaining Estimate: 0h > > I've been running search performance tests using a simple expression > ({{_score + ln(1000+unit_sales)}}) for sorting and hit this odd bottleneck: > {noformat} > "pool-1-thread-30" #70 prio=5 os_prio=0 tid=0x7eea7000a000 nid=0x1ea8a > waiting for monitor entry [0x7eea867dd000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.lucene.expressions.js.JavascriptCompiler$CompiledExpression.evaluate(_score > + ln(1000+unit_sales)) > at > org.apache.lucene.expressions.ExpressionFunctionValues.doubleValue(ExpressionFunctionValues.java:49) > at > com.amazon.lucene.OrderedVELeafCollector.collectInternal(OrderedVELeafCollector.java:123) > at > com.amazon.lucene.OrderedVELeafCollector.collect(OrderedVELeafCollector.java:108) > at > org.apache.lucene.search.MultiCollectorManager$Collectors$LeafCollectors.collect(MultiCollectorManager.java:102) > at > org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:241) > at > org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:184) > at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:658) > at org.apache.lucene.search.IndexSearcher$5.call(IndexSearcher.java:600) > at org.apache.lucene.search.IndexSearcher$5.call(IndexSearcher.java:597) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > I couldn't see any {{synchronized}} in the sources here, so I'm not sure > which object monitor it's blocked on. > I was accidentally compiling a new expression for every query, and that > bottleneck would cause overall QPS to slow down drastically (~4X slower after > ~1 hour of redline tests), as if the JVM is getting slower and slower to > evaluate each expression the more expressions I had compiled. > I tested JDK 9-ea and it also kept slowing down over time as the performance > test ran. > Maybe we should put a small cache in front of the expressions compiler to > make it less trappy? Or maybe we can get to the root cause of why the JVM > slows down more and more, the more expressions you compile? > I won't have time to work on this in the near future so if anyone else feels > the itch, please scratch it! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on a change in pull request #1837: LUCENE-7882: First idea of using Java 15 hidden anonymous classes for Lucene expressions
uschindler commented on a change in pull request #1837: URL: https://github.com/apache/lucene-solr/pull/1837#discussion_r484117905 ## File path: lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompiler.java ## @@ -85,7 +88,33 @@ } } - private static final int CLASSFILE_VERSION = Opcodes.V1_8; + /** Method handle to invoke Java 15's way to define hidden classes. + * The method handle uses a private lookup of {@link JavascriptCompiler} to define + * classes (this ensures we can create classes in our package without extra permissions). + * The classes are initialized and have no strong relationship to our classloader. + * This ensures they can be unloaded. + * Signature of MH: {@code static Lookup defineHiddenClass(byte[] bc)} + * The MH is {@code null} if an earlier JDK is used. + * @see "https://openjdk.java.net/jeps/371"; + */ + private static final MethodHandle MH_defineHiddenClass; + static { +final Lookup publicLookup = MethodHandles.publicLookup(); +MethodHandle mh; +try { + final Object emptyOptions = Array.newInstance( + publicLookup.findClass(Lookup.class.getName().concat("$ClassOption")), 0); + mh = publicLookup.findVirtual(Lookup.class, "defineHiddenClass", + MethodType.methodType(Lookup.class, byte[].class, boolean.class, emptyOptions.getClass())); + mh = mh.bindTo(MethodHandles.lookup()); // private lookup of JavascriptCompiler! + mh = MethodHandles.insertArguments(mh.asFixedArity(), 1, true, emptyOptions); +} catch (ReflectiveOperationException e) { + mh = null; +} +MH_defineHiddenClass = mh; + } + + private static final int CLASSFILE_VERSION = Opcodes.V11; Review comment: Actually this should have been done on master already! We are on Java 11, so classfile format should be Java 11, too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-7882) Maybe expression compiler should cache recently compiled expressions?
[ https://issues.apache.org/jira/browse/LUCENE-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191381#comment-17191381 ] Uwe Schindler edited comment on LUCENE-7882 at 9/6/20, 9:57 PM: Hi Mike, Java 15 now has the "solutions" for the problems you are seeing. We can now use "hidden classes" ([JEP-371|https://openjdk.java.net/jeps/371]) for this. Performance should be much better and pressure on GC lower. Here is a first PR: https://github.com/apache/lucene-solr/pull/1837 Backside: We don't see the expression in stack frames anymore. I have no idea how to solve this, we may need to make the source code available in a different way on errors (see my question on openjdk mailing lists: https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-September/068542.html). The attached PR does not pass all tests because of missing stack frames, but the expressions module works. Could you test the attached PR with JDK 15 and your test code that had the problems? was (Author: thetaphi): Hi Mike, Java 15 now has the "solutions" for the problems you are seeing. We can now use "hidden classes" ([JEP-371|https://openjdk.java.net/jeps/371]) for this. Performance should be much better and pressure on GC lower. Here is a first PR: https://github.com/apache/lucene-solr/pull/1837 Backside: We don't see the expression in stack frames anymore. I have no idea how to solve this, we may need to make the source code available in a different way on errors. The attached PR does not pass all tests because of missing stack frames, but the expressions module works. Could you test the attached PR with JDK 15 and your test code that had the problems? > Maybe expression compiler should cache recently compiled expressions? > - > > Key: LUCENE-7882 > URL: https://issues.apache.org/jira/browse/LUCENE-7882 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/expressions >Reporter: Michael McCandless >Assignee: Uwe Schindler >Priority: Major > Attachments: demo.patch > > Time Spent: 20m > Remaining Estimate: 0h > > I've been running search performance tests using a simple expression > ({{_score + ln(1000+unit_sales)}}) for sorting and hit this odd bottleneck: > {noformat} > "pool-1-thread-30" #70 prio=5 os_prio=0 tid=0x7eea7000a000 nid=0x1ea8a > waiting for monitor entry [0x7eea867dd000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.lucene.expressions.js.JavascriptCompiler$CompiledExpression.evaluate(_score > + ln(1000+unit_sales)) > at > org.apache.lucene.expressions.ExpressionFunctionValues.doubleValue(ExpressionFunctionValues.java:49) > at > com.amazon.lucene.OrderedVELeafCollector.collectInternal(OrderedVELeafCollector.java:123) > at > com.amazon.lucene.OrderedVELeafCollector.collect(OrderedVELeafCollector.java:108) > at > org.apache.lucene.search.MultiCollectorManager$Collectors$LeafCollectors.collect(MultiCollectorManager.java:102) > at > org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:241) > at > org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:184) > at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:658) > at org.apache.lucene.search.IndexSearcher$5.call(IndexSearcher.java:600) > at org.apache.lucene.search.IndexSearcher$5.call(IndexSearcher.java:597) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > I couldn't see any {{synchronized}} in the sources here, so I'm not sure > which object monitor it's blocked on. > I was accidentally compiling a new expression for every query, and that > bottleneck would cause overall QPS to slow down drastically (~4X slower after > ~1 hour of redline tests), as if the JVM is getting slower and slower to > evaluate each expression the more expressions I had compiled. > I tested JDK 9-ea and it also kept slowing down over time as the performance > test ran. > Maybe we should put a small cache in front of the expressions compiler to > make it less trappy? Or maybe we can get to the root cause of why the JVM > slows down more and more, the more expressions you compile? > I won't have time to work on this in the near future so if anyone else feels > the itch, please scratch it! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191384#comment-17191384 ] Ishan Chattopadhyaya commented on SOLR-14768: - {quote}Oh dear, "ant package" failed, saying file lucene/common-build.xml, line 2331 failed (that line wants $(git.exe) for gosh sakes). I am building on SUSE Leap 15.2 Linux, not Windows. {quote} Try: {code:java} # zypper in git {code} > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml, > Solr v8.6.x fails with multipart MIME in commands.eml, Solr v8.6.x fails with > multipart MIME in commands.eml > > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] > The integration tests of the solarium PHP library and the integration tests > of the Search API Solr Drupal module both fail on PDF extraction if executed > on Solr 8.6. > They still work on Solr 8.5.1 an earlier versions. > {quote}2020-08-20 12:30:35.279 INFO (qtp855700733-19) [ x:5f3e6ce2810ef] > o.a.s.u.p.LogUpdateProcessorFactory [5f3e6ce2810ef] webapp=/solr > path=/update/extract > params=\{json.nl=flat&commitWithin=0&omitHeader=false&resource.name=testpdf.pdf&literal.id=extract-test&commit=true&extractOnly=false&uprefix=attr_&wt=json}{add=[extract-test > (1675547519474466816)],commit=} 0 957 > solr8_1 | 2020-08-20 12:30:35.280 WARN (qtp855700733-19) [ ] > o.e.j.s.HttpChannel /solr/5f3e6ce2810ef/update/extract => > java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > solr8_1 | java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:443) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) > ~[?:?] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590) > ~[jetty-security-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
[jira] [Commented] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191399#comment-17191399 ] David Smiley commented on SOLR-14768: - I worked on the refactoring that led to this. Seems like some difficult classpath issue. Unit tests (indirectly SolrExampleXMLTest and others which randomly configure the http client to use multi-part) did not trip this and generally don't reveal classpath problems. H... I'm investigating... > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml, > Solr v8.6.x fails with multipart MIME in commands.eml, Solr v8.6.x fails with > multipart MIME in commands.eml > > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] > The integration tests of the solarium PHP library and the integration tests > of the Search API Solr Drupal module both fail on PDF extraction if executed > on Solr 8.6. > They still work on Solr 8.5.1 an earlier versions. > {quote}2020-08-20 12:30:35.279 INFO (qtp855700733-19) [ x:5f3e6ce2810ef] > o.a.s.u.p.LogUpdateProcessorFactory [5f3e6ce2810ef] webapp=/solr > path=/update/extract > params=\{json.nl=flat&commitWithin=0&omitHeader=false&resource.name=testpdf.pdf&literal.id=extract-test&commit=true&extractOnly=false&uprefix=attr_&wt=json}{add=[extract-test > (1675547519474466816)],commit=} 0 957 > solr8_1 | 2020-08-20 12:30:35.280 WARN (qtp855700733-19) [ ] > o.e.j.s.HttpChannel /solr/5f3e6ce2810ef/update/extract => > java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > solr8_1 | java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:443) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) > ~[?:?] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590) > ~[jetty-security-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.ser
[jira] [Assigned] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley reassigned SOLR-14768: --- Assignee: David Smiley > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Assignee: David Smiley >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml, > Solr v8.6.x fails with multipart MIME in commands.eml, Solr v8.6.x fails with > multipart MIME in commands.eml > > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] > The integration tests of the solarium PHP library and the integration tests > of the Search API Solr Drupal module both fail on PDF extraction if executed > on Solr 8.6. > They still work on Solr 8.5.1 an earlier versions. > {quote}2020-08-20 12:30:35.279 INFO (qtp855700733-19) [ x:5f3e6ce2810ef] > o.a.s.u.p.LogUpdateProcessorFactory [5f3e6ce2810ef] webapp=/solr > path=/update/extract > params=\{json.nl=flat&commitWithin=0&omitHeader=false&resource.name=testpdf.pdf&literal.id=extract-test&commit=true&extractOnly=false&uprefix=attr_&wt=json}{add=[extract-test > (1675547519474466816)],commit=} 0 957 > solr8_1 | 2020-08-20 12:30:35.280 WARN (qtp855700733-19) [ ] > o.e.j.s.HttpChannel /solr/5f3e6ce2810ef/update/extract => > java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > solr8_1 | java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:443) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) > ~[?:?] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590) > ~[jetty-security-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] >
[GitHub] [lucene-solr] dsmiley opened a new pull request #1838: SOLR-14768: Fix multipart POST to Solr.
dsmiley opened a new pull request #1838: URL: https://github.com/apache/lucene-solr/pull/1838 https://issues.apache.org/jira/browse/SOLR-14768 Regression from 8.6 Multipart POST would fail due to a NoClassDefFoundError of Jetty MultiPart. Solr cannot access many Jetty classes, which is not noticeable in our tests. I tested this manually: I started the "techproducts" schema via `bin/solr start -e techproducts` then I did this: curl http://localhost:8983/solr/techproducts/update/extract \ -F literal.id=doc2 -F stream.body=@example/exampledocs/sample.html It should work. It used to dump a stacktrace from Solr. I would be awesome if we could have an automated test for this given the test infrastructure we have, but AFAIK I don't see anything. A suitable test would need to run bin/solr. If we had a SolrExampleTests subclass that required a URL input then we could run tests against a Solr instance that is being released by the smoketester and also by a future Docker release process. I'll raise this particular matter in the dev list for discussion. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1837: LUCENE-7882: First idea of using Java 15 hidden anonymous classes for Lucene expressions
dweiss commented on a change in pull request #1837: URL: https://github.com/apache/lucene-solr/pull/1837#discussion_r484194383 ## File path: lucene/expressions/src/test/org/apache/lucene/expressions/js/TestCustomFunctions.java ## @@ -263,7 +263,7 @@ public void testThrowingException() throws Exception { PrintWriter pw = new PrintWriter(sw); Review comment: Perhaps we should import assertj for tests. These assertions are so much cleaner with assertj. Don't know whether hamcrest equivalent exist (maybe it does). ## File path: lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompiler.java ## @@ -206,6 +247,12 @@ private Expression compileExpression(ClassLoader parent) throws ParseException { throw new IllegalStateException("An internal error occurred attempting to compile the expression (" + sourceText + ").", exception); } } + Review comment: I wouldn't do this, to be honest. Rethrow Error and RuntimeException subclasses as they were, wrap anything else in a runtime exception with an appropriate caused-by delegate. Makes it easier to reason about the code should something happen. ## File path: lucene/expressions/src/java/org/apache/lucene/expressions/js/JavascriptCompiler.java ## @@ -191,9 +220,21 @@ private Expression compileExpression(ClassLoader parent) throws ParseException { try { generateClass(getAntlrParseTree(), classWriter, externalsMap); - - final Class evaluatorClass = new Loader(parent) -.define(COMPILED_EXPRESSION_CLASS, classWriter.toByteArray()); + + final Class evaluatorClass; Review comment: Just a suggestion. My personal preference is to set up a supplier (or a function) statically, depending on conditions, then just use that later forever. In this case you could have something like: static Function> classLoader; static { if (MH_...) { classLoader = (bytes) -> {}; } else { classLoader = (bytes) -> {}; } } which would be defined once and used without the knowledge of its internal implementation details later on, forever. Just a thought. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191454#comment-17191454 ] David Smiley commented on SOLR-14768: - I've got a fix at the PR with further info. I'm pretty happy with the fix; it uses Jetty APIs less and thus insulates us more from changes there and of course the classpath matter. I feel guilty I didn't test the multi-part stuff manually _originally_, and now we've got this nasty regression :-( This problem is arguably worthy of a bugfix release. > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Assignee: David Smiley >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml, > Solr v8.6.x fails with multipart MIME in commands.eml, Solr v8.6.x fails with > multipart MIME in commands.eml > > Time Spent: 10m > Remaining Estimate: 0h > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] > The integration tests of the solarium PHP library and the integration tests > of the Search API Solr Drupal module both fail on PDF extraction if executed > on Solr 8.6. > They still work on Solr 8.5.1 an earlier versions. > {quote}2020-08-20 12:30:35.279 INFO (qtp855700733-19) [ x:5f3e6ce2810ef] > o.a.s.u.p.LogUpdateProcessorFactory [5f3e6ce2810ef] webapp=/solr > path=/update/extract > params=\{json.nl=flat&commitWithin=0&omitHeader=false&resource.name=testpdf.pdf&literal.id=extract-test&commit=true&extractOnly=false&uprefix=attr_&wt=json}{add=[extract-test > (1675547519474466816)],commit=} 0 957 > solr8_1 | 2020-08-20 12:30:35.280 WARN (qtp855700733-19) [ ] > o.e.j.s.HttpChannel /solr/5f3e6ce2810ef/update/extract => > java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > solr8_1 | java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:443) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) > ~[?:?] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590) > ~[jetty-security-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.Cont