Classes in solr_home /lib cannot import from solr/dist
I've got an extension jar that contains a class which extends from org.apache.solr.handler.dataimport.DataSource But it only works if it's within the solr/dist folder. However when stored in the lib/ folder within Solr home. When it tries to load the class it cannot find it's parent: Exception in thread "Thread-69" java.lang.NoClassDefFoundError: org/apache/solr/handler/dataimport/DataSource at org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:374) at org.apache.solr.handler.dataimport.ContextImpl.getDataSource(ContextImpl.java:102) Caused by: java.lang.ClassNotFoundException: org.apache.solr.handler.dataimport.DataSource The classes in the lib folder don't have access to the class within the dist folder in their classpath when they are loaded. I'd like the keep my solr install separate from my configs/plugins/indexes so I want to avoid putting it into the dist folder unless I absolutely have to. Is this by design? Is there some kind of configuration somewhere I can tweak to get this to work? Cheers, Callum L. -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Re: Classes in solr_home /lib cannot import from solr/dist
That's what I did: My solrconfig.xml has the following (i've hardcoded the version numbers for now to get regexes out of the picture): No warning's whatsoever for not finding the jars. And the jars themselves are in the right order (the second depends on the first). If i move the data import handler jar to the ${solr.solr.home}/lib/ folder then everything works. This implies that the solr-dataimporthandler jar isn't being included properly but I've checked so many times that it's correct. I can do a full absolute path without the use of solr.install.dir and solr.solr.home and it still does not work. The permissions and ownership on the jar files are identical for the 2 jars, if it can load one then it should be able to load the other. On Thu, Jan 14, 2016 at 2:19 PM, sara hajili wrote: > hi Callum. > you can create a directory for your jar file any where,and u must set jar > file location in tag in solrConfig.xml > and be carefull that add your lib location at the end of the solr config > default tag, > because some times your jar need class that at first solr must be load own > class after that load your jar to don't face a class not found exception. > > > On Thu, Jan 14, 2016 at 4:36 AM, Callum Lamb wrote: > > > I've got an extension jar that contains a class which extends from > > > > org.apache.solr.handler.dataimport.DataSource > > > > But it only works if it's within the solr/dist folder. However when > stored > > in the lib/ folder within Solr home. When it tries to load the class it > > cannot find it's parent: > > > > Exception in thread "Thread-69" java.lang.NoClassDefFoundError: > > org/apache/solr/handler/dataimport/DataSource > > at > > > > > org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:374) > > at > > > > > org.apache.solr.handler.dataimport.ContextImpl.getDataSource(ContextImpl.java:102) > > Caused by: java.lang.ClassNotFoundException: > > org.apache.solr.handler.dataimport.DataSource > > > > The classes in the lib folder don't have access to the class within the > > dist folder in their classpath when they are loaded. > > > > I'd like the keep my solr install separate from my > configs/plugins/indexes > > so I want to avoid putting it into the dist folder unless I absolutely > have > > to. > > > > Is this by design? Is there some kind of configuration somewhere I can > > tweak to get this to work? > > > > Cheers, > > > > Callum L. > > > > -- > > > > Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN > > Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 > > > > Contact details for our other offices can be found at > > http://www.mintel.com/office-locations. > > > > This email and any attachments may include content that is confidential, > > privileged > > or otherwise protected under applicable law. Unauthorised disclosure, > > copying, distribution > > or use of the contents is prohibited and may be unlawful. If you have > > received this email in error, > > including without appropriate authorisation, then please reply to the > > sender about the error > > and delete this email and any attachments. > > > > > -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Re: Classes in solr_home /lib cannot import from solr/dist
Good to know Solr already loads them, that removed a bunch of lines from my solrconfig.xml. Having to copy the required jars from dist/ to lib/ isn't ideal but if that's the only solution then at least I can stop searching for a solution and figure out how best to deal with this limitation. I assume the reason for this is that the libs in solr.home.home/lib are loaded at runtime? I don't know much about how this works in Java but i'm guessing Solr can access the classes in the Jars but not the other way around? Thanks for your help guys. On Thu, Jan 14, 2016 at 5:03 PM, Shawn Heisey wrote: > On 1/14/2016 5:36 AM, Callum Lamb wrote: > > I've got an extension jar that contains a class which extends from > > > > org.apache.solr.handler.dataimport.DataSource > > > > But it only works if it's within the solr/dist folder. However when > stored > > in the lib/ folder within Solr home. When it tries to load the class it > > cannot find it's parent: > > > > Exception in thread "Thread-69" java.lang.NoClassDefFoundError: > > org/apache/solr/handler/dataimport/DataSource > > at > > > org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:374) > > at > > > org.apache.solr.handler.dataimport.ContextImpl.getDataSource(ContextImpl.java:102) > > Caused by: java.lang.ClassNotFoundException: > > org.apache.solr.handler.dataimport.DataSource > > > > The classes in the lib folder don't have access to the class within the > > dist folder in their classpath when they are loaded. > > > > I'd like the keep my solr install separate from my > configs/plugins/indexes > > so I want to avoid putting it into the dist folder unless I absolutely > have > > to. > > If you're going to put jars in $SOLR_HOME/lib, then you should *only* > put jars in that directory, and NOT load jars explicitly. The > directives should not be used in solrconfig.xml when jars are loaded > from this directory, because Solr will automatically load jars from this > location and make them available to all cores. > > If moving all your extra jars (including things like the dataimport jar) > to $SOLR_HOME/lib and taking out jar loading in solrconfig.xml doesn't > help, then depending on the Solr version, you *might* be running into > SOLR-6188. > > https://issues.apache.org/jira/browse/SOLR-6188 > > You'll want to be sure that you don't the same jar more than once. This > is the root of the specific problem that SOLR-6188 solves. Loading the > same jar more than once can also happen if the jar is in the lib > directory AND mentioned on a config element. > > Thanks, > Shawn > > -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Nested grouping or equivalent.
We have a horrible Solr query that groups by a field and then sorts by another. My understanding is that for this to happen it has to sort by the grouping field, group it and then sort the resulting result set. It's not a fast query. Unfortunately our documents now need to be grouped as well (product variants into items) and that grouping query needs to work on that grouping instead. As far as I'm aware you can't do nested grouping in Solr. In summary we want to have product variants that get grouped into Items and then they get grouped by field and then sorted by another. The solution doesn't need to be fast, it's a rarely ever used legacy part of our application that's basically never used and we just need it to work. Our dataset isn't huge so it doesn't matter if Solr has to scan the entire index (I think the query does this atm anyway). But downloading the entire document set and doing the operations in ETL isn't something we really want to dedicate time to unless it's impossible to represent this in Solr queries. Any ideas? Cheers, Callum. -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Re: Nested grouping or equivalent.
Thank you so much Erick. The CollapsingQparse was exactly what I needed. I'm able to group by the field and then do the collapse from products into items and get the correct answer. The Collapsing is also more appropriate for the general grouping we need to do all the time now as well, so we'll probably use that instead. There was one part that took me a while to figure out though. And that was getting the unique counts of a field with facets from BEFORE the collapse kicked in. Just for anyone else who wants to know how to do this. I use a tag on the collapse fq like this: fq={!collapse field=int_item_id tag=collapse} And then by expressing my facet count query with excludeTags like this i'm able to get the pre_collapse_count: json.facet={'pre_collapse_count':{'type':'query','domain':{'excludeTags':'collapse'}}} Which works perfectly. You need Solr 5.4 for this to work with json facets according to http://yonik.com/multi-select-faceting/. But I think you can also do it with old facets by putting a {!ex=collapse}int_item_id in the facet.field property. On Thu, May 12, 2016 at 5:10 AM, Erick Erickson wrote: > A couple of ideas. If this is 5x consider Streaming Aggregation. > The idea here is that you stream the docs back to a SolrJ client and > slice and dice them there. SA is designed to export 400K docs/sec, > but the returned values must be DocValues (i.e. no text types, strings > are OK). > > Have you seen the CollapsingQParserPlugin? That might help. > > Or push back at the product manager and say "why are we wasting > time supporting something nobody uses?" ;) > > Best, > Erick > > On Wed, May 11, 2016 at 1:45 AM, Callum Lamb wrote: > > We have a horrible Solr query that groups by a field and then sorts by > > another. My understanding is that for this to happen it has to sort by > the > > grouping field, group it and then sort the resulting result set. It's > not a > > fast query. > > > > Unfortunately our documents now need to be grouped as well (product > > variants into items) and that grouping query needs to work on that > grouping > > instead. As far as I'm aware you can't do nested grouping in Solr. > > > > In summary we want to have product variants that get grouped into Items > and > > then they get grouped by field and then sorted by another. > > > > The solution doesn't need to be fast, it's a rarely ever used legacy part > > of our application that's basically never used and we just need it to > work. > > Our dataset isn't huge so it doesn't matter if Solr has to scan the > entire > > index (I think the query does this atm anyway). But downloading the > entire > > document set and doing the operations in ETL isn't something we really > want > > to dedicate time to unless it's impossible to represent this in Solr > > queries. > > > > Any ideas? > > > > Cheers, > > > > Callum. > > > > -- > > > > Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN > > Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 > > > > Contact details for our other offices can be found at > > http://www.mintel.com/office-locations. > > > > This email and any attachments may include content that is confidential, > > privileged > > or otherwise protected under applicable law. Unauthorised disclosure, > > copying, distribution > > or use of the contents is prohibited and may be unlawful. If you have > > received this email in error, > > including without appropriate authorisation, then please reply to the > > sender about the error > > and delete this email and any attachments. > > > -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Date range warming.
We want to warm some FQ's. The main ones we want to do being date presets like "last 6 months", "last year" .etc The queries for the last 6 months get generated to look like this from site (it's really 6 months -1 day): *date_published:([2016-01-02T00:00:00.000Z TO 2016-07-01T23:59:59.999Z])* But since I have to represent this is in the firstSearcher section of solrconfig.xml I need to use the date math features (Is there another way? There doesn't seem to be a JVM system properties with the date in it, and I don't want to have to restart solr everyday to update a solr env variable). So I have this: *date_published:([NOW/DAY-6MONTH+1DAY TO NOW/DAY+1DAY-1SECOND])* Which should resolve to the same thing. Is there someway I can check this for sure? I get the same results when I run them. I have a couple questions though: 1. Is Solr smart enough to see that it if the current explicit queries that come through are the same as my date math queries and re-use the fq in this case? Is there a way to confirm this? I can go and change them to be the same as well, not much of an issue, more curious than anything. 2. Can Solr re-use fq's with NOW in them at all? Since NOW is changing all the time I'm worried there some kind of checker than just sets cache=false on all queries containing NOW or worse expands them to the current time and caches that, and none of the fq's will ever match it (assuming solr just does a strcmp for fq's). Cheers, Callum. -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Re: Date range warming.
Woops, Just realised it's meant to be: date_published:([NOW/DAY-6MONTH+1DAY TO NOW/DAY+1DAY-*1MILLISECOND*]) instead. On Fri, Jul 1, 2016 at 11:52 AM, Callum Lamb wrote: > We want to warm some FQ's. The main ones we want to do being date presets > like "last 6 months", "last year" .etc > > The queries for the last 6 months get generated to look like this from > site (it's really 6 months -1 day): > > *date_published:([2016-01-02T00:00:00.000Z TO 2016-07-01T23:59:59.999Z])* > > But since I have to represent this is in the firstSearcher section of > solrconfig.xml I need to use the date math features (Is there another way? > There doesn't seem to be a JVM system properties with the date in it, and I > don't want to have to restart solr everyday to update a solr env variable). > > So I have this: > > *date_published:([NOW/DAY-6MONTH+1DAY TO NOW/DAY+1DAY-1SECOND])* > > Which should resolve to the same thing. Is there someway I can check this > for sure? I get the same results when I run them. > > I have a couple questions though: > > 1. Is Solr smart enough to see that it if the current explicit queries > that come through are the same as my date math queries and re-use the fq in > this case? Is there a way to confirm this? I can go and change them to be > the same as well, not much of an issue, more curious than anything. > > 2. Can Solr re-use fq's with NOW in them at all? Since NOW is changing all > the time I'm worried there some kind of checker than just sets cache=false > on all queries containing NOW or worse expands them to the current time and > caches that, and none of the fq's will ever match it (assuming solr just > does a strcmp for fq's). > > Cheers, > > Callum. > > > > > -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Re: Date range warming.
Okay I figured it out. Answer here if anyone ever stumbled across this in future. With debugQuery on you can see filter_queries actually get processed into what's in the parsed_filter_queries and it's those that get cached. In this case solr converts them to unix timestamps TO unix_timestamp and It's these that get cached. So you can see if they'll match by looking at those. On Fri, Jul 1, 2016 at 12:00 PM, Callum Lamb wrote: > Woops, Just realised it's meant to be: > > date_published:([NOW/DAY-6MONTH+1DAY TO NOW/DAY+1DAY-*1MILLISECOND*]) > > instead. > > On Fri, Jul 1, 2016 at 11:52 AM, Callum Lamb wrote: > >> We want to warm some FQ's. The main ones we want to do being date presets >> like "last 6 months", "last year" .etc >> >> The queries for the last 6 months get generated to look like this from >> site (it's really 6 months -1 day): >> >> *date_published:([2016-01-02T00:00:00.000Z TO 2016-07-01T23:59:59.999Z])* >> >> But since I have to represent this is in the firstSearcher section of >> solrconfig.xml I need to use the date math features (Is there another way? >> There doesn't seem to be a JVM system properties with the date in it, and I >> don't want to have to restart solr everyday to update a solr env variable). >> >> So I have this: >> >> *date_published:([NOW/DAY-6MONTH+1DAY TO NOW/DAY+1DAY-1SECOND])* >> >> Which should resolve to the same thing. Is there someway I can check this >> for sure? I get the same results when I run them. >> >> I have a couple questions though: >> >> 1. Is Solr smart enough to see that it if the current explicit queries >> that come through are the same as my date math queries and re-use the fq in >> this case? Is there a way to confirm this? I can go and change them to be >> the same as well, not much of an issue, more curious than anything. >> >> 2. Can Solr re-use fq's with NOW in them at all? Since NOW is changing >> all the time I'm worried there some kind of checker than just sets >> cache=false on all queries containing NOW or worse expands them to the >> current time and caches that, and none of the fq's will ever match it >> (assuming solr just does a strcmp for fq's). >> >> Cheers, >> >> Callum. >> >> >> >> >> > -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Re: Multilevel grouping?
Look at the collapse module https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results. It can the same thing as group. If you want to get counts/facets from before the collapse, tag the collapse statement use the exclude tags tags in your json facets (there's an equivalent for non json facets). I think the default nullpolicy is different from grouping too, but you can change it to be the same. I've not been able to get 2 collapses to work on my version of Solr. But collapse + group works and you can get 2 levels. Not being able to do multiple collapses appears to be a bug (it sorta works). I recall there being JIRA case somewhere stating it was fixed in some version. So you may be able to do as many levels as you like if you upgrade/already run a very recent version of Solr. On Thu, Jul 14, 2016 at 3:52 PM, Aditya Sundaram wrote: > Thanks Yonik, was looking for exactly that, is there any workaround to > achieve that currently? > > On Tue, Jul 12, 2016 at 5:07 PM, Yonik Seeley wrote: > > > I started this a while ago, but haven't found the time to finish: > > https://issues.apache.org/jira/browse/SOLR-7830 > > > > -Yonik > > > > > > On Tue, Jul 12, 2016 at 7:29 AM, Aditya Sundaram > > wrote: > > > Does solr support multilevel grouping? I want to group upto 2/3 levels > > > based on different fields i.e 1st group on field one, within which i > > group > > > by field 2 etc. > > > I am aware of facet.pivot which does the same but retrieves only the > > count. > > > Is there anyway to get the documents as well along with the count in > > > facet.pivot??? > > > > > > -- > > > Aditya Sundaram > > > > > > -- > Aditya Sundaram > Software Engineer, Technology team > AKR Tech park B Block, B1 047 > +91-9844006866 > -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Should we still optimize?
We have a cronjob that runs every week at a quiet time to run the optimizecommand on our Solr collections. Even when it's quiet it's still an extremely heavy operation. One of the things I keep seeing on stackoverflow is that optimizing is now essentially deprecated and lucene (We're on Solr 5.5.2) will now keep the amount of segments at a reasonable level and that the performance impact of having deletedDocs is now much less. One of our cores doesn't get optimized and it's currently sitting at 5.5 million documents with 1.9 million deleted docs. Which seems pretty high to me. How true is this claim? Is optimizing still a good idea for the general case? -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Re: Should we still optimize?
Yeah I figured that was too many deleteddocs. It could just be that our max segments is set too high though. The reason I asked is because our optimize requests have started failing. Or at least,they are appearing to fail because the optimize request returns a non 200. The optimize seems to go ahead successfully regardless though. Before trying to find out if I can asynchronously request and poll for success (doesn't appear to be possible yet) or a better way of determining success, I thought I'd check if the whole thing was necessary to begin with. Hopefully it doesn't involve polling the core status until deleteddocs goes below a certain level :/. Cheers for info. On Mon, Aug 8, 2016 at 2:58 PM, Shawn Heisey wrote: > On 8/8/2016 3:10 AM, Callum Lamb wrote: > > How true is this claim? Is optimizing still a good idea for the > > general case? > > For the general case, optimizing is not recommended. If there are a > very large number of deleted documents, which does describe your > situation, then there is definitely a benefit. > > In cases where there are a lot of deleted documents, scoring can be > affected by the presence of the deleted documents, and the drop in index > size after an optimize can result in a large performance boost. For the > general case where there are not many deletes, there *is* a performance > benefit to optimizing down to a single segment, but it is nowhere near > as dramatic as it was in the 1.x/3.x days. > > The problem with optimizes in the general case is this: The performance > hit that the optimize operation itself causes may not be worth the small > performance improvement. > > If you have a time where your index is quiet enough that the optimize > itself won't be disruptive, then you should certainly take advantage of > that time and do the optimize, even if there aren't many deletes. > > There is another benefit to optimizes that doesn't get mentioned often: > It can make subsequent normal merging operations during indexing faster, > because there will not be as many large segments. > > Thanks, > Shawn > > -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Handling ampersands in searches.
I'm having an issue where searches that contain ampersands aren't being handled correctly. I need them to be dropped at index time *AND* query time. When documents come in and are indexed the ampersands are successfully dropped when they go into my stemmed field (When I facet on the stemmed field they aren't in the list), but when I actually search with a term containing an ampersand, I get no results. E.g. I search for the string "light fit" or "light and fit" then I get results, but when I search for "light & fit" I get none. Even though the SnowballPorterFilterFactory should be dropping it at query time like it does for the "and" and all 3 queries *should* be equivalent. I've tried adding a synonym such that shows in my _schema_analysis_synonyms_default.json (I only have one default file) in both this form and its inverse as well: "and":[ "&", "and"], I've also tried adding the StopWord filter to my fieldtype with & in the stopwords (though this shouldn't be necessary because the SnowBallPorter should be dropping it anyway) and it still doesn't work. Is there some kind of special handling I need for ampersands? I'm thinking that Solr must be interpreting it as some kind of operator and I need to tell Solr that it's actually literal text so the SnowBallPorter knows to drop it. Using backslashes or url encoding instead doesn't work though. Does anyone have any ideas? I can obviously just remove any ampersands from the q before I submit the query to Solr and get the correct results, so this is not a game breaking problem, but i'm more curious to *why* this is happening and how to fix it correctly. Cheers, Callum. Extra info: I'm using Solr 5.5.2 in cloud mode. The q in the queries is specified like this and are parsed the following way: "rawquerystring":"stemmed_description:light & fit", "querystring":" stemmed_description:light & fit", "parsedquery":"(+(+stemmed_description:light +DisjunctionMaxQuery((stemmed_description:&)) +DisjunctionMaxQuery(( stemmed_description:fit/no_coord", "parsedquery_toString":"+(+ stemmed_description:light +(stemmed_description:&) +(stemmed_description :fit))", I have a stemmed field defined in my schema (schema version 1.5) defined like this: with a field type defined like this: -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Stopping a node from receiving any requests temporarily.
We have a Solr cluster that still takes queries that join between cores (I know, bad). We can't change that anytime soon however and I was hoping there was a band-aid I could use in the mean time to make deployments of new nodes cleaner. When we want to add a new node to cluster we'll have a brief moment in time where one of the cores in that join will be present, but the other won't. My understanding is that even if you stop requests from reaching the new Solr node with haproxy, Solr can can route requests between nodes on it's own behind haproxy. We've also noticed that this internal Solr routing is not aware of the join in the query and will route a request to a core that joins to another core even if the latter is not present yet (Causing the query to fail). Until we eliminate all the joins, we want to be able to have a node we can do things to, but *gaurentee* it won't receive any requests until we decide it's ready to take requests. Is there an easy way to do this? We could try stopping the Solr's from talking to each other at the network level but this seems iffy to me and might cause something weird to happen. Any ideas? -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Re: Stopping a node from receiving any requests temporarily.
Forgot to mention. We're using solr 5.5.2 in Solr cloud mode. Everything is single sharded at the moment as the collections are still quite small. On Wed, Apr 12, 2017 at 3:30 PM, Callum Lamb wrote: > We have a Solr cluster that still takes queries that join between cores (I > know, bad). We can't change that anytime soon however and I was hoping > there was a band-aid I could use in the mean time to make deployments of > new nodes cleaner. > > When we want to add a new node to cluster we'll have a brief moment in > time where one of the cores in that join will be present, but the other > won't. > > My understanding is that even if you stop requests from reaching the new > Solr node with haproxy, Solr can can route requests between nodes on it's > own behind haproxy. We've also noticed that this internal Solr routing is > not aware of the join in the query and will route a request to a core that > joins to another core even if the latter is not present yet (Causing the > query to fail). > > Until we eliminate all the joins, we want to be able to have a node we can > do things to, but *gaurentee* it won't receive any requests until we decide > it's ready to take requests. Is there an easy way to do this? We could try > stopping the Solr's from talking to each other at the network level but > this seems iffy to me and might cause something weird to happen. > > Any ideas? > > > -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.
Re: Stopping a node from receiving any requests temporarily.
We can do that in most cases and that's what we've been doing up until now to prevent failed requests. All the more incentive to get rid of those joins then I guess! Thanks. On Wed, Apr 12, 2017 at 4:16 PM, Erick Erickson wrote: > No good ideas here with current Solr. I just raised SOLR-10484 for the > generic ability to take a replica out of action (including the > ADDREPLICA operation). > > Your understanding is correct, Solr will route requests to active > replicas. Is it possible that you can load the "from" core first > _then_ add the replica that references it? Or do they switch roles? > > Best, > Erick > > On Wed, Apr 12, 2017 at 7:39 AM, Callum Lamb wrote: > > Forgot to mention. We're using solr 5.5.2 in Solr cloud mode. Everything > is > > single sharded at the moment as the collections are still quite small. > > > > On Wed, Apr 12, 2017 at 3:30 PM, Callum Lamb wrote: > > > >> We have a Solr cluster that still takes queries that join between cores > (I > >> know, bad). We can't change that anytime soon however and I was hoping > >> there was a band-aid I could use in the mean time to make deployments of > >> new nodes cleaner. > >> > >> When we want to add a new node to cluster we'll have a brief moment in > >> time where one of the cores in that join will be present, but the other > >> won't. > >> > >> My understanding is that even if you stop requests from reaching the new > >> Solr node with haproxy, Solr can can route requests between nodes on > it's > >> own behind haproxy. We've also noticed that this internal Solr routing > is > >> not aware of the join in the query and will route a request to a core > that > >> joins to another core even if the latter is not present yet (Causing the > >> query to fail). > >> > >> Until we eliminate all the joins, we want to be able to have a node we > can > >> do things to, but *gaurentee* it won't receive any requests until we > decide > >> it's ready to take requests. Is there an easy way to do this? We could > try > >> stopping the Solr's from talking to each other at the network level but > >> this seems iffy to me and might cause something weird to happen. > >> > >> Any ideas? > >> > >> > >> > > > > -- > > > > Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN > > Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 > > > > Contact details for our other offices can be found at > > http://www.mintel.com/office-locations. > > > > This email and any attachments may include content that is confidential, > > privileged > > or otherwise protected under applicable law. Unauthorised disclosure, > > copying, distribution > > or use of the contents is prohibited and may be unlawful. If you have > > received this email in error, > > including without appropriate authorisation, then please reply to the > > sender about the error > > and delete this email and any attachments. > > > -- Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN Registered in England: Number 1475918. | VAT Number: GB 232 9342 72 Contact details for our other offices can be found at http://www.mintel.com/office-locations. This email and any attachments may include content that is confidential, privileged or otherwise protected under applicable law. Unauthorised disclosure, copying, distribution or use of the contents is prohibited and may be unlawful. If you have received this email in error, including without appropriate authorisation, then please reply to the sender about the error and delete this email and any attachments.