Stop filter changes in Solr >= 4.4
While attempting to upgrade from Solr 4.3.0 to Solr 4.4.0 I ran into this exception: java.lang.IllegalArgumentException: enablePositionIncrements=false is not supported anymore as of Lucene 4.4 as it can create broken token streams which led me to https://issues.apache.org/jira/browse/LUCENE-4963. I need to be able to match queries irrespective of intervening stopwords (which used to work with enablePositionIncrements="true"). For instance: "foo of the bar" would find documents matching "foo bar", "foo of bar", and "foo of the bar". With this option deprecated in 4.4.0 I'm not clear on how to maintain the same functionality. The package javadoc adds: If the selected analyzer filters the stop words "is" and "the", then for a document containing the string "blue is the sky", only the tokens "blue", "sky" are indexed, with position("sky") = 3 + position("blue"). Now, a phrase query "blue is the sky" would find that document, because the same analyzer filters the same stop words from that query. But the phrase query "blue sky" would not find that document because the position increment between "blue" and "sky" is only 1. If this behavior does not fit the application needs, the query parser needs to be configured to not take position increments into account when generating phrase queries. But there's no mention of how to actually configure the query parser to do this. Does anyone know how to deal with this issue as Solr moves toward 5.0? Crossposted from stackoverflow: http://stackoverflow.com/questions/18668376/solr-4-4-stopfilterfactory-and-enablepositionincrements
Re: Stop filter changes in Solr >= 4.4
processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:582) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:435) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:94) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:150) ... 32 more On Thu, Sep 12, 2013 at 10:07 PM, Shalin Shekhar Mangar wrote: > Can we see a full stack trace for that IllegalArgumentException? > AFAIk, enablePositionIncrements=false is deprecated in 4.x but not > removed. It will be removed in 5.0 though. > > On Fri, Sep 13, 2013 at 3:34 AM, Christopher Condit wrote: >> While attempting to upgrade from Solr 4.3.0 to Solr 4.4.0 I ran into >> this exception: >> >> java.lang.IllegalArgumentException: enablePositionIncrements=false is >> not supported anymore as of Lucene 4.4 as it can create broken token >> streams >> >> which led me to https://issues.apache.org/jira/browse/LUCENE-4963. I >> need to be able to match queries irrespective of intervening stopwords >> (which used to work with enablePositionIncrements="true"). For >> instance: "foo of the bar" would find documents matching "foo bar", >> "foo of bar", and "foo of the bar". With this option deprecated in >> 4.4.0 I'm not clear on how to maintain the same functionality. >> >> The package javadoc adds: >> >> If the selected analyzer filters the stop words "is" and "the", then >> for a document containing the string "blue is the sky", only the >> tokens "blue", "sky" are indexed, with position("sky") = 3 + >> position("blue"). Now, a phrase query "blue is the sky" would find >> that document, because the same analyzer filters the same stop words >> from that query. But the phrase query "blue sky" would not find that >> document because the position increment between "blue" and "sky" is >> only 1. >> >> If this behavior does not fit the application needs, the query parser >> needs to be configured to not take position increments into account >> when generating phrase queries. >> >> But there's no mention of how to actually configure the query parser >> to do this. Does anyone know how to deal with this issue as Solr moves >> toward 5.0? >> >> Crossposted from stackoverflow: >> http://stackoverflow.com/questions/18668376/solr-4-4-stopfilterfactory-and-enablepositionincrements > > > > -- > Regards, > Shalin Shekhar Mangar.
solrj with multiple cores
I have a Solr 3.4 installation with many cores. Is there a way to use a single SolrServer instance across my requests or is the best practice to create a separate SolrServer instance for each core? It looks as though the option to specify a core for a SolrServer was removed: https://issues.apache.org/jira/browse/SOLR-467 Thanks, -Chris
DIH and splitBy
I'm having an issue getting the splitBy construct from the regex transformer to work in a very basic case (with either Solr 3.6 or 4.1). I have a field defined like this: The entity is defined like this: Here's a POM: http://pastie.org/5992725 A JUnit test case showing the problem: http://pastie.org/5993437 And a stackoverflow question with the same information: http://stackoverflow.com/questions/14512055/splitting-database-column-into-multivalued-solr-field Anyone have any ideas? Thanks, Chris
Re: DIH and splitBy
Sorry about that - even if I switch the splitBy to "," it still doesn't work. Here's the corrected unit test: http://pastie.org/5995399 On Thu, Jan 31, 2013 at 12:30 PM, Dyer, James wrote: > In your unit test, you have: > > "" + > > And also: > > runner.update("INSERT INTO test VALUES 1, 'foo,bar,baz'"); > > So you need to decide if you want to delimit with a pipe or a comma. > > James Dyer > Ingram Content Group > (615) 213-4311 > > > -Original Message- > From: Christopher Condit [mailto:con...@sdsc.edu] > Sent: Thursday, January 31, 2013 2:03 PM > To: solr-user@lucene.apache.org > Subject: DIH and splitBy > > I'm having an issue getting the splitBy construct from the regex > transformer to work in a very basic case (with either Solr 3.6 or > 4.1). > > I have a field defined like this: > > > The entity is defined like this: > > > > > Here's a POM: > http://pastie.org/5992725 > > A JUnit test case showing the problem: > http://pastie.org/5993437 > > And a stackoverflow question with the same information: > http://stackoverflow.com/questions/14512055/splitting-database-column-into-multivalued-solr-field > > Anyone have any ideas? > > Thanks, > Chris >