Spellchecking Escaped Queries
I'm having an issue performing a spellcheck on some information and search of the archive isn't helping. I'm indexing the word "p!nk" (yes, that's a bang in there), and have a replacement filter setup so that the ! becomes i. Looking at the analyzer the right thing is happening with both the indexer and query mapping to "pink". When I ask switch on spelling suggestions I get a suggestion of "p!pink" which just seems odd. When I make a request for something like "rink", I get the correct suggestion of "pink", but asking for "r!nk", I get a suggestion of "r! pink". It seems like the spellcheck component isn't quite doing the right thing somewhere. I'm running 1.4.1 with the https://issues.apache.org/jira/browse/SOLR-1553 patch applied for the edismax query parser. Thanks, Colin -- Colin Vipurs Server Team Lead Shazam Entertainment Ltd 26-28 Hammersmith Grove, London W6 7HA m: +44 (0) 000 000 t: +44 (0) 20 8742 6820 w:www.shazam.com Please consider the environment before printing this document This e-mail and its contents are strictly private and confidential. It must not be disclosed, distributed or copied without our prior consent. If you have received this transmission in error, please notify Shazam Entertainment immediately on: +44 (0) 020 8742 6820 and then delete it from your system. Please note that the information contained herein shall additionally constitute Confidential Information for the purposes of any NDA between the recipient/s and Shazam Entertainment. Shazam Entertainment Limited is incorporated in England and Wales under company number 3998831 and its registered office is at 26-28 Hammersmith Grove, London W6 7HA. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
Re: Spellchecking Escaped Queries
Thanks Chris, The field used for indexing and spellcheck is the same and is configured like this:.. I use the pattern replace filter to swap all instances of "!" within a word to "i". I know this part is working correctly as performing a search works correctly. The spellcheck is initialized like this: title default searchfield ./spellchecker false And is attached to as a component to my search handler. Thanks, Colin > : I'm having an issue performing a spellcheck on some information and > : search of the archive isn't helping. > > For this type of quesiton, there's not much feedback anyone can offer w/o > knowing exactly what analyzers you have configured for hte various > fieldtypes (both the field you index/search and the fieldtype used for > spellchecking) > > it's also fairly critical to know how you have the spellcheck component > configured. > > off the cuff: i'd guess that maybe WordDelimiterFilter is being used in a > wonky way given your usecase -- but like i said: would need to see the > configs to make a guess. > > > -Hoss > > __ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit http://www.messagelabs.com/email > __ -- Colin Vipurs Server Team Lead Shazam Entertainment Ltd 26-28 Hammersmith Grove, London W6 7HA m: +44 (0) 000 000 t: +44 (0) 20 8742 6820 w:www.shazam.com Please consider the environment before printing this document This e-mail and its contents are strictly private and confidential. It must not be disclosed, distributed or copied without our prior consent. If you have received this transmission in error, please notify Shazam Entertainment immediately on: +44 (0) 020 8742 6820 and then delete it from your system. Please note that the information contained herein shall additionally constitute Confidential Information for the purposes of any NDA between the recipient/s and Shazam Entertainment. Shazam Entertainment Limited is incorporated in England and Wales under company number 3998831 and its registered office is at 26-28 Hammersmith Grove, London W6 7HA. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
Re: Spellchecking Escaped Queries
Thanks Chris, The field used for indexing and spellcheck is the same and is configured like this:.. I use the pattern replace filter to swap all instances of "!" within a word to "i". I know this part is working correctly as performing a search works correctly. The spellcheck is initialized like this: title default searchfield ./spellchecker false This is attached as a component to my search handler and spellchecking is done inline with the queries. Thanks, Colin > : I'm having an issue performing a spellcheck on some information and > : search of the archive isn't helping. > > For this type of quesiton, there's not much feedback anyone can offer w/o > knowing exactly what analyzers you have configured for hte various > fieldtypes (both the field you index/search and the fieldtype used for > spellchecking) > > it's also fairly critical to know how you have the spellcheck component > configured. > > off the cuff: i'd guess that maybe WordDelimiterFilter is being used in a > wonky way given your usecase -- but like i said: would need to see the > configs to make a guess. > > > -Hoss > > __ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit http://www.messagelabs.com/email > __ -- Colin Vipurs Server Team Lead Shazam Entertainment Ltd 26-28 Hammersmith Grove, London W6 7HA m: +44 (0) 000 000 t: +44 (0) 20 8742 6820 w:www.shazam.com Please consider the environment before printing this document This e-mail and its contents are strictly private and confidential. It must not be disclosed, distributed or copied without our prior consent. If you have received this transmission in error, please notify Shazam Entertainment immediately on: +44 (0) 020 8742 6820 and then delete it from your system. Please note that the information contained herein shall additionally constitute Confidential Information for the purposes of any NDA between the recipient/s and Shazam Entertainment. Shazam Entertainment Limited is incorporated in England and Wales under company number 3998831 and its registered office is at 26-28 Hammersmith Grove, London W6 7HA. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
Re: Spellchecking Escaped Queries
Apologies for the duplicate post. I'm having Evolution problems > Thanks Chris, > > The field used for indexing and spellcheck is the same and is > configured like this:.. > > > class="solr.TextField" > > > > ignoreCase="true" expand="true"/> > > pattern="^([^!]+)\!([^!]+)$" > replacement="$1i$2" > replace="all"/> > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="0" catenateAll="1" splitOnCaseChange="1" > preserveOriginal="1"/> > > > > > > I use the pattern replace filter to swap all instances of "!" within a > word to "i". I know this part is working correctly as performing a > search works correctly. > > The spellcheck is initialized like this: > > > >title > > default > searchfield > ./spellchecker > false > > > > And is attached to as a component to my search handler. > > Thanks, > > Colin > > > > : I'm having an issue performing a spellcheck on some information and > > : search of the archive isn't helping. > > > > For this type of quesiton, there's not much feedback anyone can offer w/o > > knowing exactly what analyzers you have configured for hte various > > fieldtypes (both the field you index/search and the fieldtype used for > > spellchecking) > > > > it's also fairly critical to know how you have the spellcheck component > > configured. > > > > off the cuff: i'd guess that maybe WordDelimiterFilter is being used in a > > wonky way given your usecase -- but like i said: would need to see the > > configs to make a guess. > > > > > > -Hoss > > > > __ > > This email has been scanned by the MessageLabs Email Security System. > > For more information please visit http://www.messagelabs.com/email > > __ > > > -- > > > Colin Vipurs > Server Team Lead > > Shazam Entertainment Ltd > 26-28 Hammersmith Grove, London W6 7HA > m: +44 (0) 000 000 t: +44 (0) 20 8742 6820 > w:www.shazam.com > > Please consider the environment before printing this document > > This e-mail and its contents are strictly private and confidential. It > must not be disclosed, distributed or copied without our prior > consent. If you have received this transmission in error, please > notify Shazam Entertainment immediately on: +44 (0) 020 8742 6820 and > then delete it from your system. Please note that the information > contained herein shall additionally constitute Confidential > Information for the purposes of any NDA between the recipient/s and > Shazam Entertainment. Shazam Entertainment Limited is incorporated in > England and Wales under company number 3998831 and its registered > office is at 26-28 Hammersmith Grove, London W6 7HA. > > > > > ______ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit http://www.messagelabs.com/email > __ > > __ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit http://www.messagelabs.com/email > __ -- Colin Vipurs Server Team Lead Shazam Entertainment Ltd 26-28 Hammersmith Grove, London W6 7HA m: +44 (0) 000 000 t: +44 (0) 20 8742 6820 w:www.shazam.com Please consider the environment before printing this document This e-mail and its contents are strictly private and confidential. It must not be disclosed, distributed or copied without our prior consent. If you have received this transmission in error, please notify Shazam Entertainment immediately on: +44 (0) 020 8742 6820 and then delete it from your system. Please note that the information contained herein shall additionally constitute Confidential Information for the purposes of any NDA between the recipient/s and Shazam Entertainment. Shazam Entertainment Limited is incorporated in England and Wales under company number 3998831 and its registered office is at 26-28 Hammersmith Grove, London W6 7HA. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
Re: Search Regression Testing
Hi Mark, What we're doing is using a bunch of acceptance tests with JBehave to drive our testing. We run this in a clean room environment, clearing out the indexes before a test run and inserting the data we're interested in. As well as tests to ensure things "just work" we have a bunch of tests that insert data and check it comes out in the order we're expecting to - so unexpected changes to boosts etc. can be caught early. Whereas what this doesn't tell us what a certain query will return with our live data set, it does affirm our assertions about the abstract case. You could use a similar technique to insert a bunch of data and then check your critical queries. > Hey guys, > > I'm wondering how people are managing regression testing, in particular with > things like text based search. > > I.e. if you change how fields are indexed or change boosts in dismax, > ensuring that doesn't mean that critical queries are showing bad data. > > The obvious answer to me was using unit tests. These may be brittle as some > index data can change over time, but I couldn't think of a better way. > > How is everyone else solving this problem? > > Cheers, > > Mark > > -- > E: mark.man...@gmail.com > T: http://www.twitter.com/neurotic > W: www.compoundtheory.com > > cf.Objective(ANZ) - Nov 17, 18 - Melbourne Australia > http://www.cfobjective.com.au > > Hands-on ColdFusion ORM Training > www.ColdFusionOrmTraining.com > > > __ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit http://www.messagelabs.com/email > __ -- Colin Vipurs Server Team Lead Shazam Entertainment Ltd 26-28 Hammersmith Grove, London W6 7HA m: +44 (0) 000 000 t: +44 (0) 20 8742 6820 w:www.shazam.com Please consider the environment before printing this document This e-mail and its contents are strictly private and confidential. It must not be disclosed, distributed or copied without our prior consent. If you have received this transmission in error, please notify Shazam Entertainment immediately on: +44 (0) 020 8742 6820 and then delete it from your system. Please note that the information contained herein shall additionally constitute Confidential Information for the purposes of any NDA between the recipient/s and Shazam Entertainment. Shazam Entertainment Limited is incorporated in England and Wales under company number 3998831 and its registered office is at 26-28 Hammersmith Grove, London W6 7HA. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
Re: How to test Solr Integartion - how to get EmbeddedSolrServer?
I use the following: org.apache.solr solr-core 3.1.0 org.apache.solr solr-solrj 3.1.0 > Hello, > > I'm starting to write tests of my Solr integration, and have unfortunately > spent a lot of time chasing updated documentation. > > Follows a test I found > here<http://blog.synyx.de/2011/01/integration-tests-for-your-solr-config/>which > uses anEmbeddedSolrServerto communicate with the server and run some > queries. > > @Test > public void testThatNoResultsAreReturned() throws SolrServerException { > SolrParams params = new SolrQuery("text that is not found"); > assertQ(TEST_SEED, null, tests); > > QueryResponse response = req(params); > assertEquals(0L, response.getResults().getNumFound()); > } > > The issue is that I cannot add a dependency on Solr-3.2-SNAPSHOT since it's > packaged as a war. I've tried to attach the sources and make the dependency > of type classes but it still won't work. > > > org.apache.maven.plugins > maven-war-plugin > > web > web/WEB-INF/web.xml > * true* > > > > How could you use EmbeddedSolrServer outside of Solr Webapp? > > I've see that org.apache.solr.client.solrj.embedded.TestSolrProperties does > that in Solr Core, but not through a dependency on Solr Webapp (and I'm not > figuring out where it comes from). > > > -- > Regards, > K. Gabriele > > --- unchanged since 20/9/10 --- > P.S. If the subject contains "[LON]" or the addressee acknowledges the > receipt within 48 hours then I don't resend the email. > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x) > < Now + 48h) ⇒ ¬resend(I, this). > > If an email is sent by a sender that is not a trusted contact or the email > does not contain a valid code then the email is not received. A valid code > starts with a hyphen and ends with "X". > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈ > L(-[a-z]+[0-9]X)). > > ______ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit http://www.messagelabs.com/email > __ -- Colin Vipurs Server Team Lead Shazam Entertainment Ltd 26-28 Hammersmith Grove, London W6 7HA m: +44 (0) 000 000 t: +44 (0) 20 8742 6820 w:www.shazam.com Please consider the environment before printing this document This e-mail and its contents are strictly private and confidential. It must not be disclosed, distributed or copied without our prior consent. If you have received this transmission in error, please notify Shazam Entertainment immediately on: +44 (0) 020 8742 6820 and then delete it from your system. Please note that the information contained herein shall additionally constitute Confidential Information for the purposes of any NDA between the recipient/s and Shazam Entertainment. Shazam Entertainment Limited is incorporated in England and Wales under company number 3998831 and its registered office is at 26-28 Hammersmith Grove, London W6 7HA. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __