hash uniqueKey generation?
Hi, I just finished reading on the wiki about deduplication and the solr.UUIDField type. What I'd like to do is generate an ID for a document by hashing a subset of its fields. One route I thought would be to do this ahead of time to CSV data, but I would think sticking something into the UpdateRequest chain would be more elegant. Has anyone had any success in this area? Cheers, Dan http://twitter.com/danklynn
Re: hash uniqueKey generation?
Thanks for the feedback, guys! On 11/15/2010 10:14 AM, Dan Lynn wrote: Hi, I just finished reading on the wiki about deduplication and the solr.UUIDField type. What I'd like to do is generate an ID for a document by hashing a subset of its fields. One route I thought would be to do this ahead of time to CSV data, but I would think sticking something into the UpdateRequest chain would be more elegant. Has anyone had any success in this area? Cheers, Dan http://twitter.com/danklynn
Re: Spell Checker
I had to deal with spellchecking today a bit. Make sure you are performing the analysis step at index-time as such: schema.xml: . multiValued="true"/> From http://wiki.apache.org/solr/SpellCheckingAnalysis: Use a to divert your main text fields to the spell field and then configure your spell checker to use the "spell" field to derive the spelling index. After this, you'll need to query a spellcheck-enabled handler with spellcheck.build=true or enable spellchecker index builds during optimize. Hope this helps, Dan Lynn http://twitter.com/danklynn On 11/16/2010 05:45 PM, Eric Martin wrote: Hi (again) I am looking at the spell checker options: http://wiki.apache.org/solr/SpellCheckerRequestHandler#Term_Source_Configura tion http://wiki.apache.org/solr/SpellCheckComponent#Use_in_the_Solr_Example I am looking in my solrconfig.xml and I see one is already in use. I am kind of confused by this because the recommended spell checker is not default in my Solr 1.4.1. I have read the documentation but am still fuzzy on what I should do. My site uses legal terms and as you can see, some terms don't jive with the default spell checker so I was hoping to map the spell checker to the body for referencing dictionary words. I am unclear what approach I should take and how to start the quest. Can someone clarify what I should be doing here? Am I on the right track? Eric
Re: Spell Checker
See interjected responses below On 11/16/2010 06:14 PM, Eric Martin wrote: Thanks Dan! Few questions: Use a to divert your main text fields to the spell field and then configure your spell checker to use the "spell" field to derive the spelling index. Right. A copyField just copies data from one field to another during the indexing process. You can copy one field to n other fields without affecting the original. This will still keep my current copyfield for the same data, right? I don't need to rebuild, just reindex. " After this, you'll need to query a spellcheck-enabled handler with spellcheck.build=true or enable spellchecker index builds during optimize." If you are using the default solrconfig.xml, a requesthandler should already be set up for you (but you should need a dedicated one for production: you can just embed the spell checker component in your default handler). Just query the example like this: http://localhost:8983/solr/spell?q=ANYTHINGHERE&spellcheck=true&spellcheck.collate=true&spellcheck.build=true Note the "spellcheck.build=true" parameter. Cheers, Dan http://twitter.com/danklynn Totally lost on that. I will buy a book here shortly. -Original Message- From: Dan Lynn [mailto:d...@danlynn.com] Sent: Tuesday, November 16, 2010 5:01 PM To: solr-user@lucene.apache.org Subject: Re: Spell Checker I had to deal with spellchecking today a bit. Make sure you are performing the analysis step at index-time as such: schema.xml: . From http://wiki.apache.org/solr/SpellCheckingAnalysis: Use a to divert your main text fields to the spell field and then configure your spell checker to use the "spell" field to derive the spelling index. After this, you'll need to query a spellcheck-enabled handler with spellcheck.build=true or enable spellchecker index builds during optimize. Hope this helps, Dan Lynn http://twitter.com/danklynn On 11/16/2010 05:45 PM, Eric Martin wrote: Hi (again) I am looking at the spell checker options: http://wiki.apache.org/solr/SpellCheckerRequestHandler#Term_Source_Configura tion http://wiki.apache.org/solr/SpellCheckComponent#Use_in_the_Solr_Example I am looking in my solrconfig.xml and I see one is already in use. I am kind of confused by this because the recommended spell checker is not default in my Solr 1.4.1. I have read the documentation but am still fuzzy on what I should do. My site uses legal terms and as you can see, some terms don't jive with the default spell checker so I was hoping to map the spell checker to the body for referencing dictionary words. I am unclear what approach I should take and how to start the quest. Can someone clarify what I should be doing here? Am I on the right track? Eric
Re: Need Middleware between search client and solr?
You might be able to skip on a front-end to solr by making extensive use of XSL to format the results, but there are several other arguments putting code in front of solr (e.g. saved searches, custom sorting, result-level embedded actions, etc..) Cheers, Dan On 11/19/2010 01:58 PM, cyang2010 wrote: Hi, I am new to the lucene/solr. I have a very general question, and hope to hear your recommendation. Do you need a middleware/module between your search client and solr server? The response message is very solr specific. Do you need to translate it to application object model and return back to search client? In that case, i am thinking to have a search module in middleware server. it will route/decorate the search request to solr server, and after getting solr response then package in an application object list return back to search client. Does it make sense? My concern is whether it is unnecessarily add a network layer and slow down the search speed? But from application point of view, i see that is necessary. What do you think? Thanks, cy