hash uniqueKey generation?

2010-11-15 Thread Dan Lynn

Hi,

I just finished reading on the wiki about deduplication and the 
solr.UUIDField type. What I'd like to do is generate an ID for a 
document by hashing a subset of its fields. One route I thought would be 
to do this ahead of time to CSV data, but I would think sticking 
something into the UpdateRequest chain would be more elegant.


Has anyone had any success in this area?

Cheers,
Dan
http://twitter.com/danklynn


Re: hash uniqueKey generation?

2010-11-16 Thread Dan Lynn

Thanks for the feedback, guys!

On 11/15/2010 10:14 AM, Dan Lynn wrote:

Hi,

I just finished reading on the wiki about deduplication and the 
solr.UUIDField type. What I'd like to do is generate an ID for a 
document by hashing a subset of its fields. One route I thought would 
be to do this ahead of time to CSV data, but I would think sticking 
something into the UpdateRequest chain would be more elegant.


Has anyone had any success in this area?

Cheers,
Dan
http://twitter.com/danklynn




Re: Spell Checker

2010-11-16 Thread Dan Lynn
I had to deal with spellchecking today a bit. Make sure you are 
performing the analysis step at index-time as such:


schema.xml:

   
   
   
   
   
   
   
   
   
   
   
   
   
   
   


.
multiValued="true"/>



From http://wiki.apache.org/solr/SpellCheckingAnalysis:

   Use a  to divert your main text fields to the spell field and then 
configure your spell checker to use the "spell" field to derive the spelling index.


After this, you'll need to query a spellcheck-enabled handler with 
spellcheck.build=true or enable spellchecker index builds during optimize.


Hope this helps,

Dan Lynn
http://twitter.com/danklynn


On 11/16/2010 05:45 PM, Eric Martin wrote:

Hi (again)



I am looking at the spell checker options:



http://wiki.apache.org/solr/SpellCheckerRequestHandler#Term_Source_Configura
tion



http://wiki.apache.org/solr/SpellCheckComponent#Use_in_the_Solr_Example



I am looking in my solrconfig.xml and I see one is already in use. I am kind
of confused by this because the recommended spell checker is not default in
my Solr 1.4.1. I have read the documentation but am still fuzzy on what I
should do.



My site uses legal terms and as you can see, some terms don't jive with the
default spell checker so I was hoping to map the spell checker to the body
for referencing dictionary words. I am unclear what approach I should take
and how to start the quest.



Can someone clarify what I should be doing here? Am I on the right track?



Eric






Re: Spell Checker

2010-11-16 Thread Dan Lynn

See interjected responses below

On 11/16/2010 06:14 PM, Eric Martin wrote:

Thanks Dan! Few questions:

Use a   to divert your main text fields to the spell field and
then configure your spell checker to use the "spell" field to derive the
spelling index.
Right. A copyField just copies data from one field to another during the 
indexing process. You can copy one field to n other fields without 
affecting the original.

This will still keep my current copyfield for the same data, right?

I don't need to rebuild, just reindex.

" After this, you'll need to query a spellcheck-enabled handler with
spellcheck.build=true or enable spellchecker index builds during optimize."
If you are using the default solrconfig.xml, a requesthandler should 
already be set up for you (but you should need a dedicated one for 
production: you can just embed the spell checker component in your 
default handler). Just query the example like this:


http://localhost:8983/solr/spell?q=ANYTHINGHERE&spellcheck=true&spellcheck.collate=true&spellcheck.build=true

Note the "spellcheck.build=true" parameter.

Cheers,
Dan
http://twitter.com/danklynn



Totally lost on that.

I will buy a book here shortly.

-Original Message-
From: Dan Lynn [mailto:d...@danlynn.com]
Sent: Tuesday, November 16, 2010 5:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Spell Checker

I had to deal with spellchecking today a bit. Make sure you are
performing the analysis step at index-time as such:

schema.xml:

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 


  .



   From http://wiki.apache.org/solr/SpellCheckingAnalysis:

 Use a   to divert your main text fields to the spell field and
then configure your spell checker to use the "spell" field to derive the
spelling index.


After this, you'll need to query a spellcheck-enabled handler with
spellcheck.build=true or enable spellchecker index builds during optimize.

Hope this helps,

Dan Lynn
http://twitter.com/danklynn


On 11/16/2010 05:45 PM, Eric Martin wrote:

Hi (again)



I am looking at the spell checker options:





http://wiki.apache.org/solr/SpellCheckerRequestHandler#Term_Source_Configura

tion



http://wiki.apache.org/solr/SpellCheckComponent#Use_in_the_Solr_Example



I am looking in my solrconfig.xml and I see one is already in use. I am

kind

of confused by this because the recommended spell checker is not default

in

my Solr 1.4.1. I have read the documentation but am still fuzzy on what I
should do.



My site uses legal terms and as you can see, some terms don't jive with

the

default spell checker so I was hoping to map the spell checker to the body
for referencing dictionary words. I am unclear what approach I should take
and how to start the quest.



Can someone clarify what I should be doing here? Am I on the right track?



Eric








Re: Need Middleware between search client and solr?

2010-11-19 Thread Dan Lynn
You might be able to skip on a front-end to solr by making extensive use 
of XSL to format the results, but there are several other arguments 
putting code in front of solr (e.g. saved searches, custom sorting, 
result-level embedded actions, etc..)


Cheers,
Dan

On 11/19/2010 01:58 PM, cyang2010 wrote:

Hi,

I am new to the lucene/solr.  I have a very general question, and hope to
hear your recommendation.

Do you need a middleware/module between your search client and solr server?
The response message is very solr specific.   Do you need to translate it to
application object model and return back to search client?   In that case, i
am thinking to have a search module in middleware server.   it will
route/decorate the search request to solr server, and after getting solr
response then package in an application object list return back to search
client.   Does it make sense?

My concern is whether it is unnecessarily add a network layer and slow down
the search speed?  But from application point of view, i see that is
necessary.   What do you think?

Thanks,


cy