spell check component

2008-01-19 Thread anuvenk

Is it possible to add a spell check component so i don't have to issue a
separate request to solr to do the spell checking? Sorry if this question is
naive..am just learning to use solr.



and add it to the search handler like this


  spellcheck


what would the name of the spell check component be?

-- 
View this message in context: 
http://www.nabble.com/spell-check-component-tp14973651p14973651.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: spell check component

2008-01-19 Thread Ryan McKinley
There is not yet a spell checking component...  It would be great to 
have though!



anuvenk wrote:

Is it possible to add a spell check component so i don't have to issue a
separate request to solr to do the spell checking? Sorry if this question is
naive..am just learning to use solr.

class="org.apache.solr.handler.component.spellcheckComponent" />


and add it to the search handler like this


  spellcheck


what would the name of the spell check component be?





Re: Solr feasibility with terabyte-scale data

2008-01-19 Thread Ryan McKinley


We are considering Solr 1.2 to index and search a terabyte-scale dataset 
of OCR.  Initially our requirements are simple: basic tokenizing, score 
sorting only, no faceting.   The schema is simple too.  A document 
consists of a numeric id, stored and indexed and a large text field, 
indexed not stored, containing the OCR typically ~1.4Mb.  Some limited 
faceting or additional metadata fields may be added later.


I have not done anything on this scale...  but with:
https://issues.apache.org/jira/browse/SOLR-303 it will be possible to 
split a large index into many smaller indices and return the union of 
all results.  This may or may not be necessary depending on what the 
data actually looks like (if you text just uses 100 words, your index 
may not be that big)


How many documents are you talking about?



Should we expect Solr indexing time to slow significantly as we scale 
up?  What kind of query performance could we expect?  Is it totally 
naive even to consider Solr at this kind of scale?




You may want to check out the lucene benchmark stuff
http://lucene.apache.org/java/docs/benchmarks.html

http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/benchmark/byTask/package-summary.html


ryan




Re: spellcheckhandler

2008-01-19 Thread anuvenk

I was going to do this
create a new field(termsourcefield) called 'spell'

of type 'spell'

 
   



 
 
   




 


copy my 'name' and 'body' fields to this 'spell' field at index time

   

But like you had mentioned, the tutorial says we have to use it on a field
thats not tokenized. Now how to use my tokenized fields 'body' and 'name' to
build my spell index? 

How to use it effectively for spell checking on multi-word queries?


anuvenk wrote:
> 
> Is it possible to implement something like this with the spellcheckhandler
> 
> Like how google does,..
> 
> say i search for 'chater 13 bakrupcy',
> 
> should be able to display these..
> 
> did you search for 'chapter 13 bankruptcy'
> 
> Has someone been able to do this?
> 

-- 
View this message in context: 
http://www.nabble.com/spellcheckhandler-tp14627712p14977717.html
Sent from the Solr - User mailing list archive at Nabble.com.