Re: Help optimizing

Otis Gospodnetic Tue, 06 May 2008 19:57:50 -0700

Daniel,

The main difference is that string type fields are not tokenized, while text 
type fields are.
Example:
input text: milk with honey is goooood
String fields will end up with a single token: "milk with honey is goooood"
Text fields will end up with 5 tokens (assuming no stop word filtering): 
"milk", "with", "honey", "is", "goooood"


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
> From: Daniel Andersson <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, May 6, 2008 5:43:44 PM
> Subject: Re: Help optimizing
> 
> Thanks Otis!
> 
> 
> On May 4, 2008, at 4:32 AM, Otis Gospodnetic wrote:
> 
> > You have a lot of fields of type text, but a number of field sound  
> > like they really need not be tokenized and should thus be of type  
> > string.
> 
> I've changed quite a few of them over to string. Still not sure about  
> the difference between 'string' and 'text' :-/
> 
> 
> > Do you really need 6 warming searchers?
> 
> That I have no idea about. Currently it's a very small site, well,  
> visitor-wise anyway.
> 
> 
> > I think "date" type is pretty granular.  Do you really need that  
> > type of precision?
> 
> Probably not, have changed it to sint and will index the date in this  
> format 20070310, which should do the trick.
> 
> 
> > I don't have shell handy here to check, but is that 'M' in -Xmx...  
> > recognized, or should it be lowercase 'm'?
> 
> "Append the letter k  or K to indicate kilobytes or the letter m or M  
> to indicate megabytes.", so yeah, should recognize it.
> 
> 
> > Have you noticed anything weird while looking at the Solr Java  
> > process with jConsole?
> 
> I'm not very familiar with Java, so no idea what jConsole is :-/
> 
> 
> Will be re-indexing tomorrow with the date->sint and text->string  
> changes, will report back after it's done.
> 
> Cheers,
> Daniel
>

Re: Help optimizing

Reply via email to