I'm not sure if there is a lot of benefit from storing the literal values in 
that external file vs. directly in the index.  There are a number of things one 
should look at first, as far as performance is concerned - JVM settings, cache 
sizes, analysis, etc.

For example, I have one index here that is 9 times the size of the original 
data because of how its fields are analyzed.  I can change one analysis-level 
setting and make that ratio go down to 2.  So I'd look at other, more straight 
forward things first.  There is a Wiki page either on Solr or Lucene Wiki 
dedicated to various search performance tricks.

 Otis
--
Sematext is hiring: http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Jibo John <jiboj...@mac.com>
> To: solr-user@lucene.apache.org
> Sent: Thursday, July 23, 2009 12:08:26 PM
> Subject: Re: Storing string field in solr.ExternalFieldFile type
> 
> Thanks for the response, Eric.
> 
> We have seen that size of the index has a direct impact on the search speed, 
> especially when the index size is in GBs, so trying all possible ways to keep 
> the index size as low as we can.
> 
> We thought solr.ExternalFileField type would help to keep the index size low 
> by 
> storing a text field out side of the index.
> 
> Here's what we were planning: initially, all the fields except the 
> solr.ExternalFileField type field will be queried and will be displayed to 
> the 
> end user. . There will be subsequent calls from the UI  to pull the 
> solr.ExternalFileField field that will be loaded in a lazy manner.
> 
> However, realized that solr.ExternalFileField only supports float type, 
> however, 
> the data that we're planning to keep as an external field is a string type.
> 
> Thanks,
> -Jibo
> 
> 
> 
> On Jul 22, 2009, at 1:46 PM, Erick Erickson wrote:
> 
> > Hoping the experts chime in if I'm wrong, but....
> > As far as I know, while storing a field increases the size of an index,
> > it doesn't have much impact on the search speed. Which you could
> > pretty easily test by creating the index both ways and firing off some
> > timing queries and comparing..... Although it would be time consuming...
> > 
> > I believe there's some info on the Lucene Wiki about this, but my memory
> > isn't what it used to be.
> > 
> > Erick
> > 
> > 
> > On Tue, Jul 21, 2009 at 2:42 PM, Jibo John wrote:
> > 
> >> We're in the process of building a log searcher application.
> >> 
> >> In order to reduce the index size to improve the query performance, we're
> >> exploring the possibility of having:
> >> 
> >> 1. One field for each log line with 'indexed=true & stored=false' that
> >> will be used for searching
> >> 2. Another field for each log line of type solr.ExternalFileField that
> >> will be used only for display purpose.
> >> 
> >> We realized that currently solr.ExternalFileField supports only float type.
> >> 
> >> Is there a way we can override this to support string type? Any issues with
> >> this approach?
> >> 
> >> Any ideas are welcome.
> >> 
> >> 
> >> Thanks,
> >> -Jibo
> >> 
> >> 
> >> 

Reply via email to