I'm not sure if there is a lot of benefit from storing the literal values in that external file vs. directly in the index. There are a number of things one should look at first, as far as performance is concerned - JVM settings, cache sizes, analysis, etc.
For example, I have one index here that is 9 times the size of the original data because of how its fields are analyzed. I can change one analysis-level setting and make that ratio go down to 2. So I'd look at other, more straight forward things first. There is a Wiki page either on Solr or Lucene Wiki dedicated to various search performance tricks. Otis -- Sematext is hiring: http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR ----- Original Message ---- > From: Jibo John <jiboj...@mac.com> > To: solr-user@lucene.apache.org > Sent: Thursday, July 23, 2009 12:08:26 PM > Subject: Re: Storing string field in solr.ExternalFieldFile type > > Thanks for the response, Eric. > > We have seen that size of the index has a direct impact on the search speed, > especially when the index size is in GBs, so trying all possible ways to keep > the index size as low as we can. > > We thought solr.ExternalFileField type would help to keep the index size low > by > storing a text field out side of the index. > > Here's what we were planning: initially, all the fields except the > solr.ExternalFileField type field will be queried and will be displayed to > the > end user. . There will be subsequent calls from the UI to pull the > solr.ExternalFileField field that will be loaded in a lazy manner. > > However, realized that solr.ExternalFileField only supports float type, > however, > the data that we're planning to keep as an external field is a string type. > > Thanks, > -Jibo > > > > On Jul 22, 2009, at 1:46 PM, Erick Erickson wrote: > > > Hoping the experts chime in if I'm wrong, but.... > > As far as I know, while storing a field increases the size of an index, > > it doesn't have much impact on the search speed. Which you could > > pretty easily test by creating the index both ways and firing off some > > timing queries and comparing..... Although it would be time consuming... > > > > I believe there's some info on the Lucene Wiki about this, but my memory > > isn't what it used to be. > > > > Erick > > > > > > On Tue, Jul 21, 2009 at 2:42 PM, Jibo John wrote: > > > >> We're in the process of building a log searcher application. > >> > >> In order to reduce the index size to improve the query performance, we're > >> exploring the possibility of having: > >> > >> 1. One field for each log line with 'indexed=true & stored=false' that > >> will be used for searching > >> 2. Another field for each log line of type solr.ExternalFileField that > >> will be used only for display purpose. > >> > >> We realized that currently solr.ExternalFileField supports only float type. > >> > >> Is there a way we can override this to support string type? Any issues with > >> this approach? > >> > >> Any ideas are welcome. > >> > >> > >> Thanks, > >> -Jibo > >> > >> > >>