Hello, 300K is a pretty small index. I wouldn't worry about the number of synonyms unless you are turning a single term into dozens of ORed terms.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: anuvenk <anuvenkat...@hotmail.com> > To: solr-user@lucene.apache.org > Sent: Tuesday, June 2, 2009 11:28:43 PM > Subject: Re: Is there Downside to a huge synonyms file? > > > I'm using query time synonyms. I have more fields in my index though. This is > just an example or sample of data from my index. Yes, we don't have millions > of documents. Could be around 300,000 and might increase in future. The > reason i'm using query time synonyms is because of the nature of my data. I > can't re-index the data everytime i add or remove a synonym. But for this > particular requirement is it best to have index time synonyms because of the > multi-word synonym nature. Again if i add more cities list to the synonym > file, I can't be re-indexing all the data over and over again. > > > > anuvenk wrote: > > > > In my index i have legal faqs, forms, legal videos etc with a state field > > for each resource. > > Now if i search for real estate san diego, I want to be able to return > > other 'california' results i.e results from san francisco. > > I have the following fields in the index > > > > title state > > description... > > real estate san diego example 1 california some > > description > > real estate carlsbad example 2 california some desc > > > > so when i search for real estate san francisco, since there is no match, i > > want to be able to return the other real estate results in california > > instead of returning none. Because sometimes they might be searching for a > > real estate form and city probably doesn't matter. > > > > I have two things in mind. One is adding a synonym mapping > > san diego, california > > carlsbad, california > > san francisco, california > > > > (which probably isn't the best way) > > hoping that search for san francisco real estate would map san francisco > > to california and hence return the other two california results > > > > OR > > > > adding the mapping of city to state in the index itself like.. > > > > title state city > > > > > description... > > real estate san diego eg 1 california carlsbad, san francisco, san > > diego some description > > real estate carlsbad eg 2 california carlsbad, san francisco, san > > diego some description > > > > which of the above two is better. Does a huge synonym file affect > > performance. Or Is there a even better way? I'm sure there is but I can't > > put my finger on it yet & I'm not familiar with java either. > > > > > > -- > View this message in context: > http://www.nabble.com/Is-there-Downside-to-a-huge-synonyms-file--tp23842527p23844761.html > Sent from the Solr - User mailing list archive at Nabble.com.