Hi Lee, Sorry, I think Erick and I both thought the issue was converting the synonyms, not removing the other words.
To keep only a set of words that match a list, use the KeepWordFilterFactory, with your list of synonyms. http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.KeepWordFilterFactory I'd put the synonym filter first in your configuration for the field, then the keep words filter factory. Tom On Tue, Dec 7, 2010 at 12:06 PM, lee carroll <lee.a.carr...@googlemail.com> wrote: > ok thanks for your response > > To summarise the solution then: > > To only index synonyms you must only send words that will match the synonym > list. If words with out synonym ,atches are in the field to be indexed these > words will be indexed. No way to avoid this by using schema.xml config. > > thanks lee c > > On 7 December 2010 13:21, Erick Erickson <erickerick...@gmail.com> wrote: > >> OK, the light finally dawns.... >> >> *If* you have a defined list of words to remove, you can put them in >> with your stopwords and add a stopword filter to the field in >> schema.xml. >> >> Otherwise, you'll have to do some pre-processing and only send to >> solr words you want. I'm assuming you have a list of valid words >> (i.e. the words in your synonyms file) and could pre-filter the input >> to remove everything else. In that case you don't need a synonyms >> filter since you're controlling the whole process anyway.... >> >> Best >> Erick >> >> On Tue, Dec 7, 2010 at 6:07 AM, lee carroll <lee.a.carr...@googlemail.com >> >wrote: >> >> > Hi tom >> > >> > This seems to place in the index >> > This is a scenic line of words >> > I just want scenic and words in the index >> > >> > I'm not at a terminal at the moment but will try again to make sure. I'm >> > sure I'm missing the obvious >> > >> > Cheers lee >> > On 7 Dec 2010 07:40, "Tom Hill" <solr-l...@worldware.com> wrote: >> > > Hi Lee, >> > > >> > > >> > > On Mon, Dec 6, 2010 at 10:56 PM, lee carroll >> > > <lee.a.carr...@googlemail.com> wrote: >> > >> Hi Erik >> > > >> > > Nope, Erik is the other one. :-) >> > > >> > >> thanks for the reply. I only want the synonyms to be in the index >> > >> how can I achieve that ? Sorry probably missing something obvious in >> the >> > >> docs >> > > >> > > Exactly what he said, use the => syntax. You've already got it. Add the >> > lines >> > > >> > > pretty => scenic >> > > text => words >> > > >> > > to synonyms.txt, and it will do what you want. >> > > >> > > Tom >> > > >> > >> On 7 Dec 2010 01:28, "Erick Erickson" <erickerick...@gmail.com> >> wrote: >> > >>> See: >> > >>> >> > >> >> > >> > >> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory >> > >>> >> > >>> with the => syntax, I think that's what you're looking for >> > >>> >> > >>> Best >> > >>> Erick >> > >>> >> > >>> On Mon, Dec 6, 2010 at 6:34 PM, lee carroll < >> > lee.a.carr...@googlemail.com >> > >>>wrote: >> > >>> >> > >>>> Hi Can the following usecase be achieved. >> > >>>> >> > >>>> value to be analysed at index time "this is a pretty line of text" >> > >>>> >> > >>>> synonym list is pretty => scenic , text => words >> > >>>> >> > >>>> valued placed in the index is "scenic words" >> > >>>> >> > >>>> That is to say only the matching synonyms. Basically i want to >> produce >> > a >> > >>>> normalised set of phrases for faceting. >> > >>>> >> > >>>> Cheers Lee C >> > >>>> >> > >> >> > >> >