That's ace tom
Will give it a go but sounds spot on
On 7 Dec 2010 20:49, "Tom Hill" <solr-l...@worldware.com> wrote:
> Hi Lee,
>
> Sorry, I think Erick and I both thought the issue was converting the
> synonyms, not removing the other words.
>
> To keep only a set of words that match a list, use the
> KeepWordFilterFactory, with your list of synonyms.
>
>
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.KeepWordFilterFactory
>
> I'd put the synonym filter first in your configuration for the field,
> then the keep words filter factory.
>
> Tom
>
>
>
>
> On Tue, Dec 7, 2010 at 12:06 PM, lee carroll
> <lee.a.carr...@googlemail.com> wrote:
>> ok thanks for your response
>>
>> To summarise the solution then:
>>
>> To only index synonyms you must only send words that will match the
synonym
>> list. If words with out synonym ,atches are in the field to be indexed
these
>> words will be indexed. No way to avoid this by using schema.xml config.
>>
>> thanks lee c
>>
>> On 7 December 2010 13:21, Erick Erickson <erickerick...@gmail.com> wrote:
>>
>>> OK, the light finally dawns....
>>>
>>> *If* you have a defined list of words to remove, you can put them in
>>> with your stopwords and add a stopword filter to the field in
>>> schema.xml.
>>>
>>> Otherwise, you'll have to do some pre-processing and only send to
>>> solr words you want. I'm assuming you have a list of valid words
>>> (i.e. the words in your synonyms file) and could pre-filter the input
>>> to remove everything else. In that case you don't need a synonyms
>>> filter since you're controlling the whole process anyway....
>>>
>>> Best
>>> Erick
>>>
>>> On Tue, Dec 7, 2010 at 6:07 AM, lee carroll <
lee.a.carr...@googlemail.com
>>> >wrote:
>>>
>>> > Hi tom
>>> >
>>> > This seems to place in the index
>>> > This is a scenic line of words
>>> > I just want scenic and words in the index
>>> >
>>> > I'm not at a terminal at the moment but will try again to make sure.
I'm
>>> > sure I'm missing the obvious
>>> >
>>> > Cheers lee
>>> > On 7 Dec 2010 07:40, "Tom Hill" <solr-l...@worldware.com> wrote:
>>> > > Hi Lee,
>>> > >
>>> > >
>>> > > On Mon, Dec 6, 2010 at 10:56 PM, lee carroll
>>> > > <lee.a.carr...@googlemail.com> wrote:
>>> > >> Hi Erik
>>> > >
>>> > > Nope, Erik is the other one. :-)
>>> > >
>>> > >> thanks for the reply. I only want the synonyms to be in the index
>>> > >> how can I achieve that ? Sorry probably missing something obvious
in
>>> the
>>> > >> docs
>>> > >
>>> > > Exactly what he said, use the => syntax. You've already got it. Add
the
>>> > lines
>>> > >
>>> > > pretty => scenic
>>> > > text => words
>>> > >
>>> > > to synonyms.txt, and it will do what you want.
>>> > >
>>> > > Tom
>>> > >
>>> > >> On 7 Dec 2010 01:28, "Erick Erickson" <erickerick...@gmail.com>
>>> wrote:
>>> > >>> See:
>>> > >>>
>>> > >>
>>> >
>>> >
>>>
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
>>> > >>>
>>> > >>> with the => syntax, I think that's what you're looking for
>>> > >>>
>>> > >>> Best
>>> > >>> Erick
>>> > >>>
>>> > >>> On Mon, Dec 6, 2010 at 6:34 PM, lee carroll <
>>> > lee.a.carr...@googlemail.com
>>> > >>>wrote:
>>> > >>>
>>> > >>>> Hi Can the following usecase be achieved.
>>> > >>>>
>>> > >>>> value to be analysed at index time "this is a pretty line of
text"
>>> > >>>>
>>> > >>>> synonym list is pretty => scenic , text => words
>>> > >>>>
>>> > >>>> valued placed in the index is "scenic words"
>>> > >>>>
>>> > >>>> That is to say only the matching synonyms. Basically i want to
>>> produce
>>> > a
>>> > >>>> normalised set of phrases for faceting.
>>> > >>>>
>>> > >>>> Cheers Lee C
>>> > >>>>
>>> > >>
>>> >
>>>
>>