Le 12 avr. 2012 à 17:46, Michael Ludwig a écrit :
>> Some compounds probably should not be decompounded, like "Fahrrad"
>> (farhren/Rad). With a dictionary-based stemmer, you might decide to
>> avoid decompounding for words in the dictionary.
> 
> Good point.

More or less, Fahrrad is generally abbreviated as Rad.
(even though Rad can mean wheel and bike)

>> Note that highlighting gets pretty weird when you are matching only
>> part of a word.
> 
> Guess it'll be a weird when you get it wrong, like "Noten" in
> "Notentriegelung".

This decomposition should not happen because Noten-triegelung does not have a 
correct second term.

>> The Basis Technology linguistic analyzers aren't cheap or small, but
>> they work well.
> 
> We will consider our needs and options. Thanks for your thoughts.

My question remains as to which domain it aims at covering.
We had such need for mathematics texts... I would be pleasantly surprised if, 
for example, Differenzen-quotient  would be decompounded.

paul

Reply via email to