Hi Joe,
WordDelimiterFilter removes different delimiters, and creates several
token strings from the input. It can also concatenate and add that as
additional token to the stream. Though, it concatenates without space.
But maybe you can tweak it to your needs?
You could also use two different fields, one creating the concatenated
version with spaces, and the other producing the catenated tokens. (Both
with WordDelimiter and/or RegexPattern filters etc.)
Cheers,
Chantal
Joe Calderon schrieb:
hello *, im using a combination of tokenizers and filters that give me
the desired tokens, however for a particular field i want to
concatenate these tokens back to a single string, is there a filter to
do that, if not what are the steps needed to make my own filter to
concatenate tokens?
for example, i start with "Sprocket (widget) - Blue" the analyzers
churn out the tokens [sprocket,widget,blue] i want to end up with the
string "sprocket widget blue", this is a simple example and in the
general case lowercasing and punctuation removal does not work, hence
why im looking to concatenate tokens
--joe