Yes, the LimitTokenCountFilterFactory will do the trick.
I have some examples in the book, showing for a given input string, what the
output tokens will be.
Otherwise, the Solr Javadoc does given one generic example, but without
showing how it actually works:
http://lucene.apache.org/core/4_3_1/analyzers-common/org/apache/lucene/analysis/miscellaneous/LimitTokenCountFilterFactory.html
The new Apache Solr Reference? No mention of the filter.
-- Jack Krupansky
-----Original Message-----
From: Daniel Collins
Sent: Wednesday, June 26, 2013 3:38 AM
To: solr-user@lucene.apache.org
Subject: How to truncate a particular field, LimitTokenCountAnalyzer or
LimitTokenCountFilter?
We have a requirement to grab the first N words in a particular field and
weight them differently for scoring purposes. So I thought to use a
<copyField> and have some extra filter on the destination to truncate it
down (post tokenization).
Did a quick search and found both a LimitTokenCountAnalyzer
and LimitTokenCountFilter mentioned, if I read the wiki right, the Filter
is the correct approach for Solr as we have the schema-able analyzer chain,
so we don't need to code anything, right?
The Analyzer version would be more useful if we were explicitly coding up a
set of operations in Java, so that's what Lucene users directly would tend
to use.
Just in search of confirmation really.