The org.apache.solr.analysis.RemoveDuplicatesTokenFilter, as per its
description, "Filters out any tokens which are at the same logical position in
the tokenstream as a previous token with the same text."
A very useful filter would be one which filters out duplicate tokens throughout
the field,
Hey Shawn,
> The config with the old policy used to be the literal name
> "mergeFactor". With TieredMergePolicy, there are now three settings
> that must be changed in order to actually be the same as what
> mergeFactor used to do.The followingconfig snippet is the equivalent
> config to a mergeF
Hi Remi,
I read your post and like you, I have also identified that running solr
4.6.0 in cloud mode results in higher response time which has something to
do with merging of documents from the various shards.
Looking at the source code, we couldn't understand why it would take so much
time for m
I am using Solr 4.6.0 in cloud mode. The setup is of 4 shards, 1 on each
machine with a zookeeper quorum running on 3 other machines. The index size
on each shard is about 15GB. I noticed that the number of segments in
second shard was 42 and in the remaining shards was between 25-30.
I am basical