The ultimate answer is that you need to test your configuration with your
expected workflow.
However, the thing that mitigates the remote IO factor (hopefully) is that
the Solr HDFS stuff features a blockcache that should (when tuned
correctly) cache in RAM the blocks your Solr process needs the m
Hi All - does it make sense to run a solr shard on a node within an
Hadoop cluster that is not a data node? In that case all the data that
node processes would need to come over the network, but you get the
benefit of more CPU for things like faceting.
Thank you!
-Joe