Hi Erick,
On our architecture we use Apache Manifoldcf to invoke the schedulation
from Manifold-web and we use the Manifold-agent to take the pdf file
from the filesystem to SolR instances. Is it possibile to redirect the
Manifold schedulation to the SolrJ instance for specific schedules?
Tha
I'm assuming you're using the ExtractingRequestHandler. Offloading
the entire work onto your Solr box that is also serving queries
and indexing is not going to scale well.
Consider using Tika/SolrJ (Tika is what the ERH uses anyway) to
offload the PDF parsing amongst as many clients as you can aff