I am SOLR fant and had implemented it in our company over 10 years ago.
I moved away from that role and the new search team in the meanwhile
implemented a proprietary (and expensive) nosql style search engine. That
the project did not go well, and now I am back to project and reviewing the
technology stack.

Some of the team think that ElasticSearch could be a good option,
especially since we can easily get hosted versions with AWS where we have
all the contractual stuff sorted out.

Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
graph, and all the knobs and dials for relevancy tuning), Elastic may be
sufficient for our needs. It does not seem to have LTR out of the box but
the relevancy tuning knobs and dials seem to be similar to what SOLR has.

The corpus size is not a challenge  - we have about one million document,
of which about 1/2 have full text, while the test are simpler (i.e. company
directory etc.).
The query volumes are also quite low (max 5/second at peak).
We have implemented the content ingestion and processing pipelines already
in python and SPARK, so most of the data will be pushed in using APIs.

I would really appreciate any guidance from the community !!

Reply via email to