On 15/01/2020 11:42, Dc Tech wrote:
Thank you Jan and Charlie.

I should say that in terms of posting to the community regarding Elastic vs 
Solr - this is probably the most civil and helpful community that I have been a 
part of - and your answers have only reinforced that  notion !!

Thank you for your responses. I am glad to hear that both can do most of it, 
which was my gut feeling as well.

Charlie, to your point - the team probably feels that Elastic  is easier to get 
started with hence the preference, as well as the hosting options (with the 
caveats you noted). Agree with you completely that tech is not the real issue.

Jan,  agree with  the points you made on team skills.  On our previous 
proprietary engine - that was in fact the biggest issue - the engine was 
powerful enough and had good references.  However, we were not able to exploit 
it to good effect.

Hi again,

The dirty secret that few will voice is that...most search engines are basically the same. Once you've worked on a search project you can apply those skills to any future search engine. This is why I'm currently focused on supporting the search team, not the search tech - how do you learn and improve those relevance tuning skills, considering it's really, really hard to recruit people with existing high-level search skills (and if you can find them you probably can't afford them).

Cheers

Charlie


Thank you again.

On Jan 15, 2020, at 5:10 AM, Jan Høydahl <jan....@cominvent.com> wrote:

Hi,

Choosing the solr community mailing list to ask advice for whether to choose ES 
- you already know what to expect, not?
More often than not the choice comes down to policy, standardization, what 
skills you have in the house etc rather than ticking off feature checkboxes.
Sometimes company values also may drive a choice, i.e. Solr is 100% Apache and 
not open core, which may matter if you plan to get involved in the community, 
and contribute features or patches.

However, if I were in your shoes as architect to evaluate tech stack, and there 
was not a clear choice based on the above, I’d do what projects normally do, to 
ask yourself what you really need from the engine. Maybe you have some features 
in your requirement list that makes one a much better choice over the other. Or 
maybe after that exercise you are still wondering what to choose, in which case 
you just follow your gut feeling and make a choice :)

Jan

15. jan. 2020 kl. 10:07 skrev Charlie Hull <char...@flax.co.uk>:

On 15/01/2020 04:02, Dc Tech wrote:
I am SOLR fant and had implemented it in our company over 10 years ago.
I moved away from that role and the new search team in the meanwhile
implemented a proprietary (and expensive) nosql style search engine. That
the project did not go well, and now I am back to project and reviewing the
technology stack.

Some of the team think that ElasticSearch could be a good option,
especially since we can easily get hosted versions with AWS where we have
all the contractual stuff sorted out.
You can, but you should be aware that:
1. Amazon's hosted Elasticsearch isn't great, often lags behind the current 
version, doesn't allow plugins etc.
2.  Amazon and Elastic are currently engaged in legal battles over who is the 
most open sourcey,who allegedly copied code that was 'open' but commercially 
licensed, who would like to capture the hosted search market...not sure how 
this will pan out (Google for details)
3. You can also buy fully hosted Solr from several places.
Whle SOLR definitely seems more advanced  (LTR, streaming expressions,
graph, and all the knobs and dials for relevancy tuning), Elastic may be
sufficient for our needs. It does not seem to have LTR out of the box but
the relevancy tuning knobs and dials seem to be similar to what SOLR has.
Yes, they're basically the same under the hood (unsurprising as they're both 
based on Lucene). If you need LTR there's an ES plugin for that (disclaimer, my 
new employer built and maintains it: 
https://github.com/o19s/elasticsearch-learning-to-rank). I've lost track of the 
amount of times I've been asked 'Elasticsearch or Solr, which should I choose?' 
and my current thoughts are:

1. Don't switch from one to the other for the sake of it.  Switching search 
engines rarely addresses underlying issues (content quality, team skills, 
relevance tuning methodology)
2. Elasticsearch is easier to get started with, but at some point you'll need 
to learn how it all works
3. Solr is harder to get started with, but you'll know more about how it all 
works earlier
4. Both can be used for most search projects, most features are the same, both 
can scale.
5. Lots of Elasticsearch projects (and developers) are focused on logs, which 
is often not really a 'search' project.

The corpus size is not a challenge  - we have about one million document,
of which about 1/2 have full text, while the test are simpler (i.e. company
directory etc.).
The query volumes are also quite low (max 5/second at peak).
We have implemented the content ingestion and processing pipelines already
in python and SPARK, so most of the data will be pushed in using APIs.

I would really appreciate any guidance from the community !!

Sounds like a pretty small setup to be honest, but as ever the devil is in the 
details.

Cheers

Charlie

--
Charlie Hull
Flax - Open Source Enterprise Search (now part of OpenSourceConnections)

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.o19.com


--
Charlie Hull
Flax - Open Source Enterprise Search, now part of OSC

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.o19s.com

Reply via email to