Hi Matt,

Are you looking for a good, general purpose schema and config for Solr? Well, there's the problem: you need to define what you mean by general purpose. Every search application will have its own requirements and they'll be slightly different to every other application. Yes, there will be some commonalities too. I guess by "as a human might expect one to behave" you mean "a bit like how Google works" but unfortunately Google is a poor example: you won't have Google's money or staff or platform in your company, nor are you likely to be building a massive-scale web search engine, so at best you can just take inspiration from it, not replicate it.

In practice, what a lot of people do is start with an example setup (perhaps from one of the examples supplied with Solr, e.g. 'techproducts') and adapt it: or they might start with the Solr configset provided by another framework, e.g. Drupal (yay! Pink Ponies!). Unfortunately the standard example configsets are littered with comments that say things like 'Here is how you *could* do XYZ but please don't actually attempt it this way' and other config sections that if you un-comment them may just get you into further trouble. It's grown rather than been built, and to my mind there's a good argument for starting with an absolutely minimal Solr configset and only adding things in as you need them and understand them (see https://lucene.472066.n3.nabble.com/minimal-solrconfig-example-td4322977.html for some background and a great presentation from Alex Rafalovitch on the examples).

You're also going to need some background on *why* all these features should be used, and for that I'd recommend my colleague Doug's book Relevant Search https://www.manning.com/books/relevant-search - or maybe our training (quick plug: we're running some online training in a couple of weeks https://opensourceconnections.com/blog/2020/05/05/tlre-solr-remote/ )

Hope this helps,

Cheers

Charlie

On 20/04/2020 23:43, matthew sporleder wrote:
Is there a comprehensive/big set of tips for making solr into a
search-engine as a human would expect one to behave?  I poked around
in the nutch github for a minute and found this:
https://github.com/apache/nutch/blob/9e5ae7366f7dd51eaa76e77bee6eb69f812bd29b/src/plugin/indexer-solr/schema.xml
  but I was wondering if I was missing a very obvious document
somewhere.

I guess I'm looking for things like:
use suggester here, use spelling there, use DocValues around here, DIY
pagerank, etc

Thanks,
Matt


--
Charlie Hull
OpenSource Connections, previously Flax

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.o19s.com

Reply via email to