Re: solr as a general search engine

2020-04-21 Thread Jan Høydahl
To followup on Charlie’s points. Looks like your primary source is web or site crawl with Nutch. Once you are in the territory of unstructured text mixed with PDF/Word docs, spread across multiple sub domains, and perhaps lots of old «garbage» content, then you are looking at a very different s

Re: solr as a general search engine

2020-04-21 Thread Charlie Hull
Hi Matt, On 21/04/2020 13:41, matthew sporleder wrote: Sorry for the vague question and I appreciate the book recommendations -- I actually think I am mostly confused about suggest vs spellcheck vs morelikethis as they relate to what I referred to as "expected" behavior (like from a typed-in sea

Re: solr as a general search engine

2020-04-21 Thread matthew sporleder
Sorry for the vague question and I appreciate the book recommendations -- I actually think I am mostly confused about suggest vs spellcheck vs morelikethis as they relate to what I referred to as "expected" behavior (like from a typed-in search bar). For reference we have been using solr as search

Re: solr as a general search engine

2020-04-21 Thread Charlie Hull
Hi Matt, Are you looking for a good, general purpose schema and config for Solr? Well, there's the problem: you need to define what you mean by general purpose. Every search application will have its own requirements and they'll be slightly different to every other application. Yes, there wil

solr as a general search engine

2020-04-20 Thread matthew sporleder
Is there a comprehensive/big set of tips for making solr into a search-engine as a human would expect one to behave? I poked around in the nutch github for a minute and found this: https://github.com/apache/nutch/blob/9e5ae7366f7dd51eaa76e77bee6eb69f812bd29b/src/plugin/indexer-solr/schema.xml but