> My first decision was to divide SOLR into two cores, since I am already > using SOLR as my search server. One core would be for the main search of the > site and one for the geocoding. Correct. And you can even use that location index/collection for locations extraction for a non structural documents - i.e. if you don't have separate field with geographical names in your corpus (or location data is just not good enough compared to what can be mined from documents)
> My second decision is to store the name data in a normalised state, some > examples are shown below: > London, England > England > Swindon, Wiltshire, England Yes, you can add postcode/outcodes there also. And I would add additional field "type" region/county/town/postcode/outcode. > The third decision was to return “autosuggest” results, for example when the > user types “Lond” I would like to suggest “London, England”. For this to > work I think it makes sense to return up to 5 results via JSON based on > relevancy and have these displayed under the search box. Yeah, you might want to boost cities more than towns (I'm sure there are plenty ambiguous terms), use some kind of geoip service, additional scoring factors. > My fourth decision is that when the user actually hits the “search” button > on the location field, SOLR is again queries and returns the most relevant > result, including the co-ordinates which are stored. You can also have special logic to decide if you want to use spatial search or just simple textual match would be better. I.e. you have "England" in your example. It doesn't sound practical to return coordinates and use spatial search for this use case, right? HTH, Alexey