Apparently this mail thread is duplicated, anyway I will copy and paste my previous comment as well :
Hi Christian. This was quite easy to have, since 2011. But you can complicate this as much as you want. Or customise it as much as you want. Take a look : https://cwiki.apache.org/confluence/display/solr/UIMA+Integration https://wiki.apache.org/solr/SolrUIMA This is a good painless starting point. Then you can complicate the scenario how much you want, developing your own updateProcessor . This is a simple customisation and you can decide to use the best location NER available ( for example I would suggest you to explore : http://nlp.stanford.edu/software/corenlp.shtml for the open source ones) Apache Open NLP could be a good choice as well. Let us know, if this is what you wanted. Cheers On 4 November 2015 at 20:10, Doug Turnbull < dturnb...@opensourceconnections.com> wrote: > David Smiley had a place name and general tagging engine that for the life > of me I can't find. > > It didn't do NER for you (I'm not sure you want to do this in the search > engine) but it helps you tag entities in a search engine based on a > predefined list. At least that's what I remember. > > On Wed, Nov 4, 2015 at 3:05 PM, <liviuchrist...@yahoo.com.invalid> wrote: > > > Hi everyone, > > > > I need to install a plugin to extract Location (Country/State/City) from > > free text documents - any professional advice?!? Does OpenNLP really does > > the job? Is it English only? US only? Or does it cover worldwide places > > names? > > Could someone help me with this job - installation, configuration, > > model-training etc? > > > > Please help,Kind regards,Christian > > Christian Fotache Tel: 0728.297.207 Fax: 0351.411.570 > > > > > > From: Upayavira <u...@odoko.co.uk> > > To: solr-user@lucene.apache.org > > Sent: Tuesday, November 3, 2015 12:13 PM > > Subject: Re: language plugin > > > > Looking at the code, this is not going to work without modifications to > > Solr (or at least a custom component). > > > > The atomic update code is closely embedded into the Solr > > DistributedUpdateProcessor, which expands the atomic update into a full > > document and then posts it to the shards. > > > > You need to do the update expansion before your lang detect processor, > > but there is no gap between them. > > > > From my reading of the code, you could create an AtomicUpdateProcessor > > that simply expands updates, and insert that before the > > LangDetectUpdateProcessor. > > > > Upayavira > > > > On Tue, Nov 3, 2015, at 06:38 AM, Chaushu, Shani wrote: > > > Hi > > > When I make atomic update - set field - also on content field and also > > > another field, the language field became generic. Meaning, it doesn’t > > > work in the set field, only in the first inserting. Even if in the > first > > > time the language was detected, it just became generic after the > update. > > > Any idea? > > > > > > The chain is > > > > > > <updateRequestProcessorChain name="aa_chain"> > > > <processor > > > > > > class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory"> > > > <str name="langid.fl">title,content,text</str> > > > <str name="langid.langField">language_t</str> > > > <str name="langid.langsField">language_all_t</str> > > > <str name="langid.fallback">generic</str> > > > <str name="langid.overwrite">false</str> > > > <str name="langid.threshold">0.8</str> > > > </processor> > > > <processor class="solr.LogUpdateProcessorFactory" /> > > > <processor class="solr.RunUpdateProcessorFactory" /> > > > </updateRequestProcessorChain> > > > > > > > > > Thanks, > > > Shani > > > > > > > > > > > > > > > -----Original Message----- > > > From: Jack Krupansky [mailto:jack.krupan...@gmail.com] > > > Sent: Thursday, October 29, 2015 17:04 > > > To: solr-user@lucene.apache.org > > > Subject: Re: language plugin > > > > > > Are you trying to do an atomic update without the content field? If so, > > > it sounds like Solr needs an enhancement (bug fix?) so that language > > > detection would be skipped if the input field is not present. Or maybe > > > that could be an option. > > > > > > > > > -- Jack Krupansky > > > > > > On Thu, Oct 29, 2015 at 3:25 AM, Chaushu, Shani < > shani.chau...@intel.com > > > > > > wrote: > > > > > > > Hi, > > > > I'm using solr language detection plugin on field name "content" > > > > (solr 4.10, plugin > LangDetectLanguageIdentifierUpdateProcessorFactory) > > > > When I'm indexing on the first time it works fine, but if I want to > > > > set one field again (regardless if it's the content or not) if goes > to > > > > its default language. If I'm setting other field I would like the > > > > language to stay the way it was before, and o don't want to insert > all > > > > the content again. There is an option to set the plugin that it won't > > > > calculate again the language? (put langid.overwrite to false didn't > > > > work) > > > > > > > > Thanks, > > > > Shani > > > > > > > > > > > > --------------------------------------------------------------------- > > > > Intel Electronics Ltd. > > > > > > > > This e-mail and any attachments may contain confidential material for > > > > the sole use of the intended recipient(s). Any review or distribution > > > > by others is strictly prohibited. If you are not the intended > > > > recipient, please contact the sender and delete all copies. > > > > > > > --------------------------------------------------------------------- > > > Intel Electronics Ltd. > > > > > > This e-mail and any attachments may contain confidential material for > > > the sole use of the intended recipient(s). Any review or distribution > > > by others is strictly prohibited. If you are not the intended > > > recipient, please contact the sender and delete all copies. > > > > > > > > > > > > > -- > *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections > <http://opensourceconnections.com>, LLC | 240.476.9983 > Author: Relevant Search <http://manning.com/turnbull> > This e-mail and all contents, including attachments, is considered to be > Company Confidential unless explicitly stated otherwise, regardless > of whether attachments are marked as such. > -- -------------------------- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England