Hi Christian. This was quite easy to have, since 2011. But you can complicate this as much as you want. Or customise it as much as you want.
Take a look : https://cwiki.apache.org/confluence/display/solr/UIMA+Integration https://wiki.apache.org/solr/SolrUIMA This is a good painless starting point. Then you can complicate the scenario how much you want, developing your own updateProcessor . This is a simple customisation and you can decide to use the best location NER available ( for example I would suggest you to explore : http://nlp.stanford.edu/software/corenlp.shtml for the open source ones) Apache Open NLP could be a good choice as well. Let us know, if this is what you wanted. Cheers On 3 November 2015 at 12:04, <liviuchrist...@yahoo.com.invalid> wrote: > Hi everyone, > > I need to install a plugin to extract Location (Country/State/City) from > free text documents - any professional advice?!? Does OpenNLP really does > the job? Is it English only? US only? Or does it cover worldwide places > names? > Could someone help me with this job - installation, configuration, > model-training etc? > > Please help,Kind regards,Christian > Christian Fotache Tel: 0728.297.207 Fax: 0351.411.570 > From: Upayavira <u...@odoko.co.uk> > To: solr-user@lucene.apache.org > Sent: Tuesday, November 3, 2015 12:13 PM > Subject: Re: language plugin > > Looking at the code, this is not going to work without modifications to > Solr (or at least a custom component). > > The atomic update code is closely embedded into the Solr > DistributedUpdateProcessor, which expands the atomic update into a full > document and then posts it to the shards. > > You need to do the update expansion before your lang detect processor, > but there is no gap between them. > > From my reading of the code, you could create an AtomicUpdateProcessor > that simply expands updates, and insert that before the > LangDetectUpdateProcessor. > > Upayavira > > On Tue, Nov 3, 2015, at 06:38 AM, Chaushu, Shani wrote: > > Hi > > When I make atomic update - set field - also on content field and also > > another field, the language field became generic. Meaning, it doesn’t > > work in the set field, only in the first inserting. Even if in the first > > time the language was detected, it just became generic after the update. > > Any idea? > > > > The chain is > > > > <updateRequestProcessorChain name="aa_chain"> > > <processor > > > class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory"> > > <str name="langid.fl">title,content,text</str> > > <str name="langid.langField">language_t</str> > > <str name="langid.langsField">language_all_t</str> > > <str name="langid.fallback">generic</str> > > <str name="langid.overwrite">false</str> > > <str name="langid.threshold">0.8</str> > > </processor> > > <processor class="solr.LogUpdateProcessorFactory" /> > > <processor class="solr.RunUpdateProcessorFactory" /> > > </updateRequestProcessorChain> > > > > > > Thanks, > > Shani > > > > > > > > > > -----Original Message----- > > From: Jack Krupansky [mailto:jack.krupan...@gmail.com] > > Sent: Thursday, October 29, 2015 17:04 > > To: solr-user@lucene.apache.org > > Subject: Re: language plugin > > > > Are you trying to do an atomic update without the content field? If so, > > it sounds like Solr needs an enhancement (bug fix?) so that language > > detection would be skipped if the input field is not present. Or maybe > > that could be an option. > > > > > > -- Jack Krupansky > > > > On Thu, Oct 29, 2015 at 3:25 AM, Chaushu, Shani <shani.chau...@intel.com > > > > wrote: > > > > > Hi, > > > I'm using solr language detection plugin on field name "content" > > > (solr 4.10, plugin LangDetectLanguageIdentifierUpdateProcessorFactory) > > > When I'm indexing on the first time it works fine, but if I want to > > > set one field again (regardless if it's the content or not) if goes to > > > its default language. If I'm setting other field I would like the > > > language to stay the way it was before, and o don't want to insert all > > > the content again. There is an option to set the plugin that it won't > > > calculate again the language? (put langid.overwrite to false didn't > > > work) > > > > > > Thanks, > > > Shani > > > > > > > > > --------------------------------------------------------------------- > > > Intel Electronics Ltd. > > > > > > This e-mail and any attachments may contain confidential material for > > > the sole use of the intended recipient(s). Any review or distribution > > > by others is strictly prohibited. If you are not the intended > > > recipient, please contact the sender and delete all copies. > > > > > --------------------------------------------------------------------- > > Intel Electronics Ltd. > > > > This e-mail and any attachments may contain confidential material for > > the sole use of the intended recipient(s). Any review or distribution > > by others is strictly prohibited. If you are not the intended > > recipient, please contact the sender and delete all copies. > > -- -------------------------- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England