Thanks a lot Steve! On Wed, Jul 11, 2018 at 10:24 AM Steve Rowe <sar...@gmail.com> wrote:
> Hi Jerome, > > I was able to setup a configset to perform OpenNLP NER, loading the model > files from local storage. > > There is a trick though[1]: the model files must be located *in a jar* or > *in a subdirectory* under ${solr.solr.home}/lib/ or under a directory > specified via a solrconfig.xml <lib> directive. > > I tested with the bin/solr cloud example, and put model files under the > two solr home directories, at example/cloud/node1/solr/lib/opennlp/ and > example/cloud/node1/solr/lib/opennlp/. The “opennlp/“ subdirectory is > required, though its name can be anything else you choose. > > [1] As you noted, ZkSolrResourceLoader delegates to its parent classloader > when it can’t find resources in a configset, and the parent classloader is > set up to load from subdirectories and jar files under > ${solr.solr.home}/lib/ or under a directory specified via a solrconfig.xml > <lib> directive. These directories themselves are not included in the set > of directories from which resources are loaded; only their children are. > > -- > Steve > www.lucidworks.com > > > On Jul 9, 2018, at 10:10 PM, Jerome Yang <jey...@pivotal.io> wrote: > > > > Hi Steve, > > > > Put models under " ${solr.solr.home}/lib/ " is not working. > > I check the "ZkSolrResourceLoader" seems it will first try to find modes > in > > config set. > > If not find, then it uses class loader to load from resources. > > > > Regards, > > Jerome > > > > On Tue, Jul 10, 2018 at 9:58 AM Jerome Yang <jey...@pivotal.io> wrote: > > > >> Thanks Steve! > >> > >> > >> On Tue, Jul 10, 2018 at 5:20 AM Steve Rowe <sar...@gmail.com> wrote: > >> > >>> Hi Jerome, > >>> > >>> See the ref guide[1] for a writeup of how to enable uploading files > >>> larger than 1MB into ZooKeeper. > >>> > >>> Local storage should also work - have you tried placing OpenNLP model > >>> files in ${solr.solr.home}/lib/ ? - make sure you do the same on each > node. > >>> > >>> [1] > >>> > https://lucene.apache.org/solr/guide/7_4/setting-up-an-external-zookeeper-ensemble.html#increasing-the-file-size-limit > >>> > >>> -- > >>> Steve > >>> www.lucidworks.com > >>> > >>>> On Jul 9, 2018, at 12:50 AM, Jerome Yang <jey...@pivotal.io> wrote: > >>>> > >>>> Hi guys, > >>>> > >>>> In Solrcloud mode, where to put the OpenNLP models? > >>>> Upload to zookeeper? > >>>> As I test on solr 7.3.1, seems absolute path on local host is not > >>> working. > >>>> And can not upload into zookeeper if the model size exceed 1M. > >>>> > >>>> Regards, > >>>> Jerome > >>>> > >>>> On Wed, Apr 18, 2018 at 9:54 AM Steve Rowe <sar...@gmail.com> wrote: > >>>> > >>>>> Hi Alexey, > >>>>> > >>>>> First, thanks for moving the conversation to the mailing list. > >>> Discussion > >>>>> of usage problems should take place here rather than in JIRA. > >>>>> > >>>>> I locally set up Solr 7.3 similarly to you and was able to get things > >>> to > >>>>> work. > >>>>> > >>>>> Problems with your setup: > >>>>> > >>>>> 1. Your update chain is missing the Log and Run update processors at > >>> the > >>>>> end (I see these are missing from the example in the javadocs for the > >>>>> OpenNLP NER update processor; I’ll fix that): > >>>>> > >>>>> <processor class="solr.LogUpdateProcessorFactory" /> > >>>>> <processor class="solr.RunUpdateProcessorFactory" /> > >>>>> > >>>>> The Log update processor isn’t strictly necessary, but, from < > >>>>> > >>> > https://lucene.apache.org/solr/guide/7_3/update-request-processors.html#custom-update-request-processor-chain > >>>>>> : > >>>>> > >>>>> Do not forget to add RunUpdateProcessorFactory at the end of any > >>>>> chains you define in solrconfig.xml. Otherwise update requests > >>>>> processed by that chain will not actually affect the indexed > >>> data. > >>>>> > >>>>> 2. Your example document is missing an “id” field. > >>>>> > >>>>> 3. For whatever reason, the pre-trained model "en-ner-person.bin" > >>> doesn’t > >>>>> extract anything from text “This is Steve Jobs 2”. It will extract > >>> “Steve > >>>>> Jobs” from text “This is Steve Jobs in white” e.g. though. > >>>>> > >>>>> 4. (Not a problem necessarily) You may want to use a multi-valued > >>> “string” > >>>>> field for the “dest” field in your update chain, e.g. “people_str” > >>> (“*_str” > >>>>> in the default configset is so configured). > >>>>> > >>>>> -- > >>>>> Steve > >>>>> www.lucidworks.com > >>>>> > >>>>>> On Apr 17, 2018, at 8:23 AM, Alexey Ponomarenko < > >>> alex1989s...@gmail.com> > >>>>> wrote: > >>>>>> > >>>>>> Hi once more I am trying to implement named entities extraction > using > >>>>> this > >>>>>> manual > >>>>>> > >>>>> > >>> > https://lucene.apache.org/solr/7_3_0//solr-analysis-extras/org/apache/solr/update/processor/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.html > >>>>>> > >>>>>> I am modified solrconfig.xml like this: > >>>>>> > >>>>>> <updateRequestProcessorChain name="multiple-extract"> > >>>>>> <processor > >>>>> class="solr.OpenNLPExtractNamedEntitiesUpdateProcessorFactory"> > >>>>>> <str name="modelFile">opennlp/en-ner-person.bin</str> > >>>>>> <str name="analyzerFieldType">text_opennlp</str> > >>>>>> <str name="source">description_en</str> > >>>>>> <str name="dest">content</str> > >>>>>> </processor> > >>>>>> </updateRequestProcessorChain> > >>>>>> > >>>>>> But when I was trying to add data using: > >>>>>> > >>>>>> *request:* > >>>>>> > >>>>>> POST > >>>>>> > >>>>> > >>> > http://localhost:8983/solr/numberplate/update?version=2.2&wt=xml&update.chain=multiple-extract > >>>>>> > >>>>>> <add><doc><field name="description_en">This is Steve Jobs 2 > >>>>>> </field><field name="content_pos">This is text 2</field><field > >>>>>> name="content">This is text for content 2</field></doc></add> > >>>>>> > >>>>>> *response* > >>>>>> > >>>>>> <?xml version="1.0" encoding="UTF-8"?> > >>>>>> <response> > >>>>>> <lst name="responseHeader"> > >>>>>> <int name="status">0</int> > >>>>>> <int name="QTime">3</int> > >>>>>> </lst> > >>>>>> </response> > >>>>>> > >>>>>> But I don't see any data inserted to *content* field and in any > other > >>>>> field. > >>>>>> > >>>>>> *If you need some additional data I can provide it.* > >>>>>> > >>>>>> Can you help me? What have I done wrong? > >>>>> > >>>>> > >>>> > >>>> -- > >>>> Pivotal Greenplum | Pivotal Software, Inc. <https://pivotal.io/> > >>> > >>> > >> > >> -- > >> Pivotal Greenplum | Pivotal Software, Inc. <https://pivotal.io/> > >> > >> > > > > -- > > Pivotal Greenplum | Pivotal Software, Inc. <https://pivotal.io/> > > -- Pivotal Greenplum | Pivotal Software, Inc. <https://pivotal.io/>