Thanks a lot Steve!

On Wed, Jul 11, 2018 at 10:24 AM Steve Rowe <sar...@gmail.com> wrote:

> Hi Jerome,
>
> I was able to setup a configset to perform OpenNLP NER, loading the model
> files from local storage.
>
> There is a trick though[1]: the model files must be located *in a jar* or
> *in a subdirectory* under ${solr.solr.home}/lib/ or under a directory
> specified via a solrconfig.xml <lib> directive.
>
> I tested with the bin/solr cloud example, and put model files under the
> two solr home directories, at example/cloud/node1/solr/lib/opennlp/ and
> example/cloud/node1/solr/lib/opennlp/.  The “opennlp/“ subdirectory is
> required, though its name can be anything else you choose.
>
> [1] As you noted, ZkSolrResourceLoader delegates to its parent classloader
> when it can’t find resources in a configset, and the parent classloader is
> set up to load from subdirectories and jar files under
> ${solr.solr.home}/lib/ or under a directory specified via a solrconfig.xml
> <lib> directive.  These directories themselves are not included in the set
> of directories from which resources are loaded; only their children are.
>
> --
> Steve
> www.lucidworks.com
>
> > On Jul 9, 2018, at 10:10 PM, Jerome Yang <jey...@pivotal.io> wrote:
> >
> > Hi Steve,
> >
> > Put models under " ${solr.solr.home}/lib/ " is not working.
> > I check the "ZkSolrResourceLoader" seems it will first try to find modes
> in
> > config set.
> > If not find, then it uses class loader to load from resources.
> >
> > Regards,
> > Jerome
> >
> > On Tue, Jul 10, 2018 at 9:58 AM Jerome Yang <jey...@pivotal.io> wrote:
> >
> >> Thanks Steve!
> >>
> >>
> >> On Tue, Jul 10, 2018 at 5:20 AM Steve Rowe <sar...@gmail.com> wrote:
> >>
> >>> Hi Jerome,
> >>>
> >>> See the ref guide[1] for a writeup of how to enable uploading files
> >>> larger than 1MB into ZooKeeper.
> >>>
> >>> Local storage should also work - have you tried placing OpenNLP model
> >>> files in ${solr.solr.home}/lib/ ? - make sure you do the same on each
> node.
> >>>
> >>> [1]
> >>>
> https://lucene.apache.org/solr/guide/7_4/setting-up-an-external-zookeeper-ensemble.html#increasing-the-file-size-limit
> >>>
> >>> --
> >>> Steve
> >>> www.lucidworks.com
> >>>
> >>>> On Jul 9, 2018, at 12:50 AM, Jerome Yang <jey...@pivotal.io> wrote:
> >>>>
> >>>> Hi guys,
> >>>>
> >>>> In Solrcloud mode, where to put the OpenNLP models?
> >>>> Upload to zookeeper?
> >>>> As I test on solr 7.3.1, seems absolute path on local host is not
> >>> working.
> >>>> And can not upload into zookeeper if the model size exceed 1M.
> >>>>
> >>>> Regards,
> >>>> Jerome
> >>>>
> >>>> On Wed, Apr 18, 2018 at 9:54 AM Steve Rowe <sar...@gmail.com> wrote:
> >>>>
> >>>>> Hi Alexey,
> >>>>>
> >>>>> First, thanks for moving the conversation to the mailing list.
> >>> Discussion
> >>>>> of usage problems should take place here rather than in JIRA.
> >>>>>
> >>>>> I locally set up Solr 7.3 similarly to you and was able to get things
> >>> to
> >>>>> work.
> >>>>>
> >>>>> Problems with your setup:
> >>>>>
> >>>>> 1. Your update chain is missing the Log and Run update processors at
> >>> the
> >>>>> end (I see these are missing from the example in the javadocs for the
> >>>>> OpenNLP NER update processor; I’ll fix that):
> >>>>>
> >>>>>    <processor class="solr.LogUpdateProcessorFactory" />
> >>>>>    <processor class="solr.RunUpdateProcessorFactory" />
> >>>>>
> >>>>>  The Log update processor isn’t strictly necessary, but, from <
> >>>>>
> >>>
> https://lucene.apache.org/solr/guide/7_3/update-request-processors.html#custom-update-request-processor-chain
> >>>>>> :
> >>>>>
> >>>>>      Do not forget to add RunUpdateProcessorFactory at the end of any
> >>>>>      chains you define in solrconfig.xml. Otherwise update requests
> >>>>>      processed by that chain will not actually affect the indexed
> >>> data.
> >>>>>
> >>>>> 2. Your example document is missing an “id” field.
> >>>>>
> >>>>> 3. For whatever reason, the pre-trained model "en-ner-person.bin"
> >>> doesn’t
> >>>>> extract anything from text “This is Steve Jobs 2”.  It will extract
> >>> “Steve
> >>>>> Jobs” from text “This is Steve Jobs in white” e.g. though.
> >>>>>
> >>>>> 4. (Not a problem necessarily) You may want to use a multi-valued
> >>> “string”
> >>>>> field for the “dest” field in your update chain, e.g. “people_str”
> >>> (“*_str”
> >>>>> in the default configset is so configured).
> >>>>>
> >>>>> --
> >>>>> Steve
> >>>>> www.lucidworks.com
> >>>>>
> >>>>>> On Apr 17, 2018, at 8:23 AM, Alexey Ponomarenko <
> >>> alex1989s...@gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> Hi once more I am trying to implement named entities extraction
> using
> >>>>> this
> >>>>>> manual
> >>>>>>
> >>>>>
> >>>
> https://lucene.apache.org/solr/7_3_0//solr-analysis-extras/org/apache/solr/update/processor/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.html
> >>>>>>
> >>>>>> I am modified solrconfig.xml like this:
> >>>>>>
> >>>>>> <updateRequestProcessorChain name="multiple-extract">
> >>>>>> <processor
> >>>>> class="solr.OpenNLPExtractNamedEntitiesUpdateProcessorFactory">
> >>>>>>   <str name="modelFile">opennlp/en-ner-person.bin</str>
> >>>>>>   <str name="analyzerFieldType">text_opennlp</str>
> >>>>>>   <str name="source">description_en</str>
> >>>>>>   <str name="dest">content</str>
> >>>>>> </processor>
> >>>>>> </updateRequestProcessorChain>
> >>>>>>
> >>>>>> But when I was trying to add data using:
> >>>>>>
> >>>>>> *request:*
> >>>>>>
> >>>>>> POST
> >>>>>>
> >>>>>
> >>>
> http://localhost:8983/solr/numberplate/update?version=2.2&wt=xml&update.chain=multiple-extract
> >>>>>>
> >>>>>> <add><doc><field name="description_en">This is Steve Jobs 2
> >>>>>> </field><field name="content_pos">This is text 2</field><field
> >>>>>> name="content">This is text for content 2</field></doc></add>
> >>>>>>
> >>>>>> *response*
> >>>>>>
> >>>>>> <?xml version="1.0" encoding="UTF-8"?>
> >>>>>> <response>
> >>>>>>  <lst name="responseHeader">
> >>>>>>      <int name="status">0</int>
> >>>>>>      <int name="QTime">3</int>
> >>>>>>  </lst>
> >>>>>> </response>
> >>>>>>
> >>>>>> But I don't see any data inserted to *content* field and in any
> other
> >>>>> field.
> >>>>>>
> >>>>>> *If you need some additional data I can provide it.*
> >>>>>>
> >>>>>> Can you help me? What have I done wrong?
> >>>>>
> >>>>>
> >>>>
> >>>> --
> >>>> Pivotal Greenplum | Pivotal Software, Inc. <https://pivotal.io/>
> >>>
> >>>
> >>
> >> --
> >> Pivotal Greenplum | Pivotal Software, Inc. <https://pivotal.io/>
> >>
> >>
> >
> > --
> > Pivotal Greenplum | Pivotal Software, Inc. <https://pivotal.io/>
>
>

-- 
 Pivotal Greenplum | Pivotal Software, Inc. <https://pivotal.io/>

Reply via email to