Actually we dropped integrating nlp with solr but we took two different
ideas:

* we're using nlp seperately not with solr
* we're taking help of UIMA for solr. Its more advanced.

If you've a specific question. you can ask me. I'll tell you if i know.

-Vivek

On Wed, Sep 10, 2014 at 3:46 PM, Aman Tandon <amantandon...@gmail.com>
wrote:

> Hi,
>
> What is the progress of integration of nlp with solr. If you have achieved
> this integration techniques successfully then please share with us.
>
> With Regards
> Aman Tandon
>
> On Tue, Jun 10, 2014 at 11:04 AM, Vivekanand Ittigi <vi...@biginfolabs.com
> >
> wrote:
>
> > Hi Aman,
> >
> > Yeah, We are also thinking the same. Using UIMA is better. And thanks to
> > everyone. You guys really showed us the way(UIMA).
> >
> > We'll work on it.
> >
> > Thanks,
> > Vivek
> >
> >
> > On Fri, Jun 6, 2014 at 5:54 PM, Aman Tandon <amantandon...@gmail.com>
> > wrote:
> >
> > > Hi Vikek,
> > >
> > > As everybody in the mail list mentioned to use UIMA you should go for
> it,
> > > as opennlp issues are not tracking properly, it can make stuck your
> > > development in near future if any issue comes, so its better to start
> > > investigate with uima.
> > >
> > >
> > > With Regards
> > > Aman Tandon
> > >
> > >
> > > On Fri, Jun 6, 2014 at 11:00 AM, Vivekanand Ittigi <
> > vi...@biginfolabs.com>
> > > wrote:
> > >
> > > > Can anyone pleas reply..?
> > > >
> > > > Thanks,
> > > > Vivek
> > > >
> > > > ---------- Forwarded message ----------
> > > > From: Vivekanand Ittigi <vi...@biginfolabs.com>
> > > > Date: Wed, Jun 4, 2014 at 4:38 PM
> > > > Subject: Re: Integrate solr with openNLP
> > > > To: Tommaso Teofili <tommaso.teof...@gmail.com>
> > > > Cc: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>,
> Ahmet
> > > > Arslan <iori...@yahoo.com>
> > > >
> > > >
> > > > Hi Tommaso,
> > > >
> > > > Yes, you are right. 4.4 version will work.. I'm able to compile now.
> > I'm
> > > > trying to apply named recognition(person name) token but im not
> seeing
> > > any
> > > > change. my schema.xml looks like this:
> > > >
> > > > <field name="text" type="text_opennlp_pos_ner" indexed="true"
> > > stored="true"
> > > > multiValued="true"/>
> > > >
> > > > <fieldType name="text_opennlp_pos_ner" class="solr.TextField"
> > > > positionIncrementGap="100">
> > > >       <analyzer>
> > > >         <tokenizer class="solr.OpenNLPTokenizerFactory"
> > > >           tokenizerModel="opennlp/en-token.bin"
> > > >         />
> > > >         <filter class="solr.OpenNLPFilterFactory"
> > > >           nerTaggerModels="opennlp/en-ner-person.bin"
> > > >         />
> > > >         <filter class="solr.LowerCaseFilterFactory"/>
> > > >       </analyzer>
> > > >
> > > >     </fieldType>
> > > >
> > > > Please guide..?
> > > >
> > > > Thanks,
> > > > Vivek
> > > >
> > > >
> > > > On Wed, Jun 4, 2014 at 1:27 PM, Tommaso Teofili <
> > > tommaso.teof...@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Ahment was suggesting to eventually use UIMA integration because
> > > OpenNLP
> > > > > has already an integration with Apache UIMA and so you would just
> > have
> > > to
> > > > > use that [1].
> > > > > And that's one of the main reason UIMA integration was done: it's a
> > > > > framework that you can easily hook into in order to plug your NLP
> > > > algorithm.
> > > > >
> > > > > If you want to just use OpenNLP then it's up to you if either write
> > > your
> > > > > own UpdateRequestProcessor plugin [2] to add metadata extracted by
> > > > OpenNLP
> > > > > to your documents or either you can write a dedicated analyzer /
> > > > tokenizer
> > > > > / token filter.
> > > > >
> > > > > For the OpenNLP integration (LUCENE-2899), the patch is not up to
> > date
> > > > > with the latest APIs in trunk, however you should be able to apply
> it
> > > to
> > > > > (if I recall correctly) to 4.4 version or so, and also adapting it
> to
> > > the
> > > > > latest API shouldn't be too hard.
> > > > >
> > > > > Regards,
> > > > > Tommaso
> > > > >
> > > > > [1] :
> > > > >
> > > >
> > >
> >
> http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#org.apche.opennlp.uima
> > > > > [2] : http://wiki.apache.org/solr/UpdateRequestProcessor
> > > > >
> > > > >
> > > > >
> > > > > 2014-06-03 15:34 GMT+02:00 Ahmet Arslan <iori...@yahoo.com.invalid
> >:
> > > > >
> > > > > Can you extract names, locations etc using OpenNLP in
> plain/straight
> > > java
> > > > >> program?
> > > > >>
> > > > >> If yes, here are two seperate options :
> > > > >>
> > > > >> 1) Use http://searchhub.org/2012/02/14/indexing-with-solrj/ as an
> > > > >> example to integrate your NER code into it and write your own
> > indexing
> > > > >> code. You have the full power here. No solr-plugins are involved.
> > > > >>
> > > > >> 2) Use 'Implementing a conditional copyField' given here :
> > > > >> http://wiki.apache.org/solr/UpdateRequestProcessor
> > > > >> as an example and integrate your NER code into it.
> > > > >>
> > > > >>
> > > > >> Please note that these are separate ways to enrich your incoming
> > > > >> documents, choose either (1) or (2).
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Tuesday, June 3, 2014 3:30 PM, Vivekanand Ittigi <
> > > > >> vi...@biginfolabs.com> wrote:
> > > > >> Okay, but i dint understand what you said. Can you please
> elaborate.
> > > > >>
> > > > >> Thanks,
> > > > >> Vivek
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Tue, Jun 3, 2014 at 5:36 PM, Ahmet Arslan <iori...@yahoo.com>
> > > wrote:
> > > > >>
> > > > >> > Hi Vivekanand,
> > > > >> >
> > > > >> > I have never use UIMA+Solr before.
> > > > >> >
> > > > >> > Personally I think it takes more time to learn how to
> > configure/use
> > > > >> these
> > > > >> > uima stuff.
> > > > >> >
> > > > >> >
> > > > >> > If you are familiar with java, write a class that extends
> > > > >> > UpdateRequestProcessor(Factory). Use OpenNLP for NER, add these
> > new
> > > > >> fields
> > > > >> > (organisation, city, person name, etc, to your document. This
> > phase
> > > is
> > > > >> > usually called 'enrichment'.
> > > > >> >
> > > > >> > Does that makes sense?
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > On Tuesday, June 3, 2014 2:57 PM, Vivekanand Ittigi <
> > > > >> vi...@biginfolabs.com>
> > > > >> > wrote:
> > > > >> > Hi Ahmet,
> > > > >> >
> > > > >> > I followed what you said
> > > > >> >
> https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
> > .
> > > > But
> > > > >> how
> > > > >> > can i achieve my goal? i mean extracting only name of the
> > > organization
> > > > >> or
> > > > >> > person from the content field.
> > > > >> >
> > > > >> > I guess i'm almost there but something is missing? please guide
> me
> > > > >> >
> > > > >> > Thanks,
> > > > >> > Vivek
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > On Tue, Jun 3, 2014 at 2:50 PM, Vivekanand Ittigi <
> > > > >> vi...@biginfolabs.com>
> > > > >> > wrote:
> > > > >> >
> > > > >> > > Entire goal cant be said but one of those tasks can be like
> > this..
> > > > we
> > > > >> > have
> > > > >> > > big document(can be website or pdf etc) indexed to the solr.
> > > > >> > > Lets say <field name=content> will sore store the contents of
> > > > >> document.
> > > > >> > > All i want to do is pick name of persons,places from it using
> > > > openNLP
> > > > >> or
> > > > >> > > some other means.
> > > > >> > >
> > > > >> > > Those names should be reflected in solr itself.
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > Vivek
> > > > >> > >
> > > > >> > >
> > > > >> > > On Tue, Jun 3, 2014 at 1:33 PM, Ahmet Arslan <
> iori...@yahoo.com
> > >
> > > > >> wrote:
> > > > >> > >
> > > > >> > >> Hi,
> > > > >> > >>
> > > > >> > >> Please tell us what you are trying to in a new treat. Your
> high
> > > > level
> > > > >> > >> goal. There may be some other ways/tools such as (
> > > > >> > >> https://stanbol.apache.org ) other than OpenNLP.
> > > > >> > >>
> > > > >> > >>
> > > > >> > >>
> > > > >> > >> On Tuesday, June 3, 2014 8:31 AM, Vivekanand Ittigi <
> > > > >> > >> vi...@biginfolabs.com> wrote:
> > > > >> > >>
> > > > >> > >>
> > > > >> > >>
> > > > >> > >> We'll surely look into UIMA integration.
> > > > >> > >>
> > > > >> > >> But before moving, is this(
> > https://wiki.apache.org/solr/OpenNLP
> > > )
> > > > >> the
> > > > >> > >> only link we've got to integrate?isn't there any other
> article
> > or
> > > > >> link
> > > > >> > >> which may help us to do fix this problem.
> > > > >> > >>
> > > > >> > >> Thanks,
> > > > >> > >> Vivek
> > > > >> > >>
> > > > >> > >>
> > > > >> > >>
> > > > >> > >>
> > > > >> > >> On Tue, Jun 3, 2014 at 2:50 AM, Ahmet Arslan <
> > iori...@yahoo.com>
> > > > >> wrote:
> > > > >> > >>
> > > > >> > >> Hi,
> > > > >> > >> >
> > > > >> > >> >I believe I answered it. Let me re-try,
> > > > >> > >> >
> > > > >> > >> >There is no committed code for OpenNLP. There is an open
> > ticket
> > > > with
> > > > >> > >> patches. They may not work with current trunk.
> > > > >> > >> >
> > > > >> > >> >Confluence is the official documentation. Wiki is maintained
> > by
> > > > >> > >> community. Meaning wiki can talk about some uncommitted
> > > > >> features/stuff.
> > > > >> > >> Like this one : https://wiki.apache.org/solr/OpenNLP
> > > > >> > >> >
> > > > >> > >> >What I am suggesting is, have a look at
> > > > >> > >>
> > > https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >And search how to use OpenNLP inside UIMA. May be
> LUCENE-2899
> > is
> > > > >> > already
> > > > >> > >> doable with solr-uima. I am adding Tommaso (sorry for this
> but
> > we
> > > > >> need
> > > > >> > an
> > > > >> > >> authoritative answer here) to clarify this.
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >Also consider indexing with SolrJ and use OpenNLP enrichment
> > > > outside
> > > > >> > the
> > > > >> > >> solr. Use openNLP with plain java, enrich your documents and
> > > index
> > > > >> them
> > > > >> > >> with SolJ. You don't have to too everything inside solr as
> > > > >> solr-plugins.
> > > > >> > >> >
> > > > >> > >> >Hope this helps,
> > > > >> > >> >
> > > > >> > >> >Ahmet
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >On Monday, June 2, 2014 11:15 PM, Vivekanand Ittigi <
> > > > >> > >> vi...@biginfolabs.com> wrote:
> > > > >> > >> >Thanks, I will check with the jira.. but you dint answe my
> > first
> > > > >> > >> >question..? And there's no way to integrate solr with
> > openNLP?or
> > > > is
> > > > >> > there
> > > > >> > >> >any committed code, using which i can go head.
> > > > >> > >> >
> > > > >> > >> >Thanks,
> > > > >> > >> >Vivek
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >On Mon, Jun 2, 2014 at 10:30 PM, Ahmet Arslan <
> > > iori...@yahoo.com>
> > > > >> > wrote:
> > > > >> > >> >
> > > > >> > >> >> Hi,
> > > > >> > >> >>
> > > > >> > >> >> Here is the jira issue :
> > > > >> > >> https://issues.apache.org/jira/browse/LUCENE-2899
> > > > >> > >> >>
> > > > >> > >> >>
> > > > >> > >> >> Anyone can create an account.
> > > > >> > >> >>
> > > > >> > >> >> I didn't use UIMA by myself and I have little knowledge
> > about
> > > > it.
> > > > >> > But I
> > > > >> > >> >> believe it is possible to use OpenNLP inside UIMA.
> > > > >> > >> >> You need to dig into UIMA documentation.
> > > > >> > >> >>
> > > > >> > >> >> Solr UIMA integration already exists, thats why I
> questioned
> > > > >> whether
> > > > >> > >> your
> > > > >> > >> >> requirement is possible with uima or not. I don't know the
> > > > answer
> > > > >> > >> myself.
> > > > >> > >> >>
> > > > >> > >> >> Ahmet
> > > > >> > >> >>
> > > > >> > >> >>
> > > > >> > >> >>
> > > > >> > >> >> On Monday, June 2, 2014 7:42 PM, Vivekanand Ittigi <
> > > > >> > >> vi...@biginfolabs.com>
> > > > >> > >> >> wrote:
> > > > >> > >> >> Hi Arslan,
> > > > >> > >> >>
> > > > >> > >> >> If not uncommitted code, then which code to be used to
> > > > integrate?
> > > > >> > >> >>
> > > > >> > >> >> If i have to comment my problems, which jira and how to
> put
> > > it?
> > > > >> > >> >>
> > > > >> > >> >> And why you are suggesting UIMA integration. My
> requirements
> > > is
> > > > >> > >> integrating
> > > > >> > >> >> with openNLP.? You mean we can do all the acitivties
> through
> > > > UIMA
> > > > >> as
> > > > >> > >> we do
> > > > >> > >> >> it using openNLP..?like name,location finder etc?
> > > > >> > >> >>
> > > > >> > >> >> Thanks,
> > > > >> > >> >> Vivek
> > > > >> > >> >>
> > > > >> > >> >>
> > > > >> > >> >>
> > > > >> > >> >>
> > > > >> > >> >>
> > > > >> > >> >> On Mon, Jun 2, 2014 at 8:40 PM, Ahmet Arslan
> > > > >> > <iori...@yahoo.com.invalid
> > > > >> > >> >
> > > > >> > >> >> wrote:
> > > > >> > >> >>
> > > > >> > >> >> > Hi,
> > > > >> > >> >> >
> > > > >> > >> >> > Uncommitted code could have these kind of problems. It
> is
> > > not
> > > > >> > >> guaranteed
> > > > >> > >> >> > to work with latest trunk.
> > > > >> > >> >> >
> > > > >> > >> >> > You could commend the problem you face on the jira
> ticket.
> > > > >> > >> >> >
> > > > >> > >> >> > By the way, may be you are after something doable with
> > > already
> > > > >> > >> committed
> > > > >> > >> >> > UIMA stuff?
> > > > >> > >> >> >
> > > > >> > >> >> >
> > > > >> https://cwiki.apache.org/confluence/display/solr/UIMA+Integration
> > > > >> > >> >> >
> > > > >> > >> >> > Ahmet
> > > > >> > >> >> >
> > > > >> > >> >> >
> > > > >> > >> >> >
> > > > >> > >> >> > On Monday, June 2, 2014 5:07 PM, Vivekanand Ittigi <
> > > > >> > >> >> vi...@biginfolabs.com>
> > > > >> > >> >> > wrote:
> > > > >> > >> >> > I followed this link to integrate
> > > > >> > >> https://wiki.apache.org/solr/OpenNLP
> > > > >> > >> >> to
> > > > >> > >> >> > integrate
> > > > >> > >> >> >
> > > > >> > >> >> > Installation
> > > > >> > >> >> >
> > > > >> > >> >> > For English language testing: Until LUCENE-2899 is
> > > committed:
> > > > >> > >> >> >
> > > > >> > >> >> >     1.pull the latest trunk or 4.0 branch
> > > > >> > >> >> >
> > > > >> > >> >> >     2.apply the latest LUCENE-2899 patch
> > > > >> > >> >> >     3.do 'ant compile'
> > > > >> > >> >> >     cd solr/contrib/opennlp/src/test-files/training
> > > > >> > >> >> >     .
> > > > >> > >> >> >     .
> > > > >> > >> >> >     .
> > > > >> > >> >> > i followed first two steps but got the following error
> > while
> > > > >> > >> executing
> > > > >> > >> >> 3rd
> > > > >> > >> >> > point
> > > > >> > >> >> >
> > > > >> > >> >> > common.compile-core:
> > > > >> > >> >> >     [javac] Compiling 10 source files to
> > > > >> > >> >> >
> > > > >> > >> >> >
> > > > >> > >> >>
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
> /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/build/analysis/opennlp/classes/java
> > > > >> > >> >> >
> > > > >> > >> >> >     [javac] warning: [path] bad path element
> > > > >> > >> >> >
> > > > >> > >> >> >
> > > > >> > >> >>
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
> "/home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/lib/jwnl-1.3.3.jar":
> > > > >> > >> >> > no such file or directory
> > > > >> > >> >> >
> > > > >> > >> >> >     [javac]
> > > > >> > >> >> >
> > > > >> > >> >> >
> > > > >> > >> >>
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
> /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/FilterPayloadsFilter.java:43:
> > > > >> > >> >> > error: cannot find symbol
> > > > >> > >> >> >
> > > > >> > >> >> >     [javac]     super(Version.LUCENE_44, input);
> > > > >> > >> >> >
> > > > >> > >> >> >     [javac]                  ^
> > > > >> > >> >> >     [javac]   symbol:   variable LUCENE_44
> > > > >> > >> >> >     [javac]   location: class Version
> > > > >> > >> >> >     [javac]
> > > > >> > >> >> >
> > > > >> > >> >> >
> > > > >> > >> >>
> > > > >> > >>
> > > > >> >
> > > > >>
> > > >
> > >
> >
> /home/biginfolabs/solrtest/solr-lucene-trunk3/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/OpenNLPTokenizer.java:56:
> > > > >> > >> >> > error: no suitable constructor found for
> Tokenizer(Reader)
> > > > >> > >> >> >     [javac]     super(input);
> > > > >> > >> >> >     [javac]     ^
> > > > >> > >> >> >     [javac]     constructor
> > > > >> Tokenizer.Tokenizer(AttributeFactory)
> > > > >> > is
> > > > >> > >> not
> > > > >> > >> >> > applicable
> > > > >> > >> >> >     [javac]       (actual argument Reader cannot be
> > > converted
> > > > to
> > > > >> > >> >> > AttributeFactory by method invocation conversion)
> > > > >> > >> >> >     [javac]     constructor Tokenizer.Tokenizer() is not
> > > > >> applicable
> > > > >> > >> >> >     [javac]       (actual and formal argument lists
> differ
> > > in
> > > > >> > length)
> > > > >> > >> >> >     [javac] 2 errors
> > > > >> > >> >> >     [javac] 1 warning
> > > > >> > >> >> >
> > > > >> > >> >> > Im really stuck how to passthough this step. I wasted my
> > > > entire
> > > > >> to
> > > > >> > >> fix
> > > > >> > >> >> this
> > > > >> > >> >> > but couldn't move a bit. Please someone help me..?
> > > > >> > >> >> >
> > > > >> > >> >> > Thanks,
> > > > >> > >> >> > Vivek
> > > > >> > >> >> >
> > > > >> > >> >> >
> > > > >> > >> >>
> > > > >> > >> >>
> > > > >> > >> >
> > > > >> > >>
> > > > >> > >
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >>
> > > > >>
> > > > >
> > > >
> > >
> >
>

Reply via email to