Lucene Solr custom tokenizer - How to include delimiter special characters as tokens?

2019-01-23 Thread rina joseph
Hello Solr users, I have a need to write a tokenizer for source code files in Solr, but don't have the option of including custom JARs. So for ex: Input: foo.bar Tokens: 'foo', '.', 'bar' How can I have a custom tokenizer or filter in schema.xml that can spl

Re: Calling rest API from Solr custom tokenizer plugin

2017-12-06 Thread Sreenivas.T
ise content from content sources like sharepoint, > file > > share etc.. before content is getting indexed to Solr, I need to call our > > internal AI platform to get additional metadata like classification tags > > etc.. > > > > I'm planning to leverage manifold

Re: Calling rest API from Solr custom tokenizer plugin

2017-12-06 Thread Doug Turnbull
ent sources like sharepoint, file > share etc.. before content is getting indexed to Solr, I need to call our > internal AI platform to get additional metadata like classification tags > etc.. > > I'm planning to leverage manifold cf for getting the content from sources > a

Calling rest API from Solr custom tokenizer plugin

2017-12-06 Thread Sreenivas.T
etc.. I'm planning to leverage manifold cf for getting the content from sources and planning to write Custom tokenizer plugin to send the content to AI platform, which intern returns with additional tags. I'll index additional tags dynamically through plugin code. Is it a feasible solutio

Re: Developing custom tokenizer/filter in solr 5.4.1

2017-11-08 Thread kumar gaurav
> > at org.apache.solr.util.plugin.AbstractPluginLoader.load( > >> AbstractPluginLoader.java:152) > >> > ... 16 more > >> > Caused by: java.lang.ClassCastException: class com.skyrim. > >> ReverseFilterFactory > &

Re: Developing custom tokenizer/filter in solr 5.4.1

2017-11-08 Thread Erick Erickson
ava:3404) >> > at org.apache.solr.core.SolrResourceLoader.findClass( >> SolrResourceLoader.java:475) >> > at org.apache.solr.core.SolrResourceLoader.newInstance( >> SolrResourceLoader.java:560) >> > at org.apache.solr.schema.FieldTypePluginLoader$3. >> cre

Re: Developing custom tokenizer/filter in solr 5.4.1

2017-11-08 Thread kumar gaurav
ourceLoader.java:560) > > at org.apache.solr.schema.FieldTypePluginLoader$3. > create(FieldTypePluginLoader.java:383) > > at org.apache.solr.schema.FieldTypePluginLoader$3. > create(FieldTypePluginLoader.java:377) > > at org.apache.solr.util.plugin.A

Re: Developing custom tokenizer/filter in solr 5.4.1

2017-11-08 Thread Erick Erickson
Loader.java:377) > at > org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:152) > > > > Please help me its very urgent to build a custom tokenizer like > StandardTokenizerFactory where i will write my own rules for indexing. > > >

Re: Developing custom tokenizer/filter in solr 5.4.1

2017-11-07 Thread kumar gaurav
) Please help me its very urgent to build a custom tokenizer like StandardTokenizerFactory where i will write my own rules for indexing. On Wed, Nov 8, 2017 at 4:30 AM, Erick Erickson wrote: > Looks to me like you're compiling against the jars from one version of > Solr and execut

Re: Developing custom tokenizer/filter in solr 5.4.1

2017-11-07 Thread Erick Erickson
Looks to me like you're compiling against the jars from one version of Solr and executing against another. /root/solr-5.2.1/server/solr/#/conf/managed-schema yet you claim to be using 5.4.1 On Tue, Nov 7, 2017 at 12:00 PM, kumar gaurav wrote: > Hi > > I am developing my own custom filter in

Developing custom tokenizer/filter in solr 5.4.1

2017-11-07 Thread kumar gaurav
Hi I am developing my own custom filter in solr 5.4.1. I have created a jar of a filter class with extend to TokenizerFactory class . When i loaded in to sol config and add my filter to managed-schema , i found following error - org.apache.solr.common.SolrException: Could not load conf for core

Solr custom Tokenizer Factory works randomly

2014-06-26 Thread Gotz SE
I am new in Solr and I have to do a filter to lemmatize text to index documents and also to lemmatize querys. I created a custom Tokenizer Factory for lemmatized text before passing it to the Standard Tokenizer. Making tests in Solr analysis section works fairly good (on index ok, but on

Solr with custom tokenizer

2013-08-14 Thread Алексей Курган
There is a problem with custom tokenizer for Solr. We have developed our own tokenizer for Solr, that he rescued phones from the text and put additional tokens to token stream. But unfortunately, these additional tokens are not indexed by Solr. For an example, the text "Hello (111) 222-33-4

issue with custom tokenizer

2013-08-13 Thread dhaivat dave
Hello All, I am trying to develop custom tokeniser (please find code below) and found some issue while adding multiple document one after another. it works fine when i add first document and when i add another document it's not calling "create" method from SampleTokeniserFactory.java but it calls

Re: developing custom tokenizer

2013-08-13 Thread dhaivat dave
Hi Alex, Thanks for your reply and i looked into core analyser and also created custom tokeniser using that.I have shared code below. when i tried to look into analysis of solr, the analyser is working fine but when i tried to submit 100 docs together i found in logs (with custom message printing)

Re: developing custom tokenizer

2013-08-12 Thread Alexandre Rafalovitch
Have you tried looking at source code itself? Between simple organizer like keyword and complex language ones, you should be able to get an idea. Then ask specific follow up questions. Regards, Alex On 12 Aug 2013 09:29, "dhaivat dave" wrote: > Hello All, > > I want to create custom tokenis

developing custom tokenizer

2013-08-12 Thread dhaivat dave
Hello All, I want to create custom tokeniser in solr 4.4. it will be very helpful if some one share any tutorials or information on this. Many Thanks, Dhaivat Dave

Re: custom tokenizer error

2013-05-06 Thread Sarita Nair
baseTokenizer is reset in the #reset method. Sarita From: Jack Krupansky To: solr-user@lucene.apache.org Sent: Sunday, May 5, 2013 1:37 PM Subject: Re: custom tokenizer error I didn't notice any call to the "reset" method for your base

Re: custom tokenizer error

2013-05-05 Thread Jack Krupansky
:43 PM To: solr-user@lucene.apache.org Subject: custom tokenizer error I am using a custom Tokenizer, as part of analysis chain, for a Solr (4.2.1) field. On trying to index, Solr throws a NullPointerException. The unit tests for the custom tokenizer work fine. Any ideas as to what is it that I am missi

custom tokenizer error

2013-05-03 Thread Sarita Nair
I am using a custom Tokenizer, as part of analysis chain, for a Solr (4.2.1) field. On trying to index, Solr throws a NullPointerException.  The unit tests for the custom tokenizer work fine. Any ideas as to what is it that I am missing/doing incorrectly will be appreciated. Here is the

Re: Solr - WordDelimiterFactory with Custom Tokenizer to split only on Boundires

2013-04-30 Thread meghana
re whether/how those > patterns could be combined. > > Also, that doesn't allow the case of a single ".", "&", or "_" as a word - > but you didn't specify how that case should be handled. > > > > -- Jack Krupansky > -Original Mes

Re: Solr - WordDelimiterFactory with Custom Tokenizer to split only on Boundires

2013-04-24 Thread Jack Krupansky
The WDF "types" will treat a character the same regardless of where it appears. For something conditional, like dot between letters vs. dot lot preceded and followed by a letter, you either have to have a custom tokenizer or a character filter. Interesting that although th

Solr - WordDelimiterFactory with Custom Tokenizer to split only on Boundires

2013-04-24 Thread meghana
can i set configuration for worddelimiter factory to fulfill my requirement. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-WordDelimiterFactory-with-Custom-Tokenizer-to-split-only-on-Boundires-tp4058557.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Custom Tokenizer

2007-02-12 Thread Chris Hostetter
: After reading the docs, I put it in example/solr/lib, but didn't remove : it from example/ext. Whoops. : : Long story short, putting the custom.jar into example/solr/lib worked : with java 5 and 6, so long as it wasn't also in example/ext. That's excellent news Devon, thanks or the followup. (

RE: Custom Tokenizer

2007-02-12 Thread Smith,Devon
] Sent: Friday, February 09, 2007 3:34 AM To: solr-user@lucene.apache.org Subject: RE: Custom Tokenizer Sorry, one other thing to verify... did you see an INFO message like this logged at somepoint... Adding 'custom.jar' to Solr classloader also be on the lookout for "

RE: Custom Tokenizer

2007-02-09 Thread Chris Hostetter
: Chris Hostetter <[EMAIL PROTECTED]> : Reply-To: solr-user@lucene.apache.org : To: solr-user@lucene.apache.org : Subject: RE: Custom Tokenizer : : : : Yes, this is with the Jetty that comes with Solr. Right now I'm just : : familiarizing myself with everything. : : i ment to follow up on

RE: Custom Tokenizer

2007-02-09 Thread Chris Hostetter
m: Chris Hostetter [mailto:[EMAIL PROTECTED] : Sent: Monday, February 05, 2007 12:52 AM : To: solr-user@lucene.apache.org : Subject: Re: Custom Tokenizer : : : : to develop and build the factory and tokenizer. However, when I start : : solr up, I get a stack trace, that says : "java.lang.No

RE: Custom Tokenizer

2007-02-05 Thread Smith,Devon
pache.org Subject: Re: Custom Tokenizer : to develop and build the factory and tokenizer. However, when I start : solr up, I get a stack trace, that says "java.lang.NoClassDefFoundError: : org/apache/solr/analysis/BaseTokenizerFactory" That's really confusing. : : Any thoughts on

RE: Custom Tokenizer

2007-02-05 Thread Smith,Devon
AM To: solr-user@lucene.apache.org Subject: Re: Custom Tokenizer Hmmm, classloader hell... I assume you are putting your analyzer in solr/lib? Perhaps try to explode the solr webapp and put your custom analyzer directly in WEB-INF/lib/ -Yonik On 2/2/07, Smith,Devon <[EMAIL PROTECTED]>

Re: Custom Tokenizer

2007-02-04 Thread Chris Hostetter
: to develop and build the factory and tokenizer. However, when I start : solr up, I get a stack trace, that says "java.lang.NoClassDefFoundError: : org/apache/solr/analysis/BaseTokenizerFactory" That's really confusing. : : Any thoughts on what I'm missing/doing wrong? based on your stack trace,

Re: Custom Tokenizer

2007-02-03 Thread Erik Hatcher
On Feb 3, 2007, at 11:18 AM, Yonik Seeley wrote: Hmmm, classloader hell... Yeah, I had a bad feeling about that external lib thing. It's a holy grail to allow dynamic pluggability in Java, but its much more difficult than it perhaps should be. I assume you are putting your analyzer in

Re: Custom Tokenizer

2007-02-03 Thread Yonik Seeley
Hmmm, classloader hell... I assume you are putting your analyzer in solr/lib? Perhaps try to explode the solr webapp and put your custom analyzer directly in WEB-INF/lib/ -Yonik On 2/2/07, Smith,Devon <[EMAIL PROTECTED]> wrote: Hi, I'm trying to get a custom tokenizer working, but

Custom Tokenizer

2007-02-02 Thread Smith,Devon
Hi, I'm trying to get a custom tokenizer working, but I'm having some problems. Per the instructions on various pages [1][2], I've been able to develop and build the factory and tokenizer. However, when I start solr up, I get a stack trace, that says "java.lang.NoClassDefFo