RE: Possible issue with Stemming and nouns ended with suffix 'ion'

2020-05-01 Thread Jhonny Lopez
jhonny.lo...@prodigious.com www.prodigious.com -Mensaje original- De: Walter Underwood Enviado el: viernes, 1 de mayo de 2020 11:24 a.m. Para: solr-user@lucene.apache.org Asunto: Re: Possible issue with Stemming and nouns ended with suffix 'ion' This email has been sent fro

Re: Possible issue with Stemming and nouns ended with suffix 'ion'

2020-05-01 Thread Walter Underwood
...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On May 1, 2020, at 8:45 AM, Mike Drob wrote: > > This is how things get stemmed *now*, but I believe there is an open > question as to whether that is how they *should* be stemmed. Specifically, > the case appears to be -ify words

Re: Possible issue with Stemming and nouns ended with suffix 'ion'

2020-05-01 Thread Mike Drob
This is how things get stemmed *now*, but I believe there is an open question as to whether that is how they *should* be stemmed. Specifically, the case appears to be -ify words not stemming to the same as -ification - this applies to much more than identify/identification. Also, justify, fortify

RE: Possible issue with Stemming and nouns ended with suffix 'ion'

2020-05-01 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
From: Mike Drob Sent: jueves, 30 de abril de 2020 5:30 p. m. To: solr-user@lucene.apache.org Subject: Re: Possible issue with Stemming and nouns ended with suffix 'ion' This email has been sent from a source external to Publicis Groupe. Please use caution when

Re: Possible issue with Stemming and nouns ended with suffix 'ion'

2020-05-01 Thread Mike Drob
5:37 PM Jhonny Lopez wrote: > Yes, sounds like worth it. > > Thanks guys! > > -Original Message- > From: Mike Drob > Sent: jueves, 30 de abril de 2020 5:30 p. m. > To: solr-user@lucene.apache.org > Subject: Re: Possible issue with Stemming and nouns ended with

RE: Possible issue with Stemming and nouns ended with suffix 'ion'

2020-04-30 Thread Jhonny Lopez
Yes, sounds like worth it. Thanks guys! -Original Message- From: Mike Drob Sent: jueves, 30 de abril de 2020 5:30 p. m. To: solr-user@lucene.apache.org Subject: Re: Possible issue with Stemming and nouns ended with suffix 'ion' This email has been sent from a source e

Re: Possible issue with Stemming and nouns ended with suffix 'ion'

2020-04-30 Thread Mike Drob
Is this worth filing a bug/suggestion to the folks over at snowballstem.org? On Thu, Apr 30, 2020 at 4:08 PM Audrey Lorberfeld - audrey.lorberf...@ibm.com wrote: > I agree with Erick. I think that's just how the cookie crumbles when > stemming. If you have some time on your han

RE: Possible issue with Stemming and nouns ended with suffix 'ion'

2020-04-30 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
I agree with Erick. I think that's just how the cookie crumbles when stemming. If you have some time on your hands, you can integrate OpenNLP with your Solr instance and start using the lemmas of tokens instead of the stems. In this case, I believe if you were to lemmatize both "ide

Re: Possible issue with Stemming and nouns ended with suffix 'ion'

2020-04-30 Thread matthew sporleder
If you use the stemmer in your query analysis it should act the same, right? On Thu, Apr 30, 2020 at 3:54 PM Erick Erickson wrote: > > They are being stemmed to two different tokens, “identif” and “identifi”. > Stemming is algorithmic and imperfect and in this case you’re getting bit

Re: Possible issue with Stemming and nouns ended with suffix 'ion'

2020-04-30 Thread Erick Erickson
They are being stemmed to two different tokens, “identif” and “identifi”. Stemming is algorithmic and imperfect and in this case you’re getting bitten by that algorithm. It looks like you’re using PorterStemFilter, if you want you can look up the exact algorithm, but I don’t think it’s a bug

RE: Possible issue with Stemming and nouns ended with suffix 'ion'

2020-04-30 Thread Jhonny Lopez
Sure, rewriting the message with links for images: We’re facing an issue with stemming in solr. Most of the cases are working correctly, for example, if we search for bidding, solr brings results for bidding, bid, bids, etc. However, with nouns ended with ‘ion’ suffix, stemming is not working

Re: Possible issue with Stemming and nouns ended with suffix 'ion'

2020-04-30 Thread Erick Erickson
The mail server is pretty aggressive about stripping links, so we can’t see the images. Could you put them somewhere and paste a link? Best, Erick > On Apr 30, 2020, at 2:40 PM, Jhonny Lopez > wrote: > > We’re facing an issue with stemming in solr. Most of the cases are working

Possible issue with Stemming and nouns ended with suffix 'ion'

2020-04-30 Thread Jhonny Lopez
We're facing an issue with stemming in solr. Most of the cases are working correctly, for example, if we search for bidding, solr brings results for bidding, bid, bids, etc. However, with nouns ended with 'ion' suffix, stemming is not working. Even when analyzers seems to have c

Re: Is it possible to add stemming in a text_exact field

2020-02-25 Thread Paras Lehana
Hi Dhanesh, Use KeywordRepeatFilterFactory <https://lucene.apache.org/solr/guide/8_4/language-analysis.html#keywordrepeatfilterfactory>. It will emit each token twice and marking one of them as KEYWORD so stemming won't work on that token. Use RemoveDuplicates to remove the duplicates

Re: Is it possible to add stemming in a text_exact field

2020-01-24 Thread Lucky Sharma
_exact to text_general. Then, you can use edismax parser to search > both > > fields, but giving text_exact a higher boost (qf=text_exact^5 > > text_general). In this case, both fields should be indexed, but only one > > needs to be stored. > > > > Edward > >

Re: Is it possible to add stemming in a text_exact field

2020-01-23 Thread Alessandro Benedetti
t; > On Wed, Jan 22, 2020 at 10:34 AM Dhanesh Radhakrishnan > > wrote: > > > Hello, > > I'm facing an issue with stemming. > > My search query is "restaurant dubai" and returns results. > > If I search "restaurants dubai" it returns no data. >

Re: Is it possible to add stemming in a text_exact field

2020-01-22 Thread Edward Ribeiro
giving text_exact a higher boost (qf=text_exact^5 text_general). In this case, both fields should be indexed, but only one needs to be stored. Edward On Wed, Jan 22, 2020 at 10:34 AM Dhanesh Radhakrishnan wrote: > Hello, > I'm facing an issue with stemming. > My search query is &qu

Is it possible to add stemming in a text_exact field

2020-01-22 Thread Dhanesh Radhakrishnan
Hello, I'm facing an issue with stemming. My search query is "restaurant dubai" and returns results. If I search "restaurants dubai" it returns no data. How to stem this keyword "restaurant dubai" with "restaurants dubai" ? I'm using a

Re: Query on stemming

2019-11-03 Thread Shubham Goswami
being not available. Please confirm >if it's available in analysis directory. >- Also, what Solr version are you using? *EnglishPorterFilterFactory* is >already *deprecated* and I suggest you to use >*SnowballPorterFilterFactory* with *language="English" *in

Re: Query on stemming

2019-11-01 Thread Paras Lehana
ish" *instead. You need to add the stemming in query time analysis chain too, otherwise, query as "bags" will not match with "bag" (indexed). Hope this helps. On Fri, 1 Nov 2019 at 15:42, Shubham Goswami wrote: > Hi Jorn > > Thanks for your response. &

Re: Query on stemming

2019-11-01 Thread Paras Lehana
uot;synonyms.txt"/> > >>> > >>> > >>> > >>> > >>>> On Fri, Nov 1, 2019 at 2:10 PM Jörn Franke > >> wrote: > >>>> > >>>> How did you define the field type? Probably you have synt

Re: Query on stemming

2019-11-01 Thread Jörn Franke
gt;>> I recommend to use the schema rest api instead of schema xml as it will >>>> give you better feedback on what is wrong and it allows you also better >>>> versioning of the schema in a source code repository. >>>> >>>> htt

Re: Query on stemming

2019-11-01 Thread Paras Lehana
ad of schema xml as it will > >> give you better feedback on what is wrong and it allows you also better > >> versioning of the schema in a source code repository. > >> > >> https://lucene.apache.org/solr/guide/8_2/schema-api.html > >> > >> >

Re: Query on stemming

2019-11-01 Thread Jörn Franke
nd to use the schema rest api instead of schema xml as it will >> give you better feedback on what is wrong and it allows you also better >> versioning of the schema in a source code repository. >> >> https://lucene.apache.org/solr/guide/8_2/schema-api.html >>

Re: Query on stemming

2019-11-01 Thread Shubham Goswami
y. > > https://lucene.apache.org/solr/guide/8_2/schema-api.html > > > > Am 01.11.2019 um 06:41 schrieb Shubham Goswami < > shubham.gosw...@hotwax.co>: > > > > Hello Community > > > > I am using a filter class EnglishPorterFilterFactory for stemming filter >

Re: Query on stemming

2019-11-01 Thread Jörn Franke
://lucene.apache.org/solr/guide/8_2/schema-api.html > Am 01.11.2019 um 06:41 schrieb Shubham Goswami : > > Hello Community > > I am using a filter class EnglishPorterFilterFactory for stemming filter > but because of usage of this class, my solr is not able reload the schema. > > Ca

Query on stemming

2019-10-31 Thread Shubham Goswami
Hello Community I am using a filter class EnglishPorterFilterFactory for stemming filter but because of usage of this class, my solr is not able reload the schema. Can somebody please let me know what exactly this class does and how can i implement stemming ? Any help will be appreciated. Thanks

ManagedFilter for stemming

2019-07-09 Thread Johannes Siegert
Hi, we are using the SnowballPorterFilter to stem our tokens for serveral languages. Now we want to update the list of protected words over the Solr-API. As I can see, there are only solutions for SynonymFilter and the StopwordFilter with ManagedSynonymFilter and ManagedStopFilter. Do you know

RE: KeywordRepeat, stemming, (single term) synonyms and minimum should match (edismax)

2018-11-29 Thread Markus Jelsma
used, and then get replaced by Solr code. Many thanks, Markus -Original message- > From:Markus Jelsma > Sent: Thursday 22nd November 2018 15:39 > To: solr-user@lucene.apache.org; solr-user > Subject: RE: KeywordRepeat, stemming, (single term) synonyms and minimum &g

RE: KeywordRepeat, stemming, (single term) synonyms and minimum should match (edismax)

2018-11-22 Thread Markus Jelsma
kus Jelsma > Sent: Sunday 18th November 2018 23:21 > To: solr-user@lucene.apache.org; solr-user > Subject: RE: KeywordRepeat, stemming, (single term) synonyms and minimum > should match (edismax) > > Hello, > > Apologies for bothering you all again, but i really need

RE: KeywordRepeat, stemming, (single term) synonyms and minimum should match (edismax)

2018-11-18 Thread Markus Jelsma
arkus -Original message- > From:Markus Jelsma > Sent: Tuesday 13th November 2018 9:52 > To: solr-user > Subject: KeywordRepeat, stemming, (single term) synonyms and minimum should > match (edismax) > > Hello, apologies for this long winded e-mail. > > O

KeywordRepeat, stemming, (single term) synonyms and minimum should match (edismax)

2018-11-13 Thread Markus Jelsma
m? How can i get the same stable results for both queries? Does the odd positon increment have anything to do with it (it seems Lucene's QueryBuilder does something with it). What do i need to do? Many thanks, Markus ps. this is on Solr 7.2.1 and 7.5.0. [1] http://lucene.472066.n3.nabble.

RE: Grammatical tenses Stemming in SOLR

2018-09-21 Thread Markus Jelsma
to be hard-coded either within the algorithm, which it is not, or outside by for example a StemmerOverrideFilter. Regards, Markus -Original message- > From:aishwarya > Sent: Friday 21st September 2018 10:38 > To: solr-user@lucene.apache.org > Subject: Grammatical tenses Stem

Grammatical tenses Stemming in SOLR

2018-09-21 Thread aishwarya
1 down vote favorite i want to know which stemming filter factory can be used to fetch all the possible tenses of a stem word. example : if "run" is the search word -> it has to fetch results for all files involving run , running , runs , ran. also the vice-versa --> whichever

RE: Multiple languages, boosting and, stemming and KeywordRepeat

2018-05-18 Thread Markus Jelsma
part again, i clearly missed it the last time. Thanks, Markus -Original message- > From:Alessandro Benedetti > Sent: Friday 18th May 2018 12:54 > To: solr-user@lucene.apache.org > Subject: Re: Multiple languages, boosting and, stemming and KeywordRepeat > > Hi Mark

Re: Multiple languages, boosting and, stemming and KeywordRepeat

2018-05-18 Thread Alessandro Benedetti
t find > anything in the Lucene or Solr Javadocs, or the reference manual. > > Many thanks, again, > Markus > > > > -Original message- > > From:Markus Jelsma > > Sent: Wednesday 9th May 2018 17:39 > > To: solr-user > > Subject: Multipl

RE: Multiple languages, boosting and, stemming and KeywordRepeat

2018-05-17 Thread Markus Jelsma
- > From:Markus Jelsma > Sent: Wednesday 9th May 2018 17:39 > To: solr-user > Subject: Multiple languages, boosting and, stemming and KeywordRepeat > > Hello, > > First, apologies for the weird subject line. > > We index many languages and search over al

Multiple languages, boosting and, stemming and KeywordRepeat

2018-05-09 Thread Markus Jelsma
Hello, First, apologies for the weird subject line. We index many languages and search over all those languages at once, but boost the language of the user's preference. To differentiate between stemmed tokens and unstemmed tokens we use KeywordRepeat and RemoveDuplicates, this works very well

Re: Advice on Stemming in Solr

2017-11-04 Thread Zheng Lin Edwin Yeo
s how these flags will control spell checking. > > Probably we can control it from those files in HunspellStemFilterFactory? > > > > Regards, > > Edwin > > > > > > On 2 November 2017 at 17:46, Emir Arnautović < > emir.arnauto...@sematext.com> >

Re: Advice on Stemming in Solr

2017-11-03 Thread Emir Arnautović
d an affix file > that specifies how these flags will control spell checking. > Probably we can control it from those files in HunspellStemFilterFactory? > > Regards, > Edwin > > > On 2 November 2017 at 17:46, Emir Arnautović > wrote: > >> Hi Edwin, >>

Re: Advice on Stemming in Solr

2017-11-02 Thread Zheng Lin Edwin Yeo
On 2 November 2017 at 17:46, Emir Arnautović wrote: > Hi Edwin, > It seems that it would be best if you do not apply *ing stemming rule at > all. The first idea is to trick stemmer and replace any word that ends with > ing to some nonexisting char combination e.g. ‘wqx’.

Re: Advice on Stemming in Solr

2017-11-02 Thread Emir Arnautović
Hi Edwin, It seems that it would be best if you do not apply *ing stemming rule at all. The first idea is to trick stemmer and replace any word that ends with ing to some nonexisting char combination e.g. ‘wqx’. You can use solr.PatternReplaceFilterFactory to do that. You can switch it back

Re: Advice on Stemming in Solr

2017-11-01 Thread Zheng Lin Edwin Yeo
Hi Emir, We do have quite alot of words that should not be stemmed. Currently, the KStemFilterFactory are stemming all the non-English words that end with "ing" as well. There are quite alot of places and names which ends in "ing", and all these are being stemmed as wel

Re: Advice on Stemming in Solr

2017-11-01 Thread Emir Arnautović
. If you want to find documents that contain only “walking” with search term “walk”, then you have to stem at index time. Cases when you use stemming on query time only are rare and specific. If you want to prefer exact matches over stemmed matches, you have to index same content with and without

Advice on Stemming in Solr

2017-11-01 Thread Zheng Lin Edwin Yeo
Hi, We are currently using KStemFilterFactory in Solr, but we found that it is actually doing stemming on non-English words like "ximenting", which it stem to "ximent". This is not what we wanted. Another option is to use the HunspellStemFilterFactory, but there are som

Re: solrnet + stemming issue

2017-08-08 Thread Erick Erickson
k" but don't show result for those document which have word "saks". I assume that this sak is not actually singular word for saks and saks is also not plural for sak, this is only one sentence like "saks fifth avenue" like this in document. So I get this issue due to stemmi

solrnet + stemming issue

2017-08-08 Thread KG S
"sak" but don't show result for those document which have word "saks". I assume that this sak is not actually singular word for saks and saks is also not plural for sak, this is only one sentence like "saks fifth avenue" like this in document. So I get this issu

Re: Stemming and accents

2017-02-11 Thread Dominique Bejean
Thank you both for your answers. I tried to find some French homophone words (tache / tâche, bouche / bouché, ...) with different stems (with snowball, minimal and light stemmers), but without success. So put the ASCIIFolding filter before the stemmer is not a big issue (in French) for precision.

Re: Stemming and accents

2017-02-10 Thread Ahmet Arslan
Hi, I have experimented before, and found that Snowball is sensitive to accents/diacritics. Please see for more details: http://www.sciencedirect.com/science/article/pii/S0306457315001053 Ahmet On Friday, February 10, 2017 11:27 AM, Dominique Bejean wrote: Hi, Is the SnowballPorterFilter

Re: Stemming and accents

2017-02-10 Thread Erick Erickson
The easiest way to answer that is to define two different fieldTypes, one with Snowball first and one with ASCIIFolding first, fire up the admin/analysis page and give it some input. That'll show you _exactly_ what transformations take place at each step. Best, Erick On Fri, Feb 10, 2017 at 12:26

Stemming and accents

2017-02-10 Thread Dominique Bejean
Hi, Is the SnowballPorterFilter sensitive to the accents for French for instance ? If I use both SnowballPorterFilter and ASCIIFoldingFilter, do I have to configure ASCIIFoldingFilter after SnowballPorterFilter ? Regards. Dominique -- Dominique Béjean 06 08 46 12 43

Re: Stemming with SOLR

2016-12-18 Thread Lasitha Wattaladeniya
Thank you all for the replies. I am considering the suggestions On 17 Dec 2016 01:50, "Susheel Kumar" wrote: > To handle irregular nouns ( > http://www.ef.com/english-resources/english-grammar/ > singular-and-plural-nouns/), > the simplest way is handle them using StemOverriderFactory. The lis

Re: Stemming with SOLR

2016-12-16 Thread Susheel Kumar
To handle irregular nouns ( http://www.ef.com/english-resources/english-grammar/singular-and-plural-nouns/), the simplest way is handle them using StemOverriderFactory. The list is not so long. Or otherwise go for commercial solutions like basistech etc. as Alex suggested oR you can customize Hun

Re: Stemming with SOLR

2016-12-15 Thread Alexandre Rafalovitch
If you need the full fidelity solution taking care of multiple edge-cases, it could be worth looking at commercial solutions. http://www.basistech.com/ has one, including a free-level SAAS plan. Regards, Alex. http://www.solr-start.com/ - Resources for Solr users, new and experienced O

Re: Stemming with SOLR

2016-12-15 Thread Lasitha Wattaladeniya
Hi all, Thanks for the replies, @eric, ahmet : since those stemmers are logical stemmers it won't work on words such as caught, ran and so on. So in our case it won't work @susheel : Yes I thought about it but problems we have is, the documents we index are some what large text, so copy fielding

Re: Stemming with SOLR

2016-12-15 Thread Susheel Kumar
We did extensive comparison in the past for Snowball, KStem and Hunspell and there are cases where one of them works better but not other or vice-versa. You may utilise all three of them by having 3 different fields (fieldTypes) and during query, search in all of them. For some of the cases where

Re: Stemming with SOLR

2016-12-15 Thread Ahmet Arslan
Hi, KStemFilter returns legitimate English words, please use it. Ahmet On Thursday, December 15, 2016 6:17 PM, Lasitha Wattaladeniya wrote: Hello devs, I'm trying to develop this indexing and querying flow where it converts the words to its original form (lemmatization). I was doing bit of

Re: Stemming with SOLR

2016-12-15 Thread Erick Erickson
What about things like PorterStemFilterFactory, EnglishMinimalStemFilterFactory and the like? Best, Erick On Thu, Dec 15, 2016 at 7:16 AM, Lasitha Wattaladeniya wrote: > Hello devs, > > I'm trying to develop this indexing and querying flow where it converts the > words to its original form (lemm

Stemming with SOLR

2016-12-15 Thread Lasitha Wattaladeniya
Hello devs, I'm trying to develop this indexing and querying flow where it converts the words to its original form (lemmatization). I was doing bit of research lately but the information on the internet is very limited. I tried using hunspellfactory but it doesn't convert the word to it's original

Re: [E] Re: Stemming

2016-06-16 Thread Aurélien MAZOYER
Thanks again so much =) Sas -Original Message- From: Aurélien MAZOYER [mailto:aurelien.mazo...@francelabs.com] Sent: Thursday, June 16, 2016 4:36 PM To: solr-user@lucene.apache.org Subject: Re: [E] Re: Stemming Hi, I was just wondering if you are sure that you query only that field (or

RE: [E] Re: Stemming

2016-06-16 Thread Jamal, Sarfaraz
[mailto:aurelien.mazo...@francelabs.com] Sent: Thursday, June 16, 2016 4:36 PM To: solr-user@lucene.apache.org Subject: Re: [E] Re: Stemming Hi, I was just wondering if you are sure that you query only that field (or fields that use your text_stem analyzer) and not other fields (in your qf for

RE: [E] Re: Stemming

2016-06-16 Thread Jamal, Sarfaraz
Sas -Original Message- From: Aurélien MAZOYER [mailto:aurelien.mazo...@francelabs.com] Sent: Thursday, June 16, 2016 4:20 PM To: solr-user@lucene.apache.org Subject: [E] Re: Stemming Hi, Yes you should have the same resultset. Are you sure that you reindex all the data after changing your schema

Re: [E] Re: Stemming

2016-06-16 Thread Aurélien MAZOYER
ginal Message- From: Aurélien MAZOYER [mailto:aurelien.mazo...@francelabs.com] Sent: Thursday, June 16, 2016 4:20 PM To: solr-user@lucene.apache.org Subject: [E] Re: Stemming Hi, Yes you should have the same resultset. Are you sure that you reindex all the data after changing your schema? Ar

Re: Stemming

2016-06-16 Thread Aurélien MAZOYER
: Hi Guys, I have enabled stemming: In the Admin Analysis, I type in running or runs and they both break down to run. However when I search for run, runs, or running with an actual query - It brings back three different

RE: [E] Re: Stemming

2016-06-16 Thread Jamal, Sarfaraz
They both produced three different sets of results -Original Message- From: Ahmet Arslan [mailto:iori...@yahoo.com.INVALID] Sent: Thursday, June 16, 2016 3:37 PM To: solr-user@lucene.apache.org Subject: [E] Re: Stemming Hi Jamal, Snowball

Re: Stemming

2016-06-16 Thread Ahmet Arslan
ve enabled stemming: In the Admin Analysis, I type in running or runs and they both break down to run. However when I search for run, runs, or running with an actual query - It brings back three different sets of results. Is that correct? I would imagine that

Stemming

2016-06-16 Thread Jamal, Sarfaraz
Hi Guys, I have enabled stemming: In the Admin Analysis, I type in running or runs and they both break down to run. However when I search for run, runs, or running with an actual query - It brings back three different

Re: Stemming Help

2016-06-05 Thread Doug Turnbull
; > > > I am following this tutorial: > > > > > http://thinknook.com/keyword-stemming-and-lemmatisation-with-apache-solr-2013-08-02/ > > > > My (Managed) Schema file looks like this: (in the appropr

Re: Stemming Help

2016-06-05 Thread Georg Sorst
am Fr., 3. Juni 2016 um 20:12 Uhr: > Hi Guys, > > I am following this tutorial: > > http://thinknook.com/keyword-stemming-and-lemmatisation-with-apache-solr-2013-08-02/ > > My (Managed) Schema file looks like this: (in the appropriate places) &

Re: [E] Re: Stemming and Managed Schema

2016-06-04 Thread Alexandre Rafalovitch
Oops, ignore. Shawn answered that with a link earlier. I was reading this on the phone Newsletter and resources for Solr beginners and intermediates: http://www.solr-start.com/ On 5 June 2016 at 14:32, Alexandre Rafalovitch wrote: > Isn't just reloading the core via Admin interface suff

Re: [E] Re: Stemming and Managed Schema

2016-06-04 Thread Alexandre Rafalovitch
Isn't just reloading the core via Admin interface sufficient? I though that all Solr driven changes are written out to managed-schema at once, so as long as core is reloaded right after manual changes, it should be OK. Regards, Alex On 4 Jun 2016 10:12 am, "Erick Erickson" wrote: > Actually

Re: [E] Re: Stemming and Managed Schema

2016-06-03 Thread Erick Erickson
Actually, I prefer to do it the other way: 1> shut down Solr 2> edit managed_schema 3> start Solr. that eliminates any possibility of inadvertently overwriting your changes by issuing a managed schema call. that's a nit though, either will work. FWIW, Erick On Fri, Jun 3, 2016 at 11:02 AM, Sh

Stemming Help

2016-06-03 Thread Jamal, Sarfaraz
Hi Guys, I am following this tutorial: http://thinknook.com/keyword-stemming-and-lemmatisation-with-apache-solr-2013-08-02/ My (Managed) Schema file looks like this: (in the appropriate places) - - - - I have re-indexed everything - It is

Re: [E] Re: Stemming and Managed Schema

2016-06-03 Thread Shawn Heisey
On 6/3/2016 9:22 AM, Jamal, Sarfaraz wrote: > I would edit the managed-schema, make my changes, shutdown solr? And > start it back up and verify it is still there? That's the sledgehammer approach. Simple and effective, but Solr does go offline for a short time. > Or is there another way to rel

RE: [E] Re: Stemming and Managed Schema

2016-06-03 Thread Jamal, Sarfaraz
...@elyograg.org] Sent: Friday, June 3, 2016 11:17 AM To: solr-user@lucene.apache.org Subject: [E] Re: Stemming and Managed Schema On 6/3/2016 9:07 AM, Jamal, Sarfaraz wrote: > I found the following article: > http://thinknook.com/keyword-stemming-and-lemmatisation-with-apache-so > lr-2

Re: Stemming and Managed Schema

2016-06-03 Thread Shawn Heisey
On 6/3/2016 9:07 AM, Jamal, Sarfaraz wrote: > I found the following article: > http://thinknook.com/keyword-stemming-and-lemmatisation-with-apache-solr-2013-08-02/ > > And I want to do stemming on one of our fields. > > However, I am using a Managed Schema and I am unsure ho

Re: Stemming and Managed Schema

2016-06-03 Thread Andrea Gazzarini
Sure, this is the API reference [1] where you can see, you can add types and fields Andrea [1] https://cwiki.apache.org/confluence/display/solr/Schema+API On 03/06/16 17:07, Jamal, Sarfaraz wrote: Hi Guys, I found the following article: http://thinknook.com/keyword-stemming-and

Stemming and Managed Schema

2016-06-03 Thread Jamal, Sarfaraz
Hi Guys, I found the following article: http://thinknook.com/keyword-stemming-and-lemmatisation-with-apache-solr-2013-08-02/ And I want to do stemming on one of our fields. However, I am using a Managed Schema and I am unsure how to add these two blocks to it - I know there is an API for

Re: Stemming nouns ending in 'y'

2016-05-19 Thread Erick Erickson
> > -Original message- >> From:Mark Vega >> Sent: Thursday 19th May 2016 19:55 >> To: solr-user@lucene.apache.org >> Subject: Stemming nouns ending in 'y' >> >> I am using Apache Nutch v1.10 and SOLR v.5.2.1 to index and search a medical >&g

RE: Stemming nouns ending in 'y'

2016-05-19 Thread Markus Jelsma
Hello - try the KStem filter. It is better suited for english and doesn't show this behaviour. Markus -Original message- > From:Mark Vega > Sent: Thursday 19th May 2016 19:55 > To: solr-user@lucene.apache.org > Subject: Stemming nouns ending in 'y' > &

Stemming nouns ending in 'y'

2016-05-19 Thread Mark Vega
I am using Apache Nutch v1.10 and SOLR v.5.2.1 to index and search a medical website and am trying to find out why every stemmer I've tried on certain nouns in medical terminology ending in 'y' merely replaces the ending 'y' with an 'I'. As example, the term 'osteopathy' stemmed with the Porter

Re: Issue with stemming and lemmatizing

2016-01-15 Thread Jack Krupansky
wrote: > I wanna to write my own text tokenizer. > And my question is about what solr treat with stemming or lemmatizing? > Solr store both lemmatizerd token and orginal token together? > I mean if in index time solr lemmatize creation to create. > And in query time.user want to searc

Issue with stemming and lemmatizing

2016-01-15 Thread sara hajili
I wanna to write my own text tokenizer. And my question is about what solr treat with stemming or lemmatizing? Solr store both lemmatizerd token and orginal token together? I mean if in index time solr lemmatize creation to create. And in query time.user want to search about exactly creation not

Re: Stemming words Using Solr

2015-09-04 Thread Ritesh Sinha
String objF = jsonarray.getString(i); > > stemmedWords.add(objF); > > } > > > > String lastStemmedGroup = stemmedWords.get(stemmedWords.size() - > > 1).toString(); > > > > JSONArray finalStemmer = new JSONArray(lastStemmedGroup); > > > >

Re: Stemming words Using Solr

2015-09-04 Thread Upayavira
ct(i); > String stemmedWord = jsonobject.getString("text"); > System.out.println(stemmedWord); > } > > } > > } > > > Here, args[0] is core. > args[1] is the word i'll be sending for stemming. > args[2] is the Ana

Re: Stemming words Using Solr

2015-09-04 Thread Ritesh Sinha
temmer.length(); i++) { JSONObject jsonobject = finalStemmer.getJSONObject(i); String stemmedWord = jsonobject.getString("text"); System.out.println(stemmedWord); } } } Here, args[0] is core. args[1] is the word i'll be sending for stemming. args[

Re: Stemming words Using Solr

2015-09-04 Thread Upayavira
; > > > > > > > > > > > > > href="#/">Dashboard > > > > > > > > > > > href="#/~logging">Logging > > > > > > > >

Re: Stemming words Using Solr

2015-09-03 Thread Ritesh Sinha
d > > > > > > href="#/~cloud?view=tree">Tree > > > Graph > > > Graph > > > (Radial) > > > Dump > > > > > >

Re: Stemming words Using Solr

2015-09-03 Thread Jack Krupansky
; > > > > > > > > > > > > > http://lucene.apache.org/solr/ > ">Documentation > http://issues.apache.org/jira/browse/SOLR";>Is

Re: Stemming words Using Solr

2015-09-03 Thread Upayavira
> Dump > > > > > > > > Core > > Admin > > > > > href="#/~java-properties">Java Properties > > > > Thread > > Dump > > >

Re: Stemming words Using Solr

2015-09-03 Thread Ritesh Sinha
0' }; On Thu, Sep 3, 2015 at 4:12 PM, Upayavira wrote: > > > On Thu, Sep 3, 2015, at 11:19 AM, Ritesh Sinha wrote: > > I am learning solr and want to use solr for stemming words.I'll be > > passing > > the word to the solr and it should send the stemmed

Re: Stemming words Using Solr

2015-09-03 Thread Upayavira
On Thu, Sep 3, 2015, at 11:19 AM, Ritesh Sinha wrote: > I am learning solr and want to use solr for stemming words.I'll be > passing > the word to the solr and it should send the stemmed word back.I know how > to > configure solr core for different stemming patterns and also

Stemming words Using Solr

2015-09-03 Thread Ritesh Sinha
I am learning solr and want to use solr for stemming words.I'll be passing the word to the solr and it should send the stemmed word back.I know how to configure solr core for different stemming patterns and also i am able to view their stemmed words in the analyzer (solr admin ui) but i a

Re: Stemming Issue

2015-07-28 Thread Alessandro Benedetti
As it is possible to read in the documentation, the KStemFilter is a soft ( not much aggressive) english stemmer. As Ahmet properly specified, are you preceding that filter with a lowercase one ? What is exactly the stemming you get that doesn't convince you ? Cheers 2015-07-28 0:16 GMT+

Re: Stemming Issue

2015-07-27 Thread Ahmet Arslan
hema for a custom field type. When I use the interface (Solr) to Analysis the words. I am getting strange behavior. E.g. If Add the keyword "Supplies" I am not getting anything like "Supply". Is this behavior is because of the Kstem, is there any other stemming algorithm can fix this issue. Thanks Ravi

Stemming Issue

2015-07-27 Thread EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS)
use of the Kstem, is there any other stemming algorithm can fix this issue. Thanks Ravi

Re: term frequency with stemming

2015-07-27 Thread Aki Balogh
Hi Alessandro, I'm counting word frequencies on a site. All I want to do is, I want to count "running" and "run" as the same topic. It's not really fuzzy matching I believe -- i.e. I wouldn't want to match "running" and "sprinting". I thi

Re: term frequency with stemming

2015-07-27 Thread Alessandro Benedetti
an run fuzzy queries with the edit distance ( by default calculated over a Levenstein Automaton) . This will allow you to run your fuzzy query and leave your index terms as you want ( without affecting in this way the term frequency) . Can you give us more details about your use of stemming ?

Re: term frequency with stemming

2015-07-25 Thread Aki Balogh
;experience" -> "experi") before being passed, i.e. termfreq(body, "end-us experi"). >From what I can tell, FunctionQuery / termfreq doesn't have a way to apply stemming. Akos (Aki) Balogh Co-Founder, MarketMuse https://www.MarketMuse.com <https://www.mark

Re: term frequency with stemming

2015-07-24 Thread Darin Amos
Hi Dale, I would think the coffee shop is better, I have in-laws visiting at home. Thanks Darin > On Jul 24, 2015, at 12:04 PM, Aki Balogh wrote: > > Hi All, > > I'm using TermVectorComponent and stemming (Porter) in order to get term > frequencies with fuzzy ma

  1   2   3   4   >