Thanks Jack.

So - if I understand (all email feedback thus far) correctly:  

— Upgrading to newer version vital (5.5 —6.0)

— EnglishMinimalStemFilter:  upgrading to v5.5-6.0 will NOT help with stemming 
issues, as code has not been updated.

— PorterStemFilter:  Has been updated to work with better with v5.5 - 6.0

— Or, perhaps we just need a stemmer that is more dictionary-based (Hunspell?), 
or inflectional (any suggestions?)

Thanks again all, for your patience and time!
Sara

> On Apr 14, 2016, at 3:51 PM, Jack Krupansky <jack.krupan...@gmail.com> wrote:
> 
> BTW, I did check and that stemmer code is the same today as it was in 3.x, so 
> there should be no change in stemmer behavior there.
> 
> -- Jack Krupansky
> 
> On Thu, Apr 14, 2016 at 3:47 PM, Sara Woodmansee <swood...@gmail.com> wrote:
> 
>> Hi Shawn,
>> 
>> Thanks so much the feedback. And for the heads-up regarding (the bad form
>> of) starting a new discussion from an existing one. Thought removing all
>> content wouldn’t track to original. (Sigh). This is what you get when you
>> have photographers posting to high-end forums.
>> 
>> Thanks Erick, regarding upgrading to v5.  We actually just removed all
>> test data from the site, so we can now upload all the true, final files and
>> metadata. In some ways this could be a perfect time to upgrade to v5 (if I
>> can talk the developer into it) since all metadata has to be re-ingested
>> anyway..
>> 
>> All best,
>> Sara
>> 
>> 
>>> On Apr 14, 2016, at 3:31 PM, Erick Erickson <erickerick...@gmail.com>
>> wrote:
>>> 
>>> re: upgrading to 5.x... 5X Solr's are NOT guaranteed to read 3x indexes, 
>>> you'd have to go through 4x to do that.
>>> 
>>> If you can re-index from scratch that would be best.
>>> 
>>> Best,
>>> Erick
>>> 
>>> 
>>>> On Apr 14, 2016, at 3:29 PM, Shawn Heisey <apa...@elyograg.org> wrote:
>>>> 
>>>> On 4/14/2016 11:17 AM, Sara Woodmansee wrote:
>>>>> I posted yesterday, however I never received my own post, so worried
>> it did not go through (?)
>>>> 
>>>> I *did* see your previous message, but couldn't immediately think of
>>>> anything constructive to say.  I've had a little bit of time on my lunch
>>>> break today to look deeper.
>>>> 
>>>> EnglishMinimalStemFilter is designed to *not* aggressively stem
>>>> everything it sees.  It appears that the behavior you are seeing is
>>>> probably intentional with that filter.
>>>> 
>>>> In 5.5.0 and 6.0.0, PorterStemFilter will handle words of the form you
>>>> mentioned correctly.  In the screenshot below, PSF means
>>>> "PorterStemFilter".  I did not check any earlier versions.  I already
>>>> had these versions on my system.
>>>> 
>>>> https://www.dropbox.com/s/ss48vinrtbgifce/stemmer-ee-es-6.0.0.png?dl=0
>>>> 
>>>> https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Stemming
>>>> 
>>>> That version of Solr is over four years old.  Bugs in 3.x will *not* be
>>>> fixed.  Bugs in 4.x will also not be fixed.  On 5.x, only extremely
>>>> major bugs are likely to get any attention, and this does not qualify as
>>>> a major bug.
>>>> 
>>>> ------------
>>>> 
>>>> On another matter:
>>>> 
>>>> http://people.apache.org/~hossman/#threadhijack
>>>> 
>>>> You replied to a message with the subject "Solr Support for BM25F" ...
>>>> so your message is showing up within that thread.
>>>> 
>>>> 
>> https://www.dropbox.com/s/xi0o8z6smhd2n5d/woodmansee-thread-hijack.png?dl=0
>>>> 
>>>> Thanks,
>>>> Shawn
>>>> 
>>> 
>> 

Reply via email to