Thanks Jack. So - if I understand (all email feedback thus far) correctly:
— Upgrading to newer version vital (5.5 —6.0) — EnglishMinimalStemFilter: upgrading to v5.5-6.0 will NOT help with stemming issues, as code has not been updated. — PorterStemFilter: Has been updated to work with better with v5.5 - 6.0 — Or, perhaps we just need a stemmer that is more dictionary-based (Hunspell?), or inflectional (any suggestions?) Thanks again all, for your patience and time! Sara > On Apr 14, 2016, at 3:51 PM, Jack Krupansky <jack.krupan...@gmail.com> wrote: > > BTW, I did check and that stemmer code is the same today as it was in 3.x, so > there should be no change in stemmer behavior there. > > -- Jack Krupansky > > On Thu, Apr 14, 2016 at 3:47 PM, Sara Woodmansee <swood...@gmail.com> wrote: > >> Hi Shawn, >> >> Thanks so much the feedback. And for the heads-up regarding (the bad form >> of) starting a new discussion from an existing one. Thought removing all >> content wouldn’t track to original. (Sigh). This is what you get when you >> have photographers posting to high-end forums. >> >> Thanks Erick, regarding upgrading to v5. We actually just removed all >> test data from the site, so we can now upload all the true, final files and >> metadata. In some ways this could be a perfect time to upgrade to v5 (if I >> can talk the developer into it) since all metadata has to be re-ingested >> anyway.. >> >> All best, >> Sara >> >> >>> On Apr 14, 2016, at 3:31 PM, Erick Erickson <erickerick...@gmail.com> >> wrote: >>> >>> re: upgrading to 5.x... 5X Solr's are NOT guaranteed to read 3x indexes, >>> you'd have to go through 4x to do that. >>> >>> If you can re-index from scratch that would be best. >>> >>> Best, >>> Erick >>> >>> >>>> On Apr 14, 2016, at 3:29 PM, Shawn Heisey <apa...@elyograg.org> wrote: >>>> >>>> On 4/14/2016 11:17 AM, Sara Woodmansee wrote: >>>>> I posted yesterday, however I never received my own post, so worried >> it did not go through (?) >>>> >>>> I *did* see your previous message, but couldn't immediately think of >>>> anything constructive to say. I've had a little bit of time on my lunch >>>> break today to look deeper. >>>> >>>> EnglishMinimalStemFilter is designed to *not* aggressively stem >>>> everything it sees. It appears that the behavior you are seeing is >>>> probably intentional with that filter. >>>> >>>> In 5.5.0 and 6.0.0, PorterStemFilter will handle words of the form you >>>> mentioned correctly. In the screenshot below, PSF means >>>> "PorterStemFilter". I did not check any earlier versions. I already >>>> had these versions on my system. >>>> >>>> https://www.dropbox.com/s/ss48vinrtbgifce/stemmer-ee-es-6.0.0.png?dl=0 >>>> >>>> https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Stemming >>>> >>>> That version of Solr is over four years old. Bugs in 3.x will *not* be >>>> fixed. Bugs in 4.x will also not be fixed. On 5.x, only extremely >>>> major bugs are likely to get any attention, and this does not qualify as >>>> a major bug. >>>> >>>> ------------ >>>> >>>> On another matter: >>>> >>>> http://people.apache.org/~hossman/#threadhijack >>>> >>>> You replied to a message with the subject "Solr Support for BM25F" ... >>>> so your message is showing up within that thread. >>>> >>>> >> https://www.dropbox.com/s/xi0o8z6smhd2n5d/woodmansee-thread-hijack.png?dl=0 >>>> >>>> Thanks, >>>> Shawn >>>> >>> >>