Re: Filter Factory question

2017-09-29 Thread Emir Arnautović
It is still on master: https://github.com/apache/lucene-solr/blob/master/lucene/analysis/common/src/java/org/apache/lucene/analysis/pattern/PatternCaptureGroupTokenFilter.java

Re: Filter Factory question

2017-09-28 Thread Erick Erickson
PatternCaptureGroupTokenFilter has been around since 2013 (at least that's the earliest revision in Git). I located it even in 5x so it should be there in ...lucene/analysis/common/src/java/org/apache/lucene/analysis/pattern Best, Erick On Thu, Sep 28, 2017 at 7:45 AM, Webster Homer wrote: > It'

Re: Filter Factory question

2017-09-28 Thread Webster Homer
It's still buggy, so not ready to share. I keep a copy of Solr source which I use for this type of development. I don't see PatternCaptureGroupTokenFilterFactory in the Solr 6.2 code base at all. I was thinking of seeing how it treated the positions etc... My code now looks reasonable in the Anal

Re: Filter Factory question

2017-09-27 Thread Stefan Matheis
> In any case I figured out my problem. I was over thinking it. Mind to share? -Stefan On Sep 27, 2017 4:34 PM, "Webster Homer" wrote: > There is a need for a special filter since the input has to be normalized. > That is the main requirement, splitting into pieces is optional. As far as > I k

Re: Filter Factory question

2017-09-27 Thread Webster Homer
There is a need for a special filter since the input has to be normalized. That is the main requirement, splitting into pieces is optional. As far as I know there is nothing in solr that knows about molecular formulas. In any case I figured out my problem. I was over thinking it. On Wed, Sep 27,

Re: Filter Factory question

2017-09-27 Thread Emir Arnautović
Hi Homer, There is no need for special filter, there is one that is for some reason not part of documentation (will ask why so follow that thread if decided to go this way): You can use something like: This will capture all atom counts as a separate tokens. HTH, Emir > On 26 Sep 2017, at 23:1