nice tip. i appreciate it! -- *John Blythe* Product Manager & Lead Developer
251.605.3071 | j...@curvolabs.com www.curvolabs.com 58 Adams Ave Evansville, IN 47713 On Mon, Feb 1, 2016 at 4:55 PM, Erik Hatcher <erik.hatc...@gmail.com> wrote: > And if you want to have the “kept” words stored, consider the trick used > in example/files for url/e-mail extraction mentioned here (note the related > fix in the patch in the JIRA issue mentioned): > > https://lucidworks.com/blog/2016/01/27/example_files/ < > https://lucidworks.com/blog/2016/01/27/example_files/> > > > > > > On Feb 1, 2016, at 3:23 PM, John Blythe <j...@curvolabs.com> wrote: > > > > i immediately realized after sending that i'd had stored="true" in the > > field's config and that it was storing the original data, not the > processed > > data. silly me, thanks anyway! > > > > -- > > *John Blythe* > > Product Manager & Lead Developer > > > > 251.605.3071 | j...@curvolabs.com > > www.curvolabs.com > > > > 58 Adams Ave > > Evansville, IN 47713 > > > > On Mon, Feb 1, 2016 at 3:18 PM, John Blythe <j...@curvolabs.com> wrote: > > > >> hi all, > >> > >> i'm having trouble with what would seem to be a pretty straightforward > >> filter. > >> > >> i'm trying to 'tag' documents based off of a list of relevant words > that a > >> description field may contain. if the data contains any of the words > then > >> this field is populated with it and acts as a quick reference for > >> relevant/bucketed documents. > >> > >> i receive no errors when reloading the core or indexing the data. each > >> document, however, has its description listed in this tag field *even if > >> none of the targeted words are in it.* > >> > >> here's the analyzer, tokenizer, and filter: > >> > >> <analyzer> > >> <tokenizer class="solr.StandardTokenizerFactory" /> > >> <filter class="solr.KeepWordFilterFactory" words="tags.txt" > >> ignoreCase="true"/> > >> </analyzer> > >> > >> to add to the confusion, when i run test data through both of the > >> appropriate FieldName/FieldType in the Analysis UI I get the expected > >> results: the non-targeted words are left out of processing. > >> > >> thanks for any info/help- > >> > >