It looks like the fact that this duplicate token is generated by
WordDelimiter after StopFilter means that it's not filtered out.
In any case, a search on "david david" against this field does find
documents with values like "David's" as well as "David, David,
David..."
Michael Della Bitta
-
Yes, that had occurred to me too, but I wasn't exposed to the original
query from the developer who was having the trouble, just the text and
strange analysis. I'll confer with him to make sure there's actually
something to work on here.
Michael Della Bitta
---
I agree that it would make more sense for the catenated word ("johnsons") to
be at the same position as the leading word ("johnson").
But, what are some example queries that would "fail" given this behavior?
"johnson and johnson" would not falsely match since you have position
increment enable
> I have UPPER12-lower and would like
> to be able to find it with queries
> "UPPER" or "lower". What should break this up for the
> index? A
> tokenizer or a filter such as WordDelimiterFilterFactory?
If all thats you want just LowerCaseTokenizer will be enough.
On Mon, Jan 19, 2009 at 9:42 PM, David Shettler wrote:
> Thank you Shalin, I'm in the process of implementing your suggestion,
> and it works marvelously. Had to upgrade to solr 1.3, and had to hack
> up acts_as_solr to function correctly.
>
> Is there a way to receive a search for a given field
Thank you Shalin, I'm in the process of implementing your suggestion,
and it works marvelously. Had to upgrade to solr 1.3, and had to hack
up acts_as_solr to function correctly.
Is there a way to receive a search for a given field, and have solr
know to automatically check the two fields? I sup
Hi Dave,
A quick experimentation found the following fieldtypes to be successful with
your queries. Add one as a copyField to the other and search on both:
I added the following test to
Sorry I typed without thinking too much. Please disregard my previous mail.
I'll run a few tests and let you know.
On Sat, Jan 17, 2009 at 2:46 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> Hi Dave,
>
> There is an attribute on the WordDelimiterFactory preserveOriginal="true"
> wh
Hi Dave,
There is an attribute on the WordDelimiterFactory preserveOriginal="true"
which should keep the original string. I think if you keep LowerCaseFilter
before WordDelimiterFactory with the preserveOriginal setting, it should do
what you have outlined.
On Sat, Jan 17, 2009 at 8:57 AM, David