rom: Jack Krupansky
To: solr-user@lucene.apache.org; Mike L.
Sent: Sunday, April 5, 2015 8:23 AM
Subject: Re: WordDelimiterFilterFactory - tokenizer question
You have to tell the filter what types of tokens to generate - words, numbers.
You told it to generate... nothing. You did te
You have to tell the filter what types of tokens to generate - words,
numbers. You told it to generate... nothing. You did tell it to preserve
the original, unfiltered token though, which is fine.
-- Jack Krupansky
On Sun, Apr 5, 2015 at 3:39 AM, Mike L.
wrote:
> Solr User Group,
> I have a
Solr User Group,
I have a non-multivalied field with contains stored values similar to this:
US100AUS100BUS100CUS100-DUS100BBA
My assumption is - If I tokenized with the below fieldType definition,
specifically the WDF -splitOnNumbers and the LowerCaseFilterFactory would have
have provided
AM, Upayavira wrote:
> >
> > > Have you tried a WhitespaceTokenizerFactory followed by the
> > > WordDelimiterFilterFactory? The latter is perhaps more configurable at
> > > what it does. Alternatively, you could use a RegexFilterFactory to
> > > remove extran
avira
> >
> > On Sat, Dec 7, 2013, at 06:15 PM, Vulcanoid Developer wrote:
> > > Hi,
> > >
> > > I am new to solr and I guess this is a basic tokenizer question so please
> > > bear with me.
> > >
> > > I am trying to use SOLR to in
> On Sat, Dec 7, 2013, at 06:15 PM, Vulcanoid Developer wrote:
> > Hi,
> >
> > I am new to solr and I guess this is a basic tokenizer question so please
> > bear with me.
> >
> > I am trying to use SOLR to index a few (Indian) legal judgments in text
> >
n Sat, Dec 7, 2013, at 06:15 PM, Vulcanoid Developer wrote:
> Hi,
>
> I am new to solr and I guess this is a basic tokenizer question so please
> bear with me.
>
> I am trying to use SOLR to index a few (Indian) legal judgments in text
> form and search against them. One of
Hi,
I am new to solr and I guess this is a basic tokenizer question so please
bear with me.
I am trying to use SOLR to index a few (Indian) legal judgments in text
form and search against them. One of the key points with these documents is
that the sections/provisions of law usually have
M
To: solr-user@lucene.apache.org
Subject: Tokenizer question
I could not find a solution to that in the documentation or the mailing
list,
so here's my question.
I have files following the pattern: firstname_lastname_employeenumber.jpg
I'm able to search for the single terms firstnam
View this message in context:
http://lucene.472066.n3.nabble.com/Tokenizer-question-tp4016932.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks, I'll try that now, I'm assuming I need to add the position
increment and offset attributes?
On Wed, Jul 20, 2011 at 3:44 PM, Chris Hostetter
wrote:
>
> When the QueryParser gives hunks of text to an analyzer, and that analyzer
> produces multiple terms, the query parser has to decide how
When the QueryParser gives hunks of text to an analyzer, and that analyzer
produces multiple terms, the query parser has to decide how to build a
query out of it.
if the terms have identicle position information, then it always builds an
"OR" query (this is the typical synonym situation). If
My use case really isn't names, I just used that as a simplification.
I did look at the Synonym filter to see if I could implement a similar
filter (if that was a more appropriate place to do so) but even after
doing that I ended up with the same result.
On Wed, Jul 20, 2011 at 12:07 PM, Kyle Lee
I'm not sure how to accomplish what you're asking, but have you considered
using a synonyms file? This would also allow you to catch ostensibly
unrelated name substitutes such as Robert -> Bob and Richard -> Dick.
On Wed, Jul 20, 2011 at 10:57 AM, Jamie Johnson wrote:
> I have a query which star
I have a query which starts out with something like name:"john", I
need to expand this to something like name:("john" "johnny"). I've
implemented a custom tokenzier which gets close, but isn't quite right
it outputs name:"john johnny". Is there a simple example of doing
what I'm attempting?
Cristal clear. Thanks for your response&time!
--
View this message in context:
http://old.nabble.com/Tokenizer-question-tp27099119p27123281.html
Sent from the Solr - User mailing list archive at Nabble.com.
>
> If the analyzer produces multiple Tokens, but they all have the same
> position then the QueryParser produces a BooleanQuery will all SHOULD
> clauses. -- This is what allows simple synonyms to work.
>
You rock Hoss!!! This is exactly the explanation I was looking for .. it is
as simple as it
: q=PostCode:(1078 pw)+AND+HouseNumber:(39-43)
:
: the resulting parsed query contains a phrase query:
:
: +(PostCode:1078 PostCode:pw) +PhraseQuery(HouseNumber:"39 43")
This stems from some fairly fundemental behavior i nthe QueryParser ...
each "chunk" of input that isn't deemed "markup (ie:
y:
>>>
>>> 1. WordDelimiterFilterFactory with generateNumberParts=1 but this
>>> results in
>>> a phrase query
>>> 2. PatternTokenizerFactory that splits on (\s+|-).
>>>
>>> But both options don't work.
>>>
>>
>>
>> 1. WordDelimiterFilterFactory with generateNumberParts=1 but this results in
>> a phrase query
>> 2. PatternTokenizerFactory that splits on (\s+|-).
>>
>> But both options don't work.
>>
>> Any suggestions
suggestions on how to get rid of the phrase query?
>
> Thanks,
>
> Richard
> --
> View this message in context:
> http://old.nabble.com/Tokenizer-question-tp27099119p27099119.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
--
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem using Solr/Lucene:
http://www.lucidimagination.com/search
suggestions on how to get rid of the phrase query?
Thanks,
Richard
--
View this message in context:
http://old.nabble.com/Tokenizer-question-tp27099119p27099119.html
Sent from the Solr - User mailing list archive at Nabble.com.
: as far as I know solr.StrField is not analized but it is indexed as is
: (verbatim).
correct ... but there is definitely a bug here if the analysis.jsp
is implying that an analyzer is being used...
https://issues.apache.org/jira/browse/SOLR-1086
-Hoss
ld" in your
> > fieldType definition.
> > Then reindex and commit.
> >
> > Koji
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Field-tokenizer-question-tp22594575p22653356.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
lass="solr.TextField" instead of class="solr.StrField" in your
> fieldType definition.
> Then reindex and commit.
>
> Koji
>
>
>
--
View this message in context:
http://www.nabble.com/Field-tokenizer-question-tp22594575p22653356.html
Sent from the Solr - User mailing list archive at Nabble.com.
Ashish P wrote:
I have created a field,
Set class="solr.TextField" instead of class="solr.StrField" in your
fieldType definition.
Then reindex and commit.
Koji
committed.
Am I missing something here?
--
View this message in context:
http://www.nabble.com/Field-tokenizer-question-tp22594575p22594575.html
Sent from the Solr - User mailing list archive at Nabble.com.
27 matches
Mail list logo