with Solr 4.10.3 I was advised to set luceneMatchVersion to "4.3" to make
hit highlight work with NGram/EdgeNgram- fields, like this:
In Solr 5 and 5.1 this seems to not work any more.
The complete word is highlighted, not just the part that matches the
search term.
In Solr admi
I am interested in a new filter type, one that would combine edgengram
and ngram. The idea is that it would create all ngrams specified by the
min/max size, but the ngrams that happen to be edgengrams (specifically
the left side) would get an index-time boost. Optionally the boost
would be
Try using KeywordTokenizerFactory instead of StandardTokenizerFactory to get
the results you want.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Problem-using-EdgeNGram-tp3355132p3355211.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
I am using solr 3.3 with SolrJ. I am trying to use EdgeNgram to power auto
suggest feature in my application. My understanding is that using EdgeNgram
would mean that results will only be returned for records starting with the
search criteria but this is not happening for me.
For example if
t; >> >
> >> >
> >> >
> >> >
> >> >
> >> > this way, at query time "abcdefg" won't be turned to "a ab abc abcd
> abcde
> >> > abcdef abcdefg". At index time it will.
> >> &
> Tomás
>> >
>> >
>> > On Tue, May 31, 2011 at 1:07 PM, Brian Lamb <
>> brian.l...@journalexperts.com
>> > > wrote:
>> >
>> >> > >> positionIncrementGap="1000">
>> >>
>> >>
; >> >> maxGramSize="25" side="front" />
> >>
> >>
> >>
> >> I believe I used that link when I initially set up the field and it
> worked
> >> great (and I'm still using it in other places). In this particular
&
;1000">
>>
>>
>> > maxGramSize="25" side="front" />
>>
>>
>>
>> I believe I used that link when I initially set up the field and it worked
>> great (and I'm still using it in other places
side="front" />
>
>
>
> I believe I used that link when I initially set up the field and it worked
> great (and I'm still using it in other places). In this particular example
> however it does not appear to be practical for me. I mentioned that I have
>
f and in the case of an edgengram,
it returns 1 * length of the search string.
Thanks,
Brian Lamb
On Tue, May 31, 2011 at 11:34 AM, bmdakshinamur...@gmail.com <
bmdakshinamur...@gmail.com> wrote:
> Can you specify the analyzer you are using for your queries?
>
> May be you could u
t 9:17 AM, Brian Lamb
> > wrote:
> > > For this, I ended up just changing it to string and using "abcdefg*" to
> > > match. That seems to work so far.
> > >
> > > Thanks,
> > >
> > > Brian Lamb
> > >
> > > On We
so far.
> >
> > Thanks,
> >
> > Brian Lamb
> >
> > On Wed, May 25, 2011 at 4:53 PM, Brian Lamb
> > wrote:
> >
> >> Hi all,
> >>
> >> I'm running into some confusion with the way edgengram works. I have the
> >> fi
this, I ended up just changing it to string and using "abcdefg*" to
> match. That seems to work so far.
>
> Thanks,
>
> Brian Lamb
>
> On Wed, May 25, 2011 at 4:53 PM, Brian Lamb
> wrote:
>
>> Hi all,
>>
>> I'm running into s
For this, I ended up just changing it to string and using "abcdefg*" to
match. That seems to work so far.
Thanks,
Brian Lamb
On Wed, May 25, 2011 at 4:53 PM, Brian Lamb
wrote:
> Hi all,
>
> I'm running into some confusion with the way edgengram works. I ha
Hi all,
I'm running into some confusion with the way edgengram works. I have the
field set up as:
I've also set up my own similarity class that returns 1 as the idf score.
What I've found this does is if I match a string "abcdefg" a
I have the following field defined in my schema:
I have the default field set to "person" and have indexed the
following document:
The following queries return the result as expec
; Hi Erick,
>
> If you have time, Can you please take a look and provide your comments (or)
> suggestions for this problem?
>
> Please let me know if you need any more information.
>
> Thanks,
>
> Johnny
> --
> View this message in context:
> http://lucene.4720
Hi Erick,
If you have time, Can you please take a look and provide your comments (or)
suggestions for this problem?
Please let me know if you need any more information.
Thanks,
Johnny
--
View this message in context:
http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore
x it is
mentioned in WIKI.
http://wiki.apache.org/solr/TermsComponent
Am I going wrong anywhere?
Please let me know if you need any more info.
Thanks,
Johnny
--
View this message in context:
http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2399330.html
Sent f
lk
> with apple"
>
> I want an output Similar like a Google auto suggest.
>
> Is there a way to achieve this without encapsulating with double quotes.
>
> Thanks,
>
> Johnny
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2333602.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Right now our configuration says multivalues=true. But that need not be
"true" in our case. Will make it false and try and update this thread with
more details..
--
View this message in context:
http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2334627
is used for
autosuggest feature, performance is an important factor.
So it looks like, using edgeNgram it is difficult to achieve the the
following
Result should return only those terms where search letter is matching with
the first word only. For example, when we type "M", it should ret
The index contains around 1.5 million documents. As this is used for
autosuggest feature, performance is an important factor.
So it looks like, using edgeNgram it is difficult to achieve the the
following
Result should return only those terms where search letter is matching with
the first
Oh, i should perhaps mention that EdgeNGrams will yield results a lot quicker
than using wildcards at the cost of a larger index. You should, of course, use
EdgeNGrams if you worry about performance and have a huge index and a number
of queries per second.
> Then you don't need NGrams at all. A
Then you don't need NGrams at all. A wildcard will suffice or you can use the
TermsComponent.
If these strings are indexed as single tokens (KeywordTokenizer with
LowercaseFilter) you can simply do field:app* to retrieve the "apple milk
shake". You can also use the string field type but then yo
I haven't figured out any way to achieve that AT ALL without making a
seperate Solr index just to serve autosuggest queries. At least when you
want to auto-suggest on a multi-value field. Someone posted a crazy
tricky way to do it with a single-valued field a while ago. If you
can/are willing
tter "apple" which I typed in. It
should not bring others and if I type "milk" it should return only "milk
with apple"
I want an output Similar like a Google auto suggest.
Is there a way to achieve this without encapsulating with double quotes.
Thanks,
Johnny
--
match, that's the expected behavior.
So, we need a clear problem definition of what you're trying to do, along
with
example queries (please post the results of adding &debugQuery=on).
Best
Erick
On Tue, Jan 25, 2011 at 8:29 AM, johnnyisrael wrote:
>
> Hi Eric,
>
> Y
Hi Eric,
You are right, there is a copy field to EdgeNgram, I tried the configuration
but it not working as expected.
Configuration I tried:
edgy_user_query
==
When I search for
See below.
On Mon, Jan 24, 2011 at 1:51 PM, johnnyisrael wrote:
>
> Hi,
>
> I am trying out the auto suggest using EdgeNgram.
>
> Using the following tutorial as a reference.
>
>
> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-u
Hi,
I am trying out the auto suggest using EdgeNgram.
Using the following tutorial as a reference.
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
In the above tutorial, The below two lines has been clearly mentioned,
"Note that
ce the QTime to 1 secs.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Query-performance-issue-while-using-EdgeNGram-tp2097056p2130751.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
l for some queries QTime is more than 8 secs. It is a 'Blocker' for us.
Could you please suggest any to reduce the QTime to 1 secs.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Query-performance-issue-while-using-EdgeNGram-tp2097056p2130751.html
Sent from the Solr
>
> class="org.apache.solr.handler.component.SearchHandler">
>
> explicit
>
>
> class="solr.SpellingQueryConverter"/>
> class="org.apache.solr.handler.component.QueryElevationComponent" >
>string
>elevate.xml
>
> class="org.apache.solr.handler.component.SearchHandler" startup="lazy">
>
> explicit
>
>
> elevator
>
>
> >
> >
> startup="lazy" />
> class="org.apache.solr.handler.admin.AdminHandlers" />
>
>
> standard
> solrpingquery
> all
>
>
>
>default="true">
>
> 100
>
>
>class="org.apache.solr.highlight.RegexFragmenter">
>
> 70
> 0.5
> [-\w ,/\n\"']{20,200}
>
>
>default="true">
>
>
>
>
>
>
> class="org.apache.solr.request.XMLResponseWriter" default="true"/>
> class="org.apache.solr.request.JSONResponseWriter"/>
> class="org.apache.solr.request.PythonResponseWriter"/>
> class="org.apache.solr.request.RubyResponseWriter"/>
> class="org.apache.solr.request.PHPResponseWriter"/>
> class="org.apache.solr.request.PHPSerializedResponseWriter"/>
> class="org.apache.solr.request.XSLTResponseWriter">
>5
>
>
>
> schema.xml
>
>
>
>
>
>
>
> pattern="([^a-z0-9])"
> replacement="" replace="all" />
> maxGramSize="100"
> minGramSize="1" />
>
>
>
>
>
> pattern="([^a-z0-9])"
> replacement="" replace="all" />
> pattern="^(.{20})(.*)?" replacement="$1" replace="all" />
>
>
>
>
>
>
>
>stored="false"/>
>
> id
> autosuggest
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Query-performance-issue-while-using-EdgeNGram-tp2097056p2097056.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
it seems adding the '+' (required) operator to each term in a multi-term query
does the trick:
http://lucene.apache.org/java/2_4_0/queryparsersyntax.html#+
ie: edgytext2:(+Martin +Sco)
-robert
On Nov 16, 2010, at 8:52 PM, Robert Gründler wrote:
> thanks for the explanation.
>
> the result
thanks for the explanation.
the results for the autocompletion are pretty good now, but we still have a
small problem.
When there are hits in the "edgytext2" fields, results which only have hits in
the "edgytext" field
should not be returned at all.
Example:
Query: "Martin Sco"
Current Resu
Without the parens, the "edgytext:" only applied to "Mr", the default
field still applied to "Scorcese".
The double quotes are neccesary in the second case (rather than parens),
because on a non-tokenized field because the standard query parser will
"pre-tokenize" on whitespace before sending
>
> Did you run your query without using () and "" operators? If yes can you try
> this?
> &q=edgytext:(Mr Scorsese) OR edgytext2:"Mr Scorsese"^2.0
I didn't use () and "" in my query before. Using the query with those operators
works now, stopwords are thrown out as the should, thanks.
However,
Ah I see. Thanks for the explanation.
Could you set the defaultOperator to "AND"? That way both "Bill" and "Cl" must
be a match and that would exclude "Clyde Phillips".
--- On Thu, 11/11/10, Robert Gründler wrote:
> From: Robert Gründler
> Su
according to the fieldtype i posted previously, i think it's because of:
1. WhiteSpaceTokenizer splits the String "Clyde Phillips" into 2 tokens:
"Clyde" and "Phillips"
2. EdgeNGramFilter gets the 2 tokens, and creates an EdgeNGram for each token:
&quo
Could anyone help me understand what does "Clyde Phillips" appear in the
results for "Bill Cl"??
"Clyde Phillips" doesn't produce any EdgeNGram that would match "Bill Cl", so
why is it even in the results?
Thanks.
--- On Thu, 11/11/10, Ahmet Arsl
On 12 Nov 2010, at 01:46, Ahmet Arslan wrote:
>> This setup now makes troubles regarding StopWords, here's
>> an example:
>>
>> Let's say the index contains 2 Strings: "Mr Martin
>> Scorsese" and "Martin Scorsese". "Mr" is in the stopword
>> list.
>>
>> Query: edgytext:Mr Scorsese OR edgytext2
> This setup now makes troubles regarding StopWords, here's
> an example:
>
> Let's say the index contains 2 Strings: "Mr Martin
> Scorsese" and "Martin Scorsese". "Mr" is in the stopword
> list.
>
> Query: edgytext:Mr Scorsese OR edgytext2:Mr Scorsese^2.0
>
> This way, the only result i get is
l"
>
> You can even apply boost so that begins with matches comes first.
>
> --- On Thu, 11/11/10, Robert Gründler wrote:
>
>> From: Robert Gründler
>> Subject: EdgeNGram relevancy
>> To: solr-user@lucene.apache.org
>> Date: Thursday, November 11
10, Robert Gründler wrote:
> From: Robert Gründler
> Subject: EdgeNGram relevancy
> To: solr-user@lucene.apache.org
> Date: Thursday, November 11, 2010, 5:51 PM
> Hi,
>
> consider the following fieldtype (used for
> autocompletion):
>
> positio
Hi,
consider the following fieldtype (used for autocompletion):
This works fine as long as the query string is a single word. For multiple
words, the ranking is weird though.
Example:
Que
46 matches
Mail list logo