Solr 5: hit highlight with NGram/EdgeNgram-fields

2015-04-20 Thread Bjørn Hjelle
with Solr 4.10.3 I was advised to set luceneMatchVersion to "4.3" to make hit highlight work with NGram/EdgeNgram- fields, like this: In Solr 5 and 5.1 this seems to not work any more. The complete word is highlighted, not just the part that matches the search term. In Solr admi

Combination of edgengram and ngram

2011-12-13 Thread Shawn Heisey
I am interested in a new filter type, one that would combine edgengram and ngram. The idea is that it would create all ngrams specified by the min/max size, but the ngrams that happen to be edgengrams (specifically the left side) would get an index-time boost. Optionally the boost would be

Re: Problem using EdgeNGram

2011-09-21 Thread O. Klein
Try using KeywordTokenizerFactory instead of StandardTokenizerFactory to get the results you want. -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-using-EdgeNGram-tp3355132p3355211.html Sent from the Solr - User mailing list archive at Nabble.com.

Problem using EdgeNGram

2011-09-21 Thread Kissue Kissue
Hi, I am using solr 3.3 with SolrJ. I am trying to use EdgeNgram to power auto suggest feature in my application. My understanding is that using EdgeNgram would mean that results will only be returned for records starting with the search criteria but this is not happening for me. For example if

Re: Edgengram

2011-06-01 Thread Brian Lamb
t; >> > > >> > > >> > > >> > > >> > > >> > this way, at query time "abcdefg" won't be turned to "a ab abc abcd > abcde > >> > abcdef abcdefg". At index time it will. > >> &

Re: Edgengram

2011-06-01 Thread Erick Erickson
> Tomás >> > >> > >> > On Tue, May 31, 2011 at 1:07 PM, Brian Lamb < >> brian.l...@journalexperts.com >> > > wrote: >> > >> >> > >> positionIncrementGap="1000"> >> >>   >> >>    

Re: Edgengram

2011-06-01 Thread Brian Lamb
; >> >> maxGramSize="25" side="front" /> > >> > >> > >> > >> I believe I used that link when I initially set up the field and it > worked > >> great (and I'm still using it in other places). In this particular &

Re: Edgengram

2011-05-31 Thread Tomás Fernández Löbbe
;1000"> >> >> >> > maxGramSize="25" side="front" /> >> >> >> >> I believe I used that link when I initially set up the field and it worked >> great (and I'm still using it in other places

Re: Edgengram

2011-05-31 Thread Tomás Fernández Löbbe
side="front" /> > > > > I believe I used that link when I initially set up the field and it worked > great (and I'm still using it in other places). In this particular example > however it does not appear to be practical for me. I mentioned that I have >

Re: Edgengram

2011-05-31 Thread Brian Lamb
f and in the case of an edgengram, it returns 1 * length of the search string. Thanks, Brian Lamb On Tue, May 31, 2011 at 11:34 AM, bmdakshinamur...@gmail.com < bmdakshinamur...@gmail.com> wrote: > Can you specify the analyzer you are using for your queries? > > May be you could u

Re: Edgengram

2011-05-31 Thread bmdakshinamur...@gmail.com
t 9:17 AM, Brian Lamb > > wrote: > > > For this, I ended up just changing it to string and using "abcdefg*" to > > > match. That seems to work so far. > > > > > > Thanks, > > > > > > Brian Lamb > > > > > > On We

Re: Edgengram

2011-05-31 Thread Brian Lamb
so far. > > > > Thanks, > > > > Brian Lamb > > > > On Wed, May 25, 2011 at 4:53 PM, Brian Lamb > > wrote: > > > >> Hi all, > >> > >> I'm running into some confusion with the way edgengram works. I have the > >> fi

Re: Edgengram

2011-05-31 Thread Erick Erickson
this, I ended up just changing it to string and using "abcdefg*" to > match. That seems to work so far. > > Thanks, > > Brian Lamb > > On Wed, May 25, 2011 at 4:53 PM, Brian Lamb > wrote: > >> Hi all, >> >> I'm running into s

Re: Edgengram

2011-05-27 Thread Brian Lamb
For this, I ended up just changing it to string and using "abcdefg*" to match. That seems to work so far. Thanks, Brian Lamb On Wed, May 25, 2011 at 4:53 PM, Brian Lamb wrote: > Hi all, > > I'm running into some confusion with the way edgengram works. I ha

Edgengram

2011-05-25 Thread Brian Lamb
Hi all, I'm running into some confusion with the way edgengram works. I have the field set up as: I've also set up my own similarity class that returns 1 as the idf score. What I've found this does is if I match a string "abcdefg" a

LetterTokenizer + EdgeNGram + apostrophe in query = invalid result

2011-02-25 Thread Matt Weber
I have the following field defined in my schema: I have the default field set to "person" and have indexed the following document: The following queries return the result as expec

Re: EdgeNgram Auto suggest - doubles ignore

2011-02-08 Thread Erick Erickson
; Hi Erick, > > If you have time, Can you please take a look and provide your comments (or) > suggestions for this problem? > > Please let me know if you need any more information. > > Thanks, > > Johnny > -- > View this message in context: > http://lucene.4720

Re: EdgeNgram Auto suggest - doubles ignore

2011-02-08 Thread johnnyisrael
Hi Erick, If you have time, Can you please take a look and provide your comments (or) suggestions for this problem? Please let me know if you need any more information. Thanks, Johnny -- View this message in context: http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore

Re: EdgeNgram Auto suggest - doubles ignore

2011-02-01 Thread johnnyisrael
x it is mentioned in WIKI. http://wiki.apache.org/solr/TermsComponent Am I going wrong anywhere? Please let me know if you need any more info. Thanks, Johnny -- View this message in context: http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2399330.html Sent f

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Erick Erickson
lk > with apple" > > I want an output Similar like a Google auto suggest. > > Is there a way to achieve this without encapsulating with double quotes. > > Thanks, > > Johnny > -- > View this message in context: > http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2333602.html > Sent from the Solr - User mailing list archive at Nabble.com. >

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread mesenthil
Right now our configuration says multivalues=true. But that need not be "true" in our case. Will make it false and try and update this thread with more details.. -- View this message in context: http://lucene.472066.n3.nabble.com/EdgeNgram-Auto-suggest-doubles-ignore-tp2321919p2334627

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Jonathan Rochkind
is used for autosuggest feature, performance is an important factor. So it looks like, using edgeNgram it is difficult to achieve the the following Result should return only those terms where search letter is matching with the first word only. For example, when we type "M", it should ret

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread mesenthil
The index contains around 1.5 million documents. As this is used for autosuggest feature, performance is an important factor. So it looks like, using edgeNgram it is difficult to achieve the the following Result should return only those terms where search letter is matching with the first

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Markus Jelsma
Oh, i should perhaps mention that EdgeNGrams will yield results a lot quicker than using wildcards at the cost of a larger index. You should, of course, use EdgeNGrams if you worry about performance and have a huge index and a number of queries per second. > Then you don't need NGrams at all. A

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Markus Jelsma
Then you don't need NGrams at all. A wildcard will suffice or you can use the TermsComponent. If these strings are indexed as single tokens (KeywordTokenizer with LowercaseFilter) you can simply do field:app* to retrieve the "apple milk shake". You can also use the string field type but then yo

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Jonathan Rochkind
I haven't figured out any way to achieve that AT ALL without making a seperate Solr index just to serve autosuggest queries. At least when you want to auto-suggest on a multi-value field. Someone posted a crazy tricky way to do it with a single-valued field a while ago. If you can/are willing

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread johnnyisrael
tter "apple" which I typed in. It should not bring others and if I type "milk" it should return only "milk with apple" I want an output Similar like a Google auto suggest. Is there a way to achieve this without encapsulating with double quotes. Thanks, Johnny --

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread Erick Erickson
match, that's the expected behavior. So, we need a clear problem definition of what you're trying to do, along with example queries (please post the results of adding &debugQuery=on). Best Erick On Tue, Jan 25, 2011 at 8:29 AM, johnnyisrael wrote: > > Hi Eric, > > Y

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-25 Thread johnnyisrael
Hi Eric, You are right, there is a copy field to EdgeNgram, I tried the configuration but it not working as expected. Configuration I tried: edgy_user_query == When I search for

Re: EdgeNgram Auto suggest - doubles ignore

2011-01-24 Thread Erick Erickson
See below. On Mon, Jan 24, 2011 at 1:51 PM, johnnyisrael wrote: > > Hi, > > I am trying out the auto suggest using EdgeNgram. > > Using the following tutorial as a reference. > > > http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-u

EdgeNgram Auto suggest - doubles ignore

2011-01-24 Thread johnnyisrael
Hi, I am trying out the auto suggest using EdgeNgram. Using the following tutorial as a reference. http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ In the above tutorial, The below two lines has been clearly mentioned, "Note that

Re: Query performance issue while using EdgeNGram

2010-12-22 Thread Erick Erickson
ce the QTime to 1 secs. > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Query-performance-issue-while-using-EdgeNGram-tp2097056p2130751.html > Sent from the Solr - User mailing list archive at Nabble.com. >

Re: Query performance issue while using EdgeNGram

2010-12-22 Thread Shanmugavel SRD
l for some queries QTime is more than 8 secs. It is a 'Blocker' for us. Could you please suggest any to reduce the QTime to 1 secs. -- View this message in context: http://lucene.472066.n3.nabble.com/Query-performance-issue-while-using-EdgeNGram-tp2097056p2130751.html Sent from the Solr

Re: Query performance issue while using EdgeNGram

2010-12-16 Thread Erick Erickson
> > class="org.apache.solr.handler.component.SearchHandler"> > > explicit > > > class="solr.SpellingQueryConverter"/> > class="org.apache.solr.handler.component.QueryElevationComponent" > >string >elevate.xml > > class="org.apache.solr.handler.component.SearchHandler" startup="lazy"> > > explicit > > > elevator > > > > > > > startup="lazy" /> > class="org.apache.solr.handler.admin.AdminHandlers" /> > > > standard > solrpingquery > all > > > >default="true"> > > 100 > > >class="org.apache.solr.highlight.RegexFragmenter"> > > 70 > 0.5 > [-\w ,/\n\"']{20,200} > > >default="true"> > > > > > > > class="org.apache.solr.request.XMLResponseWriter" default="true"/> > class="org.apache.solr.request.JSONResponseWriter"/> > class="org.apache.solr.request.PythonResponseWriter"/> > class="org.apache.solr.request.RubyResponseWriter"/> > class="org.apache.solr.request.PHPResponseWriter"/> > class="org.apache.solr.request.PHPSerializedResponseWriter"/> > class="org.apache.solr.request.XSLTResponseWriter"> >5 > > > > schema.xml > > > > > > > > pattern="([^a-z0-9])" > replacement="" replace="all" /> > maxGramSize="100" > minGramSize="1" /> > > > > > > pattern="([^a-z0-9])" > replacement="" replace="all" /> > pattern="^(.{20})(.*)?" replacement="$1" replace="all" /> > > > > > > > >stored="false"/> > > id > autosuggest > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Query-performance-issue-while-using-EdgeNGram-tp2097056p2097056.html > Sent from the Solr - User mailing list archive at Nabble.com. >

Re: EdgeNGram relevancy

2010-11-16 Thread Robert Gründler
it seems adding the '+' (required) operator to each term in a multi-term query does the trick: http://lucene.apache.org/java/2_4_0/queryparsersyntax.html#+ ie: edgytext2:(+Martin +Sco) -robert On Nov 16, 2010, at 8:52 PM, Robert Gründler wrote: > thanks for the explanation. > > the result

Re: EdgeNGram relevancy

2010-11-16 Thread Robert Gründler
thanks for the explanation. the results for the autocompletion are pretty good now, but we still have a small problem. When there are hits in the "edgytext2" fields, results which only have hits in the "edgytext" field should not be returned at all. Example: Query: "Martin Sco" Current Resu

Re: EdgeNGram relevancy

2010-11-11 Thread Jonathan Rochkind
Without the parens, the "edgytext:" only applied to "Mr", the default field still applied to "Scorcese". The double quotes are neccesary in the second case (rather than parens), because on a non-tokenized field because the standard query parser will "pre-tokenize" on whitespace before sending

Re: EdgeNGram relevancy

2010-11-11 Thread Robert Gründler
> > Did you run your query without using () and "" operators? If yes can you try > this? > &q=edgytext:(Mr Scorsese) OR edgytext2:"Mr Scorsese"^2.0 I didn't use () and "" in my query before. Using the query with those operators works now, stopwords are thrown out as the should, thanks. However,

Re: EdgeNGram relevancy

2010-11-11 Thread Andy
Ah I see. Thanks for the explanation. Could you set the defaultOperator to "AND"? That way both "Bill" and "Cl" must be a match and that would exclude "Clyde Phillips". --- On Thu, 11/11/10, Robert Gründler wrote: > From: Robert Gründler > Su

Re: EdgeNGram relevancy

2010-11-11 Thread Robert Gründler
according to the fieldtype i posted previously, i think it's because of: 1. WhiteSpaceTokenizer splits the String "Clyde Phillips" into 2 tokens: "Clyde" and "Phillips" 2. EdgeNGramFilter gets the 2 tokens, and creates an EdgeNGram for each token: &quo

Re: EdgeNGram relevancy

2010-11-11 Thread Andy
Could anyone help me understand what does "Clyde Phillips" appear in the results for "Bill Cl"?? "Clyde Phillips" doesn't produce any EdgeNGram that would match "Bill Cl", so why is it even in the results? Thanks. --- On Thu, 11/11/10, Ahmet Arsl

Re: EdgeNGram relevancy

2010-11-11 Thread Nick Martin
On 12 Nov 2010, at 01:46, Ahmet Arslan wrote: >> This setup now makes troubles regarding StopWords, here's >> an example: >> >> Let's say the index contains 2 Strings: "Mr Martin >> Scorsese" and "Martin Scorsese". "Mr" is in the stopword >> list. >> >> Query: edgytext:Mr Scorsese OR edgytext2

Re: EdgeNGram relevancy

2010-11-11 Thread Ahmet Arslan
> This setup now makes troubles regarding StopWords, here's > an example: > > Let's say the index contains 2 Strings: "Mr Martin > Scorsese" and "Martin Scorsese". "Mr" is in the stopword > list. > > Query: edgytext:Mr Scorsese OR edgytext2:Mr Scorsese^2.0 > > This way, the only result i get is

Re: EdgeNGram relevancy

2010-11-11 Thread Robert Gründler
l" > > You can even apply boost so that begins with matches comes first. > > --- On Thu, 11/11/10, Robert Gründler wrote: > >> From: Robert Gründler >> Subject: EdgeNGram relevancy >> To: solr-user@lucene.apache.org >> Date: Thursday, November 11

Re: EdgeNGram relevancy

2010-11-11 Thread Ahmet Arslan
10, Robert Gründler wrote: > From: Robert Gründler > Subject: EdgeNGram relevancy > To: solr-user@lucene.apache.org > Date: Thursday, November 11, 2010, 5:51 PM > Hi, > > consider the following fieldtype (used for > autocompletion): > >   positio

EdgeNGram relevancy

2010-11-11 Thread Robert Gründler
Hi, consider the following fieldtype (used for autocompletion): This works fine as long as the query string is a single word. For multiple words, the ranking is weird though. Example: Que