Re: Highlighting in stemmed or n-grammed fields possible?
Hi Koji et.al, You say https://issues.apache.org/jira/browse/SOLR-1268 is an open issue for the ngram highlighting problem, but it seems to refer to something unrelated. Can you/anyone confirm that it is not possible to use highlighting with an ngram tokenizer/filter.. Thanks, Aodh.
Re: Highlighting in stemmed or n-grammed fields possible?
But it would seem that Lucene has always supported highlighting on NGram fields? as show by the example here: https://issues.apache.org/jira/browse/LUCENE-1489 When I try to use highlighting with NGramming, none of the text is highlighted, and instead I get a long string in the highlighting field... In any case, can anything be done to support highlighting and NGrams in Solr as this functionality is imperative to my application :( Thanks for your help, Aodh. On Tue, Sep 29, 2009 at 1:16 AM, Koji Sekiguchi wrote: > I think I need a further explanation for that. > The Lucene's FastVectorHighlighter which is pointed in SOLR-1268 is > a highlighter that supports n-gram field. Please see the description > for the features etc: > > http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/contrib-fast-vector-highlighter/org/apache/lucene/search/vectorhighlight/package-summary.html > > Thanks, > > Koji > > > aod...@gmail.com wrote: >> >> Hi Koji et.al, >> >> You say https://issues.apache.org/jira/browse/SOLR-1268 is an open >> issue for the ngram highlighting problem, but it seems to refer to >> something unrelated. >> >> Can you/anyone confirm that it is not possible to use highlighting >> with an ngram tokenizer/filter.. >> >> Thanks, >> >> Aodh. >> >> > >
NGramTokenFilter behaviour
If I index the following text: "I live in Dublin Ireland where Guinness is brewed" Then search for: duvlin Should Solr return a match? In the admin interface under the analysis section, Solr highlights some NGram matches? When I enter the following query string into my browser address bar, I get 0 results? http://localhost:8983/solr/select/?q=duvlin&debugQuery=true Nor do I get results for dub, dubli, ublin, dublin (du does return a result). I also notice when I use debugQuery=true, the parsed query is a PhraseQuery. This doesn't make sense to me, as surely the point of the NGram is to use a Boolean OR between each Gram?? However, if I don't use an NGramFilterFactory at query time, I can get results for: dub, ublin, du, but not duvlin. Can someone please clarify what the purpose of the NGramFilter/tokenizer is, if not to allow for misspellings/morphological variation and also, what the correct configuration is in terms of use at index/query time. Any help appreciated! Aodh. Solr 1.3, JDK 1.6
Re: n-Gram, only works with queries of 2 letters
Has this issue been fixed yet? can anyone shed some light on what's going on here please. NGramming is critical to my app. I will have to look to something other than Solr if it's not possible to do :(
TermVector term frequencies for tag cloud
Hello, I'm trying to create a tag cloud from a term vector, but the array returned (using JSON wt) is quite complex and takes an inordinate amount of time to process. Is there a better way to retrieve terms and their document TF? The TermVectorComponent allows for retrieval of tf and df though I'm only interested in TF. I know the TermsComponent gives you DF, but I need TF! Any suggestions, Thanks, Aodh.