Re: Highlighting in stemmed or n-grammed fields possible?

2009-09-28 Thread aodhol
Hi Koji et.al,

You say https://issues.apache.org/jira/browse/SOLR-1268 is an open
issue for the ngram highlighting problem, but it seems to refer to
something unrelated.

Can you/anyone confirm that it is not possible to use highlighting
with an ngram tokenizer/filter..

Thanks,

Aodh.


Re: Highlighting in stemmed or n-grammed fields possible?

2009-09-28 Thread aodhol
But it would seem that Lucene has always supported highlighting on
NGram fields? as show by the example here:

https://issues.apache.org/jira/browse/LUCENE-1489

When I try to use highlighting with NGramming, none of the text is
highlighted, and instead I get a long string in the highlighting
field...

In any case, can anything be done to support highlighting and NGrams
in Solr as this functionality is imperative to my application :(

Thanks for your help,

Aodh.

On Tue, Sep 29, 2009 at 1:16 AM, Koji Sekiguchi  wrote:
> I think I need a further explanation for that.
> The Lucene's FastVectorHighlighter which is pointed in SOLR-1268 is
> a highlighter that supports n-gram field. Please see the description
> for the features etc:
>
> http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/contrib-fast-vector-highlighter/org/apache/lucene/search/vectorhighlight/package-summary.html
>
> Thanks,
>
> Koji
>
>
> aod...@gmail.com wrote:
>>
>> Hi Koji et.al,
>>
>> You say https://issues.apache.org/jira/browse/SOLR-1268 is an open
>> issue for the ngram highlighting problem, but it seems to refer to
>> something unrelated.
>>
>> Can you/anyone confirm that it is not possible to use highlighting
>> with an ngram tokenizer/filter..
>>
>> Thanks,
>>
>> Aodh.
>>
>>
>
>


NGramTokenFilter behaviour

2009-09-30 Thread aodhol
If I index the following text: "I live in Dublin Ireland where
Guinness is brewed"

Then search for: duvlin

Should Solr return a match?

In the admin interface under the analysis section, Solr highlights
some NGram matches?

When I enter the following query string into my browser address bar, I
get 0 results?

http://localhost:8983/solr/select/?q=duvlin&debugQuery=true

Nor do I get results for dub, dubli, ublin, dublin (du does return a result).

I also notice when I use debugQuery=true, the parsed query is a
PhraseQuery. This doesn't make sense to me, as surely the point of the
NGram is to use a Boolean OR between each Gram??

However, if I don't use an NGramFilterFactory at query time, I can get
results for: dub, ublin, du, but not duvlin.


  



  


Can someone please clarify what the purpose of the
NGramFilter/tokenizer is, if not to allow for
misspellings/morphological variation and also, what the correct
configuration is in terms of use at index/query time.

Any help appreciated!

Aodh.

Solr 1.3, JDK 1.6


Re: n-Gram, only works with queries of 2 letters

2009-09-30 Thread aodhol
Has this issue been fixed yet? can anyone shed some light on what's
going on here please. NGramming is critical to my app. I will have to
look to something other than Solr if it's not possible to do :(


TermVector term frequencies for tag cloud

2009-10-02 Thread aodhol
Hello,

I'm trying to create a tag cloud from a term vector, but the array
returned (using JSON wt) is quite complex and takes an inordinate
amount of time to process. Is there a better way to retrieve terms and
their document TF? The TermVectorComponent allows for retrieval of tf
and df though I'm only interested in TF. I know the TermsComponent
gives you DF, but I need TF!

Any suggestions,

Thanks,

Aodh.