Am about to implement a custom query that is sort of mash-up of Facets,
Highlighting, and SpanQuery - but thought I'd see if anyone has done
anything similar.
In simple words, I need facet on the next word given a target word.
For example, if my index only had the following 5 documents (co
Hello,
I am trying to get a list of highly unusual terms or phrases (for example a
TF of 1 or 2) within an entire index (essentially this would be the inverse
of how Luke gives 'top terms' on the 'Overview' tab).
I see how I can do this within a specific query using the Term Vector
Componen
: Re: Listing Terms by Ascending IDF value . . ?
On Tue, Jan 5, 2010 at 9:15 AM, Christopher Ball <
christopher.b...@metaheuristica.com> wrote:
> Hello,
>
> I am trying to get a list of highly unusual terms or phrases (for example
a
> TF of 1 or 2) within an entire index (essenti
Hoss,
Thanks for your reply.
As you pointed out the Terms Component alone with the terms.maxcount did the
trick for single terms.
And ShingleFilter did the trick for phrases.
I have not ventured into Hadoop just yet - any examples you could point me
to of simple map/reduce jobs?
I am about to attempt to implementing the SpanQuery in Solr 1.4.
I noticed there is a JIRA to add it in 1.5:
* https://issues.apache.org/jira/browse/SOLR-1337
I also noticed a couple of email threads from Grant and Yonik about trying
to implement it such as:
* http://
query must be span queries, and most query parsers generate
non-span queries. I think there is code in the highlighter that uses
spans that can do this conversion.
-Yonik
http://www.lucidimagination.com
On Wed, Jan 27, 2010 at 12:24 PM, Christopher Ball
wrote:
> I am about to at
I am curious how I can query for multi-term phrases using the
TermsComponent?
The field I am searching has been shingled so it contains 2 and 3 word
phrases.
For example in the sample results below I want to only get back multi-word
phrases such as "table of contents" and "under the" but no
I am perplexed by the behavior I am seeing of the Solr Analyzer and Filters
with regard to Underscores.
1) I am trying to get rid of them when shingling, but seem unable to do so
with a Stopwords Filter.
And yet they are being removed when I am not even trying to by the
WordDelimiter Filter
I am perplexed by the behavior I am seeing of the Solr Analyzer and Filters
with regard to Underscores.
I am trying to get rid of underscores('_') when shingling, but seem unable
to do so with a Stopwords Filter.
And yet underscores are being removed when I am not even trying to by the
WordDelimi
Unfortunately, the underscore is being quite resilient =(
I tried the solr.MappingCharFilterFactory and know the mapping is working as
I am changing "c" => "q" just fine. But the underscore refuses to go!
I am baffled . . .
-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.co
I think I am making some progress - the key suggestion was to look at the
analysis.jsp which I foolishly had forgotten =(.
I think it is actually a bug in the ShingleFilterFactory when it is used in
subsequent to another Filter which removes tokens, e.g. StopFilterFactory or
WordDelimiterFactory.
How can I Index an entire Phrases and not it's constituent parts?
I want to index collations as a single term in the index, and not as the
multiple terms that comprise the phrase, for example, I want to index: "as
much as" but not the independent parts: "as", "much", "as".
Any guidance appr
How can I count the total number of a specific terms occurrences?
How can you get the total number of occurrences of a term across all
documents (e.g. Sum of the number of occurrences of a specific term in each
doc)?
For example, I have 3 documents, document #1 has "The green bird is flyin
nizersTokenFilters
HTH
<http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters>Erick
On Thu, Mar 4, 2010 at 2:31 PM, Christopher Ball <
christopher.b...@metaheuristica.com> wrote:
> How can I Index an entire Phrases and not it's constituent parts?
>
>
>
&
quot;
> without giving more details about the "X" so that we can understand the
> full issue. Perhaps the best solution doesn't involve "Y" at all?
>
> See Also: http://www.perlmonks.org/index.pl?node_id=542341
>
> Erick
>
>
> On Tue, Mar 9, 2010 at 6:
Thank you for the idea Mitch, but it just doesn't seem right that I should
have to revert to Scoring when what I really need seems so fundamental.
Logically, what I want is a "phrase filter factory" that would match on
phrases listed in a file, like stopwords, but in this case index the match
and
16 matches
Mail list logo