Am about to implement a custom query that is sort of mash-up of Facets,
Highlighting, and SpanQuery - but thought I'd see if anyone has done
anything similar.
In simple words, I need facet on the next word given a target word.
For example, if my index only had the following 5 documents (co
Thank you for the idea Mitch, but it just doesn't seem right that I should
have to revert to Scoring when what I really need seems so fundamental.
Logically, what I want is a "phrase filter factory" that would match on
phrases listed in a file, like stopwords, but in this case index the match
and
quot;
> without giving more details about the "X" so that we can understand the
> full issue. Perhaps the best solution doesn't involve "Y" at all?
>
> See Also: http://www.perlmonks.org/index.pl?node_id=542341
>
> Erick
>
>
> On Tue, Mar 9, 2010 at 6:
nizersTokenFilters
HTH
<http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters>Erick
On Thu, Mar 4, 2010 at 2:31 PM, Christopher Ball <
christopher.b...@metaheuristica.com> wrote:
> How can I Index an entire Phrases and not it's constituent parts?
>
>
>
&
How can I count the total number of a specific terms occurrences?
How can you get the total number of occurrences of a term across all
documents (e.g. Sum of the number of occurrences of a specific term in each
doc)?
For example, I have 3 documents, document #1 has "The green bird is flyin
How can I Index an entire Phrases and not it's constituent parts?
I want to index collations as a single term in the index, and not as the
multiple terms that comprise the phrase, for example, I want to index: "as
much as" but not the independent parts: "as", "much", "as".
Any guidance appr
I think I am making some progress - the key suggestion was to look at the
analysis.jsp which I foolishly had forgotten =(.
I think it is actually a bug in the ShingleFilterFactory when it is used in
subsequent to another Filter which removes tokens, e.g. StopFilterFactory or
WordDelimiterFactory.
Unfortunately, the underscore is being quite resilient =(
I tried the solr.MappingCharFilterFactory and know the mapping is working as
I am changing "c" => "q" just fine. But the underscore refuses to go!
I am baffled . . .
-Original Message-
From: Ahmet Arslan [mailto:iori...@yahoo.co
I am perplexed by the behavior I am seeing of the Solr Analyzer and Filters
with regard to Underscores.
I am trying to get rid of underscores('_') when shingling, but seem unable
to do so with a Stopwords Filter.
And yet underscores are being removed when I am not even trying to by the
WordDelimi
I am perplexed by the behavior I am seeing of the Solr Analyzer and Filters
with regard to Underscores.
1) I am trying to get rid of them when shingling, but seem unable to do so
with a Stopwords Filter.
And yet they are being removed when I am not even trying to by the
WordDelimiter Filter
I am curious how I can query for multi-term phrases using the
TermsComponent?
The field I am searching has been shingled so it contains 2 and 3 word
phrases.
For example in the sample results below I want to only get back multi-word
phrases such as "table of contents" and "under the" but no
query must be span queries, and most query parsers generate
non-span queries. I think there is code in the highlighter that uses
spans that can do this conversion.
-Yonik
http://www.lucidimagination.com
On Wed, Jan 27, 2010 at 12:24 PM, Christopher Ball
wrote:
> I am about to at
I am about to attempt to implementing the SpanQuery in Solr 1.4.
I noticed there is a JIRA to add it in 1.5:
* https://issues.apache.org/jira/browse/SOLR-1337
I also noticed a couple of email threads from Grant and Yonik about trying
to implement it such as:
* http://
Hoss,
Thanks for your reply.
As you pointed out the Terms Component alone with the terms.maxcount did the
trick for single terms.
And ShingleFilter did the trick for phrases.
I have not ventured into Hadoop just yet - any examples you could point me
to of simple map/reduce jobs?
: Re: Listing Terms by Ascending IDF value . . ?
On Tue, Jan 5, 2010 at 9:15 AM, Christopher Ball <
christopher.b...@metaheuristica.com> wrote:
> Hello,
>
> I am trying to get a list of highly unusual terms or phrases (for example
a
> TF of 1 or 2) within an entire index (essenti
Hello,
I am trying to get a list of highly unusual terms or phrases (for example a
TF of 1 or 2) within an entire index (essentially this would be the inverse
of how Luke gives 'top terms' on the 'Overview' tab).
I see how I can do this within a specific query using the Term Vector
Componen
16 matches
Mail list logo