Hi,
I'm combined the WordDelimiterFilter with the PositionFilter to prevent the
creation of expensive Phrase and MultiPhraseQueries. But
if I now parse an escaped string consisting of two terms, the analyser returns
a BooleanQuery. That's not what I would expect. If a
string is escaped, I would
Hi Robert,
> On Fri, Sep 24, 2010 at 3:54 AM, Mathias Walter wrote:
>
> > Hi,
> >
> > I'm combined the WordDelimiterFilter with the PositionFilter to prevent the
> > creation of expensive Phrase and MultiPhraseQueries. But
> > if I now parse an es
Hi Max,
why don't you use WordDelimiterFilterFactory directly? I'm doing the same
stuff inside my own analyzer:
final Map args = new HashMap();
args.put("generateWordParts", "1");
args.put("generateNumberParts", "1");
args.put("catenateWords", "0");
args.put("catenateNumbers", "0");
args.put("ca
Hi,
does a field which should be cached needs to be indexed?
I have a binary field which is just stored. Retrieving it via
FieldCache.DEFAULT.getTerms returns empty ByteRefs.
Then I found the following post:
http://www.mail-archive.com/d...@lucene.apache.org/msg05403.html
How can I use the Fi
ving
> it is usually a rare enough operation that caching is irrelevant.
>
> This smells like an XY problem, see:
> http://people.apache.org/~hossman/#xyproblem
>
> If this seems like gibberish, could you explain your problem
> a little more?
>
> Best
> Erick
>
>
Hi,
> On Mon, Oct 25, 2010 at 3:41 AM, Mathias Walter
> wrote:
> > I indexed about 90 million sentences and the PAS (predicate argument
> structures) they consist of (which are about 500 million). Then
> > I try to do NER (named entity recognition) by searching about 5 mi
Hi,
> > [...] I tried to use IndexableBinaryStringTools to re-encode my 11 byte
> > array. The size was increased to 7 characters (= 14 bytes)
> > which is still a gain of more than 50 percent compared to the UTF8
> > encoding. BTW: I found no sample how to use the
> > IndexableBinaryStringTools c