Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?

2010-10-03 Thread Gert Brinkmann
On 03.10.2010 09:20, Andy wrote: NGramFilterFactory would then take that one toke ("electric guitar") and generate N-grams out of it. One of the ngrams would be "guit" because "guit" is a substring of "electric guitar". AFAIK it only produces prefix-strings like gui guit guita guitar etc. So

Re: Prefix-Search with Stopwords - no results?

2010-05-30 Thread Gert Brinkmann
On 28.05.2010 22:06, Chris Hostetter wrote: and one "text_prefix" defined similarly but with an additional EdgeNGramTokenFilter used when indexing to generate "prefix" tokens. then search those fields using dismax... To be sure that I understand this right: Am I right that I should not stopwor

Re: Prefix-Search with Stopwords - no results?

2010-05-29 Thread Gert Brinkmann
Thank you, Chris and Erick, for the answers, it was new to me that "the*" is expanded to all known the* words in the index. Good to know. And yes, the AND operation between the query terms are certainly the problem. (I would like to switch to OR instead. The result set will grow the more wo

Prefix-Search with Stopwords - no results?

2010-05-28 Thread Gert Brinkmann
Hello, I am having some problems with solr 1.4. I am indexing and querying data using the following fieldType: The ap

Re: utf 8 issue

2009-02-18 Thread Gert Brinkmann
revathy arun wrote: > Is there any way to check the encoding of a text/pdf document or convert > them to utf -8 encoding? If you are using pdftotext you could set the enc parameter: pdftotext -enc UTF-8 filename How can you convert PDFs to text via xpdf programmatically? Greetings, Gert

Re: commit very long ?? solr

2009-02-05 Thread Gert Brinkmann
sunnyfr wrote: > Yes the average is 12 docs seconde updated. In our case with indexing normal web-pages on a normal workstation we have about 10 docs per second (updating + committing). This feels quite long. But if this is normal... ok. > I actually reduce warmup and cache, it works fine now, I

Highlighting on Prefix-Search Bug/Workaround (Re: query with stemming, prefix and fuzzy?)

2009-02-04 Thread Gert Brinkmann
Mark Miller wrote: >> Currently I think about dropping the stemming and only use >> prefix-search. But as highlighting does not work with a prefix "house*" >> this is a problem for me. The hint to use "house?*" instead does not >> work here. >> > Thats because wildcard queries are also not high

Re: query with stemming, prefix and fuzzy?

2009-01-30 Thread Gert Brinkmann
Mark Miller wrote: > Try hitting /solr/admin/luke and see what it says. Oh, interesting. I think I have to check the stopword list. Is there a way to filter single characters like the "h"? text_de_de ITS-- ITS-- 2340 57971 1454 1016 1008 980 927 924 895 843 730 730

Re: query with stemming, prefix and fuzzy?

2009-01-30 Thread Gert Brinkmann
Mark Miller wrote: > Yeah, sounds small. Its odd you would see such slow performance. It > depends though. You may still have a *lot* of unique terms in there. Is there a way to retrieve the list of terms in the index? Gert

Re: query with stemming, prefix and fuzzy?

2009-01-30 Thread Gert Brinkmann
Thanks, Mark, for your answer, Mark Miller wrote: > Truncation queries and stemming are difficult partners. You likely have > to accept compromise. You can try using multiple fields like you are, I already have multiple fields, one per language, to be able to use different stemmers. Wouldn't bec

Re: query with stemming, prefix and fuzzy?

2009-01-29 Thread Gert Brinkmann
Gert Brinkmann wrote: >> A) fuzzy search >> >> What can I do to speed up the fuzzy query? Setting ramBufferSizeMB to a higher value seems to speed up the query slightly. I have to continue with tuning though. >> B) combine stemming, prefix and fuzzy search >> &g

Re: query with stemming, prefix and fuzzy?

2009-01-29 Thread Gert Brinkmann
Shalin Shekhar Mangar wrote: Quite the opposite, you are actually working with some advanced stuff :) Thank you for the response. Please have some patience, someone is Ok, I will have (what else could I do? ;) ). Meanwhile I while try some things and continue to search the web. Greetings

Re: query with stemming, prefix and fuzzy?

2009-01-28 Thread Gert Brinkmann
Hello again, is there nobody who could help me with this? Or is it an FAQ and my questions are dumb somehow? Maybe I should try to shorten the questions: ;) > A) fuzzy search > > What can I do to speed up the fuzzy query? > B) combine stemming, prefix and fuzzy search > > Is there a way to

query with stemming, prefix and fuzzy?

2009-01-27 Thread Gert Brinkmann
Hello, I am trying to get Solr to properly work. I have set up a Solr test server (using jetty as mentioned in the tutorial). Also I had to modify the schema.xml so that I have different fields for different languages (with their own stemmers) that occur in the content management system that I am