Re: Wildcard queries and custom char filter

2013-12-18 Thread michallos
Hoh, I can see that when there are wildcards then KeywordTokenizerFactory is used instead of StandardTokenizerFactory. I created custom wildcard remover char filter for few specific cases (so I cannot use any of regex replacer filters) but event with that, KeywordTokenizerFactory is used. I though

Re: Wildcard queries and custom char filter

2013-12-18 Thread michallos
It works! Thanks. Last question: how to invoke charFilter before tokenizer? I can see that with tokenizer StandardTokenizerFactory without wildcards text "123-abc" is broken into two tokens "123" and "abc" but text "*123-abc*" remain unchanged "*123-abc*". It is possible to use charFilter before

Wildcard queries and custom char filter

2013-12-18 Thread michallos
Hello, I have a problem with configuring custom char filter. When there are no wildcards in query then my filter is invoked. When there are wildcards, my filter is not invoked. It is possible to configure charFilter to be used with wildcard queries? I can see than with wildcards, TokenizerChain.c

Re: Constantly increasing time of full data import

2013-12-12 Thread michallos
One more stack trace which is active during indexing. This call task is also executed on the same single threaded executor as registering new searcher: "searcherExecutor-48-thread-1" prio=10 tid=0x7f24c0715000 nid=0x3de6 runnable [0x7f24b096d000] java.lang.Thread.State: RUNNABLE

Re: Constantly increasing time of full data import

2013-12-11 Thread michallos
I took a few thread dumps and here and the results: - service which are indexing stuck on this stack trace: "cmdDistribExecutor-3-thread-17669" prio=10 tid=0x7f1aae4a6800 nid=0x44a9 runnable [0x7f1a6c0f6000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRe

Re: Constantly increasing time of full data import

2013-12-09 Thread michallos
"on production" - no I can't profile it (because of huge overhead) ... Maybe with dynamic tracing but we can't do it right now. After server restart, delta time reset to 15-20 seconds so it is not caused by the mergeFactor. We have SSD and 70GB RAM (it is enough for us). -- View this message in

Re: Constantly increasing time of full data import

2013-12-03 Thread michallos
This occurs only on production environment so I can't profile it :-) Any clues? DirectUpdateHandler2 config: 15000 false ${solr.ulog.dir:} -- View this message in context: http://lucene.472066.n3.nabble.com/Constantly-increasing-time-of-full-data-import-tp4103873p4104722.htm

Re: Constantly increasing time of full data import

2013-12-02 Thread michallos
Update: I can see that times increases when the search load is higher. During nights and weekends full load times doesn't increase. So it is not caused by the number of documents being loaded (during weekends we have the same number of new documents) but number of queries / minute. Anyone observe

Re: Constantly increasing time of full data import

2013-11-29 Thread michallos
One more info that may be important: this index is divided into 64 logical shards (4 replica factor). -- View this message in context: http://lucene.472066.n3.nabble.com/Constantly-increasing-time-of-full-data-import-tp4103873p4103874.html Sent from the Solr - User mailing list archive at Nabbl

Constantly increasing time of full data import

2013-11-29 Thread michallos
Hi, On our Solr Cloud based application we use full data import (as a delta import) every minute (http://HOST:PORT/solr/collection1/dataimport/?command=full-import&commit=true&optimize=false&clean=false&synchronous=true). Solr cloud is deployed on 4 nodes. When Solr starts, this full import takes

Re: Lowercase field

2013-11-29 Thread michallos
Unfortunately indexing takes more than 3 days (hundreds of millions of documents) so it is impossible to do that right now. Any other ideas to do simple .toLowerCase() on one field? -- View this message in context: http://lucene.472066.n3.nabble.com/Lowercase-field-tp4103848p4103866.html Sent f

Lowercase field

2013-11-29 Thread michallos
Hi, How can process one of the fields in query to be lower case? This field is of type StrField and what is very important I can't change schema (for example to TextField with LowerCase filter). I also can't change query that is passed by HTTP. Is it possible to do that in configuration? Example: