Re: Endless 100% CPU usage on searcherExecutor thread

2014-12-24 Thread heaven
In general we do not have too complex filters, but I decreased the filterCache autowarm count to 256, will see how it performs during a month or so before take any changes on it. It also seems that adding more shards could improve the situation. We have 16 CPU cores and SSD RAID 10, so I think it

Re: SolrCloud & Paging on large indexes

2014-12-24 Thread heaven
Would be cool to have ability to get not only the next page cursor, but next page cursors, or a set of cursors for a given window, so we can draw page numbers. Not sure about the last page though. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Paging-on-large-inde

Re: Endless 100% CPU usage on searcherExecutor thread

2014-12-23 Thread heaven
We do not use dates here, at least not too often. Usually its something like type:Profile (we do use it from the rails application so type describes model names), opted_in:true, etc. Solr wasn't running too long though, so this could not show the real state. Currently for the filter cache it shows

Re: Endless 100% CPU usage on searcherExecutor thread

2014-12-22 Thread heaven
It is getting better now with smaller caches like this: filterCache class:org.apache.solr.search.FastLRUCache version:1.0 description:Concurrent LRU Cache(maxSize=4096, initialSize=512, minSize=3686, acceptableSize=3891, cleanupThread=false, autowarmCount=256, regenerator=org.apache.solr.search.Sol

Re: SolrCloud & Paging on large indexes

2014-12-22 Thread heaven
I have a very bad experience with pagination on collections larger than a few millions of documents. Pagination becomes very and very slow. Just tried to switch to page 76662 and it took almost 30 seconds. Solr now supports cursors which work fast and are useful for exports and some data processin

Re: Endless 100% CPU usage on searcherExecutor thread

2014-12-19 Thread heaven
Okay, thanks for the suggestion, will try to decrease the caches gradually. Each node has near 50 000 000 docs, perhaps we need more shards... We had smaller caches before but that was leading to bad feedback from our users. Besides our application users we also use Solr internally for data analyz

Re: Endless 100% CPU usage on searcherExecutor thread

2014-12-19 Thread heaven
Thanks, decreased the caches at twice, increased the heap size to 16G, configured Huge Pages and added these options: -XX:+UseConcMarkSweepGC -XX:+UseLargePages -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts -XX:CMSInitiatingOccupancyFraction=75 Be

Re: Endless 100% CPU usage on searcherExecutor thread

2014-12-19 Thread heaven
I have the next settings in my solrconfig.xml: What is the best way to calculate the optimal cache/heap sizes? I understand there's no a common formula and all docs have different size but -Xmx is already 12G. Thanks, Alex -- View this message in context: http://lucene.472066.n3.nabble.

Endless 100% CPU usage on searcherExecutor thread

2014-12-18 Thread heaven
Hi, We have 2 shards, each one has 2 replicas and each Solr instance has a single thread that constantly uses 100% of CPU: After restart it is running normally for some time (approximately until Solr comes close to Xmx limit),

Re: Help with a slow filter query

2014-10-07 Thread heaven
The syntax for frange query parser is weird: {!frange cache=false cost=200 l=2001-01-01T00:00:00Z u=2013-12-01T23:59:59Z}nominated_at_d or simply {!frange cache=false cost=200 u=2013-12-01T23:59:59Z}nominated_at_d And I don't see any docs in Solr wiki explaining this syntax, these examples above I

Re: Help with a slow filter query

2014-09-11 Thread heaven
Fields: Schema: http://pastie.org/pastes/9544427/text?key=974npslxxalvewhlyc4mg Query debug output: http://pastie.org/pastes/9544433/text?key=v9favzj2ulcaq

Help with a slow filter query

2014-09-11 Thread heaven
Hi, please help me to figure out what's wrong with this query: http://pastie.org/pastes/9544433/text?key=v9favzj2ulcaq0qvorda Without cache=false and cost it takes more than 15 seconds. What's weird type:"Award::Nomination" takes a few milliseconds when created_at_d:[* TO 2014-09-08T23:59:59Z] tak

Re: SolrCloud : node recovery fails with "No registered leader was found"

2014-09-07 Thread heaven
Seeing the same thing after a crash of one ZK node (from 5): {code} org.apache.solr.common.SolrException: No registered leader was found after waiting for 4000ms , collection: crm-prod slice: shard1 at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:545)

Re: Help with StopFilterFactory

2014-09-02 Thread heaven
Jira issue: https://issues.apache.org/jira/browse/SOLR-6468 -- View this message in context: http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-tp4153839p4156373.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Help with StopFilterFactory

2014-08-28 Thread heaven
Hello, Any thoughts on this? Should I open a jira ticket? Or how can we engage at least one of Solr devs to this issue? Best, Alex -- View this message in context: http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-tp4153839p4155582.html Sent from the Solr - User mailing list arch

Re: Help with StopFilterFactory

2014-08-26 Thread heaven
The did: >> If this behavior does not fit the application needs, the query parser >> needs to be configured to not take position increments into account when >> generating phrase queries. Another one would be to write your own search engine maybe :) -- View this message in context: http://luce

Re: Help with StopFilterFactory

2014-08-26 Thread heaven
There is — admit that enablePositionIncrements removal was a bad idea and restore it. Why to remove an option that has no alternatives because of those who get wrong results with it? I really don't understand this approach. And what should we do now, after spending lots of money on the integration

Re: Help with StopFilterFactory

2014-08-26 Thread heaven
So it sounds like a bug to me, doesn't it? Interned is full of complaints about this issue and why should all we suffer because of someone, who didn't know when and how to use this feature and as result got wrong data indexed? Who cares about it??? And why to remove the option that is so useful for

Re: Help with StopFilterFactory

2014-08-26 Thread heaven
Hi, just tried your suggestion but get this error: And then I found the next: http://stackoverflow.com/questions/18668376/solr-4-4-stopfilterfactory-and-enablepositionincrements. I don't really know why they did so, the reason that "it can create broken token streams" doesn't fit in my mind. Per

Re: Help with StopFilterFactory

2014-08-25 Thread heaven
A valid search: http://pastie.org/pastes/9500661/text?key=rgqj5ivlgsbk1jxsudx9za An Invalid search: http://pastie.org/pastes/9500662/text?key=b4zlh2oaxtikd8jvo5xaww What weird I found is that the valid query has: "parsedquery_toString": "+(url_words_ngram:\"twitter com zer0sleep\")" And the invali

Re: Help with StopFilterFactory

2014-08-24 Thread heaven
The problem is in #4: >> 4. if I index twitter.com/testuser and search for >> https://twitter.com/testuser I am getting 0 matches even though "https" >> should be filtered out by the StopFilterFactory. When I said that the stop filter factory "doesn't work" I mentioned that blacklisted words still

Re: Help with StopFilterFactory

2014-08-24 Thread heaven
Just a guess but it seems that auto phase generation and stop filter factory don't know of each other. Here's the current field configuration: {code} {code} -- View this message in context: http://lucene.472066.n3.nabble.com/Help-with-StopFilter

Re: Help with StopFilterFactory

2014-08-24 Thread heaven
I don't see any confusions, the problem is clearly explained in the first post. The one confusion I had was with the autoGeneratePhraseQueries and my schema version, I didn't know about that attribute and that its behavior could differ per schema version. I think we now figured that out and I am us

Re: Help with StopFilterFactory

2014-08-24 Thread heaven
Unfortunately I can't change the operator and phrase query for "https://twitter.com/testuser"; doesn't work as well. It does work for "twitter.com/testuser" but that makes no sense since I then can simply use old schema version or autoGenereratePhaseQueries=true and ask users to remove http/www fr

Re: Help with StopFilterFactory

2014-08-21 Thread heaven
With 1.5 schema it work but not as it is expected. I am indexing twitter.com/testuser and only need to get exact matches, not those that match "twitter" or "com". so my search results should contain just one record: * http://twitter.com/testuser but what I see with 1.5 schema is: * http://twitter.

Re: Help with StopFilterFactory

2014-08-21 Thread heaven
Any ideas? Doesn't that seems like a bug? -- View this message in context: http://lucene.472066.n3.nabble.com/Help-with-StopFilterFactory-tp4153839p4154202.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Help with StopFilterFactory

2014-08-20 Thread heaven
Hello, Yes, with schema version 1.5 all those examples that didn't work do work now. But results also include records that match by "com", "twitter", etc, which is not desirable. It seems we do need autoGeneratePhraseQueries="true" but also need to ignore blacklisted words. Is that somehow possi

Re: Help with StopFilterFactory

2014-08-20 Thread heaven
>From this page: http://wiki.apache.org/solr/SchemaXml >> autoGeneratePhraseQueries=true|false (in schema version 1.4 and later >> this now defaults to false) Just checked, I've so this may be true by default? -- View this message in context: http://lucene.472066.n3.nabble.com/Help-with-StopFi

Re: Help with StopFilterFactory

2014-08-20 Thread heaven
> What release of Solr? 4.8.1. > Do you have autoGeneratePhraseQueries="true" on the field? No, the config I've provided is the exact. > And when you said "But any of these does", did you mean "But NONE of these does"? Whoops, yes, fixed that. -- View this message in context: http://lucene.

Help with StopFilterFactory

2014-08-19 Thread heaven
Hi, I have the next text field: url_stopwords.txt looks like: http https ftp www So very simple. In index I have: * twitter.com/testuser All these queries do match: * twitter.com/testuser * com/testuser * testuser But any of these does: * https://twitter.com/testuser * h

Re: Search results inconsistency when using joins

2014-07-29 Thread heaven
Yup, that's known, added it for future Solr releases. But seems this couldn't be a reason for such results discrepancy. -- View this message in context: http://lucene.472066.n3.nabble.com/Search-results-inconsistency-when-using-joins-tp4149810p4149925.html Sent from the Solr - User mailing list

Re: Search results inconsistency when using joins

2014-07-29 Thread heaven
Just tried to remove joins and it worked as expected: q: ( _query_:"{!edismax qf='name_small_ngram' mm='1'}-foundation -association -organization -hospital -charity -news -info" AND ( _query_:"{!edismax qf='name_small_ngram emails_words_ngram sites_words_ngram rss_categories_texts twitter_

Search results inconsistency when using joins

2014-07-29 Thread heaven
I was thinking these 2 queries should yield same results: q: ( _query_:"{!edismax qf='name_small_ngram' mm='1'}-foundation -association -organization -hospital -charity -news -info" AND ( _query_:"{!edismax qf='name_small_ngram emails_words_ngram sites_words_ngram rss_categories_texts twitt

Re: java.net.SocketException: Connection reset

2014-07-07 Thread heaven
Yeah. the heap is huge, need to optimize the caches. It was 8Gb previously, had to increase because there were out of memory errors. Using ConcMarkSweepGC, which is supposed to not lock the world. Had to disable optimize (previously we did so by a cron task) because the index is big and optimize h

Re: java.net.SocketException: Connection reset

2014-07-04 Thread heaven
Today this had happened again + this one: null:java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113) at java.net.SocketOutputStream.write(SocketOutputStream.jav

Re: java.net.SocketException: Connection reset

2014-07-03 Thread heaven
Hello, usually the loading is not high at all: We're using bundled jetty and writing in batches by 50-100 documents and only using soft and auto commits. About clients, we have 4 processes and each could run up to 5 threads. At

java.net.SocketException: Connection reset

2014-07-03 Thread heaven
Hi, trying DigitalOcean for Solr, everything seems well, except sometimes I see these errors: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at org.apach

Re: SolrCloud copy the index to another cluster.

2014-07-03 Thread heaven
Hi, sorry for the delay. Yes, we thought to simply copy the index over but this sounds risky and time consuming. Our index is too big to copy it over the internet quickly. We decided to re-index our data and then switch and re-index again. It's a pity there's no way to do this like with mysql :)

Re: SolrCloud copy the index to another cluster.

2014-06-24 Thread heaven
Zero read would be enough, we can safely stop index updates for a while. But have some API endpoints, where read downtime is very undesirable. Best, Alex -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-copy-the-index-to-another-cluster-tp4143759p4143795.html Sent

Re: SolrCloud copy the index to another cluster.

2014-06-24 Thread heaven
I've just realized that old and new clusters do use different installations, configs and lib paths. So the nodes from the new cluster will probably simply refuse to start using configs from the old zookeper. Only if there is a way to run them with their own zookeper and then manually add as replic

SolrCloud copy the index to another cluster.

2014-06-24 Thread heaven
Hello, We do have a running SolrCloud cluster, a simple set up of 4 nodes — 2 shards and 2 replicas and ≈ 140GB index. And now we have to move to another server and need to somehow copy existing index without downtime (if applicable). New config is exactly the same, same 4 nodes, same collections

Re: Solr to return the list of matched fields

2014-03-13 Thread heaven
Hi, thank you, when it is good for visual review it is hard to work with this data. What I need is to build something like this: | Name | Twitter Profile | Topics | Site Title | Site Description | Site content | | John Doe | Yes| No | Yes | No |

Solr to return the list of matched fields

2014-03-10 Thread heaven
Hi, I have a few text fields indexed and when searching I need to know what field matched. For example I have fields: {code} full_name, site_source, tweets, rss_entries, etc {code} When searching I need to show results and show scores per each field. So an user can see what exactly content match th

RE: SOLR USING 100% percent CPU and not responding after a while

2014-01-28 Thread heaven
I have the same problem, please look at the image: And this is on idle. Index size is about 90Gb. Solr 4.4.0. Memory is not an issue, there's a lot. RAID 10 (15000RPM rapid hdd). -- View this message in context: http://luce

Re: Query time join with conditions

2014-01-16 Thread heaven
Nvm, figured it out. To match profiles that have "test entry" in own attributes or in related rss entries it is possible to use ({!join from=profile_ids_im to=id_i v=$rssQuery}Test entry) OR Test entry in "q" parameter, not in "fq". Thanks again for the help, Alex -- View this message in conte

Re: Query time join with conditions

2014-01-16 Thread heaven
Hi, thanks for the response. Seems almost figured things out. Since both Profiles and RssEntries are in the same index (same core), it is possible to either use `v=` param or specify `type:RssEntry` right after the closing `}`. Both will work: {!join from=profile_ids_im to=id_i}type:RssEntry or {!

Re: Query time join with conditions

2014-01-14 Thread heaven
Can someone shed some light on this? -- View this message in context: http://lucene.472066.n3.nabble.com/Query-time-join-with-conditions-tp4108365p4111300.html Sent from the Solr - User mailing list archive at Nabble.com.

Query time join with conditions

2013-12-27 Thread heaven
Hello, I have one physical Solr collection and multiple logical collections in it. The separation is done by using the "type" field (Ruby on Rails application). So I have 2 logical collections: Profile and RssEntry and would not want to add RssEntries content to Profiles index. When I want to sear

Re: Help to figure out why query does not match

2013-10-10 Thread heaven
Hi Erick, I am finally got back to this issue. Here is the wish I've created: https://issues.apache.org/jira/browse/SOLR-5332 Best, Alex -- View this message in context: http://lucene.472066.n3.nabble.com/Help-to-figure-out-why-query-does-not-match-tp4086967p4094652.html Sent from the Solr -

Help to figure out why query does not match

2013-08-28 Thread heaven
Hi, please help me figure out what's going on. I have the next field type: And the next string indexed: http://plus.google.com/111950520904110959061/profile Here is what the analyzer shows: http://img607.imageshack.us/img607/5074/fn1.png Then I d

Re: Edismax vs Dismax

2013-08-12 Thread heaven
Awesome, thank you. Would be great to see anything about this difference in wiki. For anyone else suffering from this, originally I started with: fulltext(query.gsub(/(\$|&|\+|,|\/|:|;|=|\?|@)/) { "\\#{$1}" }) do minimum_match(1) end But be careful with the "+" sign, since it could be used in

Edismax vs Dismax

2013-08-11 Thread heaven
Hi, the application I am working on switched to edismax parser and I found some weird behavior. I have this field: The string that is indexed is: facebook.com/profile.php?id=123456789 When I do use

Re: Overlapping onDeckSearchers=2

2013-05-27 Thread heaven
I am on 4.2.1 @Yonik Seeley I do understand the cost and run it once per 24 hours and perhaps later this interval will be increased up to a few days. In general I am optimizing not to merge the fragments but to remove deleted docs. My index refreshes quickly and number of deleted docs could reach

Re: Overlapping onDeckSearchers=2

2013-05-27 Thread heaven
Hi, thanks for the response. Seems like this is the case because there are no any other applications that could fire commit/optimize calls. All commits are triggered by Solr and the optimize is triggered by a cron task. Because of all that it looks like a bug in Solr. It probably should not run co

Re: Overlapping onDeckSearchers=2

2013-05-25 Thread heaven
Hi, I am getting this warning with auto commits. My application does not send commits. I am optimizing the index once per day and seems like this warning appear every time the optimization process is launched. Best, Alex -- View this message in context: http://lucene.472066.n3.nabble.com/Overl

Re: SolrCloud: IOException occured when talking to server at

2013-05-14 Thread heaven
Hi, thanks for the links and for your help. The server is now running third day in a row with no issues. What is done: 1. Applied these GC tuning options: -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 2. Optimized the schema and index size (decreased at least 8 times). 3. Updated th

Re: SolrCloud: IOException occured when talking to server at

2013-05-10 Thread heaven
Again, just finished reindexing, server utilization was about 5-10%, I started index optimization. As result I now lost (again) entire index, got a lot of errors, they are appear so fast and contain 0 useful information.

Re: Re: SolrCloud: IOException occured when talking to server at

2013-05-10 Thread heaven
be reduced at 30-50% after optimization. Best, Alex Friday 10 May 2013, you wrote: On 5/10/2013 2:00 AM, heaven wrote: > UPD: > Forget to confirm, we're using Solr 4.2.1 and will wait for Solr 4.3.1 or > for 4.4 as you advised. Solr 4.2.1 should be pretty stable. You ment

Re: SolrCloud: IOException occured when talking to server at

2013-05-10 Thread heaven
UPD: Forget to confirm, we're using Solr 4.2.1 and will wait for Solr 4.3.1 or for 4.4 as you advised. -- View this message in context: http://lucene.472066.n3.nabble.com/ColrCloud-IOException-occured-when-talking-to-server-at-tp4061831p4062235.html Sent from the Solr - User mailing list archiv

Re: SolrCloud: IOException occured when talking to server at

2013-05-10 Thread heaven
Hi Shawn, thank you for the reply and for your advises, will try all of them today. Some of them are already applied, i.e. "Stop other software" and "zkClientTimeout". Timeout set to 60 seconds, also reduced autowarm count and increased autoCommit interval to 5 minutes. Situation improved now and

Re: ColrCloud: IOException occured when talking to server at

2013-05-09 Thread heaven
Can confirm this lead to data loss. I have 1217427 records in database and only 1217216 indexed. Which does mean that Solr gave a successful response and then did not added some documents to the index. Seems like SolrCloud is not a production-ready solution, would be good if there was a warning in

Re: ColrCloud: IOException occured when talking to server at

2013-05-09 Thread heaven
Zookeeper log: 1 *2013-05-09 03:03:07,177* [myid:3] - WARN [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2183:Follower@118] - Got zxid 0x20001 expected 0x1 2 *2013-05-09 03:36:52,918* [myid:3] - ERROR [CommitProcessor:3:NIOServerCnxn@180] - Unexpected Exception: 3 java.nio.channels.Ca

Re: ColrCloud: IOException occured when talking to server at

2013-05-09 Thread heaven
Forget to mention Solr is 4.2 and zookepeer 3.4.5 I do not do manual commits and prefer softCommit each second and autoCommit each 3 minutes. the problem happened again, lots of errors in logs and no description. Cluster state changed, on the shard 2 replica became a leader, former leader get in

ColrCloud: IOException occured when talking to server at

2013-05-09 Thread heaven
Hi, observing lots of these errors with SolrCloud Here is the instruction I am using to run services: zookeeper: 1: cd /opt/zookeeper/ 2: sudo bin/zkServer.sh start zoo1.cfg 3: sudo bin/zkServer.sh start zoo2.cfg 4: sudo bin/zkServer.sh start zoo3.cfg shards: 1: cd /opt/solr-cluster/sha

Re: Re: Re: Re: Shard update error when using DIH

2013-05-09 Thread heaven
Thank you all, guys. Your advises work great and I don't see any errors in Solr logs anymore. Best, Alex Monday 29 April 2013, you wrote: On 29 April 2013 14:55, heaven <[hidden email][1]> wrote: > Got these errors after switching the field type to long: >

Re: Re: Re: Re: Shard update error when using DIH

2013-04-29 Thread heaven
Whoops, yes, that works. Will check if that helped to fix the original error now. Monday 29 April 2013, you wrote: On 29 April 2013 14:55, heaven <[hidden email][1]> wrote: > Got these errors after switching the field type to long: >

Re: Re: Re: Shard update error when using DIH

2013-04-29 Thread heaven
Got these errors after switching the field type to long: * *crm-test:* org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Unknown fieldtype 'long' specified on field _version_ * *crm-prod:* org.apache.solr.common.SolrException:org.apache.solr.common.SolrExcep

Re: Re: Shard update error when using DIH

2013-04-29 Thread heaven
Yes, here is the full schema: http://pastebin.com/pFPbD749[1] On Mon, Apr 29, 2013 at 10:01 AM, heaven <[hidden email][2]> wrote: *If you reply to this email, your message will be added to the discussion below:* http://lucene.472066.n3.nabble.com/Shard-update

Re: Shard update error when using DIH

2013-04-29 Thread heaven
Hi, seems like I have exactly the same error: Apr 28, 2013 11:41:57 PM org.apache.solr.common.SolrException log SEVERE: null:java.lang.UnsupportedOperationException at org.apache.lucene.queries.function.FunctionValues.longVal(FunctionValues.java:46) at org.apache.solr.update.Versio