Clarity on Stable Release

2020-01-29 Thread Jeff
TL;DR: I am having difficulty on deciding on a release that is stable to use and would like this to be easier. Recently it has been rather difficult to figure out what release to use based on its stability. This is probably in part because of the rapid release cadence and also the versioning being

Re: Clarity on Stable Release

2020-01-29 Thread Jeff
Thanks Shawn! Your answer is very helpful. Especially your note about keeping up to date with the latest major version after a number of releases. On Wed, Jan 29, 2020 at 6:35 PM Shawn Heisey wrote: > On 1/29/2020 11:24 AM, Jeff wrote: > > Now, we are considering 8.2.0, 8.3.1, or 8.4

Re: How to check when a search exceeds the threshold of timeAllowed parameter

2015-12-23 Thread Jeff Wartes
Looks like it’ll set partialResults=true on your results if you hit the timeout. https://issues.apache.org/jira/browse/SOLR-502 https://issues.apache.org/jira/browse/SOLR-5986 On 12/22/15, 5:43 PM, "Vincenzo D'Amore" wrote: >Well... I can write everything, but really all this just to un

Error importing data - java.util.concurrent.RejectedExecutionException

2015-12-30 Thread Jeff Chastain
verything appears to line up. I am at a loss here ... can anybody offer a pointer? Thanks, -- Jeff

Re: SolrCloud: Setting/finding node names for deleting replicas

2016-01-08 Thread Jeff Wartes
I’m pretty sure you could change the name when you ADDREPLICA using a core.name property. I don’t know if you can when you initially create the collection though. The CLUSTERSTATUS command will tell you the core names: https://cwiki.apache.org/confluence/display/solr/Collections+API#Collectio

Re: SolrCloud: Setting/finding node names for deleting replicas

2016-01-08 Thread Jeff Wartes
sed is the default names of the slices, so it’s a mixed bag. See here: https://github.com/whitepages/solrcloud_manager#terminology On 1/8/16, 2:34 PM, "Robert Brown" wrote: >Thanks for the pointer Jeff, > >For SolrCloud it turned out to be... > >&property.coreN

Re: collection configuration stored in Zoo Keeper with solrCloud

2016-01-11 Thread Jeff Courtade
Yes its stored in the directories configured in zoo.cfg .Jeff Courtade M: 240.507.6116 On Jan 11, 2016 1:16 PM, "Jim Shi" wrote: > Hi, I have question regarding collection configurations stored Zoo Keeper > with solrCloud. > All collection configurations are stored at Zoo K

Re: SolrCloud replicas out of sync

2016-01-26 Thread Jeff Wartes
My understanding is that the "version" represents the timestamp the searcher was opened, so it doesn’t really offer any assurances about your data. Although you could probably bounce a node and get your document counts back in sync (by provoking a check), it’s interesting that you’re in this si

Re: SolrCloud replicas out of sync

2016-01-26 Thread Jeff Wartes
like a bug to me. That said, another general recommendation (of mine) is that you not use Solr as your primary data source, so you can rebuild your index from scratch if you really need to. On 1/26/16, 1:10 PM, "David Smith" wrote: >Thanks Jeff! A few comments > >

Re: SolrCloud replicas out of sync

2016-01-27 Thread Jeff Wartes
On 1/27/16, 8:28 AM, "Shawn Heisey" wrote: > >I don't think any documentation states this, but it seems like a good >idea to me use an alias from day one, so that you always have the option >of swapping the "real" collection that you are using without needing to >change anything else. I'll

Re: SolrCloud replicas out of sync

2016-01-27 Thread Jeff Wartes
of Solr’s assumptions. On 1/27/16, 7:59 AM, "David Smith" wrote: >Jeff, again, very much appreciate your feedback. > >It is interesting — the article you linked to by Shalin is exactly why we >picked SolrCloud over ES, because (eventual) consistency is critical for ou

Re: collection aliasing

2016-01-28 Thread Jeff Wartes
I enjoy using collection aliases in all client references, because that allows me to change the collection all clients use without updating the clients. I just move the alias. This is particularly useful if I’m doing a full index rebuild and want an atomic, zero-downtime switchover. On 1/2

Re: Restoring backups of solrcores

2016-02-01 Thread Jeff Wartes
Aliases work when indexing too. Create collection: collection1 Create alias: this_week -> collection1 Index to: this_week Next week... Create collection: collection2 Create (Move) alias: this_week -> collection2 Index to: this_week On 2/1/16, 2:14 AM, "vidya" wrote: >Hi > >How can that b

Re: Shard allocation across nodes

2016-02-01 Thread Jeff Wartes
You could write your own snitch: https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement Or, it would be more annoying, but you can always add/remove replicas manually and juggle things yourself after you create the initial collection. On 2/1/16, 8:42 AM, "Tom Evans"

Re: Adding nodes

2016-02-17 Thread Jeff Wartes
Solrcloud does not come with any autoscaling functionality. If you want such a thing, you’ll need to write it yourself. https://github.com/whitepages/solrcloud_manager might be a useful head start though, particularly the “fill” and “cleancollection” commands. I don’t do *auto* scaling, but I d

Re: very slow frequent updates

2016-02-23 Thread Jeff Wartes
My suggestion would be to split your problem domain. Use Solr exclusively for search - index the id and only those fields you need to search on. Then use some other data store for retrieval. Get the id’s from the solr results, and look them up in the data store to get the rest of your fields. T

Re: very slow frequent updates

2016-02-24 Thread Jeff Wartes
at features you do actually need, might be worth a look >> on "External File Fields" Roland? >> >> -Stefan >> >> On Wed, Feb 24, 2016 at 12:24 PM, Szűcs Roland >> wrote: >> > Thanks Jeff your help, >> > >> > Can it wo

Re: Shard State vs Replica State

2016-02-26 Thread Jeff Wartes
I believe the shard state is a reflection of whether that shard is still in use by the collection, and has nothing to do with the state of the replicas. I think doing a split-shard operation would create two new shards, and mark the old one as inactive, for example. On 2/26/16, 8:50 AM, "De

Re: Prevent the SSL Keystore and Truststore password from showing up in the Solr Admin and Linux processes (Solr 5.2.1)

2016-02-29 Thread Jeff Wu
Hi Katherine, we had exact the same issue, we need to protect our password. Anyone who can access to solr server can do "ps -elf|grep java" to grep the solr commandline, and it has all the password in plain text. The /bin/solr shell will set 10 related system property: SOLR_SSL_OPTS=" -Dsolr.jett

Re: SolrCloud - Strategy for recovering cluster states

2016-03-01 Thread Jeff Wartes
I’ve been running SolrCloud clusters in various versions for a few years here, and I can only think of two or three cases that the ZK-stored cluster state was broken in a way that I had to manually intervene by hand-editing the contents of ZK. I think I’ve seen Solr fixes go by for those cases,

Re: SolrCloud - Strategy for recovering cluster states

2016-03-02 Thread Jeff Wartes
think it should be local disk for non-SolrCloud, and ZK for SolrCloud. On 3/2/16, 12:13 AM, "danny teichthal" wrote: >Thanks Jeff, >I understand your philosophy and it sounds correct. >Since we had many problems with zookeeper when switching to Solr Cloud. we >coul

Re: XX:ParGCCardsPerStrideChunk

2016-03-03 Thread Jeff Wartes
I've experimented with that a bit, and Shawn added my comments in IRC to his Solr/GC page here: https://wiki.apache.org/solr/ShawnHeisey The relevant bit: "With values of 4096 and 32768, the IRC user was able to achieve 15% and 19% reductions in average pause time, respectively, with the maximu

Re: Separating cores from Solr home

2016-03-03 Thread Jeff Wartes
It’s a bit backwards feeling, but I’ve had luck setting the install dir and solr home, instead of the data dir. Something like: -Dsolr.solr.home=/data/solr -Dsolr.install.dir=/opt/solr So all of the Solr files are in in /opt/solr and all of the index/core related files end up in /data/solr.

Reset JMX counters for monitoring without restarting

2016-04-02 Thread Jeff Courtade
1240.416114498 75thPcRequestTime.value 1614.2324915 95thPcRequestTime.value 3048.37888109 99thPcRequestTime.value 5930.183086690001 -- Thanks, Jeff Courtade M: 240.507.6116

Re: Reset JMX counters for monitoring without restarting

2016-04-02 Thread Jeff Courtade
Thanks, I was hoping there was a way without a core reload. Do you know what is different with cloud? I need to do this in both. Jeff Courtade M: 240.507.6116 On Apr 2, 2016 1:37 PM, "Shawn Heisey" wrote: > On 4/2/2016 11:06 AM, Jeff Courtade wrote: > > I am putting toge

Re: Reset JMX counters for monitoring without restarting

2016-04-02 Thread Jeff Courtade
Thanks very much. Jeff Courtade M: 240.507.6116 On Apr 2, 2016 3:03 PM, "Otis Gospodnetić" wrote: > Hi Jeff, > > With info that Solr provides in JMX you have to keep track of things > yourself, do subtractions and counting yourself. > If you don't feel like

Re: SolrCloud no leader for collection

2016-04-05 Thread Jeff Wartes
I recall I had some luck fixing a leader-less shard (after a ZK quorum failure) by forcably removing the records for the down-state replicas from the leader election list, and then forcing an election. The ZK path looks like collections//leader_elect/shardX/election. Usually you’ll find the dow

Re: SolrCloud backup/restore

2016-04-05 Thread Jeff Wartes
There is some automation around this process in the backup commands here: https://github.com/whitepages/solrcloud_manager It’s been tested with 5.4, and will restore arbitrary replication factors. Ever assuming the shared filesystem for backups, of course. On 4/5/16, 3:18 AM, "Reth RM" wrot

DIH with Nested Documents - Configuration Issue

2016-04-14 Thread Jeff Chastain
ohn", "lastName": "Doe" }, { "languagesSpoken_id": 243, "languagesSpoken_abbreviation": "en", "languagesSpoken_name": "English" }, { "languagesSpoken_id": 442, "languagesSpoken_abbreviation": "fr", "languagesSpoken_name": "French" } I have spent several days now trying to figure out what is going on here to no avail. Can anybody provide me with a pointer as to what I am missing here? Thanks, -- Jeff

Re: HTTP Client Only

2016-04-14 Thread Jeff Wartes
If you’re already using java, just use the CloudSolrClient. If you’re using the default router, (CompositeId) it’ll figure out the leaders and send documents to the right place for you. If you’re not using java, then I’d still look there for hints on how to duplicate the functionality. On

Re: Adding replica on solr - 5.50

2016-04-14 Thread Jeff Wartes
I’m all for finding another way to make something work, but I feel like this is the wrong advice. There are two options: 1) You are doing something wrong. In which case, you should probably invest in figuring out what. 2) Solr is doing something wrong. In which case, you should probably invest

Re: Indexing 700 docs per second

2016-04-19 Thread Jeff Wartes
I have no numbers to back this up, but I’d expect Atomic Updates to be slightly slower than a full update, since the atomic approach has to retrieve the fields you didn't specify before it can write the new (updated) document. On 4/19/16, 11:54 AM, "Tim Robertson" wrote: >Hi Mark, > >We we

Re: Replicas for same shard not in sync

2016-04-26 Thread Jeff Wartes
At the risk of thread hijacking, this is an area where I don’t know I fully understand, so I want to make sure. I understand the case where a node is marked “down” in the clusterstate, but what if it’s down for less than the ZK heartbeat? That’s not unreasonable, I’ve seen some recommendations

Re: Replicas for same shard not in sync

2016-04-27 Thread Jeff Wartes
ome retry logic in the code that distributes the updates from >the leader as well. > >Best, >Erick > >On Tue, Apr 26, 2016 at 12:51 PM, Jeff Wartes wrote: >> >> At the risk of thread hijacking, this is an area where I don’t know I fully >> understand, so I want to ma

Re: Solr 5.2.1 on Java 8 GC

2016-04-28 Thread Jeff Wartes
Shawn Heisey’s page is the usual reference guide for GC settings: https://wiki.apache.org/solr/ShawnHeisey Most of the learnings from that are in the Solr 5.x startup scripts already, but your heap is bigger, so your mileage may vary. Some tools I’ve used while doing GC tuning: * VisualVM - Co

Re: Passing Ids in query takes more time

2016-05-05 Thread Jeff Wartes
An ID lookup is a very simple and fast query, for one ID. Or’ing a lookup for 80k ids though is basically 80k searches as far as Solr is concerned, so it’s not altogether surprising that it takes a while. Your complaint seems to be that the query planner doesn’t know in advance that should be

SOLR cloud node has a 4k index directory

2015-08-17 Thread Jeff Courtade
.20150815151640598 ps04 shard 2 replica 61G /opt/solr/solr-4.7.2/example/solr/collection1/data/index.20140820212651780 39G /opt/solr/solr-4.7.2/example/solr/collection1/data/index.20150815170546642 what can i do to remedy this? -- Thanks, Jeff Courtade M: 240.507.6116

Re: SOLR cloud node has a 4k index directory

2015-08-17 Thread Jeff Courtade
console. once it is green check the version number on ps01 and ps03 they should be the same now. Repeat this for shard2 and you are done. -- Thanks, Jeff Courtade M: 240.507.6116 On Mon, Aug 17, 2015 at 10:57 AM, Jeff Courtade wrote: > Hi, > > I have SOLR cloud running on SOLR 4.7.2 &

Solr leader and replica version mismatch 4.7.2

2015-08-19 Thread Jeff Courtade
InBytes matches on Leader and replica curl http://ps01:8983/solr/admin/cores?action=STATUS |sed 's/>\n1439974815928 2015-08-19T09:00:15.928Z 43691759309 40.69 GB if that number date and size match on the leader and the replicas I believe we are in sync. Can anyone verify this? -

Re: Solr leader and replica version mismatch 4.7.2

2015-08-19 Thread Jeff Courtade
number is what I was interested in. Should the version number be different in SOLR Cloud then as it is deprecated? -- Thanks, Jeff Courtade M: 240.507.6116 On Wed, Aug 19, 2015 at 10:08 AM, Shawn Heisey wrote: > On 8/19/2015 7:52 AM, Jeff Courtade wrote: > > We are running S

splitting shards on 4.7.2 with custom plugins

2015-08-25 Thread Jeff Courtade
I am getting failures when trying too split shards on solr 4.2.7 with custom plugins. It fails regularily it cannot find the jar files for plugins when creating the new cores/shards. Ideas? -- Thanks, Jeff Courtade M: 240.507.6116

Re: splitting shards on 4.7.2 with custom plugins

2015-08-26 Thread Jeff Courtade
DME.txt 4.0Ksolr.xml 4.0Kzoo.cfg [root@dj01 solr]# du -sh /opt/solr/solr-4.7.2/solr04/solr/collection1_shard1_1_replica2 16G /opt/solr/solr-4.7.2/solr04/solr/collection1_shard1_1_replica2 [root@dj01 solr]# du -sh /opt/solr/solr-4.7.2/solr03/solr/collection1_shard1_0_replica2 18G /opt/solr/solr-4.7.2/

Re: splitting shards on 4.7.2 with custom plugins

2015-08-26 Thread Jeff Courtade
1_shard1_0_replica2", "node_name":"10.135.2.153:8983_solr"}}}, "shard2_0":{ "range":"0-3fff", "state":"active", "replicas":{ "core_node13":{ &

Cached fq decreases performance

2015-09-03 Thread Jeff Wartes
I have a query like: q=&fq=enabled:true For purposes of this conversation, "fq=enabled:true" is set for every query, I never open a new searcher, and this is the only fq I ever use, so the filter cache size is 1, and the hit ratio is 1. The fq=enabled:true clause matches about 15% of my docume

Re: Cached fq decreases performance

2015-09-03 Thread Jeff Wartes
and even a newsletter: >http://www.solr-start.com/ > > >On 3 September 2015 at 16:45, Jeff Wartes wrote: >> >> I have a query like: >> >> q=&fq=enabled:true >> >> For purposes of this conversation, "fq=enabled:true" is set for every >

Re: Cached fq decreases performance

2015-09-04 Thread Jeff Wartes
On 9/4/15, 7:06 AM, "Yonik Seeley" wrote: > >Lucene seems to always be changing it's execution model, so it can be >difficult to keep up. What version of Solr are you using? >Lucene also changed how filters work, so now, a filter is >incorporated with the query like so: > >query = new BooleanQ

solr4.7: leader core does not elected to other active core after sorl OS shutdown, known issue?

2015-09-21 Thread Jeff Wu
Our environment still run with Solr4.7. Recently we noticed in a test. When we stopped 1 solr server(solr02, which did OS shutdown), all the cores of solr02 are shown as "down", but remains a few cores still as leaders. After that, we quickly seeing all other servers are still sending requests to t

Solr4.7: tlog replay has a major delay before start recovering transaction replay

2015-09-21 Thread Jeff Wu
Our environment ran in Solr4.7. Recently hit a core recovery failure and then it retries to recover from tlog. We noticed after 20:05:22 said Recovery failed, Solr server waited a long time before it started tlog replay. During that time, we have about 32 cores doing such tlog relay. The service

Re: solr4.7: leader core does not elected to other active core after sorl OS shutdown, known issue?

2015-09-21 Thread Jeff Wu
es, and they are all active, and all leader cores. Shutdown the linux OS. Monitor clusterstate.json over ZK, after enough ZK session timeout value. We noticed some cores has leader election happened. But still saw some down cores remains leader. 2015-09-21 9:15 GMT-04:00 Shalin Shekhar Mangar :

Re: Solr4.7: tlog replay has a major delay before start recovering transaction replay

2015-09-21 Thread Jeff Wu
ore it tells us "tlog replay" 2015-09-21 9:07 GMT-04:00 Shalin Shekhar Mangar : > Hi Jeff, > > Comments inline: > > On Mon, Sep 21, 2015 at 6:06 PM, Jeff Wu wrote: > > Our environment ran in Solr4.7. Recently hit a core recovery failure and > > then it retries to r

Re: solr4.7: leader core does not elected to other active core after sorl OS shutdown, known issue?

2015-09-21 Thread Jeff Wu
y releases the lease, so that other cores may claim it. > > Perhaps that explains the confusion? > > Shai > > On Mon, Sep 21, 2015 at 4:36 PM, Jeff Wu wrote: > > > Hi Shalin, thank you for the response. > > > > We waited longer enough than the ZK session timeout t

Autowarm and filtercache invalidation

2015-09-24 Thread Jeff Wartes
If I configure my filterCache like this: and I have <= 10 distinct filter queries I ever use, does that mean I’ve effectively disabled cache invalidation? So my cached filter query results will never change? (short of JVM restart) I’m unclear on whether autowarm simply copies the value into the

Re: Autowarm and filtercache invalidation

2015-09-24 Thread Jeff Wartes
whether it was populated via autowarm. On 9/24/15, 11:28 AM, "Jeff Wartes" wrote: > >If I configure my filterCache like this: >autowarmCount="10"/> > >and I have <= 10 distinct filter queries I ever use, does that mean I’ve >effectively disabled cache inv

Re: How to know index file in OS Cache

2015-09-25 Thread Jeff Wartes
I’ve been relying on this: https://code.google.com/archive/p/linux-ftools/ fincore will tell you what percentage of a given file is in cache, and fadvise can suggest to the OS that a file be cached. All of the solr start scripts at my company first call fadvise (FADV_WILLNEED) on all the files

Re: Cost of having multiple search handlers?

2015-09-28 Thread Jeff Wartes
One would hope that https://issues.apache.org/jira/browse/SOLR-4735 will be done by then. On 9/28/15, 11:39 AM, "Walter Underwood" wrote: >We did the same thing, but reporting performance metrics to Graphite. > >But we won’t be able to add servlet filters in 6.x, because it won’t be a >webapp

Re: Cost of having multiple search handlers?

2015-09-29 Thread Jeff Wartes
production for a year, >but the config is pretty manual. > >wunder >Walter Underwood >wun...@wunderwood.org >http://observer.wunderwood.org/ (my blog) > > >> On Sep 28, 2015, at 4:41 PM, Jeff Wartes wrote: >> >> >> One would hope that h

Facet queries blow out the filterCache

2015-10-01 Thread Jeff Wartes
I’m doing some fairly simple facet queries in a two-shard 5.3 SolrCloud index on fields like this:

Re: Facet queries blow out the filterCache

2015-10-01 Thread Jeff Wartes
f you set f.city.facet.limit=-1 ? > >On Thu, Oct 1, 2015 at 7:43 PM, Jeff Wartes >wrote: > >> >> I’m doing some fairly simple facet queries in a two-shard 5.3 SolrCloud >> index on fields like this: >> >> > docValues="true”/> >

Re: Facet queries blow out the filterCache

2015-10-01 Thread Jeff Wartes
here >https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-Over-Re >questParameters >eg does it happen if you run with distrib=false? > >On Fri, Oct 2, 2015 at 12:27 AM, Jeff Wartes >wrote: > >> >> No change, still shows an insert per-reques

Re: Facet queries blow out the filterCache

2015-10-02 Thread Jeff Wartes
gh, because issuing new distinct queries causes a reported insert, but not a lookup, so the cache hit ratio is always exactly 1. On 10/2/15, 4:18 AM, "Toke Eskildsen" wrote: >On Thu, 2015-10-01 at 22:31 +, Jeff Wartes wrote: >> It still inserts if I address the core dire

Re: Facet queries blow out the filterCache

2015-10-06 Thread Jeff Wartes
I dug far enough yesterday to find the GET_DOCSET, but not far enough to find why. Thanks, a little context is really helpful sometimes. So, starting with an empty filterCache... http://localhost:8983/solr/techproducts/select?q=name:foo&rows=1&facet=true &facet.field=popularity New values:

Re: are there any SolrCloud supervisors?

2015-10-14 Thread Jeff Wartes
I’m aware of two public administration tools: This was announced to the list just recently: https://github.com/bloomreach/solrcloud-haft And I’ve been working in this: https://github.com/whitepages/solrcloud_manager Both of these hook the Solrcloud client’s ZK access to inspect the cluster state

solr4.7: truncated log output in grouping.CommandHandler?

2015-10-19 Thread Jeff Wu
We had solr server 4.7 recently reported such WARN message, and come with a long GC pause after that. Sometime it will force Solr server disconnect with ZK server. Solr 4.7.0, got this warning message: WARN - 2015-10-19 02:23:24.503; org.apache.solr.search.grouping.CommandHandler; Query: +(+owner

Anyone users IBM J9 JVM with 32G max heap ? Tuning recommendations?

2015-10-19 Thread Jeff Wu
Hi all, we are using solr4.7 on top of IBM JVM J9 Java7, max heap to 32G, system RAM 64G. JVM parameters: -Xgcpolicy:balanced -verbose:gc -Xms12228m -Xmx32768m -XX:PermSize=128m -XX:MaxPermSize=512m We faced one issue here: we set zkClient timeout value to 30 seconds. By using the balanced GC po

Re: DevOps question : auto deployment/setup of Solr & Zookeeper on medium-large clusters

2015-10-20 Thread Jeff Wartes
If you’re using AWS, there’s this: https://github.com/LucidWorks/solr-scale-tk If you’re using chef, there’s this: https://github.com/vkhatri/chef-solrcloud (There are several other chef cookbooks for Solr out there, but this is the only one I’m aware of that supports Solr 5.3.) For ZK, I’m less

Re: copy data between collection

2015-10-26 Thread Jeff Wartes
The “copy” command in this tool automatically does what Upayavira describes, including bringing the replicas up to date. (if any) https://github.com/whitepages/solrcloud_manager I’ve been using it as a mechanism for copying a collection into a new cluster (different ZK), but it should work withi

Re: replica recovery

2015-10-27 Thread Jeff Wartes
On the face of it, your scenario seems plausible. I can offer two pieces of info that may or may not help you: 1. A write request to Solr will not be acknowledged until an attempt has been made to write to all relevant replicas. So, B won’t ever be missing updates that were applied to A, unless c

Re: Facet queries blow out the filterCache

2015-10-28 Thread Jeff Wartes
FWIW, since it seemed like there was at least one bug here (and possibly more), I filed https://issues.apache.org/jira/browse/SOLR-8171 On 10/6/15, 3:58 PM, "Jeff Wartes" wrote: > >I dug far enough yesterday to find the GET_DOCSET, but not far enough to >find why. Thanks,

Re: Data Import Handler / Backup indexes

2015-11-17 Thread Jeff Wartes
https://github.com/whitepages/solrcloud_manager supports 5.x, and I added some backup/restore functionality similar to SOLR-5750 in the last release. Like SOLR-5750, this backup strategy requires a shared filesystem, but note that unlike SOLR-5750, I haven’t yet added any backup functionality for

Re: replica recovery

2015-11-19 Thread Jeff Wartes
er but it isn't clear to me how high it should be or if >raising the limit will cause new problems. > >Any advice you could provide in this situation would be awesome! > >Cheers, >Brian > > > >> On Oct 27, 2015, at 20:50, Jeff Wartes wrote: >> >>

Re: Data Import Handler / Backup indexes

2015-11-23 Thread Jeff Wartes
be run >because the database is unavailable. > >Our collection is simple: 2 nodes - 1 collection - 2 shards with 2 >replicas >each > >So a simple copy (cp command) for both the nodes/shards might work for us? >How do I restore the data back? > > > >On

Method to fix issue when you get KeeperErrorCode = NoAuth when Zookeeper ACL enabled

2015-12-02 Thread Jeff Wu
We have being following this wiki to enable ZooKeeper ACL control https://cwiki.apache.org/confluence/display/solr/ZooKeeper+Access+Control#ZooKeeperAccessControl-AboutZooKeeperACLs It works fine for Solr service itself, but when you try to use scripts/cloud-scripts/zkcli.sh to put a zNode, it thr

Re: Solr 5: Schema.xml vs. Managed Schema - which is advisable?

2015-12-03 Thread Jeff Wartes
I’ve never used the managed schema, so I’m probably biased, but I’ve never seen much of a point to the Schema API. I need to make changes sometimes to solrconfig.xml, in addition to schema.xml and other config files, and there’s no API for those, so my process has been like: 1. Put the entire con

Re: How to list all collections in solr-4.7.2

2015-12-03 Thread Jeff Wartes
Looks like LIST was added in 4.8, so I guess you’re stuck looking at ZK, or finding some tool that looks in ZK for you. The zkCli.sh that ships with zookeeper would probably suffice for a one-off manual inspection: https://zookeeper.apache.org/doc/trunk/zookeeperStarted.html#sc_ConnectingT oZooKee

Re: Solrcloud: 1 server, 1 configset, multiple collections, multiple schemas

2015-12-04 Thread Jeff Wartes
If you want two different collections to have two different schemas, those collections need to reference two different configsets. So you need another copy of your config available using a different name, and to reference that other name when you create the second collection. On 12/4/15, 6:26 AM

Re: Fully automated replica creation in AWS

2015-12-09 Thread Jeff Wartes
It’s a pretty common misperception that since solr scales, you can just spin up new nodes and be done. Amazon ElasticSearch and older solrcloud getting-started docs encourage this misperception, as does the HDFS-only autoAddReplicas flag. I agree that auto-scaling should be approached carefully,

Re: Moving to SolrCloud, specifying dataDir correctly

2015-12-14 Thread Jeff Wartes
Don’t set solr.data.dir. Instead, set the install dir. Something like: -Dsolr.solr.home=/data/solr -Dsolr.install.dir=/opt/solr I have many solrcloud collections, and separate data/install dirs, and I’ve never had to do anything with manual per-collection or per-replica data dirs. That said, it’

state.json being downloaded every 10 seconds

2016-05-16 Thread Jeff Wartes
I have a solr 5.4 cluster with three collections, A, B, C. Nodes either host replicas for collection A, or B and C. Collections B and C are not currently used - no inserts or queries. Collection A is getting significant query traffic, but no insert traffic, and queries are only directed to node

Re: state.json being downloaded every 10 seconds

2016-05-16 Thread Jeff Wartes
>What the "something" is that sends requests I'm not quite sure, but >that's a place >to start. > >Best, >Erick > >On Mon, May 16, 2016 at 11:08 AM, Jeff Wartes wrote: >> >> I have a solr 5.4 cluster with three collections, A, B, C. >&g

Re: SolrCloud replicas consistently out of sync

2016-05-19 Thread Jeff Wartes
That case related to consistency after a ZK outage or network connectivity issue. Your case is standard operation, so I’m not sure that’s really the same thing. I’m aware of a few issues that cam happen if ZK connectivity goes wonky, that I hope are fixed in SOLR-8697. This one might be a close

Re: How to stop searches to solr while full data import is going in SOLR

2016-05-23 Thread Jeff Wartes
The PingRequestHandler contains support for a file check, which allows you to control whether the ping request succeeds based on the presence/absence of a file on disk on the node. http://lucene.apache.org/solr/6_0_0/solr-core/org/apache/solr/handler/PingRequestHandler.html I suppose you could

Re: SolrCloud increase replication factor

2016-05-23 Thread Jeff Wartes
https://github.com/whitepages/solrcloud_manager was designed to provide some easier operations for common kinds of cluster operation. It hasn’t been tested with 6.0 though, so if you try it, please let me know your experience. On 5/23/16, 6:28 AM, "Tom Evans" wrote: >On Mon, May 23, 2016 at

Re: Solr cloud with Grouping query gives inconsistent results

2016-05-23 Thread Jeff Wartes
My first thought is that you haven’t indexed such that all values of the field you’re grouping on are found in the same cores. See the end of the article here: (Distributed Result Grouping Caveats) https://cwiki.apache.org/confluence/display/solr/Result+Grouping And the “Document Routing” sectio

Re: What if adding 3rd node exceeds replication Factor? [scottchu]

2016-05-25 Thread Jeff Wartes
SolrCloud never creates replicas automatically, unless perhaps you’re using the HDFS-only autoAddReplicas option. Start the new node using the same ZK, and then use the Collections API (https://cwiki.apache.org/confluence/display/solr/Collections+API) to ADDREPLICA. The replicationFactor you s

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-26 Thread Jeff Wartes
Oh, interesting. I’ve certainty encountered issues with multi-word synonyms, but I hadn’t come across this. If you end up using it with a recent solr verison, I’d be glad to hear your experience. I haven’t used it, but I am aware of one other project in this vein that you might be interested in

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-05-31 Thread Jeff Wartes
2016, at 2:21 PM, John Bickerstaff < >> > > > j...@johnbickerstaff.com> >> > > > >> wrote: >> > > > >> > >> > > > >> > Thanks Chris -- >> > > > >> > >> > > > >> > The

Re: Solr Cloud and Multi-word Synonyms :: synonym_edismax parser

2016-06-01 Thread Jeff Wartes
ot; wrote: >Thanks Jeff, > >I believe I tried that, and it still refused to load.. But I'd sure love >it to work since the other process is a bit convoluted - although I see >it's value in a large Solr installation. > >When I "locate" the jar on the linux comman

Re: Solr off-heap FieldCache & HelioSearch

2016-06-03 Thread Jeff Wartes
For what it’s worth, I’d suggest you go into a conversation with Azul with a more explicit “I’m looking to buy” approach. I reached out to them with a more “I’m exploring my options” attitude, and never even got a trial. I get the impression their business model involves a fairly expensive (to

Re: Multiple calls across the distributed nodes for a query

2016-06-15 Thread Jeff Wartes
Any distributed query falls into the two-phase process. Actually, I think some components may require a third phase. (faceting?) However, there are also cases where only a single pass is required. A fl=id,score will only be a single pass, for example, since it doesn’t need to get the field valu

Re: Long STW GCs with Solr Cloud

2016-06-16 Thread Jeff Wartes
Check your gc log for CMS “concurrent mode failure” messages. If a concurrent CMS collection fails, it does a stop-the-world pause while it cleans up using a *single thread*. This means the stop-the-world CMS collection in the failure case is typically several times slower than a concurrent CMS

Re: Long STW GCs with Solr Cloud

2016-06-17 Thread Jeff Wartes
I gather due to large heap, interestingly enough >while the scenario Jeff talked about is remarkably similar (we use field >collapsing), including the performance aspects of it, we are getting >concurrent mode failures both due to new space allocation failures and due >to promotion failu

Re: SolrCloud: Adding a very large collection to a pre-existing cluster

2016-06-21 Thread Jeff Wartes
There’s no official way of doing #1, but there are some less official ways: 1. The Backup/Restore API provides some hooks into loading pre-existing data dirs into an existing collection. Lots of caveats. 2. If you don’t have many shards, there’s always rsync/reload. 3. There are some third-party

Re: Help with recovering shard range after zookeeper disaster

2016-06-28 Thread Jeff Wartes
This might come a little late to be helpful, but I had a similar situation with Solr 5.4 once. We ended up finding a ZK snapshot we could restore, but we did also get the cluster back up for most of the interim by taking the now-empty ZK cluster, re-uploading the configs that the collections us

Re: Full re-index without downtime

2016-07-06 Thread Jeff Wartes
A variation on #1 here - Use the same cluster, create a new collection, but use the createNodeSet option to logically partition your cluster so no node has both the old and new collection. If your clients all reference a collection alias, instead of a collection name, then all you need to do w

Re: solrcloud consumes more time than solr when write index

2016-07-12 Thread Jeff Wartes
Well, two thoughts: 1. If you’re not using solrcloud, presumably you don’t have any replicas. If you are, presumably you do. This makes for a biased comparison, because SolrCloud won’t acknowledge a write until it’s been safely written to all replicas. In short, solrcloud write time is max(per

Re: solrcloud consumes more time than solr when write index

2016-07-13 Thread Jeff Wartes
Kent > >2016-07-12 23:02 GMT+08:00 Jeff Wartes : > >> Well, two thoughts: >> >> >> 1. If you’re not using solrcloud, presumably you don’t have any replicas. >> If you are, presumably you do. This makes for a biased comparison, because >> SolrCloud won’t

Re: Node not recovering, leader elections not occuring

2016-07-19 Thread Jeff Wartes
It sounds like the node-local version of the ZK clusterstate has diverged from the ZK cluster state. You should check the contents of zookeeper and verify the state there looks sane. I’ve had issues (v5.4) on a few occasions where leader election got screwed up to the point where I had to delete

Effects of insert order on query performance

2016-08-11 Thread Jeff Wartes
This isn’t really a question, although some validation would be nice. It’s more of a warning. Tldr is that the insert order of documents in my collection appears to have had a huge effect on my query speed. I have a very large (sharded) SolrCloud 5.4 index. One aspect of this index is a mult

Re: Effects of insert order on query performance

2016-08-12 Thread Jeff Wartes
shards in that case. That’s fine from a SolrCloud query perspective of course, but it makes for more difficult resource provisioning. On 8/12/16, 1:39 AM, "Emir Arnautovic" wrote: Hi Jeff, I will not comment on your theory (will let that to guys more familiar with L

Re: Result Grouping vs. Collapsing Query Parser -- Can one be deprecated?

2016-10-20 Thread Jeff Wartes
I’ll also mention the choice to improve processing speed by allocating more memory, which increases the importance of GC tuning. This bit me when I tried using it on a larger index. https://issues.apache.org/jira/browse/SOLR-9125 I don’t know if the result grouping feature shares the same issue

  1   2   3   4   5   >