AutoAddReplicas

2016-07-20 Thread Joe Obernberger
Hi - I'm not sure how to enable autoAddReplicas to be true for collections. According to here: https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS it is specified in solr.xml, but I tried adding: true and that results in an error. What am I doing wrong? Thanks! -Joe

Re: AutoAddReplicas

2016-07-20 Thread Joe Obernberger
in the collections API here: https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api1 Best, Erick On Wed, Jul 20, 2016 at 2:46 PM, Joe Obernberger wrote: Hi - I'm not sure how to enable autoAddReplicas to be true for collections. According to here:

Re: solr shutdown

2016-10-21 Thread Joe Obernberger
Thanks Shawn - We've had to increase this to 300 seconds when using a large cache size with HDFS, and a fairly heavily loaded index routine (3 million docs per day). I don't know if that's why it takes a long time to shutdown, but it can take a while for solr cloud to shutdown gracefully. If

Re: SolrCloud - Sharing zookeeper ensemble with Kafka

2017-07-11 Thread Joe Obernberger
Vincenzo - we do this in our environment. Zookeeper handles, HDFS, HBase, Kafka, and Solr Cloud. -Joe On 7/11/2017 4:18 AM, Vincenzo D'Amore wrote: Hi All, in my test environment I've two Zookeeper instances one for SolrCloud (6.6.0) and another for a Kafka server (2.11-0.10.1.0). My task

Auto commit Error - Solr Cloud 6.6.0 with HDFS

2017-07-12 Thread Joe Obernberger
Started up a 6.6.0 solr cloud instance running on 45 machines yesterday using HDFS (managed schema in zookeeper) and began indexing. This error occurred on several of the nodes: auto commit error...:org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.

Re: Auto commit Error - Solr Cloud 6.6.0 with HDFS

2017-07-12 Thread Joe Obernberger
was getting the log ready for you, but it was overwritten in the interim. If it happens again, I'll get the log file ready. -Joe On 7/12/2017 9:25 AM, Shawn Heisey wrote: On 7/12/2017 7:14 AM, Joe Obernberger wrote: Started up a 6.6.0 solr cloud instance running on 45 machines yest

NullPointerException on openStreams

2017-07-13 Thread Joe Obernberger
Hi All - trying to call ClouderSolrStream.open(), but I'm getting this error: java.io.IOException: java.lang.NullPointerException at org.apache.solr.client.solrj.io.stream.CloudSolrStream.constructStreams(CloudSolrStream.java:408) at org.apache.solr.client.solrj.io.stream.CloudSolrSt

Re: NullPointerException on openStreams

2017-07-13 Thread Joe Obernberger
ler does this for an example: https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/StreamHandler.java#L339 Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Jul 13, 2017 at 2:06 PM, Joe Obernberger < joseph.obernber...@gmail.com> wrote:

Re: NullPointerException on openStreams

2017-07-13 Thread Joe Obernberger
class) .withFunctionName("facet", FacetStream.class) .withFunctionName("sum", SumMetric.class) .withFunctionName("unique", UniqueStream.class) .withFunctionName("uniq", UniqueMetric.class) .withFunctionName("

Solr 6.6.0 - Deleting Collections - HDFS

2017-07-14 Thread Joe Obernberger
When I delete a collection, it is gone from the GUI, but the directory is not removed from HDFS. The directory is empty, but the entry is still there. Is this expected? As shown below all the MODEL1007_* collections have been deleted. hadoop fs -du -s -h /solr6.6.0/* 3.3 G 22.7 G /solr6.6

Re: NullPointerException on openStreams

2017-07-14 Thread Joe Obernberger
ator.class) .withFunctionName("count", CountMetric.class) .withFunctionName("facet", FacetStream.class) .withFunctionName("sum", SumMetric.class) .withFunctionName("unique", UniqueStream.class) .withFunctionName("uniq", Un

Re: Auto commit Error - Solr Cloud 6.6.0 with HDFS

2017-07-14 Thread Joe Obernberger
) at java.lang.Thread.run(Thread.java:748) The whole log can be found here: http://lovehorsepower.com/solr.log the GC log is here: http://lovehorsepower.com/solr_gc.log.3.current -Joe On 7/12/2017 9:25 AM, Shawn Heisey wrote: On 7/12/2017 7:14 AM, Joe Obernberger wrote: Started up a 6.6.0

Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
We've been indexing data on a 45 node cluster with 100 shards and 3 replicas, but our indexing processes have been stopping due to errors. On the server side the error is "Error logging add". Stack trace: 2017-07-17 12:29:24.057 INFO (qtp985934102-5161548) [c:UNCLASS s:shard58 r:core_node290

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
t for SolrCloud? Thank you! -Joe On 7/17/2017 8:36 AM, Joe Obernberger wrote: We've been indexing data on a 45 node cluster with 100 shards and 3 replicas, but our indexing processes have been stopping due to errors. On the server side the error is "Error logging add". Stack tr

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
est for 2 shards and 2 replica. This would confirm if there is basic issue with indexing / cluster setup. On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger < joseph.obernber...@gmail.com> wrote: Some more info: When I stop all the indexers, in about 5-10 minutes the cluster goes all g

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
7;s actually having the problem. Unfortunately you'll just to look at one Solr log from each shard to see whether this is an issue. Best, Erick On Mon, Jul 17, 2017 at 7:23 AM, Joe Obernberger wrote: So far we've indexed about 46 million documents, but over the weekend, these errors started c

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
you'll just to look at one Solr log from each shard to see whether this is an issue. Best, Erick On Mon, Jul 17, 2017 at 7:23 AM, Joe Obernberger wrote: So far we've indexed about 46 million documents, but over the weekend, these errors started coming up. I would expect that if there

Short Circuit Reads -

2017-07-18 Thread Joe Obernberger
Hi All - does SolrCloud support using Short Circuit Reads when using HDFS? Thanks! -Joe

Re: Solr 6.6.0 - Indexing errors

2017-07-18 Thread Joe Obernberger
ception ex) { System.out.println("Error writting: "+ex); } } } Then I copied the files to the 45 servers and restarted solr 6.6.0 on each. It came back up OK, and it has been indexing all night long. -Joe On 7/17/2017 3:15 PM,

Classify stream expression questions

2017-08-14 Thread Joe Obernberger
Hi All - I'm using the classify stream expression and the results returned are always limited to 1,000. Where do I specify the number to return? The stream expression that I'm using looks like: classify(model(models,id="MODEL1014",cacheMillis=5000),search(COL,df="FULL_DOCUMENT",q="Collection:

Re: Classify stream expression questions

2017-08-14 Thread Joe Obernberger
this. If you have a large number of records to process I would recommend batch processing. This blog explains the parallel batch framework: http://joelsolr.blogspot.com/2016/10/solr-63-batch-jobs-para llel-etl-and.html Joel Bernstein http://joelsolr.blogspot.com/ On Mon, Aug 14, 2017 at 7:53

Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
Indexing about 15 million documents per day across 100 shards on 45 servers. Up until about 350 million documents, each of the solr instances was taking up about 1 core (100% CPU). Recently, they all jumped to 700%. Is this normal? Anything that I can check for? I don't see anything unusua

Re: Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
spent? It can be very helpful for debugging this sort of problem. On Fri, Aug 18, 2017 at 12:37 PM, Joe Obernberger < joseph.obernber...@gmail.com> wrote: Indexing about 15 million documents per day across 100 shards on 45 servers. Up until about 350 million documents, each of the solr ins

Re: Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
. On Fri, Aug 18, 2017 at 12:37 PM, Joe Obernberger < joseph.obernber...@gmail.com> wrote: Indexing about 15 million documents per day across 100 shards on 45 servers. Up until about 350 million documents, each of the solr instances was taking up about 1 core (100% CPU). Recently, they all

Re: Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
usage stayed low for a while, but then eventually comes up to ~800% where it will stay. Please let me know if there is other information that I can provide, or what I should be looking for in the GC logs. Thanks! -Joe On 8/18/2017 2:25 PM, Shawn Heisey wrote: On 8/18/2017 10:37 AM, Joe

Re: Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
wrote: I see a server with 100Gb of memory and processes (java and jsvc) using 203Gb of virtual memory. Hmm. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) On Aug 18, 2017, at 12:05 PM, Joe Obernberger wrote: Thank you Shawn. Please

Re: Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
f the app - just using it to see the monitoring isn't half as useful. On Fri, Aug 18, 2017 at 3:31 PM, Joe Obernberger mailto:joseph.obernber...@gmail.com>> wrote: Hi Walter - I see what you are saying, but the machine is not actively swapping (that would be the conce

Re: Solr 6.6.0 - High CPU during indexing

2017-08-18 Thread Joe Obernberger
0.3 0.3 0:00.93 java Note that the OS didn't actually give PID 29566 80G of memory, it actually gave it 275m. Right? Thanks again! -Joe On 8/18/2017 4:15 PM, Shawn Heisey wrote: On 8/18/2017 1:05 PM, Joe Obernberger wrote: Thank you Shawn. Please see: http://www.lovehorsepower.com/V

Machine Learning for search

2017-08-22 Thread Joe Obernberger
Hi All - One of the really neat features of solr 6 is the ability to create machine learning models (information gain) and then use those models as a query.  If I want a user to be able to execute a query for the text Hawaii and use a machine learning model related to weather data, how can I co

Re: Machine Learning for search

2017-08-23 Thread Joe Obernberger
will be too slow to classify the whole result set. In this scenario the search engine ranking will already be returning relevant candidate documents and the model is only used to get a better ordering of the top docs. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Aug 22, 2017 at 12:32

Re: Solr nodes crashing (OOM) after 6.6 upgrade

2017-09-18 Thread Joe Obernberger
Very nice article - thank you!  Is there a similar article available when the index is on HDFS?  Sorry to hijack!  I'm very interested in how we can improve cache/general performance when running with HDFS. -Joe On 9/18/2017 11:35 AM, Erick Erickson wrote: This is suspicious too. Each entr

Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
Hi All - we have a system with 45 physical boxes running solr 6.6.1 using HDFS as the index.  The current index size is about 31TBytes.  With 3x replication that takes up 93TBytes of disk. Our main collection is split across 100 shards with 3 replicas each.  The issue that we're running into is

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
could also try increasing the timeouts, and the HDFS directory factory has some parameters to tweak that are a mystery to me... All in all, this is behavior that I find mystifying. Best, Erick On Tue, Nov 21, 2017 at 5:07 AM, Joe Obernberger wrote: Hi All - we have a system with 45 physi

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
added some logic to my Solr start up script which scans the log files in HDFS and compares that with the state in ZooKeeper and then delete all lock files that belong to the node that I'm starting. regards, Hendrik On 21.11.2017 14:07, Joe Obernberger wrote: Hi All - we have a system with 45

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
state in ZooKeeper and then delete all lock files that belong to the node that I'm starting. regards, Hendrik On 21.11.2017 14:07, Joe Obernberger wrote: Hi All - we have a system with 45 physical boxes running solr 6.6.1 using HDFS as the index. The current index size is about 31TBytes.

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
e with a gap of like 5min. That normally recovers pretty well. regards, Hendrik On 21.11.2017 20:12, Joe Obernberger wrote: We set the hard commit time long because we were having performance issues with HDFS, and thought that since the block size is 128M, having a longer hard commit made sense.  Tha

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
delete and recreate collections/cores and it sometimes happens that the data was not cleaned up in HDFS and then causes a conflict. Hendrik On 21.11.2017 21:07, Joe Obernberger wrote: We've never run an index this size in anything but HDFS, so I have no comparison. What we've been doing

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
te collections/cores and it sometimes happens that the data was not cleaned up in HDFS and then causes a conflict. Hendrik On 21.11.2017 21:07, Joe Obernberger wrote: We've never run an index this size in anything but HDFS, so I have no comparison.  What we've been doing is keeping tw

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Joe Obernberger
e had so far was those lock files and if you delete and recreate collections/cores and it sometimes happens that the data was not cleaned up in HDFS and then causes a conflict. Hendrik On 21.11.2017 21:07, Joe Obernberger wrote: We've never run an index this size in anything but HDFS, so

FORCELEADER not working - solr 6.6.1

2017-11-21 Thread Joe Obernberger
Hi All - sorry for the repeat, but I'm at a complete loss on this.  I have a collection with 100 shards and 3 replicas each.  6 of the shard will not elect a leader.  I've tried the FORCELEADER command, but nothing changes. The log shows 'Force leader attempt 1.  Waiting 5 secs for an active

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-22 Thread Joe Obernberger
would try looking into the code and figure out what the problem is here and maybe compare the state in HDFS and ZK with a shard that works. regards, Hendrik On 21.11.2017 23:57, Joe Obernberger wrote: Hi Hendrick - the shards in question have three replicas.  I tried restarting each one (on

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-22 Thread Joe Obernberger
data quite a bit. So I would try looking into the code and figure out what the problem is here and maybe compare the state in HDFS and ZK with a shard that works. regards, Hendrik On 21.11.2017 23:57, Joe Obernberger wrote: Hi Hendrick - the shards in question have three replicas. I tried restarti

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-22 Thread Joe Obernberger
44 AM, Shawn Heisey wrote: On 11/22/2017 6:44 AM, Joe Obernberger wrote: Right now, we have a relatively small block cache due to the requirements that the servers run other software.  We tried to find the best balance between block cache size, and RAM for programs, while still giving enough for

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-27 Thread Joe Obernberger
ot;. The smoking gun here is that there are no errors on the follower, just the notification that the leader put it into recovery. There are other variations on the theme, it all boils down to when communications fall apart replicas go into recovery. Best, Erick On Wed, Nov 22, 2017 at 11:02

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-27 Thread Joe Obernberger
led, retry loop.  Anyone else run into this? Thanks. -Joe On 11/27/2017 11:28 AM, Joe Obernberger wrote: Thank you Erick.  Right now, we have our autoCommit time set to 180 (30 minutes), and our autoSoftCommit set to 12.  The thought was that with HDFS we want less frequent, but lar

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-12-04 Thread Joe Obernberger
iated Recovery". The smoking gun here is that there are no errors on the follower, just the notification that the leader put it into recovery. There are other variations on the theme, it all boils down to when communications fall apart replicas go into recovery. Best, Erick On Wed, Nov 22

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-12-08 Thread Joe Obernberger
Anyone have any thoughts on this?  Will TLOG replicas use less network bandwidth? -Joe On 12/4/2017 12:54 PM, Joe Obernberger wrote: Hi All - this same problem happened again, and I think I partially understand what is going on.  The part I don't know is what caused any of the replic

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-12-11 Thread Joe Obernberger
leadership if the leader goes down so they must have an up-to-date-after-last-index-sync set of tlogs. At least that's my current understanding... Best, Erick On Fri, Dec 8, 2017 at 12:01 PM, Joe Obernberger wrote: Anyone have any thoughts on this? Will TLOG replicas use less network

Re: Frequency of Full reindex on SolrCloud

2018-01-02 Thread Joe Obernberger
Almost never.  I would only run a re-index for newer versions (such as 6.5.2 to 7.2) that have a required feature or schema changes such as changing the type of an existing field (int to string for example).  Not sure what you mean by 'every delta', but I would assume you just mean new data?  I

stream, features and train

2016-11-23 Thread Joe Obernberger
Hi - I'm trying to experiment with the new train, features, model, classify capabilities of Solr 6.3.0. I'm following along on: https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions#StreamingExpressions-StreamSources When I execute: features(UNCLASS, q="*:*", featureSet

Solr 6.3.0 SQL question

2016-11-28 Thread Joe Obernberger
I'm running this query: curl --data-urlencode 'stmt=SELECT avg(TextSize) from UNCLASS' http://cordelia:9100/solr/UNCLASS/sql?aggregationMode=map_reduce The error that I get back is: {"result-set":{"docs":[ {"EXCEPTION":"org.apache.solr.common.SolrException: Collection not found: unclass","EO

Re: stream, features and train

2016-11-28 Thread Joe Obernberger
n the training set. Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Nov 23, 2016 at 3:21 PM, Joe Obernberger < joseph.obernber...@gmail.com> wrote: Hi - I'm trying to experiment with the new train, features, model, classify capabilities of Solr 6.3.0. I'm following along on: htt

Re: Solr 6.3.0 SQL question

2016-11-29 Thread Joe Obernberger
re a longer error/stack trace in your Solr server logs? I wonder if the real error is being masked. Kevin Risden On Mon, Nov 28, 2016 at 3:24 PM, Joe Obernberger < joseph.obernber...@gmail.com> wrote: I'm running this query: curl --data-urlencode 'stmt=SELECT avg(TextSize) from

Re: Solr 6.3.0 SQL question

2016-11-29 Thread Joe Obernberger
rt collection alias' which is fixed in 6.4 is a work around. On 29 November 2016 at 08:29, Kevin Risden wrote: Is there a longer error/stack trace in your Solr server logs? I wonder if the real error is being masked. Kevin Risden On Mon, Nov 28, 2016 at 3:24 PM, Joe Obernberger < j

Random Streaming Function not there? SolrCloud 6.3.0

2017-01-03 Thread Joe Obernberger
I'm getting an error: {"result-set":{"docs":[ {"EXCEPTION":"Invalid stream expression random(MAIN,q=\"FULL_DOCUMENT:obamacare\",rows=100,fl=DocumentId) - function 'random' is unknown (not mapped to a valid TupleStream)","EOF":true}]}} When trying to use the streaming random function. I'm us

Re: Random Streaming Function not there? SolrCloud 6.3.0

2017-01-03 Thread Joe Obernberger
x, so the /stream handler registration always gets tested. I'm fairly sure I remember testing random at scale through the /stream handler so I'm not sure how this missed getting committed. I will fix this for Solr 6.4. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Jan 3, 2017

Large index recommendation

2017-01-13 Thread Joe Obernberger
Hi All - we've been experimenting with Solr Cloud 5.5.0 with a 27 shard (no replication - each shard runs on a physical host) cluster on top of HDFS. It currently just crossed 3 billion documents indexed with an index size of 16.1TBytes. In HDFS with 3x replication this takes up 48.2TBytes.

Re: Large index recommendation

2017-01-16 Thread Joe Obernberger
, going with multiple shards per machines sounds like the way to go here. I do have a test instance, and can do some bench-marking there. Thanks again! -Joe On 1/13/2017 4:16 PM, Toke Eskildsen wrote: Joe Obernberger wrote: [3 billion docs / 16TB / 27 shards on HDFS times 3 for replication

Can't create collection

2017-01-16 Thread Joe Obernberger
Hi All - trying to create a collection in Solr 6.3.0 on a 5 node cluster and getting a timeout error. The cluster has a collection already, but I'm trying to make a new one. The UI reports 'Connection to Solr lost - Please check the Solr instance.'. It is still running, but the log reports:

Re: Can't create collection

2017-01-16 Thread Joe Obernberger
://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-AsynchronousCalls Thanks Hrishikesh On Mon, Jan 16, 2017 at 11:36 AM, Joe Obernberger < joseph.obernber...@gmail.com> wrote: Hi All - trying to create a collection in Solr 6.3.0 on a 5 node cluster and getting a timeout error

indexing error - 6.3.0

2017-01-17 Thread Joe Obernberger
While indexing a large number of records in Solr Cloud 6.3.0 with a 5 node configuration, I received an error. I'm using java code / solrj to perform the indexing by creating a list of SolrInputDocuments, 1000 at a time, and then calling CloudSolrClient.add(list). The records are small - abou

Re: indexing error - 6.3.0

2017-01-18 Thread Joe Obernberger
ound 27-50 million records, I get this error. This is on a newly created collection (that I've been dropping and recreating after each test). Is there anything I can try that may help debug? Perhaps my method of indexing is incorrect? Thanks for any ideas! -Joe On 1/17/2017 10:13 AM, Joe Ob

Re: indexing error - 6.3.0

2017-01-19 Thread Joe Obernberger
ot; or 2> you edit and push when all your Solr nodes are shut down. Best, Erick On Wed, Jan 18, 2017 at 9:11 PM, Joe Obernberger wrote: Hi All - I've been trying to debug this, but it keeps occurring. Even if I do 100 at a time, or 50 at a time, eventually I get the below stack trace. I

Re: indexing error - 6.3.0

2017-01-19 Thread Joe Obernberger
#x27;s gotten a lot further along so far. -Joe On 1/19/2017 12:59 PM, Joe Obernberger wrote: Thank you Erick! For this scenario, I was defining the schema manually (editing managed_schema and pushing to zookeeper), but didn't realize that I had left the field guessing block in the

Solr 6.3.0 - recovery failed

2017-02-01 Thread Joe Obernberger
Hi All - I had one node in a 45 shard cluster (9 physical machines) run out of memory. I stopped all the nodes in the cluster and removed any lingering write.lock files from the OOM in HDFS. All the nodes recovered except one replica of one shard that happens to be on the node that ran out of

Re: Solr 6.3.0 - recovery failed

2017-02-01 Thread Joe Obernberger
Thank you for the response. There are no virtual machines in the configuration. The collection has 45 shards with 3 replicas each spread across the 9 physical boxes; each box is running one copy of solr. I've tried to restart just the one node after the other 8 (and all their shards/replicas)

Re: Solr 6.3.0 - recovery failed

2017-02-01 Thread Joe Obernberger
e are failing to recover? Cheers On 1 Feb 2017 6:07 p.m., "Joe Obernberger" wrote: Thank you for the response. There are no virtual machines in the configuration. The collection has 45 shards with 3 replicas each spread across the 9 physical boxes; each box is running one copy of so

Re: Solr 6.3.0 - recovery failed

2017-02-01 Thread Joe Obernberger
ogs, directly ( not from the ui), is there any " caused by" associated to the recovery failure exception? Cheers On 1 Feb 2017 6:28 p.m., "Joe Obernberger" wrote: In HDFS when a node fails it will leave behind write.lock files in HDFS. These files have to be manually re

Re: Solr 6.3.0 - recovery failed

2017-02-01 Thread Joe Obernberger
ot; associated to the recovery failure exception? Cheers On 1 Feb 2017 6:28 p.m., "Joe Obernberger" wrote: In HDFS when a node fails it will leave behind write.lock files in HDFS. These files have to be manually removed; otherwise the shards/replicas that have write.lock files l

Re: Announcing Marple, a RESTful API & GUI for inspecting Lucene indexes

2017-02-27 Thread Joe Obernberger
Hi Charlie - will this work with an index stored in HDFS as written by Solr Cloud? -Joe On 2/24/2017 12:24 PM, Charlie Hull wrote: Hi all, Very pleased to announce the first release of Marple, an open source tool for inspecting Lucene indexes. We've blogged about it here: http://www.flax.co

model building

2017-03-20 Thread Joe Obernberger
I'm trying to build a model using tweets. I've manually tagged 30 tweets as threatening, and 50 random tweets as non-threatening. When I build the mode with: update(models2, batchSize="50", train(UNCLASS, features(UNCLASS,

Re: model building

2017-03-20 Thread Joe Obernberger
If I put the training data into its own collection and use q="*:*", then it works correctly. Is that a requirement? Thank you. -Joe On 3/20/2017 3:47 PM, Joe Obernberger wrote: I'm trying to build a model using tweets. I've manually tagged 30 tweets as threatening, and

Re: model building

2017-03-22 Thread Joe Obernberger
in the same collection. I suspect you're training set is too small to get a reliable model from. The training sets we tested with were considerably larger. All the idfs_ds values being the same seems odd though. The idfs_ds in particular were designed to be accurate when there are multiple t

SolrJ and Streaming

2017-04-18 Thread Joe Obernberger
Hi All - any examples of using solrJ and streaming expressions available? Like calling UpdateStream from solrJ? Thank you! -Joe

Re: SolrJ and Streaming

2017-04-18 Thread Joe Obernberger
k; } } } finally { stream.close(); } Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Apr 18, 2017 at 2:33 PM, Joe Obernberger < joseph.obernber...@gmail.com> wrote: Hi All - any examples of using solrJ and streaming expressions available? Like calling UpdateStream from solrJ? Thank you! -Joe

Solr 7.7.0 - Garbage Collection issue

2019-02-12 Thread Joe Obernberger
Yesterday, we upgraded our 40 node cluster from solr 7.6.0 to solr 7.7.0.  This morning, all the nodes are using 1200+% of CPU. It looks like it's in garbage collection.  We did reduce our HDFS cache size from 11G to 6G, but other than that, no other parameters were changes. Top shows: top -

Re: Solr 7.7.0 - Garbage Collection issue

2019-02-12 Thread Joe Obernberger
ent Thank you For the gceasy.io site - that is very slick!  I'll use that in the future.  I can try using the standard settings, but again - at this point it doesn't look GC related to me? -Joe On 2/12/2019 11:35 AM, Shawn Heisey wrote: On 2/12/2019 7:35 AM, Joe Obernberger

Re: Solr 7.7.0 - Garbage Collection issue

2019-02-12 Thread Joe Obernberger
Reverted back to 7.6.0 - same settings, but now I do not encounter the large CPU usage. -Joe On 2/12/2019 12:37 PM, Joe Obernberger wrote: Thank you Shawn.  Yes, I used the settings off of your site. I've restarted the cluster and the CPU usage is back up again. Looking at it now, it do

Re: High CPU usage with Solr 7.7.0

2019-02-27 Thread Joe Obernberger
Just to add to this.  We upgraded to 7.7.0 and saw very large CPU usage on multi core boxes - sustained in the 1200% range.  We then switched to 7.6.0 (no other configuration changes) and the problem went away. We have a 40 node cluster and all 40 nodes had high CPU usage with 3 indexes stored

Solr 7.1.0 - concurrent.ExecutionException building model

2018-04-02 Thread Joe Obernberger
concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)     ... 1 more Last Check: 4/2/2018, 3:47:15 PM Thank you! -Joe Obernberger

Re: Solr 7.1.0 - concurrent.ExecutionException building model

2018-04-02 Thread Joe Obernberger
n. Are the logs from http://vesta:9100/solr/MODEL1024_1522696624083_shard20_replica_n75 reporting any issues? When you go to that url is it back up and running? Joel Bernstein http://joelsolr.blogspot.com/ On Mon, Apr 2, 2018 at 3:55 PM, Joe Obernberger < joseph.obernber...@gmail.com> w

Re: Solr 7.1.0 - concurrent.ExecutionException building model

2018-04-05 Thread Joe Obernberger
InternalHttpClient.java:185)     at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)     at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)     at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMeth

Re: Largest number of indexed documents used by Solr

2018-04-05 Thread Joe Obernberger
50 billion per day?  Wow!  How large are these documents? We have a cluster with one large collection that contains 2.4 billion documents spread across 40 machines using HDFS for the index.  We store our data inside of HBase, and in order to re-index data we pull from HBase and index with solr

Re: Solr 7.1.0 - concurrent.ExecutionException building model

2018-04-05 Thread Joe Obernberger
, Joe Obernberger wrote: Thank you Shawn - sorry so long to respond, been playing around with this a good bit.  It is an amazing capability.  It looks like it could be related to certain nodes in the cluster not responding quickly enough.  In one case, I got the concurrent.ExecutionException, but it

Re: Solr 7.1.0 - concurrent.ExecutionException building model

2018-04-05 Thread Joe Obernberger
features you can make this matrix much smaller. Its fairly easy to make the train function work on a random sample of the training set on each iteration rather then the entire training set, but currently this is not how its implemented. Feel free to create a ticket requesting this the sampling approa

Solr7.1.0 - deleting collections when using HDFS

2018-04-10 Thread Joe Obernberger
Hi All - I've noticed that if I delete a collection that is stored in HDFS, the files/directory in HDFS remain.  If I then try to recreate the collection with the same name, I get an error about unable to open searcher.  If I then remove the directory from HDFS, the error remains due to files s

Re: Solr OOM Crashes / JVM tuning advice

2018-04-11 Thread Joe Obernberger
Just as a side note, when Solr goes OOM and kills itself, and if you're running HDFS, you are guaranteed to have write.lock files left over.  If you're running lots of shards/replicas, you may have many files that you need to go into HDFS and delete before restarting. -Joe On 4/11/2018 10:46

Solr 7 + HDFS issue

2018-06-11 Thread Joe Obernberger
We are seeing an issue on our Solr Cloud 7.3.1 cluster where replication starts and pegs network interfaces so aggressively that other tasks cannot talk.  We will see it peg a bonded 2GB interfaces.  In some cases the replication fails over and over until it finally succeeds and the replica com

Re: Solr 7 + HDFS issue

2018-06-12 Thread Joe Obernberger
uot;_4jq2_4.liv",   "_4jqm.cfe",       "_4jqm.cfs",   "_4jqm.si",   "_4jqm_1.liv",   "_4jqn.cfe",   "_4jqn.cfs",   "_4jqn.si",   "_4jqn_2.liv",   &

Exception writing to index; possible analysis error - 7.3.1 - HDFS

2018-06-22 Thread Joe Obernberger
I'm getting an error on some of the nodes in my solr cloud cluster under heavy indexing load.  Once the error happens, that node, just repeatedly gets this error over and over and will no longer index documents until a restart.  I believe the root cause of the error is: File /solr7.1.0/UNCLASS/c

Can't recover - HDFS

2018-07-02 Thread Joe Obernberger
Hi All - having this same problem again with a large index in HDFS.  A replica needs to recover, and it just spins retrying over and over again.  Any ideas?  Is there an adjustable timeout? Screenshot: http://lovehorsepower.com/images/SolrShot1.jpg Thank you! -Joe Obernberger

Solr 7.1.0 - NoNode for /collections

2018-07-02 Thread Joe Obernberger
Hi - On startup, I'm getting the following error.  The shard had 3 replicas, but none are selected as the leader.  I deleted one, and adding a new one back, but that had no effect, and at times the calls would timeout.  I was having the same issue with another shard on the same collection and d

Re: Solr 7.1.0 - NoNode for /collections

2018-07-02 Thread Joe Obernberger
Just to add to this - looks like the only valid replica that is remaining is a TLOG type, and I suspect that is why it no longer has a leader.  Poop. -Joe On 7/2/2018 7:54 PM, Joe Obernberger wrote: Hi - On startup, I'm getting the following error.  The shard had 3 replicas, but non

Re: Can't recover - HDFS

2018-07-03 Thread Joe Obernberger
a:624)     at java.lang.Thread.run(Thread.java:748) Thank you very much for the help! -Joe On 7/2/2018 8:32 PM, Shawn Heisey wrote: On 7/2/2018 1:40 PM, Joe Obernberger wrote: Hi All - having this same problem again with a large index in HDFS.  A replica needs to recover, and it just spins retrying

CloudSolrClient URL Too Long

2018-07-12 Thread Joe Obernberger
Hi - I'm using SolrCloud 7.3.1 and calling a search from Java using: org.apache.solr.client.solrj.response.QueryResponse response = CloudSolrClient.query(ModifiableSolrParams) If the ModifiableSolrParams are long, I get an error: Bad Message 414reason: URI Too Long I have the maximum number o

Re: CloudSolrClient URL Too Long

2018-07-13 Thread Joe Obernberger
Shawn - thank you!  That works great.  Stupid huge searches here I come! -Joe On 7/12/2018 4:46 PM, Shawn Heisey wrote: On 7/12/2018 12:48 PM, Joe Obernberger wrote: Hi - I'm using SolrCloud 7.3.1 and calling a search from Java using: org.apache.solr.client.solrj.response.QueryRes

Solr 7.1 nodes shutting down

2018-08-10 Thread Joe Obernberger
Hi All - having an issue that seems to be related to the machine being under a high CPU load.  Occasionally a node will fall out of the solr cloud cluster.  It will be using 200% CPU and show the following exception: 2018-08-10 15:36:43.416 INFO  (qtp1908316405-203450) [c:models s:shard3 r:cor

NoClassDefFoundError - Faceting on 8.2.0

2020-02-05 Thread Joe Obernberger
! -Joe Obernberger

Split Shard - HDFS Index - Solr 7.6.0

2020-02-10 Thread Joe Obernberger
1755371094",     "exception": {         "msg": "not enough free disk space to perform index split on node sys-hadoop-1:9100_solr, required: 306.76734546013176, available: 16.772361755371094",         "rspCode": 500     },     "status": {         "state": "failed",         "msg": "found [] in failed tasks"     } } -Joe Obernberger

Solr 8.2.0 - Schema issue

2020-02-26 Thread Joe Obernberger
schema), I get an error that the field doesn't exist. If I restart the cluster, this problem goes away and I can add a document with the new field to any solr collection that has the schema.  Any work-arounds that don't involve a restart? Thank you! -Joe Obernberger

  1   2   >