Re: SolrJ appears to have problems with Docker Toolbox

2017-04-11 Thread Mike Thomsen
ss your VM (the public network > interface where the docker instance is mapped). > > Hope you understand... :) > > Cheers, > Vincenzo > > > On Sun, Apr 9, 2017 at 2:42 AM, Mike Thomsen > wrote: > > > I'm running two nodes of SolrCloud in Docker on Window

Re: SolrJ appears to have problems with Docker Toolbox

2017-04-08 Thread Mike Thomsen
on would get better help in a Docker forum. > Cheers -- Rick > > On April 8, 2017 8:42:13 PM EDT, Mike Thomsen > wrote: > >I'm running two nodes of SolrCloud in Docker on Windows using Docker > >Toolbox. The problem I am having is that Docker Toolbox runs inside of >

SolrJ appears to have problems with Docker Toolbox

2017-04-08 Thread Mike Thomsen
I'm running two nodes of SolrCloud in Docker on Windows using Docker Toolbox. The problem I am having is that Docker Toolbox runs inside of a VM and so it has an internal network inside the VM that is not accessible to the Docker Toolbox VM's host OS. If I go to the VM's IP which is 192.168.99.100

Re: Data Import

2017-03-17 Thread Mike Thomsen
If Solr is down, then adding through SolrJ would fail as well. Kafka's new API has some great features for this sort of thing. The new client API is designed to be run in a long-running loop where you poll for new messages with a certain amount of defined timeout (ex: consumer.poll(1000) for 1s) So

Re: SOLR Data Locality

2017-03-17 Thread Mike Thomsen
I've only ever used the HDFS support with Cloudera's build, but my experience turned me off to use HDFS. I'd much rather use the native file system over HDFS. On Tue, Mar 14, 2017 at 10:19 AM, Muhammad Imad Qureshi < imadgr...@yahoo.com.invalid> wrote: > We have a 30 node Hadoop cluster and each

How to expose new Lucene field type to Solr

2017-03-02 Thread Mike Thomsen
Found this project and I'd like to know what would be involved with exposing its RestrictedField type through Solr for indexing and querying as a Solr field type. https://github.com/roshanp/lucure-core Thanks, Mike

Re: solr warning - filling logs

2017-02-27 Thread Mike Thomsen
It's a brittle ZK configuration. A typical ZK quorum is three nodes for most production systems. One is fine, though, for development provided the system it's on is not overloaded. On Mon, Feb 27, 2017 at 6:43 PM, Rick Leir wrote: > Hi Mike > We are using a single ZK node, I think. What problems

Re: Index Segments not Merging

2017-02-27 Thread Mike Thomsen
Just barely skimmed the documentation, but it looks like the tool generates its own shards and pushes them into the collection by manipulating the configuration of the cluster. https://www.cloudera.com/documentation/enterprise/5-8-x/topics/search_mapreduceindexertool.html If that reading is corre

Re: solr warning - filling logs

2017-02-27 Thread Mike Thomsen
When you transition to an external zookeeper, you'll need at least 3 ZK nodes. One is insufficient outside of a development environment. That's a general requirement for any system that uses ZK. On Sun, Feb 26, 2017 at 7:14 PM, Satya Marivada wrote: > May I ask about the port scanner running? Ca

Re: Fwd: Solr dynamic field blowing up the index size

2017-02-21 Thread Mike Thomsen
Correct me if I'm wrong, but heavy use of doc values should actually blow up the size of your index considerably if they are in fields that get sent a lot of data. On Tue, Feb 21, 2017 at 10:50 AM, Pratik Patel wrote: > Thanks for the reply. I can see that in solr 6, more than 50% of the index >

Re: Solr partial update

2017-02-09 Thread Mike Thomsen
Set the fl parameter equal to the fields you want and then query for id:(SOME_ID OR SOME_ID OR SOME_ID) On Thu, Feb 9, 2017 at 5:37 AM, Midas A wrote: > Hi, > > i want solr doc partially if unique id exist else we donot want to do any > thing . > > how can i achieve this . > > Regards, > Midas >

Re: Solr Kafka DIH

2017-01-31 Thread Mike Thomsen
Probably not, but writing your own little Java process to do it would be trivial with Kafka 0.9.X or 0.10.X. You can also look at the Confluent Platform as they have tons of connectors for Kafka to directly feed into other systems. On Mon, Jan 30, 2017 at 3:05 AM, Mahmoud Almokadem wrote: > Hell

Re: Is it possible to rewrite part of the solr response?

2017-01-18 Thread Mike Thomsen
if there's any way you can build this into tokens in the > doc and use a standard fq clause it's usually much easier. That may > take some creative work at indexing time if it's even possible. > > Best, > Erick > > On Wed, Dec 21, 2016 at 5:56 PM, Mike Thomsen >

Re: Solr ACL Plugin Windows

2017-01-04 Thread Mike Thomsen
I didn't see a real Java project there, but the directions to compile on Linux are almost always applicable to Windows with Java. If you find a project that says it uses Ant or Maven, all you need to do is download Ant or Maven, the Java Development Kit and put both of them on the windows path. The

Re: HDFS support maturity

2017-01-03 Thread Mike Thomsen
Cloudera defaults their Hadoop installation to use HDFS w/ their bundle of Solr (4.10.3) if that is any indication. On Tue, Jan 3, 2017 at 7:40 AM, Hendrik Haddorp wrote: > Hi, > > is the HDFS support in Solr 6.3 considered production ready? > Any idea how many setups might be using this? > > th

Re: Is it possible to rewrite part of the solr response?

2016-12-21 Thread Mike Thomsen
iness logic is such that > you can calculate them all "fast enough", you're golden. > > All that said, if there's any way you can build this into tokens in the > doc and use a standard fq clause it's usually much easier. That may > take some creative work

Is it possible to rewrite part of the solr response?

2016-12-21 Thread Mike Thomsen
We're trying out some ideas on locking down solr and would like to know if there is a public API that allows you to grab the response before it is sent and inspect it. What we're trying to do is something for which a filter query is not a good option to really get where we want to be. Basically, it

Replica document counts out of sync

2016-11-30 Thread Mike Thomsen
In one of our environments, we have an issue where one shard has two replicas with smaller document counts than the third one. This is on Solr 4.10.3 (Cloudera's build). We've found that shutting down the smaller replicas, deleting their data folders and restarting one by one will do the trick of f

Detecting schema errors while adding documents

2016-11-16 Thread Mike Thomsen
We're stuck on Solr 4.10.3 (Cloudera bundle). Is there any way to detect with SolrJ when a document added to the index violated the schema? All we see when we look at the stacktrace for the SolrException that comes back is that it contains messages about an IOException when talking to the solr node

Re: Rolling backups of a collection

2016-11-09 Thread Mike Thomsen
ide of Solr. If you post that > > script, may be we can even ship it as part of Solr itself (for the > benefit > > of the community). > > > > Thanks > > Hrishikesh > > > > > > > > On Wed, Nov 9, 2016 at 9:17 AM, Mike Thomsen > > wrote:

Rolling backups of a collection

2016-11-09 Thread Mike Thomsen
I read over the docs ( https://cwiki.apache.org/confluence/display/solr/Making+and+Restoring+Backups) and am not quite sure what route to take. My team is looking for a way to backup the entire index of a SolrCloud collection with regular rotation similar to the backup option available in a single

Backup to HDFS while running cluster on local disk

2016-11-08 Thread Mike Thomsen
We have SolrCloud running on bare metal but want the nightly snapshots to be written to HDFS. Can someone give me some help on configuring the HdfsBackupRepository? ${solr.hdfs.default.backup.path} ${solr.hdfs.home:} ${solr.hdfs.confdir:} Not sure how to procee

Best way to generate multivalue fields from streaming API

2016-09-16 Thread Mike Thomsen
Read this article and thought it could be interesting as a way to do ingestion: https://dzone.com/articles/solr-streaming-expressions-for-collection-auto-upd-1 Example from the article: daemon(id="12345", runInterval="6", update(users, batchSize=10, jdbc(connection="jdbc:mysql://loca

Update command not working

2016-02-26 Thread Mike Thomsen
I posted this to http://localhost:8983/solr/default-collection/update and it treated it like I was adding a whole document, not a partial update: { "id": "0be0daa1-a6ee-46d0-ba05-717a9c6ae283", "tags": { "add": [ "news article" ] } } In the logs, I found this: 2016-02-26 14:0

Re: /select changes between 4 and 5

2016-02-24 Thread Mike Thomsen
content types? > > -Yonik > > > On Wed, Feb 24, 2016 at 8:48 AM, Mike Thomsen > wrote: > > With 4.10, we used to post JSON like this example (part of it is Python) > to > > /select: > > > > { > > "q": "LONG_QUERY_HERE",

/select changes between 4 and 5

2016-02-24 Thread Mike Thomsen
With 4.10, we used to post JSON like this example (part of it is Python) to /select: { "q": "LONG_QUERY_HERE", "fq": fq, "fl": ["id", "title", "date_of_information", "link", "search_text"], "rows": 100, "wt": "json", "indent": "true", "_": int(time.time()) } We just up

Leader election issues after upgrade from 4.10.4 to 5.4.1

2016-02-08 Thread Mike Thomsen
We get this error on one of our nodes: Caused by: org.apache.solr.common.SolrException: There is conflicting information about the leader of shard: shard2 our state says: http://server01:8983/solr/collection/ but zookeeper says: http://server02:8983/collection/ Then I noticed this in the log: ]

zkCli.sh not in solr 5.4?

2016-01-19 Thread Mike Thomsen
I downloaded a build of 5.4.0 to install in some VMs and noticed that zkCli.sh is not there. I need it in order to upload a configuration set to ZooKeeper before I create the collection. What's the preferred way of doing that? Specifically, I need to specify a configuration like this because it's

Phrase query not matching exact tokens in some cases

2015-07-14 Thread Mike Thomsen
For the query "police office" our users are getting back highlighted results for "police office*r*" (and "police office*rs*") I get why a search for police officers would include just "office" since the stemmer would cause that behavior. However I don't understand why "office" is matching "officer"

Re: Exact phrase search on very large text

2015-06-26 Thread Mike Thomsen
ene, the underlying search engine library, imposes this 32K limit for > individual terms. Use tokenized text instead. > > -- Jack Krupansky > > On Thu, Jun 25, 2015 at 8:36 PM, Mike Thomsen > wrote: > > > I need to be able to do exact phrase searching on some documents that > ar

Exact phrase search on very large text

2015-06-25 Thread Mike Thomsen
I need to be able to do exact phrase searching on some documents that are a few hundred kb when treated as a single block of text. I'm on 4.10.4 and it complains when I try to put something larger than 32kb in using a textfield with the keyword tokenizer as the tokenizer. Is there any way I can ind

ManagedStopFilterFactory not accepting ignoreCase

2015-06-17 Thread Mike Thomsen
We're running Solr 4.10.4 and getting this... Caused by: java.lang.IllegalArgumentException: Unknown parameters: {ignoreCase=true} at org.apache.solr.rest.schema.analysis.BaseManagedTokenFilterFactory.(BaseManagedTokenFilterFactory.java:46) at org.apache.solr.rest.schema.analysis.M

Exact phrase search not working

2015-06-11 Thread Mike Thomsen
This is my field definition: Then I query for this exact phrase (which I can see in various documents) and get no results... my_field: "baltimore po

Re: Shard still around after calling splitshard

2015-06-04 Thread Mike Thomsen
ending upon the version of Solr you're > using). > > > On Thu, Jun 4, 2015 at 10:35 AM, Mike Thomsen > wrote: > > > I thought splitshard was supposed to get rid of the original shard, > > shard1, in this case. Am I missing something? I was expecting the only >

Shard still around after calling splitshard

2015-06-04 Thread Mike Thomsen
I thought splitshard was supposed to get rid of the original shard, shard1, in this case. Am I missing something? I was expecting the only two remaining shards to be shard1_0 and shard1_1. The REST call I used was /admin/collections?collection=default-collection&shard=shard1&action=SPLITSHARD if t

Managed synonyms and Solr Java API

2015-04-29 Thread Mike Thomsen
Is there a way to manage synonyms through Solr's Java API? Google doesn't turn up any good results, and I didn't see anything in the javadocs that looked promising. Thanks, Mike

Can't find result of autophrase filter

2015-04-20 Thread Mike Thomsen
This is the content of my autophrases.txt file: al qaeda in the arabian peninsula seat belt I've attached a screenshot showing the analysis view of the index. When I query for al_qaeda_in_the_arabian_peninsula or alqaedainthearabianpeninsula, nothing comes back even though at least the latter app

Re: Using synonyms API

2015-04-15 Thread Mike Thomsen
"initializedOn":"2015-04-14T19:39:55.157Z", > "managedMap":{ > "GB":["GiB", > "Gigabyte"], > "TV":["Television"], > "happy":["glad", > &q

Re: Using synonyms API

2015-04-15 Thread Mike Thomsen
"Gigabyte"], > "TV":["Television"], > "happy":["glad", > "joyful"]}}} > > > Verify that your URL has the correct port number (your example below > doesn't), and that "default-co

Using synonyms API

2015-04-15 Thread Mike Thomsen
We recently upgraded from 4.5.0 to 4.10.4. I tried getting a list of our synonyms like this: http://localhost/solr/default-collection/schema/analysis/synonyms/english I got a not found error. I found this page on new features in 4.8 http://yonik.com/solr-4-8-features/ Do we have to do something

Re: Using the collections API to create a new collection

2015-03-15 Thread Mike Thomsen
; Now whenever one of the replicas for that collection starts up, it contact > ZK and reads the config files and starts up. The replica does _not_ > copy the files locally. > > HTH, > Erick > > On Sun, Mar 15, 2015 at 6:16 AM, Mike Thomsen > wrote: > > I tried that w

Re: Using the collections API to create a new collection

2015-03-15 Thread Mike Thomsen
the collection. This would link the > config set in zk with your collection. > > I think it would make a lot of sense for you to go through the getting > started with SolrCloud section in the Solr Reference Guide for 4.5. > > On Sat, Mar 14, 2015 at 12:02 PM, Mike Thomsen &

Re: Using the collections API to create a new collection

2015-03-14 Thread Mike Thomsen
ce an update into zookeeper? Or should I just purge the zookeeper data? On Sat, Mar 14, 2015 at 3:02 PM, Mike Thomsen wrote: > I looked in the tree view and I have only a node called "configs." Nothing > called "configsets." That's a serious problem, right? So if I&

Re: Using the collections API to create a new collection

2015-03-14 Thread Mike Thomsen
at > you specified for someExistingCollection. It's not the _collection_ > that should be > existing, it should be the configset. > > It's often a bit confusing because if the configName is not specified, > the default > is to look for a config set of the same name as

Using the collections API to create a new collection

2015-03-14 Thread Mike Thomsen
We're running SolrCloud 4.5.0. It's just a standard version of SolrCloud deployed in Tomcat, not something like the Cloudera distribution (I note that because I can't seem to find solrctl and other things referenced in the Cloudera tutorials). I'm trying to create a new Solr collection like this:

Re: Solr cannot find solr.xml even though it's there

2014-12-20 Thread Mike Thomsen
c 20, 2014 at 3:40 PM, Shawn Heisey wrote: > On 12/20/2014 12:27 PM, Mike Thomsen wrote: > > at java.lang.Thread.run(Thread.java:745) > > /solr.xml cannot start Solrcommon.SolrException: solr.xml does not > > exist in /opt/solr/solr-shard1 > > at org

Solr cannot find solr.xml even though it's there

2014-12-20 Thread Mike Thomsen
I'm getting the following stacktrace with Solr 4.5.0 SEVERE: null:org.apache.solr.common.SolrException: Could not load SOLR configuration at org.apache.solr.core.ConfigSolr.fromFile(ConfigSolr.java:71) at org.apache.solr.core.ConfigSolr.fromSolrHome(ConfigSolr.java:98) at

Need some help with solr not restarting

2014-08-11 Thread Mike Thomsen
I'm very new to SolrCloud. When I tried restarting our tomcat server running SolrCloud, I started getting this in our logs: SEVERE: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /configs/configuration1/default-collection/data/index/_3ts3_Lucene4

Re: Is Solr right for our project?

2010-09-28 Thread Mike Thomsen
eature descriptions > Coming to a trunk near you - see > https://issues.apache.org/jira/browse/SOLR-1873 > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > > On 27. sep. 2010, at 17.44, Mike Thomsen wrote: > >> (I apologize in advance if I

Is Solr right for our project?

2010-09-27 Thread Mike Thomsen
(I apologize in advance if I missed something in your documentation, but I've read through the Wiki on the subject of distributed searches and didn't find anything conclusive) We are currently evaluating Solr and Autonomy. Solr is attractive due to its open source background, following and price.

Newbie question about search behavior

2010-08-16 Thread Mike Thomsen
Is it possible to set up Lucene to treat a keyword search such as title:News implicitly like title:News* so that any title that begins with News will be returned without the user having to throw in a wildcard? Also, are there any common filters and such that are generally considered a good pra