date:20170710

Re: CDCR - how to deal with the transaction log files

2017-07-10 Thread Xie, Sean

My guess was the documentation gap. I did a testing that turning off the CDCR by using action=stop, while continuously sending documents to the source cluster. The tlog files were growing; And after the hard commit, a new tlog file was created and the old files stayed there forever. As soon as

RE: ZooKeeper transaction logs

2017-07-10 Thread Xie, Sean

Not sure if I can answer the question, we previously use the manual command to cleanup the log, and use a linux daemon the schedule it. In windows, there should be corresponding tool to do so. We currently use the Netflix exhibitor to manage the zookeeper instances, and it works pretty well. S

Re: CDCR - how to deal with the transaction log files

2017-07-10 Thread Varun Thacker

Yeah it just seems weird that you would need to disable the buffer on the source cluster though. The docs say "Replicas do not need to buffer updates, and it is recommended to disable buffer on the target SolrCloud" which means the source should have it enabled. But the fact that it's working for

Re: CDCR - how to deal with the transaction log files

2017-07-10 Thread Xie, Sean

Yes. Documents are being sent to target. Monitoring the output from “action=queues”, depending your settings, you will see the documents replication progress. On the other hand, if enable the buffer, the lastprocessedversion is always returning -1. Reading the source code, the CdcrUpdateLogSync

Re: CDCR - how to deal with the transaction log files

2017-07-10 Thread Varun Thacker

After disabling the buffer are you still seeing documents being replicated to the target cluster(s) ? On Mon, Jul 10, 2017 at 1:07 PM, Xie, Sean wrote: > After several experiments and observation, finally make it work. > The key point is you have to also disablebuffer on source cluster. I don’t

Re: Cross DC SolrCloud anti-patterns in presentation shalinmangar/cross-datacenter-replication-in-apache-solr-6

2017-07-10 Thread Arcadius Ahouansou

Hello Shawn. Thank you very much for the comment. On 24 June 2017 at 16:14, Shawn Heisey wrote: > On 6/24/2017 2:14 AM, Arcadius Ahouansou wrote: > > Interpretation 1: > > ZooKeeper doesn't *need* an odd number of servers, but there's no > benefit to an even number. If you have 5 servers, two

Re: How to "chain" import handlers: import from DB and from file system

2017-07-10 Thread Susheel Kumar

Use SolrJ if you end up developing Indexer in Java to send documents to Solr. Its been a long i have used DIH but you can gave it a try first, otherwise as Walter suggested developing external indexer is best. On Sun, Jul 9, 2017 at 6:46 PM, Walter Underwood wrote: > 4. Write an external progra

Re: How to "chain" import handlers: import from DB and from file system

2017-07-10 Thread Giovanni De Stefano

Thank you guys for your advice! I would rather take advantage as much as possible of the existing handlers/processors. I just realised that nested entities in DIH is extremely slow: I fixed that with a view on the DB (that does a join between 2 tables). The other thing I have to do is chain th

Re: mm = 1 and multi-field searches

2017-07-10 Thread Susheel Kumar

How are you specifying multiple fields. Use qf parameter to specify multiple fields e.g. http://localhost:8983/solr/techproducts/select?indent=on&q=Samsung%20Maxtor%20hard&wt=json&defType=edismax&qf=name%20manu&debugQuery=on&mm=1 On Mon, Jul 10, 2017 at 4:51 PM, Michael Joyner wrote: > Hello a

mm = 1 and multi-field searches

2017-07-10 Thread Michael Joyner

Hello all, How does setting mm = 1 for edismax impact multi-field searches? We set mm to 1 and get zero results back when specifying multiple fields to search across. Is there a way to set mm = 1 for each field, but to OR the individual field searches together? -Mike/NewsRx

Re: How to "chain" import handlers: import from DB and from file system

2017-07-10 Thread Walter Underwood

I did this at Netflix with Solr 1.3, read stuff out of various databases and sent it all to Solr. I’m not sure DIH even existed then. At Chegg, we have slightly more elaborate system because we have so many collections and data sources. Each content owner writes an “extractor” that makes a JSON

RE: CDCR - how to deal with the transaction log files

2017-07-10 Thread Xie, Sean

After several experiments and observation, finally make it work. The key point is you have to also disablebuffer on source cluster. I don’t know why in the wiki, it didn’t mention it, but I figured this out through the source code. Once disablebuffer on source cluster, the lastProcessedVersion

Re: Returning results for multi-word search term

2017-07-10 Thread Erick Erickson

Well, one issue is that Paddle* Arm* has an implicit OR between the terms. Try +Paddle* +Arm* That'll reduce the documents found, although it would find "Paddle robotic armature" (no such thing, just sayin'). Although another possibility is that you're really sending some_field:Paddle* Arm* w

RE: How to "chain" import handlers: import from DB and from file system

2017-07-10 Thread Allison, Timothy B.

>4. Write an external program that fetches the file, fetches the metadata, >combines them, and send them to Solr. I've done this with some custom crawls. Thanks to Erick Erickson, this is a snap: https://lucidworks.com/2012/02/14/indexing-with-solrj/ With the caveat that Tika should really be i

RE: Returning results for multi-word search term

2017-07-10 Thread Miller, William K - Norman, OK - Contractor

I forgot to mention that I am using Solr 6.5.1 and I am indexing XML files. My Solr server is running on a Linux OS. ~~~ William Kevin Miller [ecsLogo] ECS Federal, Inc. USPS/MTSC (405) 573-2158 From: Miller, William K - Norman, OK - Contractor [mailto:william.k.mil...@u

Returning results for multi-word search term

2017-07-10 Thread Miller, William K - Norman, OK - Contractor

I am trying to return results when using a multi-word term. I am using "Paddle Arm" as my search term(including the quotes). I know that the field that I am querying against has these words together. If I run the query using Paddle* Arm* I get the following results, but I want to get only the

Re: uploading solr.xml to zk

2017-07-10 Thread Cassandra Targett

In your command, you are missing the "zk" part of the command. Try: bin/solr zk cp file:local/file/path/to/solr.xml zk:/solr.xml -z localhost:2181 I see this is wrong in the documentation, I will fix it for the next release of the Ref Guide. I'm not sure about how to refer to it - I don't think

RE: DIH issue with streaming xml file

2017-07-10 Thread Miller, William K - Norman, OK - Contractor

Please consider this issue closed as we are looking at moving our xml files to the solr server for now. ~~~ William Kevin Miller ECS Federal, Inc. USPS/MTSC (405) 573-2158 -Original Message- From: Miller, William K - Norman, OK - Contractor Sent: Monday, June 12,

Re: Solr 6.5.1 crashing when too many queries with error or high memory usage are queried

2017-07-10 Thread Joel Bernstein

Yes the hashJoin will read the entire "hashed" query into memory. The documentation explains this. In general the streaming joins were designed for OLAP type work loads. Unless you have a large cluster powering streaming joins you are going to have problems with high QPS workloads. Joel Bernstein

Re: Multiple Field Search on Solr

2017-07-10 Thread Erik Hatcher

I recommend first understanding the Solr API, and the parameters you need to add the capabilities with just the /select API. Once you are familiar with that, you can then learn what’s needed and apply that to the HTML and JavaScript. While the /browse UI is fairly straightforward, there’s a

RE: CDCR - how to deal with the transaction log files

2017-07-10 Thread Xie, Sean

Did some source code reading, and looks like when lastProcessedVersion==-1, then it will do nothing: https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/CdcrUpdateLogSynchronizer.java // if we received -1, it means that the log reader on the leade

Multiple Field Search on Solr

2017-07-10 Thread Clare Lee

Hello, My name is Clare Lee and I'm working on Apache Solr-6.6.0, Solritas right now and I'm not able to do something I want to do. Could you help me with this? I want to be able to search solr with multiple fields. With the basic configurations(I'm using the core techproducts and just changing t

Re: High disk write usage

2017-07-10 Thread Shawn Heisey

On 7/10/2017 2:57 AM, Antonio De Miguel wrote: > I continue deeping inside this problem... high writing rates continues. > > Searching in logs i see this: > > 2017-07-10 08:46:18.888 INFO (commitScheduler-11-thread-1) [c:ads s:shard2 > r:core_node47 x:ads_shard2_replica3] o.a.s.u.LoggingInfoStrea

RE: CDCR - how to deal with the transaction log files

2017-07-10 Thread Michael McCarthy

We have been experiencing this same issue for months now, with version 6.2. No solution to date. -Original Message- From: Xie, Sean [mailto:sean@finra.org] Sent: Sunday, July 09, 2017 9:41 PM To: solr-user@lucene.apache.org Subject: [EXTERNAL] Re: CDCR - how to deal with the transact

Resources for solr design and Architecture

2017-07-10 Thread Ranganath B N

Hi, Is there any resource (article or book) which sheds light on the solr design and architecture (interaction between client and server modules in solr, interaction b/w solr modules (java source files) )? Thanks, Ranganath B. N.

RE: ZooKeeper transaction logs

2017-07-10 Thread Avi Steiner

I did use this class using batch file (from Windows server), but it still does not remove anything. I sent number of snapshots to keep as 3, but I have more in my folder. -Original Message- From: Xie, Sean [mailto:sean@finra.org] Sent: Sunday, July 9, 2017 7:33 PM To: solr-user@lucen

Re: High disk write usage

2017-07-10 Thread Antonio De Miguel

Hi! I continue deeping inside this problem... high writing rates continues. Searching in logs i see this: 2017-07-10 08:46:18.888 INFO (commitScheduler-11-thread-1) [c:ads s:shard2 r:core_node47 x:ads_shard2_replica3] o.a.s.u.LoggingInfoStream [DWPT][commitScheduler-11-thread-1]: flushed: segm

Re: index new discovered fileds of different types

2017-07-10 Thread Jan Høydahl

I think Thaer’s answer clarify how they do it. So at the time they assemble the full Solr doc to index, there may be a new field name not known in advance, but to my understanding the RDF source contains information on the type (else they could not do the mapping to dynamic field either) and so a

Re: index new discovered fileds of different types

2017-07-10 Thread Thaer Sammar

Hi Rick, yes the RDF structure has subject, predicate and object. The object data type is not only text, it can be integer or double as well or other data types. The structure of our solar document doesn't only contain these three fields. We compose one document per subject and we use all found ob

Re: CDCR - how to deal with the transaction log files

RE: ZooKeeper transaction logs

Re: CDCR - how to deal with the transaction log files

Re: CDCR - how to deal with the transaction log files

Re: CDCR - how to deal with the transaction log files

Re: Cross DC SolrCloud anti-patterns in presentation shalinmangar/cross-datacenter-replication-in-apache-solr-6

Re: How to "chain" import handlers: import from DB and from file system

Re: How to "chain" import handlers: import from DB and from file system

Re: mm = 1 and multi-field searches

mm = 1 and multi-field searches

Re: How to "chain" import handlers: import from DB and from file system

RE: CDCR - how to deal with the transaction log files

Re: Returning results for multi-word search term

RE: How to "chain" import handlers: import from DB and from file system

RE: Returning results for multi-word search term

Returning results for multi-word search term

Re: uploading solr.xml to zk

RE: DIH issue with streaming xml file

Re: Solr 6.5.1 crashing when too many queries with error or high memory usage are queried

Re: Multiple Field Search on Solr

RE: CDCR - how to deal with the transaction log files

Multiple Field Search on Solr

Re: High disk write usage

RE: CDCR - how to deal with the transaction log files

Resources for solr design and Architecture

RE: ZooKeeper transaction logs

Re: High disk write usage

Re: index new discovered fileds of different types

Re: index new discovered fileds of different types

29 matches

Site Navigation

Mail list logo

Footer information