My guess was the documentation gap.
I did a testing that turning off the CDCR by using action=stop, while
continuously sending documents to the source cluster. The tlog files were
growing; And after the hard commit, a new tlog file was created and the old
files stayed there forever. As soon as
Not sure if I can answer the question, we previously use the manual command to
cleanup the log, and use a linux daemon the schedule it. In windows, there
should be corresponding tool to do so.
We currently use the Netflix exhibitor to manage the zookeeper instances, and
it works pretty well.
S
Yeah it just seems weird that you would need to disable the buffer on the
source cluster though.
The docs say "Replicas do not need to buffer updates, and it is recommended
to disable buffer on the target SolrCloud" which means the source should
have it enabled.
But the fact that it's working for
Yes. Documents are being sent to target. Monitoring the output from
“action=queues”, depending your settings, you will see the documents
replication progress.
On the other hand, if enable the buffer, the lastprocessedversion is always
returning -1. Reading the source code, the CdcrUpdateLogSync
After disabling the buffer are you still seeing documents being replicated
to the target cluster(s) ?
On Mon, Jul 10, 2017 at 1:07 PM, Xie, Sean wrote:
> After several experiments and observation, finally make it work.
> The key point is you have to also disablebuffer on source cluster. I don’t
Hello Shawn.
Thank you very much for the comment.
On 24 June 2017 at 16:14, Shawn Heisey wrote:
> On 6/24/2017 2:14 AM, Arcadius Ahouansou wrote:
> > Interpretation 1:
>
> ZooKeeper doesn't *need* an odd number of servers, but there's no
> benefit to an even number. If you have 5 servers, two
Use SolrJ if you end up developing Indexer in Java to send documents to
Solr. Its been a long i have used DIH but you can gave it a try first,
otherwise as Walter suggested developing external indexer is best.
On Sun, Jul 9, 2017 at 6:46 PM, Walter Underwood
wrote:
> 4. Write an external progra
Thank you guys for your advice!
I would rather take advantage as much as possible of the existing
handlers/processors.
I just realised that nested entities in DIH is extremely slow: I fixed that
with a view on the DB (that does a join between 2 tables).
The other thing I have to do is chain th
How are you specifying multiple fields. Use qf parameter to specify
multiple fields e.g.
http://localhost:8983/solr/techproducts/select?indent=on&q=Samsung%20Maxtor%20hard&wt=json&defType=edismax&qf=name%20manu&debugQuery=on&mm=1
On Mon, Jul 10, 2017 at 4:51 PM, Michael Joyner wrote:
> Hello a
Hello all,
How does setting mm = 1 for edismax impact multi-field searches?
We set mm to 1 and get zero results back when specifying multiple fields
to search across.
Is there a way to set mm = 1 for each field, but to OR the individual
field searches together?
-Mike/NewsRx
I did this at Netflix with Solr 1.3, read stuff out of various databases and
sent it all to Solr. I’m not sure DIH even existed then.
At Chegg, we have slightly more elaborate system because we have so many
collections and data sources. Each content owner writes an “extractor” that
makes a JSON
After several experiments and observation, finally make it work.
The key point is you have to also disablebuffer on source cluster. I don’t know
why in the wiki, it didn’t mention it, but I figured this out through the
source code.
Once disablebuffer on source cluster, the lastProcessedVersion
Well, one issue is that Paddle* Arm* has an implicit OR between the terms. Try
+Paddle* +Arm*
That'll reduce the documents found, although it would find "Paddle
robotic armature" (no such thing, just sayin').
Although another possibility is that you're really sending
some_field:Paddle* Arm*
w
>4. Write an external program that fetches the file, fetches the metadata,
>combines them, and send them to Solr.
I've done this with some custom crawls. Thanks to Erick Erickson, this is a
snap:
https://lucidworks.com/2012/02/14/indexing-with-solrj/
With the caveat that Tika should really be i
I forgot to mention that I am using Solr 6.5.1 and I am indexing XML files. My
Solr server is running on a Linux OS.
~~~
William Kevin Miller
[ecsLogo]
ECS Federal, Inc.
USPS/MTSC
(405) 573-2158
From: Miller, William K - Norman, OK - Contractor
[mailto:william.k.mil...@u
I am trying to return results when using a multi-word term. I am using "Paddle
Arm" as my search term(including the quotes). I know that the field that I am
querying against has these words together. If I run the query using Paddle*
Arm* I get the following results, but I want to get only the
In your command, you are missing the "zk" part of the command. Try:
bin/solr zk cp file:local/file/path/to/solr.xml zk:/solr.xml -z localhost:2181
I see this is wrong in the documentation, I will fix it for the next
release of the Ref Guide.
I'm not sure about how to refer to it - I don't think
Please consider this issue closed as we are looking at moving our xml files to
the solr server for now.
~~~
William Kevin Miller
ECS Federal, Inc.
USPS/MTSC
(405) 573-2158
-Original Message-
From: Miller, William K - Norman, OK - Contractor
Sent: Monday, June 12,
Yes the hashJoin will read the entire "hashed" query into memory. The
documentation explains this.
In general the streaming joins were designed for OLAP type work loads.
Unless you have a large cluster powering streaming joins you are going to
have problems with high QPS workloads.
Joel Bernstein
I recommend first understanding the Solr API, and the parameters you need to
add the capabilities with just the /select API. Once you are familiar with
that, you can then learn what’s needed and apply that to the HTML and
JavaScript. While the /browse UI is fairly straightforward, there’s a
Did some source code reading, and looks like when lastProcessedVersion==-1,
then it will do nothing:
https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/CdcrUpdateLogSynchronizer.java
// if we received -1, it means that the log reader on the leade
Hello,
My name is Clare Lee and I'm working on Apache Solr-6.6.0, Solritas right
now and I'm not able to do something I want to do. Could you help me with
this?
I want to be able to search solr with multiple fields. With the basic
configurations(I'm using the core techproducts and just changing t
On 7/10/2017 2:57 AM, Antonio De Miguel wrote:
> I continue deeping inside this problem... high writing rates continues.
>
> Searching in logs i see this:
>
> 2017-07-10 08:46:18.888 INFO (commitScheduler-11-thread-1) [c:ads s:shard2
> r:core_node47 x:ads_shard2_replica3] o.a.s.u.LoggingInfoStrea
We have been experiencing this same issue for months now, with version 6.2. No
solution to date.
-Original Message-
From: Xie, Sean [mailto:sean@finra.org]
Sent: Sunday, July 09, 2017 9:41 PM
To: solr-user@lucene.apache.org
Subject: [EXTERNAL] Re: CDCR - how to deal with the transact
Hi,
Is there any resource (article or book) which sheds light on the solr
design and architecture (interaction between client and server modules in solr,
interaction b/w solr modules (java source files) )?
Thanks,
Ranganath B. N.
I did use this class using batch file (from Windows server), but it still does
not remove anything. I sent number of snapshots to keep as 3, but I have more
in my folder.
-Original Message-
From: Xie, Sean [mailto:sean@finra.org]
Sent: Sunday, July 9, 2017 7:33 PM
To: solr-user@lucen
Hi!
I continue deeping inside this problem... high writing rates continues.
Searching in logs i see this:
2017-07-10 08:46:18.888 INFO (commitScheduler-11-thread-1) [c:ads s:shard2
r:core_node47 x:ads_shard2_replica3] o.a.s.u.LoggingInfoStream
[DWPT][commitScheduler-11-thread-1]: flushed: segm
I think Thaer’s answer clarify how they do it.
So at the time they assemble the full Solr doc to index, there may be a new
field name not known in advance,
but to my understanding the RDF source contains information on the type (else
they could not do the mapping
to dynamic field either) and so a
Hi Rick,
yes the RDF structure has subject, predicate and object. The object data
type is not only text, it can be integer or double as well or other data
types. The structure of our solar document doesn't only contain these three
fields. We compose one document per subject and we use all found ob
29 matches
Mail list logo