That ok if I'm using it in local, but I'm doing it in a production based on
the below page
https://cwiki.apache.org/confluence/display/solr/Taking+Solr+to+Production
On Thu, Feb 11, 2016 at 12:58 PM, Binoy Dalal
wrote:
> Why don't you directly run solr from the script provided in {SOLR_DIST}\
Thanks Erik. How do people handle this scenario? Right now the only option
I can think of is to replay the entire batch by doing add for every single
doc. Then this will give me error for all the docs which got added from the
batch.
On Tue, Feb 9, 2016 at 10:57 PM, Erick Erickson
wrote:
> This h
Why don't you directly run solr from the script provided in {SOLR_DIST}\bin
./solr start -p 8984
On Thu, 11 Feb 2016, 12:56 Jeyaprakash Singarayar
wrote:
> Hi,
>
> I'm trying to install solr 5.4.1 on CentOS. I know that while installing
> Solr as a service in the Linux we can pass -p to shift t
Hi,
I'm trying to install solr 5.4.1 on CentOS. I know that while installing
Solr as a service in the Linux we can pass -p to shift the
app to host on that port.
./install_solr_service.sh solr-5.4.1.tgz -p 8984 -f
but still it shows as it is hosted on 8983 and not on 8984. Any idea?
Waiting u
Hi,
I have migrated to solr 5.2 and the size of logs are high.
Can anyone help me out here how to control this?
@Jack
Currently we have around 55,00,000 docs
Its not about load on one node we have load on different nodes at different
times as our traffic is huge around 60k users at a given point of time
We want the hits on solr servers to be distributed so we are planning to
move on solr cloud as it would
hi,
what if master node fail what should be our fail over strategy ?
On Wed, Feb 10, 2016 at 9:12 PM, Jack Krupansky
wrote:
> What exactly is your motivation? I mean, the primary benefit of SolrCloud
> is better support for sharding, and you have only a single shard. If you
> have no need for s
how to delete overseer queues of zookeeper
Thanks everyone for your suggestions.
Based on it I am planning to have one doc per event with sessionId common.
So in this case hopefully indexing each doc as and when it comes would be
okay? Or do we still need to batch and index to Solr?
Also with 4M sessions a day with about 6000 docs (events
Thanks everyone for your suggestions.
Based on it I am planning to have a doc per event.
On Wed, Feb 10, 2016 at 3:38 AM, Emir Arnautovic <
emir.arnauto...@sematext.com> wrote:
> Hi Mark,
> Appending session actions just to be able to return more than one session
> without retrieving large numb
You do not tell us much of how Solr is setup. I found your stackoverflow
question too at
http://stackoverflow.com/questions/35220443/tesseract-command-line-ocr-engine-has-stopped-working
with a screenshot.
That suggests that you have setup Tika with OCR for images, and emails with
images are
Is it possible for the Data Import Handler to bring in maximum number of
records depending on available resources? If so, how should it be
configured?
Thanks,
Timothy's points are absolutely spot-on. In production scenarios, if
you use the simple
"run Tika in a SolrJ program" approach you _must_ abort the program on
OOM errors
and the like and figure out what's going on with the offending
document(s). Or record the
name somewhere and skip it next time '
We’re trying to fine tune our query and ingestion performance and would like to
get more metrics out of SOLR around this. We are capturing the standard logs
as well as the jetty request logs. The standard logs get us QTime, which is
not a good indication of how long the actual request took to
Well, what do we have here. I just saw a different docCount in the same result
set for the same field. These two are the explains for the top two documents
in the same result set:
1: 70.77082 = sum of:
70.77082 = max plus 0.65 times others of:
70.77082 = weight(title_nl:contactformulier i
Generally, creating a collection may also include uploading a zookeeper
configuration:
import org.apache.solr.common.cloud.SolrZkClient;
import org.apache.solr.common.cloud.ZkConfigManager;
import org.apache.solr.common.cloud.ZkStateReader;
/* ... much later ... */
SolrZkClient zkClient = s
Hi - i've noticed ExactStatsCache is not very exact on consecutive calls, see
the following explains for the number one result:
70.76961 = sum of:
70.76961 = max plus 0.65 times others of:
70.76961 = weight(title_nl:contactformulier in 210879) [], result of:
70.76961 = score(doc=21087
Is this some kind of typo in your slave configuration? 'cause it's
kinda weird. The error
mentioning collection1 indicates I think that the masterUrl is not
parseable (and somehow
doesn't throw a parsing error on startup) and the old default was "collection1".
This URL should point to a single cor
On 2/10/2016 6:55 AM, vidya wrote:
> I want to connect to solrCloud server from java program using
> zookeeperHost variable. I know that data can be indexed and searched from a
> collection using java program. but Can i able to create a collection
> initially from java program?
Yes. Use the Clo
Since you're using SolrJ anyway just use the
CollectionsAdminRequest. You can see
examples of it's use in the test cases, take a look
at CollectionsApiSolrJTests..
Best,
Erick
On Wed, Feb 10, 2016 at 5:55 AM, vidya wrote:
> Hi
>
> I want to connect to solrCloud server from java program using
>
All,
I have solr 4.7 installed in a Windows 7 environment. My solrconfig.xml on the
master is:
${master.replication.enabled:true}
commit
startup
optimize
optimize
commit
On 2/10/2016 8:02 AM, tedsolr wrote:
> I have my head wrapped around sending index requests in parallel, but in a
> later post you mentioned how you separately track the most recent update and
> are able to sync from that point if needed. That I don't get. Is it an index
> version you are tracking?
What exactly is your motivation? I mean, the primary benefit of SolrCloud
is better support for sharding, and you have only a single shard. If you
have no need for sharding and your master-slave replicated Solr has been
working fine, then stick with it. If only one machine is having a load
problem,
Hi,
I need to boost documents at runtime according to a set of roles and
related ids. For instance I would have the fields:
ceo:1234-abcd-5678-poiu
tl:-abcd-5678-abc
and a set of boosts to apply a runtime, for instance
ceo = 10
tl = 5
I don't want to do any complex operation with the weights
Arcadius,
Thanks for sharing your multi data center design. My requirements are
different (hot site - warm site) but nevertheless your posts are very
interesting. It helps to know that in many cases someone else has already
cut their teeth on the problem you're trying to solve.
Ted
--
View thi
Cross data center replication sounds like a great feature. I read Yonik's
post on it. I'll keep my ear to the ground. In the meantime it's good to
know there's nothing built in to handle this, so it will involve some design
effort.
I have my head wrapped around sending index requests in parallel,
Hi
I want to connect to solrCloud server from java program using
zookeeperHost variable. I know that data can be indexed and searched from a
collection using java program. but Can i able to create a collection
initially from java program?
My problem is that i cannot access solr web page, i'm
Ha. Spoke too soon about this thread not getting swamped.
Will add the dropwizard-tika-server to our wiki page. Thank you for the link!
As a side note, I'll submit a pull request to update the AbstractTikaResource
to avoid a potential NPE if the mime type can't be parsed...we just fixed this
I completely agree on the impulse, and for the vast majority of the time
(regular catchable exceptions), that'll work. And, by vast majority, aside
from oom on very large files, we aren't seeing these problems any more in our 3
million doc corpus (y, I know, small by today's standards) from gov
Hi Tom - thanks. But judging from the article and SOLR-6348 faceting stats over
ranges is not yet supported. More specifically, SOLR-6352 is what we would need.
[1]: https://issues.apache.org/jira/browse/SOLR-6348
[2]: https://issues.apache.org/jira/browse/SOLR-6352
Thanks anyway, at least we fo
On Wed, Feb 10, 2016 at 10:21 AM, Markus Jelsma
wrote:
> Hi - if we assume the following simple documents:
>
>
> 2015-01-01T00:00:00Z
> 2
>
>
> 2015-01-01T00:00:00Z
> 4
>
>
> 2015-01-02T00:00:00Z
> 3
>
>
> 2015-01-02T00:00:00Z
> 7
>
>
> Can i get a daily average for the fie
hi all.i have a mlt query.
and i wanna to categorise query result based on special field.
so i wanna to use group solr feature.
but solr mlt does not support group .
i used group = true
group.field =filed1,
group.limit=3
but i did n't get group result.
** i test solr group feature with select handl
Hi - if we assume the following simple documents:
2015-01-01T00:00:00Z
2
2015-01-01T00:00:00Z
4
2015-01-02T00:00:00Z
3
2015-01-02T00:00:00Z
7
Can i get a daily average for the field 'value' by day? e.g.
3.0
5.0
Reading the documentation, i don't think i can, or i a
What is the size of your index, hardware specs, average query load, rate of
Indexing?
On Wed, 10 Feb 2016, 14:14 kshitij tyagi
wrote:
> Hi,
>
> We are currently using solr 5.2 and I need to move on solr cloud
> architecture.
>
> As of now we are using 5 machines :
>
> 1. I am using 1 master wher
On 09/02/2016 22:49, Alexandre Rafalovitch wrote:
Solr uses Tika directly. And not in the most efficient way. It is
there mostly for convenience rather than performance.
So, for performance, Solr recommendation is also to run Tika
separately and only send Solr the processed documents.
Absolute
Hi,
We are currently using solr 5.2 and I need to move on solr cloud
architecture.
As of now we are using 5 machines :
1. I am using 1 master where we are indexing ourdata.
2. I replicate my data on other machines
One or the other machine keeps on showing high load so I am planning to
move on s
Hi Mark,
Appending session actions just to be able to return more than one
session without retrieving large number of results is not good tradeoff.
Like Upayavira suggested, you should consider storing one action per doc
and aggregate on read time or push to Solr once session ends and
aggregat
37 matches
Mail list logo