I side with Toke on this. Enterprise bare metal machines often have
hundreds of gigs of memory and tens of CPU cores -- you would have to fit
multiple instances in a machine to make use of them to circumvent huge
heaps.
If this is not a common case now, it could well be in the future the way
hardw
I should add to Erick's point that the test framework allows you to test
HTTP APIs through an embedded Jetty instance, so you should be able to do
anything that you do with a remote Solr instance from code..
On 12 Jan 2016 18:24, "Erick Erickson" wrote:
> And a neater way to debug stuff rather th
M is the number of ids you want for each group, specified by group.limit.
It's unrelated to the number of rows requested..
On 21 Aug 2015 19:54, "SolrUser1543" wrote:
> Ramkumar R. Aiyengar wrote
> > Grouping does need 3 phases.. The phases are:
> >
> >
>
Grouping does need 3 phases.. The phases are:
(1) Each shard is asked for the top N groups (instead of ids), with the
sort value. The federator then sorts the groups from all shards and chooses
the top N groups.
(2) For the N groups, each shard is asked for the top M ids (M is
configurable per req
Custom authentication support was added in 5x, and the imminent (in the
next few days) 5.3 release has a lot of features in this regard, including
a basic authentication module, I would suggest upgrading to it. 5x versions
(include 5.3) do support Java 7, so I don't see an issue here?
On 20 Aug 201
Please open a JIRA with details of what the issues are, we should try to
support this..
On 18 Jun 2015 15:07, "Bence Vass" wrote:
> Hello,
>
> Is there any documentation on how to start Solr 5.2.1 on Solaris (Solaris
> 10)? The script (solr start) doesn't work out of the box, is anyone running
>
I started with an empty Solr instance and Firefox 38 on Linux. This is the
trunk source..
There's a 'No cores available. Go and create one' button available in the
old and the new UI. In the old UI, clicking it goes to the core admin, and
pops open the dialog for Add Core. The new UI only goes to
This shouldn't happen, but if it does, there's no good way currently for
Solr to automatically fix it. There are a couple of issues being worked on
to do that currently. But till then, your best bet is to restart the node
which you expect to be the leader (you can look at ZK to see who is at the
he
ddress the issue. Those directories should be removed over time. At
times
there will have to be a couple around at the same time and others may
take
a while to clean up.
- Mark
On Tue, Apr 28, 2015 at 3:27 AM Ramkumar
R. Aiyengar <
andyetitmo...@gmail.com> wrote:
> SolrCloud does need
SolrCloud does need up to twice the amount of disk space as your usual
index size during replication. Amongst other things, this ensures you have
a full copy of the index at any point. There's no way around this, I would
suggest you provision the additional disk space needed.
On 20 Apr 2015 23:21,
It shouldn't be any different without the patch, or with the patch and
(100,10) as parameters. Which is why I wanted you to check with 100,10.. If
you see the same issue with that, then the patch is probably not an issue,
may be it is with the patched build in general..
On 30 Mar 2015 13:01, "fores
I doubt this has anything to do with the patch. Do you observe the same
behaviour if you reduce the values for the config to defaults? (100, 10)
On 30 Mar 2015 09:51, "forest_soup" wrote:
> https://issues.apache.org/jira/browse/SOLR-6359
>
> I also posted the questions to the JIRA ticket.
>
> We
Not a direct answer, but Anshum just created this..
https://issues.apache.org/jira/browse/SOLR-7275
On 20 Mar 2015 23:21, "Furkan KAMACI" wrote:
> Is there anyway to use ConcurrentUpdateSolrServer for secured Solr as like
> CloudSolrServer:
>
> HttpClientUtil.setBasicAuth(cloudSolrServer.getLbS
Is your concern that you want to be able to modify source code just on your
machine or that you can't for some reason install svn?
If it's the former, even if you checkout using svn, you can't modify
anything outside the machine as changes can be checked in only by the
committers of the project.
Yes, and doing so is painful and takes lots of people and hardware
resources to get there for large amounts of data and queries :)
As Erick says, work backwards from 60s and first establish how high the
commit interval can be to satisfy your use case..
On 16 Mar 2015 16:04, "Erick Erickson" wrote
Yes, Solr 5.0 uses Jetty 8.
FYI, the upcoming release 5.1 will move to Jetty 9.
Also, just in case it matters -- as noted in the 5.0 release notes, the use
of Jetty is now an implementation detail and we might move away from it in
the future -- so you shouldn't be depending on Solr using Jetty or
The update log replay issue looks like
https://issues.apache.org/jira/browse/SOLR-6583
On 9 Mar 2015 01:41, "Mark Miller" wrote:
> Interesting bug.
>
> First there is the already closed transaction log. That by itself deserves
> a look. I'm not even positive we should be replaying the log we
> re
I don't have formal benchmarks, but we did get significant performance
gains by switching from a RAMDirectory to a MMapDirectory on tmpfs,
especially under parallel queries. Locking seemed to pull down the former..
On 23 Jan 2015 06:35, "deniz" wrote:
> Would it boost any performance in case the
https://issues.apache.org/jira/browse/SOLR-6359 has a patch which allows
this to be configured, it has not gone in as yet.
Note that the current design of the UpdateLog causes it to be less
efficient if the number is bumped up too much, but certainly worth
experimenting with.
On 22 Jan 2015 02:47,
That's correct, even though it should still be possible to embed Jetty,
that could change in the future, and that's why support for pluggable
containers is being taken away.
If you need to deal with the index at a lower level, there's always Lucene
you can use as a library instead of Solr.
But I
Versions 4.10.3 and beyond already use server rather than example, which
still finds a reference in the script purely for back compat. A major
release 5.0 is coming soon, perhaps the back compat can be removed for that.
On 6 Jan 2015 09:30, "Dominique Bejean" wrote:
> Hi,
>
> In release 4.10.3, t
As Eric mentions, his change to have a state where indexing happens but
querying doesn't surely helps in this case.
But these are still boolean decisions of send vs don't send. In general, it
would be nice to abstract the routing policy so that it is pluggable. You
could then do stuff like have a
Do keep one thing in mind though. If you are already doing the work of
figuring out the right shard leader (through solrJ or otherwise), using
that location with just the collection name might be suboptimal if there
are multiple shard leaders present in the same instance -- the collection
name just
On 30 Oct 2014 14:49, "Shawn Heisey" wrote:
> In order to see a gain in performance from multiple shards per server,
> the server must have a lot of CPUs and the query rate must be fairly
> low. If the query rate is high, then all the CPUs will be busy just
> handling simultaneous queries, so pu
On 30 Oct 2014 23:46, "Erick Erickson" wrote:
>
> This configuration deals with all
> the replication, NRT processing, self-repair when nodes go up and
> down and all that, but since there's no second trip to get the docs
> from shards your query performance won't be affected.
More or less.. Vagu
As far as the second option goes, unless you are using a large amount of
memory and you reach a point where a JVM can't sensibly deal with a GC
load, having multiple JVMs wouldn't buy you much. With a 26GB index, you
probably haven't reached that point. There are also other shared resources
at an i
https://issues.apache.org/jira/plugins/servlet/mobile#issue/LUCENE-2878
provides lucene API what you are trying to do, it's not yet in though.
There's a fork which has the change in
https://github.com/flaxsearch/lucene-solr-intervals
On 12 Sep 2014 21:24, "Craig Longman" wrote:
> In order to take
On 31 Aug 2014 13:24, "Mark Miller" wrote:
>
>
> > On Aug 31, 2014, at 4:04 AM, Christoph Schmidt <
christoph.schm...@moresophy.de> wrote:
> >
> > we see at least two problems when scaling to large number of
collections. I would like to ask the community, if they are known and maybe
already addres
ZK has the list of live nodes available as a set of ephemeral nodes. You
can use /zookeeper on Solr or talk to ZK directly to get that list.
On 24 Aug 2014 03:08, "Nathan Neulinger" wrote:
> Is there a way to query the 'live node' state without sending a query to
> every node myself? i.e. to get
(1) sounds a lot like SOLR-6261 I mention above. There are possibly other
improvements since 4.6.1 as Mark mentions, I would certainly suggest you
test with the latest version with the issue above patched (or use the
current stable branch in svn, branch_4x) to see if that makes a difference.
I didn't realise you could even disable tlog when running SolrCloud, but as
Anshum says it's a bad idea. In all possibility, even if it worked,
removing transaction logs is likely to make your restart slower, SolrCloud
would always be forced to do a full recovery because it cannot now use
tlogs to
I agree with Erick that this gain you are looking at might not be worth, so
do measure and see if there's a difference.
Also, the next release of Solr is to have some significant improvements
when it comes to CPU usage under heavy indexing load, and we have had at
least one anecdote so far where t
Not an exact answer.. OpenGrok uses Lucene, but not Solr.
On 2 Jun 2014 07:48, "Alexandre Rafalovitch" wrote:
> Hello,
>
> Anybody knows of a recent projects that index SVN repos for Solr
> search? With or without UI.
>
> I know of similar efforts for other VCS, but the only thing I found
> for S
I agree with Eric that this is premature unless you can show that it makes
a difference.
Firstly why are you splitting the data into multiple time tiers (one
recent, and one all) and then waiting to merge results from all of them?
Time tiering is useful when you can do the search separately on bot
fsets.
>
> Or am I missing something?
>
> Regards,
> Alex
> On 16/04/2014 10:59 pm, "Ramkumar R. Aiyengar"
> wrote:
>
> > Logically if you tokenize and put the results in a multivalued field, you
> > should be able to get all values in sequence?
>
Logically if you tokenize and put the results in a multivalued field, you
should be able to get all values in sequence?
On 16 Apr 2014 16:51, "Alexandre Rafalovitch" wrote:
> Hello,
>
> If I use very basic tokenizers, e.g. space based and no filters, can I
> reconstruct the text from the tokenize
ant compile / ant -f solr dist / ant test certainly work, I use them with a
git working copy. You trying something else?
On 14 Apr 2014 19:36, "Jeff Wartes" wrote:
> I vastly prefer git, but last I checked, (admittedly, some time ago) you
> couldn't build the project from the git clone. Some of t
Start with http://wiki.apache.org/solr/SolrPerformanceProblems It has a
section on GC tuning and a link to some example settings.
On 16 Feb 2014 21:19, "lboutros" wrote:
> Thanks a lot for your answer.
>
> Is there a web page, on the wiki for instance, where we could find some JVM
> settings or r
If only availability is your concern, you can always keep a list of servers
to which your C++ clients will send requests, and round robin amongst them.
If one of the servers go down, you will either not be able to reach it or
get a 500+ error in the HTTP response, you can take it out of circulation
Ludovic, recent Solr changes won't do much to prevent ZK session expiry,
you might want to enable GC logging on Solr and Zookeeper to check for
pauses and tune appropriately.
The patch below fixes a situation under which the cloud can get to a bad
state during the recovery after session expiry. Th
We have had success with starting up Jolokia in the same servlet container
as Solr, and then using its REST/Bulk API to JMX from the application of
choice.
On 4 Feb 2014 17:16, "Walter Underwood" wrote:
> I agree that sorting and filtering stats in Solr is not a good idea. There
> is certainly so
There's already an issue for this,
https://issues.apache.org/jira/browse/SOLR-5209, we were once bitten by the
same issue, when we were trying to relocate a shard. As Mark mentions, the
idea was to do this in zk truth mode, the link also references where that
work is being done.
On 31 Jan 2014 23:1
pretty much want to examine my assumptions
> and see if they're correct, perhaps start to trim my requirements
> etc.
>
> FWIW,
> Erick
>
> On Tue, Jul 9, 2013 at 4:07 AM, Ramkumar R. Aiyengar
> wrote:
> >> 5. No more than 32 nodes in your SolrCloud cluster.
> &
> 5. No more than 32 nodes in your SolrCloud cluster.
I hope this isn't too OT, but what tradeoffs is this based on? Would have
thought it easy to hit this number for a big index and high load (hence
with the view of both the number of shards and replicas horizontally
scaling..)
> 6. Don't return
In general, just increasing the cache sizes to make everything fit in
memory might not always give you best results. Do keep in mind that the
caches are in Java memory and that incurs the penalty of garbage collection
and other housekeeping Java's memory management might have to do.
Reasonably rec
45 matches
Mail list logo