Also curious why such a large heap is required... If it's due to field
caches being loaded I'd highly recommend MMapDirectory (if not using
already) and turning on DocValues for all fields you plan to perform
sort/facet/analytics on.
steve
On Wed, Dec 21, 2016 at 9:25 AM Pushkar Raste
wrote:
>
That sounds like some SAN vendor BS if you ask me. Breaking up 300gb into
smaller chunks would only be relevant if they were caching entire files not
blocks and I find that hard to believe. Would be interested to know more
about the specifics of the problem as the vendor sees it.
As Shawn said loc
There are two options as I see it..
1. Do something like you describe and create a secondary index, index into
it, then switch... I personally would create a completely separate solr
cloud alongside my existing one vs new core in the same cloud as you might
see some negative impacts on GC caused b
Looking deeper into zookeeper as truth mode I was wrong about existing
replicas being recreated once storage is gone.. Seems there is intent for
the type of behavior based upon existing tickets.. We'll look at creating a
patch for this too..
Steve
On Tue, Jul 5, 2016 at 6:00 PM Tomás Fernández Löb
You shouldn't "need" to move the storage as SolrCloud will replicate all
data to the new node and anything in the transaction log will already be
distributed through the rest of the machines..
One option to keep all your data attached to nodes might be to use Amazon
EFS (pretty new) to store your
The ticket in question is https://issues.apache.org/jira/browse/SOLR-9265
We are working on a patch now... will update when we have a working patch /
tests..
Shawn is correct that when adding a new node to a SolrCloud cluster it will
not automatically add replicas/etc..
The idea behind this patc
>
> Upayavira
>
> On Mon, 4 Jul 2016, at 08:53 PM, Steven Bower wrote:
> > My main issue is having to make any solr collection api calls during a
> > transition.. It makes integrating with orchestration engines way more
> > complex..
> > On Mon, Jul 4, 2016 at 3:40
My main issue is having to make any solr collection api calls during a
transition.. It makes integrating with orchestration engines way more
complex..
On Mon, Jul 4, 2016 at 3:40 PM Upayavira wrote:
> Are you using Solrcloud? With Solrcloud this stuff is easy. You just add
> a new replica for a c
We have been working on some changes that should help with this.. 1st
challenge is having the node name remain static regardless of where the
node runs (right now it uses host and port, so this won't work unless you
are using some sort of tunneled or dynamic networking).. We have a patch we
are wor
commenting so this ends up in Dennis' inbox..
On Tue, Oct 13, 2015 at 7:17 PM Yonik Seeley wrote:
> On Wed, Oct 7, 2015 at 9:42 AM, Ryan Josal wrote:
> > I developed a join transformer plugin that did that (although it didn't
> > flatten the results like that). The one thing that was painful a
Is it possible to run in "cloud" mode with zookeeper managing
collections/state/etc.. but to read all config files (solrconfig, schema,
etc..) from local disk?
Obviously this implies that you'd have to keep them in sync..
My thought here is of running Solr in a docker container, but instead of
ha
gt; Good question Steve,
>
> You'll have to re-index right off.
>
> ~ David
> p.s. Sorry I didn't reply sooner; I just switched jobs and reconfigured my
> mailing list subscriptions
>
>
>
> Steven Bower wrote
> > If am only indexing point shapes and I w
If am only indexing point shapes and I want to change the maxDistErr from
0.09 (1m res) to 0.00045 will this "break" as in searches stop working
or will search work but any performance gain won't be seen until all docs
are reindexed? Or will I have to reindex right off?
thanks,
steve
My problem is that both maxDoc() and docCount() both report documents that
have been deleted in their values. Because of merging/etc.. those numbers
can be different per replica (or at least that is what I'm seeing). I need
a value that is consistent across replicas... I see in the comment it makes
I am noticing the maxDocs between replicas is consistently different and
that in the idf calculation it is used which causes idf scores for the same
query/doc between replicas to be different. obviously an optimize can
normalize the maxDocs scores, but that is only temporary.. is there a way
to hav
also put this option in your query
> after the end of the last parenthesis, as in this example from the wiki:
>
> fq=geo:"IsWithin(POLYGON((-10 30, -40 40, -10 -20, 40 20, 0 0, -10 30)))
> distErrPct=0"
>
> ~ David
>
>
> Steven Bower wrote
> > Only poi
>
> It's a fairly different story if you are indexing non-point shapes.
>
> ~ David
>
> From: Steven Bower smb-apa...@alcyon.net >>
> Reply-To: "solr-user@lucene.apache.org solr-user@lucene.apache.org >"
>
> <mailto:solr-user@lucene.apache.o
.8988
42.1284,42.2141
40.0919,47.8482
30.4169,47.5783
26.9892,43.6459
27.2095,41.5676
29.0454,41.2198
On Mon, Mar 10, 2014 at 4:23 PM, Steven Bower wrote:
> Minor edit to the KML to adjust color of polygon
>
>
> On Mon
Minor edit to the KML to adjust color of polygon
On Mon, Mar 10, 2014 at 4:21 PM, Steven Bower wrote:
> I am seeing a "error" when doing a spatial search where a particular point
> is showing up within a polygon, but by all methods I've tried that point is
> not within
I am seeing a "error" when doing a spatial search where a particular point
is showing up within a polygon, but by all methods I've tried that point is
not within the polygon..
First the point is: 41.2299,29.1345 (lat/lon)
The polygon is:
31.2719,32.283
31.2179,32.3681
31.1333,32.3407
30.9356,32.
; >>> I think solr.xml is the correct place for it, and you can then set up
> substitution variables to allow it to be set by environment variables, etc.
> But let's discuss on the JIRA ticket.
> >>>
> >>> Alan Woodward
> >>> www.flax.co.uk
> >>
he place, passed in and recorded in constructors etc,
> > as well as being possibly unique for each core. There's been some talk
> > of sharing a single config object, and there's also talk about using
> > "config sets" that might address some of those concerns, but ne
Are there any plans/tickets to allow for pluggable SolrConf and
CoreLocator? In my use case my solr.xml is totally static, i have a
separate dataDir and my core.properties are derived from a separate
configuration (living in ZK) but totally outside of the SolrCloud..
I'd like to be able to not hav
I was looking at the code for getIndexSize() on the ReplicationHandler to
get at the size of the index on disk. From what I can tell, because this
does directory.listAll() to get all the files in the directory, the size on
disk includes not only what is searchable at the moment but potentially
also
And the close hook will basically only be fired once during shutdown?
On Fri, Nov 15, 2013 at 1:07 PM, Chris Hostetter
wrote:
>
> : So for a given instance of a handler it will only be called once during
> the
> : lifetime of that handler?
>
> correct (unless there is a bug somewhere)
>
> : Also
>>> it should be called only once during hte lifetime of a given plugin,
>>> usually not long after construction -- but it could be called many, many
>>> times in the lifetime of the solr process.
So for a given instance of a handler it will only be called once during the
lifetime of that handler?
ed when the handler is created, either at SolrCore construction
> time (solr startup or core reload) or the first time the handler is
> requested if it's a lazy-loading handler.
>
> Alan Woodward
> www.flax.co.uk
>
>
> On 15 Nov 2013, at 15:40, Steven Bower wrote:
&g
Under what circumstances will a handler that implements SolrCoreAware have
its inform() method called?
thanks,
steve
Check out: https://issues.apache.org/jira/browse/SOLR-5302 it supports
median value
On Wed, Jul 3, 2013 at 12:11 PM, William Bell wrote:
> If you are a programmer, you can modify it and attach a patch in Jira...
>
>
>
>
> On Tue, Jun 4, 2013 at 4:25 AM, Marcin Rzewucki
> wrote:
>
> > Hi there,
Check out: https://issues.apache.org/jira/browse/SOLR-5302 can do this
using query facets
On Fri, Jul 12, 2013 at 11:35 AM, Jack Krupansky wrote:
> sum(x, y, z) = x + y + z (sums those specific fields values for the
> current document)
>
> sum(x, y) = x + y (sum of those two specific field value
lues in a field.
>
> See https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
>
> On Fri, Sep 6, 2013 at 12:28 AM, Steven Bower
> wrote:
> > Is there a way to get the count of buckets (ie unique values) for a field
> > facet? the rudimentary approach of co
Is there a way to get the count of buckets (ie unique values) for a field
facet? the rudimentary approach of course is to get back all buckets, but
in some cases this is a huge amount of data.
thanks,
steve
https://issues.apache.org/jira/browse/SOLR-5163
On Thu, Aug 15, 2013 at 6:04 PM, Steven Bower wrote:
> @Yonik that was exactly the issue... I'll file a ticket... there def
> should be an exception thrown for something like this..
>
> It would seem to me that eating any sort
rrors).
>
> -Yonik
> http://lucidworks.com
>
>
> On Thu, Aug 15, 2013 at 5:19 PM, Steven Bower
> wrote:
>
> > I have query like:
> >
> > q=foo AND bar
> > defType=edismax
> > qf=field1
> > qf=field2
> > qf=fi
I have query like:
q=foo AND bar
defType=edismax
qf=field1
qf=field2
qf=field3
with debug on I see it parsing to this:
(+(DisjunctionMaxQuery((field1:foo | field2:foo | field3:foo))
DisjunctionMaxQuery((field1:and | field2:and | field3:and))
DisjunctionMaxQuery((field1:bar | field2:bar | field3:
Is there an easy way in code / command line to lint a solr config (or even
just a solr schema)?
Steve
nvert) that are
used for sorting.. I'm suspecting that docvalues will greatly help this
load performance?
thanks,
steve
On Wed, Jul 31, 2013 at 4:32 PM, Steven Bower wrote:
> the list of IDs does change relatively frequently, but this doesn't seem
> to have very much impact on t
t; On Wed, Jul 31, 2013 at 1:10 AM, Steven Bower wrote:
>
> >
> > not sure what you mean by good hit raitio?
> >
>
> I mean such queries are really expensive (even on cache hit), so if the
> list of ids changes every time, it never hit cache and hence executes these
steve
On Tue, Jul 30, 2013 at 5:10 PM, Steven Bower wrote:
> Very good read... Already using MMap... verified using pmap and vsz from
> top..
>
> not sure what you mean by good hit raitio?
>
> Here are the stacks...
>
&
, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:
> On Tue, Jul 30, 2013 at 12:45 AM, Steven Bower >wrote:
>
> >
> > - Most of my time (98%) is being spent in
> > java.nio.Bits.copyToByteArray(long,Object,long,long) which is being
>
>
> Steven, pl
I am curious why the field:* walks the entire terms list.. could this be
discovered from a field cache / docvalues?
steve
On Tue, Jul 30, 2013 at 2:00 PM, Steven Bower wrote:
> Until I get the data refed I there was another field (a date field) that
> was there and not when the geo fie
#x27;ll
be down in that sub 100ms range..
steve
On Tue, Jul 30, 2013 at 12:02 PM, Steven Bower wrote:
> Will give the boolean thing a shot... makes sense...
>
>
> On Tue, Jul 30, 2013 at 11:53 AM, Smiley, David W. wrote:
>
>> I see the problem ‹ it's +pp:*. It may loo
led SpatialPointVectorFieldType that could be modified
> trivially but it doesn't support it now.
>
> ~ David
>
> On 7/30/13 11:32 AM, "Steven Bower" wrote:
>
> >#1 Here is my query:
> >
> >sort=vid asc
> >start=0
> >rows=1000
> >de
r to max-levels.
> (4) Do all of your searches find less than a million points, considering
> all
> filters? If so then it's worth comparing the results with LatLonType.
>
> ~ David Smiley
>
>
> Steven Bower wrote
> > @Erick it is alot of hw, but basically trying
hat
> really shouldn't be a big issue, your index files
> should be MMaped.
>
> Let's try the crude thing first and give the JVM
> more memory.
>
> FWIW
> Erick
>
> On Mon, Jul 29, 2013 at 4:45 PM, Steven Bower
> wrote:
> > I've been doing some perfo
I've been doing some performance analysis of a spacial search use case I'm
implementing in Solr 4.3.0. Basically I'm seeing search times alot higher
than I'd like them to be and I'm hoping people may have some suggestions
for how to optimize further.
Here are the specs of what I'm doing now:
Mach
Created https://issues.apache.org/jira/browse/SOLR-4831 to capture this
issue
On Thu, May 16, 2013 at 10:10 AM, Steven Bower wrote:
> Looking at the timestamps on the tlog files they seem to have all been
> created around the same time (04:55).. starting around this time I start
> s
;I may be the new leader - try and sync");
>
> How reproducible is this bug for you? It would be great to know if
> the patch in the issue fixes things.
>
> -Yonik
> http://lucidworks.com
>
>
> On Wed, May 15, 2013 at 6:04 PM, Steven Bower wrote:
> > They are
They are visible to ls...
On Wed, May 15, 2013 at 5:49 PM, Yonik Seeley wrote:
> On Wed, May 15, 2013 at 5:20 PM, Steven Bower wrote:
> > when the TransactionLog objects are dereferenced
> > their RandomAccessFile object is not closed..
>
> Have the files been delet
t; A similar setting for Solr might map commit requests to hard commit
> (default), soft commit, or none.
>
> wunder
>
> On May 15, 2013, at 2:20 PM, Steven Bower wrote:
>
> > Most definetly understand the don't commit after each record...
> > unfortunately the data is be
.
>
> I'll see if I can create a test case to reproduce this.
>
> Separately, you'll get a lot better performance if you don't commit
> per update of course (or at least use something like commitWithin).
>
> -Yonik
> http://lucidworks.com
>
> On Wed, Ma
We have a system in which a client is sending 1 record at a time (via REST)
followed by a commit. This has produced ~65k tlog files and the JVM has run
out of file descriptors... I grabbed a heap dump from the JVM and I can see
~52k "unreachable" FileDescriptors... This leads me to believe that the
> Solr & ElasticSearch Support
> http://sematext.com/
> On May 9, 2013 1:43 AM, "Steven Bower" wrote:
>
> > Is it currently possible to have per-shard replication factor?
> >
> > A bit of background on the use case...
> >
> > If you are hashi
Is it currently possible to have per-shard replication factor?
A bit of background on the use case...
If you are hashing content to shards by a known factor (lets say date
ranges, 12 shards, 1 per month) it might be the case that most of your
search traffic would be directed to one particular sha
54 matches
Mail list logo