Re: Very long young generation stop the world GC pause

2016-12-21 Thread Steven Bower
Also curious why such a large heap is required... If it's due to field caches being loaded I'd highly recommend MMapDirectory (if not using already) and turning on DocValues for all fields you plan to perform sort/facet/analytics on. steve On Wed, Dec 21, 2016 at 9:25 AM Pushkar Raste wrote: >

Re: SOLR Disk Access Latency Problem

2016-09-21 Thread Steven Bower
That sounds like some SAN vendor BS if you ask me. Breaking up 300gb into smaller chunks would only be relevant if they were caching entire files not blocks and I find that hard to believe. Would be interested to know more about the specifics of the problem as the vendor sees it. As Shawn said loc

Re: Full re-index without downtime

2016-07-06 Thread Steven Bower
There are two options as I see it.. 1. Do something like you describe and create a secondary index, index into it, then switch... I personally would create a completely separate solr cloud alongside my existing one vs new core in the same cloud as you might see some negative impacts on GC caused b

Re: deploy solr on cloud providers

2016-07-05 Thread Steven Bower
Looking deeper into zookeeper as truth mode I was wrong about existing replicas being recreated once storage is gone.. Seems there is intent for the type of behavior based upon existing tickets.. We'll look at creating a patch for this too.. Steve On Tue, Jul 5, 2016 at 6:00 PM Tomás Fernández Löb

Re: deploy solr on cloud providers

2016-07-05 Thread Steven Bower
You shouldn't "need" to move the storage as SolrCloud will replicate all data to the new node and anything in the transaction log will already be distributed through the rest of the machines.. One option to keep all your data attached to nodes might be to use Amazon EFS (pretty new) to store your

Re: stateless solr ?

2016-07-05 Thread Steven Bower
The ticket in question is https://issues.apache.org/jira/browse/SOLR-9265 We are working on a patch now... will update when we have a working patch / tests.. Shawn is correct that when adding a new node to a SolrCloud cluster it will not automatically add replicas/etc.. The idea behind this patc

Re: stateless solr ?

2016-07-04 Thread Steven Bower
> > Upayavira > > On Mon, 4 Jul 2016, at 08:53 PM, Steven Bower wrote: > > My main issue is having to make any solr collection api calls during a > > transition.. It makes integrating with orchestration engines way more > > complex.. > > On Mon, Jul 4, 2016 at 3:40

Re: stateless solr ?

2016-07-04 Thread Steven Bower
My main issue is having to make any solr collection api calls during a transition.. It makes integrating with orchestration engines way more complex.. On Mon, Jul 4, 2016 at 3:40 PM Upayavira wrote: > Are you using Solrcloud? With Solrcloud this stuff is easy. You just add > a new replica for a c

Re: stateless solr ?

2016-07-04 Thread Steven Bower
We have been working on some changes that should help with this.. 1st challenge is having the node name remain static regardless of where the node runs (right now it uses host and port, so this won't work unless you are using some sort of tunneled or dynamic networking).. We have a patch we are wor

Re: Solr cross core join special condition

2015-11-11 Thread Steven Bower
commenting so this ends up in Dennis' inbox.. On Tue, Oct 13, 2015 at 7:17 PM Yonik Seeley wrote: > On Wed, Oct 7, 2015 at 9:42 AM, Ryan Josal wrote: > > I developed a join transformer plugin that did that (although it didn't > > flatten the results like that). The one thing that was painful a

SolrCloud with local configs

2015-05-21 Thread Steven Bower
Is it possible to run in "cloud" mode with zookeeper managing collections/state/etc.. but to read all config files (solrconfig, schema, etc..) from local disk? Obviously this implies that you'd have to keep them in sync.. My thought here is of running Solr in a docker container, but instead of ha

Re: Spatial maxDistErr changes

2014-04-03 Thread Steven Bower
gt; Good question Steve, > > You'll have to re-index right off. > > ~ David > p.s. Sorry I didn't reply sooner; I just switched jobs and reconfigured my > mailing list subscriptions > > > > Steven Bower wrote > > If am only indexing point shapes and I w

Spatial maxDistErr changes

2014-03-17 Thread Steven Bower
If am only indexing point shapes and I want to change the maxDistErr from 0.09 (1m res) to 0.00045 will this "break" as in searches stop working or will search work but any performance gain won't be seen until all docs are reindexed? Or will I have to reindex right off? thanks, steve

Re: IDF maxDocs / numDocs

2014-03-12 Thread Steven Bower
My problem is that both maxDoc() and docCount() both report documents that have been deleted in their values. Because of merging/etc.. those numbers can be different per replica (or at least that is what I'm seeing). I need a value that is consistent across replicas... I see in the comment it makes

IDF maxDocs / numDocs

2014-03-12 Thread Steven Bower
I am noticing the maxDocs between replicas is consistently different and that in the idf calculation it is used which causes idf scores for the same query/doc between replicas to be different. obviously an optimize can normalize the maxDocs scores, but that is only temporary.. is there a way to hav

Re: Issue with spatial search

2014-03-11 Thread Steven Bower
also put this option in your query > after the end of the last parenthesis, as in this example from the wiki: > > fq=geo:"IsWithin(POLYGON((-10 30, -40 40, -10 -20, 40 20, 0 0, -10 30))) > distErrPct=0" > > ~ David > > > Steven Bower wrote > > Only poi

Re: Issue with spatial search

2014-03-10 Thread Steven Bower
> > It's a fairly different story if you are indexing non-point shapes. > > ~ David > > From: Steven Bower smb-apa...@alcyon.net >> > Reply-To: "solr-user@lucene.apache.org solr-user@lucene.apache.org >" > > <mailto:solr-user@lucene.apache.o

Re: Issue with spatial search

2014-03-10 Thread Steven Bower
.8988 42.1284,42.2141 40.0919,47.8482 30.4169,47.5783 26.9892,43.6459 27.2095,41.5676 29.0454,41.2198 On Mon, Mar 10, 2014 at 4:23 PM, Steven Bower wrote: > Minor edit to the KML to adjust color of polygon > > > On Mon

Re: Issue with spatial search

2014-03-10 Thread Steven Bower
Minor edit to the KML to adjust color of polygon On Mon, Mar 10, 2014 at 4:21 PM, Steven Bower wrote: > I am seeing a "error" when doing a spatial search where a particular point > is showing up within a polygon, but by all methods I've tried that point is > not within

Issue with spatial search

2014-03-10 Thread Steven Bower
I am seeing a "error" when doing a spatial search where a particular point is showing up within a polygon, but by all methods I've tried that point is not within the polygon.. First the point is: 41.2299,29.1345 (lat/lon) The polygon is: 31.2719,32.283 31.2179,32.3681 31.1333,32.3407 30.9356,32.

Re: core.properties and solr.xml

2014-01-23 Thread Steven Bower
; >>> I think solr.xml is the correct place for it, and you can then set up > substitution variables to allow it to be set by environment variables, etc. > But let's discuss on the JIRA ticket. > >>> > >>> Alan Woodward > >>> www.flax.co.uk > >>

Re: core.properties and solr.xml

2014-01-15 Thread Steven Bower
he place, passed in and recorded in constructors etc, > > as well as being possibly unique for each core. There's been some talk > > of sharing a single config object, and there's also talk about using > > "config sets" that might address some of those concerns, but ne

core.properties and solr.xml

2014-01-14 Thread Steven Bower
Are there any plans/tickets to allow for pluggable SolrConf and CoreLocator? In my use case my solr.xml is totally static, i have a separate dataDir and my core.properties are derived from a separate configuration (living in ZK) but totally outside of the SolrCloud.. I'd like to be able to not hav

Index Sizes

2014-01-07 Thread Steven Bower
I was looking at the code for getIndexSize() on the ReplicationHandler to get at the size of the index on disk. From what I can tell, because this does directory.listAll() to get all the files in the directory, the size on disk includes not only what is searchable at the moment but potentially also

Re: SolrCoreAware

2013-11-15 Thread Steven Bower
And the close hook will basically only be fired once during shutdown? On Fri, Nov 15, 2013 at 1:07 PM, Chris Hostetter wrote: > > : So for a given instance of a handler it will only be called once during > the > : lifetime of that handler? > > correct (unless there is a bug somewhere) > > : Also

Re: SolrCoreAware

2013-11-15 Thread Steven Bower
>>> it should be called only once during hte lifetime of a given plugin, >>> usually not long after construction -- but it could be called many, many >>> times in the lifetime of the solr process. So for a given instance of a handler it will only be called once during the lifetime of that handler?

Re: SolrCoreAware

2013-11-15 Thread Steven Bower
ed when the handler is created, either at SolrCore construction > time (solr startup or core reload) or the first time the handler is > requested if it's a lazy-loading handler. > > Alan Woodward > www.flax.co.uk > > > On 15 Nov 2013, at 15:40, Steven Bower wrote: &g

SolrCoreAware

2013-11-15 Thread Steven Bower
Under what circumstances will a handler that implements SolrCoreAware have its inform() method called? thanks, steve

Re: StatsComponent with median

2013-10-04 Thread Steven Bower
Check out: https://issues.apache.org/jira/browse/SOLR-5302 it supports median value On Wed, Jul 3, 2013 at 12:11 PM, William Bell wrote: > If you are a programmer, you can modify it and attach a patch in Jira... > > > > > On Tue, Jun 4, 2013 at 4:25 AM, Marcin Rzewucki > wrote: > > > Hi there,

Re: How to set a condition over stats result

2013-10-04 Thread Steven Bower
Check out: https://issues.apache.org/jira/browse/SOLR-5302 can do this using query facets On Fri, Jul 12, 2013 at 11:35 AM, Jack Krupansky wrote: > sum(x, y, z) = x + y + z (sums those specific fields values for the > current document) > > sum(x, y) = x + y (sum of those two specific field value

Re: bucket count for facets

2013-09-06 Thread Steven Bower
lues in a field. > > See https://cwiki.apache.org/confluence/display/solr/The+Stats+Component > > On Fri, Sep 6, 2013 at 12:28 AM, Steven Bower > wrote: > > Is there a way to get the count of buckets (ie unique values) for a field > > facet? the rudimentary approach of co

bucket count for facets

2013-09-05 Thread Steven Bower
Is there a way to get the count of buckets (ie unique values) for a field facet? the rudimentary approach of course is to get back all buckets, but in some cases this is a huge amount of data. thanks, steve

Re: AND not working

2013-08-15 Thread Steven Bower
https://issues.apache.org/jira/browse/SOLR-5163 On Thu, Aug 15, 2013 at 6:04 PM, Steven Bower wrote: > @Yonik that was exactly the issue... I'll file a ticket... there def > should be an exception thrown for something like this.. > > It would seem to me that eating any sort

Re: AND not working

2013-08-15 Thread Steven Bower
rrors). > > -Yonik > http://lucidworks.com > > > On Thu, Aug 15, 2013 at 5:19 PM, Steven Bower > wrote: > > > I have query like: > > > > q=foo AND bar > > defType=edismax > > qf=field1 > > qf=field2 > > qf=fi

AND not working

2013-08-15 Thread Steven Bower
I have query like: q=foo AND bar defType=edismax qf=field1 qf=field2 qf=field3 with debug on I see it parsing to this: (+(DisjunctionMaxQuery((field1:foo | field2:foo | field3:foo)) DisjunctionMaxQuery((field1:and | field2:and | field3:and)) DisjunctionMaxQuery((field1:bar | field2:bar | field3:

Schema Lint

2013-08-06 Thread Steven Bower
Is there an easy way in code / command line to lint a solr config (or even just a solr schema)? Steve

Re: Performance question on Spatial Search

2013-08-05 Thread Steven Bower
nvert) that are used for sorting.. I'm suspecting that docvalues will greatly help this load performance? thanks, steve On Wed, Jul 31, 2013 at 4:32 PM, Steven Bower wrote: > the list of IDs does change relatively frequently, but this doesn't seem > to have very much impact on t

Re: Performance question on Spatial Search

2013-07-31 Thread Steven Bower
t; On Wed, Jul 31, 2013 at 1:10 AM, Steven Bower wrote: > > > > > not sure what you mean by good hit raitio? > > > > I mean such queries are really expensive (even on cache hit), so if the > list of ids changes every time, it never hit cache and hence executes these

Re: Performance question on Spatial Search

2013-07-30 Thread Steven Bower
steve On Tue, Jul 30, 2013 at 5:10 PM, Steven Bower wrote: > Very good read... Already using MMap... verified using pmap and vsz from > top.. > > not sure what you mean by good hit raitio? > > Here are the stacks... > &

Re: Performance question on Spatial Search

2013-07-30 Thread Steven Bower
, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > On Tue, Jul 30, 2013 at 12:45 AM, Steven Bower >wrote: > > > > > - Most of my time (98%) is being spent in > > java.nio.Bits.copyToByteArray(long,Object,long,long) which is being > > > Steven, pl

Re: Performance question on Spatial Search

2013-07-30 Thread Steven Bower
I am curious why the field:* walks the entire terms list.. could this be discovered from a field cache / docvalues? steve On Tue, Jul 30, 2013 at 2:00 PM, Steven Bower wrote: > Until I get the data refed I there was another field (a date field) that > was there and not when the geo fie

Re: Performance question on Spatial Search

2013-07-30 Thread Steven Bower
#x27;ll be down in that sub 100ms range.. steve On Tue, Jul 30, 2013 at 12:02 PM, Steven Bower wrote: > Will give the boolean thing a shot... makes sense... > > > On Tue, Jul 30, 2013 at 11:53 AM, Smiley, David W. wrote: > >> I see the problem ‹ it's +pp:*. It may loo

Re: Performance question on Spatial Search

2013-07-30 Thread Steven Bower
led SpatialPointVectorFieldType that could be modified > trivially but it doesn't support it now. > > ~ David > > On 7/30/13 11:32 AM, "Steven Bower" wrote: > > >#1 Here is my query: > > > >sort=vid asc > >start=0 > >rows=1000 > >de

Re: Performance question on Spatial Search

2013-07-30 Thread Steven Bower
r to max-levels. > (4) Do all of your searches find less than a million points, considering > all > filters? If so then it's worth comparing the results with LatLonType. > > ~ David Smiley > > > Steven Bower wrote > > @Erick it is alot of hw, but basically trying

Re: Performance question on Spatial Search

2013-07-29 Thread Steven Bower
hat > really shouldn't be a big issue, your index files > should be MMaped. > > Let's try the crude thing first and give the JVM > more memory. > > FWIW > Erick > > On Mon, Jul 29, 2013 at 4:45 PM, Steven Bower > wrote: > > I've been doing some perfo

Performance question on Spatial Search

2013-07-29 Thread Steven Bower
I've been doing some performance analysis of a spacial search use case I'm implementing in Solr 4.3.0. Basically I'm seeing search times alot higher than I'd like them to be and I'm hoping people may have some suggestions for how to optimize further. Here are the specs of what I'm doing now: Mach

Re: Transaction Logs Leaking FileDescriptors

2013-05-16 Thread Steven Bower
Created https://issues.apache.org/jira/browse/SOLR-4831 to capture this issue On Thu, May 16, 2013 at 10:10 AM, Steven Bower wrote: > Looking at the timestamps on the tlog files they seem to have all been > created around the same time (04:55).. starting around this time I start > s

Re: Transaction Logs Leaking FileDescriptors

2013-05-16 Thread Steven Bower
;I may be the new leader - try and sync"); > > How reproducible is this bug for you? It would be great to know if > the patch in the issue fixes things. > > -Yonik > http://lucidworks.com > > > On Wed, May 15, 2013 at 6:04 PM, Steven Bower wrote: > > They are

Re: Transaction Logs Leaking FileDescriptors

2013-05-15 Thread Steven Bower
They are visible to ls... On Wed, May 15, 2013 at 5:49 PM, Yonik Seeley wrote: > On Wed, May 15, 2013 at 5:20 PM, Steven Bower wrote: > > when the TransactionLog objects are dereferenced > > their RandomAccessFile object is not closed.. > > Have the files been delet

Re: Transaction Logs Leaking FileDescriptors

2013-05-15 Thread Steven Bower
t; A similar setting for Solr might map commit requests to hard commit > (default), soft commit, or none. > > wunder > > On May 15, 2013, at 2:20 PM, Steven Bower wrote: > > > Most definetly understand the don't commit after each record... > > unfortunately the data is be

Re: Transaction Logs Leaking FileDescriptors

2013-05-15 Thread Steven Bower
. > > I'll see if I can create a test case to reproduce this. > > Separately, you'll get a lot better performance if you don't commit > per update of course (or at least use something like commitWithin). > > -Yonik > http://lucidworks.com > > On Wed, Ma

Transaction Logs Leaking FileDescriptors

2013-05-15 Thread Steven Bower
We have a system in which a client is sending 1 record at a time (via REST) followed by a commit. This has produced ~65k tlog files and the JVM has run out of file descriptors... I grabbed a heap dump from the JVM and I can see ~52k "unreachable" FileDescriptors... This leads me to believe that the

Re: Per Shard Replication Factor

2013-05-10 Thread Steven Bower
> Solr & ElasticSearch Support > http://sematext.com/ > On May 9, 2013 1:43 AM, "Steven Bower" wrote: > > > Is it currently possible to have per-shard replication factor? > > > > A bit of background on the use case... > > > > If you are hashi

Per Shard Replication Factor

2013-05-08 Thread Steven Bower
Is it currently possible to have per-shard replication factor? A bit of background on the use case... If you are hashing content to shards by a known factor (lets say date ranges, 12 shards, 1 per month) it might be the case that most of your search traffic would be directed to one particular sha