Re: ramBufferSizeMB and maxIndexingThreads

2016-01-19 Thread Shalin Shekhar Mangar
ramBufferSizeMB is independent of the maxIndexingThreads. If you set it to 100MB then any lucene segment (or part of a segment) exceeding 100MB will be flushed to disk. On Wed, Jan 20, 2016 at 3:50 AM, Angel Todorov wrote: > hi guys, > > quick question - is the ramBufferSizeMB the maximum value n

Re: Faceting and multiValued field type

2016-01-19 Thread Erick Erickson
bq: which of those date values will be used for faceting when I use range-search faceting on this field? All of them. Which values match in a multiValued field, range query or not, have no bearing on the facet counts. Faceting essentially says "take all the docs that match the query and, for each

Re: Scaling SolrCloud

2016-01-19 Thread Yago Riveiro
What I can say is: * SDD (crucial for performance if the index doesn't fit in memory, and will not fit) * Divide and conquer, for that volume of docs you will need more than 6 nodes. * DocValues to not stress the java HEAP. * Do you will you aggregate data?, if yes, what is your max

Re: Solr trying to auto-update schema.xml

2016-01-19 Thread Chris Hostetter
: Thanks, very helpful. I think I'm on the right track now, but when I do a : post now and my UpdateRequestProcessor extension tries to add a field to a : document, I get: : : RequestHandlerBase org.apache.solr.common.SolrException: ERROR: [doc=1] : Error adding field 'myField'='2234543' : : Th

Re: Solr UninvertingReader getNumericDocValues doesn't seem to work for fields that are not stored or indexed

2016-01-19 Thread Joel Bernstein
Try converting the long to a float using: Float.intBitsToFloat((int)val). It should come back with the correct float. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Jan 19, 2016 at 5:35 PM, plbarrios wrote: > I have a defined field in my schema.xml that is a simple float that is not > s

Re: Scaling SolrCloud

2016-01-19 Thread Shawn Heisey
On 1/19/2016 1:30 PM, Troy Edwards wrote: We are currently "beta testing" a SolrCloud with 2 nodes and 2 shards with 2 replicas each. The number of documents is about 125000. We now want to scale this to about 10 billion documents. What are the steps to prototyping, hardware estimation and stre

Solr UninvertingReader getNumericDocValues doesn't seem to work for fields that are not stored or indexed

2016-01-19 Thread plbarrios
I have a defined field in my schema.xml that is a simple float that is not stored nor indexed as follows: ** In Solr 4 I was able to get the float value the following way: *FieldCache.Floats docBoosts = FieldCache.DEFAULT.getFloats(context.reader(), FieldName.Boost, false); float boost = docBoos

Re: zkCli.sh not in solr 5.4?

2016-01-19 Thread Shawn Heisey
On 1/19/2016 3:00 PM, Mike Thomsen wrote: I downloaded a build of 5.4.0 to install in some VMs and noticed that zkCli.sh is not there. I need it in order to upload a configuration set to ZooKeeper before I create the collection. What's the preferred way of doing that? Nitpicky point -- zkCli.sh

ramBufferSizeMB and maxIndexingThreads

2016-01-19 Thread Angel Todorov
hi guys, quick question - is the ramBufferSizeMB the maximum value no matter how maxIndexingThreads I have, or is it multiplied by the number if indexing threads? So, if I have ramBufferSizeMB set to 100 MB, and 8 indexing threads, does this mean the total ram buffer will be 100 MB or 800 MB ? T

Re: Rolling upgrade to 5.4 from 5.0 - "bug" caused by leader changes - is there a workaround?

2016-01-19 Thread Anshum Gupta
If you can wait, I'd suggest to be on the bug fix release. It should be out around the weekend. On Tue, Jan 19, 2016 at 1:48 PM, Michael Joyner wrote: > ok, > > I just found the 5.4.1 RC2 download, it seems to work ok for a rolling > upgrade. > > I will see about downgrading back to 5.4.0 afterw

zkCli.sh not in solr 5.4?

2016-01-19 Thread Mike Thomsen
I downloaded a build of 5.4.0 to install in some VMs and noticed that zkCli.sh is not there. I need it in order to upload a configuration set to ZooKeeper before I create the collection. What's the preferred way of doing that? Specifically, I need to specify a configuration like this because it's

Re: Rolling upgrade to 5.4 from 5.0 - "bug" caused by leader changes - is there a workaround?

2016-01-19 Thread Michael Joyner
ok, I just found the 5.4.1 RC2 download, it seems to work ok for a rolling upgrade. I will see about downgrading back to 5.4.0 afterwards to be on an official release ... On 01/19/2016 04:27 PM, Michael Joyner wrote: Hello all, I downloaded 5.4 and started doing a rolling upgrade from a

Re: Rolling upgrade to 5.4 from 5.0 - "bug" caused by leader changes - is there a workaround?

2016-01-19 Thread Anshum Gupta
Hi Mike, This is a known issue and would be fixed with the upcoming 5.4.1. Here's a link to the issue: https://issues.apache.org/jira/browse/SOLR-8561 On Tue, Jan 19, 2016 at 1:27 PM, Michael Joyner wrote: > Hello all, > > I downloaded 5.4 and started doing a rolling upgrade from a 5.0 solrclou

Rolling upgrade to 5.4 from 5.0 - "bug" caused by leader changes - is there a workaround?

2016-01-19 Thread Michael Joyner
Hello all, I downloaded 5.4 and started doing a rolling upgrade from a 5.0 solrcloud cluster and discovered that there seems to be a compatibility issue where doing a rolling upgrade from pre-5.4 which causes the 5.4 to fail with unable to determine leader errors. Is there a work around that

RE: solrconfig.xml - configuration scope

2016-01-19 Thread Cario, Elaine
I know this is an old post, but I've seen the same behavior as well, specifically in setting invariants. We had 2 cores with different schema/solrconfig, but they both had a (differently configured) request handler named /search. The invariants set in one core leached into requests for the ot

Re: Using "join" with 2+ cores

2016-01-19 Thread Mikhail Khludnev
Perhaps shards=... https://cwiki.apache.org/confluence/display/solr/Distributed+Search+with+Index+Sharding Usually it scales as well it searches across shards in parallel. On Tue, Jan 19, 2016 at 11:38 AM, Steven White wrote: > I should rephrase my question, this isn't really about "join", it's

Scaling SolrCloud

2016-01-19 Thread Troy Edwards
We are currently "beta testing" a SolrCloud with 2 nodes and 2 shards with 2 replicas each. The number of documents is about 125000. We now want to scale this to about 10 billion documents. What are the steps to prototyping, hardware estimation and stress testing? Thanks

Re: Using "join" with 2+ cores

2016-01-19 Thread Binoy Dalal
You can try distributed search ( https://wiki.apache.org/solr/DistributedSearch) although I'm not sure if it'll work against two completely different cores. On Wed, 20 Jan 2016, 01:08 Steven White wrote: > I should rephrase my question, this isn't really about "join", it's more > about how do I

Re: Using "join" with 2+ cores

2016-01-19 Thread Steven White
I should rephrase my question, this isn't really about "join", it's more about how do I search across multiple cores as if I have 1 core. What is the Solr URL syntax I need to send to Solr to search across N cores as if I'm searching against 1 core? I'm on Solr 5.2.1 a none cloud setup. Steve

Re: Faceting and multiValued field type

2016-01-19 Thread Steven White
My apology for not being clear -- I left out the keyword "range search" with facet. Let me try again. Using DateRangeField field type, if this field is multiValued and I have 3 date values stored for one record, 5 for another, etc., which of those date values will be used for faceting when I use

Re: Solr trying to auto-update schema.xml

2016-01-19 Thread Bob Lawson
PS: I followed the advice given here to update my solrconfig.xml file to get rod of add-unknown-fields-to-the-schema On Tue, Jan 19, 2016 at 1:45 PM, Bob Lawson wrote: > Thank

Re: Solr trying to auto-update schema.xml

2016-01-19 Thread Bob Lawson
Thanks, very helpful. I think I'm on the right track now, but when I do a post now and my UpdateRequestProcessor extension tries to add a field to a document, I get: RequestHandlerBase org.apache.solr.common.SolrException: ERROR: [doc=1] Error adding field 'myField'='2234543' The call I'm making

Re: Solr trying to auto-update schema.xml

2016-01-19 Thread Erick Erickson
You _are_ using the mutating update processor. You have this: . . . and "add-unknown-field-to-the-schema" is in your update section here: add-unknown-fields-to-the-schema As to what field is in the doc but not in the schema, the solr log will probably help... Best, Erick On Tue,

Re: Solr trying to auto-update schema.xml

2016-01-19 Thread Bob Lawson
5.4.0 ${solr.data.dir:} ${solr.lock.type:native}

Re: Solr node 'Gone' status

2016-01-19 Thread Erick Erickson
The other possibility is that the "live_nodes" znode somehow lost the registration of that node. This shouldn't be happening, I've only seen it when Zookeeper had problems (full disk in this case). A node is only considered _really_ available if its state in the state.json (or clusterstate.json in

Re: Using "join" with 2+ cores

2016-01-19 Thread Mikhail Khludnev
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser fromIndex= score= I don't think that there is a reason between joining in the single core and join across two ones, but I'm not sure what do you mean in 5 to 10 cores. On Tue, Jan 19, 2016 at 8:43 AM, Steve

Re: Faceting and multiValued field type

2016-01-19 Thread Erick Erickson
Yes. What do you mean "how does it work"? The low-level details or what? Basically, faceting just... facets. I.e. for each unique value in the field specified it counts the number of docs in the result set that have that value. So if you have a doc with two dates and facet on that field, say 1/1

Re: Solr trying to auto-update schema.xml

2016-01-19 Thread Erick Erickson
Let's see your solrconfig.xml, because it does look like you're somehow using managed schema. And is the error above from the Solr log or your client? Because the Solr log often has more complete information. Best, Erick On Tue, Jan 19, 2016 at 9:54 AM, Bob Lawson wrote: > I added some fields t

Re: Boost query vs function query in edismax query

2016-01-19 Thread Chris Hostetter
: Boost Query (bq) accepts lucene queries. E.g. bq=price:[50 TO 100]^100 : boost and bf parameters accept Function queries, e.g. boost=log(popularity) while these statements are both true, they don't tell the full story. for example you can also specify a function as a query using the appropri

Solr trying to auto-update schema.xml

2016-01-19 Thread Bob Lawson
I added some fields to schema.xml. I then use a custom extension of UpdateRequestProcessor to add those fields to a document being indexed. When I run the post command, I am getting an error because apparently Solr is trying to add more fields over and above what I manually added. Here is the err

Re: Solr QTime explanation

2016-01-19 Thread Toke Eskildsen
Damien Picard wrote: > Currently we have 4 Solr nodes, with 12Gb memory (heap) ; the collections > are replicated (4 shards, 1 replica). > This query mostly returns a QTime=4 and it takes around 20ms on the client > side to get the result. > We have to handle around 200 simultaneous connections.

Re: Boost query vs function query in edismax query

2016-01-19 Thread Ahmet Arslan
Hi Sara, Boost Query (bq) accepts lucene queries. E.g. bq=price:[50 TO 100]^100 boost and bf parameters accept Function queries, e.g. boost=log(popularity) Ahmet On Tuesday, January 19, 2016 12:41 PM, Binoy Dalal wrote: You use boost functions when you wish to perform certain operations, lik

Faceting and multiValued field type

2016-01-19 Thread Steven White
Hi everyone, Can I use facet on a field type of multiValued? If so, how does facet work with field type of "date" set as multiValued? Thanks Steve

Using "join" with 2+ cores

2016-01-19 Thread Steven White
Hi evryone, Does Solr's join support 2+ cores? Is there an example? Is there performance impact when I have 5 to 10 cores vs. single core (all my data in 1 core)? Is relevancy score impacted with multiple cores vs. single core? Thanks Steve

Using Solr's spatial functionality for astronomical catalog

2016-01-19 Thread Colin Freas
Greetings! I have recently stood up an instance of Solr, indexing a catalog of about 100M records representing points on the celestial sphere. All of the fields are strings, floats, and non-spatial types. I’d like to convert the positional data to an appropriate spatial point data type suppo

Re: Solr 5 with java 7 or java 8

2016-01-19 Thread Daniel Collins
Solr 5.x is compiled for Java7 bytecode, so it will definitely work on a Java 7 VM. It is Solr 6 which will be Java 8 only. That said, I would advocate using Java 8. Especially as Java 7 is already publicly End of Life'd, starting a new project on Java 7 would seem short-sighted. On 19 January 2

Re: Solr QTime explanation

2016-01-19 Thread Damien Picard
Thank you for your advices. Currently we have 4 Solr nodes, with 12Gb memory (heap) ; the collections are replicated (4 shards, 1 replica). This query mostly returns a QTime=4 and it takes around 20ms on the client side to get the result. We have to handle around 200 simultaneous connections. Cu

Solr 5 with java 7 or java 8

2016-01-19 Thread Novin Novin
Hi Guys, Is this highly recommended java 8 with solr 5.x, because solr 5.x code is compiled with java 8 or it would be ok with java 7 but also could cause performance . Thanks Novin

Re: Solr QTime explanation

2016-01-19 Thread Shawn Heisey
On 1/19/2016 3:43 AM, Damien Picard wrote: > I'm currently testing Solr query execution performance (over http and > SolrJ), and, using HTTP with JMeter, I see that the response time increases > with the number of concurrent request (100 simultaneous request in my case). > > To understand where Sol

Solr QTime explanation

2016-01-19 Thread Damien Picard
Hi, I'm currently testing Solr query execution performance (over http and SolrJ), and, using HTTP with JMeter, I see that the response time increases with the number of concurrent request (100 simultaneous request in my case). To understand where Solr takes more time, I use the debug=timing param

Re: Boost query vs function query in edismax query

2016-01-19 Thread Binoy Dalal
You use boost functions when you wish to perform certain operations, like mathematical (reciprocal, sq. root etc.) on the actual field value during boosting. In boost queries, all you can do is specify a field optionally along with a particular value to boost on. On Tue, 19 Jan 2016, 14:22 shruti

Re: Solr node 'Gone' status

2016-01-19 Thread danny teichthal
>From my short experience, it indicates that the particular node lost connection with zookeeper. Like Binoy said, It may be because the process/host is down, but also could be a result of a network problem. On Tue, Jan 19, 2016 at 12:20 PM, Binoy Dalal wrote: > In my experience 'Gone' indicates

Re: Solr node 'Gone' status

2016-01-19 Thread Binoy Dalal
In my experience 'Gone' indicates that that particular solr instance itself is down. To bring it back up, simply restart solr. On Tue, 19 Jan 2016, 11:16 davidphilip cherian wrote: > Hi, > > Solr-admin cloud view page has got another new radio button indicating > status of node : 'Gone' status.

Re: Boost query vs function query in edismax query

2016-01-19 Thread shruti suri
Hi, Both the query affect solr scoring but for relevancy we use boost query as we can provide additional weights to a boost query but not to a function query. - Regards Shruti -- View this message in context: http://lucene.472066.n3.nabble.com/Boost-query-vs-function-query-in-edismax-query