Re: Learning to Rank (LTR) with grouping

2018-05-01 Thread ilayaraja
* "Top K shouldn't start from the "start" parameter, if it does, it is a bug. "*** 1. I clearly see that LTR do re-rank based on the start parameter. 2. When reRankDocs=24, pageSize=24, I still get the second page of results re-ranked by ltr plugin when I query with start=24. Alessandro Benedett

Re: 7.3 appears to leak

2018-05-01 Thread Đạt Cao Mạnh
Thank Markus, So I will go ahead with 7.3.1 release. On Tue, May 1, 2018 at 9:41 PM Markus Jelsma wrote: > Mạnh, Shalin, > > I tried to reproduce it locally but i failed, it is not just a stream of > queries and frequent updates/commits. We will temporarily abuse a > production machine to run 7

Solr Heap usage

2018-05-01 Thread Greenhorn Techie
Hi, Wondering what are the considerations to be aware to arrive at an optimal heap size for Solr JVM? Though I did discuss this on the IRC, I am still unclear on how Solr uses the JVM heap space. Are there any pointers to understand this aspect better? Given that Solr requires an optimally config

Median Date

2018-05-01 Thread Jim Freeby
All, We have a dateImported field in our schema. I'd like to generate a statistic showing the median dateImported (actually we want median age of the documents, based on the dateImported value). I have other stats that calculate the median value of numbers (like price). This was achieved with some

Query Regarding Solr Garbage Collection

2018-05-01 Thread Greenhorn Techie
Hi, Following the https://wiki.apache.org/solr/SolrPerformanceFactors article, I understand that Garbage Collection might be triggered due to significant increase in JVM heap usage unless a commit is performed. Given this background, I am curious to understand the reasons / factors that contribute

Re: Error when indexing against a specific dynamic field type

2018-05-01 Thread Erick Erickson
Steve's comment is much more germane. KeywordTokenizer, used in alphaOnlySort last I knew is not appropriate at all. Do you really want single tokens that consist of the entire document for sorting purposes? Wouldn't the first 1K be enough? It looks like this was put in in 4.0, so I'm guessing you

Re: SolrCloud Heterogenous Hardware setup

2018-05-01 Thread Deepak Goel
I had a similar problem some time back. Although it might not be the best way, but I used cron to move data from a high-end-spec to a lower-end-spec. It worked beautifully Deepak "The greatness of a nation can be judged by the way its animals are treated. Please stop cruelty to Animals, become a

Re: Error when indexing against a specific dynamic field type

2018-05-01 Thread THADC
Erick, thanks for the response. I have a number of documents in our database where solr is throwing the same exception against *_tsing types. However, when I index against the same document with our solr 4.7, it is successfully indexed. So, I assume something is different between 4.7 and 7.3. I wa

Re: SolrCloud Heterogenous Hardware setup

2018-05-01 Thread Greenhorn Techie
Thanks Erick. This information is very helpful. Will explore further on the node placement rules within Collections API. Many Thanks On 1 May 2018 at 16:26:34, Erick Erickson (erickerick...@gmail.com) wrote: "Is it possible to configure a collection such that the collection data is only stored

Re: Error when indexing against a specific dynamic field type

2018-05-01 Thread Shawn Heisey
On 5/1/2018 8:40 AM, THADC wrote: > I get the following exception: > > *Exception writing document id FULL_36265 to the index; possible analysis > error: Document contains at least one immense term in > field="gridFacts_tsing" (whose UTF8 encoding is longer than the max length > 32766), all of whic

Re: Error when indexing against a specific dynamic field type

2018-05-01 Thread Steve Rowe
The input in the error message starts “lorem ipsum”, so it contains spaces, but the alphaOnlySort field type (in Solr’s example schemas anyway) uses KeywordTokenizer, which tokenizes the entire input as a single token. As Erick implied, you maybe should not be doing that with this kind of data -

Re: Load Balancing between Two Cloud Clusters

2018-05-01 Thread Erick Erickson
Glad to help. Yeah, I thought you might have been making it harder than it needed to be ;). In SolrCloud you're constantly running up against "it's just magic until it's not", knowing when magic applies and when it doesn't can be tricky, very tricky. Basically when using LBs, people just thro

User queries end up in filterCache if facetting is enabled

2018-05-01 Thread Markus Jelsma
Hello, We noticed the number of entries of the filterCache to be higher than we expected, using showItems="1024" something unexpected was listed as entries of the filterCache, the complete Query.toString() of our user queries, massive entries, a lot of them. We also spotted all entries of fiel

Re: Error when indexing against a specific dynamic field type

2018-05-01 Thread Erick Erickson
You're sending it a huge term. My guess is you're sending something like base64-encoded data or perhaps just a single unbroken string in your field. Examine your document, it should jump out at you. Best, Erick On Tue, May 1, 2018 at 7:40 AM, THADC wrote: > Hello, > > We are migrating from solr

Re: SolrCloud Heterogenous Hardware setup

2018-05-01 Thread Erick Erickson
"Is it possible to configure a collection such that the collection data is only stored on few nodes in the SolrCloud setup?" Yes. There are "node placement rules", but also you can create a collection with a createNodeSet that specifies the nodes that the replicas are placed on. " If this is poss

SolrCloud Heterogenous Hardware setup

2018-05-01 Thread Greenhorn Techie
Hi, We are building a SolrCloud setup, which will index time-series data. Being time-series data with write-once semantics, we are planning to have multiple collections i.e. one collection per month. As per our use case, end users should be able to query across last 12 months worth of data, which

RE: 7.3 appears to leak

2018-05-01 Thread Markus Jelsma
Mạnh, Shalin, I tried to reproduce it locally but i failed, it is not just a stream of queries and frequent updates/commits. We will temporarily abuse a production machine to run 7.3 and a control  machine on 7.2 to rule some things out. We have plenty custom plugins, so when i can reproduce it

Error when indexing against a specific dynamic field type

2018-05-01 Thread THADC
Hello, We are migrating from solr 4.7 to 7.3. When I encounter a data item that matches a custom dynamic field from our 4.7 schema: ** , I get the following exception: *Exception writing document id FULL_36265 to the index; possible analysis error: Document contains at least one immense term in

Re: Load Balancing between Two Cloud Clusters

2018-05-01 Thread Monica Skidmore
Thank you, Erick. This is exactly the information I needed but hadn't correctly parsed as a new Solr cloud user. You've just made setting up our new configuration much easier!! Monica Skidmore Senior Software Engineer On 4/30/18, 7:29 PM, "Erick Erickson" wrote: "We need a way to d

ApacheCon North America 2018 schedule is now live.

2018-05-01 Thread Rich Bowen
Dear Apache Enthusiast, We are pleased to announce our schedule for ApacheCon North America 2018. ApacheCon will be held September 23-27 at the Montreal Marriott Chateau Champlain in Montreal, Canada. Registration is open! The early bird rate of $575 lasts until July 21, at which time it goe