Numerous problems with SolrCloud

2015-12-21 Thread John Smith
This is my first experience with SolrCloud, so please bear with me. I've inherited a setup with 5 servers, 2 of which are Zookeeper only and the 3 others SolrCloud + Zookeeper. Versions are respectively 5.4.0 & 3.4.7. There's around 80 Gb of index, some collections are rather big (20Gb) and some v

Re: Numerous problems with SolrCloud

2015-12-21 Thread John Smith
or Solutions Architect > http://www.lucidworks.com <http://www.lucidworks.com/> > > > >> On Dec 21, 2015, at 10:37 AM, John Smith wrote: >> >> This is my first experience with SolrCloud, so please bear with me. >> >> I've inherited a setup with

Re: Numerous problems with SolrCloud

2015-12-21 Thread John Smith
#x27;d look first for any help in understanding > the root cause of nodes going into recovery. > > Best, > Erick > > On Mon, Dec 21, 2015 at 8:04 AM, John Smith wrote: >> Thanks, I'll have a try. Can the load on the Solr servers impair the zk >> response time in t

Re: Numerous problems with SolrCloud

2015-12-22 Thread John Smith
over time and outgrew its host, but that's a guess. > > And you get to deal with it over the holidays too ;) > > Best, > Erick > > On Mon, Dec 21, 2015 at 8:33 AM, John Smith wrote: >> OK, great. I've eliminated OOM errors after increasing the memory >> allocat

More problems (now jetty errorrs) with SolrCloud

2016-01-22 Thread John Smith
Hi, This morning one of the 2 nodes of our SolrCloud went down. I've tried many ways to recover it but to no avail. I've tried to unload all cores on the failed node and reload it after emptying the data directory, hoping it would sync from scratch. The core is still marked as down and no data is

Re: Boosts for relevancy (shopping products)

2016-03-19 Thread John Smith
Hi, For once I might be of some help: I've had a similar configuration (large set of products from various sources). It's very difficult to find the right balance between all parameters and requires a lot of tweaking, most often in the dark unfortunately. What I've found is that omitNorms=true is

Actual (specific) RT Search?

2016-03-20 Thread John Smith
Hi, The purpose of the project is an actual RT Search, not NRT, but with a specific condition: when an updated document meets a fixed criteria, it should be filtered out from future results (no reuse of the document). This criteria is present in the search query but of course doesn't work for unco

Unexpected delayed document deletion with atomic updates

2015-10-07 Thread John Smith
Hi, I'm bumping on the following problem with update XML messages. The idea is to record the number of clicks for a document: each time, a message is sent to .../update such as this one: abc 1 1.05 (Clicks is an int field; Boost is a float field, it's updated to reflect the change in popular

Re: Unexpected delayed document deletion with atomic updates

2015-10-07 Thread John Smith
>> >> Perhaps looking at the solr log would help too... >> >> Best, >> Erick >> >> On Wed, Oct 7, 2015 at 6:32 AM, John Smith >> wrote: >>> Hi, >>> >>> I'm bumping on the following problem with update X

Re: Unexpected delayed document deletion with atomic updates

2015-10-07 Thread John Smith
ll; Closing out SolrRequest: {wt=json&commit=true&update.chain=dedupe} The update.chain parameter wasn't part of the original request, and "dedupe" looks suspicious to me. Perhaps should I investigate further there? Thanks, John. On 08/10/15 08:25, John Smith wrote: >

Re: Unexpected delayed document deletion with atomic updates

2015-10-08 Thread John Smith
RC in the techproducts sample > configs. > > Perhaps you uncommented it to use your own update processors, but didn't > remove that component? > > On Thu, Oct 8, 2015, at 07:38 AM, John Smith wrote: >> Oh, I forgot Erick's mention of the logs: there's nothing unusua

Re: Unexpected delayed document deletion with atomic updates

2015-10-08 Thread John Smith
logging) in the data import handler. Is there an easy way to do this? Conceptually, shouldn't the update chain be callable from the data import process - maybe it is? John On 08/10/15 09:43, Upayavira wrote: > Yay! > > On Thu, Oct 8, 2015, at 08:38 AM, John Smith wrote: >> Yes ind

Re: Unexpected delayed document deletion with atomic updates

2015-10-08 Thread John Smith
update.chain value. > > I have no idea how you would then reference that in the DIH - I've never > really used it. > > Upayavira > > On Thu, Oct 8, 2015, at 09:25 AM, John Smith wrote: >> After some further investigation, for those interested: the >> SignatureUpd

Re: Unexpected delayed document deletion with atomic updates

2015-10-11 Thread John Smith
ue for that field to > "1" . > So a document with 1 click will be considered equal to one with 1000 clicks. > My 2 cents > > Cheers > > On 8 October 2015 at 14:10, John Smith wrote: > >> Well, every day we update a lot of documents (usually several millions)

Best way to backup and restore an index for a cloud setup in 4.6.1?

2015-05-08 Thread John Smith
All, With a cloud setup for a collection in 4.6.1, what is the most elegant way to backup and restore an index? We are specifically looking into the application of when doing a full reindex, with the idea of building an index on one set of servers, backing up the index, and then restoring that ba

Creating a collection with 1 shard gives a weird range

2016-05-17 Thread John Smith
I'm trying to create a collection starting with only one shard (numShards=1) using a compositeID router. The purpose is to start small and begin splitting shards when the index grows larger. The shard created gets a weird range value: 8000-7fff, which doesn't look effective. Indeed, if a tr

Re: Creating a collection with 1 shard gives a weird range

2016-05-18 Thread John Smith
On 17/05/16 11:56, Tom Evans wrote: > On Tue, May 17, 2016 at 9:40 AM, John Smith wrote: >> I'm trying to create a collection starting with only one shard >> (numShards=1) using a compositeID router. The purpose is to start small >> and begin splitting shards when t

parent/child rows in solr

2018-09-07 Thread John Smith
Hi, I have a document structure like this (this is a made up schema, my data has nothing to do with departments and employees, but the structure holds true to my real data): department 1 employee 11 employee 12 employee 13 room 11 room 12 room 13 department 2 employee

Re: parent/child rows in solr

2018-09-07 Thread John Smith
umns. On Fri, Sep 7, 2018 at 9:32 PM Shawn Heisey wrote: > On 9/7/2018 3:06 PM, John Smith wrote: > > Hi, I have a document structure like this (this is a made up schema, my > > data has nothing to do with departments and employees, but the structure > > holds true to my rea

Re: parent/child rows in solr

2018-09-11 Thread John Smith
> > On 9/7/2018 7:44 PM, John Smith wrote: > > Thanks Shawn, for your comments. The reason why I don't want to go flat > > file structure, is due to all the wasted/duplicated data. If a department > > has 100 employees, then it's very wasteful in terms of disk spa

Re: parent/child rows in solr

2018-09-11 Thread John Smith
On Tue, Sep 11, 2018 at 9:32 PM Shawn Heisey wrote: > On 9/11/2018 7:07 PM, John Smith wrote: > > header: 223,580 > > > > child1: 124,978 > > child2: 254,045 > > child3: 127,917 > > child4:1,009,030 > > child5:

Re: parent/child rows in solr

2018-09-11 Thread John Smith
On Tue, Sep 11, 2018 at 11:00 PM Shawn Heisey wrote: > On 9/11/2018 8:35 PM, John Smith wrote: > > The problem is that the math isn't a simple case of adding up all the row > > counts. These are "left outer join"s. In sql, it would be this query: > > I think w

Re: parent/child rows in solr

2018-09-11 Thread John Smith
On Tue, Sep 11, 2018 at 11:05 PM Walter Underwood wrote: > Have you tried modeling it with multivalued fields? > > That's an interesting idea, but I don't think that would work. We would lose the concept of "rows". So let's say child1 has col "a" and col "b", both are turned into multi-value fiel

statistics in hitlist

2018-02-23 Thread John Smith
I'm using solr, and enabling stats as per this page: https://lucene.apache.org/solr/guide/6_6/the-stats-component.html I want to get more stat values though. Specifically I'm looking for r-squared (coefficient of determination). This value is not present in solr, however some of the pieces used to

Re: statistics in hitlist

2018-02-23 Thread John Smith
Joel Bernstein wrote: > Typically SSE is the sum of the squared errors of the prediction in a > regression analysis. The stats component doesn't perform regression, > although it might be a nice feature. > > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On F

Re: statistics in hitlist

2018-03-01 Thread John Smith
values in fieldA are stored in the variable "b". The > values in fieldB are stored in variable "c". > > Then the regress function performs a simple linear regression on arrays > stored in variables "b" and "c". > > The output of the regress functio

Re: statistics in hitlist

2018-03-05 Thread John Smith
> > > > I'll check to see if the default schema used with solr start -c has this > > field, if not I'll add it. Thanks for pointing this out. > > > > I checked and right now the random expression is only accepting one fq, > > but I consider thi

Re: statistics in hitlist

2018-03-15 Thread John Smith
o="true", > > a=random(tx_prod_production, q="*:*", fq="isParent:true", rows="15", > > fl="oil_first_90_days_production,oil_last_30_days_production"), > > b=col(a, oil_first_90_days_production)) > > > > >

Re: statistics in hitlist

2018-03-16 Thread John Smith
the let expression which variables to output. > > > > > > > > Joel Bernstein > > http://joelsolr.blogspot.com/ > > > > On Thu, Mar 15, 2018 at 3:13 PM, Erick Erickson > > > wrote: > > > >> What does the fq clause look like? > >>