Re: Missed update on replica

2016-04-29 Thread Mike Wartes
I should add that this is on Solr 5.1.0. On Thu, Apr 28, 2016 at 2:42 PM, Mike Wartes wrote: > I have a three node, one shard SolrCloud cluster. > > Last week one of the nodes went out of sync with the other two and I'm > trying to understand why that happened. > > After poking through my logs a

Error - Too many close [count:-1]

2016-04-29 Thread Vipul Gupta
Solr team - Any pointers on fixing this issue ? [10:29:08] ERROR 0-thread-7 o.a.s.c.SolrCore <> Too many close [count:-1] on org.apache.solr.core.SolrCore@3d6f8ad3. Please report this exception to solr-user@lucene.apache.org

Re: deactivate coord scoring factor in pf2 pf3

2016-04-29 Thread Doug Turnbull
I was wrong Elisabeth. I thought you could disable coord at query time in Solr, turns out you can't (I was thinking of Lucene's BooleanQuery disableCoord param). https://issues.apache.org/jira/browse/SOLR-3931 I definitely know you can disable coord with a custom Similarity and just return 1.0 fo

Re: Idle timeout expired: 50000/50000 ms

2016-04-29 Thread Robert Brown
Thanks Shawn, I'm definitely not looking to just upping the timeout, like you say, there's a bigger issue to be resolved. My indexes are between 1m and up to 60m docs (30m per shard, ~70GB on disk each). All of these collections get completely refreshed at least once a day, data may not ac

Re: Decide on facets from results

2016-04-29 Thread Mark Robinson
Thanks for the suggestion Joe. I will check on it. Thanks! Mark. On Fri, Apr 29, 2016 at 11:56 AM, Joel Bernstein wrote: > Check out the new docs for the gatherNodes streaming expression. it Allows > you to aggregate and then use those aggregates as input for another > expression. You can even

Schema API

2016-04-29 Thread Hendrik Haddorp
Hi, I have a Solr Cloud 6 setup with a managed schema. It seems like when I create multiple collections from the same config set that they still share the same schema. That was rather unexpected, as in the REST and SolrJ API I do specify a collection when doing the schema change. Looking into what

Re: Idle timeout expired: 50000/50000 ms

2016-04-29 Thread Shawn Heisey
On 4/28/2016 3:13 PM, Robert Brown wrote: > I operate several collections (about 7-8) all using the same 5-node > ZooKeeper cluster. They've been in production for 3 months, with only > 2 previous issues where a Solr node went down. > > Tonight, during several updates to the various collections, a

Re: issues doing a spatial query

2016-04-29 Thread Erick Erickson
Where is the doc that's "somewhere online"? One of the issues I face constantly is knowing what _current_ information is. Lots of posts out there are perfectly correct at the time they were written, but haven't been updated. Best, Erick On Fri, Apr 29, 2016 at 6:16 AM, GW wrote: > I realise the

Re: Set router.field in unit tests

2016-04-29 Thread Alan Woodward
It's almost certainly worth using SolrCloudTestBase rather than AbstractDistribZkTestBase as well - normally makes the test five or six times faster. Alan Woodward www.flax.co.uk On 29 Apr 2016, at 17:11, Erick Erickson wrote: > I'm pretty sure you can just create a collection after the distr

Re: Questions on SolrCloud core state, when will Solr recover a "DOWN" core to "ACTIVE" core.

2016-04-29 Thread Don Bosco Durai
Hi Li I got into very similar situation like you. The GC was taking much longer than the zookeeper timeout configured. I had 3 nodes in the SolrCloud and very often I would have my entire cluster totally messed up. Increasing the zookeeper timeout eventually helped. But before that, I was able

Re: Set router.field in unit tests

2016-04-29 Thread Erick Erickson
I'm pretty sure you can just create a collection after the distributed stuff is set up. Take a look at: CollectionsAPIDistributedZkTest.testNodesUsedByCreate to see creating a collection in your test just by a request (you can set any params you want there, including router.field). Or Collection

Re: Decide on facets from results

2016-04-29 Thread Joel Bernstein
Check out the new docs for the gatherNodes streaming expression. it Allows you to aggregate and then use those aggregates as input for another expression. You can even do this across collections. https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62693238 This is slated for Solr 6.1

Re: Facet ignoring repeated word

2016-04-29 Thread Erick Erickson
That's the way faceting is designed to work. It counts the _documents_ that a term appears in that satisfy your query, if a word appears multiple times in a doc, it'll only count it once. For the general use-case it'd be unsettling for a user to see a facet count of 500, then click on it and disco

RE: dataimport db-data-config.xml

2016-04-29 Thread Davis, Daniel (NIH/NLM) [C]
Kishor, Data Import Handler doesn't know how to randomly access rows from the CSV to "JOIN" them to rows from the MySQL table at indexing time. However, both MySQL and Solr know how to JOIN rows/documents from multiple tables/collections/cores. Data Import Handler could read the CSV first, and

Phrases and edismax

2016-04-29 Thread Mark Robinson
Hi, q=productType:(two piece bathtub white) &defType=edismax&pf=productType^20.0&qf=productType^15.0 In the debug section this is what I see:- (+(productType:two productType:piec productType:bathtub productType:white) DisjunctionMaxQuery((productType:"piec bathtub white"^20.0)))/no_coord My qu

Re: Questions on SolrCloud core state, when will Solr recover a "DOWN" core to "ACTIVE" core.

2016-04-29 Thread Erick Erickson
Well, there have been lots of improvements since 4.6. You're right, logically when things come back up and are all reachable, it seems like it is theoretically possible to bring a node back up. There have been situations where that doesn't happen, and various fixes have been implemented to fix them

Re: Solr 5.2.1 on Java 8 GC

2016-04-29 Thread Nick Vasilyev
Not sure if it helps anyone, but I am seeing decent results with the following. It was mostly a result of trial and error, I am not familiar with Java GC or even Java itself. I added my interpretation of what was happening, but I am not sure if it is right, take it for what it's worth. It'd be nic

Re: Tuning solr for large index with rapid writes

2016-04-29 Thread Erick Erickson
Good luck! You have one huge advantage when doing prototyping, you can mine your current logs for real user queries. It's actually surprisingly difficult to generate, say, 10,000 "realistic" queries. And IMO you need something approaching that number to insure that you're queries don't hit the cac

Re: Decide on facets from results

2016-04-29 Thread Mark Robinson
Thanks much everyone! Appreciate your responses. Best, Mark On Thu, Apr 28, 2016 at 10:52 AM, Jay Potharaju wrote: > On the same lines as Erik suggested but using facet stats instead. you can > get stats on your facet fields in the first pass and then include the > facets that you need in the s

Re: Facet ignoring repeated word

2016-04-29 Thread Ahmet Arslan
Hi, Depending on your requirements; StatsComponent, TermsComponent, LukeRequestHandler can also be used. https://cwiki.apache.org/confluence/display/solr/The+Terms+Component https://wiki.apache.org/solr/LukeRequestHandler https://cwiki.apache.org/confluence/display/solr/The+Stats+Component Ahme

Re: Many to Many Mapping with Solr

2016-04-29 Thread Joel Bernstein
We really still need to know more about your use case. In particular what types of questions will you be asking of the data? It's useful to do this in plain english without mapping to any specific implementation. Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Apr 29, 2016 at 9:43 AM, Alexa

Re: Many to Many Mapping with Solr

2016-04-29 Thread Alexandre Rafalovitch
You do not structure Solr to represent your database. You structure it to represent what you will search. In your case, it sounds like you want to return 'user-records', in which case you will index the related information all together. Yes, you will possibly need to recreate the multiple document

Re: issues doing a spatial query

2016-04-29 Thread GW
I realise the world wrap thing. but it is correct ~ they are coordinates taken from Google maps. I'd does not really matter though. I switched the query to use geofilt and everything is fine. Here's the kicker. There is a post somewhere online that says you cannot use geofilt with multivalued loc

Re: Set router.field in unit tests

2016-04-29 Thread GW
Not exactly suer what you mean but I think you are wanting to change your schema.xml to restart solr On 29 April 2016 at 06:04, Markus Jelsma wrote: > Hi - any hints to share? > > Thanks! > Markus > > > > -Original message- > > From:Markus Jelsma > > Sent: Thursday 28th April 20

Many to Many Mapping with Solr

2016-04-29 Thread Sandeep Mestry
Hi All, Hope the day is going on well for you. This question has been asked before, but I couldn't find answer to my specific request. I have many to many relationship and the mapping table has additional columns. Whats the best way I can model this into solr entity? For example: a user has many

RE: Set router.field in unit tests

2016-04-29 Thread Markus Jelsma
Hi - any hints to share? Thanks! Markus -Original message- > From:Markus Jelsma > Sent: Thursday 28th April 2016 13:30 > To: solr-user > Subject: Set router.field in unit tests > > Hi - i'm working on a unit test that requires the cluster's router.field to > be set to a field diff

dataimport db-data-config.xml

2016-04-29 Thread kishor
I want to import data from mysql-table and csv file ata the same time beacuse some data are in mysql tables and some are in csv file . I want to match specific id from mysql table in csv file then add the data in solar. What i think or wnat to do

Facet ignoring repeated word

2016-04-29 Thread G, Rajesh
Hi, I am trying to implement word cloud

Re: solr | backup and restoration

2016-04-29 Thread Jan Verweij - Reeleez
Hi Prateek, To me it feels like the backup/restore is still an open item should be higher on the agenda. Yes, there are work-arounds like copying data from/into the index folder but this doesn' t seem very stable. I'm using the following approach in solrcloud since I ran into an issue with restor

relaxed vs. improved validation in solr.TrieDateField

2016-04-29 Thread Uwe Reh
Hi, doing some migration tests (4.10 to 6.0) I recognized a improved validation of TrieDateField. Syntactical correct but impossible days are rejected now. (stack trace at the end of the mail) Examples: - '1997-02-29T00:00:00Z' - '2006-06-31T00:00:00Z' - '2000-00-00T00:00:00Z' The first two d