Re: Learning to Rank (LTR) with grouping

2018-04-19 Thread Nitin Kumar
Can anybody please share, ltr and group together works fine in which solr version. On Wed, Apr 18, 2018 at 3:47 PM, Diego Ceccarelli (BLOOMBERG/ LONDON) < dceccarel...@bloomberg.net> wrote: > I just updated the PR to upstream - I still have to fix some things in > distribute mode, but unit tests

SolrCloud design question

2018-04-19 Thread Bernd Fehling
How would you setup a SolrCloud an why? shard1 shard2 shard3 | | | | | | | |r1| | | |r1| | | |r1| | | | | | | | | | | | | | | | | | | | | |r2| |

Re: SolrCloud design question

2018-04-19 Thread Shawn Heisey
On 4/19/2018 6:28 AM, Bernd Fehling wrote: How would you setup a SolrCloud an why? shard1 shard2 shard3 | | | | | | | |r1| | | |r1| | | |r1| | | | | | | | | | | | | | |

Re: Howto change log level with Solr Admin UI ?

2018-04-19 Thread Shawn Heisey
On 4/19/2018 12:32 AM, Bernd Fehling wrote: Would be cool if that would be possible, to change the log level for solr.log from the Admin UI. Imagine, a running system with problems, you can change log level and get more logging info into solr.log without restarting the system and overloading the

Re: SolrCloud design question

2018-04-19 Thread Chris Ulicny
Beyond failure/maintenance concerns, the first setup is not necessarily a good distribution of the hosts' resources. Depending on the use case, it could be much more prone to hot-spots in the cluster, especially if routing or manual sharding is involved. If for some reason, there are documents on

Re: need help on search on last name + middile initial

2018-04-19 Thread Wendy2
Hi Shawn, Thank you very much for your reply! Per your suggestion, I re-indexed the data after removing the stopword filter. It looks that Solr parsed the data correctly but didn't return any results. Anything else could I try? Thank you again! ===debugQuery Output= { "responseHe

Re: SolrCloud design question

2018-04-19 Thread Bernd Fehling
Hi Shawn, OK, got that. Would shuffling or shifting the replicas bring any benfit or is it just wasted time? | | | | | | shard1 | |r1| | | |r2| | | |r3| | | | | | | | |

Re: How to protect middile initials during search

2018-04-19 Thread Wendy2
Hi Jay, Thank you very much for your reply! I re-indexed the data after removing the stopword filter. It looks that Solr parsed the data correctly but didn't return any results. Anything else could I try? Thank you again! ===debugQuery Output= { "responseHeader":{ "status":0,

docValue vs. analyzer

2018-04-19 Thread Uwe Reh
Hi, I'm stuck in a dead end. My task is to map individual ids, to group them. So far, so simple: * copyfield 'id' -> 'groupId' * use a SynonymFilter on 'groupId' Now, I had the idea to improve the performance of grouping with 'docValues'. Unfortunately, this leads to a contradiction: * docVal

Re: Specialized Solr Application

2018-04-19 Thread Terry Steichen
Thanks, Tim.  A couple of quick comments and a couple of questions: 1) the toughest pdfs to identify are those that are partly searchable (text) and partly not (image-based text).  However, I've found that such documents tend to exist in clusters. 2) email documents (.eml) are no

duplicate doc of uniqueKey

2018-04-19 Thread Novin Novin
Hi Guys, I end up with duplicate docs in solr cloud. I don't know how to debug it. So looking for help here please. Below is details: Solr 6.6.2 zookeeper 3.4.10 Below is example of duplicate record of Json: { "responseHeader":{ "zkConnected":true, "status":0, "QTime":0, "para

Re: duplicate doc of uniqueKey

2018-04-19 Thread Erick Erickson
Also ask for the _version_ field in your fl list. The _version_ field is used o r optimistic locking. This is mostly a curiosity question The only time I've ever seen something like this is if you, for instance, use MergeIndexes or MapReduceIndexerTool (which does a MergeIndexes under the cove

Re: SolrCloud cluster does not accept new documents for indexing

2018-04-19 Thread Erick Erickson
Have you changed any of the merge policy parameters? I doubt it but just asking. My guess: your I/O is your bottleneck. There are a limited number of threads (tunable) that are used for background merging. When they're all busy, incoming updates are queued up. This squares with your statement that

Re: docValue vs. analyzer

2018-04-19 Thread Erick Erickson
I haven't poked into the details, but (recently, very recently, 7.3) theres a SortableTextField that may be useful in this situation. Otherwise you could use a FieldMutatingUpdateProcessorFactory or perhaps a ScriptUpdateProcessor to manipulate the fields on the way in. Not quite sure how you could

Re: SolrCloud cluster does not accept new documents for indexing

2018-04-19 Thread Denis Demichev
Erick, Thank you for your quick response. I/O bottleneck: Please see another screenshot attached, as you can see disk r/w operations are pretty low or not significant. iostat== Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await r_await w_await svc

Re: duplicate doc of uniqueKey

2018-04-19 Thread Novin Novin
Hi Erick, I haven't done any of merge indexes with MergeIndexes or MapReduceIndexerTool. Actually I found that one of doc does not have child doc, because I am using solr parent child docs for block join queries. As far as I know, it is know issue for parent child docs that if you send only parent

Re: PF, PF2, PF3 clauses missing in solr7 with query-time synonyms?

2018-04-19 Thread Elizabeth Haubert
An update on this: The problem occurs on phrase queries, using edismax, where the term in the nested query contains a multi-word synonym. In the example above, dog has a multiterm synonym "canis familiaris", and aspirin has "acetylsalicylic acid". Creating a JIRA ticket. Thank you, Elizabeth

Re: duplicate doc of uniqueKey

2018-04-19 Thread Karthik Ramachandran
Novin, Was your system time moved to future time and then was reset to current time? Solr will add the new document and will send delete for the old document but there will no document matching the criteria. On Thu, Apr 19, 2018 at 1:10 PM, Novin Novin wrote: > Hi Erick, > > I haven't done an

Re: SolrCloud cluster does not accept new documents for indexing

2018-04-19 Thread Mikhail Khludnev
Threads are hanging on merge io throthling at org.apache.lucene.index.MergePolicy$OneMergeProgress.pauseNanos(MergePolicy.java:150) at org.apache.lucene.index.MergeRateLimiter.maybePause(MergeRateLimiter.java:148) at org.apache.lucene.index.MergeRateLimiter.pause(MergeRa

Re: custom response writer which extends RawResponseWriter fails when shards > 1

2018-04-19 Thread Lee Carroll
Hi, I rewrote all of my tests to use SolrCloudTestCase rather than SolrTestCaseJ4 and was able to replicate the responsewriter issue and debug with a sharded collection. It turned out the issue was not with my response writer really but rather my config. content In clo

Re: SolrCloud cluster does not accept new documents for indexing

2018-04-19 Thread Denis Demichev
Mikhail, I see what you're saying. Thank you for the clarification. Yes, there's no single line in the client code that contains a commit statement. The only thing I do: solr.add(collectionName, dataToSend); where solr is a SolrClient. Autocommits are set up on the server side for 2 minutes and th

Re: custom response writer which extends RawResponseWriter fails when shards > 1

2018-04-19 Thread Mikhail Khludnev
what if you put it into "defaults"? On Thu, Apr 19, 2018 at 8:42 PM, Lee Carroll wrote: > Hi, > > I rewrote all of my tests to use SolrCloudTestCase rather than > SolrTestCaseJ4 > and was able to replicate the responsewriter issue and debug with a sharded > collection. It turned out the issue wa

Re: duplicate doc of uniqueKey

2018-04-19 Thread Novin Novin
Hi Karthik, *Was your system time moved to future time and then was reset to current* *time?* Nothing happen like this as far as I known. Thanks in advance Novin On Thu, 19 Apr 2018 at 18:26 Karthik Ramachandran wrote: > Novin, > > Was your system time moved to future time and then was reset

Re: duplicate doc of uniqueKey

2018-04-19 Thread Erick Erickson
Right, parent/child docs _must_ be treated as a block. By that I mean you cannot add/delete individuals child docs and/or parent docs. That's one of the limitations of parent/child blocks and I don't know of any plans to change that. Best, Erick On Thu, Apr 19, 2018 at 11:14 AM, Novin Novin wrot

Re: custom response writer which extends RawResponseWriter fails when shards > 1

2018-04-19 Thread Lee Carroll
Default works. However in the case (which is maybe my case) where request handler implies a response type and really should be locked down to prevent abuse or error you could argue invariant is needed. I guess its also not very elegant having an arbitrary rule, no wt as invariant in cloud mode etc.

Re: SolrCloud cluster does not accept new documents for indexing

2018-04-19 Thread Erick Erickson
When all indexing threads are occupied merging, incoming updates block until at least one thread frees up IIUC. The fact that you're not opening searchers doesn't matter as far as merging is concerned, that happens regardless on hard commits. Bumping your ram buffer up to 2G is usually unnecessar

Re: custom response writer which extends RawResponseWriter fails when shards > 1

2018-04-19 Thread Mikhail Khludnev
You can introduce own searchHandler with wt invariant. When aggreator request slaves it will use regular /select with default wt=javabin (it's condrolled by shards.qt, btw) Providing such comprehensive application logic on top of neat solrconfig.xml is not the best idea, though. On Thu, Apr 19, 20

Re: Performance & CPU Usage of 6.2.1 vs 6.5.1 & above

2018-04-19 Thread Deepak Goel
Deepak "Please stop cruelty to Animals, help by becoming a Vegan" +91 73500 12833 deic...@gmail.com Facebook: https://www.facebook.com/deicool LinkedIn: www.linkedin.com/in/deicool "Plant a Tree, Go Green" On Thu, Apr 19, 2018 at 9:23 AM, mganeshs wrote: > Hello Deepak, > > We are not querying

Re: duplicate doc of uniqueKey

2018-04-19 Thread Novin Novin
Thanks Erick and Karthik for you help. On Thu, 19 Apr 2018 at 19:53 Erick Erickson wrote: > Right, parent/child docs _must_ be treated as a block. By that I mean > you cannot add/delete individuals child docs and/or parent docs. > That's one of the limitations of parent/child blocks and I don't