Re: Solr Slave failed to initialize collection

2018-05-04 Thread Shawn Heisey
On 5/4/2018 4:29 AM, Aji Viswanadhan wrote: > Caused by: org.apache.lucene.index.IndexNotFoundException: no segments* file > found in > LockValidatingDirectoryWrapper(NRTCachingDirectory(MMapDirectory@D:\Solr8.2\solr-5.5.4\solr-5.5.4\server\solr\collection_web_index\data\index.20180428184955635 > l

Re: Learning to Rank (LTR) with grouping

2018-05-04 Thread ilayaraja
Also, would like to understand what are the ways to optimize for performance at search time with LTR. Queries with terms (that fetch more results) lead to very high latency with re-rank query even for reRankDocs=24. Is there best practices to reduce the latency? Can fv cache help? Shou

Re: Regarding Solr Admin "LoadTermInfo" section

2018-05-04 Thread Alexandre Rafalovitch
The underlying Lucene segments are only written once. This affects a lot of things with commit strategies, document update limitations, etc. So, when documents are deleted, they are still present in the lucene structures and can affect some low-level statistics. If you manage to delete the whole s

Backup collections using SolrJ

2018-05-04 Thread Olivier Tavard
Hi, I have a question regarding the backup of a Solr collection using SolrJ. I use Solr 7. I want to do a JAR for that and launch it into a cron job. So far, no problem for the request using CollectionAdminRequest.backupCollection then I use the processAsync method. The command is well transmitt

Re: Autocomplete returning shingles

2018-05-04 Thread Alessandro Benedetti
Yes, faceting will work, you can use an old approach used for autocompletion[1] . Be sure you add the shingle filter to the appropriate index time analysis for the field you want. Facet values are extracted from the indexed terms, so calculating faceting and filtering by prefix should do the trick.

Re: the number of docs in each group depends on rows

2018-05-04 Thread Webster Homer
We do group queries with Solrcloud all the time. You must set up your collection so that all values for the field you are grouping on are in the same shard. This can easily be done with the composite router. Basically you do this be creating a unique field that contains the field to group on, with

Re: Autocomplete returning shingles

2018-05-04 Thread O. Klein
Yes, splitting in more documents would probably work. Don't think I can do this easliy with Solr. Looking into using facets now. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: solrj (admin) requests

2018-05-04 Thread Arturas Mazeika
Hi Erick, Shawn, et al, Thanks a lot for a piece of wisdom Especially for unit tests, findRecursive, as well as ideas how to get to infos that one is looking for. Good job indeed. Cheers, Arturas On Fri, May 4, 2018 at 12:22 AM, Shawn Heisey wrote: > On 5/3/2018 9:07 AM, Arturas Mazeika wrote:

Re:the number of docs in each group depends on rows

2018-05-04 Thread Diego Ceccarelli (BLOOMBERG/ LONDON)
Hello, I'm not sure 100% but I think that if you have multiple shards the number of docs matched in each group is *not* guarantee to be exact. Increasing the rows will increase the amount of partial information that each shard sends to the federator and make the number more precise. For exact

Solr Slave failed to initialize collection

2018-05-04 Thread Aji Viswanadhan
Hi Team, We have the Solr setup in our project and it has SOLR Master and SOLR Slaves. The setup is working fine and the replication also happens properly. This week we had the issue as one slave was not able to initialize one collection and it seems index was corrupted in the slave. Here is the

Re: Regarding LTR feature

2018-05-04 Thread Alessandro Benedetti
Hi Preteek, I would assume you have that feature at training time as well, can't you use the training set to estabilish the parameters for the normalizer at query time ? In the end being a normalization, doesn't have to be that accurate to the query time state, but it must reflect the relations th

Howto disable PrintGCTimeStamps in Solr

2018-05-04 Thread Bernd Fehling
Hi list, this sounds simple but I can't disable PrintGCTimeStamps in solr_gc logging. I tried with GC_LOG_OPTS in start scripts and --verbose reporting during start to make sure it is not in Solr start scripts. But if Solr is up and running there are always TimeStamps in solr_gc.log and the file r