Re: Corrupt Index error on Target cluster
Thanks. I have 6.6.2. Do you remember the exact minor version which you run into with corruptIndex. I did fix it using CheckIndex. On Sat, Sep 8, 2018 at 2:00 AM Stephen Bianamara wrote: > Hmm, when this occurred for me I was also on 6.6 between minor releases. So > unclear if it's connected to 6.6 specifically. > > If you want to resolve the problem, you should be able to use the > collection api delete that node from the collection, and then re-add it > which will trigger resync. > > > On Fri, Sep 7, 2018, 10:35 AM Susheel Kumar wrote: > > > No. The solr i have is 6.6. > > > > On Fri, Sep 7, 2018 at 10:51 AM Stephen Bianamara < > > sdl1tinsold...@gmail.com> > > wrote: > > > > > I've gotten incorrect checksums when upgrading solr versions across the > > > cluster. Or in other words, when indexing into a mixed version cluster. > > Are > > > you running mixed versions by chance? > > > > > > On Fri, Sep 7, 2018, 6:07 AM Susheel Kumar > > wrote: > > > > > > > Anyone has insight / have faced above errors ? > > > > > > > > On Thu, Sep 6, 2018 at 12:04 PM Susheel Kumar > > > > > wrote: > > > > > > > > > Hello, > > > > > > > > > > We had a running cluster with CDCR and there were some issues with > > > > > indexing on Source cluster which got resolved after restarting the > > > nodes > > > > > (in my absence...) and now I see below errors on a shard at Target > > > > > cluster. Any suggestions / ideas what could have caused this and > > whats > > > > the > > > > > best way to recover. > > > > > > > > > > Thnx > > > > > > > > > > Caused by: org.apache.solr.common.SolrException: Error opening new > > > > searcher > > > > > at > > > > > org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2069) > > > > > at > > > org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2189) > > > > > at > > > org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1926) > > > > > at > > > org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1826) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.request.SolrQueryRequestBase.getSearcher(SolrQueryRequestBase.java:127) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:310) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:296) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173) > > > > > at > org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.handler.PingRequestHandler.handlePing(PingRequestHandler.java:267) > > > > > ... 34 more > > > > > Caused by: org.apache.lucene.index.CorruptIndexException: Corrupted > > > > > bitsPerDocBase: 6033 > > > > > > > > > > > > > > > (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/app/solr/data/COLL_shard8_replica1/data/index.20180903220548447/_9nsy.tvx"))) > > > > > at > > > > > > > > > > > > > > > org.apache.lucene.codecs.compressing.CompressingStoredFieldsIndexReader.(CompressingStoredFieldsIndexReader.java:89) > > > > > at > > > > > > > > > > > > > > > org.apache.lucene.codecs.compressing.CompressingTermVectorsReader.(CompressingTermVectorsReader.java:126) > > > > > at > > > > > > > > > > > > > > > org.apache.lucene.codecs.compressing.CompressingTermVectorsFormat.vectorsReader(CompressingTermVectorsFormat.java:91) > > > > > at > > > > > > > > > > > > > > > org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:128) > > > > > at > > > > > org.apache.lucene.index.SegmentReader.(SegmentReader.java:74) > > > > > at > > > > > > > > > > > > > > > org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:145) > > > > > at > > > > > > > > > > > > > > > org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:197) > > > > > at > > > > > > > > > > > > > > > org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:103) > > > > > at > > > > > org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:467) > > > > > at > > > > > > > org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:103) > > > > > at > > > > > > org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:79) > > > > > at > > > > > > > > > > > > > > > org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:39) > > > > > at > > > > > org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2033) > > > > > ... 43 more > > > > > Suppressed: org.apache.lucene.index.CorruptIndexException: > > > > > checksum failed (hardware problem?) : expected=e5bf0d15 > > actual=21722825 > > > > > > > > > > > > > > > (resource=BufferedChecksumInde
504 timeout
hi all. we just migrated to cloud on friday night (woohoo!). everything is looking good (great!) overall. we did, however, just run into a hiccup. running a query like this got us a 504 gateway time-out error: **some* *foo* *bar* *query** it was about 6 partials with encapsulating wildcards that someone was running that gave the error. doing 4 or 5 of them worked fine, but upon adding the last one or two it went caput. all operations have been zippier since the migration before doing some of those wildcard queries which took time (if they worked at all). is this something related directly w our server configuration or is there some solr/cloud config'ing that we could work on that would allow better response to these sorts of queries (though it'd be at a cost, i'd imagine!). thanks for any insight! best, -- John Blythe
Re: 504 timeout
First of all, wildcards are evil. Be sure that the reason people are using wildcards wouldn't be better served by proper tokenizing, perhaps something like stemming etc. Assuming that wildcards must be handled though, there are two main strategies: 1> if you want to use leading wildcards, look at ReverseWildcardFilterFactory. For something like abc* (trailing wildcard), conceptually Lucene has to construct a big OR query of every term that starts with "abc". That's not hard and is also pretty fast, just jump to the first term that starts with "abc" and gather all of them (they're sorted lexicaly) until you get to the first term starting with "abd". _Leading_ wildcards are a whole 'nother story. *abc means that each and every distinct term in the field must be enumerated. The first term could be abc and the last term in the field zzzabc. There's no way to tell without checking every one. ReverseWildcardFilterFactory handles indexing the term, well, reversed so the above example not only would the term abc bb indexed, but also cba. Now both leading and trailing wildcards are automagically made into trailing wildcards. 2> If you must allow leading and trailing wildcards on the same term *abc*, consider ngramming, bigrams are usually sufficient. So aaabcde is indexed as aa, aa, ab, bd, de and searching for *abc* becomes searching for "ab bc". Both of these make the index larger, but usually by surprisingly little. People will also index these variants in separate fields upon occasion, it depends on the use-cases needed to support. Ngramming for instance would find "ab" in the above (no wildcards) Best, Erick On Sun, Sep 9, 2018 at 1:40 PM John Blythe wrote: > > hi all. we just migrated to cloud on friday night (woohoo!). everything is > looking good (great!) overall. we did, however, just run into a hiccup. > running a query like this got us a 504 gateway time-out error: > > **some* *foo* *bar* *query** > > it was about 6 partials with encapsulating wildcards that someone was > running that gave the error. doing 4 or 5 of them worked fine, but upon > adding the last one or two it went caput. all operations have been zippier > since the migration before doing some of those wildcard queries which took > time (if they worked at all). is this something related directly w our > server configuration or is there some solr/cloud config'ing that we could > work on that would allow better response to these sorts of queries (though > it'd be at a cost, i'd imagine!). > > thanks for any insight! > > best, > > -- > John Blythe
Solr Index Issues
Hi Team, We are using Nutch 1.15 and Solr 6.6.3 We tried crawling one of the URL and and noticed issues while indexing data to solr.Below is the capture from logs Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr/nutch: Expected mime type application/octet-stream but got text/html. Here in the log i see collection name is nutch but the actual collection name i created is Nutch1.15_Test Given below is the command used for crawling bin/nutch solrindex http://10.150.17.32:8983/solr/Nutch1.15_Test crawl/crawldb -linkdb crawl/linkdb crawl/segments/* Please suggest any workarounds if available. Thank you -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html