Many thanks for your answers Erik. Effectively, I've read this into many different threads that the migration path will not be guaranteed but, what's strange is that there's no formal information on this impossibility because clearly we can't migrate to v8 if indexes are not "pure" v7 indexes. I understand reason (y =f(x)) but al least a simple documentation about the fact that a Lucene 6 segments can't be upgrade into Lucene 8 would be appreciate.
More, the check tool just shows v7.7.3 index and there is no mention about "real" segment version which v6! So forbid to open v7 lucene indexes upgraded from v6, is quiet brutal and the rule about that we can migrate only from previous major version is not completely true :-( I'll stay into v7.7.3 Thanks again, Jean-Louis -----Original Message----- From: Erick Erickson <erickerick...@gmail.com> Sent: mardi 19 mai 2020 15:00 To: solr-user@lucene.apache.org Subject: Re: Upgrade 5.5.5 to 8.5.1 / Segment stucked in lucene v6 This will not work. Lucene has never promised this upgrade path would work, the “one major version back-compat” means that Lucene X has special handling for X-1, but for X-2, all bets are off. Starting with Solr 6, a marker is written into the segments recording the version of Lucene the segment was written with. That marker is preserved through all merges/upgrades/whatever. Starting with Lucene 8, if any segment has a marker for Lucene 6 (or no marker at all for earlier versions), then Lucene will refuse to open the index. IndexUpgraderTool and the like simply cannot synthesize the new index format, the most succinct explanation I’ve seen is from Robert Muir: “I think the key issue here is Lucene is an index not a database. Because it is a lossy index and does not retain all of the user's data, its not possible to safely migrate some things automagically. In the norms case IndexWriter needs to re-analyze the text ("re-index") and compute stats to get back the value, so it can be re-encoded. The function is y = f(x) and if x is not available its not possible, so lucene can't do it.” So you’ll have to re-index your corpus with Solr 8 I’m afraid. Best, Erick > On May 19, 2020, at 4:19 AM, VILA Jean-Louis > <jean-louis.v...@sword-group.com> wrote: > > Dear all, > > We start to upgrade a huge SolrCloud cluster from 5.4.1 to lastest version > 8.5.1. > Context : > . Ubuntu 16.04, 64b, JVM Oracle 8 101 and now OpenJDK 8 252 . We can't > reindex documents because old ones doesn't exist anymore, so no other choices > than upgrading indexes. > > Our upgrading strategy is based on indexUpgrader Tool. > 5.4.1 -> 5.5.5 : Ok > 5.5.5 -> 6.6.6 : Ok > 6.6.6 -> 7.7.3 : ok > Unable to upgrade 7.7.3 to 8.5.1 : here my problem using > 8.5.1, indexUpgrader : > > Exception in thread "main" > org.apache.lucene.index.IndexFormatTooOldException: Format version is not > supported (resource > BufferedChecksumIndexInput(MMapIndexInput(path="/data2/solr/nodes/node1/solr/insight_dw_shard3_replica_n69/data/index/segments_2nz0"))): > This index was initially created with Lucene 6.x while the current version > is 8.5.1 and Lucene only supports reading the current and previous major > versions.. This version of Lucene only supports indexes created with release > 7.0 and later. > at > org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:318) > at > org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:289) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:432) > at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:429) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:680) > at > org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:632) > at > org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:434) > at > org.apache.lucene.index.DirectoryReader.listCommits(DirectoryReader.java:285) > at > org.apache.lucene.index.IndexUpgrader.upgrade(IndexUpgrader.java:158) > at > org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:78) > > But when I check the index version with 7.7.3, the segment seems to be 7.7.3! > 0.00% total deletions; 50756501 documents; 0 deleteions Segments > file=segments_2nz0 numSegments=1 version=7.7.3 > id=ay2stfke7hwy9gippl8k77tdd userData={commitTimeMSec=1589314850951} > 1 of 1: name=_2rr9t maxDoc=50756501 > version=7.7.3 > id=9pubpiwgt38rzyxr7litvgcu5 > codec=Lucene70 > compound=false > numFiles=10 > size (MB)=338,143.905 > diagnostics = {os=Linux, java.vendor=Oracle Corporation, > java.version=1.8.0_101, java.vm.version=25.101-b13, lucene.version=7.7.3, > mergeMaxNumSegments=1, os.arch=amd64, java.runtime.version=1.8.0_101-b13, > source=merge, mergeFactor=2, os.version=3.13.0-147-generic, > timestamp=1589484981711} > no deletions > test: open reader.........OK [took 2.779 sec] > > When I read the different thread, some people say that when a segment is > "marked as v6 lucene index", this mark remains across upgrading, so we are > stucked in 7.7.3 version. > > What are my options? > > Many many thanks for your help, > Jean-Louis > > > > Jean-Louis Vila, PhD > Directeur technique > Sword SAS > > d +33 4 72 85 37 60 > m +33 6 17 81 14 69 > t +33 4 72 85 37 40 > e > jean-louis.v...@sword-group.com<mailto:jean-louis.v...@sword-group.com> > > 9 avenue Charles de Gaulle > 69771, Saint Didier au Mont d'Or > France > http://www.sword-group.com/<http://www.sword-group.com/> > P Pensez à l'environnement avant d'imprimer ce message / Please consider the > environment before printing this mail note. > Ce message et toutes les pièces jointes (ci-après le "message") sont établis > à l'intention exclusive de ses destinataires et sont confidentiels. Si vous > recevez ce message par erreur, merci de le détruire et d'en avertir > immédiatement l'expéditeur. Toute utilisation de ce message non conforme à sa > destination, toute diffusion ou toute publication, totale ou partielle, est > interdite, sauf autorisation expresse. Internet ne permettant pas d'assurer > l'intégrité de ce message, le Groupe Sword (et ses filiales) décline(nt) > toute responsabilité au titre de ce message, dans l'hypothèse où il aurait > été modifié, altéré ou falsifié. Le Groupe Sword vous remercie de votre > attention. >