Thanks Walter, but I can't imagine that will work because if this could work, 
then the index Upgrader should work and it is not the case ☹
Because of the format, the index iv6 can't be rewrite whatever the process you 
use (add replica, optimize, etc...)
The only way I have is the full reindexing! 260 000 000 docs / 3TB indexes, a 
specific preprocessing, it will be very very long......


-----Original Message-----
From: Walter Underwood <wun...@wunderwood.org> 
Sent: mardi 19 mai 2020 17:43
To: solr-user@lucene.apache.org
Subject: Re: Upgrade 5.5.5 to 8.5.1 / Segment stucked in lucene v6

Hmm, might be able to hack this with optimize (forced merge).

First, you would have to add enough extra documents to force a rewrite of all 
segments. That might be as many documents as are already in the index. You 
could set a “fake:true” field and filter them out with an fq. Or make sure they 
have no searchable text.

After adding all those, run optimize. This should rewrite all the segments in 
the new format.

Finally, delete all the extra documents. Might want to do another optimize 
after that.

No guarantee that this desperate hack will work.

wunder
Walter Underwood
wun...@wunderwood.org
https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fobserver.wunderwood.org%2F&amp;data=02%7C01%7CJean-Louis.VILA%40sword-group.com%7C630d6fc16a954cac9c6008d7fc0b587b%7C6adf23d8eabe44c8b68a0b8fb7aacef9%7C1%7C0%7C637254997968577639&amp;sdata=yPhyNyGjjJhKgu%2Bmvkp7%2Fwsx8%2FAR8x5rEnmWRjgmSv8%3D&amp;reserved=0
  (my blog)

> On May 19, 2020, at 6:21 AM, VILA Jean-Louis 
> <jean-louis.v...@sword-group.com> wrote:
> 
> Many thanks for your answers Erik. 
> 
> Effectively, I've read this into many different threads that the migration 
> path will not be guaranteed but, what's strange is that there's no formal 
> information on this impossibility because clearly we can't migrate to v8 if 
> indexes are not "pure" v7 indexes. I understand reason (y =f(x)) but al least 
> a simple documentation about the fact that a Lucene 6 segments can't be 
> upgrade into Lucene 8 would be appreciate.
> 
> More, the check tool just shows v7.7.3 index and there is no mention 
> about "real" segment version which v6! So forbid to open v7 lucene 
> indexes upgraded from v6, is quiet brutal and the rule about that we 
> can migrate only from previous major version is not completely true 
> :-( I'll stay into v7.7.3
> 
> Thanks again,
> Jean-Louis
> 
> -----Original Message-----
> From: Erick Erickson <erickerick...@gmail.com>
> Sent: mardi 19 mai 2020 15:00
> To: solr-user@lucene.apache.org
> Subject: Re: Upgrade 5.5.5 to 8.5.1 / Segment stucked in lucene v6
> 
> This will not work. Lucene has never promised this upgrade path would work, 
> the “one major version back-compat” means that Lucene X has special handling 
> for X-1, but for X-2, all bets are off. Starting with Solr 6, a marker is 
> written into the segments recording the version of Lucene the segment was 
> written with. That marker is preserved through all merges/upgrades/whatever.
> 
> Starting with Lucene 8, if any segment has a marker for Lucene 6 (or no 
> marker at all for earlier versions), then Lucene will refuse to open the 
> index.
> 
> IndexUpgraderTool and the like simply cannot synthesize the new index format, 
> the most succinct explanation I’ve seen is from Robert Muir:
> 
> “I think the key issue here is Lucene is an index not a database. Because it 
> is a lossy index and does not retain all of the user's data, its not possible 
> to safely migrate some things automagically. In the norms case IndexWriter 
> needs to re-analyze the text ("re-index") and compute stats to get back the 
> value, so it can be re-encoded. The function is y = f(x) and if x is not 
> available its not possible, so lucene can't do it.”
> 
> So you’ll have to re-index your corpus with Solr 8 I’m afraid.
> 
> Best,
> Erick
> 
> 
>> On May 19, 2020, at 4:19 AM, VILA Jean-Louis 
>> <jean-louis.v...@sword-group.com> wrote:
>> 
>> Dear all,
>> 
>> We start to upgrade a huge SolrCloud cluster from 5.4.1 to lastest version 
>> 8.5.1.
>>               Context :
>> . Ubuntu 16.04, 64b, JVM Oracle 8 101 and now OpenJDK 8 252 . We 
>> can't reindex documents because old ones doesn't exist anymore, so no other 
>> choices than upgrading indexes.
>> 
>> Our upgrading strategy is based on indexUpgrader Tool.
>>               5.4.1 -> 5.5.5 : Ok
>>               5.5.5 -> 6.6.6 : Ok
>>               6.6.6 -> 7.7.3 : ok
>>               Unable to upgrade 7.7.3 to 8.5.1 : here my problem using 
>> 8.5.1, indexUpgrader :
>> 
>> Exception in thread "main" 
>> org.apache.lucene.index.IndexFormatTooOldException: Format version is not 
>> supported (resource 
>> BufferedChecksumIndexInput(MMapIndexInput(path="/data2/solr/nodes/node1/solr/insight_dw_shard3_replica_n69/data/index/segments_2nz0"))):
>>  This index was initially created with Lucene 6.x while the current version 
>> is 8.5.1 and Lucene only supports reading the current and previous major 
>> versions.. This version of Lucene only supports indexes created with release 
>> 7.0 and later.
>>       at 
>> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:318)
>>       at 
>> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:289)
>>       at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:432)
>>       at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:429)
>>       at 
>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:680)
>>       at 
>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:632)
>>       at 
>> org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:434)
>>       at 
>> org.apache.lucene.index.DirectoryReader.listCommits(DirectoryReader.java:285)
>>       at 
>> org.apache.lucene.index.IndexUpgrader.upgrade(IndexUpgrader.java:158)
>>       at
>> org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:78)
>> 
>> But when I check the index version with 7.7.3, the segment seems to be 7.7.3!
>> 0.00% total deletions; 50756501 documents; 0 deleteions Segments
>> file=segments_2nz0 numSegments=1 version=7.7.3 
>> id=ay2stfke7hwy9gippl8k77tdd userData={commitTimeMSec=1589314850951}
>> 1 of 1: name=_2rr9t maxDoc=50756501
>>   version=7.7.3
>>   id=9pubpiwgt38rzyxr7litvgcu5
>>   codec=Lucene70
>>   compound=false
>>   numFiles=10
>>   size (MB)=338,143.905
>>   diagnostics = {os=Linux, java.vendor=Oracle Corporation, 
>> java.version=1.8.0_101, java.vm.version=25.101-b13, lucene.version=7.7.3, 
>> mergeMaxNumSegments=1, os.arch=amd64, java.runtime.version=1.8.0_101-b13, 
>> source=merge, mergeFactor=2, os.version=3.13.0-147-generic, 
>> timestamp=1589484981711}
>>   no deletions
>>   test: open reader.........OK [took 2.779 sec]
>> 
>> When I read the different thread, some people say that when a segment is 
>> "marked as v6 lucene index", this mark remains across upgrading, so we are 
>> stucked in 7.7.3 version.
>> 
>> What are my options?
>> 
>> Many many thanks for your help,
>> Jean-Louis
>> 
>> 
>> 
>> Jean-Louis Vila, PhD
>> Directeur technique
>> Sword SAS
>> 
>> d         +33 4 72 85 37 60
>> m        +33 6 17 81 14 69
>> t          +33 4 72 85 37 40
>> e         
>> jean-louis.v...@sword-group.com<mailto:jean-louis.v...@sword-group.com>
>> 
>> 9 avenue Charles de Gaulle
>> 69771, Saint Didier au Mont d'Or
>> France
>> http://www.sword-group.com/<http://www.sword-group.com/>
>> P Pensez à l'environnement avant d'imprimer ce message /  Please consider 
>> the environment before printing this mail note.
>> Ce message et toutes les pièces jointes (ci-après le "message") sont établis 
>> à l'intention exclusive de ses destinataires et sont confidentiels. Si vous 
>> recevez ce message par erreur, merci de le détruire et d'en avertir 
>> immédiatement l'expéditeur. Toute utilisation de ce message non conforme à 
>> sa destination, toute diffusion ou toute publication, totale ou partielle, 
>> est interdite, sauf autorisation expresse. Internet ne permettant pas 
>> d'assurer l'intégrité de ce message, le Groupe Sword (et ses filiales) 
>> décline(nt) toute responsabilité au titre de ce message, dans l'hypothèse où 
>> il aurait été modifié, altéré ou falsifié. Le Groupe Sword vous remercie de 
>> votre attention.
>> 
> 

Reply via email to