Hi Jason,
I tried to upgrade from 6.6.2 to 7.1.0 and got the following exception:
org.apache.lucene.index.IndexFormatTooNewException: Format version is not
supported (resource BufferedChecksumIndexInput(segments_2)): 7 (needs to be
between 4 and 6)
It looks like the fix is not good.
What I see (from
RollingUpgradeQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion.java)
is when it doing upgrade of a locator it will shutdown and started on the
newer version. The problem is that server2 become a lead and cannot read lucene
index on the newer version(Lucene index format has changed between 6 and 7
versions).
Another problem is after the rolling upgrade of locator and server1 when
verifying region size on VMs. For example,
expectedRegionSize += 5;
putSerializableObjectAndVerifyLuceneQueryResult(server1, regionName,
expectedRegionSize, 5,
15, server2, server3);
First it checks if region has expected size for VMs and it passed(has 15
entries). The problem is while executing verifyLuceneQueryResults, for
VM1(server2) it has 13 entries and assertion failed.
>From logs it can be seen that two batches are unsuccessfully dispatched:
[vm0] [warn 2019/12/06 08:31:39.956 CET <Event Processor for
GatewaySender_AsyncEventQueue_index#_aRegion_0> tid=0x42] During normal
processing, unsuccessfully dispatched 1 events (batch #0)
[vm0] [warn 2019/12/06 08:31:40.103 CET <Event Processor for
GatewaySender_AsyncEventQueue_index#_aRegion_2> tid=0x46] During normal
processing, unsuccessfully dispatched 1 events (batch #0)
For VM0(server1) and VM2(server3) it has 14 entries, one is unsuccessfully
dispatched.
I don't know why some events are successfully dispatched, some not.
Do you have any idea?
BR,
Mario
________________________________
Šalje: Jason Huynh <[email protected]>
Poslano: 2. prosinca 2019. 18:32
Prima: geode <[email protected]>
Predmet: Re: Odg: Lucene upgrade
Hi Mario,
Sorry I reread the original email and see that the exception points to a
different problem.. I think your fix addresses an old version seeing an
unknown new lucene format, which looks good. The following exception looks
like it's the new lucene library not being able to read the older files
(Just a guess from the message)...
Caused by: org.apache.lucene.index.IndexFormatTooOldException: Format
version is not supported (resource
BufferedChecksumIndexInput(segments_1)): 6 (needs to be between 7 and
9). This version of Lucene only supports indexes created with release
6.0 and later.
The upgrade is from 6.6.2 -> 8.x though, so I am not sure if the message is
incorrect (stating needs to be release 6.0 and later) or if it requires an
intermediate upgrade between 6.6.2 -> 7.x -> 8.
On Mon, Dec 2, 2019 at 2:00 AM Mario Kevo <[email protected]> wrote:
>
> I started with implementation of Option-1.
> As I understood the idea is to block all puts(put them in the queue) until
> all members are upgraded. After that it will process all queued events.
>
> I tried with Dan's proposal to check on start of
> LuceneEventListener.process() if all members are upgraded, also changed
> test to verify lucene indexes only after all members are upgraded, but got
> the same error with incompatibilities between lucene versions.
> Changes are visible on https://github.com/apache/geode/pull/4198.
>
> Please add comments and suggestions.
>
> BR,
> Mario
>
>
> ________________________________
> Šalje: Xiaojian Zhou <[email protected]>
> Poslano: 7. studenog 2019. 18:27
> Prima: geode <[email protected]>
> Predmet: Re: Lucene upgrade
>
> Oh, I misunderstood option-1 and option-2. What I vote is Jason's option-1.
>
> On Thu, Nov 7, 2019 at 9:19 AM Jason Huynh <[email protected]> wrote:
>
> > Gester, I don't think we need to write in the old format, we just need
> the
> > new format not to be written while old members can potentially read the
> > lucene files. Option 1 can be very similar to Dan's snippet of code.
> >
> > I think Option 2 is going to leave a lot of people unhappy when they get
> > stuck with what Mario is experiencing right now and all we can say is
> "you
> > should have read the doc". Not to say Option 2 isn't valid and it's
> > definitely the least amount of work to do, I still vote option 1.
> >
> > On Wed, Nov 6, 2019 at 5:16 PM Xiaojian Zhou <[email protected]> wrote:
> >
> > > Usually re-creating region and index are expensive and customers are
> > > reluctant to do it, according to my memory.
> > >
> > > We do have an offline reindex scripts or steps (written by Barry?). If
> > that
> > > could be an option, they can try that offline tool.
> > >
> > > I saw from Mario's email, he said: "I didn't found a way to write
> lucene
> > in
> > > older format. They only support
> > > reading old format indexes with newer version by using lucene-backward-
> > > codec."
> > >
> > > That's why I think option-1 is not feasible.
> > >
> > > Option-2 will cause the queue to be filled. But usually customer will
> > hold
> > > on, silence or reduce their business throughput when
> > > doing rolling upgrade. I wonder if it's a reasonable assumption.
> > >
> > > Overall, after compared all the 3 options, I still think option-2 is
> the
> > > best bet.
> > >
> > > Regards
> > > Gester
> > >
> > >
> > > On Wed, Nov 6, 2019 at 3:38 PM Jacob Barrett <[email protected]>
> > wrote:
> > >
> > > >
> > > >
> > > > > On Nov 6, 2019, at 3:36 PM, Jason Huynh <[email protected]> wrote:
> > > > >
> > > > > Jake - there is a side effect to this in that the user would have
> to
> > > > > reimport all their data into the user defined region too. Client
> > apps
> > > > > would also have to know which of the regions to put into.. also, I
> > may
> > > be
> > > > > misunderstanding this suggestion, completely. In either case, I'll
> > > > support
> > > > > whoever implements the changes :-P
> > > >
> > > > Ah… there isn’t a way to re-index the existing data. Eh… just a
> > thought.
> > > >
> > > > -Jake
> > > >
> > > >
> > >
> >
>