[DISCUSS] Upgrading to Lucene 7.1.0

2021-09-27 Thread nabarun nag
Hi dev team,

Recently, a commit was pushed to develop which upgraded the Lucene
version used in Apache Geode to 7.1.0. These new Lucene indexes are
not compatible with the previous versions hence it breaks the rolling
upgrade contract. We are no longer able to execute Lucene queries when
there are severs of mixed versions in the cluster. One solution that
was provided was to not allow Lucene queries to be executed when there
are mixed versions of servers present in the cluster.

After a discussion, it was put forward that this was not an optimal
user experience. Also, that this change of default behavior was not
discussed in the dev list.

Alternative solutions that were put forward were that Geode allowed
the user to pick the version of Lucene to be used. Another that since
this is a major upgrade to Lucene with some breaking API changes,
should we sync this upgrade with the release of Apache Geode 2.0

We would love to hear your thoughts or alternative solutions to this issue.

Regards
Nabarun.


Re: [DISCUSS] Upgrading to Lucene 7.1.0

2021-09-27 Thread Jens Deppe
Would it be possible to allow an operator to choose which version of Lucene to 
use? Such that if they are prepared for the issues you describe, they could 
still go ahead and upgrade to 7.10. Or are there breaking API changes which 
would make that hard/impossible to accommodate in our code base?

--Jens

From: nabarun nag 
Date: Monday, September 27, 2021 at 11:49 AM
To: dev@geode.apache.org 
Subject: [DISCUSS] Upgrading to Lucene 7.1.0
Hi dev team,

Recently, a commit was pushed to develop which upgraded the Lucene
version used in Apache Geode to 7.1.0. These new Lucene indexes are
not compatible with the previous versions hence it breaks the rolling
upgrade contract. We are no longer able to execute Lucene queries when
there are severs of mixed versions in the cluster. One solution that
was provided was to not allow Lucene queries to be executed when there
are mixed versions of servers present in the cluster.

After a discussion, it was put forward that this was not an optimal
user experience. Also, that this change of default behavior was not
discussed in the dev list.

Alternative solutions that were put forward were that Geode allowed
the user to pick the version of Lucene to be used. Another that since
this is a major upgrade to Lucene with some breaking API changes,
should we sync this upgrade with the release of Apache Geode 2.0

We would love to hear your thoughts or alternative solutions to this issue.

Regards
Nabarun.


Re: [DISCUSS] Upgrading to Lucene 7.1.0

2021-09-27 Thread Dan Smith
Does anyone have more context on why lucene queries won't work during the 
rolling upgrade? I can see what added a line to the documentation and changed 
the tests not to do queries, but I'm not sure why we needed to do that.

-Dan


From: nabarun nag 
Sent: Monday, September 27, 2021 11:48 AM
To: dev@geode.apache.org 
Subject: [DISCUSS] Upgrading to Lucene 7.1.0

Hi dev team,

Recently, a commit was pushed to develop which upgraded the Lucene
version used in Apache Geode to 7.1.0. These new Lucene indexes are
not compatible with the previous versions hence it breaks the rolling
upgrade contract. We are no longer able to execute Lucene queries when
there are severs of mixed versions in the cluster. One solution that
was provided was to not allow Lucene queries to be executed when there
are mixed versions of servers present in the cluster.

After a discussion, it was put forward that this was not an optimal
user experience. Also, that this change of default behavior was not
discussed in the dev list.

Alternative solutions that were put forward were that Geode allowed
the user to pick the version of Lucene to be used. Another that since
this is a major upgrade to Lucene with some breaking API changes,
should we sync this upgrade with the release of Apache Geode 2.0

We would love to hear your thoughts or alternative solutions to this issue.

Regards
Nabarun.


Re: [DISCUSS] Upgrading to Lucene 7.1.0

2021-09-27 Thread Jacob Barrett



> On Sep 27, 2021, at 11:48 AM, nabarun nag  wrote:
> 
> Recently, a commit was pushed to develop which upgraded the Lucene
> version used in Apache Geode to 7.1.0. These new Lucene indexes are
> not compatible with the previous versions hence it breaks the rolling
> upgrade contract. We are no longer able to execute Lucene queries when
> there are severs of mixed versions in the cluster.


Can you describe the problem with a little more detail? Does this mean that 
while there is a mix the execution throws an exception on all servers or is 
there a subset for which it works? If there is a subset for which it works, are 
those instances sufficient to provide accurate results if the instances that 
fail are ignored?

-Jake



Re: [DISCUSS] Upgrading to Lucene 7.1.0

2021-09-27 Thread Nabarun Nag
In simple words,  if Lucene indexes were created by a new version (7.1.0), then 
replicated to others that are still in the older version, they won't understand 
the index, and the event processors start throwing exceptions.

This can be simply seen by just re-enabling the query execution in the DUnit 
tests and commenting out the check blocks: [develop SHA: 
68629356f561a932f5dfbace70b01d9971a42473]

In LuceneEventListener
if (cache.hasMemberOlderThan(KnownVersion.GEODE_1_15_0)) {
  logger.info("Some members are older than " + 
KnownVersion.GEODE_1_15_0.getName());
  return false;
}

In IndexRepositoryFactory:
if (userRegion.getCache() != null
&& userRegion.getCache().hasMemberOlderThan(KnownVersion.GEODE_1_15_0)) 
{
  logger.info("Some members are older than " + 
KnownVersion.GEODE_1_15_0.getName());
  return null;
}


This is the exception that will be encountered:

[Exception]

[vm2_v1.2.0] [warn 2021/09/27 14:24:42.251 PDT  tid=102] An Exception occurred. 
The dispatcher will continue.
[vm2_v1.2.0] org.apache.geode.InternalGemFireError: Unable to create index 
repository
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:118)
[vm2_v1.2.0] at 
java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.computeRepository(AbstractPartitionedRepositoryManager.java:108)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:137)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:76)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.LuceneEventListener.process(LuceneEventListener.java:87)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.LuceneEventListener.processEvents(LuceneEventListener.java:64)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.GatewaySenderEventCallbackDispatcher.dispatchBatch(GatewaySenderEventCallbackDispatcher.java:154)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.GatewaySenderEventCallbackDispatcher.dispatchBatch(GatewaySenderEventCallbackDispatcher.java:80)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.AbstractGatewaySenderEventProcessor.processQueue(AbstractGatewaySenderEventProcessor.java:609)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.AbstractGatewaySenderEventProcessor.run(AbstractGatewaySenderEventProcessor.java:1051)
[vm2_v1.2.0] Caused by: org.apache.lucene.index.IndexFormatTooNewException: 
Format version is not supported (resource 
BufferedChecksumIndexInput(segments_2)): 7 (needs to be between 4 and 6)
[vm2_v1.2.0] at 
org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:216)
[vm2_v1.2.0] at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:302)
[vm2_v1.2.0] at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:286)
[vm2_v1.2.0] at org.apache.lucene.index.IndexWriter.(IndexWriter.java:938)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.computeIndexRepository(IndexRepositoryFactory.java:84)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager.computeRepository(PartitionedRepositoryManager.java:42)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:116)
[vm2_v1.2.0] ... 10 more



Also:
[vm2_v1.2.0] [warn 2021/09/27 14:24:42.134 PDT  tid=106] An Exception occurred. 
The dispatcher will continue.
[vm2_v1.2.0] org.apache.geode.InternalGemFireError: Unable to create index 
repository
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:118)
[vm2_v1.2.0] at 
java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.computeRepository(AbstractPartitionedRepositoryManager.java:108)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:137)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:76)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.LuceneEventListener.process(LuceneEventListener.java:87)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.LuceneEventListener.processEvents(LuceneEventListener.java:64)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.GatewaySenderEventCallbackDispatcher.dispatchBatch(GatewaySenderEventCallbackDispatcher.java:154)
[vm2_v1

Re: [DISCUSS] Upgrading to Lucene 7.1.0

2021-09-27 Thread Nabarun Nag
The solution for preventing the query executions to occur in the mixed version 
mode also caused some problems where the query function executions get 
repeatedly executed and that results in stack overflow.



From: Nabarun Nag 
Sent: Monday, September 27, 2021 2:30 PM
To: dev@geode.apache.org 
Subject: Re: [DISCUSS] Upgrading to Lucene 7.1.0

In simple words,  if Lucene indexes were created by a new version (7.1.0), then 
replicated to others that are still in the older version, they won't understand 
the index, and the event processors start throwing exceptions.

This can be simply seen by just re-enabling the query execution in the DUnit 
tests and commenting out the check blocks: [develop SHA: 
68629356f561a932f5dfbace70b01d9971a42473]

In LuceneEventListener
if (cache.hasMemberOlderThan(KnownVersion.GEODE_1_15_0)) {
  logger.info("Some members are older than " + 
KnownVersion.GEODE_1_15_0.getName());
  return false;
}

In IndexRepositoryFactory:
if (userRegion.getCache() != null
&& userRegion.getCache().hasMemberOlderThan(KnownVersion.GEODE_1_15_0)) 
{
  logger.info("Some members are older than " + 
KnownVersion.GEODE_1_15_0.getName());
  return null;
}


This is the exception that will be encountered:

[Exception]

[vm2_v1.2.0] [warn 2021/09/27 14:24:42.251 PDT  tid=102] An Exception occurred. 
The dispatcher will continue.
[vm2_v1.2.0] org.apache.geode.InternalGemFireError: Unable to create index 
repository
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:118)
[vm2_v1.2.0] at 
java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.computeRepository(AbstractPartitionedRepositoryManager.java:108)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:137)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:76)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.LuceneEventListener.process(LuceneEventListener.java:87)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.LuceneEventListener.processEvents(LuceneEventListener.java:64)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.GatewaySenderEventCallbackDispatcher.dispatchBatch(GatewaySenderEventCallbackDispatcher.java:154)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.GatewaySenderEventCallbackDispatcher.dispatchBatch(GatewaySenderEventCallbackDispatcher.java:80)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.AbstractGatewaySenderEventProcessor.processQueue(AbstractGatewaySenderEventProcessor.java:609)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.AbstractGatewaySenderEventProcessor.run(AbstractGatewaySenderEventProcessor.java:1051)
[vm2_v1.2.0] Caused by: org.apache.lucene.index.IndexFormatTooNewException: 
Format version is not supported (resource 
BufferedChecksumIndexInput(segments_2)): 7 (needs to be between 4 and 6)
[vm2_v1.2.0] at 
org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:216)
[vm2_v1.2.0] at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:302)
[vm2_v1.2.0] at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:286)
[vm2_v1.2.0] at org.apache.lucene.index.IndexWriter.(IndexWriter.java:938)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.computeIndexRepository(IndexRepositoryFactory.java:84)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager.computeRepository(PartitionedRepositoryManager.java:42)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:116)
[vm2_v1.2.0] ... 10 more



Also:
[vm2_v1.2.0] [warn 2021/09/27 14:24:42.134 PDT  tid=106] An Exception occurred. 
The dispatcher will continue.
[vm2_v1.2.0] org.apache.geode.InternalGemFireError: Unable to create index 
repository
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:118)
[vm2_v1.2.0] at 
java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.computeRepository(AbstractPartitionedRepositoryManager.java:108)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:137)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:76)
[vm2_v1.2.0

Re: [DISCUSS] Upgrading to Lucene 7.1.0

2021-09-27 Thread Udo Kohlmeyer
Might I propose something here.

There is currently a significant amount of work going into completing 
Geode-8705, which is the Classloader isolation. We are currently targeting to 
getting this release in Geode 1.16.

My proposal is, that we use the capability that Patrick demo’d at the Community 
meeting (on this topic) where one, at runtime, can unload /  load extensions 
(like our integration with Lucene). This means that one could possibly do a 
rolling upgrade on the existing system, and keep the versions of the Lucene 
integration stable.

Once the whole system has been upgraded, the existing Lucene extension 
component is then unloaded, and the newer version of the extension component is 
then loaded. What this means, is that at runtime, there will be a period of 
time where Lucene queries will not be available and as part of the “load” 
lifecycle of the extension, there needs to be an initialization step, which 
will initialize the extension component safely.

Once initialized, Lucene queries can then become available again, etc.

This if course requires some work around the lifecycles of extension components 
and making sure that I can add the extension on at runtime and safely 
initialize it.

I think this approach allows for a more seamless (lower downtime) upgrading of 
system and extension components.

Thoughts?

--Udo

From: Nabarun Nag 
Date: Tuesday, September 28, 2021 at 7:33 AM
To: dev@geode.apache.org 
Subject: Re: [DISCUSS] Upgrading to Lucene 7.1.0
The solution for preventing the query executions to occur in the mixed version 
mode also caused some problems where the query function executions get 
repeatedly executed and that results in stack overflow.



From: Nabarun Nag 
Sent: Monday, September 27, 2021 2:30 PM
To: dev@geode.apache.org 
Subject: Re: [DISCUSS] Upgrading to Lucene 7.1.0

In simple words,  if Lucene indexes were created by a new version (7.1.0), then 
replicated to others that are still in the older version, they won't understand 
the index, and the event processors start throwing exceptions.

This can be simply seen by just re-enabling the query execution in the DUnit 
tests and commenting out the check blocks: [develop SHA: 
68629356f561a932f5dfbace70b01d9971a42473]

In LuceneEventListener
if (cache.hasMemberOlderThan(KnownVersion.GEODE_1_15_0)) {
  logger.info("Some members are older than " + 
KnownVersion.GEODE_1_15_0.getName());
  return false;
}

In IndexRepositoryFactory:
if (userRegion.getCache() != null
&& userRegion.getCache().hasMemberOlderThan(KnownVersion.GEODE_1_15_0)) 
{
  logger.info("Some members are older than " + 
KnownVersion.GEODE_1_15_0.getName());
  return null;
}


This is the exception that will be encountered:

[Exception]

[vm2_v1.2.0] [warn 2021/09/27 14:24:42.251 PDT  tid=102] An Exception occurred. 
The dispatcher will continue.
[vm2_v1.2.0] org.apache.geode.InternalGemFireError: Unable to create index 
repository
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:118)
[vm2_v1.2.0] at 
java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.computeRepository(AbstractPartitionedRepositoryManager.java:108)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:137)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:76)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.LuceneEventListener.process(LuceneEventListener.java:87)
[vm2_v1.2.0] at 
org.apache.geode.cache.lucene.internal.LuceneEventListener.processEvents(LuceneEventListener.java:64)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.GatewaySenderEventCallbackDispatcher.dispatchBatch(GatewaySenderEventCallbackDispatcher.java:154)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.GatewaySenderEventCallbackDispatcher.dispatchBatch(GatewaySenderEventCallbackDispatcher.java:80)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.AbstractGatewaySenderEventProcessor.processQueue(AbstractGatewaySenderEventProcessor.java:609)
[vm2_v1.2.0] at 
org.apache.geode.internal.cache.wan.AbstractGatewaySenderEventProcessor.run(AbstractGatewaySenderEventProcessor.java:1051)
[vm2_v1.2.0] Caused by: org.apache.lucene.index.IndexFormatTooNewException: 
Format version is not supported (resource 
BufferedChecksumIndexInput(segments_2)): 7 (needs to be between 4 and 6)
[vm2_v1.2.0] at 
org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:216)
[vm2_v1.2.0] at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:302)
[vm2_v1.2.0] at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos