Darrel to better understand what you are asking for:
I understand that you want the following:
byte[] -> PdxInstance
and not
byte[] -> PdxInstance -> Pojo
The `read-serialized=true` flag is a single flag that "rules them all"
which does not make sense anymore and something that is seemingly a
problem. The proposal seems to want a type that does not get
deserialized into a POJO, regardless of setting of `read-serialized=true`.
I don't believe we need another type that denotes the behavior of "don't
deserialize into POJO".
Creating a type JUST for a behavior is not helpful in any manner. The
opposite would actually be true. You would not have to push that logic
into the code:
```
if(object instanceof StablePdxInstance){
return object;
} else{
return SerializtionFramework.deserialize(object);
}
```
If you want the system NOT deserialize the PDXInstance to a POJO, then
the behavior comes from the serializer.
Adding serialization configuration on a per Region basis is not that
crazy and will solve the problem you are trying to address.
It might be a little more effort, but something that can be more easily
maintained. What the end goal really should be, is to have all
serialization logic to be only in the Serialization framework and not
any where else in the code. So if you want to have a region key of type
PDXInstance, that does NOT get deserialized into POJO (or at least does
not try), then have the serializer know that. So on a per region basis,
tell the serializer that it does not need to deserialze this key into
POJO. I would hate to see more code that says "if(object instanceof
StablePDXInstance).......
--Udo
On 1/15/19 13:21, Dan Smith wrote:
If I understand this right, you are talking about a way to create a
PdxInstance that has no corresponding java class. How about just a
RegionService.createPdxInstanceFactory() method that doesn't take a
classname, and therefore has no corresponding java class? It seems a
PdxInstances without a class is a more fundamental PdxInstance. A
PdxInstance with a java classname on it is just an extension of the
classless version.
I agree what Udo is talking about - giving the user better control of
*when* there value is deserialized to a java object - is also valuable, but
a separate feature.
-Dan
On Tue, Jan 15, 2019 at 1:09 PM Darrel Schneider <dschnei...@pivotal.io>
wrote:
Even before the JSON pdx support we had internal support for a PdxInstance
that deserializes as a PdxInstance.
This is just adding an external api for that already existing internal
feature. So it is pretty simple to do if we can figure out how to name it.
On Tue, Jan 15, 2019 at 11:18 AM Galen O'Sullivan <gosulli...@pivotal.io>
wrote:
I suspect Udo is remembering something we both had to deal with, which is
that the lack of values to get/put PDXInstances on Regions make some
patterns difficult. In internal code, we have to set some thread-locals
to
get serialized values out, and in general, I think that setting
pdx-read-serialized is a violation of the contract you'd expect from the
type signature of get, put, etc. Having a separate API for serialized
objects, and possibly region-level configuration, makes a lot more sense.
You could even have the non-PDX get fail on regions that are set to only
use PDX-serialized objects for everything.
We already have something like a PdxInstance that always deserializes to
a
PdxInstance -- have a look at the __GEMFIRE_JSON mess that we use for
JSON.
However you end up doing the new PDXInstance stuff, I strongly suggest
using the new solution for JSON objects.
-Galen
On Tue, Jan 15, 2019 at 10:49 AM Darrel Schneider <dschnei...@pivotal.io
wrote:
I like the idea of adding support to the region configuration that lets
users control how it stores the data. But even if we did that, and you
are
correct that it would be much more work, I don't think it would address
this issue or remove the value of a PdxInstance that always
deserializes
to
a PdxInstance. So I'd like this proposal to stay focused on PdxInstance
and
not get side tracked. PdxInstances can be used outside of regions (for
example arguments to functions).
I'd like to see a separate proposal about being able to configure how a
region stores its data. I could be wrong, but I think that proposal
would
focus on the values, not the keys. Storing keys as serialized data is
tricky because you need to come up with a equals and hashCode and if
those
are going to be done based on a sequence of serialized bytes then you
really need to understand your serialization code and make sure that
"equal" objects always have the same serialized form.
On Tue, Jan 15, 2019 at 10:38 AM Udo Kohlmeyer <u...@apache.org> wrote:
Darrel, thank you for this.
I would like to propose a counter-proposal.
Instead of introducing another PDXInstance type, why don't we improve
the serialization framework itself? I know my proposal is most likely
going to take a little more effort than adding a new type, but I
believe
it is less of a work around.
MY proposal is to have the PDX serialization configuration be a
little
more explicit. In the sense that a user can define serialization
details
down to the Region.Key or Region.Value level.
Why would we possibly have a "one size fits all" approach? Could one
have a setup where serialization configuration is stored on a per
region
basis. Maybe in some cases we want to deserialize the key and in some
cases we don't want to. In some regions we want to leave the value in
serialized form and in others we don't. The point is, why limit to a
single flag.
--Udo
On 1/15/19 10:17, Darrel Schneider wrote:
As part of GEODE-6272 we realized we need a way to use a
PdxInstance
as
key
for a Region entry. The problem with the current PdxInstance
behavior
is
that in some members the key may be seen as a PdxInstance and in
others
seen as an instance of a domain class. This inconsistency can lead
to
problems, in particular with partitioned regions because of the
key's
hash
code being used to determine the bucket. You can read more about
this
here:
https://urldefense.proofpoint.com/v2/url?u=https-3A__geode.apache.org_docs_guide_17_developing_data-5Fserialization_using-5Fpdx-5Fregion-5Fentry-5Fkeys.html&d=DwIBaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=eizM8j4ZzXpU2_4tKNPdsrNNjryTeKuT6UdYhvucPpY&m=Pba8A2NQprPqyA0LhCvz9iyCjcXgqxkVildpFiJD6b4&s=blWIWwIbt5SKqKVidtZsC-cB9QK158CdEdOho54mhiM&e=
What we want is a new type of PdxInstance that will never
deserialize
to
a
domain class. It will always be a PdxInstance. This can safely be
used
as a
Region key since PdxInstance implements equals and hashCode. It can
also
be
used in other contexts when you just want some structured data with
well
defined fields but never need to deserialize that data to a domain
class.
We are trying to figure out what to call this new type of
PdxInstance.
Currently the pull request for GEODE-6272 has them named as
"stable"
because they do not change form; they are always a PdxInstance.
Another
suggestion was not to name them but add a boolean parameter to the
method
that creates a PdxInstanceFactory named "forcePDXEveryWhere".
Internally
we
have some code that has a boolean named "noDomainClass". I'd prefer
we
come
up with a name instead of using boolean parameters. In the Java
world
you
label fields that can't change "final" and in the object world you
call
objects that can't change "immutable". Would either of these be
better
than
"stable"? Any other ideas for what we could calls this new type of
PdxInstance?