On Fri, 18 Aug 2023 at 04:32, Michael A. Smith <[email protected]> wrote:
> I found I'm still a little confused at how using aliases to correct > invalid names should work. Maybe you can define an alias that is an > invalid name, but having done so, can you use it? I tried this schema > in both the Python and Java implementations. > Correcting names -- and other projections -- happen during schema resolution when reading Avro data. Such a process requires that the write schema is parsed without any validation. An exception when parsing the write schema means the data becomes unreadable. When reading the data, the read schema is first resolved against the write schema. One of the things that happen during schema resolution is that the names in the write schema are matched against the names and aliases in the read schema. This means you won't be using the aliases directly. You can test this theory by encoding data as Avro bytes without any header. You'll find you can decode the bytes using a different schema that is identical except for its names. This works, as these schemata yield the same sequence of bytes. Names and aliases in the schemas allow enhancing process to do other nice things: - skip written fields that were removed from the read schema - fill in default values for new fields added to the read schema - match different type orders in unions - do some conversions, like reading an int as a long Kind regards, Oscar -- ✉️ Oscar Westra van Holthe - Kind <[email protected]>
