Re: [DBD::Pg] Another stab at the UTF-8 system, this time simplified as much as possible. See the pod for pg_enable_utf8 for an explanation. Note that this commit will probably be picked out later, as we want to release a new minor version before releasing such a big change

David E. Wheeler Wed, 03 Jul 2013 02:57:04 -0700

On Jul 3, 2013, at 3:26 AM, Greg Sabino Mullane <[email protected]> wrote:


>> What happens if the client encoding is *not* UTF8? 
> 
> If not UTF8, we don't do anything. I think it is sufficient that we simply 
> require people to use UTF8 as their client_encoding if they want DBD::Pg 
> to do the right thing. It's very common, and more importantly, is the only 
> encoding guaranteed to auto convert from any server encoding.

Okay, maybe add that point to the docs?

>> Does it also recognize "UTF-8", "utf-8", and "unicode"?
> 
> (PQclientEncoding isn't old enough to use yet). As far as I can tell, 
> Postgres always returns the canonical string 'UTF8', even when you set 
> it to 'Unicode', or "utf-8", or even "U Tf----8"

Great.

>> What if it's set to 2, or 42?
> 
> No effect. We don't want people to be clever. You really want to force 
> the flag on? Set it to 1. I suppose we should throw an error if not -1, 0, or 
> 1

+1

>> Will it turn on the flag for all data without regard to type?
> 
> Yes.

So different than before. I think that’s okay, frankly, but may want to mention 
it somewhere, Changes if nothing else.

>> Isn't bytea returned in hex? If so, turning on UTF8 should do nothing. 
>> Maybe DBD::Pg should decode bytea to binary, though?
> 
> It's returned either hex or escape mode. Yes, it would not hurt to turn 
> it on, but it doesn't buy us anything either.

Well, it buys us not having to check the type every bloody time.

> See the routines in quote.c - 
> we are decoding them to strings anyway, so the chars returned from the 
> database are never directly exposed back to the user anyway.
> 
> A lot of this is not going to have any perfect answers, especially as far 
> as backwards compatibilty goes, and forward compatibility with DBI 
> support. But we need to get moving, and I think this is a pretty good 
> first effort. I'll tweak the allowed ranges for the pg_enable_utf8 setting.

I agree entirely. Would be nice to auto-convert bytea, but otherwise I have no 
complaints about this at all.

Many thanks!

David

Re: [DBD::Pg] Another stab at the UTF-8 system, this time simplified as much as possible. See the pod for pg_enable_utf8 for an explanation. Note that this commit will probably be picked out later, as we want to release a new minor version before releasing such a big change

Reply via email to