Fwd: Camel case identifiers and folding

2019-03-16 Thread Steve Haresnape
My apologies I sent this reply to David only by mistake.

-- Forwarded message -
From: Steve Haresnape 
Date: Fri, 15 Mar 2019 at 13:09
Subject: Re: Camel case identifiers and folding
To: David G. Johnston 


As I said, I don't want to quote my identifiers. I know what that does. I
want to specify them in a certain way, see them in that same way, but refer
to them in any old way.

You can call it normalize or fold or whatever. It's a bad design choice,
and not even a completely compliant choice.

Is a cure contemplated? I know it's not just me that dislikes this.

On Fri, 15 Mar 2019 at 12:21, David G. Johnston 
wrote:

> On Thu, Mar 14, 2019 at 4:07 PM Steve Haresnape
>  wrote:
> >
> > I'm porting a sql server database to postgresql 9.6. My camelCase
> identifiers are having their humps removed. This is disconcerting and sad.
> >
> > Is there a cure for this?
>
> No
>
> >I don't want to quote my identifiers unless I have to.
>
> PostgreSQL made the choice long ago to normalize unquoted identifiers
> to lower case.  Quoting them will preserve whatever you type,
> including case.
>
> David J.
>


-- 
Steve Haresnape
60 Kauri Road, Awhitu, RD 4 Waiuku 2684
s.haresn...@creativeintegrity.co.nz
Phone: (09) 235 1698
Mobile: 021 514 666


-- 
Steve Haresnape
60 Kauri Road, Awhitu, RD 4 Waiuku 2684
s.haresn...@creativeintegrity.co.nz
Phone: (09) 235 1698
Mobile: 021 514 666


Re: Fwd: Camel case identifiers and folding

2019-03-16 Thread Adrian Klaver

On 3/16/19 1:53 AM, Steve Haresnape wrote:

My apologies I sent this reply to David only by mistake.

-- Forwarded message -
From: *Steve Haresnape* >

Date: Fri, 15 Mar 2019 at 13:09
Subject: Re: Camel case identifiers and folding
To: David G. Johnston >



As I said, I don't want to quote my identifiers. I know what that does. 
I want to specify them in a certain way, see them in that same way, but 
refer to them in any old way.


You can call it normalize or fold or whatever. It's a bad design choice, 
and not even a completely compliant choice.


It is SQL standard(sort of) so I would not call it a bad choice. It 
deviates from the standard in that it folds down not up, but for your 
case that would not matter.




Is a cure contemplated? I know it's not just me that dislikes this.


I would say no. Probably matched by folks who would dislike having it 
changed.




On Fri, 15 Mar 2019 at 12:21, David G. Johnston 
mailto:david.g.johns...@gmail.com>> wrote:


On Thu, Mar 14, 2019 at 4:07 PM Steve Haresnape
mailto:s.haresn...@creativeintegrity.co.nz>> wrote:
 >
 > I'm porting a sql server database to postgresql 9.6. My camelCase
identifiers are having their humps removed. This is disconcerting
and sad.
 >
 > Is there a cure for this?

No

 >I don't want to quote my identifiers unless I have to.

PostgreSQL made the choice long ago to normalize unquoted identifiers
to lower case.  Quoting them will preserve whatever you type,
including case.

David J.



--
Steve Haresnape
60 Kauri Road, Awhitu, RD 4 Waiuku 2684
s.haresn...@creativeintegrity.co.nz 


Phone: (09) 235 1698
Mobile: 021 514 666


--
Steve Haresnape
60 Kauri Road, Awhitu, RD 4 Waiuku 2684
s.haresn...@creativeintegrity.co.nz 


Phone: (09) 235 1698
Mobile: 021 514 666



--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Fwd: Camel case identifiers and folding

2019-03-16 Thread Tom Lane
Steve Haresnape  writes:
> As I said, I don't want to quote my identifiers. I know what that does. I
> want to specify them in a certain way, see them in that same way, but refer
> to them in any old way.
> You can call it normalize or fold or whatever. It's a bad design choice,
> and not even a completely compliant choice.

> Is a cure contemplated? I know it's not just me that dislikes this.

No.

There have been previous discussions of allowing variant case-folding
rules, and the conclusion has always been that it would break so much
stuff as to be entirely not worth the trouble.

The big problem with making significant semantics changes like this
be optional is that authors of general-purpose tools then have to be
prepared to cope with all the possibilities.  That's a pretty enormous
cost to load onto other people.  If it *only* affected the core code,
maybe you could find somebody to do the work and call it done, but
actually the implications would reverberate across the entire Postgres
ecosystem.  That's a tough call to make for a change that can't even
be painted as meeting a widely-favored goal like better SQL spec
compliance.

Now, in the spirit of full disclosure, I should say that the only form
of this idea that people have really spent significant effort looking
at is exactly the fully-SQL-spec-compliant case-folding rule, ie just
like Postgres normally does it except unquoted identifiers fold to
all-upper-case not all-lower.  Perhaps there's some reason why what
you want would be less painful than that turns out to be ... but I'm
not seeing such a reason offhand.  In fact I suspect your preference
is actually worse, it'd require behavior changes in more places.
As an example, I believe your request would require case-insensitive
uniqueness enforcement in the system catalogs' unique indexes on names.
You have no idea how large a can of worms that opens (but I'll just
mention that "which characters are letters" doesn't even have a well
defined universal answer).

regards, tom lane



Re: Camel case identifiers and folding

2019-03-16 Thread Peter J. Holzer
On 2019-03-15 17:09:49 -0600, Rob Sargent wrote:
> On Mar 15, 2019, at 4:43 PM, Morris de Oryx  
> wrote:
> 
> The original question has already been answered really well, but it 
> reminds
> me to mention that Postgres text/varchar values are case-sensitive. Here's
> a list of the times when I would like a case-sensitive text field:
> 
>Never
> 
> Now here's the list of times I would like a case-blind text field:
> 
>Everywhere else.
> 
[...]
> What sort of content is in your field of type text?  Certainly, in English
> prose, “rob” is different than “Rob”

I disagree. While the grammar for written English has rules when to
write "rob" and when to write "Rob", that distinction usually carries no
semantic difference. Consider:

"How to Rob the Hump of a Camel"

"the go programming language was invented by rob pike, ken thompson and
robert griesemer"

Here "Rob" is a verb and "rob" is a first name, the opposite of what you
probably intended. Yet the the first sentence is grammatically correct
if it is a title and while the second isn't correct, few people will
have difficulties understanding it (many probably won't even notice that
it is all lower case).

Spoken English of course doesn't even have a case distinction.

> and if the content is for a web page (or in my experience, the content
> of medical reference books) these differences are critical.

A web page? Rarely, at least for the human readable parts. Medicine? I
don't know. There may be names for different substances which differ
only in case. But those are parts of a formal language, and as
programmers we already know about case-sensitive formal languages.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


Re: Camel case identifiers and folding

2019-03-16 Thread Rob Sargent


>> What sort of content is in your field of type text?  Certainly, in English
>> prose, “rob” is different than “Rob”
> 
> I disagree. While the grammar for written English has rules when to
> write "rob" and when to write "Rob", that distinction usually carries no
> semantic difference. Consider:
> 
> "How to Rob the Hump of a Camel"
> 
> "the go programming language was invented by rob pike, ken thompson and
> robert griesemer"
> 
> Here "Rob" is a verb and "rob" is a first name, the opposite of what you
> probably intended. Yet the the first sentence is grammatically correct
> if it is a title and while the second isn't correct, few people will
> have difficulties understanding it (many probably won't even notice that
> it is all lower case).
> 
> Spoken English of course doesn't even have a case distinction.
> 
>> and if the content is for a web page (or in my experience, the content
>> of medical reference books) these differences are critical.
> 
> A web page? Rarely, at least for the human readable parts. Medicine? I
> don't know. There may be names for different substances which differ
> only in case. But those are parts of a formal language, and as
> programmers we already know about case-sensitive formal languages.
> 
I don’t think it’s solely about the semantics.  One might be contractually 
obligated to always spell a name in some exact way including it capitalization. 
For instance if referring to "Rob Sargent” as a quote or accreditation, then 
it’s not okay to let a typo “rob Sargent” go through.



Re: Camel case identifiers and folding

2019-03-16 Thread Andrew Gierth
> "Morris" == Morris de Oryx  writes:

 Morris> UUIDs as a type are an interesting case in Postgres. They're
 Morris> stored as a large numeric for efficiency (good!), but are
 Morris> presented by default in the 36-byte format with the dashes.
 Morris> However, you can also search using the dashes 32-character
 Morris> formatand it all works. Case-insensitively.

That works because UUIDs have a convenient canonical form (the raw
bytes) which all input is converted to before comparison.

Text is ... not like this.

Even citext is really only a hack - it assumes that comparisons can be
done by conversion to lowercase, which may work well enough for English
but I'm pretty sure it does not correctly handle the edge cases in, for
example, German (consider 'SS', 'ss', 'ß') or Greek (final sigma). Doing
it better would require proper application of case-folding rules, and
even that would require handling of edge cases (the Unicode case folding
algorithm is designed to be language-independent, which means that it
breaks for Turkish without special-case exceptions).

-- 
Andrew (irc:RhodiumToad)



Re: Camel case identifiers and folding

2019-03-16 Thread Peter J. Holzer
On 2019-03-16 14:00:34 -0600, Rob Sargent wrote:
> What sort of content is in your field of type text?  Certainly, in
> English
> prose, “rob” is different than “Rob”
> 
> 
> I disagree. While the grammar for written English has rules when to
> write "rob" and when to write "Rob", that distinction usually carries no
> semantic difference. Consider:
[...]
> I don’t think it’s solely about the semantics.  One might be contractually
> obligated to always spell a name in some exact way including it 
> capitalization.
> For instance if referring to "Rob Sargent” as a quote or accreditation, then
> it’s not okay to let a typo “rob Sargent” go through.

1) Such contracts might exist, but they are only binding to the signing
parties, they don't affect what is commonly understood as "the English
language". Everybody else will see it as an obvious typo and won't
assume that this refers to some "rob Sargent" who is a different person
than "Rob Sargent".

2) I don't think the OP was talking about spell-checking. And in any
case spell-checking is more complicated than simply comparing strings
byte by byte.

hp


-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature