Re: [Cython] New function (pointer) syntax.

2014-11-06 Thread C Blake
I think you should just use the C declarator syntax.  Cython already
allows you to say "cdef int *foo[10]".  Declarators aren't bad - just
poorly taught, though I can see some saying those are the same thing.
More below.  I absolutely like the declarator one the most, and the
lambda one second most.  Declarator style makes it far easier to take
C code/.h files using function pointers over to Cython.  So, this
discussion also depends on whether you view Cython as a bridge to
C libs or its own island/bias toward pure Py3.

One other proposal that might appease Stefan's "Python lexical cues"
where he was missing "def" would be to take the Python3 function
definition header syntax and strip just variable names.  I.e., keep the
":"s
def foo(x: type1, y: type2, z: type3) -> type0: pass
goes to
(: type1, : type2, : type3) -> type0 [ident]

I don't think that looks like anything else that might valid in C or
Python.  It does just what C does - strip variable names from a function
header, but the ":"s maybe key your brain into a different syntax mode
since they are arguably more rare in C.  (Besides stripping names, C has
some extra parens for precedence/associativity - which admittedly are
tuned to make use-expressions simpler than type-expressions.)  Anyway,
I don't really like my own proposal above.  Just mentioning it for
completeness in case the ":"s help anyone.


Robert wrote:
>I really hope, in the long run, people won't have to manually do these
>declarations.

I agree they'll always be there somehow and also with Stefan's comments
about the entry bar.  So, most people not needing them most of the time
doesn't remove the question.


>I am curious, when you read "cdef int * p" do you parse this as "cdef
>(int*) p" or "cdef int (*p)" 'cause for me no matter how well I know
>it's the latter, I think the former (i.e. I think "I'm declaring a
>variable of type p that's of int pointer type.")
>[..]
>Essentially everyone thinks "cdef type var" even though that's not
>currently the true grammar.
>[..]
>The reason ctypedefs help, and are so commonly used for with function
>pointers, is because the existing syntax is just so horrendously bad.
>If there's a clear way to declare a function taking a single float and
>returning a single float, no typedef needed.

No, no, no.  Look, generations of C/C++ programmers have been done
monumental disservice by textbooks/code/style guides that suggest
"int*  p" is *any less* confusing than spacing "2+3/x" as "2+3 / x".
Early on in my C exposure someone pointed this out and I've never been
confused since.  It's a syntax-semantics confusion.  Concrete syntax
has always been right associative dereference *.  In this syntax family,
the moment any operators []/*/() are introduced, you have to start
formatting it/thinking of it as a real expression, and that formatting
should track the syntax not semantics like in your head "pointer
to"/indirection speak or whatever.  Spacing it as if * were left
associative to the type name is misleading at best.

If you can only think of type decls in general as (type, var) pairs
syntactically and semantically then *of course* you find typedefs more
clear.  They make pairing more explicit & shrink the expression tree to
be more trivial.  You should *still* space the typedef itself in a way
suggestive of the actual concrete syntax -- "typedef int *p" (or
"ctypedef") just like you shouldn't write "2+3 / x".  You should still
not essentially think of "ctypedef type var" *either*, but rather
"typedef basetype expr".  In short, "essentially everyone" *should* think
and be taught and have it reinforced and "gel" by spacing that "basetype
expr" is the syntax to create a type-var bindings semantically, and only
perceive "type var" as just one simple case.  Taking significance of
space in Python/Cython one step further, "int* p" could even be a hard
syntax error, but I'm not seriously proposing that change.  I really
do not think it is "essentially everyone".  You know better as you
said anyway, but are in conflict with yourself, I think syntax-semantics
wise.

Semantically, pointer indirection loads from an address before using,
and sure that can be confusing to new programmers in its own right.
Trying to unravel that confusion with anti-syntax spacing/thought
cascades the wrong way out of the whole situation and contributes
to your blocked assimilation.  *If* the space guides you or barring
space parens guide you, you quickly get to never forgetting that types
are inside-out/inverse/what-I-get-if specifications.  Note that this
is as it would if +,/ had somehow tricky concepts somehow "fixable"
by writing "2+3 / x" all the time.  Arithmetic isn't a binding..so
the analogy is hard to complete, but my point is ad nauseum at this
stage (or even sooner! ;-).  Undermine syntax with contrary whitespace
and of course it will seem bad/be harder.  It might even lock you in
to thought patterns that make it really hard to think about it how
you know you "oug

Re: [Cython] New function (pointer) syntax.

2014-11-07 Thread C Blake
Robert Bradshaw robertwb at gmail.com wrote: 
>Quick: is that a pointer to an array or 10 pointers to ints? Yes, I
>know what it is, but the thing is without knowing C (well) it's not
>immediately obvious what the precedence should be.

If you're gonna pick something e.g like that, it should not be something
people see all the time like int main(int argc, char *argv[]).  ;-)
*That* I recognized correctly in far less time than it took me to read
your question text.

Here's a counter: printf("%s", *a[i]) - what does it do?  I submit that
if *a[i] is hard or not "real quick" then, eh, what the real problem
here may be that you've got enough syntaxes floating in your brain that
you haven't internalized all the operator rules of this one.  In that
case, nothing "operator-oriented" is going to make you truly happy in
the long-run.

I think this whole anti-declarator campaign is misguided.  I agree with
Greg that experienced C programmers, such as C library writers know it.
In trying to "fix it", you are just squeezing complexity jello maybe to
oil the squeakiest wheel of casual users.  You can eek out a level or
maybe two of simplicity, but you double up syntax.  You can make ()s
better and make []s harder..middle-scale complexity better/full scale
a little worse.  That kind of thing.  The net result of these attempts
doesn't strike me as better or worse except that different is worse.

The closer to exactly C types you can get the more coherent the overall
system is unless/until Python land has a commonly used type setup that
is adequate.  The ctypes module has a way to do this and Numba can use
that, though I have little experience doing so.  The context is different
in that in Cython you have more opportunity to create new syntax, but
should you do so?

That brings up a totally other note, Could you maybe compile-time eval()
ctypes stuff if you really hate C decls and want to be more pythonic?
If the answer to ctypes is "Err..  not terse/compact/part of the syntax
enough", well, declarators are a fine answer to terseness, that's for
sure.  Mostly operator text. ;-)


>Cython's target audience includes lots of people who don't know C well.

One leans differently based on what you think Cython "mostly" is..Bridge
to libs, Its own thing, Compiled python, etc. (I know it's all the above).


>If they were good, they would be easy to learn, no expert teaching required.

All syntax requires learning..e.g, Stefan expressed a harder time than you
visually unpacking some of the "->" exprs.  All teaching can be messed up.
Teaching methods can fall into bad ruts.  In this case some teaching/practice
actively blocks assimilation..possibly in a long-term sense like mishearing
the lyrics of song or a person's name and then the actual case never sounding
quite right for a long time.  Or people with strong accents of your spoken
past dragging you back into accented speech yourself.  (Mis-)reinforced
language is *tough* and can transcend good teaching.  Part of this thread
was you & Stefan both having an implied question of "why do I read it one
way when I darn well know it's the other".  I was trying to help answer
that question.  Maybe I'm wrong about your individual case(s).  In my
experience with people's trouble is that it's not just "tokens being on
both sides".  Most people are used to [] and () being to the right.  It's
active mis-reinforcement stuff like spacing/thinking const char instead of
char const,.. that makes it hard.  Unless you can avoid declarator style
100%, it's better to make people confront it sooner than do half-measures.


> syntax != semantics => baddness

It's only "misperceived/taught/assimilated semantics"-syntax divergence.
I do consider the misperception unfortunate and unnecessary at the outset.
The "pairing semantics" that "feel" like they diverge from the syntax for
you are 'weak/vague' and *should* have lower priority in your head.
It's only really 1-type, 1-var pairs in function sigs.  Even you like
"type varlist" in var or struct decls.  I mean, c'mon: "int i, j" rocks. :)
So, it's not always 1-1.  Sometimes it's 1-1, sometimes 1-many aka 1 to
a non-trival expr.  If you go with a full expression instead of just a
list, you get stuff in return as a bonus..You get stuff and lose stuff.
Declarators are not some hateful, hateful thing to always avoid - there
are pros and cons like anything, not "zero pros except for history" as
you seem to say.


Anyway, "close but different" is the order of the day and is subjective.
Keeping in mind how all three function type cases look - the def/cdef,
call site, type cast/type spec - should be ever present in your syntax
evaluations of all you guys, and so should less trivial cases like
returning a function pointer from a lookup.  *Dropping/swapping* parts
is arguably closer than changing how operators work.  *New* operators
are better than making old ones multi-personality.  I don't like that
cdef char *(..) approach on those grounds.  -1 is my $.02.

The 

Re: [Cython] New function (pointer) syntax.

2014-11-08 Thread C Blake
>But I admit it's hard to come up with an objective measure for how
>good a syntax is...if it's natural to you than that's great.

I think those queries you mention will mostly be biased by the squeakier
wheels being more beginning people and that's not a very good argument
or metric.  I agree an objective measure of "goodness" or "understanding"
is hard, but I happen to run Gentoo and keep my sources around.  So, I
did a quick grep over .c and .h files in 600 packages on my system..
pretty diverse: no one style guide or style or maintainer..Not even any
very common domains..utilities, libraries, all sorts of stuff.

$ grep '[a-zA-Z0-9_][a-zA-Z0-9_]\*\*[^ ]' `find -type f -name '*.[ch]'` |
grep -v '/\*\*' | grep -v '\*\*/' | wc -l
3468

$ grep '[a-zA-Z0-9_][a-zA-Z0-9_]  *\*\*[^ ]' `find -type f -name '*.[ch]'` |
| grep -v '/\*\*' | grep -v '\*\*/' | wc -l
68900

In other words, over 95% of the instances spaced the '**' as if they knew
it bound to the token on its right.  ('**' is easier than '*' since the
latter could be multiplies but '**' almost never is).

Yes, greps are way approximate.  Yes, some real parser would be better,
but that just took me only a few minutes.  I visually inspected what they
were catching by |less instead of |wc and both cases seemed to mostly be
catching decl/type uses as intended..less than a few percent error on
that.  If anything, the most glaring question was that 3468 "type**" cases
were highly polluted with near 50% questionable "no whitespace at all"
instances like (char**)a.  Maintainers who know better might accept
patches and lazily not fix confusing formatting.  So, in a couple ways
that 5% confused is an upper bound in this corpus (under a spacing = no
confusion hypothesis).  And, sure, confused people might format
non-confusingly.  And maybe '**' itself is slightly selecting for
less confused people.

Even so, 95..97.5% to me == "essentially no one" to you by some, let's
say "not totally bonkers" measure suggests that we are just thinking of
highly different populations of people.  Even if you think my methods way
hokey, it's probably at least suggestive "essentially no one" is a far
bigger set than you thought before.  So, I agree/disagree with some other
things you said.  Initializers are an (awfully convenient) aberration,
but your example of teaching is just an example of bad teaching - so what?
A list of vars of one type is easily achieved even thinking as you want
with a typedef, hand having both declarators and typedefs gets you
everything you want.  Still, disagreements aside, I give up trying to
convince you of anything beyond that you *just might* may have a very
skewed perception of how confused about declarators are people coming to
Cython from a C/C++ background or people who write C/C++ in general.
But it seems in this arc anyway you aren't trying to target them or
C code integration coherence or such...Period!  As per...

>I'm hoping we can avoid it 100% :-) for anyone who doesn't have to
>actually interact with C.

So, you're leaning hard on the Cython as a Python compiler direction.
I think Cython in general should probably either be A) as C-like as
possible or B) as Python-like as possible.  Given your (I still think
misguided) hatred of C function pointer syntax/scenario A), there's
your probable answer - be as Py-like as possible.  Given that, for
just function types, that seems to mean either:
A) the "lambda type1, type2: type3" proposal,
B) what mypy does which is roughly Function[ [type1, type2], type3 ],
or possibly C) what Numba does if that really catches on.
or maybe the (type, type) -> rtype though that seems unpopular here,
but almost surely not that "char*(..)" thing.

In a few years the mypy approach may well be a PEP approved lint/typing
approach and people coming from Python will at least already have maybe
seen it.  In dozens of emails 2..3 months ago Guido was really strongly
promoting mypy, but I think it is in some kind of a-PEP-needs-to-be-
written limbo.  Here is a link to the relevant sub-part for those who
haven't looked at it:

http://www.mypy-lang.org/tutorial.html#callables

I actually like A) better, but not so much better it should override
what the parent to one of the two Cython syntax communities goes with.
A) is really easy to describe - "just take the function value structure
but use types instead of variables/expression value".

There are some other styles like pytypedecl or obiwan and such that
might also be worth looking into before you decide.  I haven't looked
at them, but thought I should mention them.
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel