Re: [Cython] New function (pointer) syntax.
I think you should just use the C declarator syntax. Cython already allows you to say "cdef int *foo[10]". Declarators aren't bad - just poorly taught, though I can see some saying those are the same thing. More below. I absolutely like the declarator one the most, and the lambda one second most. Declarator style makes it far easier to take C code/.h files using function pointers over to Cython. So, this discussion also depends on whether you view Cython as a bridge to C libs or its own island/bias toward pure Py3. One other proposal that might appease Stefan's "Python lexical cues" where he was missing "def" would be to take the Python3 function definition header syntax and strip just variable names. I.e., keep the ":"s def foo(x: type1, y: type2, z: type3) -> type0: pass goes to (: type1, : type2, : type3) -> type0 [ident] I don't think that looks like anything else that might valid in C or Python. It does just what C does - strip variable names from a function header, but the ":"s maybe key your brain into a different syntax mode since they are arguably more rare in C. (Besides stripping names, C has some extra parens for precedence/associativity - which admittedly are tuned to make use-expressions simpler than type-expressions.) Anyway, I don't really like my own proposal above. Just mentioning it for completeness in case the ":"s help anyone. Robert wrote: >I really hope, in the long run, people won't have to manually do these >declarations. I agree they'll always be there somehow and also with Stefan's comments about the entry bar. So, most people not needing them most of the time doesn't remove the question. >I am curious, when you read "cdef int * p" do you parse this as "cdef >(int*) p" or "cdef int (*p)" 'cause for me no matter how well I know >it's the latter, I think the former (i.e. I think "I'm declaring a >variable of type p that's of int pointer type.") >[..] >Essentially everyone thinks "cdef type var" even though that's not >currently the true grammar. >[..] >The reason ctypedefs help, and are so commonly used for with function >pointers, is because the existing syntax is just so horrendously bad. >If there's a clear way to declare a function taking a single float and >returning a single float, no typedef needed. No, no, no. Look, generations of C/C++ programmers have been done monumental disservice by textbooks/code/style guides that suggest "int* p" is *any less* confusing than spacing "2+3/x" as "2+3 / x". Early on in my C exposure someone pointed this out and I've never been confused since. It's a syntax-semantics confusion. Concrete syntax has always been right associative dereference *. In this syntax family, the moment any operators []/*/() are introduced, you have to start formatting it/thinking of it as a real expression, and that formatting should track the syntax not semantics like in your head "pointer to"/indirection speak or whatever. Spacing it as if * were left associative to the type name is misleading at best. If you can only think of type decls in general as (type, var) pairs syntactically and semantically then *of course* you find typedefs more clear. They make pairing more explicit & shrink the expression tree to be more trivial. You should *still* space the typedef itself in a way suggestive of the actual concrete syntax -- "typedef int *p" (or "ctypedef") just like you shouldn't write "2+3 / x". You should still not essentially think of "ctypedef type var" *either*, but rather "typedef basetype expr". In short, "essentially everyone" *should* think and be taught and have it reinforced and "gel" by spacing that "basetype expr" is the syntax to create a type-var bindings semantically, and only perceive "type var" as just one simple case. Taking significance of space in Python/Cython one step further, "int* p" could even be a hard syntax error, but I'm not seriously proposing that change. I really do not think it is "essentially everyone". You know better as you said anyway, but are in conflict with yourself, I think syntax-semantics wise. Semantically, pointer indirection loads from an address before using, and sure that can be confusing to new programmers in its own right. Trying to unravel that confusion with anti-syntax spacing/thought cascades the wrong way out of the whole situation and contributes to your blocked assimilation. *If* the space guides you or barring space parens guide you, you quickly get to never forgetting that types are inside-out/inverse/what-I-get-if specifications. Note that this is as it would if +,/ had somehow tricky concepts somehow "fixable" by writing "2+3 / x" all the time. Arithmetic isn't a binding..so the analogy is hard to complete, but my point is ad nauseum at this stage (or even sooner! ;-). Undermine syntax with contrary whitespace and of course it will seem bad/be harder. It might even lock you in to thought patterns that make it really hard to think about it how you know you "oug
Re: [Cython] New function (pointer) syntax.
Robert Bradshaw robertwb at gmail.com wrote: >Quick: is that a pointer to an array or 10 pointers to ints? Yes, I >know what it is, but the thing is without knowing C (well) it's not >immediately obvious what the precedence should be. If you're gonna pick something e.g like that, it should not be something people see all the time like int main(int argc, char *argv[]). ;-) *That* I recognized correctly in far less time than it took me to read your question text. Here's a counter: printf("%s", *a[i]) - what does it do? I submit that if *a[i] is hard or not "real quick" then, eh, what the real problem here may be that you've got enough syntaxes floating in your brain that you haven't internalized all the operator rules of this one. In that case, nothing "operator-oriented" is going to make you truly happy in the long-run. I think this whole anti-declarator campaign is misguided. I agree with Greg that experienced C programmers, such as C library writers know it. In trying to "fix it", you are just squeezing complexity jello maybe to oil the squeakiest wheel of casual users. You can eek out a level or maybe two of simplicity, but you double up syntax. You can make ()s better and make []s harder..middle-scale complexity better/full scale a little worse. That kind of thing. The net result of these attempts doesn't strike me as better or worse except that different is worse. The closer to exactly C types you can get the more coherent the overall system is unless/until Python land has a commonly used type setup that is adequate. The ctypes module has a way to do this and Numba can use that, though I have little experience doing so. The context is different in that in Cython you have more opportunity to create new syntax, but should you do so? That brings up a totally other note, Could you maybe compile-time eval() ctypes stuff if you really hate C decls and want to be more pythonic? If the answer to ctypes is "Err.. not terse/compact/part of the syntax enough", well, declarators are a fine answer to terseness, that's for sure. Mostly operator text. ;-) >Cython's target audience includes lots of people who don't know C well. One leans differently based on what you think Cython "mostly" is..Bridge to libs, Its own thing, Compiled python, etc. (I know it's all the above). >If they were good, they would be easy to learn, no expert teaching required. All syntax requires learning..e.g, Stefan expressed a harder time than you visually unpacking some of the "->" exprs. All teaching can be messed up. Teaching methods can fall into bad ruts. In this case some teaching/practice actively blocks assimilation..possibly in a long-term sense like mishearing the lyrics of song or a person's name and then the actual case never sounding quite right for a long time. Or people with strong accents of your spoken past dragging you back into accented speech yourself. (Mis-)reinforced language is *tough* and can transcend good teaching. Part of this thread was you & Stefan both having an implied question of "why do I read it one way when I darn well know it's the other". I was trying to help answer that question. Maybe I'm wrong about your individual case(s). In my experience with people's trouble is that it's not just "tokens being on both sides". Most people are used to [] and () being to the right. It's active mis-reinforcement stuff like spacing/thinking const char instead of char const,.. that makes it hard. Unless you can avoid declarator style 100%, it's better to make people confront it sooner than do half-measures. > syntax != semantics => baddness It's only "misperceived/taught/assimilated semantics"-syntax divergence. I do consider the misperception unfortunate and unnecessary at the outset. The "pairing semantics" that "feel" like they diverge from the syntax for you are 'weak/vague' and *should* have lower priority in your head. It's only really 1-type, 1-var pairs in function sigs. Even you like "type varlist" in var or struct decls. I mean, c'mon: "int i, j" rocks. :) So, it's not always 1-1. Sometimes it's 1-1, sometimes 1-many aka 1 to a non-trival expr. If you go with a full expression instead of just a list, you get stuff in return as a bonus..You get stuff and lose stuff. Declarators are not some hateful, hateful thing to always avoid - there are pros and cons like anything, not "zero pros except for history" as you seem to say. Anyway, "close but different" is the order of the day and is subjective. Keeping in mind how all three function type cases look - the def/cdef, call site, type cast/type spec - should be ever present in your syntax evaluations of all you guys, and so should less trivial cases like returning a function pointer from a lookup. *Dropping/swapping* parts is arguably closer than changing how operators work. *New* operators are better than making old ones multi-personality. I don't like that cdef char *(..) approach on those grounds. -1 is my $.02. The
Re: [Cython] New function (pointer) syntax.
>But I admit it's hard to come up with an objective measure for how >good a syntax is...if it's natural to you than that's great. I think those queries you mention will mostly be biased by the squeakier wheels being more beginning people and that's not a very good argument or metric. I agree an objective measure of "goodness" or "understanding" is hard, but I happen to run Gentoo and keep my sources around. So, I did a quick grep over .c and .h files in 600 packages on my system.. pretty diverse: no one style guide or style or maintainer..Not even any very common domains..utilities, libraries, all sorts of stuff. $ grep '[a-zA-Z0-9_][a-zA-Z0-9_]\*\*[^ ]' `find -type f -name '*.[ch]'` | grep -v '/\*\*' | grep -v '\*\*/' | wc -l 3468 $ grep '[a-zA-Z0-9_][a-zA-Z0-9_] *\*\*[^ ]' `find -type f -name '*.[ch]'` | | grep -v '/\*\*' | grep -v '\*\*/' | wc -l 68900 In other words, over 95% of the instances spaced the '**' as if they knew it bound to the token on its right. ('**' is easier than '*' since the latter could be multiplies but '**' almost never is). Yes, greps are way approximate. Yes, some real parser would be better, but that just took me only a few minutes. I visually inspected what they were catching by |less instead of |wc and both cases seemed to mostly be catching decl/type uses as intended..less than a few percent error on that. If anything, the most glaring question was that 3468 "type**" cases were highly polluted with near 50% questionable "no whitespace at all" instances like (char**)a. Maintainers who know better might accept patches and lazily not fix confusing formatting. So, in a couple ways that 5% confused is an upper bound in this corpus (under a spacing = no confusion hypothesis). And, sure, confused people might format non-confusingly. And maybe '**' itself is slightly selecting for less confused people. Even so, 95..97.5% to me == "essentially no one" to you by some, let's say "not totally bonkers" measure suggests that we are just thinking of highly different populations of people. Even if you think my methods way hokey, it's probably at least suggestive "essentially no one" is a far bigger set than you thought before. So, I agree/disagree with some other things you said. Initializers are an (awfully convenient) aberration, but your example of teaching is just an example of bad teaching - so what? A list of vars of one type is easily achieved even thinking as you want with a typedef, hand having both declarators and typedefs gets you everything you want. Still, disagreements aside, I give up trying to convince you of anything beyond that you *just might* may have a very skewed perception of how confused about declarators are people coming to Cython from a C/C++ background or people who write C/C++ in general. But it seems in this arc anyway you aren't trying to target them or C code integration coherence or such...Period! As per... >I'm hoping we can avoid it 100% :-) for anyone who doesn't have to >actually interact with C. So, you're leaning hard on the Cython as a Python compiler direction. I think Cython in general should probably either be A) as C-like as possible or B) as Python-like as possible. Given your (I still think misguided) hatred of C function pointer syntax/scenario A), there's your probable answer - be as Py-like as possible. Given that, for just function types, that seems to mean either: A) the "lambda type1, type2: type3" proposal, B) what mypy does which is roughly Function[ [type1, type2], type3 ], or possibly C) what Numba does if that really catches on. or maybe the (type, type) -> rtype though that seems unpopular here, but almost surely not that "char*(..)" thing. In a few years the mypy approach may well be a PEP approved lint/typing approach and people coming from Python will at least already have maybe seen it. In dozens of emails 2..3 months ago Guido was really strongly promoting mypy, but I think it is in some kind of a-PEP-needs-to-be- written limbo. Here is a link to the relevant sub-part for those who haven't looked at it: http://www.mypy-lang.org/tutorial.html#callables I actually like A) better, but not so much better it should override what the parent to one of the two Cython syntax communities goes with. A) is really easy to describe - "just take the function value structure but use types instead of variables/expression value". There are some other styles like pytypedecl or obiwan and such that might also be worth looking into before you decide. I haven't looked at them, but thought I should mention them. ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel