GNU C extension: Function Error vs. Success
Hi, First, let me say that I'm not subscribed to the mailing list, so please CC myself when responding. This post is to discuss a possible extension to the GNU C language. Note that this is still an idea and not refined. Background == In C, the following code structure is ubiquitous: return value = function_call(arguments); if (return_value == ERROR_VALUE) goto exit_fail; You can take a look at goto usages in the Linux kernel just for examples (https://github.com/torvalds/linux/search?q=goto). However, this method has one particular drawback, besides verbosity among others. This drawback is that each function has to designate (at least) one special value as ERROR_VALUE. Trivial as it may seem, this has by itself resulted in many inconsistencies and problems. For example, `malloc` signals failure by returning `NULL`, `strtod` may return 0, `HUGE_VAL*` etc, `fread` returns 0 which is not necessarily an error case either, `fgetc` returns `EOF`, `remove` returns nonzero if failed, `clock` returns -1 and so on. Sometimes such a special value may not even be possible, in which case a workaround is required (put the return value as a pointer argument and return the state of success). The following suggestion allows clearer and shorter error handling. The Extension (Basic) = First, let's introduce a new syntax (note again, this is just an example. I don't suggest these particular symbols): float inverse(int x) { if (x == 0) fail; return 1.0f / x; } ... y = inverse(x) !! goto exit_inverse_failed; The semantics of this syntax would be as follows. The function `inverse` can choose to `fail` instead of `return`, in which case it doesn't actually return anything. From the caller site, this failure is signaled (speculations on details below), `y` is not assigned and a `goto exit_inverse_failed` is executed. The observed behavior would be equivalent to: int inverse(int x, float *y) { if (x == 0) return -1; *y = 1.0f / x; return 0; } ... if (inverse(x, &y)) goto exit_inverse_failed; The Extension (Advanced) Sometimes, error handling is done not just by a single `goto` (although they can all be reduced to this). For example: return value = function_call(arguments); if (return_value == ERROR_VALUE) { /* a small piece of code, such as printing an error */ goto exit_fail; } This could be shortened as: return value = function_call(arguments) !! { /* a small piece of code, such as printing an error */ goto exit_fail; } A generic syntax could therefore be used: return value = function_call(arguments) !! goto exit_fail; return value = function_call(arguments) !! fail; return value = function_call(arguments) !! return 0; return value = function_call(arguments) !! { /* more than one statement */ } Another necessity is for the error code. While `errno` is usable, it's not the best solution in the world. Extending the syntax further, the following could be used (again, syntax is just for the sake of example, I'm not suggesting these particular symbols): float inverse(int x) { if (x == 0) fail EDOM; return 1.0f / x; } ... y = inverse(x) !!= error_code !! goto exit_inverse_failed; By this, the function `inverse` can `fail` with an error code (again, speculations of details below), which can be stored in a variable (`error_code`) in call site. Some Details == The state of failure and success as well as the failure code can be kept in registers, to keep the ABI backward-compatible. If backward compatibility is required, a `fail`able function must still provide a fail value (simply to keep older code intact), which could have a syntax as follows (for example): float inverse(int x) !! 0 { if (x == 0) fail EDOM; return 1.0f / x; } ... y = inverse(x); In this example, the caller doesn't check for failure and would receive the fail value indicated by the function signature. If no such fail value is given, the caller must check for failure. This allows older code, such as the standard library to be possibly used in the way it has always been (by providing fail value) or with this extension, while allowing cleaner and more robust code to be written (by not providing fail value). Examples Here are some examples. Opening a file and reading a number (normal C): int n; FILE *fin = fopen("filename", "r"); if (fin == NULL) goto exit_no_file; if (fscanf(fin, "%d", &n) != 1) if (ferror(fin)) goto exit_io_error; else { /* complain about format */ } fclose(fin); return 0; exit_io_error: /* print error: I/O error */
Re: GNU C extension: Function Error vs. Success
Hi Julian, Thanks for the feedback. Regarding C++ exceptions: exceptions are not really nice. They can just make your function return without you even knowing it (forgetting a `try/catch` or not knowing it may be needed, which is C++'s fault and probably could have been done better). Also, they require complicated operations. You can read a small complaint about it here: http://stackoverflow.com/a/1746368/912144 and I'm sure there are many others on the internet. Regarding OCaml option types: in fact I did think of this separation after learning about monads (through Haskell though). However, personally I don't see how monads could be introduced in C without majorly affecting syntax and ABI. Regarding the syntax: You are absolutely right. I don't claim that this particular syntax is ideal. I'm sure the many minds in this mailing list are able to find a more beautiful syntax, if they are interested in the idea. Nevertheless, the following code: x = foo() + bar(); doesn't do any error checking. I.e. it assumes `foo` and `bar` are unfailable. If that is the case, there is no need for a `!! goto fail_label` at all. I personally have never seen such an expression followed by e.g. if (x ) goto foo_or_bar_failed; On the other hand, something like this is common: while (func(...) == 0) which, if turned to e.g.: while (func(...) !! break) or fail_code=0; while (func(...) !!= fail_code, fail_code == 0) could seem awkward at best. My hope is that through this discussion, we would be able to figure out a way to separate success and failure of functions with minimal change to the language. My syntax is based on having the return value intact while returning the success-failure and error-code in registers both for speed and compatibility and let the compiler generate the repetitive/ugly error-checking code. Other than that, I personally don't have any attachments to the particular way it's embedded in the grammar of GNU C. On Mon, Mar 10, 2014 at 3:50 PM, Julian Brown wrote: > On Mon, 10 Mar 2014 15:27:06 +0100 > Shahbaz Youssefi wrote: > >> Feedback >> >> >> Please let me know what you think. In particular, what would be the >> limitations of such a syntax? Would you be interested in seeing this >> extension to the GNU C language? What alternative symbols do you think >> would better show the intention/simplify parsing/look more beautiful? > > I suggest you think about how this is better than C++ exceptions, and > also consider alternatives like OCaml's option types that can be used > to achieve similar ends. > > For your suggested syntax at function call sites, consider that > functions can be called in more complicated ways than simply as "bar = > foo();" statements, and the part following the "!!" in your examples > appears to be a statement itself: in more complicated expressions, that > interleaving of expressions and statements going to get very ugly very > quickly. E.g.: > > x = foo() + bar(); > > would need to become something like: > > x = (foo() !! goto label1) + (bar () !! goto label2); > > And there are all sorts of issues with that. > > Anyway, I quite like the idea of rationalising error-code returns in C > code, but I don't think this is the right way of going about it. > > HTH, > > Julian
Re: GNU C extension: Function Error vs. Success
Thanks for the hint. I would try to learn how to do that and experiment on the idea if/when I get the time. I could imagine why the community isn't interested in new syntax in general. Still, you may never know if an idea would be attractive enough to generate some attention! :) On Mon, Mar 10, 2014 at 4:26 PM, Basile Starynkevitch wrote: > On Mon, Mar 10, 2014 at 03:27:06PM +0100, Shahbaz Youssefi wrote: >> Hi, >> >> First, let me say that I'm not subscribed to the mailing list, so >> please CC myself when responding. >> >> This post is to discuss a possible extension to the GNU C language. >> Note that this is still an idea and not refined. > [] >> >> The Extension (Basic) >> = >> >> First, let's introduce a new syntax (note again, this is just an >> example. I don't suggest these particular symbols): >> >> float inverse(int x) >> { >> if (x == 0) >> fail; >> return 1.0f / x; >> } >> >> ... >> y = inverse(x) !! goto exit_inverse_failed; >> > > > Syntax is not that important. To experiment your idea, > I would suggest using a mixture of pragmas and builtins; > you could perhaps have a new builtin_shahbaz_fail() and a pragma > #pragma SHAHBAZ and then your temporary syntax would be > > float inverse(int x) > { > if (x == 0) builtin_shahbaz_fail(); > return 1.0f / x; > } > > #pragma SHAHBAZ on_error_goto(exit_inverse_failed) > { y = inverse(x); } > > > Then, you don't need to dig into GCC parser to add these builtin and pragma. > You could add them with a GCC plugin (in C++) or using MELT > http://gcc-melt.org/ > > Once you added a GCC pass to support your builtin and pragma > (which is difficult, and means understanding the details of internals of GCC) > you could convince other people. > > Notice that the GCC community is not friendly these days to new syntactic > constructs. > > BTW, once you have implemented a builtin and a pragma you could use > preprocessor macros > to make these look more like your syntax. > > > I would believe that MELT is very well suited for such experiments. > > Regards. > > PS. Plugins cannot extend the C syntax (except thru attributes, builtins, > pragmas). > > -- > Basile STARYNKEVITCH http://starynkevitch.net/Basile/ > email: basilestarynkevitchnet mobile: +33 6 8501 2359 > 8, rue de la Faiencerie, 92340 Bourg La Reine, France > *** opinions {are only mines, sont seulement les miennes} ***
Re: GNU C extension: Function Error vs. Success
I'm mostly interested in C. Nevertheless, you can of course also do the same in C: struct option_float { float value; int error_code; bool succeeded; }; struct option_float inverse(int x) { if (x == 0) return (struct option_float){ .succeeded = false, .error_code = EDOM }; return (struct option_float){ .value = 1.0f / x, .succeeded = true }; } you get the idea. The difference is that it's hard to optimize the non-error execution path if the compiler is not aware of the semantics. Also, with exceptions, this can happen: float inverse(int x) { if (x == 0) throw overflow; return 1.0f / x; } y = inverse(x); Which means control is taken from the function calling inverse without it explicitly allowing it, which is not in the spirit of C. P.S. programming in a lot of languages is _mere syntax_ with respect to some others. Still, some syntaxes are good and some not. If we can improve GNU C's syntax to be shorter, but without loss of expressiveness or clarity, then why not! On Mon, Mar 10, 2014 at 6:18 PM, Andrew Haley wrote: > On 03/10/2014 03:09 PM, Shahbaz Youssefi wrote: >> Regarding C++ exceptions: exceptions are not really nice. They can >> just make your function return without you even knowing it (forgetting >> a `try/catch` or not knowing it may be needed, which is C++'s fault >> and probably could have been done better). Also, they require >> complicated operations. You can read a small complaint about it here: >> http://stackoverflow.com/a/1746368/912144 and I'm sure there are many >> others on the internet. > > A few quibbles here. > > Firstly, C++ exceptions do not require complicated operations: an > implementation may well do complicated things, but that's not the > same at all. In GCC we use DWARF exception handling, which is > designed to be near-zero-cost for exceptions that are not thrown, > but is more expensive when they are. > > There is no inherent reason why > > float inverse(int x) > { > if (x == 0) > fail; > return 1.0f / x; > } > > y = inverse(x) !! goto exit_inverse_failed; > > should not generate the same code as > > float inverse(int x) > { > if (x == 0) > throw overflow; > return 1.0f / x; > } > > try { > y = inverse(x); > } catch (IntegerOverflow e) { > goto exit_inverse_failed; > } > > This assumes, of course, a knowledgeable optimizing compiler. > > Also, consider that C++ can already do almost what you want. > Here we have a function that returns a float wrapped with a > status: > > option inverse(float x) { > if (x == 0) > return option(); // No value... > return 1.0f / x; > } > > float poo(float x) { > option res = inverse(x); > if (res.none()) > return 0; > return res; > } > > GCC generates, quite nicely: > > poo(float): > xorps %xmm1, %xmm1 > ucomiss %xmm1, %xmm0 > jp .L12 > jne .L12 > movaps %xmm1, %xmm0 > ret > .L12: > movss .LC1(%rip), %xmm1 > divss %xmm0, %xmm1 > movaps %xmm1, %xmm0 > ret > > The difference between > > y = inverse(x) !! goto exit_inverse_failed; > > and > > option y = inverse(x); > if (y.none()) > goto exit_inverse_failed; > > is, I suggest to you, mere syntax. The latter is more explicit. > > Andrew. >
Re: GNU C extension: Function Error vs. Success
On Tue, Mar 11, 2014 at 1:26 PM, David Brown wrote: > On 10/03/14 18:26, Shahbaz Youssefi wrote: > You can tell the compiler about the likely paths: > > struct option_float inverse(int x) { > if (__builtin_expect(x != 0, 1)) { > return (struct option_float){ .value = 1.0f / x, .succeeded > = true }; > } else { > return (struct option_float){ .succeeded = false, > .error_code = > EDOM }; > } True, but I was actually referring to the fact that like this, you have to write the status to stack, where the return value resides, while with a built-in method you could do away with returning it in a register. This is not just for performance, but also to be compatible with the previous ABI. > I am not sure that it would be possible to get the sort of effect you > are looking for without disrupting the syntax too much for a gcc extension. > > Speaking as an embedded developer who often wants to get the smallest > and fastest code on small processors, it would be very nice is to have > the ability to return an extra flag along with the main return value of > a function. Typically that would be a flag to indicate success or > failure, but it might have other purposes - and it could be the only > return value of an otherwise void function. Key to the implementation > would be a calling convention to use a processor condition code flag > here - that would let you generate optimal code for the "if (error) > goto" part. I too am an embedded developer (with some kernel module programming too) and what you say is another reason why I'd personally like to see this happen. Thanks for the feedback.