GNU C extension: Function Error vs. Success

2014-03-10 Thread Shahbaz Youssefi
Hi,

First, let me say that I'm not subscribed to the mailing list, so
please CC myself when responding.

This post is to discuss a possible extension to the GNU C language.
Note that this is still an idea and not refined.

Background
==

In C, the following code structure is ubiquitous:

return value = function_call(arguments);
if (return_value == ERROR_VALUE)
goto exit_fail;

You can take a look at goto usages in the Linux kernel just for
examples (https://github.com/torvalds/linux/search?q=goto).

However, this method has one particular drawback, besides verbosity
among others. This drawback is that each function has to designate (at
least) one special value as ERROR_VALUE. Trivial as it may seem, this
has by itself resulted in many inconsistencies and problems. For
example, `malloc` signals failure by returning `NULL`, `strtod` may
return 0, `HUGE_VAL*` etc, `fread` returns 0 which is not necessarily
an error case either, `fgetc` returns `EOF`, `remove` returns nonzero
if failed, `clock` returns -1 and so on.

Sometimes such a special value may not even be possible, in which case
a workaround is required (put the return value as a pointer argument
and return the state of success).

The following suggestion allows clearer and shorter error handling.

The Extension (Basic)
=

First, let's introduce a new syntax (note again, this is just an
example. I don't suggest these particular symbols):

float inverse(int x)
{
if (x == 0)
fail;
return 1.0f / x;
}

...
y = inverse(x) !! goto exit_inverse_failed;

The semantics of this syntax would be as follows. The function
`inverse` can choose to `fail` instead of `return`, in which case it
doesn't actually return anything. From the caller site, this failure
is signaled (speculations on details below), `y` is not assigned and a
`goto exit_inverse_failed` is executed. The observed behavior would be
equivalent to:

int inverse(int x, float *y)
{
if (x == 0)
return -1;
*y = 1.0f / x;
return 0;
}

...
if (inverse(x, &y))
goto exit_inverse_failed;

The Extension (Advanced)


Sometimes, error handling is done not just by a single `goto`
(although they can all be reduced to this). For example:

return value = function_call(arguments);
if (return_value == ERROR_VALUE)
{
/* a small piece of code, such as printing an error */
goto exit_fail;
}

This could be shortened as:

return value = function_call(arguments) !! {
/* a small piece of code, such as printing an error */
goto exit_fail;
}

A generic syntax could therefore be used:

return value = function_call(arguments) !! goto exit_fail;
return value = function_call(arguments) !! fail;
return value = function_call(arguments) !! return 0;
return value = function_call(arguments) !! {
/* more than one statement */
}

Another necessity is for the error code. While `errno` is usable, it's
not the best solution in the world. Extending the syntax further, the
following could be used (again, syntax is just for the sake of
example, I'm not suggesting these particular symbols):

float inverse(int x)
{
if (x == 0)
fail EDOM;
return 1.0f / x;
}

...
y = inverse(x) !!= error_code !! goto exit_inverse_failed;

By this, the function `inverse` can `fail` with an error code (again,
speculations of details below), which can be stored in a variable
(`error_code`) in call site.

Some Details
==

The state of failure and success as well as the failure code can be
kept in registers, to keep the ABI backward-compatible.

If backward compatibility is required, a `fail`able function must
still provide a fail value (simply to keep older code intact), which
could have a syntax as follows (for example):

float inverse(int x) !! 0
{
if (x == 0)
fail EDOM;
return 1.0f / x;
}

...
y = inverse(x);

In this example, the caller doesn't check for failure and would
receive the fail value indicated by the function signature. If no such
fail value is given, the caller must check for failure. This allows
older code, such as the standard library to be possibly used in the
way it has always been (by providing fail value) or with this
extension, while allowing cleaner and more robust code to be written
(by not providing fail value).

Examples


Here are some examples.

Opening a file and reading a number (normal C):

int n;
FILE *fin = fopen("filename", "r");
if (fin == NULL)
goto exit_no_file;

if (fscanf(fin, "%d", &n) != 1)
if (ferror(fin))
goto exit_io_error;
else
{ /* complain about format */ }

fclose(fin);
return 0;

exit_io_error:
/* print error: I/O error */
   

Re: GNU C extension: Function Error vs. Success

2014-03-10 Thread Shahbaz Youssefi
Hi Julian,

Thanks for the feedback.

Regarding C++ exceptions: exceptions are not really nice. They can
just make your function return without you even knowing it (forgetting
a `try/catch` or not knowing it may be needed, which is C++'s fault
and probably could have been done better). Also, they require
complicated operations. You can read a small complaint about it here:
http://stackoverflow.com/a/1746368/912144 and I'm sure there are many
others on the internet.

Regarding OCaml option types: in fact I did think of this separation
after learning about monads (through Haskell though). However,
personally I don't see how monads could be introduced in C without
majorly affecting syntax and ABI.

Regarding the syntax: You are absolutely right. I don't claim that
this particular syntax is ideal. I'm sure the many minds in this
mailing list are able to find a more beautiful syntax, if they are
interested in the idea. Nevertheless, the following code:

x = foo() + bar();

doesn't do any error checking. I.e. it assumes `foo` and `bar` are
unfailable. If that is the case, there is no need for a `!! goto
fail_label` at all. I personally have never seen such an expression
followed by e.g.

if (x )
goto foo_or_bar_failed;

On the other hand, something like this is common:

while (func(...) == 0)

which, if turned to e.g.:

while (func(...) !! break)

or

fail_code=0;
while (func(...) !!= fail_code, fail_code == 0)

could seem awkward at best.

My hope is that through this discussion, we would be able to figure
out a way to separate success and failure of functions with minimal
change to the language. My syntax is based on having the return value
intact while returning the success-failure and error-code in registers
both for speed and compatibility and let the compiler generate the
repetitive/ugly error-checking code. Other than that, I personally
don't have any attachments to the particular way it's embedded in the
grammar of GNU C.

On Mon, Mar 10, 2014 at 3:50 PM, Julian Brown  wrote:
> On Mon, 10 Mar 2014 15:27:06 +0100
> Shahbaz Youssefi  wrote:
>
>> Feedback
>> 
>>
>> Please let me know what you think. In particular, what would be the
>> limitations of such a syntax? Would you be interested in seeing this
>> extension to the GNU C language? What alternative symbols do you think
>> would better show the intention/simplify parsing/look more beautiful?
>
> I suggest you think about how this is better than C++ exceptions, and
> also consider alternatives like OCaml's option types that can be used
> to achieve similar ends.
>
> For your suggested syntax at function call sites, consider that
> functions can be called in more complicated ways than simply as "bar =
> foo();" statements, and the part following the "!!" in your examples
> appears to be a statement itself: in more complicated expressions, that
> interleaving of expressions and statements going to get very ugly very
> quickly. E.g.:
>
> x = foo() + bar();
>
> would need to become something like:
>
> x = (foo() !! goto label1) + (bar () !! goto label2);
>
> And there are all sorts of issues with that.
>
> Anyway, I quite like the idea of rationalising error-code returns in C
> code, but I don't think this is the right way of going about it.
>
> HTH,
>
> Julian


Re: GNU C extension: Function Error vs. Success

2014-03-10 Thread Shahbaz Youssefi
Thanks for the hint. I would try to learn how to do that and
experiment on the idea if/when I get the time. I could imagine why the
community isn't interested in new syntax in general. Still, you may
never know if an idea would be attractive enough to generate some
attention! :)

On Mon, Mar 10, 2014 at 4:26 PM, Basile Starynkevitch
 wrote:
> On Mon, Mar 10, 2014 at 03:27:06PM +0100, Shahbaz Youssefi wrote:
>> Hi,
>>
>> First, let me say that I'm not subscribed to the mailing list, so
>> please CC myself when responding.
>>
>> This post is to discuss a possible extension to the GNU C language.
>> Note that this is still an idea and not refined.
> []
>>
>> The Extension (Basic)
>> =
>>
>> First, let's introduce a new syntax (note again, this is just an
>> example. I don't suggest these particular symbols):
>>
>> float inverse(int x)
>> {
>> if (x == 0)
>> fail;
>> return 1.0f / x;
>> }
>>
>> ...
>> y = inverse(x) !! goto exit_inverse_failed;
>>
>
>
> Syntax is not that important. To experiment your idea,
> I would suggest using a mixture of pragmas and builtins;
> you could perhaps have a new builtin_shahbaz_fail() and a pragma
> #pragma SHAHBAZ  and then your temporary syntax would be
>
>   float inverse(int x)
>   {
>   if (x == 0) builtin_shahbaz_fail();
>   return 1.0f / x;
>   }
>
>   #pragma SHAHBAZ on_error_goto(exit_inverse_failed)
>   { y = inverse(x); }
>
>
> Then, you don't need to dig into GCC parser to add these builtin and pragma.
> You could add them with a GCC plugin (in C++) or using MELT 
> http://gcc-melt.org/
>
> Once you added a GCC pass to support your builtin and pragma
> (which is difficult, and means understanding the details of internals of GCC)
> you could convince other people.
>
> Notice that the GCC community is not friendly these days to new syntactic 
> constructs.
>
> BTW, once you have implemented a builtin and a pragma you could use 
> preprocessor macros
> to make these look more like your syntax.
>
>
> I would believe that MELT is very well suited for such experiments.
>
> Regards.
>
> PS. Plugins cannot extend the C syntax (except thru attributes, builtins, 
> pragmas).
>
> --
> Basile STARYNKEVITCH http://starynkevitch.net/Basile/
> email: basilestarynkevitchnet mobile: +33 6 8501 2359
> 8, rue de la Faiencerie, 92340 Bourg La Reine, France
> *** opinions {are only mines, sont seulement les miennes} ***


Re: GNU C extension: Function Error vs. Success

2014-03-10 Thread Shahbaz Youssefi
I'm mostly interested in C. Nevertheless, you can of course also do
the same in C:

struct option_float
{
float value;
int error_code;
bool succeeded;
};

struct option_float inverse(int x) {
  if (x == 0)
return (struct option_float){ .succeeded = false, .error_code = EDOM };
  return (struct option_float){ .value = 1.0f / x, .succeeded = true };
}

you get the idea. The difference is that it's hard to optimize the
non-error execution path if the compiler is not aware of the
semantics. Also, with exceptions, this can happen:

float inverse(int x)
{
if (x == 0)
throw overflow;
return 1.0f / x;
}

y = inverse(x);

Which means control is taken from the function calling inverse without
it explicitly allowing it, which is not in the spirit of C.

P.S. programming in a lot of languages is _mere syntax_ with respect
to some others. Still, some syntaxes are good and some not. If we can
improve GNU C's syntax to be shorter, but without loss of
expressiveness or clarity, then why not!

On Mon, Mar 10, 2014 at 6:18 PM, Andrew Haley  wrote:
> On 03/10/2014 03:09 PM, Shahbaz Youssefi wrote:
>> Regarding C++ exceptions: exceptions are not really nice. They can
>> just make your function return without you even knowing it (forgetting
>> a `try/catch` or not knowing it may be needed, which is C++'s fault
>> and probably could have been done better). Also, they require
>> complicated operations. You can read a small complaint about it here:
>> http://stackoverflow.com/a/1746368/912144 and I'm sure there are many
>> others on the internet.
>
> A few quibbles here.
>
> Firstly, C++ exceptions do not require complicated operations: an
> implementation may well do complicated things, but that's not the
> same at all.  In GCC we use DWARF exception handling, which is
> designed to be near-zero-cost for exceptions that are not thrown,
> but is more expensive when they are.
>
> There is no inherent reason why
>
> float inverse(int x)
> {
> if (x == 0)
> fail;
> return 1.0f / x;
> }
>
> y = inverse(x) !! goto exit_inverse_failed;
>
> should not generate the same code as
>
> float inverse(int x)
> {
> if (x == 0)
> throw overflow;
> return 1.0f / x;
> }
>
> try {
> y = inverse(x);
> } catch (IntegerOverflow e) {
> goto exit_inverse_failed;
> }
>
> This assumes, of course, a knowledgeable optimizing compiler.
>
> Also, consider that C++ can already do almost what you want.
> Here we have a function that returns a float wrapped with a
> status:
>
> option inverse(float x) {
>   if (x == 0)
> return option();  // No value...
>   return 1.0f / x;
> }
>
> float poo(float x) {
>   option res = inverse(x);
>   if (res.none())
> return 0;
>   return res;
> }
>
> GCC generates, quite nicely:
>
> poo(float):
> xorps   %xmm1, %xmm1
> ucomiss %xmm1, %xmm0
> jp  .L12
> jne .L12
> movaps  %xmm1, %xmm0
> ret
> .L12:
> movss   .LC1(%rip), %xmm1
> divss   %xmm0, %xmm1
> movaps  %xmm1, %xmm0
> ret
>
> The difference between
>
> y = inverse(x) !! goto exit_inverse_failed;
>
> and
>
>   option y = inverse(x);
>   if (y.none())
> goto exit_inverse_failed;
>
> is, I suggest to you, mere syntax.  The latter is more explicit.
>
> Andrew.
>


Re: GNU C extension: Function Error vs. Success

2014-03-11 Thread Shahbaz Youssefi
On Tue, Mar 11, 2014 at 1:26 PM, David Brown  wrote:
> On 10/03/14 18:26, Shahbaz Youssefi wrote:
> You can tell the compiler about the likely paths:
>
> struct option_float inverse(int x) {
> if (__builtin_expect(x != 0, 1)) {
>   return (struct option_float){ .value = 1.0f / x, .succeeded 
> = true };
> } else {
> return (struct option_float){ .succeeded = false, 
> .error_code =
> EDOM };
> }

True, but I was actually referring to the fact that like this, you
have to write the status to stack, where the return value resides,
while with a built-in method you could do away with returning it in a
register. This is not just for performance, but also to be compatible
with the previous ABI.

> I am not sure that it would be possible to get the sort of effect you
> are looking for without disrupting the syntax too much for a gcc extension.
>
> Speaking as an embedded developer who often wants to get the smallest
> and fastest code on small processors, it would be very nice is to have
> the ability to return an extra flag along with the main return value of
> a function.  Typically that would be a flag to indicate success or
> failure, but it might have other purposes - and it could be the only
> return value of an otherwise void function.  Key to the implementation
> would be a calling convention to use a processor condition code flag
> here - that would let you generate optimal code for the "if (error)
> goto" part.

I too am an embedded developer (with some kernel module programming
too) and what you say is another reason why I'd personally like to see
this happen. Thanks for the feedback.