[RFC] timers, pointers to functions and type safety

2006-12-01 Thread Al Viro
There's a bunch of related issues, some kernel, some gcc,
thus the Cc from hell on that one.

First of all, in theory the timers in kernel are done that way:
* they have callback of type void (*)(unsigned long)
* they have data to be passed to it - of type unsigned long
* callback is called by the code that even in theory has no
chance whatsoever of inlining the call.
* one of the constraints on the targets we can port the kernel
on is that unsigned long must be uintptr_t.

The last one means that we can pass any pointers to these suckers; just
cast to unsigned long and cast back in the callback.

While that is safe (modulo the portability constraint that affects much
more code than just timers), it ends up very inconvenient and leads to
lousy type safety.

The thing is, absolute majority of callbacks really want a pointer to
some object.  There is a handful of cases where we really want a genuine
number - not a pointer cast to unsigned long, not an index in array, etc.
They certainly can be dealt with.  Nearly a thousand of other instances
definitely want pointers.

Another observation is that quite a few places are playing fast and
loose with function pointers.  Some are just too lazy and cast
void (*)(void) to void (*)(unsigned long).  These, IMO, should stop
wanking and grow an unused argument.  Not worth the ugliness...
However, there are other cases, including very interesting
timer->function = (void (*)(unsigned long))func;
timer->data = (unsigned long)p;
with func actually being void (void *) and p being void *.

Now, _that_ is definitely not a valid C.  Worse, it doesn't take much
to come up with realistic architecture that would have different calling
conventions for those.  Just assume that
* there are two groups of registers (A and D)
* base address for memory access must be in some A register 
* both A and D registers can be used for arithmetics
* ABI is such that functions with few arguments have them passed
via A and D registers, with pointers going via A and numbers via D.
Realistic enough?  I wouldn't be surprised if such beasts actually existed -
embedded processors influenced by m68k are not particulary rare and picking
such ABI would make sense for them.

Note that this kind of casts is not just in some obscure code; e.g.
rpc_init_task() does just that.


And that's where it gets interesting.  It would be very nice to get to
the following situation:
* callbacks are void (*)(void *)
* data is void *
* instances can take void * or pointer to object type
* a macro SETUP_TIMER(timer, func, data) sets callback and data
and checks if func(data) would be valid.

It would be remove a lot of cruft and definitely improve the type safety
of the entire thing.  It's not hard to do; all it takes [warning: non
portable C ahead] is
typeof(*data) *p = data;
timer->function = (void (*)(void *))func;
timer->data = (void *)p;
(void)(0 && (func(p),0));

Again, that's not a portable C, even leaving aside the use of typeof.
Casts between the incompatible function types are undefined behaviour;
rationale is that we might have different calling conventions for those.
However, here we are at much safer ground; calling conventions are not
likely to change if you replace a pointer to object with void *.  It's
still possible in theory, but let's face it, we would have far worse
problems if it ever came to porting to such targets.

Note that here we have absolutely no possibility of eventual call
ever being inlined, no matter what kind of analysis compiler might
be doing.  Call happens when kernel/timer.c gets to structure while
trawling the lists and it simply has no way to tell which callback
might be there (as the matter of fact, callback can and often does
come from a module).

IOW, "gcc would ICE if it ever inlined it" kind of arguments (i.e. what
had triggered gcc refusing to do direct call via such cast) doesn't apply
here.  Question to gcc folks: can we expect no problems with the approach
above, provided that calling conventions are the same for passing void *
and pointers to objects?  No versions (including current 4.3 snapshots)
create any problems here...

BTW, similar trick would be very useful for workqueues - there we already
have void * as argument, but extra type safety would be nice to have.


Now, there's another question: how do we get there?  Or, at least, from
current void (*)(unsigned long) to void (*)(void *)...

"A fscking huge patch flipping everything at once" is obviously not an
answer; too much PITA merging and impossible to review.  There is a tempting
possibility to do that gradually, with all intermediates still in working
state, provided that on all platforms currently supported by the kernel
unsigned long and void * are indeed passed the same way (again, only for
current kernel targets and existing gcc versions).  Namely, we could
  

Re: [RFC] timers, pointers to functions and type safety

2006-12-02 Thread Al Viro
On Sat, Dec 02, 2006 at 01:29:32AM -0500, Daniel Berlin wrote:
> >While that is safe (modulo the portability constraint that affects much
> >more code than just timers), it ends up very inconvenient and leads to
> >lousy type safety.
> 
> Understandable.  I assume you are trying to get more  type safety more
> for error checking than optimization, being that the kernel still
> defaults to -fno-strict-aliasing.

Indeed.

> >Again, that's not a portable C, even leaving aside the use of typeof.
> >Casts between the incompatible function types are undefined behaviour;
> >rationale is that we might have different calling conventions for those.
> >However, here we are at much safer ground; calling conventions are not
> >likely to change if you replace a pointer to object with void *.
^^
> Is this true of the ports you guys support  even if the object is a
> function pointer or a function?
> (Though the first case may be insane.  I can't think of a *good*
> reason you'd pass a pointer to a function pointer to a timer
> callback,).

Pointer to pointer to function should be OK - it's a pointer to normal
object type and I would be very surprised if it differed from void *,
modulo very creative reading of 6.3.2.3(1).  I certainly wouldn't expect
it to differ from e.g. pointer to pointer to struct, even on targets
far odder than anything we might hope to port to.

Pointer to function might be worse, but that's excluded by the above -
it's not a pointer to object type.

In any case, we don't do either anywhere and if the need ever arises
we can simply put that pointer into whatever structure our timer_list
is embedded into and pass the address of that structure (which is what
we do in a lot of cases anyway).

IOW, it's worth checking in SETUP_TIMER, but it's not going to be a
realistic PITA in any instance.

> >IOW, "gcc would ICE if it ever inlined it" kind of arguments (i.e. what
> >had triggered gcc refusing to do direct call via such cast) doesn't apply
> >here.  Question to gcc folks: can we expect no problems with the approach
> >above, provided that calling conventions are the same for passing void *
> >and pointers to objects?  No versions (including current 4.3 snapshots)
> >create any problems here...
> 
> Personally, as someone who works on the pointer and alias analysis,
> I'm much more worried about your casting of void (*)(void) to void
> (*)(unsigned long) breaking in the future than I am about the above.

That one will be gone; it's pure laziness and IMO ~10 instances mostly in
weird ancient cdrom drivers are not worth worrying about, especially since
getting rid of such crap is a matter of minutes.

> I can't really see the above code breaking any more than what you have
> now, and it should in fact, break a lot less.  However, the removal of
> the argument cast (IE void (*) (void) to void (*) unsigned long) maybe
> break eventually.  As we get into intermodule analysis and figure out
> the conservative set of called functions for a random function
> pointer,  the typical thing to do is to type filter the set so only
> those address taken functions with "compatible"  signatures (for some
> very loose definition of compatible) are said to be callable by a
> given function pointer.  I can't imagine the definition of compatible
> would include address taken functions with less arguments than the
> function pointer you are calling through.  In this specific case, as
> you point out, it's pretty unlikely we'll ever be able to come up with
> a set without something telling us.  But I figured i'd mention it in
> case this type of casting is prevalent elsewhere.

Umm...  That type of casting worries me too, and not for timer-related
reasons.  We do it a lot in one place: fs/bad_inode.c.  A bunch of methods
with different arguments,
static int return_EIO(void)
{
return -EIO;
}
and (void *)return_EIO used to initialize them.  There's a couple of
other places where minor turds like that are done, but that one is
the most massive.  That's bloody annoying when doing any global call
graph analysis (sparse-based in my case, but I wouldn't expect gcc
folks being any happier with it)...

And no, it certainly isn't good C for many reasons (any target where
callee removes arguments from stack before returning control to caller
and that one's toast, for starters).
 
> IOW, unless someone else can see a good reason, I see no reason for
> you guys not to switch.


Re: [RFC] timers, pointers to functions and type safety

2006-12-02 Thread Al Viro
On Sat, Dec 02, 2006 at 04:23:32AM -0500, Kyle Moffett wrote:
> On Dec 01, 2006, at 12:21:49, Al Viro wrote:
> >And that's where it gets interesting.  It would be very nice to get to
> >the following situation:
> > * callbacks are void (*)(void *)
> > * data is void *
> > * instances can take void * or pointer to object type
> > * a macro SETUP_TIMER(timer, func, data) sets callback and data  
> >and checks if func(data) would be valid.
> 
> This is where a very limited form of C++ templates would do nicely;  
> you could define something like the following:
> 
> template 
> static inline void setup_timer(struct timer_list *timer,
>   void (*function)(T *), T *data)
> {
>   timer->function = (void (*)(void *))function;
>   timer->data = (void *)data;
>   init_timer(timer);
> }
> 
> Any attempts to call it with mismatched "function" and "data"  
> arguments would display an "Unable to find matching template" error  
> from the compiler.
> 
> Unfortunately you can't get simple templated functions without  
> opening the whole barrel of monkeys involved with C++ support;  

Fortunately, you can get all checks done by gcc without involving C++ (or
related flamewars).  See original post for a way to do it in a macro
and for fsck sake, leave gcc@gcc.gnu.org out of it.

Folks, please trim the Cc.  My apologies for cross-posting in the first place,
should've double-posted instead and manually forwarded relevant followups...