gcc project

2006-03-27 Thread Nic Volanschi
Hi,

During the last two years, I developed a gcc pass called tree-check
that performs user-defined checks on the tree (GIMPLE) form of a
function.

The checks are specified using a new option --tree-check, and are
powerful enough to express user-defined memory leaks, null pointer
dereferences, unreleased locks, etc., but also more basic checks such
as using an unsafe construct.

The tree-check pass prints a warning for each example found in the
code, but compiles the program as usual.

I think that incorporating these option in a future gcc version would
improve gcc in shortening the debug cycle, by avoiding to introduce
bugs that can be easily specified and automatically detected.

Modifications to the source of gcc are quite small: 250 lines
modified and 1000 lines added, in two new modules. Is this considered
a major change, or a minor one?

Also, in which branch should I port these modifications (4.2, 4.1,...)
to submit a patch? I originally developed the whole stuff on a
tree-ssa branch dating January 2004, and then ported it to
4.0.1. Porting this to any recent branch should be easy, but which one
to choose?

Regards,
Nic.




Re: gcc project

2006-03-29 Thread Nic Volanschi
On Tue, 2006-03-28 at 17:23, Diego Novillo wrote:
> Oh, excellent.  Coincidentally, we have been thinking about developing
> some kind of plugin/extension framework to allow these classes of
> analyses.  One of the goals is to provide an extensibility mechanism
> that will not require rebuilding GCC nor adding code that may not be
> often used.  Some of these checks are only interesting in very specific
> cases, so they may not be something that we want to add to the compiler
> itself.
> 
> So, the idea is to have a generic .so plugin mechanism with a relatively
> stable API that lets you hook into the compilation pipeline and do your
> analysis/transformation.  This would have the double advantage of
> allowing people to write ad-hoc passes and/or analyses that may not be
> suitable for the core compiler.  It would also allow users to extend GCC
> without having to get into the source code itself (assuming we get the
> abstractions and exports right, of course).

I see. The plugin approach you're sketching is very general, through
which anyone could write virtually any check, provided that it learns
(1) the AST structure and (2) the API to manipulate it.

I took a quite different approach, much more lightweight, which lets
really anyone write a restricted class of checks without knowing the
AST, nor any API. 

Nevertheless, this light approach could be combined with the API-based
approach, by complementing the (declarative) code patterns with
(executable) predicates using the API, and loaded as dynamic libraries. 

Thus, the two approaches can complement each other, leaving
unexperienced users to write simple checks, and advanced users to write
more complex checks.

The more general idea is that compiling and (simple) checking can be
fused together with a lot of advantages. I also wrote some papers on the
subject (that I would have loved to submit to the gcc summit but I
didn't learn soon enough about it, that's too bad :(( ).

> I'd be very interested in taking a look at what you've done.  Perhaps
> the best approach for you now is to get this code into a branch.  We
> already are in a "no new features" stage for 4.2.
> 

Ok, I'll take a few days to install subversion, figure how to use it,
port my stuff to the current 4.2 mainline, etc, and I'll get back. Note
that I'm doing everything on my spare time, so it's not that predictable
:°).

Regards,
Nic.




REQUEST: SEND FORM FOR PAST AND FUTURE CHANGES

2006-04-04 Thread Nic Volanschi
Hi,

I've sent this request for assignment last week to [EMAIL PROTECTED]
Will I receive (only) a snail mail answer? Did I submit the right version
of the form?

Thanks for any help,
Nic.

--
REQUEST: SEND FORM FOR PAST AND FUTURE CHANGES

[What is the name of the program or package you're contributing to?]
gcc

[Did you copy any files or text written by someone else in these
changes?
Even if that material is free software, we need to know about it.]
No.

[Do you have an employer who might have a basis to claim to own
your changes?  Do you attend a school which might make such a claim?]
No.

[For the copyright registration, what country are you a citizen of?]
Romania.

[What year were you born?]
1968.

[Please write your email address here.]
[EMAIL PROTECTED]

[Please write your postal address here.]
4 rue Villebois Mareuil
78110 Le Vesinet
FRANCE


[Which files have you changed so far, and which new files have you
written so far?]
1. Files changed:
common.opt
diagnostic.h
dlcheck.c
flags.h
Makefile.in
opts.c
pretty-print.c
pretty-print.h
timevar.def
toplev.c
tree.c
tree.h
tree-optimize.c
tree-pass.h
tree-pretty-print.c

2. New files:
tree-pattern.h (133 lines)
tree-check.c (409 lines)
tree-match.c (478 lines)





preview of the tree-check pass (Re: gcc project)

2006-04-04 Thread Nic Volanschi

OK, I have put a preview of the tree-check pass (performing lightweight
user-defined checks) on:
http://mygcc.free.fr. 
Any comments are welcome.

Nic.

On Tue, 2006-03-28 at 17:23, Diego Novillo wrote:
> On 03/27/06 16:35, Nic Volanschi wrote:
> 
> > The checks are specified using a new option --tree-check, and are
> > powerful enough to express user-defined memory leaks, null pointer
> > dereferences, unreleased locks, etc., but also more basic checks such
> > as using an unsafe construct.
[...]
> I'd be very interested in taking a look at what you've done.  Perhaps
> the best approach for you now is to get this code into a branch.  We
> already are in a "no new features" stage for 4.2.
> 



Re: preview of the tree-check pass (Re: gcc project)

2006-04-06 Thread Nic Volanschi
On Wed, 2006-04-05 at 09:12, Zack Weinberg wrote:
> It's an interesting system.  I wonder if it's powerful enough to express
> the rather complicated constraints on objects of type va_list.  Warnings
> for violations of those constraints would be valuable - there are common
> portability errors that could be caught - but it's never been important
> enough to write a custom check pass for it.  If your system can handle
> it, we could consider (assuming it's included) providing the check file
> that defines them with GCC as a worked example, maybe even including it
> in -Wall (which would ensure that your pass got exercised and therefore
> didn't break through disuse).  
> 
> I describe the constraints in
> http://gcc.gnu.org/ml/gcc-patches/2002-06/msg01293.html
> and can explain further if they don't make sense.  (I don't swear that
> I have them perfectly accurate, but I'm pretty sure.)

Very interesting problem! The short answer is: 
1. you can solve 1/3 of your problem with the current tree-check pass
2. by hacking a bit the pass, we could easily solve a second third
3. the third third :°) is a bit trickier, actually you cannot express it
in the framework without extending it more seriously.

Here are the details.
Looking at your specification of the va_arg check, it is formulated as
two automata: one for the caller function, and one for the called
("mangling") function. There are three error cases (corresponding to the
3 thirds above):
1. (in the caller:) exiting the function after a va_start() without an
va_end().
This can be expressed directly today as:

  from "va_start(%X,%_)"
  to "return" or "return(%_)"
  avoid "va_end(%X)"

Actually, because the check is performed on the preprocessed form, the
names of the calls have to be a bit different:

  from "__builtin_va_start(%X,%_,%_)"
  to "return" or "return(%_)"
  avoid "__builtin_va_end(%X)"


2. (in the mangler:) calling va_start() or va_end() on a va_list
parameter.
In principle, this can be expressed simply as:

from "va_list %X;" 
to "return" or "return(%_)"

The problem is that currently the pass scans only statements in the CFG,
not declarations, so the "from" expression will never match. Moreover,
the declaration of %X should be a formal parameter, not a local
variable.
The first problem could be handled by simply scanning declarations as an
entry block for the CFG. 
The second problem could be handled by complementing the "from" pattern
with a call to the gcc API (as Diego intended to define one), e.g.:

from "va_list %X;" | PARM_DECL_P(X)
to "return" or "return(%_)"


3. (in the caller:) exiting the function after a va_start() then a call
to the mangler without an va_end().
This one involves more than a from/to/avoid; it is of the form
from/then/to/avoid. In other words, the corresponding automaton has more
than three states, which is the limit of my current framework. The
reason why I chose this limitation in the first place is to ensure taht
checking is linear in time and space. I'm not sure this should be
re-considered.

Cheers,
Nic.



Re: preview of the tree-check pass (Re: gcc project)

2006-04-07 Thread Nic Volanschi
On Fri, 2006-04-07 at 01:43, Joe Buck wrote:
> On Thu, Apr 06, 2006 at 11:58:20PM +0200, Nic Volanschi wrote:
> > 3. (in the caller:) exiting the function after a va_start() then a call
> > to the mangler without an va_end().
> > This one involves more than a from/to/avoid; it is of the form
> > from/then/to/avoid. In other words, the corresponding automaton has more
> > than three states, which is the limit of my current framework. The
> > reason why I chose this limitation in the first place is to ensure taht
> > checking is linear in time and space. I'm not sure this should be
> > re-considered.
> 
> The limitation is, I think, too strict.  Other path-based tools (for
> example, Coverity's) handle cases like this, and limit the combinatorial
> explosion by cutting off after exploring a fixed number of paths .

Yes!, Coverity is the natural alternative (except it's not free, hum :(
). But more importantly, Coverity is (as far as I know, I didn't buy
one) a checker, not a checking compiler. Historically, it started as an
extended version of gcc, but it never tried to *remain* a compiler.
That's essential! I mean, I do not propose to transform gcc in a
standalone checking tool, but rather to add user-defined checks
*besides* normal compiling.

This is, first of all, a severe constraint: it imposes to perform checks
that take a reasonable time with respect to compiling proper (read as:
less, or even much less, time than compiling).

In turn, keeping checks efficient opens up a whole new perspective:
continuous checking. No compiler I know of is able to do this. What's
the advantage? Well, if you check your code only from time to time, you
may re-introduce bugs, or let them survive from one release to another.
To prove this point, take a look to the Linux experiment (on
mygcc.free.fr). This is the story of an old Linux (v2.4.1) that was
tested precisely with Coverity's ancestor, called MC (or xgcc, before).
Anyways, this (impressive) tool found more than 500 bugs in the OS, all
reported to the Linux developers community. Have they all disappeared? I
had the curiosity to apply mygcc to check this in a recent kernel
(2.6.13), and I found 4% surviving bugs, three years later, rephrased in
slightly different contexts. OK, 4% is not much, but if you can avoid
them, why not? And I also found 3 new bugs!

BTW, among the 12 classes of bugs found by the MC tool, 11 can be
expressed partially or completely in the much more limited setting of
mygcc (= using automata of at most 3 states). That's encouraging, isn't
it?

Well, I don't say that we should not extend the framework, due to this
experiment. We certainly should, at least for builtin checks. (gcc
passes are not all linear, are they?) However, when putting power in
user's hands, one should be very careful to two aspects:
- integrity: user checks should not hang the compiler
- performance: checks should be reasonably fast
If not, user's perception of gcc's quality may be negatively impacted,
which is quite important, I think. The direct analogy is user-space
hooks in an OS: they should not hang the whole system.

But this can be discussed further, and I encourage everyone interested
to express their feelings :°).

Nic.




Re: MyGCC and whole program static analysis

2006-09-03 Thread Nic Volanschi
On Wednesday 30 August 2006 18:52, Basile STARYNKEVITCH wrote:
> Le Wed, Aug 30, 2006 at 06:36:19PM +0200, basile écrivait/wrote:
> > Maybe some of your are aware of MyGCC http://mygcc.free.fr/ which
> > seems to be an extended GCC to add some kind of static analysis.
> >
> > I'm quite surprised that the mygcc page gives x86/linux binaries, but
> > no source tarball of their compiler (this seems to me against the
> > spirit of the GPL licence, but I am not a lawyer).
>
> My public apologies to MyGCC. There is a patch on
> http://gcc.gnu.org/ml/gcc-patches/2006-05/msg01437.html but sadly the
> http://mygcc.free.fr does not provide any link to it.
>
> It is sad to have to google to find their patch, it would be simpler
> if they linked it (or even gave full source tarball).

Basile, 

I apologize for the inconvenience, I agree with everybody that mygcc sources, 
although publicly available, were not as easy to get as they had to. It's just 
that in my view, the site was only distributing a pre-compiled snapshot of 
something going on in a (public) gcc development branch.

So, I fixed all that today: 
- there is a link on the site to the proposed gcc patch
- the full source archive is also available in the download page.

Thank you for your interest, and for your announce about the very 
exciting starting GGCC project. 

Nic.


Re: [gnu.org #283065] Confirming the mailing of your assignment

2006-04-08 Thread Nic Volanschi via RT
Hi Jonas,

Thanks for this confirmation. To accelerate my processing of these
papers, would it be possible to already have an electronic version of
them by e-mail? Even a generic version would be ok. This way, I can sign
and return you the paper copy as soon as I get it.

Cheers,
Nic.

On Thu, 2006-04-06 at 23:08, [EMAIL PROTECTED] via RT wrote:
> Hello Nic, 
> 
> Thank you for contributing to GNU software. We'll send you the appropriate 
> papers through the post Please sign and return the original in the envelope 
> provided. Once the FSF has signed it, we will send you a digital copy in pdf 
> format for your records.
> 
> If your employment status changes, please remember to notify us.  A change in 
> employers, or a change in your employment status, can effect your continuing 
> assignment.
> 
> Thank you for your contribution!
> 
> All the best,
> Jonas Jacobson
> Assignment Administrator
> Free Software Foundation
> 51 Franklin Street, Fifth Floor
> Boston, MA 02110
> Phone +1-617-542-5942 
> Fax +1-617-542-2652
> 
> 
>