[Python-Dev] Static checker for common Python programming errors

2014-11-17 Thread Stefan Bucur
I'm developing a Python static analysis tool that flags common programming
errors in Python programs. The tool is meant to complement other tools like
Pylint (which perform checks at lexical and syntactic level) by going
deeper with the code analysis and keeping track of the possible control
flow paths in the program (path-sensitive analysis).

For instance, a path-sensitive analysis detects that the following snippet
of code would raise an AttributeError exception:

if object is None: # If the True branch is taken, we know the object is None
  object.doSomething() # ... so this statement would always fail

I'm writing first to the Python developers themselves to ask, in their
experience, what common pitfalls in the language & its standard library
such a static checker should look for. For instance, here [1] is a list of
static checks for the C++ language, as part of the Clang static analyzer
project.

My preliminary list of Python checks is quite rudimentary, but maybe could
serve as a discussion starter:

* Proper Unicode handling (for 2.x)
  - encode() is not called on str object
  - decode() is not called on unicode object
* Check for integer division by zero
* Check for None object dereferences

Thanks a lot,
Stefan Bucur

[1] http://clang-analyzer.llvm.org/available_checks.html
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Static checker for common Python programming errors

2014-11-17 Thread Mark Shannon

Hi,

I think this might be a bit off-topic for this mailing list,
code-qual...@python.org is the place for discussing static analysis tools.

Although if anyone does have any comments on any particular checks
they would like, I would be interested as well.

Cheers,
Mark.


On 17/11/14 14:49, Stefan Bucur wrote:

I'm developing a Python static analysis tool that flags common
programming errors in Python programs. The tool is meant to complement
other tools like Pylint (which perform checks at lexical and syntactic
level) by going deeper with the code analysis and keeping track of the
possible control flow paths in the program (path-sensitive analysis).

For instance, a path-sensitive analysis detects that the following
snippet of code would raise an AttributeError exception:

if object is None: # If the True branch is taken, we know the object is None
   object.doSomething() # ... so this statement would always fail

I'm writing first to the Python developers themselves to ask, in their
experience, what common pitfalls in the language & its standard library
such a static checker should look for. For instance, here [1] is a list
of static checks for the C++ language, as part of the Clang static
analyzer project.

My preliminary list of Python checks is quite rudimentary, but maybe
could serve as a discussion starter:

* Proper Unicode handling (for 2.x)
   - encode() is not called on str object
   - decode() is not called on unicode object
* Check for integer division by zero
* Check for None object dereferences

Thanks a lot,
Stefan Bucur

[1] http://clang-analyzer.llvm.org/available_checks.html



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/mark%40hotpy.org


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Static checker for common Python programming errors

2014-11-17 Thread Stefan Bucur
Mark, thank you for the pointer! I will re-send my message there. Should I
include both mailing lists in a single thread if I end up receiving replies
from both?

Cheers,
Stefan

On Mon Nov 17 2014 at 4:04:45 PM Mark Shannon  wrote:

> Hi,
>
> I think this might be a bit off-topic for this mailing list,
> code-qual...@python.org is the place for discussing static analysis tools.
>
> Although if anyone does have any comments on any particular checks
> they would like, I would be interested as well.
>
> Cheers,
> Mark.
>
>
> On 17/11/14 14:49, Stefan Bucur wrote:
> > I'm developing a Python static analysis tool that flags common
> > programming errors in Python programs. The tool is meant to complement
> > other tools like Pylint (which perform checks at lexical and syntactic
> > level) by going deeper with the code analysis and keeping track of the
> > possible control flow paths in the program (path-sensitive analysis).
> >
> > For instance, a path-sensitive analysis detects that the following
> > snippet of code would raise an AttributeError exception:
> >
> > if object is None: # If the True branch is taken, we know the object is
> None
> >object.doSomething() # ... so this statement would always fail
> >
> > I'm writing first to the Python developers themselves to ask, in their
> > experience, what common pitfalls in the language & its standard library
> > such a static checker should look for. For instance, here [1] is a list
> > of static checks for the C++ language, as part of the Clang static
> > analyzer project.
> >
> > My preliminary list of Python checks is quite rudimentary, but maybe
> > could serve as a discussion starter:
> >
> > * Proper Unicode handling (for 2.x)
> >- encode() is not called on str object
> >- decode() is not called on unicode object
> > * Check for integer division by zero
> > * Check for None object dereferences
> >
> > Thanks a lot,
> > Stefan Bucur
> >
> > [1] http://clang-analyzer.llvm.org/available_checks.html
> >
> >
> >
> > ___
> > Python-Dev mailing list
> > Python-Dev@python.org
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> mark%40hotpy.org
> >
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Static checker for common Python programming errors

2014-11-17 Thread Guido van Rossum
Also, I should mention mypy (mypy-lang.org), which is a much more ambitious
project that uses type annotations. I am trying to find time to work on a
PEP that standardizes type annotations to match mypy's syntax (with
probably some improvements and caveats). It's too early to post the PEP
draft but if you're designing a type checker or IDE that could use help
from type annotations, email me.

On Mon, Nov 17, 2014 at 6:49 AM, Stefan Bucur 
wrote:

> I'm developing a Python static analysis tool that flags common programming
> errors in Python programs. The tool is meant to complement other tools like
> Pylint (which perform checks at lexical and syntactic level) by going
> deeper with the code analysis and keeping track of the possible control
> flow paths in the program (path-sensitive analysis).
>
> For instance, a path-sensitive analysis detects that the following snippet
> of code would raise an AttributeError exception:
>
> if object is None: # If the True branch is taken, we know the object is
> None
>   object.doSomething() # ... so this statement would always fail
>
> I'm writing first to the Python developers themselves to ask, in their
> experience, what common pitfalls in the language & its standard library
> such a static checker should look for. For instance, here [1] is a list of
> static checks for the C++ language, as part of the Clang static analyzer
> project.
>
> My preliminary list of Python checks is quite rudimentary, but maybe could
> serve as a discussion starter:
>
> * Proper Unicode handling (for 2.x)
>   - encode() is not called on str object
>   - decode() is not called on unicode object
> * Check for integer division by zero
> * Check for None object dereferences
>
> Thanks a lot,
> Stefan Bucur
>
> [1] http://clang-analyzer.llvm.org/available_checks.html
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>


-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Static checker for common Python programming errors

2014-11-17 Thread Brett Cannon
On Mon Nov 17 2014 at 12:06:15 PM Stefan Bucur 
wrote:

> Mark, thank you for the pointer! I will re-send my message there. Should I
> include both mailing lists in a single thread if I end up receiving replies
> from both?


No as cross-posting becomes just a nightmare of moderation when someone is
not on both lists; please only post to a single mailing list.

-Brett


>
> Cheers,
> Stefan
>
>
> On Mon Nov 17 2014 at 4:04:45 PM Mark Shannon  wrote:
>
>> Hi,
>>
>> I think this might be a bit off-topic for this mailing list,
>> code-qual...@python.org is the place for discussing static analysis
>> tools.
>>
>> Although if anyone does have any comments on any particular checks
>> they would like, I would be interested as well.
>>
>> Cheers,
>> Mark.
>>
>>
>> On 17/11/14 14:49, Stefan Bucur wrote:
>> > I'm developing a Python static analysis tool that flags common
>> > programming errors in Python programs. The tool is meant to complement
>> > other tools like Pylint (which perform checks at lexical and syntactic
>> > level) by going deeper with the code analysis and keeping track of the
>> > possible control flow paths in the program (path-sensitive analysis).
>> >
>> > For instance, a path-sensitive analysis detects that the following
>> > snippet of code would raise an AttributeError exception:
>> >
>> > if object is None: # If the True branch is taken, we know the object is
>> None
>> >object.doSomething() # ... so this statement would always fail
>> >
>> > I'm writing first to the Python developers themselves to ask, in their
>> > experience, what common pitfalls in the language & its standard library
>> > such a static checker should look for. For instance, here [1] is a list
>> > of static checks for the C++ language, as part of the Clang static
>> > analyzer project.
>> >
>> > My preliminary list of Python checks is quite rudimentary, but maybe
>> > could serve as a discussion starter:
>> >
>> > * Proper Unicode handling (for 2.x)
>> >- encode() is not called on str object
>> >- decode() is not called on unicode object
>> > * Check for integer division by zero
>> > * Check for None object dereferences
>> >
>> > Thanks a lot,
>> > Stefan Bucur
>> >
>> > [1] http://clang-analyzer.llvm.org/available_checks.html
>> >
>> >
>> >
>> > ___
>> > Python-Dev mailing list
>> > Python-Dev@python.org
>> > https://mail.python.org/mailman/listinfo/python-dev
>> > Unsubscribe: https://mail.python.org/mailman/options/python-dev/
>> mark%40hotpy.org
>> >
>>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] OneGet provider for Python

2014-11-17 Thread Paul Moore
On 15 November 2014 15:40, Paul Moore  wrote:
> On 15 November 2014 15:17, Benjamin Peterson  wrote:
>> On Sat, Nov 15, 2014, at 05:54, Nathaniel Smith wrote:
>>> On 15 Nov 2014 10:10, "Paul Moore"  wrote:
>>> >
>>> > > Incidentally, it would be really useful if python.org provided stable
>>> > > url's that always redirected to the latest .msi installers, for
>>> > > bootstrapping purposes. I'd prefer to not rely on chocolatey (or on
>>> > > scraping the web site) for this.
>>> >
>>> > https://www.python.org/ftp/python/$ver/python-$ver.msi
>>> > https://www.python.org/ftp/python/$ver/python-$ver.amd64.msi
>>>
>>> Right, but what's the URL for "the latest 2.7.x release" or "the latest
>>> 3.x.x release"?
>>
>> The website has an API you know.
>
> Um, no. Where can I find out about it?

I don't know if this got lost in the other messages in this thread,
but *is* there a stable URL for "the latest Python 3.4 MSI for Windows
amd64" (or similar)?

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] OneGet provider for Python

2014-11-17 Thread Ned Deily
In article 
,
 Paul Moore  wrote:
> I don't know if this got lost in the other messages in this thread,
> but *is* there a stable URL for "the latest Python 3.4 MSI for Windows
> amd64" (or similar)?

AFAIK, no, there is no such stable URL that directly downloads the 
latest installer(s) for a platform; the closest is probably 
https://www.python.org/downloads/windows/ which would require scraping. 
I'm not sure we would want to encourage such a thing; we want 
downloaders to read the web page information for each release and make 
an informed choice.  And the number of installer variants may change 
from release to release for a platform, as was recently the case with 
the OS X installers.  For testing purposes, scraping the web pages or 
using the (undocumented, see the code base on github) website JSON API 
are probably the best options now.  You could open an issue on the 
website github issue tracker.

-- 
 Ned Deily,
 n...@acm.org

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Static checker for common Python programming errors

2014-11-17 Thread Francis Giraldeau
If I may, there are prior work on JavaScript that may be worth
investigating. Formal verification of dynamically typed software is a
challenging endeavour, but it is very valuable to avoid errors at runtime,
providing benefits from strongly type language without the rigidity.

http://cs.au.dk/~amoeller/papers/tajs/

Good luck!

Francis

2014-11-17 9:49 GMT-05:00 Stefan Bucur :

> I'm developing a Python static analysis tool that flags common programming
> errors in Python programs. The tool is meant to complement other tools like
> Pylint (which perform checks at lexical and syntactic level) by going
> deeper with the code analysis and keeping track of the possible control
> flow paths in the program (path-sensitive analysis).
>
> For instance, a path-sensitive analysis detects that the following snippet
> of code would raise an AttributeError exception:
>
> if object is None: # If the True branch is taken, we know the object is
> None
>   object.doSomething() # ... so this statement would always fail
>
> I'm writing first to the Python developers themselves to ask, in their
> experience, what common pitfalls in the language & its standard library
> such a static checker should look for. For instance, here [1] is a list of
> static checks for the C++ language, as part of the Clang static analyzer
> project.
>
> My preliminary list of Python checks is quite rudimentary, but maybe could
> serve as a discussion starter:
>
> * Proper Unicode handling (for 2.x)
>   - encode() is not called on str object
>   - decode() is not called on unicode object
> * Check for integer division by zero
> * Check for None object dereferences
>
> Thanks a lot,
> Stefan Bucur
>
> [1] http://clang-analyzer.llvm.org/available_checks.html
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/francis.giraldeau%40gmail.com
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] OneGet provider for Python

2014-11-17 Thread Paul Moore
On 17 November 2014 19:23, Ned Deily  wrote:
> Paul Moore  wrote:
>> I don't know if this got lost in the other messages in this thread,
>> but *is* there a stable URL for "the latest Python 3.4 MSI for Windows
>> amd64" (or similar)?
>
> AFAIK, no, there is no such stable URL that directly downloads the
> latest installer(s) for a platform; the closest is probably
> https://www.python.org/downloads/windows/ which would require scraping.
> I'm not sure we would want to encourage such a thing;

I'm happy enough with just the direct links to the exact versions
(3.4.1 etc). I have to update my automatic build script whenever a new
minor version comes out, which is a bit of a pain but as you say,
having to deliberately decide to upgrade the version it installs is
not a bad thing.

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Support for Linux perf

2014-11-17 Thread Francis Giraldeau
Hi,

The PEP-418 is about performance counters, but there is no mention of
performance management unit (PMU) counters, such as cache misses and
instruction counts.

The Linux perf tool aims at recording these samples at the system level. I
ran linux perf on CPython for profiling. The resulting callstack is inside
libpython.so, mostly recursive calls to PyEval_EvalFrameEx(), because the
tool works at the ELF level. Here is an example with a dummy program
(linux-tools on Ubuntu 14.04):

$ perf record python crunch.py
$ perf report --stdio
# Overhead  Command   Shared ObjectSymbol
#   ...  ..  
#
32.37%   python  python2.7   [.] PyEval_EvalFrameEx
13.70%   python  libm-2.19.so[.] __sin_avx
 5.25%   python  python2.7   [.] binary_op1.5010
 4.82%   python  python2.7   [.] PyObject_GetAttr

While this may be insightful for the interpreter developers, it it not so
for the average Python developer. The report should display Python code
instead. It seems obvious, still I haven't found the feature for that.

When a performance counter reaches a given value, a sample is recorded. The
most basic sample only records a timestamps, thread ID and the program
counter (%rip). In addition, all executable memory maps of libraries are
recorded. For the callstack, frame pointers are traversed, but most of the
time, they are optimized on x86, so there is a fall back to unwind, which
requires saving register values and a chunk of the stack. The memory space
of the process is reconstructed offline.

CPython seems to allocates code and frames on mmap() pages. If the data is
outside about 1k from the top of stack, it is not available offline in the
trace. We need some way to reconstitute this memory space of the
interpreter to resolve the symbols, probably by  dumping the data on disk.

In Java, there is a small HotSpot agent that spits out the symbols of JIT
code:

https://github.com/jrudolph/perf-map-agent

The problem is that CPython does not JIT code, and executed code is the ELF
library itself. The executed frames are parameters of functions of the
interpreter. I don't think the same approach can be used (maybe this can be
applied to PyPy?).

I looked at how Python frames are handled in GDB
(file cpython/Tools/gdb/libpython.py). A python frame is detected in
Frame(gdbframe).is_evalframeex() by a C call to PyEval_EvalFrameEx().
However, the traceback accesses PyFrameObject on the heap (at least for
f->f_back = 0xa57460), which is possible in GDB when the program is paused
and the whole memory space is available, but is not recorded for offline
use in perf. Here is an example of callstack from GDB:

#0  PyEval_EvalFrameEx (f=Frame 0x77f1b060, for file crunch.py, line 7,
in bar (num=466829),
throwflag=0) at ../Python/ceval.c:1039
#1  0x00527877 in fast_function (func=,
pp_stack=0x7fffd280, n=1, na=1, nk=0) at ../Python/ceval.c:4106
#2  0x00527582 in call_function (pp_stack=0x7fffd280, oparg=1)
at ../Python/ceval.c:4041


We could add a kernel module that "knows" how to make samples of CPython,
but it means python structures becomes sort of ABI, and kernel devs won't
allow a python interpreter in kernel mode ;-).

What we really want is f_code data and related objects:

(gdb) print (void *)(f->f_code)
$8 = (void *) 0x77e370f0

Maybe we could save these pages every time some code is loaded from the
interpreter? (the memory range is about 1.7MB, but )

Anyway, I think we must change CPython to support tools such as perf. Any
thoughts?

Cheers,

Francis
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Static checker for common Python programming errors

2014-11-17 Thread Terry Reedy

On 11/17/2014 9:49 AM, Stefan Bucur wrote:

I'm developing a Python static analysis tool that flags common
programming errors in Python programs. The tool is meant to complement
other tools like Pylint (which perform checks at lexical and syntactic
level) by going deeper with the code analysis and keeping track of the
possible control flow paths in the program (path-sensitive analysis).

For instance, a path-sensitive analysis detects that the following
snippet of code would raise an AttributeError exception:

if object is None: # If the True branch is taken, we know the object is None
   object.doSomething() # ... so this statement would always fail

I'm writing first to the Python developers themselves to ask, in their
experience, what common pitfalls in the language & its standard library
such a static checker should look for. For instance, here [1] is a list
of static checks for the C++ language, as part of the Clang static
analyzer project.


You could also a) ask on python-list (new thread), or scan python 
questions on StackOverflow.  Todays's example: "Why does my function 
return None?"  Because there is no return statement.  Perhaps current 
checkers can note that, but what about if some branches have a return 
and others do not?  That is a likely bug.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Static checker for common Python programming errors

2014-11-17 Thread MRAB

On 2014-11-18 01:21, Terry Reedy wrote:

On 11/17/2014 9:49 AM, Stefan Bucur wrote:

I'm developing a Python static analysis tool that flags common
programming errors in Python programs. The tool is meant to complement
other tools like Pylint (which perform checks at lexical and syntactic
level) by going deeper with the code analysis and keeping track of the
possible control flow paths in the program (path-sensitive analysis).

For instance, a path-sensitive analysis detects that the following
snippet of code would raise an AttributeError exception:

if object is None: # If the True branch is taken, we know the object is None
   object.doSomething() # ... so this statement would always fail

I'm writing first to the Python developers themselves to ask, in their
experience, what common pitfalls in the language & its standard library
such a static checker should look for. For instance, here [1] is a list
of static checks for the C++ language, as part of the Clang static
analyzer project.


You could also a) ask on python-list (new thread), or scan python
questions on StackOverflow.  Todays's example: "Why does my function
return None?"  Because there is no return statement.  Perhaps current
checkers can note that, but what about if some branches have a return
and others do not?  That is a likely bug.


Mutable default parameters comes up occasionally.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com