I think it is a good idea improve it by the GSoC 2014 Edward! This can speed development and put in conform static analyzer with NIST nomenclature.
2014-03-11 8:47 GMT-03:00 Edoardo P. <[email protected]>: > Hi Ted, > > Thank you for your answer. > > If I got it right, the parameter variables of a certain function, > cannot be constrained until inside the TU (I hope we can manage to > analyze together two or more TU) there's a call to such function > somewhere else and the arguments are constrained. > If we get a function call and we don't have the declaration of such > function or it's inside another TU, we can't tell anything regarding > the parameters or the global variables (is it possible for a function > declared in another TU access to a static variable from another TU?), > which might change, but for the stl functions, we know how they work, > hence we can choose between: > 1)"inject" inside the TU their "generic" declarations (I wonder which > functions are OS-dependent...) and make it parse, > 2)hardcode their behavior inside clang ... > > Obviously, we should add something to the model. > > Anyways, since we got in the Juliet Test Suite argument, is it > possible to open a new bug report with a status of the static analyzer > with it? It would keep track and could be improved by the GSoC 2014 > student(s) who wants to work on it. > Also, I suggest to include to the todo things for the GSoC the idea of > conforming the new Unique Defect ID to the CWE ID. This would be > helpful when you want to test clang analyzer with the Static Analysis > Tool Exposition (SATE): http://samate.nist.gov/SATE.html > > > 2014-03-10 2:32 GMT+01:00 Ted Kremenek <[email protected]>: > > Hi Edoardo, > > > > What you are observing is something that is potentially easy to "fix", > but > > essentially requires additional domain-specific modeling in the static > > analyzer. > > > > To understand what is going on, consider the following example: > > > > void foo(int x) { > > 1 / x; > > } > > > > In this example we have a potential division-by-zero. If 'foo' is > analyzed > > in isolation, we don't know anything about the value of 'x'. We can do > one > > of two things: > > > > (1) Warn that 'x' could potentially be zero. > > > > (2) Do not emit a warning, but register there was a division and that 'x' > > cannot be zero after the division, and keep analyzing. > > > > In both cases the value of 'x' is unconstrained, so we can look at #1 > and #2 > > as two potential default behaviors. In the analyzer we do #2. > Essentially, > > although the value of 'x' is unconstrained, we don't warn here. The > > rationale is that the analyzer would emit a huge number of false > positives > > for integers that are possibly perfectly safe. We take the same tactic > for > > pointers, who all could hypothetically could be null. > > > > Instead, we take the strategy in the analyzer to flag bugs like null > > pointers and division-by-zero if we have evidence belief there may be a > > problem. For example, if a pointer is checked against null, we have > > evidence to believe that the pointer could be null. Similarly, there > may be > > some APIs that we know could return a null pointer, and thus we can > possibly > > track that information and see if a pointer coming from one of these > sources > > is dereferenced. > > > > The same tactic can be taken with integers and divide-by-zero. Let's > look > > at one of your examples: > > > > void f_fscanf() > > { > > int data; > > fscanf(stdin, "%d", &data); > > /* POTENTIAL FLAW: Possibly divide by zero */ > > printIntLine(100 / data); > > } > > > > After the call to 'fscanf' the analyzer tracks that 'data' binds to some > > symbolic integer value. That value is considered to be underconstrained, > > just like in my example above. In this example however, we could track > some > > additional information that it came from an unsafe API like fscanf. This > > information is something *additional* that we would need to model, as we > > probably would not want to do this for any arbitrary API call. Modeling > > this might not be too hard. For example, the symbolic integer value does > > have some information associated with it to tell us where it originated; > we > > could possibly use this to consult an API list to determine if the > integer > > value should be treated as unsafe. We'd then modify the divide-by-zero > > logic in the analyzer to emit a warning if an under constrained integer > came > > from one of these sources. We'd probably could conjure up something > fairly > > ad hoc at first for a few specific cases, and then generalize it. A > while > > ago we experimented with generalized taint analysis in the analyzer, > which > > is something that would also possibly be useful here. > > > > Does that answer your question? Essentially we need a little additional > > modeling in the analyzer for these APIs. > > > > Ted > > > > > > On Mar 9, 2014, at 9:17 AM, Edoardo P. <[email protected]> wrote: > > > > Ping? > > > > > > 2014-02-20 22:25 GMT+01:00 Edoardo P. <[email protected]>: > >> > >> I forgot to add more info regarding the not detected int divisions: > >> > >> the only correctly warned parts are for very simple cases, like this: > >> > >> int x = 0; ... int y = 100 / x; > >> > >> and the modulo variant. The other ones, which are not detected, assign x > >> with: > >> > >> - atoi (used when converting a string returned by socket functions or > >> fgets) > >> - fscanf > >> - rand > >> > >> And all the variants of such functions. The examples for fscanf and rand > >> are simple: > >> > >> fscanf variant: > >> > >> void f_fscanf() > >> { > >> int data; > >> fscanf(stdin, "%d", &data); > >> /* POTENTIAL FLAW: Possibly divide by zero */ > >> printIntLine(100 / data); > >> } > >> > >> rand variant: > >> > >> void f_rand() > >> { > >> int data; > >> data = rand(); > >> /* POTENTIAL FLAW: Possibly divide by zero */ > >> printIntLine(100 / data); > >> } > >> > >> in the test suite, rand is replaced with a macro, RAND_32(), defined as: > >> > >> from testcasesupport/std_testcase.h: > >> /* rand only returns 15 bits, so we xor 3 calls together to get the full > >> result (13 bits overflow, but that is okay) */ > >> #define RAND32() ((rand()<<30) ^ (rand()<<15) ^ rand()) > >> > >> (I'd truncate the first rand() to the first 2 bits before shifting it to > >> 30th bit position, just to avoid overflow ...) > >> > >> The 'itoa' case is more complex. Imho, the analyzer, before reporting > the > >> warning, should check that there can be a possibility that the string > >> argument will be converted to 0. I hope it's possible with the current > >> analyzer architecture... > >> > >> Cheers, > >> Edward-san > >> > >> 2014-02-20 15:26 GMT+01:00 Edoardo P. <[email protected]>: > >> > >>> Hi Lucas: > >>> > >>>> > >>>> I am doing today a script to run all test case in Juliet with Clang > and > >>>> generate a report (CSV file), when a finished this i will send you the > >>>> results. > >>> > >>> > >>> Please do. > >>> > >>>> > >>>> But a run manually and clang can find only 54 weaknesses in a > >>>> total of 1476 files (into testcases/CWE369_Divide_by_zero, including > s01 > >>>> and s02 directories). > >>>> > >>> > >>> I can confirm that. In IRC I discussed with some llvm devs about the > need > >>> to report the float division by zero (Standard C/C++ says it's > undefined > >>> behavior, but Annex F implicitly allows it, thanks to IEEE 754). I > have an > >>> idea: add a new static analyzer option which allows or disallows > optional > >>> features, including the optional Annexes. That would help a lot imho. > >>> > >>> Cheers, > >>> Edward-san > >>> > >>>> > >>>> On Tue, 2014-02-18 at 16:19 +0100, Edoardo P. wrote: > >>>> > Hi, Lucas, Jordan: > >>>> > > >>>> > > >>>> > About the division by zero checking, I run this: > >>>> > > >>>> > > >>>> > scan-build --use-analyzer /usr/bin/clang -o buildres/ clang -c -I > >>>> > testcasesupport -DOMITGOOD > >>>> > > >>>> > > testcases/CWE369_Divide_by_Zero/s02/CWE369_Divide_by_Zero__int_zero_divide_01.c > >>>> > -o /dev/null > >>>> > > >>>> > > >>>> > and I get the warnng: > >>>> > > >>>> > > >>>> > > testcases/CWE369_Divide_by_Zero/s02/CWE369_Divide_by_Zero__int_zero_divide_01.c:30:22: > >>>> > warning: Division by zero > >>>> > printIntLine(100 / data); > >>>> > ~~~~^~~~~~ > >>>> > > >>>> > > >>>> > So, Lucas, which file was failing for you? > >>>> > > >>>> > > >>>> > Regarding the experience, here it is what I gathered till now: > >>>> > > >>>> > I created a very huge file_list.txt, containing the source files to > >>>> > compile (I used 'find -name *.c*' in the juliet directory), then > >>>> > filtered away the 'main\.c', 'main_linux\.c' and 'testcasesupport' > >>>> > files (grep -v), which have nothing to check, then I sorted the list > >>>> > by CWE number (I had to do manual sorting because I couldn't manage > to > >>>> > sort, for example, CWE15_* and CWE114_* correctly). > >>>> > > >>>> > Since I can't check for win32-only tests (I'm using linux), I > filtered > >>>> > them via 'grep -v w32' and 'grep -v wchar_t' (some tests require a > >>>> > 'fopen'-like function with wchar_t string, which seems to be > exclusive > >>>> > to win32). > >>>> > > >>>> > > >>>> > Regarding the per-translation-unit analysis, some files are, indeed, > >>>> > separated sources for a program, so I didn't hesitate to filter them > >>>> > with these patterns, according to the manual: "[abcdeBG]\.c" and > >>>> > "good1" (last was associated with a 'bad' file, which was already > >>>> > filtered). > >>>> > > >>>> > > >>>> > With this file_lists.txt, I run the static analyzer only for the > false > >>>> > positives, with this command: > >>>> > > >>>> > < file_list.txt xargs -n 1 scan-build --use-analyzer /usr/bin/clang > >>>> > -disable-checker deadcode.DeadStores -o buildres/ clang -c -I > >>>> > testcasesupport -DOMITBAD -o /dev/null > /dev/null 2> warns.txt > >>>> > > >>>> > > >>>> > so, it checks all the files in the file list, saves the results in > >>>> > buildres and reports the warnings in warns.txt file, ignoring the > >>>> > DeadStores warns because they're reported a lot often. > >>>> > > >>>> > > >>>> > Well, there are tons of false positives, caused by the flow variants > >>>> > which involve global and static variables, shadow variables usage, > >>>> > etc. > >>>> > > >>>> > > >>>> > To the devs, I'd like to know which CWE are you interested, from the > >>>> > list I attached on that email: > >>>> > > http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-February/035113.html . > >>>> > > >>>> > > >>>> > About the results, if I have more time, I'll post some of them. > >>>> > > >>>> > > >>>> > > >>>> > 2014-02-18 2:05 GMT+01:00 Lucas Kanashiro > >>>> > <[email protected]>: > >>>> > Thanks Jordan! > >>>> > > >>>> > Could you leave me updated on the matter? I am so interested > >>>> > in this, > >>>> > and if it is necessary and possible i want to help to solve > >>>> > the > >>>> > potential issue. > >>>> > > >>>> > Edward, can you tell us your experience with Clang and > Juliet > >>>> > Test > >>>> > Suite? > >>>> > > >>>> > > >>>> > On Mon, 2014-02-17 at 09:43 -0800, Jordan Rose wrote: > >>>> > > Hi, Lucas. The analyzer currently runs a > >>>> > per-translation-unit analysis, so it misses some bugs that > >>>> > whole-program analysis may be able to catch. I'm guessing > >>>> > that's the reason it's unable to catch this particular > issue. > >>>> > > > >>>> > > In general, the analyzer is set for reasonably fast > >>>> > turnaround (depending on the size of the project, of > course), > >>>> > so it also might not do a fully precise interprocedural > >>>> > analysis if the state space gets too big. I'd have to see > the > >>>> > particular test case to tell what's going on here. > >>>> > > > >>>> > > I did see that Edward (CC'd) wanted to try bringing in the > >>>> > Juliet Test Suite for the analyzer, but neither I nor Ted > (the > >>>> > lead on the analyzer) have gotten the chance to sit down and > >>>> > look at what this would actually entail. It's possible he's > >>>> > encountered similar issues, however. > >>>> > > > >>>> > > Jordan > >>>> > > > >>>> > > > >>>> > > On Feb 15, 2014, at 5:58 , Lucas Kanashiro > >>>> > <[email protected]> wrote: > >>>> > > > >>>> > > > I am trying to running 'scan-build' in Juliet suite > >>>> > testcase v1.2 (NIST > >>>> > > > indication) to catch some bugs of 'Division by zero' > (CWE > >>>> > 369) and I > >>>> > > > can't do it, the scan-build can't show me the existing > >>>> > bugs. Did someone > >>>> > > > try to do it yet? > >>>> > > > > >>>> > > > I have a doubt that scan-build can identify a bug of > >>>> > division by zero in > >>>> > > > this case (when parameter denominator is zero): > >>>> > > > > >>>> > > > int divide (int denominator) { > >>>> > > > return 10/denominator; > >>>> > > > } > >>>> > > > > >>>> > > > Can someone help me? Is this a deficiency of scan-build? > >>>> > Can scan-build > >>>> > > > identify the bugs in Juliet suite? > >>>> > > > > >>>> > > > Thanks in advance! > >>>> > > > > >>>> > > > -- > >>>> > > > Lucas Kanashiro Duarte > >>>> > > > Engenharia de Software - FGA/UnB > >>>> > > > [email protected] > >>>> > > > > >>>> > > > _______________________________________________ > >>>> > > > cfe-users mailing list > >>>> > > > [email protected] > >>>> > > > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-users > >>>> > > > >>>> > > >>>> > -- > >>>> > Lucas Kanashiro Duarte > >>>> > Engenharia de Software - FGA/UnB > >>>> > [email protected] > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > -- > >>>> > Mathematics is the language with which God has written the universe. > >>>> > (Galilei) > >>>> > >>>> -- > >>>> Lucas Kanashiro Duarte > >>>> Engenharia de Software - FGA/UnB > >>>> [email protected] > >>>> > >>> > >>> > >>> > >>> -- > >>> Mathematics is the language with which God has written the universe. > >>> (Galilei) > >> > >> > >> > >> > >> -- > >> Mathematics is the language with which God has written the universe. > >> (Galilei) > > > > > > > > > > -- > > Mathematics is the language with which God has written the universe. > > (Galilei) > > > > > > > > -- > Mathematics is the language with which God has written the universe. > (Galilei) >
_______________________________________________ cfe-users mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-users
