llvmbot wrote: @llvm/pr-subscribers-clang-static-analyzer-1
<details> <summary>Changes</summary> strlen(..) call should not propagate taintedness, because it brings in many false positive findings. It is a common pattern to copy user provided input to another buffer. In these cases we always get warnings about tainted data used as the malloc parameter: buf = malloc(strlen(tainted_txt) + 1); // false warning This pattern can lead to a denial of service attack only, when the attacker can directly specify the size of the allocated area as an arbitrary large number (e.g. the value is converted from a user provided string). Later, we could reintroduce strlen() as a taint propagating function with the consideration not to emit warnings when the tainted value cannot be "arbitrarily large" (such as the size of an already allocated memory block). The change has been evaluated on the following open source projects: - memcached: [1 lost false positive](https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=memcached_1.6.8_ednikru_taint_nostrlen_baseline&newcheck=memcached_1.6.8_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved) - tmux: 0 lost reports - twin [3 lost false positives](https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=twin_v0.8.1_ednikru_taint_nostrlen_baseline&newcheck=twin_v0.8.1_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved) - vim [1 lost false positive](https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=vim_v8.2.1920_ednikru_taint_nostrlen_baseline&newcheck=vim_v8.2.1920_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved) - openssl 0 lost reports - sqliste [2 lost false positives](https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=sqlite_version-3.33.0_ednikru_taint_nostrlen_baseline&newcheck=sqlite_version-3.33.0_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved) - ffmpeg 0 lost repots - postgresql [3 lost false positives](https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=postgres_REL_13_0_ednikru_taint_nostrlen_baseline&newcheck=postgres_REL_13_0_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved) - tinyxml 0 lost reports - libwebm 0 lost reports - xerces 0 lost reports In all cases the lost reports are originating from copying untrusted environment variables into another buffer. There are 2 types of lost false positive reports: 1) [Where the warning is emitted at the malloc call by the TaintPropagation Checker ](https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=memcached_1.6.8_ednikru_taint_nostrlen_baseline&newcheck=memcached_1.6.8_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved&report-id=2648506&report-hash=2079221954026f17e1ecb614f5f054db&report-filepath=%2amemcached.c) ` len = strlen(portnumber_filename)+4+1; temp_portnumber_filename = malloc(len); ` 2) When pointers are set based on the length of the tainted string by the ArrayOutofBoundsv2 checker. For example [this ](https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=vim_v8.2.1920_ednikru_taint_nostrlen_baseline&newcheck=vim_v8.2.1920_ednikru_taint_nostrlen_new&is-unique=on&diff-type=Resolved&report-id=2649310&report-hash=79dc8522d2cd34ca8e1b2dc2db64b2df&report-filepath=%2aos_unix.c)case. -- Full diff: https://github.com/llvm/llvm-project/pull/66086.diff 4 Files Affected: - (modified) clang/docs/analyzer/checkers.rst (+2-2) - (modified) clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp (-2) - (modified) clang/test/Analysis/taint-diagnostic-visitor.c (+7-6) - (modified) clang/test/Analysis/taint-generic.c (-18) <pre> diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst index 54ea49e7426cc86..dbd6d7787823530 100644 --- a/clang/docs/analyzer/checkers.rst +++ b/clang/docs/analyzer/checkers.rst @@ -2599,8 +2599,8 @@ Default propagations rules: ``memcpy``, ``memmem``, ``memmove``, ``mbtowc``, ``pread``, ``qsort``, ``qsort_r``, ``rawmemchr``, ``read``, ``recv``, ``recvfrom``, ``rindex``, ``strcasestr``, ``strchr``, ``strchrnul``, ``strcasecmp``, ``strcmp``, - ``strcspn``, ``strlen``, ``strncasecmp``, ``strncmp``, ``strndup``, - ``strndupa``, ``strnlen``, ``strpbrk``, ``strrchr``, ``strsep``, ``strspn``, + ``strcspn``, ``strncasecmp``, ``strncmp``, ``strndup``, + ``strndupa``, ``strpbrk``, ``strrchr``, ``strsep``, ``strspn``, ``strstr``, ``strtol``, ``strtoll``, ``strtoul``, ``strtoull``, ``tolower``, ``toupper``, ``ttyname``, ``ttyname_r``, ``wctomb``, ``wcwidth`` diff --git a/clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp index 3dcb45c0b110383..9df69c1ad1b525e 100644 --- a/clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp +++ b/clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp @@ -694,8 +694,6 @@ void GenericTaintChecker::initTaintRules(CheckerContext &C) const { {{{"strpbrk"}}, TR::Prop({{0}}, {{ReturnValueIndex}})}, {{{"strndup"}}, TR::Prop({{0}}, {{ReturnValueIndex}})}, {{{"strndupa"}}, TR::Prop({{0}}, {{ReturnValueIndex}})}, - {{{"strlen"}}, TR::Prop({{0}}, {{ReturnValueIndex}})}, - {{{"strnlen"}}, TR::Prop({{0}}, {{ReturnValueIndex}})}, {{{"strtol"}}, TR::Prop({{0}}, {{1, ReturnValueIndex}})}, {{{"strtoll"}}, TR::Prop({{0}}, {{1, ReturnValueIndex}})}, {{{"strtoul"}}, TR::Prop({{0}}, {{1, ReturnValueIndex}})}, diff --git a/clang/test/Analysis/taint-diagnostic-visitor.c b/clang/test/Analysis/taint-diagnostic-visitor.c index 663836836d3db67..1eb926f25f9a778 100644 --- a/clang/test/Analysis/taint-diagnostic-visitor.c +++ b/clang/test/Analysis/taint-diagnostic-visitor.c @@ -10,6 +10,7 @@ int scanf(const char *restrict format, ...); int system(const char *command); char* getenv( const char* env_var ); size_t strlen( const char* str ); +int atoi( const char* str ); void *malloc(size_t size ); void free( void *ptr ); char *fgets(char *str, int n, FILE *stream); @@ -54,11 +55,11 @@ void taintDiagnosticVLA(void) { // propagating through variables and expressions char *taintDiagnosticPropagation(){ char *pathbuf; - char *pathlist=getenv("PATH"); // expected-note {{Taint originated here}} + char *size=getenv("SIZE"); // expected-note {{Taint originated here}} // expected-note@-1 {{Taint propagated to the return value}} - if (pathlist){ // expected-note {{Assuming 'pathlist' is non-null}} + if (size){ // expected-note {{Assuming 'size' is non-null}} // expected-note@-1 {{Taking true branch}} - pathbuf=(char*) malloc(strlen(pathlist)+1); // expected-warning{{Untrusted data is used to specify the buffer size}} + pathbuf=(char*) malloc(atoi(size)); // expected-warning{{Untrusted data is used to specify the buffer size}} // expected-note@-1{{Untrusted data is used to specify the buffer size}} // expected-note@-2 {{Taint propagated to the return value}} return pathbuf; @@ -71,12 +72,12 @@ char *taintDiagnosticPropagation(){ char *taintDiagnosticPropagation2(){ char *pathbuf; char *user_env2=getenv("USER_ENV_VAR2");//unrelated taint source - char *pathlist=getenv("PATH"); // expected-note {{Taint originated here}} + char *size=getenv("SIZE"); // expected-note {{Taint originated here}} // expected-note@-1 {{Taint propagated to the return value}} char *user_env=getenv("USER_ENV_VAR");//unrelated taint source - if (pathlist){ // expected-note {{Assuming 'pathlist' is non-null}} + if (size){ // expected-note {{Assuming 'size' is non-null}} // expected-note@-1 {{Taking true branch}} - pathbuf=(char*) malloc(strlen(pathlist)+1); // expected-warning{{Untrusted data is used to specify the buffer size}} + pathbuf=(char*) malloc(atoi(size)+1); // expected-warning{{Untrusted data is used to specify the buffer size}} // expected-note@-1{{Untrusted data is used to specify the buffer size}} // expected-note@-2 {{Taint propagated to the return value}} return pathbuf; diff --git a/clang/test/Analysis/taint-generic.c b/clang/test/Analysis/taint-generic.c index b7906d201e4fad3..a614453c63af671 100644 --- a/clang/test/Analysis/taint-generic.c +++ b/clang/test/Analysis/taint-generic.c @@ -915,24 +915,6 @@ void testStrndupa(size_t n) { clang_analyzer_isTainted_charp(result); // expected-warning {{YES}} } -size_t strlen(const char *s); -void testStrlen() { - char s[10]; - scanf("%9s", s); - - size_t result = strlen(s); - clang_analyzer_isTainted_int(result); // expected-warning {{YES}} -} - -size_t strnlen(const char *s, size_t maxlen); -void testStrnlen(size_t maxlen) { - char s[10]; - scanf("%9s", s); - - size_t result = strnlen(s, maxlen); - clang_analyzer_isTainted_int(result); // expected-warning {{YES}} -} - long strtol(const char *restrict nptr, char **restrict endptr, int base); long long strtoll(const char *restrict nptr, char **restrict endptr, int base); unsigned long int strtoul(const char *nptr, char **endptr, int base); </pre> </details> https://github.com/llvm/llvm-project/pull/66086 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits