On Thu, 26 Jul 2018 at 13:14, Erik Pilkington via cfe-commits < cfe-commits@lists.llvm.org> wrote:
> > > On 7/26/18 12:52 PM, Richard Smith wrote: > > Other than the (fixed) ffmpeg false positive, I see this firing in three > places. > > One of them is in the libc tests for memset and bzero, where the 0 > argument is intentional. > > > I don't find this case very convincing, if you're literally the one person > that has to test memset then you should probably just compile with > -Wno-suspicious-memaccess. > I think you're missing the point. In our tests, the false positive rate for this was extremely high. If the warning should be turned off in two-thirds of the places where it appears, then it should not be on by default. That's why I was asking for evidence that it's actually got a much higher true-positive rate than we saw. > One of them is here: > https://github.com/apache/xerces-c/blob/trunk/src/xercesc/util/XMLUTF16Transcoder.cpp#L114 > (note that this code is correct). > And one of them was a real bug where the second and third arguments of > memset were transposed. > > That's an extremely low false positive rate, much lower than what we'd > expect for an enabled-by-default warning. I find this extremely surprising; > I would have expected the ratio of true-- to false-positives to be much > much higher. But ultimately data wins out over expectations. > How does our experience compare to other people's experiences? Are our > findings abnormal? (They may well be; our corpus is very C++-heavy.) If > this is indeed typical, we should reconsider whether to have these warnings > enabled by default, as they do not meet our bar for false positives. > > > I tested this internally, and it found ~50 true-positives and only one > false-positive (similar to apache bug above, where someone was memsetting > to a sizeof exression). Given those numbers, my position is to keep it > on-by-default, but I'm open to hear more. We might be able to make this > warning even more conservative in this case by checking if the sizeof > expression is actually computing a size that corresponds to the type of the > third argument. I think that would be a good tradeoff if it meant we could > keep this on by default. > 50:1 sounds a lot more reasonable than what I was seeing (1:3). In the past we've said >99% true-positive was a good threshold for on-by-default warnings; your data is a plausible sample set from such a Bernoulli distribution (whereas mine is ... not so much -- but I think this is at least evidence that in my corpus, something else had already caught most of these bugs already; do we have a pre-existing ClangTidy check for this or similar? Does GCC warn on it?). If we can improve the true-positive rate further, that'd be a nice-to-have, but given the sizes of our respective sample sets, I think we lack the data to know if we've really helped the true-positive rate very much. > On Mon, 23 Jul 2018 at 09:28, Erik Pilkington via cfe-commits < > cfe-commits@lists.llvm.org> wrote: > >> Sure, that seems pretty reasonable. r337706. Thanks for the suggestion! >> >> On 7/21/18 1:40 PM, Arthur O'Dwyer wrote: >> >> In this case Clang is complaining about >> >> memset(complicated-expr, 0, 0); >> >> which means that transposing the last two arguments would not "fix" >> anything. Clang ought to not complain in this case. >> It doesn't have anything to do with coming from a macro; it's just that >> *both* arguments are zero. >> (I would also expect that `memset(complicated-expr, (unsigned char)fill, >> (size_t)0)` shouldn't warn, either. But we had some disagreement that casts >> to `(unsigned char)` resp. `(size_t)` should be the right way to shut up >> the warning in the first place...) >> Basically Clang should only be suggesting a swap (and thus complaining at >> all) in the case that swapping the arguments would actually improve the >> situation. >> >> –Arthur >> >> On Sat, Jul 21, 2018 at 1:33 PM, Nico Weber via cfe-commits < >> cfe-commits@lists.llvm.org> wrote: >> >>> This has a false positive on ffmpeg: >>> >>> ../../third_party/ffmpeg/libavcodec/options.c:273:64: error: 'size' >>> argument to memset is '0'; did you mean to transpose the last two >>> arguments? [-Werror,-Wmemset-transposed-args] >>> alloc_and_copy_or_fail(intra_matrix, 64 * sizeof(int16_t), 0); >>> >>> With: >>> >>> #define alloc_and_copy_or_fail(obj, size, pad) \ >>> if (src->obj && size > 0) { \ >>> dest->obj = av_malloc(size + pad); \ >>> if (!dest->obj) \ >>> goto fail; \ >>> memcpy(dest->obj, src->obj, size); \ >>> if (pad) \ >>> memset(((uint8_t *) dest->obj) + size, 0, pad); \ >>> } >>> >>> ( >>> https://cs.chromium.org/chromium/src/third_party/ffmpeg/libavcodec/options.c?q=alloc_and_copy_or_fail&sq=package:chromium&g=0&l=261 >>> ; https://bugs.chromium.org/p/chromium/issues/detail?id=866202) >>> >>> Maybe the warning shouldn't fire if the memset is from a macro? >>> >>> On Thu, Jul 19, 2018 at 12:51 PM Erik Pilkington via cfe-commits < >>> cfe-commits@lists.llvm.org> wrote: >>> >>>> Author: epilk >>>> Date: Thu Jul 19 09:46:15 2018 >>>> New Revision: 337470 >>>> >>>> URL: http://llvm.org/viewvc/llvm-project?rev=337470&view=rev >>>> Log: >>>> [Sema] Add a new warning, -Wmemset-transposed-args >>>> >>>> This diagnoses calls to memset that have the second and third arguments >>>> transposed, for example: >>>> >>>> memset(buf, sizeof(buf), 0); >>>> >>>> This is done by checking if the third argument is a literal 0, or if >>>> the second >>>> is a sizeof expression (and the third isn't). The first check is also >>>> done for >>>> calls to bzero. >>>> >>>> Differential revision: https://reviews.llvm.org/D49112 >>>> >>>> Added: >>>> cfe/trunk/test/Sema/transpose-memset.c >>>> Modified: >>>> cfe/trunk/include/clang/Basic/DiagnosticGroups.td >>>> cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td >>>> cfe/trunk/lib/Sema/SemaChecking.cpp >>>> >>>> Modified: cfe/trunk/include/clang/Basic/DiagnosticGroups.td >>>> URL: >>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticGroups.td?rev=337470&r1=337469&r2=337470&view=diff >>>> >>>> ============================================================================== >>>> --- cfe/trunk/include/clang/Basic/DiagnosticGroups.td (original) >>>> +++ cfe/trunk/include/clang/Basic/DiagnosticGroups.td Thu Jul 19 >>>> 09:46:15 2018 >>>> @@ -443,6 +443,13 @@ def : DiagGroup<"synth">; >>>> def SizeofArrayArgument : DiagGroup<"sizeof-array-argument">; >>>> def SizeofArrayDecay : DiagGroup<"sizeof-array-decay">; >>>> def SizeofPointerMemaccess : DiagGroup<"sizeof-pointer-memaccess">; >>>> +def MemsetTransposedArgs : DiagGroup<"memset-transposed-args">; >>>> +def DynamicClassMemaccess : DiagGroup<"dynamic-class-memaccess">; >>>> +def NonTrivialMemaccess : DiagGroup<"nontrivial-memaccess">; >>>> +def SuspiciousBzero : DiagGroup<"suspicious-bzero">; >>>> +def SuspiciousMemaccess : DiagGroup<"suspicious-memaccess", >>>> + [SizeofPointerMemaccess, DynamicClassMemaccess, >>>> + NonTrivialMemaccess, MemsetTransposedArgs, SuspiciousBzero]>; >>>> def StaticInInline : DiagGroup<"static-in-inline">; >>>> def StaticLocalInInline : DiagGroup<"static-local-in-inline">; >>>> def GNUStaticFloatInit : DiagGroup<"gnu-static-float-init">; >>>> >>>> Modified: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td >>>> URL: >>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td?rev=337470&r1=337469&r2=337470&view=diff >>>> >>>> ============================================================================== >>>> --- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td (original) >>>> +++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td Thu Jul 19 >>>> 09:46:15 2018 >>>> @@ -619,14 +619,14 @@ def warn_cstruct_memaccess : Warning< >>>> "%select{destination for|source of|first operand of|second operand >>>> of}0 this " >>>> "%1 call is a pointer to record %2 that is not trivial to " >>>> "%select{primitive-default-initialize|primitive-copy}3">, >>>> - InGroup<DiagGroup<"nontrivial-memaccess">>; >>>> + InGroup<NonTrivialMemaccess>; >>>> def note_nontrivial_field : Note< >>>> "field is non-trivial to %select{copy|default-initialize}0">; >>>> def warn_dyn_class_memaccess : Warning< >>>> "%select{destination for|source of|first operand of|second operand >>>> of}0 this " >>>> "%1 call is a pointer to %select{|class containing a }2dynamic class >>>> %3; " >>>> "vtable pointer will be >>>> %select{overwritten|copied|moved|compared}4">, >>>> - InGroup<DiagGroup<"dynamic-class-memaccess">>; >>>> + InGroup<DynamicClassMemaccess>; >>>> def note_bad_memaccess_silence : Note< >>>> "explicitly cast the pointer to silence this warning">; >>>> def warn_sizeof_pointer_expr_memaccess : Warning< >>>> @@ -655,7 +655,19 @@ def note_memsize_comparison_paren : Note >>>> "did you mean to compare the result of %0 instead?">; >>>> def note_memsize_comparison_cast_silence : Note< >>>> "explicitly cast the argument to size_t to silence this warning">; >>>> - >>>> +def warn_suspicious_sizeof_memset : Warning< >>>> + "%select{'size' argument to memset is '0'|" >>>> + "setting buffer to a 'sizeof' expression}0" >>>> + "; did you mean to transpose the last two arguments?">, >>>> + InGroup<MemsetTransposedArgs>; >>>> +def note_suspicious_sizeof_memset_silence : Note< >>>> + "%select{parenthesize the third argument|" >>>> + "cast the second argument to 'int'}0 to silence">; >>>> +def warn_suspicious_bzero_size : Warning<"'size' argument to bzero is >>>> '0'">, >>>> + InGroup<SuspiciousBzero>; >>>> +def note_suspicious_bzero_size_silence : Note< >>>> + "parenthesize the second argument to silence">; >>>> + >>>> def warn_strncat_large_size : Warning< >>>> "the value of the size argument in 'strncat' is too large, might >>>> lead to a " >>>> "buffer overflow">, InGroup<StrncatSize>; >>>> >>>> Modified: cfe/trunk/lib/Sema/SemaChecking.cpp >>>> URL: >>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaChecking.cpp?rev=337470&r1=337469&r2=337470&view=diff >>>> >>>> ============================================================================== >>>> --- cfe/trunk/lib/Sema/SemaChecking.cpp (original) >>>> +++ cfe/trunk/lib/Sema/SemaChecking.cpp Thu Jul 19 09:46:15 2018 >>>> @@ -8629,24 +8629,26 @@ static const CXXRecordDecl *getContained >>>> return nullptr; >>>> } >>>> >>>> +static const UnaryExprOrTypeTraitExpr *getAsSizeOfExpr(const Expr *E) { >>>> + if (const auto *Unary = dyn_cast<UnaryExprOrTypeTraitExpr>(E)) >>>> + if (Unary->getKind() == UETT_SizeOf) >>>> + return Unary; >>>> + return nullptr; >>>> +} >>>> + >>>> /// If E is a sizeof expression, returns its argument expression, >>>> /// otherwise returns NULL. >>>> static const Expr *getSizeOfExprArg(const Expr *E) { >>>> - if (const UnaryExprOrTypeTraitExpr *SizeOf = >>>> - dyn_cast<UnaryExprOrTypeTraitExpr>(E)) >>>> - if (SizeOf->getKind() == UETT_SizeOf && !SizeOf->isArgumentType()) >>>> + if (const UnaryExprOrTypeTraitExpr *SizeOf = getAsSizeOfExpr(E)) >>>> + if (!SizeOf->isArgumentType()) >>>> return SizeOf->getArgumentExpr()->IgnoreParenImpCasts(); >>>> - >>>> return nullptr; >>>> } >>>> >>>> /// If E is a sizeof expression, returns its argument type. >>>> static QualType getSizeOfArgType(const Expr *E) { >>>> - if (const UnaryExprOrTypeTraitExpr *SizeOf = >>>> - dyn_cast<UnaryExprOrTypeTraitExpr>(E)) >>>> - if (SizeOf->getKind() == UETT_SizeOf) >>>> - return SizeOf->getTypeOfArgument(); >>>> - >>>> + if (const UnaryExprOrTypeTraitExpr *SizeOf = getAsSizeOfExpr(E)) >>>> + return SizeOf->getTypeOfArgument(); >>>> return QualType(); >>>> } >>>> >>>> @@ -8742,6 +8744,86 @@ struct SearchNonTrivialToCopyField >>>> >>>> } >>>> >>>> +/// Detect if \c SizeofExpr is likely to calculate the sizeof an >>>> object. >>>> +static bool doesExprLikelyComputeSize(const Expr *SizeofExpr) { >>>> + SizeofExpr = SizeofExpr->IgnoreParenImpCasts(); >>>> + >>>> + if (const auto *BO = dyn_cast<BinaryOperator>(SizeofExpr)) { >>>> + if (BO->getOpcode() != BO_Mul && BO->getOpcode() != BO_Add) >>>> + return false; >>>> + >>>> + return doesExprLikelyComputeSize(BO->getLHS()) || >>>> + doesExprLikelyComputeSize(BO->getRHS()); >>>> + } >>>> + >>>> + return getAsSizeOfExpr(SizeofExpr) != nullptr; >>>> +} >>>> + >>>> +/// Check if the ArgLoc originated from a macro passed to the call at >>>> CallLoc. >>>> +/// >>>> +/// \code >>>> +/// #define MACRO 0 >>>> +/// foo(MACRO); >>>> +/// foo(0); >>>> +/// \endcode >>>> +/// >>>> +/// This should return true for the first call to foo, but not for the >>>> second >>>> +/// (regardless of whether foo is a macro or function). >>>> +static bool isArgumentExpandedFromMacro(SourceManager &SM, >>>> + SourceLocation CallLoc, >>>> + SourceLocation ArgLoc) { >>>> + if (!CallLoc.isMacroID()) >>>> + return SM.getFileID(CallLoc) != SM.getFileID(ArgLoc); >>>> + >>>> + return SM.getFileID(SM.getImmediateMacroCallerLoc(CallLoc)) != >>>> + SM.getFileID(SM.getImmediateMacroCallerLoc(ArgLoc)); >>>> +} >>>> + >>>> +/// Diagnose cases like 'memset(buf, sizeof(buf), 0)', which should >>>> have the >>>> +/// last two arguments transposed. >>>> +static void CheckMemaccessSize(Sema &S, unsigned BId, const CallExpr >>>> *Call) { >>>> + if (BId != Builtin::BImemset && BId != Builtin::BIbzero) >>>> + return; >>>> + >>>> + const Expr *SizeArg = >>>> + Call->getArg(BId == Builtin::BImemset ? 2 : 1)->IgnoreImpCasts(); >>>> + >>>> + // If we're memsetting or bzeroing 0 bytes, then this is likely an >>>> error. >>>> + SourceLocation CallLoc = Call->getRParenLoc(); >>>> + SourceManager &SM = S.getSourceManager(); >>>> + if (isa<IntegerLiteral>(SizeArg) && >>>> + cast<IntegerLiteral>(SizeArg)->getValue() == 0 && >>>> + !isArgumentExpandedFromMacro(SM, CallLoc, >>>> SizeArg->getExprLoc())) { >>>> + >>>> + SourceLocation DiagLoc = SizeArg->getExprLoc(); >>>> + >>>> + // Some platforms #define bzero to __builtin_memset. See if this >>>> is the >>>> + // case, and if so, emit a better diagnostic. >>>> + if (BId == Builtin::BIbzero || >>>> + (CallLoc.isMacroID() && Lexer::getImmediateMacroName( >>>> + CallLoc, SM, S.getLangOpts()) == >>>> "bzero")) { >>>> + S.Diag(DiagLoc, diag::warn_suspicious_bzero_size); >>>> + S.Diag(DiagLoc, diag::note_suspicious_bzero_size_silence); >>>> + } else { >>>> + S.Diag(DiagLoc, diag::warn_suspicious_sizeof_memset) << 0; >>>> + S.Diag(DiagLoc, diag::note_suspicious_sizeof_memset_silence) << >>>> 0; >>>> + } >>>> + return; >>>> + } >>>> + >>>> + // If the second argument to a memset is a sizeof expression and the >>>> third >>>> + // isn't, this is also likely an error. This should catch >>>> + // 'memset(buf, sizeof(buf), 0xff)'. >>>> + if (BId == Builtin::BImemset && >>>> + doesExprLikelyComputeSize(Call->getArg(1)) && >>>> + !doesExprLikelyComputeSize(Call->getArg(2))) { >>>> + SourceLocation DiagLoc = Call->getArg(1)->getExprLoc(); >>>> + S.Diag(DiagLoc, diag::warn_suspicious_sizeof_memset) << 1; >>>> + S.Diag(DiagLoc, diag::note_suspicious_sizeof_memset_silence) << 1; >>>> + return; >>>> + } >>>> +} >>>> + >>>> /// Check for dangerous or invalid arguments to memset(). >>>> /// >>>> /// This issues warnings on known problematic, dangerous or unspecified >>>> @@ -8771,6 +8853,9 @@ void Sema::CheckMemaccessArguments(const >>>> Call->getLocStart(), >>>> Call->getRParenLoc())) >>>> return; >>>> >>>> + // Catch cases like 'memset(buf, sizeof(buf), 0)'. >>>> + CheckMemaccessSize(*this, BId, Call); >>>> + >>>> // We have special checking when the length is a sizeof expression. >>>> QualType SizeOfArgTy = getSizeOfArgType(LenExpr); >>>> const Expr *SizeOfArg = getSizeOfExprArg(LenExpr); >>>> >>>> Added: cfe/trunk/test/Sema/transpose-memset.c >>>> URL: >>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Sema/transpose-memset.c?rev=337470&view=auto >>>> >>>> ============================================================================== >>>> --- cfe/trunk/test/Sema/transpose-memset.c (added) >>>> +++ cfe/trunk/test/Sema/transpose-memset.c Thu Jul 19 09:46:15 2018 >>>> @@ -0,0 +1,60 @@ >>>> +// RUN: %clang_cc1 -Wmemset-transposed-args -verify %s >>>> +// RUN: %clang_cc1 -xc++ -Wmemset-transposed-args -verify %s >>>> + >>>> +#define memset(...) __builtin_memset(__VA_ARGS__) >>>> +#define bzero(x,y) __builtin_memset(x, 0, y) >>>> +#define real_bzero(x,y) __builtin_bzero(x,y) >>>> + >>>> +int array[10]; >>>> +int *ptr; >>>> + >>>> +int main() { >>>> + memset(array, sizeof(array), 0); // expected-warning{{'size' >>>> argument to memset is '0'; did you mean to transpose the last two >>>> arguments?}} expected-note{{parenthesize the third argument to silence}} >>>> + memset(array, sizeof(array), 0xff); // expected-warning{{setting >>>> buffer to a 'sizeof' expression; did you mean to transpose the last two >>>> arguments?}} expected-note{{cast the second argument to 'int' to silence}} >>>> + memset(ptr, sizeof(ptr), 0); // expected-warning{{'size' argument to >>>> memset is '0'; did you mean to transpose the last two arguments?}} >>>> expected-note{{parenthesize the third argument to silence}} >>>> + memset(ptr, sizeof(*ptr) * 10, 1); // expected-warning{{setting >>>> buffer to a 'sizeof' expression; did you mean to transpose the last two >>>> arguments?}} expected-note{{cast the second argument to 'int' to silence}} >>>> + memset(ptr, 10 * sizeof(int *), 1); // expected-warning{{setting >>>> buffer to a 'sizeof' expression; did you mean to transpose the last two >>>> arguments?}} expected-note{{cast the second argument to 'int' to silence}} >>>> + memset(ptr, 10 * sizeof(int *) + 10, 0xff); // >>>> expected-warning{{setting buffer to a 'sizeof' expression; did you mean to >>>> transpose the last two arguments?}} expected-note{{cast the second argument >>>> to 'int' to silence}} >>>> + memset(ptr, sizeof(char) * sizeof(int *), 0xff); // >>>> expected-warning{{setting buffer to a 'sizeof' expression; did you mean to >>>> transpose the last two arguments?}} expected-note{{cast the second argument >>>> to 'int' to silence}} >>>> + memset(array, sizeof(array), sizeof(array)); // Uh... fine I guess. >>>> + memset(array, 0, sizeof(array)); >>>> + memset(ptr, 0, sizeof(int *) * 10); >>>> + memset(array, (int)sizeof(array), (0)); // no warning >>>> + memset(array, (int)sizeof(array), 32); // no warning >>>> + memset(array, 32, (0)); // no warning >>>> + >>>> + bzero(ptr, 0); // expected-warning{{'size' argument to bzero is >>>> '0'}} expected-note{{parenthesize the second argument to silence}} >>>> + real_bzero(ptr, 0); // expected-warning{{'size' argument to bzero is >>>> '0'}} expected-note{{parenthesize the second argument to silence}} >>>> +} >>>> + >>>> +void macros() { >>>> +#define ZERO 0 >>>> + int array[10]; >>>> + memset(array, 0xff, ZERO); // no warning >>>> + // Still emit a diagnostic for memsetting a sizeof expression: >>>> + memset(array, sizeof(array), ZERO); // expected-warning{{'sizeof'}} >>>> expected-note{{cast}} >>>> + bzero(array, ZERO); // no warning >>>> + real_bzero(array, ZERO); // no warning >>>> +#define NESTED_DONT_DIAG \ >>>> + memset(array, 0xff, ZERO); \ >>>> + real_bzero(array, ZERO); >>>> + >>>> + NESTED_DONT_DIAG; >>>> + >>>> +#define NESTED_DO_DIAG \ >>>> + memset(array, 0xff, 0); \ >>>> + real_bzero(array, 0) >>>> + >>>> + NESTED_DO_DIAG; // expected-warning{{'size' argument to memset}} >>>> expected-warning{{'size' argument to bzero}} expected-note2{{parenthesize}} >>>> + >>>> +#define FN_MACRO(p) \ >>>> + memset(array, 0xff, p) >>>> + >>>> + FN_MACRO(ZERO); >>>> + FN_MACRO(0); // FIXME: should we diagnose this? >>>> + >>>> + __builtin_memset(array, 0, ZERO); // no warning >>>> + __builtin_bzero(array, ZERO); >>>> + __builtin_memset(array, 0, 0); // expected-warning{{'size' argument >>>> to memset}} // expected-note{{parenthesize}} >>>> + __builtin_bzero(array, 0); // expected-warning{{'size' argument to >>>> bzero}} // expected-note{{parenthesize}} >>>> +} >>>> >>>> >>>> _______________________________________________ >>>> cfe-commits mailing list >>>> cfe-commits@lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits >>>> >>> >>> _______________________________________________ >>> cfe-commits mailing list >>> cfe-commits@lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits >>> >>> >> >> _______________________________________________ >> cfe-commits mailing list >> cfe-commits@lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits >> > > _______________________________________________ > cfe-commits mailing list > cfe-commits@lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits >
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits