On 7/26/18 1:38 PM, Richard Smith wrote:
On Thu, 26 Jul 2018 at 13:14, Erik Pilkington via cfe-commits <cfe-commits@lists.llvm.org <mailto:cfe-commits@lists.llvm.org>> wrote:



    On 7/26/18 12:52 PM, Richard Smith wrote:
    Other than the (fixed) ffmpeg false positive, I see this firing
    in three places.

    One of them is in the libc tests for memset and bzero, where the
    0 argument is intentional.

    I don't find this case very convincing, if you're literally the
    one person that has to test memset then you should probably just
    compile with -Wno-suspicious-memaccess.


I think you're missing the point. In our tests, the false positive rate for this was extremely high. If the warning should be turned off in two-thirds of the places where it appears, then it should not be on by default. That's why I was asking for evidence that it's actually got a much higher true-positive rate than we saw.

    One of them is here:
    
https://github.com/apache/xerces-c/blob/trunk/src/xercesc/util/XMLUTF16Transcoder.cpp#L114
    (note that this code is correct).
    And one of them was a real bug where the second and third
    arguments of memset were transposed.

    That's an extremely low false positive rate, much lower than what
    we'd expect for an enabled-by-default warning. I find this
    extremely surprising; I would have expected the ratio of true--
    to false-positives to be much much higher. But ultimately data
    wins out over expectations.
    How does our experience compare to other people's experiences?
    Are our findings abnormal? (They may well be; our corpus is very
    C++-heavy.) If this is indeed typical, we should reconsider
    whether to have these warnings enabled by default, as they do not
    meet our bar for false positives.

    I tested this internally, and it found ~50 true-positives and only
    one false-positive (similar to apache bug above, where someone was
    memsetting to a sizeof exression). Given those numbers, my
    position is to keep it on-by-default, but I'm open to hear more.
    We might be able to make this warning even more conservative in
    this case by checking if the sizeof expression is actually
    computing a size that corresponds to the type of the third
    argument. I think that would be a good tradeoff if it meant we
    could keep this on by default.


50:1 sounds a lot more reasonable than what I was seeing (1:3). In the past we've said >99% true-positive was a good threshold for on-by-default warnings; your data is a plausible sample set from such a Bernoulli distribution (whereas mine is ... not so much -- but I think this is at least evidence that in my corpus, something else had already caught most of these bugs already; do we have a pre-existing ClangTidy check for this or similar? Does GCC warn on it?).

Yep, GCC has -Wmemset-transposed-args already, and there is also a clang-tidy check. If your codebase already has to pass through those tools then I agree thats probably the reason your seeing those numbers.

If we can improve the true-positive rate further, that'd be a nice-to-have, but given the sizes of our respective sample sets, I think we lack the data to know if we've really helped the true-positive rate very much.

    On Mon, 23 Jul 2018 at 09:28, Erik Pilkington via cfe-commits
    <cfe-commits@lists.llvm.org <mailto:cfe-commits@lists.llvm.org>>
    wrote:

        Sure, that seems pretty reasonable. r337706. Thanks for the
        suggestion!


        On 7/21/18 1:40 PM, Arthur O'Dwyer wrote:
        In this case Clang is complaining about

            memset(complicated-expr, 0, 0);

        which means that transposing the last two arguments would
        not "fix" anything. Clang ought to not complain in this case.
        It doesn't have anything to do with coming from a macro;
        it's just that /both/ arguments are zero.
        (I would also expect that `memset(complicated-expr,
        (unsigned char)fill, (size_t)0)` shouldn't warn, either. But
        we had some disagreement that casts to `(unsigned char)`
        resp. `(size_t)` should be the right way to shut up the
        warning in the first place...)
        Basically Clang should only be suggesting a swap (and thus
        complaining at all) in the case that swapping the arguments
        would actually improve the situation.

        –Arthur

        On Sat, Jul 21, 2018 at 1:33 PM, Nico Weber via cfe-commits
        <cfe-commits@lists.llvm.org
        <mailto:cfe-commits@lists.llvm.org>> wrote:

            This has a false positive on ffmpeg:

            ../../third_party/ffmpeg/libavcodec/options.c:273:64:
            error: 'size' argument to memset is '0'; did you mean to
            transpose the last two arguments?
            [-Werror,-Wmemset-transposed-args]
            alloc_and_copy_or_fail(intra_matrix, 64 *
            sizeof(int16_t), 0);

            With:

            #define alloc_and_copy_or_fail(obj, size, pad) \
                if (src->obj && size > 0) { \
                    dest->obj = av_malloc(size + pad); \
                    if (!dest->obj) \
                        goto fail; \
                    memcpy(dest->obj, src->obj, size); \
                    if (pad) \
                        memset(((uint8_t *) dest->obj) + size, 0,
            pad); \
                }

            
(https://cs.chromium.org/chromium/src/third_party/ffmpeg/libavcodec/options.c?q=alloc_and_copy_or_fail&sq=package:chromium&g=0&l=261
            ;
            https://bugs.chromium.org/p/chromium/issues/detail?id=866202)

            Maybe the warning shouldn't fire if the memset is from a
            macro?

            On Thu, Jul 19, 2018 at 12:51 PM Erik Pilkington via
            cfe-commits <cfe-commits@lists.llvm.org
            <mailto:cfe-commits@lists.llvm.org>> wrote:

                Author: epilk
                Date: Thu Jul 19 09:46:15 2018
                New Revision: 337470

                URL:
                http://llvm.org/viewvc/llvm-project?rev=337470&view=rev
                Log:
                [Sema] Add a new warning, -Wmemset-transposed-args

                This diagnoses calls to memset that have the second
                and third arguments
                transposed, for example:

                  memset(buf, sizeof(buf), 0);

                This is done by checking if the third argument is a
                literal 0, or if the second
                is a sizeof expression (and the third isn't). The
                first check is also done for
                calls to bzero.

                Differential revision: https://reviews.llvm.org/D49112

                Added:
                cfe/trunk/test/Sema/transpose-memset.c
                Modified:
                cfe/trunk/include/clang/Basic/DiagnosticGroups.td
                cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
                cfe/trunk/lib/Sema/SemaChecking.cpp

                Modified:
                cfe/trunk/include/clang/Basic/DiagnosticGroups.td
                URL:
                
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticGroups.td?rev=337470&r1=337469&r2=337470&view=diff
                
==============================================================================
                ---
                cfe/trunk/include/clang/Basic/DiagnosticGroups.td
                (original)
                +++
                cfe/trunk/include/clang/Basic/DiagnosticGroups.td
                Thu Jul 19 09:46:15 2018
                @@ -443,6 +443,13 @@ def : DiagGroup<"synth">;
                 def SizeofArrayArgument :
                DiagGroup<"sizeof-array-argument">;
                 def SizeofArrayDecay : DiagGroup<"sizeof-array-decay">;
                 def SizeofPointerMemaccess :
                DiagGroup<"sizeof-pointer-memaccess">;
                +def MemsetTransposedArgs :
                DiagGroup<"memset-transposed-args">;
                +def DynamicClassMemaccess :
                DiagGroup<"dynamic-class-memaccess">;
                +def NonTrivialMemaccess :
                DiagGroup<"nontrivial-memaccess">;
                +def SuspiciousBzero : DiagGroup<"suspicious-bzero">;
                +def SuspiciousMemaccess :
                DiagGroup<"suspicious-memaccess",
                +  [SizeofPointerMemaccess, DynamicClassMemaccess,
                +   NonTrivialMemaccess, MemsetTransposedArgs,
                SuspiciousBzero]>;
                 def StaticInInline : DiagGroup<"static-in-inline">;
                 def StaticLocalInInline :
                DiagGroup<"static-local-in-inline">;
                 def GNUStaticFloatInit :
                DiagGroup<"gnu-static-float-init">;

                Modified:
                cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
                URL:
                
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td?rev=337470&r1=337469&r2=337470&view=diff
                
==============================================================================
                ---
                cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
                (original)
                +++
                cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
                Thu Jul 19 09:46:15 2018
                @@ -619,14 +619,14 @@ def warn_cstruct_memaccess :
                Warning<
                   "%select{destination for|source of|first operand
                of|second operand of}0 this "
                   "%1 call is a pointer to record %2 that is not
                trivial to "
                 "%select{primitive-default-initialize|primitive-copy}3">,
                - InGroup<DiagGroup<"nontrivial-memaccess">>;
                + InGroup<NonTrivialMemaccess>;
                 def note_nontrivial_field : Note<
                   "field is non-trivial to
                %select{copy|default-initialize}0">;
                 def warn_dyn_class_memaccess : Warning<
                   "%select{destination for|source of|first operand
                of|second operand of}0 this "
                   "%1 call is a pointer to %select{|class
                containing a }2dynamic class %3; "
                   "vtable pointer will be
                %select{overwritten|copied|moved|compared}4">,
                - InGroup<DiagGroup<"dynamic-class-memaccess">>;
                + InGroup<DynamicClassMemaccess>;
                 def note_bad_memaccess_silence : Note<
                   "explicitly cast the pointer to silence this
                warning">;
                 def warn_sizeof_pointer_expr_memaccess : Warning<
                @@ -655,7 +655,19 @@ def
                note_memsize_comparison_paren : Note
                   "did you mean to compare the result of %0 instead?">;
                 def note_memsize_comparison_cast_silence : Note<
                   "explicitly cast the argument to size_t to
                silence this warning">;
                -
                +def warn_suspicious_sizeof_memset : Warning<
                +  "%select{'size' argument to memset is '0'|"
                +  "setting buffer to a 'sizeof' expression}0"
                +  "; did you mean to transpose the last two
                arguments?">,
                + InGroup<MemsetTransposedArgs>;
                +def note_suspicious_sizeof_memset_silence : Note<
                +  "%select{parenthesize the third argument|"
                +  "cast the second argument to 'int'}0 to silence">;
                +def warn_suspicious_bzero_size : Warning<"'size'
                argument to bzero is '0'">,
                +  InGroup<SuspiciousBzero>;
                +def note_suspicious_bzero_size_silence : Note<
                +  "parenthesize the second argument to silence">;
                +
                 def warn_strncat_large_size : Warning<
                   "the value of the size argument in 'strncat' is
                too large, might lead to a "
                   "buffer overflow">, InGroup<StrncatSize>;

                Modified: cfe/trunk/lib/Sema/SemaChecking.cpp
                URL:
                
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaChecking.cpp?rev=337470&r1=337469&r2=337470&view=diff
                
==============================================================================
                --- cfe/trunk/lib/Sema/SemaChecking.cpp (original)
                +++ cfe/trunk/lib/Sema/SemaChecking.cpp Thu Jul 19
                09:46:15 2018
                @@ -8629,24 +8629,26 @@ static const CXXRecordDecl
                *getContained
                   return nullptr;
                 }

                +static const UnaryExprOrTypeTraitExpr
                *getAsSizeOfExpr(const Expr *E) {
                +  if (const auto *Unary =
                dyn_cast<UnaryExprOrTypeTraitExpr>(E))
                +    if (Unary->getKind() == UETT_SizeOf)
                +      return Unary;
                +  return nullptr;
                +}
                +
                 /// If E is a sizeof expression, returns its
                argument expression,
                 /// otherwise returns NULL.
                 static const Expr *getSizeOfExprArg(const Expr *E) {
                -  if (const UnaryExprOrTypeTraitExpr *SizeOf =
                - dyn_cast<UnaryExprOrTypeTraitExpr>(E))
                -    if (SizeOf->getKind() == UETT_SizeOf &&
                !SizeOf->isArgumentType())
                +  if (const UnaryExprOrTypeTraitExpr *SizeOf =
                getAsSizeOfExpr(E))
                +    if (!SizeOf->isArgumentType())
                       return
                SizeOf->getArgumentExpr()->IgnoreParenImpCasts();
                -
                   return nullptr;
                 }

                 /// If E is a sizeof expression, returns its
                argument type.
                 static QualType getSizeOfArgType(const Expr *E) {
                -  if (const UnaryExprOrTypeTraitExpr *SizeOf =
                - dyn_cast<UnaryExprOrTypeTraitExpr>(E))
                -    if (SizeOf->getKind() == UETT_SizeOf)
                -      return SizeOf->getTypeOfArgument();
                -
                +  if (const UnaryExprOrTypeTraitExpr *SizeOf =
                getAsSizeOfExpr(E))
                +    return SizeOf->getTypeOfArgument();
                   return QualType();
                 }

                @@ -8742,6 +8744,86 @@ struct
                SearchNonTrivialToCopyField

                 }

                +/// Detect if \c SizeofExpr is likely to calculate
                the sizeof an object.
                +static bool doesExprLikelyComputeSize(const Expr
                *SizeofExpr) {
                +  SizeofExpr = SizeofExpr->IgnoreParenImpCasts();
                +
                +  if (const auto *BO =
                dyn_cast<BinaryOperator>(SizeofExpr)) {
                +    if (BO->getOpcode() != BO_Mul &&
                BO->getOpcode() != BO_Add)
                +      return false;
                +
                +    return doesExprLikelyComputeSize(BO->getLHS()) ||
                +  doesExprLikelyComputeSize(BO->getRHS());
                +  }
                +
                +  return getAsSizeOfExpr(SizeofExpr) != nullptr;
                +}
                +
                +/// Check if the ArgLoc originated from a macro
                passed to the call at CallLoc.
                +///
                +/// \code
                +///   #define MACRO 0
                +///   foo(MACRO);
                +///   foo(0);
                +/// \endcode
                +///
                +/// This should return true for the first call to
                foo, but not for the second
                +/// (regardless of whether foo is a macro or function).
                +static bool
                isArgumentExpandedFromMacro(SourceManager &SM,
                +     SourceLocation CallLoc,
                +     SourceLocation ArgLoc) {
                +  if (!CallLoc.isMacroID())
                +    return SM.getFileID(CallLoc) !=
                SM.getFileID(ArgLoc);
                +
                +  return
                SM.getFileID(SM.getImmediateMacroCallerLoc(CallLoc)) !=
                +  SM.getFileID(SM.getImmediateMacroCallerLoc(ArgLoc));
                +}
                +
                +/// Diagnose cases like 'memset(buf, sizeof(buf),
                0)', which should have the
                +/// last two arguments transposed.
                +static void CheckMemaccessSize(Sema &S, unsigned
                BId, const CallExpr *Call) {
                +  if (BId != Builtin::BImemset && BId !=
                Builtin::BIbzero)
                +    return;
                +
                +  const Expr *SizeArg =
                +    Call->getArg(BId == Builtin::BImemset ? 2 :
                1)->IgnoreImpCasts();
                +
                +  // If we're memsetting or bzeroing 0 bytes, then
                this is likely an error.
                +  SourceLocation CallLoc = Call->getRParenLoc();
                +  SourceManager &SM = S.getSourceManager();
                +  if (isa<IntegerLiteral>(SizeArg) &&
                + cast<IntegerLiteral>(SizeArg)->getValue() == 0 &&
                + !isArgumentExpandedFromMacro(SM, CallLoc,
                SizeArg->getExprLoc())) {
                +
                +    SourceLocation DiagLoc = SizeArg->getExprLoc();
                +
                +    // Some platforms #define bzero to
                __builtin_memset. See if this is the
                +    // case, and if so, emit a better diagnostic.
                +    if (BId == Builtin::BIbzero ||
                +        (CallLoc.isMacroID() &&
                Lexer::getImmediateMacroName(
                + CallLoc, SM, S.getLangOpts()) == "bzero")) {
                +      S.Diag(DiagLoc,
                diag::warn_suspicious_bzero_size);
                +      S.Diag(DiagLoc,
                diag::note_suspicious_bzero_size_silence);
                +    } else {
                +      S.Diag(DiagLoc,
                diag::warn_suspicious_sizeof_memset) << 0;
                +      S.Diag(DiagLoc,
                diag::note_suspicious_sizeof_memset_silence) << 0;
                +    }
                +    return;
                +  }
                +
                +  // If the second argument to a memset is a sizeof
                expression and the third
                +  // isn't, this is also likely an error. This
                should catch
                +  // 'memset(buf, sizeof(buf), 0xff)'.
                +  if (BId == Builtin::BImemset &&
                + doesExprLikelyComputeSize(Call->getArg(1)) &&
                + !doesExprLikelyComputeSize(Call->getArg(2))) {
                +    SourceLocation DiagLoc =
                Call->getArg(1)->getExprLoc();
                +    S.Diag(DiagLoc,
                diag::warn_suspicious_sizeof_memset) << 1;
                +    S.Diag(DiagLoc,
                diag::note_suspicious_sizeof_memset_silence) << 1;
                +    return;
                +  }
                +}
                +
                 /// Check for dangerous or invalid arguments to
                memset().
                 ///
                 /// This issues warnings on known problematic,
                dangerous or unspecified
                @@ -8771,6 +8853,9 @@ void
                Sema::CheckMemaccessArguments(const
                  Call->getLocStart(), Call->getRParenLoc()))
                     return;

                +  // Catch cases like 'memset(buf, sizeof(buf), 0)'.
                +  CheckMemaccessSize(*this, BId, Call);
                +
                   // We have special checking when the length is a
                sizeof expression.
                   QualType SizeOfArgTy = getSizeOfArgType(LenExpr);
                   const Expr *SizeOfArg = getSizeOfExprArg(LenExpr);

                Added: cfe/trunk/test/Sema/transpose-memset.c
                URL:
                
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Sema/transpose-memset.c?rev=337470&view=auto
                
==============================================================================
                --- cfe/trunk/test/Sema/transpose-memset.c (added)
                +++ cfe/trunk/test/Sema/transpose-memset.c Thu Jul
                19 09:46:15 2018
                @@ -0,0 +1,60 @@
                +// RUN: %clang_cc1  -Wmemset-transposed-args -verify %s
                +// RUN: %clang_cc1 -xc++ -Wmemset-transposed-args
                -verify %s
                +
                +#define memset(...) __builtin_memset(__VA_ARGS__)
                +#define bzero(x,y) __builtin_memset(x, 0, y)
                +#define real_bzero(x,y) __builtin_bzero(x,y)
                +
                +int array[10];
                +int *ptr;
                +
                +int main() {
                +  memset(array, sizeof(array), 0); //
                expected-warning{{'size' argument to memset is '0';
                did you mean to transpose the last two arguments?}}
                expected-note{{parenthesize the third argument to
                silence}}
                +  memset(array, sizeof(array), 0xff); //
                expected-warning{{setting buffer to a 'sizeof'
                expression; did you mean to transpose the last two
                arguments?}} expected-note{{cast the second argument
                to 'int' to silence}}
                +  memset(ptr, sizeof(ptr), 0); //
                expected-warning{{'size' argument to memset is '0';
                did you mean to transpose the last two arguments?}}
                expected-note{{parenthesize the third argument to
                silence}}
                +  memset(ptr, sizeof(*ptr) * 10, 1); //
                expected-warning{{setting buffer to a 'sizeof'
                expression; did you mean to transpose the last two
                arguments?}} expected-note{{cast the second argument
                to 'int' to silence}}
                +  memset(ptr, 10 * sizeof(int *), 1); //
                expected-warning{{setting buffer to a 'sizeof'
                expression; did you mean to transpose the last two
                arguments?}} expected-note{{cast the second argument
                to 'int' to silence}}
                +  memset(ptr, 10 * sizeof(int *) + 10, 0xff); //
                expected-warning{{setting buffer to a 'sizeof'
                expression; did you mean to transpose the last two
                arguments?}} expected-note{{cast the second argument
                to 'int' to silence}}
                +  memset(ptr, sizeof(char) * sizeof(int *), 0xff);
                // expected-warning{{setting buffer to a 'sizeof'
                expression; did you mean to transpose the last two
                arguments?}} expected-note{{cast the second argument
                to 'int' to silence}}
                +  memset(array, sizeof(array), sizeof(array)); //
                Uh... fine I guess.
                +  memset(array, 0, sizeof(array));
                +  memset(ptr, 0, sizeof(int *) * 10);
                +  memset(array, (int)sizeof(array), (0)); // no warning
                +  memset(array, (int)sizeof(array), 32); // no warning
                +  memset(array, 32, (0)); // no warning
                +
                +  bzero(ptr, 0); // expected-warning{{'size'
                argument to bzero is '0'}}
                expected-note{{parenthesize the second argument to
                silence}}
                +  real_bzero(ptr, 0); // expected-warning{{'size'
                argument to bzero is '0'}}
                expected-note{{parenthesize the second argument to
                silence}}
                +}
                +
                +void macros() {
                +#define ZERO 0
                +  int array[10];
                +  memset(array, 0xff, ZERO); // no warning
                +  // Still emit a diagnostic for memsetting a
                sizeof expression:
                +  memset(array, sizeof(array), ZERO); //
                expected-warning{{'sizeof'}} expected-note{{cast}}
                +  bzero(array, ZERO); // no warning
                +  real_bzero(array, ZERO); // no warning
                +#define NESTED_DONT_DIAG             \
                +  memset(array, 0xff, ZERO);             \
                +  real_bzero(array, ZERO);
                +
                +  NESTED_DONT_DIAG;
                +
                +#define NESTED_DO_DIAG             \
                +  memset(array, 0xff, 0);              \
                +  real_bzero(array, 0)
                +
                +  NESTED_DO_DIAG; // expected-warning{{'size'
                argument to memset}} expected-warning{{'size'
                argument to bzero}} expected-note2{{parenthesize}}
                +
                +#define FN_MACRO(p)              \
                +  memset(array, 0xff, p)
                +
                +  FN_MACRO(ZERO);
                +  FN_MACRO(0); // FIXME: should we diagnose this?
                +
                +  __builtin_memset(array, 0, ZERO); // no warning
                +  __builtin_bzero(array, ZERO);
                +  __builtin_memset(array, 0, 0); //
                expected-warning{{'size' argument to memset}} //
                expected-note{{parenthesize}}
                +  __builtin_bzero(array, 0); //
                expected-warning{{'size' argument to bzero}} //
                expected-note{{parenthesize}}
                +}


                _______________________________________________
                cfe-commits mailing list
                cfe-commits@lists.llvm.org
                <mailto:cfe-commits@lists.llvm.org>
                http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


            _______________________________________________
            cfe-commits mailing list
            cfe-commits@lists.llvm.org
            <mailto:cfe-commits@lists.llvm.org>
            http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits



        _______________________________________________
        cfe-commits mailing list
        cfe-commits@lists.llvm.org <mailto:cfe-commits@lists.llvm.org>
        http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


    _______________________________________________
    cfe-commits mailing list
    cfe-commits@lists.llvm.org <mailto:cfe-commits@lists.llvm.org>
    http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to