Re: Add support to trace comparison instructions and switch statements

2017-07-11 Thread Wish Wu
Hi

I wrote a test for "-fsanitize-coverage=trace-cmp" .

Is there anybody tells me if these codes could be merged into gcc ?

Index: gcc/testsuite/gcc.dg/sancov/basic3.c
===
--- gcc/testsuite/gcc.dg/sancov/basic3.c (nonexistent)
+++ gcc/testsuite/gcc.dg/sancov/basic3.c (working copy)
@@ -0,0 +1,42 @@
+/* Basic test on number of inserted callbacks.  */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize-coverage=trace-cmp -fdump-tree-optimized" } */
+
+void foo(char *a, short *b, int *c, long long *d, float *e, double *f)
+{
+  if (*a)
+*a += 1;
+  if (*b)
+*b = *a;
+  if (*c)
+*c += 1;
+  if(*d)
+*d = *c;
+  if(*e == *c)
+*e = *c;
+  if(*f == *e)
+*f = *e;
+  switch(*a)
+{
+case 2:
+  *b += 2;
+  break;
+default:
+  break;
+}
+  switch(*d)
+{
+case 3:
+  *d += 3;
+case -4:
+  *d -= 4;
+}
+}
+
+/* { dg-final { scan-tree-dump-times
"__builtin___sanitizer_cov_trace_cmp1 \\(" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times
"__builtin___sanitizer_cov_trace_cmp2 \\(" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times
"__builtin___sanitizer_cov_trace_cmp4 \\(" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times
"__builtin___sanitizer_cov_trace_cmp8 \\(" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times
"__builtin___sanitizer_cov_trace_cmpf \\(" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times
"__builtin___sanitizer_cov_trace_cmpd \\(" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times
"__builtin___sanitizer_cov_trace_switch \\(" 2 "optimized" } } */


With Regards
Wish Wu

On Mon, Jul 10, 2017 at 8:07 PM, 吴潍浠(此彼)  wrote:
> Hi
>
> I write some codes to make gcc support comparison-guided fuzzing.
> It is very like 
> http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-data-flow .
> With -fsanitize-coverage=trace-cmp the compiler will insert extra 
> instrumentation around comparison instructions and switch statements.
> I think it is useful for fuzzing.  :D
>
> Patch is below, I may supply test cases later.
>
> With Regards
> Wish Wu
>
> Index: gcc/asan.c
> ===
> --- gcc/asan.c  (revision 250082)
> +++ gcc/asan.c  (working copy)
> @@ -2705,6 +2705,29 @@ initialize_sanitizer_builtins (void)
>tree BT_FN_SIZE_CONST_PTR_INT
>  = build_function_type_list (size_type_node, const_ptr_type_node,
> integer_type_node, NULL_TREE);
> +
> +  tree BT_FN_VOID_UINT8_UINT8
> += build_function_type_list (void_type_node, unsigned_char_type_node,
> +   unsigned_char_type_node, NULL_TREE);
> +  tree BT_FN_VOID_UINT16_UINT16
> += build_function_type_list (void_type_node, uint16_type_node,
> +   uint16_type_node, NULL_TREE);
> +  tree BT_FN_VOID_UINT32_UINT32
> += build_function_type_list (void_type_node, uint32_type_node,
> +   uint32_type_node, NULL_TREE);
> +  tree BT_FN_VOID_UINT64_UINT64
> += build_function_type_list (void_type_node, uint64_type_node,
> +   uint64_type_node, NULL_TREE);
> +  tree BT_FN_VOID_FLOAT_FLOAT
> += build_function_type_list (void_type_node, float_type_node,
> +   float_type_node, NULL_TREE);
> +  tree BT_FN_VOID_DOUBLE_DOUBLE
> += build_function_type_list (void_type_node, double_type_node,
> +   double_type_node, NULL_TREE);
> +  tree BT_FN_VOID_UINT64_PTR
> += build_function_type_list (void_type_node, uint64_type_node,
> +   ptr_type_node, NULL_TREE);
> +
>tree BT_FN_BOOL_VPTR_PTR_IX_INT_INT[5];
>tree BT_FN_IX_CONST_VPTR_INT[5];
>tree BT_FN_IX_VPTR_IX_INT[5];
> Index: gcc/builtin-types.def
> ===
> --- gcc/builtin-types.def   (revision 250082)
> +++ gcc/builtin-types.def   (working copy)
> @@ -338,8 +338,20 @@ DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTRMODE_PTR,
>  BT_VOID, BT_PTRMODE, BT_PTR)
>  DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTR_PTRMODE,
>  BT_VOID, BT_PTR, BT_PTRMODE)
> +DEF_FUNCTION_TYPE_2 (BT_FN_VOID_UINT8_UINT8,
> +BT_VOID, BT_UINT8, BT_UINT8)
> +DEF_FUNCTION_TYPE_2 (BT_FN_VOID_UINT16_UINT16,
> +BT_VOID, BT_UINT16, BT_UINT16)
> +DEF_FUNCTION_TYPE_2 (BT_FN_VOID_UINT32_UINT32,
> +BT_VOID, BT_UINT32, BT_UINT32)
>  DEF_FUNCTION_TYPE_2 (BT_FN_VOID_UINT64_UINT64,
> 

Re: Add support to trace comparison instructions and switch statements

2017-07-13 Thread Wish Wu
Hi

In my perspective:

1. Do we need to assign unique id for every comparison ?
Yes, I suggest to implement it like -fsanitize-coverage=trace-pc-guard .
Because some fuzzing targets may invoke dlopen() like functions to
load libraries(modules) after fork(), while these libraries are
compiled with trace-cmp as well.
With ALSR enabled by linker and/or kernel, return address can't be
a unique id for every comparison.

2. Should we merge cmp1(),cmp2(),cmp4(),cmp8(),cmpf(),cmpd() into one cmp() ?
No, It may reduce the performance of fuzzing. It may wastes
registers. But the number "switch" statements are much less than "if",
I forgive "switch"'s wasting behaviors.

3.Should we record operands(<,>,==,<= ..) ?
Probably no. As comparison,"<" , "==" and ">" all of them are
meaningful, because programmers must have some reasons to do that. As
practice , "==" is more meaningful.

4.Should we record comparisons for counting loop checks ?
Not sure.

With Regards
Wish Wu of Ant-financial Light-Year Security Lab

On Thu, Jul 13, 2017 at 4:09 PM, Dmitry Vyukov  wrote:
> On Tue, Jul 11, 2017 at 1:59 PM, Wish Wu  wrote:
>> Hi
>>
>> I wrote a test for "-fsanitize-coverage=trace-cmp" .
>>
>> Is there anybody tells me if these codes could be merged into gcc ?
>
>
> Nice!
>
> We are currently working on Linux kernel fuzzing that use the
> comparison tracing. We use clang at the moment, but having this
> support in gcc would be great for kernel land.
>
> One concern I have: do we want to do some final refinements to the API
> before we implement this in both compilers?
>
> 2 things we considered from our perspective:
>  - communicating to the runtime which operands are constants
>  - communicating to the runtime which comparisons are counting loop checks
>
> First is useful if you do "find one operand in input and replace with
> the other one" thing. Second is useful because counting loop checks
> are usually not useful (at least all but one).
> In the original Go implementation I also conveyed signedness of
> operands, exact comparison operation (<, >, etc):
> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-defs/defs.go#L13
> But I did not find any use for that.
> I also gave all comparisons unique IDs:
> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-dep/sonar.go#L24
> That turned out to be useful. And there are chances we will want this
> for C/C++ as well.
>
> Kostya, did anything like this pop up in your work on libfuzzer?
> Can we still change the clang API? At least add an additional argument
> to the callbacks?
>
> At the very least I would suggest that we add an additional arg that
> contains some flags (1/2 arg is a const, this is counting loop check,
> etc). If we do that we can also have just 1 callback that accepts
> uint64's for args because we can pass operand size in the flags:
>
> void __sanitizer_cov_trace_cmp(uint64 arg1, uint64 arg2, uint64 flags);
>
> But I wonder if 3 uint64 args will be too inefficient for 32 bit archs?...
>
> If we create a global per comparison then we could put the flags into
> the global:
>
> void __sanitizer_cov_trace_cmp(uint64 arg1, uint64 arg2, something_t *global);
>
> Thoughts?
>
>
>
>
>> Index: gcc/testsuite/gcc.dg/sancov/basic3.c
>> ===
>> --- gcc/testsuite/gcc.dg/sancov/basic3.c (nonexistent)
>> +++ gcc/testsuite/gcc.dg/sancov/basic3.c (working copy)
>> @@ -0,0 +1,42 @@
>> +/* Basic test on number of inserted callbacks.  */
>> +/* { dg-do compile } */
>> +/* { dg-options "-fsanitize-coverage=trace-cmp -fdump-tree-optimized" } */
>> +
>> +void foo(char *a, short *b, int *c, long long *d, float *e, double *f)
>> +{
>> +  if (*a)
>> +*a += 1;
>> +  if (*b)
>> +*b = *a;
>> +  if (*c)
>> +*c += 1;
>> +  if(*d)
>> +*d = *c;
>> +  if(*e == *c)
>> +*e = *c;
>> +  if(*f == *e)
>> +*f = *e;
>> +  switch(*a)
>> +{
>> +case 2:
>> +  *b += 2;
>> +  break;
>> +default:
>> +  break;
>> +}
>> +  switch(*d)
>> +{
>> +case 3:
>> +  *d += 3;
>> +case -4:
>> +  *d -= 4;
>> +}
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times
>> "__builtin___sanitizer_cov_trace_cmp1 \\(" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times
>> "__builtin___sanitizer_cov_trace_cmp2 \\(" 1 "optimized" } } */
>&

Re: Add support to trace comparison instructions and switch statements

2017-07-13 Thread Wish Wu
Hi

In fact, under linux with "return address" and file "/proc/self/maps",
we can give unique id for every comparison.

For fuzzing, we may give 3 bits for every comparison as marker of if
"<", "==" or ">" is showed. :D

With Regards
Wish Wu of Ant-financial Light-Year Security Lab

On Thu, Jul 13, 2017 at 6:04 PM, Wish Wu  wrote:
> Hi
>
> In my perspective:
>
> 1. Do we need to assign unique id for every comparison ?
> Yes, I suggest to implement it like -fsanitize-coverage=trace-pc-guard .
> Because some fuzzing targets may invoke dlopen() like functions to
> load libraries(modules) after fork(), while these libraries are
> compiled with trace-cmp as well.
> With ALSR enabled by linker and/or kernel, return address can't be
> a unique id for every comparison.
>
> 2. Should we merge cmp1(),cmp2(),cmp4(),cmp8(),cmpf(),cmpd() into one cmp() ?
> No, It may reduce the performance of fuzzing. It may wastes
> registers. But the number "switch" statements are much less than "if",
> I forgive "switch"'s wasting behaviors.
>
> 3.Should we record operands(<,>,==,<= ..) ?
> Probably no. As comparison,"<" , "==" and ">" all of them are
> meaningful, because programmers must have some reasons to do that. As
> practice , "==" is more meaningful.
>
> 4.Should we record comparisons for counting loop checks ?
> Not sure.
>
> With Regards
> Wish Wu of Ant-financial Light-Year Security Lab
>
> On Thu, Jul 13, 2017 at 4:09 PM, Dmitry Vyukov  wrote:
>> On Tue, Jul 11, 2017 at 1:59 PM, Wish Wu  wrote:
>>> Hi
>>>
>>> I wrote a test for "-fsanitize-coverage=trace-cmp" .
>>>
>>> Is there anybody tells me if these codes could be merged into gcc ?
>>
>>
>> Nice!
>>
>> We are currently working on Linux kernel fuzzing that use the
>> comparison tracing. We use clang at the moment, but having this
>> support in gcc would be great for kernel land.
>>
>> One concern I have: do we want to do some final refinements to the API
>> before we implement this in both compilers?
>>
>> 2 things we considered from our perspective:
>>  - communicating to the runtime which operands are constants
>>  - communicating to the runtime which comparisons are counting loop checks
>>
>> First is useful if you do "find one operand in input and replace with
>> the other one" thing. Second is useful because counting loop checks
>> are usually not useful (at least all but one).
>> In the original Go implementation I also conveyed signedness of
>> operands, exact comparison operation (<, >, etc):
>> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-defs/defs.go#L13
>> But I did not find any use for that.
>> I also gave all comparisons unique IDs:
>> https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-dep/sonar.go#L24
>> That turned out to be useful. And there are chances we will want this
>> for C/C++ as well.
>>
>> Kostya, did anything like this pop up in your work on libfuzzer?
>> Can we still change the clang API? At least add an additional argument
>> to the callbacks?
>>
>> At the very least I would suggest that we add an additional arg that
>> contains some flags (1/2 arg is a const, this is counting loop check,
>> etc). If we do that we can also have just 1 callback that accepts
>> uint64's for args because we can pass operand size in the flags:
>>
>> void __sanitizer_cov_trace_cmp(uint64 arg1, uint64 arg2, uint64 flags);
>>
>> But I wonder if 3 uint64 args will be too inefficient for 32 bit archs?...
>>
>> If we create a global per comparison then we could put the flags into
>> the global:
>>
>> void __sanitizer_cov_trace_cmp(uint64 arg1, uint64 arg2, something_t 
>> *global);
>>
>> Thoughts?
>>
>>
>>
>>
>>> Index: gcc/testsuite/gcc.dg/sancov/basic3.c
>>> ===
>>> --- gcc/testsuite/gcc.dg/sancov/basic3.c (nonexistent)
>>> +++ gcc/testsuite/gcc.dg/sancov/basic3.c (working copy)
>>> @@ -0,0 +1,42 @@
>>> +/* Basic test on number of inserted callbacks.  */
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-fsanitize-coverage=trace-cmp -fdump-tree-optimized" } */
>>> +
>>> +void foo(char *a, short *b, int *c, long long *d, float *e, double *f)
>>> +{
>>> +  if (*a)
>>> +*a += 1;
>>> +  if (*b)
>>> +