rsundahl added a comment.

Adding dialog to comments made by reviewers.



================
Comment at: clang/lib/CodeGen/ItaniumCXXABI.cpp:2443
+  // Handle poisoning the array cookie in asan
+  if (CGM.getLangOpts().Sanitize.has(SanitizerKind::Address) && AS == 0 &&
+      (expr->getOperatorNew()->isReplaceableGlobalAllocationFunction() ||
----------------
delcypher wrote:
> Why is there a restriction on the address space?
It's only there because it's also there in 
ItaniumCXXABI::InitializeArrayCookie() method and I have no reason to think 
that the ARMCXXABI version would be different. That said, this is the revision 
that introduced the notion originally: https://reviews.llvm.org/D5111


================
Comment at: clang/lib/CodeGen/ItaniumCXXABI.cpp:2478
+  // run-time deal with it: if the shadow is properly poisoned return the
+  // cookie, otherwise return 0 to avoid an infinite loop calling DTORs.
+  // We can't simply ignore this load using nosanitize metadata because
----------------
delcypher wrote:
> This comment is confusing. What's returning `0`? 
> `__asan_load_cxx_array_cookie` doesn't do that and AFAICT neither does this 
> code.
(Same answer) It's only there because it's also there in 
ItaniumCXXABI::InitializeArrayCookie() method and I have no reason to think 
that the ARMCXXABI version would be different. That said, this is the revision 
that introduced the notion originally: https://reviews.llvm.org/D5111


================
Comment at: clang/lib/CodeGen/ItaniumCXXABI.cpp:2479
+  // cookie, otherwise return 0 to avoid an infinite loop calling DTORs.
+  // We can't simply ignore this load using nosanitize metadata because
+  // the metadata may be lost.
----------------
delcypher wrote:
> I also don't understand what you mean by the comment. Could you elaborate?
(Same answer) It's only there because it's also there in 
ItaniumCXXABI::InitializeArrayCookie() method and I have no reason to think 
that the ARMCXXABI version would be different. That said, this is the revision 
that introduced the notion originally: https://reviews.llvm.org/D5111


================
Comment at: clang/lib/CodeGen/ItaniumCXXABI.cpp:2485
+      CGM.CreateRuntimeFunction(FTy, "__asan_load_cxx_array_cookie");
+  return CGF.Builder.CreateCall(F, numElementsPtr.getRawPointer(CGF));
 }
----------------
yln wrote:
> Same here: extract common helper for the ASan stuff to be shared by ARM and 
> ItaniumCXXABI.
Was limiting my impact but yes, I agree with you and thanks for expecting it. 
FYI and FWIW: the same concern that @delcypher has in the preceding comments 
will still exist because the factored code will come from the existing Itanium 
methods and those concerns come from that existing code that was introduced by: 
https://reviews.llvm.org/D5111.


================
Comment at: compiler-rt/lib/asan/asan_poisoning.cpp:262
   *reinterpret_cast<u8*>(s) = kAsanArrayCookieMagic;
+  // The ARM64 cookie has a second "elementSize" entry so poison it as well
+  #if SANITIZER_ARM64
----------------
yln wrote:
> yln wrote:
> > yln wrote:
> > > Nitpicking extreme:
> > > * I think the code inside `#if` shouldn't have extra indent (because 
> > > preprocessor sections don't establish code).  If in doubt, let 
> > > `clang-format` do it for you.
> > > * Let's move the comment inside the #if, just above before the line of 
> > > code.  If you ever read the pre-processed source-code, then the comment 
> > > "lives and dies" with the line of code it relates too, i.e, on x86, 
> > > currently there would be a comment without the code.
> > 
> I find this a bit confusing
> * x86_64: cookie is 1 word and passed `p` points to it
> * arm64: cookie is 2 words and passed `p` points to second half of it
> 
> Would it be worth to take the extra care in CodeGen to always pass the 
> "beginning of the cookie" to `__asan_poison_cxx_array_cookie()` and then have 
> something like that:
> ```
> size_t shadow_cookie_size = SANITIZER_ARM64 ? 2 : 1:
> internal_memset(s, kAsanArrayCookieMagic, shadow_cookie_size);
> ```
I don't think so for a collection of reasons. When the ItaniumCXXABI defines 
the "array cookie" it states: "The cookie will have size sizeof(size_t)" and it 
then describes how there may be padding preceding the cookie but makes no 
mention of a second element anywhere. 
(https://itanium-cxx-abi.github.io/cxx-abi/abi.html#array-cookies). Now, while 
the ARMCXXABI(32) specification defines the cookie as two 4-byte (int) values 
(sizeof(element),numElements), the ARMCXXABI(64) 
(https://github.com/ARM-software/abi-aa/blob/320a56971fdcba282b7001cf4b84abb4fd993131/cppabi64/cppabi64.rst#the-generic-c-abi)
 does no such similar thing and defers to the Itanium specification of a 
single, type size_t (8 bytes) cookie which contains only (numElements). The 
initial commit creating the ARMCXXABI(64) mimicking the ARMCXXABI(32) "pair" of 
values occurred here: https://marc.info/?l=cfe-commits&m=135915714232578&w=2 
and is then (unfortunately) this interpretation was reiterated here: 
https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms
which (imo incorrectly) states:
"The ABI provides a fixed layout of two size_t words for array cookies, with no 
extra alignment requirements. This behavior matches the ARM 32-bit C++ ABI." 
However... there is no supporting documentation for this two-element cookie in 
the ARMCXXABI(64). Furthermore, throughout the llvm source code, including 
sanitizer, the term "array cookie" ALWAYS means the (size_t) bytes immediately 
preceding the array pointer that is returned by the allocator. For these 
reasons, I have chosen to treat the "array cookie" uniformly as a size_t 
singleton and the additional sizeof(element) as the padding that it is, but 
with the additional information. (There is no use in llvm of the second element 
either in the Itanium nor the ARM ABIs including the 32 bit ARM ABI. I'm 
confident that we should support only the defined array cookie as a single 
value but also fill the padding with sizeof(element) and deprecate the whole 
idea eventually.)


================
Comment at: compiler-rt/lib/asan/asan_poisoning.cpp:262-265
+  // The ARM64 cookie has a second "elementSize" entry so poison it as well
+  #if SANITIZER_ARM64
+    *(reinterpret_cast<u8*>(s)-1) = kAsanArrayCookieMagic;
+  #endif
----------------
rsundahl wrote:
> yln wrote:
> > yln wrote:
> > > yln wrote:
> > > > Nitpicking extreme:
> > > > * I think the code inside `#if` shouldn't have extra indent (because 
> > > > preprocessor sections don't establish code).  If in doubt, let 
> > > > `clang-format` do it for you.
> > > > * Let's move the comment inside the #if, just above before the line of 
> > > > code.  If you ever read the pre-processed source-code, then the comment 
> > > > "lives and dies" with the line of code it relates too, i.e, on x86, 
> > > > currently there would be a comment without the code.
> > > 
> > I find this a bit confusing
> > * x86_64: cookie is 1 word and passed `p` points to it
> > * arm64: cookie is 2 words and passed `p` points to second half of it
> > 
> > Would it be worth to take the extra care in CodeGen to always pass the 
> > "beginning of the cookie" to `__asan_poison_cxx_array_cookie()` and then 
> > have something like that:
> > ```
> > size_t shadow_cookie_size = SANITIZER_ARM64 ? 2 : 1:
> > internal_memset(s, kAsanArrayCookieMagic, shadow_cookie_size);
> > ```
> I don't think so for a collection of reasons. When the ItaniumCXXABI defines 
> the "array cookie" it states: "The cookie will have size sizeof(size_t)" and 
> it then describes how there may be padding preceding the cookie but makes no 
> mention of a second element anywhere. 
> (https://itanium-cxx-abi.github.io/cxx-abi/abi.html#array-cookies). Now, 
> while the ARMCXXABI(32) specification defines the cookie as two 4-byte (int) 
> values (sizeof(element),numElements), the ARMCXXABI(64) 
> (https://github.com/ARM-software/abi-aa/blob/320a56971fdcba282b7001cf4b84abb4fd993131/cppabi64/cppabi64.rst#the-generic-c-abi)
>  does no such similar thing and defers to the Itanium specification of a 
> single, type size_t (8 bytes) cookie which contains only (numElements). The 
> initial commit creating the ARMCXXABI(64) mimicking the ARMCXXABI(32) "pair" 
> of values occurred here: 
> https://marc.info/?l=cfe-commits&m=135915714232578&w=2 and is then 
> (unfortunately) this interpretation was reiterated here: 
> https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms
> which (imo incorrectly) states:
> "The ABI provides a fixed layout of two size_t words for array cookies, with 
> no extra alignment requirements. This behavior matches the ARM 32-bit C++ 
> ABI." However... there is no supporting documentation for this two-element 
> cookie in the ARMCXXABI(64). Furthermore, throughout the llvm source code, 
> including sanitizer, the term "array cookie" ALWAYS means the (size_t) bytes 
> immediately preceding the array pointer that is returned by the allocator. 
> For these reasons, I have chosen to treat the "array cookie" uniformly as a 
> size_t singleton and the additional sizeof(element) as the padding that it 
> is, but with the additional information. (There is no use in llvm of the 
> second element either in the Itanium nor the ARM ABIs including the 32 bit 
> ARM ABI. I'm confident that we should support only the defined array cookie 
> as a single value but also fill the padding with sizeof(element) and 
> deprecate the whole idea eventually.)
Good stuff. Please nit-pick all you like. Will update.


================
Comment at: compiler-rt/test/asan/TestCases/Posix/new_array_cookie_test.cpp:3
 // RUN: %clangxx_asan -O3 %s -o %t
-// RUN:                                    not %run %t 2>&1  | FileCheck %s
+// RUN:                                      not %run %t 2>&1  | FileCheck %s
 // RUN: %env_asan_opts=poison_array_cookie=1 not %run %t 2>&1  | FileCheck %s
----------------
yln wrote:
> ❤  The boy scouts rule:
> > Always leave the campground cleaner than you found it.
I pick up litter on my walks too. ;-) 


================
Comment at: compiler-rt/test/asan/TestCases/Posix/new_array_cookie_test.cpp:19
   C *buffer = new C[argc];
-  buffer[-2].x = 10;
+  buffer[-1].x = 10;
 // CHECK: AddressSanitizer: heap-buffer-overflow
----------------
yln wrote:
> Can we try to simulate a 1-byte buffer-underflow (and leave the definition of 
> `struct C` as-is)?  This would show that ASan still reports the smallest 
> possible infraction.
That's a good idea, but it's not about leaving the struct the same because it's 
definitely wrong in that type "int" has nothing to do with the cookie and it 
only happens to be true that  (buffer[-2].x) gets to the beginning of the 
cookie since two "int"s happen to fit inside of a "size_t". (This may not 
always be true as in most 64-on-32 type arrangements) It's better to treat the 
cookie as [-1] of an array of size_t elements so that when indexing [-1] you 
get all of the cookie at once (see: 
https://itanium-cxx-abi.github.io/cxx-abi/abi.html#array-cookies). The reason I 
write to the *whole* cookie at once is because of endian-ness. The existing 
code which only writes to the first half of the cookie would reach a different 
half of the cookie on a big-endian machine so this covers both. But you have a 
good point, we should see if the minimum underflow is caught so I'll think of a 
way to do that in an endian-agnostic way... 


================
Comment at: 
compiler-rt/test/asan/TestCases/Posix/new_array_cookie_uaf_test.cpp:20
 __attribute__((no_sanitize_address)) void Write42ToCookie(C *c) {
-  long *p = reinterpret_cast<long*>(c);
+  size_t *p = reinterpret_cast<size_t*>(c);
   p[-1] = 42;
----------------
yln wrote:
> Can we try to simulate a 1-byte buffer-underflow (and leave the definition of 
> struct C as-is)? This would show that ASan still reports the smallest 
> possible infraction.
(Same answer as above)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125195/new/

https://reviews.llvm.org/D125195

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to