[Lldb-commits] [PATCH] D107140: [lldb] Show memory tags in memory read output

David Spickett via Phabricator via lldb-commits Fri, 29 Oct 2021 08:40:11 -0700

DavidSpickett updated this revision to Diff 383374.
DavidSpickett added a comment.
Herald added a subscriber: dang.


Added a command option for showing tags (off by default).

Split the patch into three (see parent revisions).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107140/new/

https://reviews.llvm.org/D107140

Files:
  lldb/include/lldb/Core/DumpDataExtractor.h
  lldb/source/Commands/CommandObjectMemory.cpp
  lldb/source/Commands/Options.td
  lldb/source/Core/DumpDataExtractor.cpp
  
lldb/test/API/linux/aarch64/mte_tag_access/TestAArch64LinuxMTEMemoryTagAccess.py

Index: lldb/test/API/linux/aarch64/mte_tag_access/TestAArch64LinuxMTEMemoryTagAccess.py
===================================================================
--- lldb/test/API/linux/aarch64/mte_tag_access/TestAArch64LinuxMTEMemoryTagAccess.py
+++ lldb/test/API/linux/aarch64/mte_tag_access/TestAArch64LinuxMTEMemoryTagAccess.py
@@ -280,3 +280,128 @@
                           "\[0x[0-9A-Fa-f]+10, 0x[0-9A-Fa-f]+20\): 0x3 \(mismatch\)\n"
                           "\[0x[0-9A-Fa-f]+20, 0x[0-9A-Fa-f]+30\): 0x3 \(mismatch\)\n"
                           "\[0x[0-9A-Fa-f]+30, 0x[0-9A-Fa-f]+40\): 0x0$"])
+
+    @skipUnlessArch("aarch64")
+    @skipUnlessPlatform(["linux"])
+    @skipUnlessAArch64MTELinuxCompiler
+    def test_mte_memory_read_tag_display(self):
+        self.setup_mte_test()
+
+        # Reading from an untagged range should not be any different
+        self.expect("memory read non_mte_buf non_mte_buf+16",
+                substrs=["tag"], matching=False)
+
+        # show-tags option is required
+        self.expect("memory read mte_buf mte_buf+32 -f \"x\" -l 1 -s 16",
+                patterns=["tag"], matching=False)
+
+        # Reading 16 bytes per line means 1 granule and so 1 tag per line
+        self.expect("memory read mte_buf mte_buf+32 -f \"x\" -l 1 -s 16 --show-tags",
+                patterns=[
+                    "0x[0-9A-fa-f]+00: 0x0+ \(tag: 0x0\)\n"
+                    "0x[0-9A-fa-f]+10: 0x0+ \(tag: 0x1\)"
+                    ])
+
+        # If bytes per line is > granule size then you get multiple tags
+        # per line.
+        self.expect("memory read mte_buf mte_buf+32 -f \"x\" -l 1 -s 32 --show-tags",
+                patterns=[
+                    "0x[0-9A-fa-f]+00: 0x0+ \(tags: 0x0 0x1\)\n"
+                    ])
+
+        # Reading half a granule still shows you the tag for that granule
+        self.expect("memory read mte_buf mte_buf+8 -f \"x\" -l 1 -s 8 --show-tags",
+                patterns=[
+                    "0x[0-9A-fa-f]+00: 0x0+ \(tag: 0x0\)\n"
+                    ])
+
+        # We can read a whole number of granules but split them over more lines
+        # than there are granules. Tags are shown repeated for each applicable line.
+        self.expect("memory read mte_buf+32 mte_buf+64 -f \"x\" -l 1 -s 8 --show-tags",
+                patterns=[
+                    "0x[0-9A-fa-f]+20: 0x0+ \(tag: 0x2\)\n"
+                    "0x[0-9A-fa-f]+28: 0x0+ \(tag: 0x2\)\n"
+                    "0x[0-9A-fa-f]+30: 0x0+ \(tag: 0x3\)\n"
+                    "0x[0-9A-fa-f]+38: 0x0+ \(tag: 0x3\)"
+                    ])
+
+        # Also works if we misalign the start address. Note the first tag is shown
+        # only once here and we have a new tag on the last line.
+        # (bytes per line == the misalignment here)
+        self.expect("memory read mte_buf+32+8 mte_buf+64+8 -f \"x\" -l 1 -s 8 --show-tags",
+                patterns=[
+                    "0x[0-9A-fa-f]+28: 0x0+ \(tag: 0x2\)\n"
+                    "0x[0-9A-fa-f]+30: 0x0+ \(tag: 0x3\)\n"
+                    "0x[0-9A-fa-f]+38: 0x0+ \(tag: 0x3\)\n"
+                    "0x[0-9A-fa-f]+40: 0x0+ \(tag: 0x4\)"
+                    ])
+
+        # We can do the same thing but where the misaligment isn't equal to
+        # bytes per line. This time, some lines cover multiple granules and
+        # so show multiple tags.
+        self.expect("memory read mte_buf+32+4 mte_buf+64+4 -f \"x\" -l 1 -s 8 --show-tags",
+                patterns=[
+                    "0x[0-9A-fa-f]+24: 0x0+ \(tag: 0x2\)\n"
+                    "0x[0-9A-fa-f]+2c: 0x0+ \(tags: 0x2 0x3\)\n"
+                    "0x[0-9A-fa-f]+34: 0x0+ \(tag: 0x3\)\n"
+                    "0x[0-9A-fa-f]+3c: 0x0+ \(tags: 0x3 0x4\)"
+                    ])
+
+        # If you read a range that includes non tagged areas those areas
+        # simply aren't annotated.
+
+        # Initial part of range is untagged
+        self.expect("memory read mte_buf-16 mte_buf+32 -f \"x\" -l 1 -s 16 --show-tags",
+                patterns=[
+                    "0x[0-9A-fa-f]+f0: 0x0+\n"
+                    "0x[0-9A-fa-f]+00: 0x0+ \(tag: 0x0\)\n"
+                    "0x[0-9A-fa-f]+10: 0x0+ \(tag: 0x1\)"
+                    ])
+
+        # End of range is untagged
+        self.expect("memory read mte_buf+page_size-16 mte_buf+page_size+16 -f \"x\" -l 1 -s 16 --show-tags",
+                patterns=[
+                    "0x[0-9A-fa-f]+f0: 0x0+ \(tag: 0xf\)\n"
+                    "0x[0-9A-fa-f]+00: 0x0+"
+                    ])
+
+        # The smallest MTE range we can get is a single page so we just check
+        # parts of this result. Where we read from before the tagged page to after it.
+        # Add --force here because we're reading just over 4k.
+        self.expect(
+                "memory read mte_read_only-16 mte_read_only+page_size+16 -f \"x\" -l 1 -s 16 --force --show-tags",
+                patterns=[
+                    "0x[0-9A-fa-f]+f0: 0x0+\n"
+                    "0x[0-9A-fa-f]+00: 0x0+ \(tag: 0x0\)\n",
+                    "0x[0-9A-fa-f]+f0: 0x0+ \(tag: 0x0\)\n"
+                    "0x[0-9A-fa-f]+00: 0x0+"
+                    ])
+
+        # Some parts of a line might be tagged and others untagged.
+        # <no tag> is shown in where the tag would be, to keep the order intact.
+        self.expect("memory read mte_buf-16 mte_buf+32 -f \"x\" -l 1 -s 32 --show-tags",
+                patterns=["0x[0-9A-fa-f]+f0: 0x0+ \(tags: <no tag> 0x0\)"])
+        self.expect(
+                "memory read mte_read_only+page_size-16 mte_read_only+page_size+16 -f \"x\" -l 1 -s 32  --show-tags",
+                patterns=["0x[0-9A-fa-f]+f0: 0x0+ \(tags: 0x0 <no tag>\)"])
+
+        # Here the start address is unaligned so we cover 3 granules instead of 2
+        self.expect("memory read mte_buf-16+4 mte_buf+32+4 -f \"x\" -l 1 -s 32 --show-tags",
+                patterns=["0x[0-9A-fa-f]+f4: 0x0+ \(tags: <no tag> 0x0 0x1\)"])
+        self.expect(
+                "memory read mte_read_only+page_size-16+4 mte_read_only+page_size+16+4 -f \"x\" -l 1 -s 32 --show-tags",
+                patterns=["0x[0-9A-fa-f]+f4: 0x0+ \(tags: 0x0 <no tag> <no tag>\)"])
+
+        # Some formats call DumpDataExtractor multiple times,
+        # check that those print tags only once per line.
+        self.expect("memory read mte_buf mte_buf+32 -f \"x\" --show-tags",
+                patterns=["0x[0-9A-fa-f]+00: 0x0+ 0x0+ 0x0+ 0x0+ \(tag: 0x0\)\n",
+                          "0x[0-9A-fa-f]+10: 0x0+ 0x0+ 0x0+ 0x0+ \(tag: 0x1\)"])
+
+        self.expect("memory read mte_buf mte_buf+32 -f \"bytes with ASCII\" --show-tags",
+                patterns=["0x[0-9A-fa-f]+00: (00 ){16} \.{16} \(tag: 0x0\)\n",
+                          "0x[0-9A-fa-f]+10: (00 ){16} \.{16} \(tag: 0x1\)"])
+
+        self.expect("memory read mte_buf mte_buf+32 -f \"uint8_t[]\" -s 16 -l 1 --show-tags",
+                patterns=["0x[0-9A-Fa-f]+00: \{(0x00 ){15}0x00\} \(tag: 0x0\)\n"
+                          "0x[0-9A-Fa-f]+10: \{(0x00 ){15}0x00\} \(tag: 0x1\)"])
Index: lldb/source/Core/DumpDataExtractor.cpp
===================================================================
--- lldb/source/Core/DumpDataExtractor.cpp
+++ lldb/source/Core/DumpDataExtractor.cpp
@@ -17,11 +17,14 @@
 #include "lldb/Target/ABI.h"
 #include "lldb/Target/ExecutionContext.h"
 #include "lldb/Target/ExecutionContextScope.h"
+#include "lldb/Target/MemoryRegionInfo.h"
+#include "lldb/Target/MemoryTagManager.h"
 #include "lldb/Target/Process.h"
 #include "lldb/Target/SectionLoadList.h"
 #include "lldb/Target/Target.h"
 #include "lldb/Utility/DataExtractor.h"
 #include "lldb/Utility/Log.h"
+#include "lldb/Utility/MemoryTagMap.h"
 #include "lldb/Utility/Stream.h"
 
 #include "llvm/ADT/APFloat.h"
@@ -253,6 +256,85 @@
   ss << f;
 }
 
+static llvm::Optional<MemoryTagMap>
+GetMemoryTags(lldb::addr_t addr, size_t length,
+              ExecutionContextScope *exe_scope) {
+  assert(addr != LLDB_INVALID_ADDRESS);
+
+  if (!exe_scope)
+    return llvm::None;
+
+  TargetSP target_sp = exe_scope->CalculateTarget();
+  if (!target_sp)
+    return llvm::None;
+
+  ProcessSP process_sp = target_sp->CalculateProcess();
+  if (!process_sp)
+    return llvm::None;
+
+  llvm::Expected<const MemoryTagManager *> tag_manager_or_err =
+      process_sp->GetMemoryTagManager();
+  if (!tag_manager_or_err) {
+    llvm::consumeError(tag_manager_or_err.takeError());
+    return llvm::None;
+  }
+
+  MemoryRegionInfos memory_regions;
+  // Don't check return status, list will be just empty if an error happened.
+  process_sp->GetMemoryRegions(memory_regions);
+
+  llvm::Expected<std::vector<MemoryTagManager::TagRange>> tagged_ranges_or_err =
+      (*tag_manager_or_err)
+          ->MakeTaggedRanges(addr, addr + length, memory_regions);
+  // Here we know that our range will not be inverted but we must still check
+  // for an error.
+  if (!tagged_ranges_or_err) {
+    llvm::consumeError(tagged_ranges_or_err.takeError());
+    return llvm::None;
+  }
+  if (tagged_ranges_or_err->empty())
+    return llvm::None;
+
+  MemoryTagMap memory_tag_map(*tag_manager_or_err);
+  for (const MemoryTagManager::TagRange &range : *tagged_ranges_or_err) {
+    llvm::Expected<std::vector<lldb::addr_t>> tags_or_err =
+        process_sp->ReadMemoryTags(range.GetRangeBase(), range.GetByteSize());
+
+    if (tags_or_err)
+      memory_tag_map.InsertTags(range.GetRangeBase(), *tags_or_err);
+    else
+      llvm::consumeError(tags_or_err.takeError());
+  }
+
+  if (memory_tag_map.empty())
+    return llvm::None;
+
+  return memory_tag_map;
+}
+
+static void
+printMemoryTags(const DataExtractor &DE, Stream *s, lldb::addr_t addr,
+                size_t len,
+                const llvm::Optional<MemoryTagMap> &memory_tag_map) {
+  std::vector<llvm::Optional<lldb::addr_t>> tags =
+      memory_tag_map->GetTags(addr, len);
+
+  // Only print if there is at least one tag for this line
+  if (tags.empty())
+    return;
+
+  s->Printf(" (tag%s:", tags.size() > 1 ? "s" : "");
+  // Some granules may not be tagged but print something for them
+  // so that the ordering remains intact.
+  for (auto tag : tags) {
+    if (tag)
+      s->Printf(" 0x%" PRIx64, *tag);
+    else
+      s->PutCString(" <no tag>");
+  }
+  s->PutCString(")");
+}
+
 lldb::offset_t lldb_private::DumpDataExtractor(
     const DataExtractor &DE, Stream *s, offset_t start_offset,
     lldb::Format item_format, size_t item_byte_size, size_t item_count,
@@ -261,7 +343,7 @@
                               // non-zero, the value is a bitfield
     uint32_t item_bit_offset, // If "item_bit_size" is non-zero, this is the
                               // shift amount to apply to a bitfield
-    ExecutionContextScope *exe_scope) {
+    ExecutionContextScope *exe_scope, bool show_memory_tags) {
   if (s == nullptr)
     return start_offset;
 
@@ -272,6 +354,11 @@
 
   offset_t offset = start_offset;
 
+  llvm::Optional<MemoryTagMap> memory_tag_map = llvm::None;
+  if (show_memory_tags && base_addr != LLDB_INVALID_ADDRESS)
+    memory_tag_map =
+        GetMemoryTags(base_addr, DE.GetByteSize() - offset, exe_scope);
+
   if (item_format == eFormatInstruction)
     return DumpInstructions(DE, s, exe_scope, start_offset, base_addr,
                             item_count);
@@ -283,7 +370,10 @@
   lldb::offset_t line_start_offset = start_offset;
   for (uint32_t count = 0; DE.ValidOffset(offset) && count < item_count;
        ++count) {
+    // If we are at the beginning or end of a line
+    // Note that the last line is handled outside this for loop.
     if ((count % num_per_line) == 0) {
+      // If we are at the end of a line
       if (count > 0) {
         if (item_format == eFormatBytesWithASCII &&
             offset > line_start_offset) {
@@ -295,6 +385,15 @@
                             offset - line_start_offset, SIZE_MAX,
                             LLDB_INVALID_ADDRESS, 0, 0);
         }
+
+        if (base_addr != LLDB_INVALID_ADDRESS && memory_tag_map) {
+          size_t line_len = offset - line_start_offset;
+          lldb::addr_t line_base =
+              base_addr +
+              (offset - start_offset - line_len) / DE.getTargetByteSize();
+          printMemoryTags(DE, s, line_base, line_len, memory_tag_map);
+        }
+
         s->EOL();
       }
       if (base_addr != LLDB_INVALID_ADDRESS)
@@ -796,14 +895,28 @@
     }
   }
 
-  if (item_format == eFormatBytesWithASCII && offset > line_start_offset) {
-    s->Printf("%*s", static_cast<int>(
-                         (num_per_line - (offset - line_start_offset)) * 3 + 2),
-              "");
-    DumpDataExtractor(DE, s, line_start_offset, eFormatCharPrintable, 1,
-                      offset - line_start_offset, SIZE_MAX,
-                      LLDB_INVALID_ADDRESS, 0, 0);
+  // If anything was printed we want to catch the end of the last line.
+  // Since we will exit the for loop above before we get a chance to append to
+  // it normally.
+  if (offset > line_start_offset) {
+    if (item_format == eFormatBytesWithASCII) {
+      s->Printf("%*s",
+                static_cast<int>(
+                    (num_per_line - (offset - line_start_offset)) * 3 + 2),
+                "");
+      DumpDataExtractor(DE, s, line_start_offset, eFormatCharPrintable, 1,
+                        offset - line_start_offset, SIZE_MAX,
+                        LLDB_INVALID_ADDRESS, 0, 0);
+    }
+
+    if (base_addr != LLDB_INVALID_ADDRESS && memory_tag_map) {
+      size_t line_len = offset - line_start_offset;
+      lldb::addr_t line_base = base_addr + (offset - start_offset - line_len) /
+                                               DE.getTargetByteSize();
+      printMemoryTags(DE, s, line_base, line_len, memory_tag_map);
+    }
   }
+
   return offset; // Return the offset at which we ended up
 }
 
Index: lldb/source/Commands/Options.td
===================================================================
--- lldb/source/Commands/Options.td
+++ lldb/source/Commands/Options.td
@@ -489,6 +489,8 @@
     "display data.">;
   def memory_read_force : Option<"force", "r">, Groups<[1,2,3]>,
     Desc<"Necessary if reading over target.max-memory-read-size bytes.">;
+  def memory_read_show_tags : Option<"show-tags", "\\x01">, Group<1>,
+    Desc<"Include memory tags in output (does not apply to binary output).">;
 }
 
 let Command = "memory find" in {
Index: lldb/source/Commands/CommandObjectMemory.cpp
===================================================================
--- lldb/source/Commands/CommandObjectMemory.cpp
+++ lldb/source/Commands/CommandObjectMemory.cpp
@@ -90,6 +90,10 @@
       error = m_offset.SetValueFromString(option_value);
       break;
 
+    case '\x01':
+      m_show_tags = true;
+      break;
+
     default:
       llvm_unreachable("Unimplemented option");
     }
@@ -103,6 +107,7 @@
     m_force = false;
     m_offset.Clear();
     m_language_for_type.Clear();
+    m_show_tags = false;
   }
 
   Status FinalizeSettings(Target *target, OptionGroupFormat &format_options) {
@@ -276,6 +281,7 @@
   bool m_force;
   OptionValueUInt64 m_offset;
   OptionValueLanguage m_language_for_type;
+  bool m_show_tags = false;
 };
 
 // Read memory from the inferior process
@@ -854,7 +860,7 @@
     size_t bytes_dumped = DumpDataExtractor(
         data, output_stream_p, 0, format, item_byte_size, item_count,
         num_per_line / target->GetArchitecture().GetDataByteSize(), addr, 0, 0,
-        exe_scope);
+        exe_scope, m_memory_options.m_show_tags);
     m_next_addr = addr + bytes_dumped;
     output_stream_p->EOL();
     return true;
Index: lldb/include/lldb/Core/DumpDataExtractor.h
===================================================================
--- lldb/include/lldb/Core/DumpDataExtractor.h
+++ lldb/include/lldb/Core/DumpDataExtractor.h
@@ -76,6 +76,15 @@
 ///     same integer value. If the items being displayed are not
 ///     bitfields, this value should be zero.
 ///
+/// \param[in] exe_scope
+///     If provided, this will be used to lookup language specific
+///     information, address information and memory tags.
+///     (if they are requested by the other options)
+///
+/// \param[in] show_memory_tags
+///     If exe_scope and base_addr are valid, include memory tags
+///     in the output. This does not apply to certain formats.
+///
 /// \return
 ///     The offset at which dumping ended.
 lldb::offset_t
@@ -83,7 +92,8 @@
                   lldb::Format item_format, size_t item_byte_size,
                   size_t item_count, size_t num_per_line, uint64_t base_addr,
                   uint32_t item_bit_size, uint32_t item_bit_offset,
-                  ExecutionContextScope *exe_scope = nullptr);
+                  ExecutionContextScope *exe_scope = nullptr,
+                  bool show_memory_tags = false);
 
 void DumpHexBytes(Stream *s, const void *src, size_t src_len,
                   uint32_t bytes_per_line, lldb::addr_t base_addr);

_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [PATCH] D107140: [lldb] Show memory tags in memory read output

Reply via email to