On Wed, Mar 11, 2026 at 09:46:57PM -0700, Randy Dunlap wrote:
> 
> 
> On 3/10/26 1:15 PM, Mukesh Ojha wrote:
> > diff --git a/Documentation/dev-tools/meminspect.rst 
> > b/Documentation/dev-tools/meminspect.rst
> > new file mode 100644
> > index 000000000000..d0c7222bdcd7
> > --- /dev/null
> > +++ b/Documentation/dev-tools/meminspect.rst
> > @@ -0,0 +1,144 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +==========
> > +meminspect
> > +==========
> > +
> > +This document provides information about the meminspect feature.
> > +
> > +Overview
> > +========
> > +
> > +meminspect is a mechanism that allows the kernel to register a chunk of
> > +memory into a table, to be used at a later time for a specific
> > +inspection purpose like debugging, memory dumping or statistics.
> > +
> > +meminspect allows drivers to traverse the inspection table on demand,
> > +or to register a notifier to be called whenever a new entry is being added
> 
>   preferably...                                                is added
> 
> > +or removed.
> > +
> > +The reasoning for meminspect is also to minimize the required information
> > +in case of a kernel problem. For example a traditional debug method 
> > involves
> > +dumping the whole kernel memory and then inspecting it. Meminspect allows 
> > the
> > +users to select which memory is of interest, in order to help this specific
> > +use case in production, where memory and connectivity are limited.
> > +
> > +Although the kernel has multiple internal mechanisms, meminspect fits
> > +a particular model which is not covered by the others.
> > +
> > +meminspect Internals
> > +====================
> > +
> > +API
> > +---
> > +
> > +Static memory can be registered at compile time, by instructing the 
> > compiler
> > +to create a separate section with annotation info.
> > +For each such annotated memory (variables usually), a dedicated struct
> > +is being created with the required information.
> 
>    is created
> 
> > +To achieve this goal, some basic APIs are available:
> > +
> > +* MEMINSPECT_ENTRY(idx, sym, sz)
> > +  is the basic macro that takes an ID, the symbol, and a size.
> > +
> > +To make it easier, some wrappers are also defined
> > +
> > +* MEMINSPECT_SIMPLE_ENTRY(sym)
> > +  will use the dedicated MEMINSPECT_ID_##sym with a size equal to 
> > sizeof(sym)
> 
>      uses the dedicated
> 
> > +
> > +* MEMINSPECT_NAMED_ENTRY(name, sym)
> > +  will be a simple entry that has an id that cannot be derived from the 
> > sym,
> 
>      is a simple entry that
> 
> > +  so a name has to be provided
> > +
> > +* MEMINSPECT_AREA_ENTRY(sym, sz)
> > +  this will register sym, but with the size given as sz, useful for e.g.
> 
>      registers sym, but with
> 
> > +  arrays which do not have a fixed size at compile time.
> > +
> > +For dynamically allocated memory, or for other cases, the following APIs
> > +are being defined::
> 
>    are defined::
> 
> > +
> > +  meminspect_register_id_pa(enum meminspect_uid id, phys_addr_t zone,
> > +                            size_t size, unsigned int type);
> > +
> > +which takes the ID and the physical address.
> > +
> > +Similarly there are variations:
> > +
> > + * meminspect_register_pa() omits the ID
> > + * meminspect_register_id_va() requires the ID but takes a virtual address
> > + * meminspect_register_va() omits the ID and requires a virtual address
> > +
> > +If the ID is not given, the next avialable dynamic ID is allocated.
> 
>                                     available
> 
> > +
> > +To unregister a dynamic entry, some APIs are being defined:
> 
>                                             are defined:
> 
> > + * meminspect_unregister_pa(phys_addr_t zone, size_t size);
> > + * meminspect_unregister_id(enum meminspect_uid id);
> > + * meminspect_unregister_va(va, size);
> > +
> > +All of the above have a lock variant that ensures the lock on the table
> > +is taken.
> > +
> > +
> > +meminspect drivers
> > +------------------
> > +
> > +Drivers are free to traverse the table by using a dedicated function::
> > +
> > + meminspect_traverse(void *priv, MEMINSPECT_ITERATOR_CB cb)
> > +
> > +The callback will be called for each entry in the table.
> 
> maybe           is called
> 
> > +
> > +Drivers can also register a notifier with meminspect_notifier_register()
> > +and unregister with meminspect_notifier_unregister() to be called when a 
> > new
> > +entry is being added or removed.
> 
>          is added or removed.
> 
> > +
> > +Data structures
> > +---------------
> > +
> > +The regions are being stored in a simple fixed size array. It avoids
> 
>                are stored
> 
> > +memory allocation overhead. This is not performance critical nor does
> > +allocating a few hundred entries create a memory consumption problem.
> > +
> > +The static variables registered into meminspect are being annotated into
> 
>                                                    are annotated into
> 
> > +a dedicated .inspect_table memory section. This is then walked by 
> > meminspect> +at a later time and each variable is then copied to the whole 
> > inspect table.
> > +
> > +meminspect Initialization
> > +-------------------------
> > +
> > +At any time, meminspect will be ready to accept region registration
> 
>                 meminspect is ready
> 
> > +from any part of the kernel. The table does not require any initialization.
> > +In case CONFIG_CRASH_DUMP is enabled, meminspect will create an ELF header
> 
>                                          meminspect creates an ELF header
> 
> > +corresponding to a core dump image, in which each region is added as a
> > +program header. In this scenario, the first region is this ELF header, and
> > +the second region is the vmcoreinfo ELF note.
> > +By using this mechanism, all the meminspect table, if dumped, can be
> > +concatenated to obtain a core image that is loadable with the `crash` tool.
> > +
> > +meminspect example
> > +==================
> > +
> > +A simple scenario for meminspect is the following:
> > +The kernel registers the linux_banner variable into meminspect with
> > +a simple annotation like::
> > +
> > +  MEMINSPECT_SIMPLE_ENTRY(linux_banner);
> > +
> > +The meminspect late initcall will parse the compilation time created table
> 
> maybe...                                       compile-time
> 
> > +and copy the entry information into the inspection table.
> > +At a later point, any interested driver can call the traverse function to
> > +find out all entries in the table.
> > +A specific driver will then note into a specific table the address of the
> > +banner and the size of it.
> > +The specific table is then written to a shared memory area that can be
> > +read by upper level firmware.
> > +When the kernel freezes (hypothetically), the kernel will no longer feed
> > +the watchdog. The watchdog will trigger a higher exception level interrupt
> > +which will be handled by the upper level firmware. This firmware will then
> > +read the shared memory table and find an entry with the start and size of
> > +the banner. It will then copy it for debugging purpose. The upper level
> > +firmware will then be able to provide useful debugging information,
> > +like in this example, the banner.
> > +
> > +As seen here, meminspect facilitates the interaction between the kernel
> > +and a specific firmware.

Thanks for your time and review, I have applied the changes to both doc. and
Kconfig for next version.

> 
> 
> -- 
> ~Randy
> 

-- 
-Mukesh Ojha

Reply via email to