================ @@ -0,0 +1,436 @@ +Data Formatters +=============== + +This page is an introduction to the design of the LLDB data formatters +subsystem. The intended target audience are people interested in understanding +or modifying the formatters themselves rather than writing a specific data +formatter. For this latter purpose, the user documentation about formatters is +the main relevant document which one should refer to. + +This page also highlights some open areas for improvement to the general +subsystem, and more evolutions not anticipated here are certainly possible. + +Overview +-------- + +The LLDB data formatters subsystem is used to allow the debugger as well as the +end-users to customize the way their variables look upon inspection in the user +interface (be it the command line tool, or one of the several GUIs that are +backed by LLDB) + +To this aim, they are hooked into the ValueObjects model, in order to provide +entry points through which such customization questions can be answered. For +example what format should this number be printed as? How many child elements +does this ``std::vector`` have? + +The architecture of the subsystem is layered, with the highest level layer +being the user visible interaction features (e.g. the ``type ***`` commands, +the SB classes, ...). Other layers of interest that will be analyzed in this +document include + +* Classes implementing individual data formatter types +* Classes implementing formatters navigation, discovery and categorization +* The ``FormatManager`` layer +* The ``DataVisualization`` layer +* The SWIG <> LLDB communication layer + +Data Formatter Types +-------------------- + +As described in the user documentation, there are four types of formatters: + +* Formats +* Summaries +* Filters +* Synthetic children + +Formatters have descriptor classes, ``Type*Impl``, which contain at least a +"Flags" nested object, which contains both rules to be used by the matching +algorithm (e.g. should the formatter for type Foo apply to a Foo*?) or rules to +be used by the formatter itself (e.g. is this summary a oneliner?). + +Individual formatter descriptor classes then also contain data items useful to +them for performing their functionality. For instance ``TypeFormatImpl`` +(backing formats) contains an ``lldb::Format`` that is the format to then be +applied were this formatter to be selected. Upon issuing a ``type format add`` +a new ``TypeFormatImpl`` is created that wraps the user-specified format, and +matching options: + +:: + + entry.reset(new TypeFormatImpl( + format, TypeFormatImpl::Flags() + .SetCascades(m_command_options.m_cascade) + .SetSkipPointers(m_command_options.m_skip_pointers) + .SetSkipReferences(m_command_options.m_skip_references))); + + +While formats are fairly simple and only implemented by one class, the other +formatter types are backed by a class hierarchy. + +Summaries, for instance, can exist in one of three "flavors": + +* Summary strings +* Python script +* Native C++ + +The base class for summaries, ``TypeSummaryImpl``, is a pure virtual class that +wraps, again, the Flags, and exports among others: + +:: + + virtual bool FormatObject (ValueObject *valobj, std::string& dest) = 0; + + +This is the core entry point, which allows subclasses to specify their mode of +operation. + +``StringSummaryFormat``, which is the class that implements summary strings, +does a check as to whether the summary is a one-liner, and if not, then uses +its stored summary string to call into ``Debugger::FormatPrompt``, and obtain a +string back, which it returns in ``dest`` as the resulting summary. + +For a Python summary, implemented in ``ScriptSummaryFormat``, +``FormatObject()`` calls into the ``ScriptInterpreter`` which is supposed to +hold the knowledge on how to bridge back and forth with the scripting language +(Python in the case of LLDB) in order to produce a valid string. Implementors +of new ``ScriptInterpreters`` for other languages are expected to provide a +``GetScriptedSummary()`` entrypoint for this purpose, if they desire to allow +users to provide formatters in the new language + +Lastly, C++ summaries (``CXXFunctionSummaryFormat``), wrap a function pointer +and call into it to execute their duty. It should be noted that there are no +facilities for users to interact with C++ formatters, and as such they are +extremely opaque, effectively being a thin wrapper between plain function +pointers and the LLDB formatters subsystem. + +Also, dynamic loading of C++ formatters in LLDB is currently not implemented, +and as such it is safe and reasonable for these formatters to deal with +internal ``ValueObjects`` instances instead of public ``SBValue`` objects. + +An interesting data point is that summaries are expected to be stateless. While +at the Python layer they are handed an ``SBValue`` (since nothing else could be +visible for scripts), it is not expected that the ``SBValue`` should be cached +and reused - any and all caching occurs on the LLDB side, completely +transparent to the formatter itself. + +The design of synthetic children is somewhat more intricate, due to them being +stateful objects. The core idea of the design is that synthetic children act +like a two-tier model, in which there is a backend dataset (the underlying +unformatted ``ValueObject``), and an higher level view (frontend) which vends +the computed representation + +To implement a new type of synthetic children one would implement a subclass of +``SyntheticChildren``, which akin to the ``TypeFormatImpl``, contains Flags for +matching, and data items to be used for formatting. For instance, +``TypeFilterImpl`` (which implements filters), stores the list of expression +paths of the children to be displayed. + +Filters are themselves synthetic children. Since all they do is provide child +values for a ``ValueObject``, it does not truly matter whether these come from the +real set of children or are crafted through some intricate algorithm. As such, +they perfectly fit within the realm of synthetic children and are only shown as +separate entities for user friendliness (to a user, picking a subset of +elements to be shown with relative ease is a valuable task, and they should not +be concerned with writing scripts to do so). + +Once the descriptor of the synthetic children has been coded, in order to hook +it up, one has to implement a subclass of ``SyntheticChildrenFrontEnd``. For a +given type of synthetic children, there is a deep coupling with the matching +front-end class, given that the front-end usually needs data stored in the +descriptor (e.g. a filter needs the list of child elements). + +The front-end answers the interesting questions that are the true raison d'ĂȘtre +of synthetic children: + +:: + + virtual size_t CalculateNumChildren () = 0; + virtual lldb::ValueObjectSP GetChildAtIndex (size_t idx) = 0; + virtual size_t GetIndexOfChildWithName (const ConstString &name) = 0; + virtual bool Update () = 0; + virtual bool MightHaveChildren () = 0; + +Synthetic children providers (their front-ends) will be queried by LLDB for a +number of children, and then for each of them as necessary, they should be +prepared to return a ``ValueObject`` describing the child. They might also be +asked to provide a name-to-index mapping (e.g. to allow LLDB to resolve queries +like ``myFoo.myChild``). + +``Update()`` and ``MightHaveChildren()`` are described in the user +documentation, and they mostly serve bookkeeping purposes. + +LLDB provides three kinds of synthetic children: filters, scripted synthetics, +and the native C++ providers Filters are implemented by +``TypeFilterImpl::FrontEnd``. + +Scripted synthetics are implemented by ``ScriptedSyntheticChildren::FrontEnd``, +plus a set of callbacks provided by the ``ScriptInterpteter`` infrastructure to +allow LLDB to pass the front-end queries down to the scripting languages. + +As for C++ native synthetics, there is a ``CXXSyntheticChildren``, but no +corresponding ``FrontEnd`` class. The reason for this design is that +``CXXSyntheticChildren`` store a callback to a creator function, which is +responsible for providing a ``FrontEnd``. Each individual formatter (e.g. +``LibstdcppMapIteratorSyntheticFrontEnd``) is a standalone frontend, and once +created retains to relation to its underlying ``SyntheticChildren`` object ---------------- walter-erquinigo wrote:
```suggestion created retains to relation to its underlying ``SyntheticChildren`` object. ``` https://github.com/llvm/llvm-project/pull/66527 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits