Hi https://api.libreoffice.org/docs/tools.html "'unoidl-write' is the new UNOIDL compiler, replacing the former idlc and regmerge tools. "
So idlc and regmerge still relevant ? Bests, Régis Perdreau Le lun. 16 juin 2025 à 13:38, Regis Perdreau <[email protected]> a écrit : > Hi all, > > Tanks for this work. > > Bests, > > Régis Perdreau > > > > Le lun. 16 juin 2025 à 13:21, Devansh Varshney < > [email protected]> a écrit : > >> Hi everyone, >> >> This week has been one of the best learning experiences for me, >> especially digging into the "behind-the-scenes" of LibreOffice's UNO >> APIs. >> >> My initial work (Gerrit 185362 >> <https://gerrit.libreoffice.org/c/core/+/185362>) was a first step, but >> feedback from my >> mentors in our meetings provided a crucial directive: first, figure out >> how to get the data. Before we can build a great auto-completion system, >> we need a deep, proven understanding of where all the information (for >> BASIC, UNO, ScriptForge, etc.) lives and how to access it >> programmatically. >> >> This led to a fascinating dive into the UNO data pipeline. >> >> *Understanding the UNO Data Pipeline: From IDL to Runtime* >> For anyone curious about how UNO works under the hood, here's a breakdown >> of >> what I've learned. It's a pipeline that turns human-readable API >> definitions >> into an efficient system the application uses at runtime. >> >> *IDL* *(Interface Definition Language):* This is the source of truth >> for all >> UNO APIs. These .idl text files define every service, interface, >> method, >> property, struct, and enum. >> *Locations: udkapi/* (core types) & *offapi/ *(office-specific >> types). >> >> *idlc & regmerge:* During the build, idlc (the IDL Compiler) >> compiles .idl files into intermediate binary .urd files. Then, >> regmerge combines these into .rdb (Registry Database) files. >> >> *.rdb Files:* These are the optimized binary databases that >> LibreOffice >> loads at startup. Key files include types.rdb (from udkapi.rdb etc.), >> services.rdb, and offapi.rdb. This is an installation artifact, >> not a source file, which clarified my initial search! >> >> * theCoreReflection:* At runtime, this powerful UNO service provides >> live, programmatic access to all the type information that was loaded >> from the .rdb files. >> >> * regview Tool:* A command-line tool (registry/tools/regview.cxx) >> designed to dump the contents of an .rdb file. My initial attempts >> to use this was unsuccessful, which, along with mentor guidance, led >> us to >> pivot our strategy. >> >> *SbUnoObject & XIntrospectionAccess:* The bridge in BASIC for >> interacting with live UNO objects, using dynamic introspection to >> discover their capabilities. >> >> *A simplified flow of this pipeline looks like this:* >> >> *.idl Files* --(idlc)--> *.urd Files* --(regmerge)--> *.rdb >> Files* >> (Source of Truth) (Binary intermediate) (Loaded by LO >> Runtime) >> >> | >> >> v >> <LO >> Runtime Type System> >> >> (Accessible >> via theCoreReflection) >> >> ^ >> >> | (Reads .rdb) >> >> *regview Tool* >> >> | >> >> v >> >> <Textual Dump> >> >> >> *Understanding ScriptForge (wizards/source/scriptforge/)* >> >> I also looked into ScriptForge, which is crucial for modern BASIC >> scripting. >> https://gerrit.libreoffice.org/c/core/+/164867 >> - *.xlb files* are XML manifests listing the libraries. >> - *.xba files *are ZIP-like packages containing the actual .bas >> source modules. >> - *.pyi file* is a Python stub that provides type hints to Python >> IDEs for >> auto-completion. As Rafael Lima mentioned, this might be manually >> created, >> making it a great model for the kind of rich API definition we want to >> achieve for BASIC. >> >> *How its information becomes available:* >> >> *.bas files (inside .xba packages listed in .xlb)* >> | >> v (Loaded by BasicManager/StarBASIC) >> *<SbModule objects with source code>* >> | >> v (Compiled by SbiParser) >> *<SbMethod, SbxVariable symbols within the SbModule>* >> >> *--- Parallel path for Python tooling ---* >> *.pyi file (wizards/source/scriptforge/python/scriptforge.pyi)* >> | >> v (Read by Python IDEs) >> *<Type hints for Python auto-completion>* >> >> >> *From Static File Parsing to C++ PoCs* >> >> Given the complexities of parsing static RDB/IDL files directly, and the >> clear guidance from Meeting 3, our immediate focus has shifted. The new >> priority is to write C++ Proof-of-Concept (PoC) code to programmatically >> gather data and get this code onto Gerrit for review. >> >> I'm very excited to share that the first two PoCs are complete. >> Gerrit Patch: https://gerrit.libreoffice.org/c/core/+/186475 >> This patch contains the CppUnit tests for these experiments. >> >> *UNO Services and Memes - Why Context Comes First* >> So for example I’ve seen this happen a lot on social media. There’s a meme >> going around, people are laughing, sharing it, reacting to it… and then >> there’s >> always someone in the comments asking: >> "What’s the context behind this?" >> >> I mean, I’ve done it too. Sometimes you just miss the reference, maybe >> it’s >> from a movie, or some political moment, or even a viral soundbite. >> Without the >> context, it’s just a picture or a clip. You don’t get why it’s funny, why >> it hits. >> >> *And then someone replies and goes:* >> "Oh, this is from Interstellar, that scene where Cooper watches years of >> messages after time dilation." >> >> Now it starts to click. *That context sets the stage*. >> >> *Then maybe another reply adds:* >> "Yeah, and the reason it’s funny here is because someone compared it to >> missing one lecture and coming back to find the whole syllabus changed." >> >> So first you got the context, then someone gave the reference point, say, >> the >> movie and then you dove into the details: the exact scene, the emotion, >> the >> punchline. That’s what makes it all land. >> >> And honestly, that’s how I see working with UNO services too. >> >> In our PoC, we had to first get the component context otherwise we’re just >> floating, not grounded in the current state of the app. Once we had that, >> we >> could ask for something like com.sun.star.reflection.CoreReflection, and >> only >> then could we start introspecting the real details, interfaces, methods, >> enums, all the building blocks. >> >> *It’s kind of beautiful how that maps:* >> *Context* → *“Where am I?”* >> *Service* → *“What am I working with?”* >> *Introspection* → *“What can this thing do?”* >> >> And just like in memes, without context, the rest doesn’t mean much. >> Funny enough, this whole idea of “context” is even a thing in frameworks >> like >> React or Java. So maybe context is more universal than we think. >> >> *Summary of C++ Proof-of-Concepts (PoCs)* >> Here's a breakdown of the PoCs I've implemented in the Gerrit patch: >> >> *PoC 1: Listing All Available UNO Service Names* >> *Concept:* Queries the *XMultiComponentFactory* (Service Manager) to >> get >> all creatable UNO service names. >> *Source:* comphelper/processfactory.hxx (getProcessServiceManager()). >> * Task:* >> - Get XComponentContext. >> - Get XMultiComponentFactory. >> - Call getAvailableServiceNames(). >> - Log each service name. >> *Result:* Successfully dumped service names. >> >> *PoC 2: Introspecting Specific UNO Definitions via theCoreReflection* >> *Concept:* *theCoreReflection* provides access to the complete >> in-memory >> type information that LibreOffice loaded from its RDBs. >> *Source*: com.sun.star.reflection.theCoreReflection, XIdlClass, etc. >> (implementation in stoc/source/ >> <https://git.libreoffice.org/core/+/refs/heads/master/stoc>). >> *Task:* >> - Get theCoreReflection instance. >> - For a list of key type names (XModel, XSpreadsheet, >> PropertyValue, etc.): >> - Call forName(sTypeName) to get its XIdlClass blueprint. >> - Dump all details: superclasses, methods (with full parameter >> info), >> properties, struct fields, and enum members. >> *Result:* Extracted rich, detailed API definitions. This >> proves we can get the data needed for Parameter Info and accurate >> dot-completion. >> >> >> https://gerrit.libreoffice.org/c/core/+/186475/4/basic/uno_available_services_cpp_dump.txt >> >> *Next Steps: Diving into BASIC Internals* >> >> With the UNO data access path validated, the next focus is on BASIC >> itself. >> >> *PoC 3 (In Progress): The MsgBox Deep Dive* >> My current task is to trace *MsgBox* from its user-facing >> documentation >> (both LO and MSO) down to its C++ implementation >> (*SbRtl_MsgBox in basic/source/runtime/methods.cxx*). This will >> help >> us understand how to handle built-in functions and their >> often-implicit >> parameter signatures. >> >> *Future PoC: Parser Symbol Extraction* >> After MsgBox, the plan is to write a C++ PoC that interacts with >> the >> SbiParser to extract its internal symbol tables (SbiSymPool) for >> user-defined code. >> >> A mentor's comment, *"We have a cppumaker, etc., and why not a >> basicmaker?"*, >> really resonated with me. It highlights that our ultimate goal is to >> create >> a powerful "analyzer" for BASIC that provides the same level of rich, >> structured information for our IDE tools as other "makers" do for their >> respective languages. And yes I have to speed up stuff. >> >> Thanks for following this. >> >> -- >> *Regards,* >> *Devansh* >> >
