jhuber6 added a comment. In D122069#3397373 <https://reviews.llvm.org/D122069#3397373>, @tra wrote:
>> I decided to use a custom format rather than using an existing format (ELF, >> JSON, etc) because of the specialty use-case of this. > > Will we ever need to process the files with tools built with a different LLVM > version? E.g clang and lld may be built separately from different LLVM trees. > If so, then maintaining compatibility of the binary format will become more > painful over time, compared to using json. As it stands now, this is only used by the linker-wrapper with is all clang. LLD is only called as far of the host so it's impossible to mix versions. That being said, eventually I want this functionality to be performed by the linker itself without needing the extra wrapper so it'll be possible in the future. I added the "Version" field specifically to emit a warning or error in the future if we change this. Even if we used JSON or some other form it would be a similar story if these get de-synced because we'd have different entries and keys. We could add to the struct without breaking the ABI since everything uses absolute offsets. > Considering that we already have json parsing and serialization in LLVM, I > don't see much benefit growing yet another on-disk format. AFAICT, JSON > should do the job encoding relevant data just fine and would need less > maintenance long term. I was definitely thinking about a lot of alternatives to making yet another on-disk binary format. I had a few discussions with @JonChesterfield on the subject. There were two things that put me off of using JSON primarily, correct me if I have any misconceptions. 1. This is fundamentally a very thin metadata wrapper around a binary image. AFAIK if you want to stream binary data to a JSON you need to use a base64 or similar encoding, which makes the files bigger and adds some extra work. It's not a deal-breaker, but it's somewhat of a turn-off. 2. There could be multiple of these contained in a single section, I primarily wanted something with a known size and easy to identify magic bytes to find where these live in the section. I wasn't sure how nicely this would interact with the JSON parser. It could also be done, but I wasn't sure how much utility there was. ================ Comment at: clang/include/clang/Basic/Offloading.h:87 +private: + struct Header { + uint8_t Magic[4] = {0x10, 0xFF, 0x10, 0xAD}; ---------------- jhuber6 wrote: > tra wrote: > > tra wrote: > > > Given that it's an on-disk format I think these should be explicitly > > > packed and carry a comment that it's an on-disk format which needs extra > > > caution during future changes. > > > > > > > > On-disk format could use more comments. > What do you mean by explicitly packed? And I added the "Version" field in the > header so we can warn on old versions. Noted, will add more if we settle on this format. ================ Comment at: clang/include/clang/Basic/Offloading.h:87-104 + struct Header { + uint8_t Magic[4] = {0x10, 0xFF, 0x10, 0xAD}; + uint32_t Version; + uint64_t Size; + uint64_t EntryOffset; + }; + ---------------- tra wrote: > tra wrote: > > Given that it's an on-disk format I think these should be explicitly packed > > and carry a comment that it's an on-disk format which needs extra caution > > during future changes. > > > > > On-disk format could use more comments. What do you mean by explicitly packed? And I added the "Version" field in the header so we can warn on old versions. ================ Comment at: clang/include/clang/Basic/Offloading.h:90 + uint32_t Version; + uint64_t Size; + uint64_t EntryOffset; ---------------- tra wrote: > What does `Size` cover and what units it uses -- bytes, number of Entries, > something else? Size of the entire file, so you can do `Data[Header->Size]` and potentially get the next one in memory. ================ Comment at: clang/include/clang/Basic/Offloading.h:95-96 + struct Entry { + uint16_t ImageKind; + uint16_t OffloadKind; + uint32_t Flags; ---------------- tra wrote: > Should we use the matching enum types here? > We're using 16-bit enums, so it would save us on casts when we use those > fields and would make it harder to set a wrong value by mistake. Yeah, the casts are annoying but I elected to be explicit with the size constraints in the header itself. Wasn't really sure which was better. ================ Comment at: clang/include/clang/Basic/Offloading.h:97 + uint16_t OffloadKind; + uint32_t Flags; + uint64_t TripleOffset; ---------------- tra wrote: > Do these flags have defined meaning? They're unused for now, I was planning on making it a bitfield so we could pass things like whether or not this file contains debug information or is optimized, kind of like fatbinary. ================ Comment at: clang/include/clang/Basic/Offloading.h:98 + uint32_t Flags; + uint64_t TripleOffset; + uint64_t ArchOffset; ---------------- tra wrote: > What are the offsets relative to? Absolute from the start of the file, can comment. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D122069/new/ https://reviews.llvm.org/D122069 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits