jhuber6 added a comment.

In D122069#3397373 <https://reviews.llvm.org/D122069#3397373>, @tra wrote:

>> I decided to use a custom format rather than using an existing format (ELF, 
>> JSON, etc) because of the specialty use-case of this.
>
> Will we ever need to process the files with tools built with a different LLVM 
> version? E.g clang and lld may be built separately from different LLVM trees.
> If so, then maintaining compatibility of the binary format will become more 
> painful over time, compared to using json.

As it stands now, this is only used by the linker-wrapper with is all clang. 
LLD is only called as far of the host so it's impossible to mix versions. That 
being said, eventually I want this functionality to be performed by the linker 
itself without needing the extra wrapper so it'll be possible in the future. I 
added the "Version" field specifically to emit a warning or error in the future 
if we change this. Even if we used JSON or some other form it would be a 
similar story if these get de-synced because we'd have different entries and 
keys. We could add to the struct without breaking the ABI since everything uses 
absolute offsets.

> Considering that we already have json parsing and serialization in LLVM, I 
> don't see much benefit growing yet another on-disk format. AFAICT, JSON 
> should do the job encoding relevant data just fine and would need less 
> maintenance long term.

I was definitely thinking about a lot of alternatives to making yet another 
on-disk binary format. I had a few discussions with @JonChesterfield on the 
subject. There were two things that put me off of using JSON primarily, correct 
me if I have any misconceptions.

1. This is fundamentally a very thin metadata wrapper around a binary image. 
AFAIK if you want to stream binary data to a JSON you need to use a base64 or 
similar encoding, which makes the files bigger and adds some extra work. It's 
not a deal-breaker, but it's somewhat of a turn-off.
2. There could be multiple of these contained in a single section, I primarily 
wanted something with a known size and easy to identify magic bytes to find 
where these live in the section. I wasn't sure how nicely this would interact 
with the JSON parser. It could also be done, but I wasn't sure how much utility 
there was.



================
Comment at: clang/include/clang/Basic/Offloading.h:87
+private:
+  struct Header {
+    uint8_t Magic[4] = {0x10, 0xFF, 0x10, 0xAD};
----------------
jhuber6 wrote:
> tra wrote:
> > tra wrote:
> > > Given that it's an on-disk format I think these should be explicitly 
> > > packed and carry a comment that it's an on-disk format which needs extra 
> > > caution during future changes.
> > > 
> > > 
> > On-disk format could use more comments.
> What do you mean by explicitly packed? And I added the "Version" field in the 
> header so we can warn on old versions.
Noted, will add more if we settle on this format.


================
Comment at: clang/include/clang/Basic/Offloading.h:87-104
+  struct Header {
+    uint8_t Magic[4] = {0x10, 0xFF, 0x10, 0xAD};
+    uint32_t Version;
+    uint64_t Size;
+    uint64_t EntryOffset;
+  };
+
----------------
tra wrote:
> tra wrote:
> > Given that it's an on-disk format I think these should be explicitly packed 
> > and carry a comment that it's an on-disk format which needs extra caution 
> > during future changes.
> > 
> > 
> On-disk format could use more comments.
What do you mean by explicitly packed? And I added the "Version" field in the 
header so we can warn on old versions.


================
Comment at: clang/include/clang/Basic/Offloading.h:90
+    uint32_t Version;
+    uint64_t Size;
+    uint64_t EntryOffset;
----------------
tra wrote:
> What does `Size` cover and what units it uses -- bytes, number of Entries, 
> something else?
Size of the entire file, so you can do `Data[Header->Size]` and potentially get 
the next one in memory.


================
Comment at: clang/include/clang/Basic/Offloading.h:95-96
+  struct Entry {
+    uint16_t ImageKind;
+    uint16_t OffloadKind;
+    uint32_t Flags;
----------------
tra wrote:
> Should we use the matching enum types here? 
> We're using 16-bit enums, so it would save us on casts when we use those 
> fields and would make it harder to set a wrong value by mistake.
Yeah, the casts are annoying but I elected to be explicit with the size 
constraints in the header itself. Wasn't really sure which was better.


================
Comment at: clang/include/clang/Basic/Offloading.h:97
+    uint16_t OffloadKind;
+    uint32_t Flags;
+    uint64_t TripleOffset;
----------------
tra wrote:
> Do these flags have defined meaning?
They're unused for now, I was planning on making it a bitfield so we could pass 
things like whether or not this file contains debug information or is 
optimized, kind of like fatbinary.


================
Comment at: clang/include/clang/Basic/Offloading.h:98
+    uint32_t Flags;
+    uint64_t TripleOffset;
+    uint64_t ArchOffset;
----------------
tra wrote:
> What are the offsets relative to?
Absolute from the start of the file, can comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122069/new/

https://reviews.llvm.org/D122069

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to