jansvoboda11 added a comment. In D114095#3160103 <https://reviews.llvm.org/D114095#3160103>, @vsapsai wrote:
> I've mentioned it in D112915 <https://reviews.llvm.org/D112915> as we've > discussed the stored data format there. But my concern was that bitvector > packing might be not the most space-efficient encoding. I haven't done proper > testing, just off-the-cuff comparison and it looks like for the most of > frameworks in iOS SDK storing included headers per submodule takes less space > than encoding them as a bitvector. I have an idea why that might be happening > but I haven't checked it in debugger, so'll keep it to myself to avoid > derailing the discussion. Let's bring the conversation over here. I ran the same UIKit test you did and compared the following: - current trunk - current trunk with this patch - current trunk with this patch, with bitvector replaced by vector of IDs (32-bit integers). The following table shows sizes of .pcm files in bytes and their delta compared to trunk: +----------+-----------------+-----------------+ | trunk | bit vector | ID vector | +----------+-----------------+-----------------+ | 281932 | 281944 +12 | 281988 +56 | | 989840 | 989784 -56 | 989968 +128 | | 837116 | 837084 -32 | 837212 +96 | | 899924 | 899912 -12 | 900004 +80 | | 710296 | 710296 +0 | 710376 +80 | | 273140 | 273144 +4 | 273196 +56 | | 3649856 | 3649024 -832 | 3650804 +948 | | 207676 | 207692 +16 | 207740 +64 | | 342792 | 342804 +12 | 342860 +68 | | 4137660 | 4137460 -200 | 4137940 +280 | | 173536 | 173564 +28 | 173580 +44 | | 787120 | 787144 +24 | 787180 +60 | | 1260652 | 1260596 -56 | 1260804 +152 | | 255072 | 255092 +20 | 255128 +56 | | 973204 | 973228 +24 | 973268 +64 | | 398952 | 398940 -12 | 399036 +84 | | 631516 | 631516 +0 | 631588 +72 | | 5252932 | 5252348 -584 | 5253612 +680 | | 230160 | 230168 +8 | 230228 +68 | | 24460 | 24500 +40 | 24500 +40 | | 53244 | 53280 +36 | 53288 +44 | | 75932 | 75952 +20 | 75972 +40 | | 32840 | 32876 +36 | 32884 +44 | +----------+-----------------+-----------------+ | 22479852 | 22478348 -1504 | 22483156 +3304 | +----------+-----------------+-----------------+ Used command: echo '#import <UIKit/UIKit.h>' | ./bin/clang -fsyntax-only -isysroot "$(xcrun --sdk iphoneos --show-sdk-path)" -target arm64-apple-ios -fmodules -fmodules-cache-path=modules.noindex -x objective-c - Patch that I applied on top of the one under review to get vector of IDs: F20979504: bitvector-to-id-vector.diff <https://reviews.llvm.org/F20979504> I see how the bitvector could explode for large fine-grained modules. They have lots of input files (-> large bitvectors in each submodule), but each submodule only includes a handful files (-> bitvectors are sparse). It seems like this doesn't actually happen, at least in our SDK. @vsapsai Do you think this warrants more thorough investigation? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D114095/new/ https://reviews.llvm.org/D114095 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits