Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming
On Wed, Aug 24, 2022 at 4:36 PM Alejandro Colomar wrote: > > I'm trying to be nice, and ask for review to make sure I'm not making > some big mistake by accident, and I get disrespect? No thanks. You've been told multiple times that the kernel doesn't use the "standard" names, and *cannot* use them for namespace reasons, and you ignore all the feedback, and then you claim you are asking for review? That's not "asking for review". That's "I think I know the answer, and when people tell me otherwise I ignore them". The fact is, kernel UAPI header files MUST NOT use the so-called standard names. We cannot provide said names, because they are only provided by the standard header files. And since kernel header files cannot provide them, then kernel UAPI header files cannot _use_ them. End result: any kernel UAPI header file will continue to use __u32 etc naming that doesn't have any namespace pollution issues. Nothing else is even remotely acceptable. Stop trying to make this something other than it is. And if you cannot accept these simple technical reasons, why do you expect respect? Why are you so special that you think you can change the rules for kernel uapi files over the *repeated* objections from maintainers who know better? Linus
Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming
On Wed, Aug 24, 2022 at 11:41 PM Florian Weimer wrote: > > The justifications brought forward are just regurgitating previous > misinformation. If you do that, it's hard to take you seriously. Pot, meet kettle. > There is actually a good reason for using __u64: it's always based on > long long, so the format strings are no longer architecture-specific, [..] That's a small detail that yes, we've tried to avoid the absolute humongous mess that the C standard library has with their horrendous 'PRId*' mess, but honestly, it's just a tiny detail. The real issue is that we want to be able to control our own types, and our own names, and in the process have sometimes been able to standardize on types that makes it easier to just not have to deal with "oh, somebody picked 'int' on this architecture, and 'long' on this other, and they are both 32-bit types". We still have to deal with that for '[s]size_t', but that's such a standard legacy type that thankfully we have the whole '%zu/%zd' thing for that. And yes, sometimes we screw up even *though* we were the ones that picked the types, and we've had pointless differences where '__u64' could be 'unsigned long' on a 64-bit architecture, and 'unsigned long long' on a 32-bit one, and then we were able to fix our own little broken type system exactly because it was *OUR* little type system. So you are correct that then in the specific case of '__u64' we have been able to simply just standardize on 'unsigned long long' and make printf strings simpler. But you are wrong to think that that is somehow a special thing. It's not. It's very much all the same thing: we have types *we* control, and thanks to that we can do them the way *we* need them done, and can fix them when we made a silly mistake. In other words, it's the whole *point* of not ever using 'stdint.h' at all for those things. (It's also about avoiding the kinds of unholy things that happen in system header files. Have you ever *looked* at them? Christ. The amount of absolute crap you get from including in user space is scary) > You cannot avoid using certain ISO C names with current GCC or Clang, > however hard you try. You are now the one who is regurgitating complete mis-information. You do it so prettily, and with such weasel-wording, that I know you must be knowingly threading that fine line between "actively misleading" but trying to avoid "outright lying".. You say "certain ISO C names" to try to make it sound as if this was at all relevant to things like "uint32_t" and friends. But deep down, you know you're basically lying by omission. Because it's true that we have to know and care about things like 'size_t', which comes up for all the basic string.h functions. So yes, we have a very small set of types that we make sure matches the compiler notion of said types, and we carefully use things like typedef __kernel_ulong_t __kernel_size_t; and then we have our own 'stdarg.h which uses typedef __builtin_va_list va_list; that is explicitly the one that the compiler exposes with those double underscores exactly because even the compiler can't expose the "standard" name due to namespace issues. And no, NONE OF THOSE ARE USABLE IN THE UAPI HEADERS! And equally importantly, none of those have *anything* to do with the 'uint32_t' kind of names. The fact that yes, we care about what the compiler thinks "size_t" is (because we do want the compiler to do memset() for us) has absolutely *NOTHING* to do with uint32_t and . And I'm pretty sure you knew that, but you tried to make it sound like they were somehow all in the same boat. And yes, some drivers tend to actually use 'uint32_t' in the kernel, and we allow it, but they cannot be used by user interfaces. So a uapi file really *really* shouldn't ever use them. And no, we don't use "-ffreestanding" and friends - we actually have occasionally wanted and tried to do so just to make the boundary lines clearer, but then that will make gcc no longer do sane things for 'memcpy()'' and friends, so it's kind of a balancing act. > , , are compiler-provided headers that > are designed to be safe to use for bare-metal contexts (like in > kernels). Avoiding them is not necessary per se. We explicitly avoid them all. We historically used stdarg.h and stddef.h (but never stdint.h - there's absolutely _zero_ upside), but it was always a slight pain. So we simply bake our own, exactly because it's simply less painful than having to deal with possible system-provided ones. People do odd compiler things with host compilers, bad or odd installations of cross-build environments, it's just not worth the pain to deal with the "system header files" when they just don't provide any real value. Linus
Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming
On Thu, Aug 25, 2022 at 12:20 AM Alejandro Colomar wrote: > > This patch is not about kernel, but about the section 2 and 3 manual > pages, which are directed towards user-space readers most of the time. They are about the types to the kernel interfaces. Those types that the kernel defines and exposes. And the kernel type in question looks like this: struct { /* anonymous struct used by BPF_PROG_LOAD command */ __u32 prog_type; /* one of enum bpf_prog_type */ __u32 insn_cnt; __aligned_u64 insns; __aligned_u64 license; because the kernel UAPI header wants to (a) work whether or not was included (b) doesn't want to include so as to not pollute the namespace (c) actually wants to use our special types I quoted a few more fields for that (c) reason: we've had a long history of getting the user space API wrong due to strange alignment issues, where 32-bit and 64-bit targets had different alignment for types. So then we ended up with various compat structures to translate from one to the other because they had all the same fields, but different padding. This happened a few times with the DRM people, and they finally got tired of it, and started using that "__aligned_u64" type, which is just the same old __u64, but explicitly aligned to its natural alignment. So these are the members that the interface actually uses. If you document those members, wouldn't it be good to have that documentation be actually accurate? Honestly, I don't think it makes a *huge* amount of difference, but documentation that doesn't actually match the source of the documentation will just confuse somebody in the end. Somebody will go "that's not right", and maybe even change the structure definitions to match the documentation. Which would be wrong. Now, you don't have to take the kernel uapi headers. We *try* to make those usable as-is, but hey, there has been problems in the past, and it's not necessarily wrong to take the kernel header and then munge it to be "appropriate" for user space. So I guess you can just make your own structures with the names and syntax you want, and say "these are *my* header files, and *my* documentation for them". That's fine. But I'm not surprised if the kernel maintainer then goes "no, that's not the interface I agreed to" if only because it's a pain to verify that you got it all right. Maybe it was all trivial and automated and there can be no errors, but it's still a "why is there a different copy of this complicated interface". If you really want to describe things to people, wouldn't it be nicer to just say "there's a 32-bit unsigned 'prog_type' member" and leave it at that? Do you really want to enforce your opinion of what is prettier on the maintainer that wrote that thing, and then argue with him when he doesn't agree? Linus
Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming
On Thu, Aug 25, 2022 at 7:38 AM Joseph Myers wrote: > > I've not yet implemented it for glibc or for GCC format checking, but C23 > adds 'wN' format length modifiers so you will be able to e.g. use "%w64d" > with printf to print an int64_t and won't need those PRI macros any more. Yeah, that's going to help user space. We don't typically have huge issues with it (any more) in the kernel exactly because we refused to do the syntactically horrendous PRIxyz thing. So in the kernel, we still do have some format string issues, but they tend to be about "different architectures and configurations do different things for this type", and those different things are sadly not necessarily about a fixed width. IOW, we used to have horrors like "sector_t can be 32-bit or 64-bit depending on config options" (because small machines didn't want the overhead of having to pass 64-bit things around - from back when 32-bit was a primary target). We got rid of *that* thing a few years ago because it just wasn't worth supporting any more, but some similar issues remain. So we still have a number of cases of "if you really need to print this out, you need to use '%llui' and cast the value to 'unsigned long long'". But it's happily not as common as it used to be. Linus
Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre
On Sun, Jan 14, 2018 at 1:02 PM, David Woodhouse wrote: > Review on the GCC patches has led to a request that the thunk symbols > be changed from e.g. __x86_indirect_thunk_rax to > __x86_indirect_thunk_ax without the 'r'. Ok. I think that just makes it easier for us, since then the names are independent of 32-vs/64, and we don't need to use the _ASM_XY names. What about r8-r15? I'm assuming 'r' there is used? Mind sending me a tested patch? I'll was indeed planning on generating rc8, but I might as well go grocery shopping now instead, and do rc8 later in the evening. Linus
Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre
On Sun, Jan 14, 2018 at 2:39 PM, David Woodhouse wrote: > > I'm not convinced we want to do this, but I'll defer to Linus. Well, I guess we have no choice, if gcc ends up using the stupid names. And yes, apparently this just made our macros worse instead of cleaning anything up. Oh well. I do have one (possible) solution: just export both names. So you'd export __x86_indirect_thunk_ax __x86_indirect_thunk_rax .. __x86_indirect_thunk_8 __x86_indirect_thunk_r8 as symbols (same code, obviously), and then (a) the macros would be simpler (b) it just happens to work with even the old gcc patch But at this point I don't really care. Linus
Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre
On Sun, Jan 14, 2018 at 3:09 PM, David Woodhouse wrote: > > I think we should stick with what we have now, with the names of the > thunks actually being the *full* name of the register (rax, eax, etc.) > that they use. It that works for the gcc people, then yes, I agree. The mixed "sometimes full, sometimes not" approach just seems broken. Linus
Re: [PATCH] x86/retpoline: Switch thunk names to match final GCC patches
On Sun, Jan 14, 2018 at 3:23 PM, David Woodhouse wrote: > I think we *shouldn't* do this. Uros said we could look at it and make > a decision, and GCC would implement what we decide. Up to Linus. Regardless of whether we end up having to do this, I'm not doing rc8 with it, and let's hope we can just skip it entirely. It seems silly to have the 'r' for the r8-r15 case, but not the legacy registers. Linus
Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]
On Mon, 6 Jan 2025 at 16:13, Jeff Law wrote: > > But in the case of concurrent accesses, shouldn't these objects be > declared as atomic? No. They aren't concurrent accesses to the same variable. They are concurrent accesses to *different* memory locations, and the compiler is not allowed to mess them up. IOW, if you have struct myvar { pthread_mutex_t buffer_lock; pthread_mutex_t another_var_lock; char buffer[7]; char another_var; }; and "buffer_lock" serializes accesses to "buffer[]", and "another_var_lock" serializes accesses to "another_var", then the compiler IS NOT ALLOWED TO TOUCH "another_var" when the code touches "buffer[]". So if a compiler turns "memset(var->buffer, 0, 7)" into "load 8 bytes, clear 7 of the bytes, store 8 bytes", then the compiler is buggy. Because that messes up another thread that accesses "another_var", and the 8-byte write may write back an old value that is no longer valid. There is absolutely no gray area here. It was always buggy, and the alpha architecture was always completely and fundamentally mis-designed. C11 made it explicitly clear: "Different threads of execution are always allowed to access (read and modify) different memory locations concurrently, with no interference and no synchronization requirements" but honestly, that was just codifying something that should have been blindingly obvious even before. Linus
Re: [PATCH v2 6/7] Alpha: Add option to avoid data races for sub-longword memory stores [PR117759]
On Mon, 6 Jan 2025 at 16:59, Linus Torvalds wrote: > > There is absolutely no gray area here. It was always buggy, and the > alpha architecture was always completely and fundamentally > mis-designed. Note that I really do want to re-emphasize that while I think it's kind of interesting that Maciej is trying to make gcc DTRT on alpha, the non-BWX machines really are a completely broken architecture and almost entirely unfixable. Yeah, yeah, Maciej also has patches to avoid all the ldq_u/stq_u sequences for regular byte accesses into actually using ldl_l / stl_c sequences, but those instructions take hundreds of cycles and go out on the bus outside the CPU. So using actual thread-safe byte ops with ldl_l / stl_c turns those non-BWX alpha CPU's into something very sad and pointless. You might as well go full retro and use a 6502 or something. And even the newer alphas that *had* BWX were designed to still do byte operations with quadword ops and masking. Yes, byte ops existed, but they were still very much designed as a "only when you really can't use the masked word model". For example, the standard "memset()" library routine for EV6 literally does exactly the thing that I said would be a compiler bug to do. Because that's literally the core design of the architecture: it's buggy. So while I applaud Maciej's efforts, I'm not convinced they are all that productive. Even with a fixed compiler, it's all broken anyway. Of course, most of the time, you won't ever see the breakage. It's there, but hitting it in practice is almost impossible. The Linux kernel uses the known-broken memcpy and memset library code. All user space does the same. They are hopelessly and fundamentally broken, but they work in practice as long as you don't have concurrent accesses "near-by" in space-time. Linus