Re: Importance of transformations that turn data dependencies into control dependencies?
On Thu, Feb 25, 2016 at 6:33 PM, Torvald Riegel wrote: > On Wed, 2016-02-24 at 13:14 +0100, Richard Biener wrote: >> On Tue, Feb 23, 2016 at 8:38 PM, Torvald Riegel wrote: >> > I'd like to know, based on the GCC experience, how important we consider >> > optimizations that may turn data dependencies of pointers into control >> > dependencies. I'm thinking about all optimizations or transformations >> > that guess that a pointer might have a specific value, and then create >> > (specialized) code that assumes this value that is only executed if the >> > pointer actually has this value. For example: >> > >> > int d[2] = {23, compute_something()}; >> > >> > int compute(int v) { >> > if (likely(v == 23)) return 23; >> > else ; >> > } >> > >> > int bar() { >> > int *p = ptr.load(memory_order_consume); >> > size_t reveal_that_p_is_in_d = p - d[0]; >> > return compute(*p); >> > } >> > >> > Could be transformed to (after inlining compute(), and specializing for >> > the likely path): >> > >> > int bar() { >> > int *p = ptr.load(memory_order_consume); >> > if (p == d) return 23; >> > else ; >> > } >> >> Note that if a user writes >> >> if (p == d) >>{ >> ... do lots of stuff via p ... >>} >> >> GCC might rewrite accesses to p as accesses to d and thus expose >> those opportunities. Is that a transform that isn't valid then or is >> the code written by the user (establishing the equivalency) to blame? > > In the context of this memory_order_consume proposal, this transform > would be valid because the program has already "reveiled" what value p > has after the branch has been taken. > >> There's a PR where this kind of equivalencies lead to unexpected (wrong?) >> points-to results for example. >> >> > Other potential examples that come to mind are de-virtualization, or >> > feedback-directed optimizations that has observed at runtime that a >> > certain pointer is likely to be always equal to some other pointer (eg., >> > if p is almost always d[0], and specializing for that). >> >> That's the cases that are quite important in practice. > > Could you quantify this somehow, even if it's a very rough estimate? > I'm asking because it's significant and widely used, then this would > require users or compiler implementors to make a difficult trade-off > (ie, do you want mo_consume performance or performance through those > other optimizations?). Probably resoved by your followup on how the transform is safe anyway. >> > Also, it would be interesting to me to know how often we may turn data >> > dependencies into control dependencies in cases where this doesn't >> > affect performance significantly. >> >> I suppose we try to avoid that but can we ever know for sure? Like >> speculative devirtualization does this (with the intent that it _does_ >> matter, >> of course). >> >> I suppose establishing non-dependence isn't an issue, like with the >> vectorizer adding runtime dependence checks and applying versioning >> to get a vectorized and a not vectorized loop (in case there are >> dependences)? > > I'm not sure I understand you correctly. Do you have a brief example, > perhaps? For mo_consume and its data dependencies, if there might be a > dependence, the compiler would have to preserve it; but I guess that > both a vectorized loop an one that accessses each element separately > would preserve dependences because it's doing those accesses, and they > depend on the input data. > OTOH, peraps HW vector instructions don't get the ordering guarantees > from data dependences -- Paul, do you know of any such cases? A brief example would be for void foo (int *a, int *b, int n) { for (int i = 0; i < n; ++i) a[i] = b[i]; } which we can vectorize like if (a + n < b || b + n < a) { vectorized loop } else { not vectorized loop } note how we're not establishing equivalences between pointers but non-dependence vs. possible dependence. >> > The background for this question is Paul McKenney's recently updated >> > proposal for a different memory_order_consume specification: >> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0190r0.pdf >> > >> > In a nutshell, this requires a compiler to either prove that a pointer >> > value is not carrying a dependency (simplified, its value somehow >> > originates from a memory_order_consume load), or it has to >> > conservatively assume that it does; if it does, the compiler must not >> > turn data dependencies into control dependencies in generated code. >> > (The data dependencies, in contrast to control dependencies, enforce >> > memory ordering on archs such as Power and ARM; these orderings than >> > allow for not having to use an acquire HW barrier in the generated >> > code.) >> > >> > Given that such a proof will likely be hard for a compiler (dependency >> > chains propagate through assignments to variables on the heap and stack, >> > chains are not marked in the code, and points-to analysis can
Validity of SUBREG+AND-imm transformations
Hi all, I'm looking at a case where some RTL passes create an RTL expression of the form: (subreg:QI (and:SI (reg:SI x1) (const_int 31)) 0) which I'd like to simplify to: (and:QI (subreg:QI (reg:SI x1) 0) (const_int 31)) Because (const_int 31) masks out the upper bits after the 5th one, we should be able to safely perform the operation in QImode. It's easy enough to express in RTL but I'm trying to convince myself on its validity. I know there are some subtle points in this area. combine_simplify_rtx in combine.c has a comment: /* Note that we cannot do any narrowing for non-constants since we might have been counting on using the fact that some bits were zero. We now do this in the SET. */ and if I try to implement this transformation in simplify_subreg from simplify-rtx.c I get some cases where combine goes into an infinite recursion in simplify_comparison because it tries to do: /* If this is (and:M1 (subreg:M1 X:M2 0) (const_int C1)) where C1 fits in both M1 and M2 and the SUBREG is either paradoxical or represents the low part, permute the SUBREG and the AND and try again. */ I think the transformation is valid in general because in the original case we care only about the bits within QImode which are well defined by the wider inner operation, and in the transformed case the same bits are also well-defined because of the narrow bitmask. Performing this transformation would help a lot with recognition of some patterns that I'm working on, so would it be acceptable to teach combine or simplify-rtx to do this? Thanks, Kyrill
Re: [WWWDocs] Deprecate support for non-thumb ARM devices
On 24/02/16 13:59, Richard Earnshaw (lists) wrote: > After discussion with the ARM port maintainers we have decided that now > is probably the right time to deprecate support for versions of the ARM > Architecture prior to ARMv4t. This will allow us to clean up some of > the code base going forwards by being able to assume: > - Presence of half-word data accesses > - Presence of Thumb and therefore of interworking instructions. > > This patch records the status change in the GCC-6 release notes. > > I propose to commit this patch later this week. > Now done. R. > R. > > > deprecate.patch > > > Index: htdocs/gcc-6/changes.html > === > RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v > retrieving revision 1.61 > diff -u -r1.61 changes.html > --- htdocs/gcc-6/changes.html 19 Feb 2016 05:00:54 - 1.61 > +++ htdocs/gcc-6/changes.html 19 Feb 2016 14:47:31 - > @@ -340,7 +340,14 @@ > ARM > > > - The arm port now supports target attributes and pragmas. Please > + Support for revisions of the ARM architecture prior to ARMv4t has > + been deprecated and will be removed in a future GCC release. > + This affects ARM6, ARM7 (but not ARM7TDMI), ARM8, StrongARM, and > + Faraday fa526 and fa626 devices, which do not have support for > + the Thumb execution state. > + > + > + The ARM port now supports target attributes and pragmas. Please > refer to the href="https://gcc.gnu.org/onlinedocs/gcc/ARM-Function-Attributes.html#ARM-Function-Attributes";> > documentation for details of available attributes and > pragmas as well as usage instructions. >
Re: Importance of transformations that turn data dependencies into control dependencies?
On Fri, Feb 26, 2016 at 11:49:29AM +0100, Richard Biener wrote: > On Thu, Feb 25, 2016 at 6:33 PM, Torvald Riegel wrote: > > On Wed, 2016-02-24 at 13:14 +0100, Richard Biener wrote: > >> On Tue, Feb 23, 2016 at 8:38 PM, Torvald Riegel wrote: > >> > I'd like to know, based on the GCC experience, how important we consider > >> > optimizations that may turn data dependencies of pointers into control > >> > dependencies. I'm thinking about all optimizations or transformations > >> > that guess that a pointer might have a specific value, and then create > >> > (specialized) code that assumes this value that is only executed if the > >> > pointer actually has this value. For example: > >> > > >> > int d[2] = {23, compute_something()}; > >> > > >> > int compute(int v) { > >> > if (likely(v == 23)) return 23; > >> > else ; > >> > } > >> > > >> > int bar() { > >> > int *p = ptr.load(memory_order_consume); > >> > size_t reveal_that_p_is_in_d = p - d[0]; > >> > return compute(*p); > >> > } > >> > > >> > Could be transformed to (after inlining compute(), and specializing for > >> > the likely path): > >> > > >> > int bar() { > >> > int *p = ptr.load(memory_order_consume); > >> > if (p == d) return 23; > >> > else ; > >> > } > >> > >> Note that if a user writes > >> > >> if (p == d) > >>{ > >> ... do lots of stuff via p ... > >>} > >> > >> GCC might rewrite accesses to p as accesses to d and thus expose > >> those opportunities. Is that a transform that isn't valid then or is > >> the code written by the user (establishing the equivalency) to blame? > > > > In the context of this memory_order_consume proposal, this transform > > would be valid because the program has already "reveiled" what value p > > has after the branch has been taken. > > > >> There's a PR where this kind of equivalencies lead to unexpected (wrong?) > >> points-to results for example. > >> > >> > Other potential examples that come to mind are de-virtualization, or > >> > feedback-directed optimizations that has observed at runtime that a > >> > certain pointer is likely to be always equal to some other pointer (eg., > >> > if p is almost always d[0], and specializing for that). > >> > >> That's the cases that are quite important in practice. > > > > Could you quantify this somehow, even if it's a very rough estimate? > > I'm asking because it's significant and widely used, then this would > > require users or compiler implementors to make a difficult trade-off > > (ie, do you want mo_consume performance or performance through those > > other optimizations?). > > Probably resoved by your followup on how the transform is safe anyway. > > >> > Also, it would be interesting to me to know how often we may turn data > >> > dependencies into control dependencies in cases where this doesn't > >> > affect performance significantly. > >> > >> I suppose we try to avoid that but can we ever know for sure? Like > >> speculative devirtualization does this (with the intent that it _does_ > >> matter, > >> of course). > >> > >> I suppose establishing non-dependence isn't an issue, like with the > >> vectorizer adding runtime dependence checks and applying versioning > >> to get a vectorized and a not vectorized loop (in case there are > >> dependences)? > > > > I'm not sure I understand you correctly. Do you have a brief example, > > perhaps? For mo_consume and its data dependencies, if there might be a > > dependence, the compiler would have to preserve it; but I guess that > > both a vectorized loop an one that accessses each element separately > > would preserve dependences because it's doing those accesses, and they > > depend on the input data. > > OTOH, peraps HW vector instructions don't get the ordering guarantees > > from data dependences -- Paul, do you know of any such cases? > > A brief example would be for > > void foo (int *a, int *b, int n) > { > for (int i = 0; i < n; ++i) >a[i] = b[i]; > } > > which we can vectorize like > > if (a + n < b || b + n < a) >{ > vectorized loop >} > else > { >not vectorized loop > } > > note how we're not establishing equivalences between pointers but > non-dependence vs. possible dependence. Thank you for the clarification, I will check. If I am not too confused, such a loop might want to used x86 non-temporal stores, which require special handling even with acquire and release, but that is presumably already handled. Thanx, Paul > >> > The background for this question is Paul McKenney's recently updated > >> > proposal for a different memory_order_consume specification: > >> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0190r0.pdf > >> > > >> > In a nutshell, this requires a compiler to either prove that a pointer > >> > value is not carrying a dependency (simplified, its value somehow > >> > originates from a memory_order_consume load), or it has to > >> >
Re: RFC: Update Intel386, x86-64 and IA MCU psABIs for passing/returning empty struct
On Tue, Feb 23, 2016 at 5:14 PM, Richard Smith wrote: > On Tue, Feb 23, 2016 at 8:28 AM, H.J. Lu wrote: >> On Tue, Feb 23, 2016 at 8:15 AM, Michael Matz wrote: >>> Hi, >>> >>> On Tue, 23 Feb 2016, H.J. Lu wrote: >>> I thought --- An empty type is a type where it and all of its subobjects (recursively) are of class, structure, union, or array type. --- excluded struct empty { empty () = default; }; >>> >>> >>> Why would that be excluded? There are no subobjects, hence all of them >>> are of class, structure, union or array type, hence this is an empty type. >>> (And that's good, it indeed looks quite empty to me). Even if you would >>> add a non-trivial copy ctor, making this thing not trivially copyable >>> anymore, it would still be empty. Hence, given your proposed language in >>> the psABI, without reference to any other ABI (in particular not to the >>> Itanium C++ ABI), you would then need to pass it without registers. That >>> can't be done, and that's exactly why I find that wording incomplete. It >>> needs implicit references to other languages ABIs to work. >>> >> >> It is clear to me now. Let's go with >> >> --- >> An empty type is a type where it and all of its subobjects (recursively) >> are of class, structure, union, or array type. No memory slot nor >> register should be used to pass or return an object of empty type that's >> trivially copyable. >> --- >> >> Any comments? > > Yes. "trivially copyable" is the wrong restriction. See > http://mentorembedded.github.io/cxx-abi/abi.html#normal-call for the > actual Itanium C++ ABI rule. I looked it up. " trivially copyable" is covered by C++ ABI. > It's also completely nonsensical to mention this as a special case in > relation to empty types. The special case applies to all function > parameters, irrespective of whether they're empty -- this rule applies > *long* before you consider whether the type is empty. For instance, in > the x86-64 psABI, this should go right at the start of section 2.2.3 > ("Parameter Passing and Returning Values"). But please don't add it > there -- it's completely redundant, as section 5.1 already says that > the Itanium C++ ABI is used, so it's not necessary to duplicate rules > from there. Here is the final wording: An empty type is a type where it and all of its subobjects (recursively) are of class, structure, union, or array type. No memory slot nor register should be used to pass or return an object of empty type. Footnote: Array of empty type can only passed by reference in C and C++. Michael, can you put it in x86-64 psABI? I will update i386 and IA MCU psABIs. Thanks. -- H.J.
Committing via git
Hi Is there something special needed to commit via git? I got an odd error pushing some minor RTEMS patches and wondered what the proper procedure was. I am using the same commands and process I use with newlib so was wondering. The website has svn instructions so maybe I am just confused after being up too long. --joel
Re: Committing via git
On Fri, Feb 26, 2016 at 9:25 AM, Joel Sherrill wrote: > Hi > > Is there something special needed to commit via git? I got an odd error > pushing some minor RTEMS patches and wondered what the proper procedure was. What is "odd error"? H.J.
Re: Committing via git
On 26 February 2016 at 17:25, Joel Sherrill wrote: > Hi > > Is there something special needed to commit via git? I got an odd error > pushing some minor RTEMS patches and wondered what the proper procedure was. > > I am using the same commands and process I use with newlib so was wondering. > > The website has svn instructions so maybe I am just confused after being up > too long. GCC sources are still subversion. The trunk (aka master) and gcc-*-branch branches in Git are read-only, you can only push to personal branches.
Re: Committing via git
On 2/26/2016 11:50 AM, Jonathan Wakely wrote: On 26 February 2016 at 17:25, Joel Sherrill wrote: Hi Is there something special needed to commit via git? I got an odd error pushing some minor RTEMS patches and wondered what the proper procedure was. I am using the same commands and process I use with newlib so was wondering. The website has svn instructions so maybe I am just confused after being up too long. GCC sources are still subversion. The trunk (aka master) and gcc-*-branch branches in Git are read-only, you can only push to personal branches. Well that would certainly explain why a git push to master didn't work. :) Sorry for the stupidity. Thanks. --joel
Re: Committing via git
On 02/26/2016 12:18 PM, Joel Sherrill wrote: On 2/26/2016 11:50 AM, Jonathan Wakely wrote: On 26 February 2016 at 17:25, Joel Sherrill wrote: Hi Is there something special needed to commit via git? I got an odd error pushing some minor RTEMS patches and wondered what the proper procedure was. I am using the same commands and process I use with newlib so was wondering. The website has svn instructions so maybe I am just confused after being up too long. GCC sources are still subversion. The trunk (aka master) and gcc-*-branch branches in Git are read-only, you can only push to personal branches. Well that would certainly explain why a git push to master didn't work. :) Yup. Many folks are successfully using git-svn. There' instructions somewhere on the gcc.gnu.org site for setting that up. jeff
Re: Importance of transformations that turn data dependencies into control dependencies?
On 02/24/2016 05:14 AM, Richard Biener wrote: Note that if a user writes if (p == d) { ... do lots of stuff via p ... } GCC might rewrite accesses to p as accesses to d and thus expose those opportunities. Is that a transform that isn't valid then or is the code written by the user (establishing the equivalency) to blame? I think Torvald later clarified this wasn't a problem. If it was, then it wouldn't be hard to disable this for pointers and I doubt the impact would be measurable. There's a PR where this kind of equivalencies lead to unexpected (wrong?) points-to results for example. Yea, I've watched a couple discussions around this issue. I don't recall them reaching any conclusion about the validity of the testcases. If the final determination is that such equivalence propagation is unsafe for the points-to aliasing system, we can just disable those pointer equivalence tracking bits. Other potential examples that come to mind are de-virtualization, or feedback-directed optimizations that has observed at runtime that a certain pointer is likely to be always equal to some other pointer (eg., if p is almost always d[0], and specializing for that). That's the cases that are quite important in practice. I was most concerned about de-virt and feedback stuff that specializes paths based on expected values. Jeff
Re: Validity of SUBREG+AND-imm transformations
On 02/26/2016 06:40 AM, Kyrill Tkachov wrote: Hi all, I'm looking at a case where some RTL passes create an RTL expression of the form: (subreg:QI (and:SI (reg:SI x1) (const_int 31)) 0) which I'd like to simplify to: (and:QI (subreg:QI (reg:SI x1) 0) (const_int 31)) I can think of cases where the first is better and other cases where the second is better -- a lot depends on context. I don't have a good sense for which is better in general. Note that as-written these don't trigger the subtle issues in what happens with upper bits. That's more for extensions. (subreg:SI (whatever:QI)) vs {zero,sign}_extend:SI (whatever:QI)) vs (and:SI (subreg:SI (whatever:QI) (const_int 0x255))) The first leave the bits beyond QI as "undefined" and sometimes (but I doubt all that often in practice) the compiler will use the undefined nature of those bits to enable optimizations. The second & 3rd variants crisply define the upper bits. It's easy enough to express in RTL but I'm trying to convince myself on its validity. I know there are some subtle points in this area. combine_simplify_rtx in combine.c has a comment: /* Note that we cannot do any narrowing for non-constants since we might have been counting on using the fact that some bits were zero. We now do this in the SET. */ That comment makes no sense. Unfortunately it goes back to a change from Kenner in 1994 -- which predates having patch discussions here and consistently adding tests to the testsuite. The code used to do this: - if (GET_MODE_CLASS (mode) == MODE_INT - && GET_MODE_CLASS (GET_MODE (SUBREG_REG (x))) == MODE_INT - && GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (SUBREG_REG (x))) - && subreg_lowpart_p (x)) - return force_to_mode (SUBREG_REG (x), mode, GET_MODE_MASK (mode), - NULL_RTX, 0); Which appears to check that we've got a narrowing subreg expression, and if we do try to force the SUBREG_REG into the right mode using force_to_mode. But if we had a narrowing SUBREG_REG, then I can't see how anything would have been dependign on the upper bits being zero. and if I try to implement this transformation in simplify_subreg from simplify-rtx.c I get some cases where combine goes into an infinite recursion in simplify_comparison because it tries to do: /* If this is (and:M1 (subreg:M1 X:M2 0) (const_int C1)) where C1 fits in both M1 and M2 and the SUBREG is either paradoxical or represents the low part, permute the SUBREG and the AND and try again. */ Right. I think you just end up ping-ponging between the two equivalent representations. Which may indeed argue that the existing representation is preferred and we should look deeper into why the existing representation isn't being handled as well as it should be. Performing this transformation would help a lot with recognition of some patterns that I'm working on, so would it be acceptable to teach combine or simplify-rtx to do this? How does it help recognition? What kinds of patterns are you trying to recognize? jeff
Warning for converting (possibly) negative float/double to unsigned int
Perhaps this question is appropriate for the gcc mail list. Converting a float/double to unsigned int is undefined if the result would be negative when converted to a signed int. x86-64 and arm treat this condition differently---x86-64 returns a value whose bit pattern is the same as the bit pattern for converting to signed int, and arm returns zero. So it would be nice to have a warning that this will (or could) happen. I couldn't find such a warning in the GCC manual or in the GCC code base. Looking through the code, it seemed it might go in this code in force_operand() in expr.c on mainline: if (UNARY_P (value)) { if (!target) target = gen_reg_rtx (GET_MODE (value)); op1 = force_operand (XEXP (value, 0), NULL_RTX); switch (code) { case ZERO_EXTEND: case SIGN_EXTEND: case TRUNCATE: case FLOAT_EXTEND: case FLOAT_TRUNCATE: convert_move (target, op1, code == ZERO_EXTEND); return target; case FIX: case UNSIGNED_FIX: expand_fix (target, op1, code == UNSIGNED_FIX); return target; case FLOAT: case UNSIGNED_FLOAT: expand_float (target, op1, code == UNSIGNED_FLOAT); return target; default: return expand_simple_unop (GET_MODE (value), code, op1, target, 0); } } But maybe not. Any advice on how to proceed? I'd be willing to write and test the few lines of code myself if I knew where to put them. Brad
Re: [isocpp-parallel] Proposal for new memory_order_consume definition
On 2/25/16, Hans Boehm wrote: > If carries_dependency affects semantics, then it should not be an > attribute. > > The original design, or at least my understanding of it, was that it not > have semantics; it was only a suggestion to the compiler that it should > preserve dependencies instead of inserting a fence at the call site. > Dependency-based ordering would be preserved in either case. Yes, but there is a performance penalty, though I do not know how severe. When do the pragmatics become sufficiently severe that they become semantics? > But I think we're moving away from that view towards something that doesn't > quietly add fences. > > I do not think we can quite get away with defining a dependency in a way > that is unconditionally preserved by existing compilers, and thus I think > that we do probably need annotations along the dependency path. I just > don't see a way to otherwise deal with the case in which a compiler infers > an equivalent pointer and dereferences that instead of the original. This > can happen under so many (unlikely but) hard-to-define conditions that it > seems undefinable in an implementation-independent manner. "If the > implementation is able then " is, in my opinion, not > acceptable standards text. > > Thus I see no way to both avoid adding syntax to functions that preserve > dependencies and continue to allow existing transformations that remove > dependencies we care about, e.g. due to equality comparisons. We can > hopefully ensure that without annotations compilers break things with very > low probability, so that there is a reasonable path forward for existing > code relying on dependency ordering (which currently also breaks with very > low probability unless you understand what the compiler is doing). But I > don't see a way for the standard to guarantee correctness without the added > syntax (or added optimization constraints that effectively assume all > functions were annotated). > > On Sat, Feb 20, 2016 at 11:53 AM, Paul E. McKenney < > paul...@linux.vnet.ibm.com> wrote: > >> On Fri, Feb 19, 2016 at 09:15:16PM -0500, Tony V E wrote: >> > There's at least one easy answer in there: >> > >> > > If implementations must support annotation, what form should that >> > annotation take? P0190R0 recommends the [[carries_dependency]] >> > attribute, but I am not picky as long as it can be (1) applied >> > to all relevant pointer-like objects and (2) used in C as well >> > as C++. ;-) >> > >> > If an implementation must support it, then it is not an annotation but >> > a >> keyword. So no [[]] >> >> I would be good with that approach, especially if the WG14 continues >> to stay away from annotations. >> >> For whatever it is worth, the introduction of intrinsics for comparisons >> that avoid breaking dependencies enables the annotation to remain >> optional. >> >> Thanx, Paul >> >> > Sent from my BlackBerry portable Babbage Device >> > Original Message >> > From: Paul E. McKenney >> > Sent: Thursday, February 18, 2016 4:58 AM >> > To: paral...@lists.isocpp.org; linux-ker...@vger.kernel.org; >> linux-a...@vger.kernel.org; gcc@gcc.gnu.org; llvm-...@lists.llvm.org >> > Reply To: paral...@lists.isocpp.org >> > Cc: pet...@infradead.org; j.algl...@ucl.ac.uk; will.dea...@arm.com; >> dhowe...@redhat.com; ramana.radhakrish...@arm.com; luc.maran...@inria.fr; >> a...@linux-foundation.org; peter.sew...@cl.cam.ac.uk; >> torva...@linux-foundation.org; mi...@kernel.org >> > Subject: [isocpp-parallel] Proposal for new memory_order_consume >> definition >> > >> > Hello! >> > >> > A proposal (quaintly identified as P0190R0) for a new >> memory_order_consume >> > definition may be found here: >> > >> > http://www2.rdrop.com/users/paulmck/submission/consume.2016.02.10b.pdf >> > >> > As requested at the October C++ Standards Committee meeting, this >> > is a follow-on to P0098R1 that picks one alternative and describes >> > it in detail. This approach focuses on existing practice, with the >> > goal of supporting existing code with existing compilers. In the last >> > clang/LLVM patch I saw for basic support of this change, you could >> > count >> > the changed lines and still have lots of fingers and toes left over. >> > Those who have been following this story will recognize that this is >> > a very happy contrast to work that would be required to implement the >> > definition in the current standard. >> > >> > I expect that P0190R0 will be discussed at the upcoming C++ Standards >> > Committee meeting taking place the week of February 29th. Points of >> > discussion are likely to include: >> > >> > o May memory_order_consume dependency ordering be used in >> > unannotated code? I believe that this must be the case, >> > especially given that this is our experience base. P0190R0 >> > therefore recommends this approach. >> > >> > o If memory_order_consume dependency ordering can be used in >