[PATCH] D116203: [clang] adds unary type transformations as compiler built-ins

Richard Smith - zygoloid via Phabricator via cfe-commits Tue, 14 Jun 2022 13:58:33 -0700

rsmith added inline comments.


================
Comment at: clang/lib/AST/ASTContext.cpp:5863
   // FIXME: derive from "Target" ?
-  return WCharTy;
+  return IntTy;
 }
----------------
This change seems surprising. Can you explain what's going on here?


================
Comment at: clang/lib/AST/ASTContext.cpp:10712-10713
 QualType ASTContext::getCorrespondingUnsignedType(QualType T) const {
-  assert((T->hasSignedIntegerRepresentation() || T->isSignedFixedPointType()) 
&&
+  assert((T->hasSignedIntegerRepresentation() || T->isIntegerType() ||
+          T->isEnumeralType() || T->isSignedFixedPointType()) &&
          "Unexpected type");
----------------
If we now allow unsigned types to be passed in here, can we do so consistently?

This seems to do the wrong thing for a vector of scoped enumeration type. Is 
that a problem for the builtins?


================
Comment at: clang/lib/AST/ASTContext.cpp:10731-10759
+  case BuiltinType::Char_U:
   case BuiltinType::Char_S:
   case BuiltinType::SChar:
+  case BuiltinType::UChar:
+  case BuiltinType::Char8:
     return UnsignedCharTy;
   case BuiltinType::Short:
----------------
(Continuation of suggested edits in comment a few lines below.)


================
Comment at: clang/lib/AST/ASTContext.cpp:10785-10786
     return SatUnsignedLongFractTy;
   default:
-    llvm_unreachable("Unexpected signed integer or fixed point type");
+    llvm_unreachable("Unexpected integer or signed fixed point type");
   }
----------------
Perhaps we can handle all the plain unsigned types (other than `wchar_t` and 
`char`, which are special) here? This would also cover the fixed-point types.


================
Comment at: clang/lib/AST/ASTContext.cpp:10791
 QualType ASTContext::getCorrespondingSignedType(QualType T) const {
-  assert((T->hasUnsignedIntegerRepresentation() ||
-          T->isUnsignedFixedPointType()) &&
+  assert((T->hasUnsignedIntegerRepresentation() || T->isIntegerType() ||
+          T->isEnumeralType() || T->isUnsignedFixedPointType()) &&
----------------
Similar comments for this function as previous one.


================
Comment at: clang/lib/AST/TypePrinter.cpp:1158
                                            raw_ostream &OS) {
   IncludeStrongLifetimeRAII Strong(Policy);
 }
----------------
Remove this line too. No point building an RAII scope with nothing in it.


================
Comment at: clang/lib/Parse/ParseExpr.cpp:1750
+#undef TRANSFORM_TYPE_TRAIT_DEF
+    if (NextToken().is(tok::less)) {
+      Tok.setKind(tok::identifier);
----------------
cjdb wrote:
> rsmith wrote:
> > Here you're checking for `trait<` and treating it as an identifier; 
> > elsewhere you're checking for `trait(` and treating it as a builtin. Is 
> > there a reason to treat `trait` followed by a token other than `(` or `<` 
> > inconsistently?
> The parser needs to treat `trait<` as an identifier to accommodate 
> libstdc++'s usage of a few of these as alias templates. I've added a comment 
> to explain why in the code.
I agree we need to treat `trait<` as an identifier and `trait(` as the builtin. 
My question is, why do we have inconsistent treatment of the remaining cases 
(`trait` followed by any other token)? For example, 
`MaybeParseTypeTransformTypeSpecifier` treats `trait + 1` as an identifier. But 
this code treats it as the builtin name.

I think there are two choices that make the most sense, if they work:

1) `trait(` is the builtin and any other utterance is an identifier, or
2) `trait<` is an identifier, `using trait =` is an identifier, and any other 
utterance is the builtin.

I think I prefer option (2), given that it's defining the narrower special 
case. But all I'm really looking for here is a consistent story one way or the 
other, if it's feasible to have one.


================
Comment at: clang/lib/Sema/SemaType.cpp:9160-9164
+  Qualifiers Quals = Pointee.getQualifiers();
+  Quals.removeAddressSpace();
+  return Context.getUnaryTransformType(
+      BaseType,
+      QualType(Pointee.getSplitUnqualifiedType().Ty, Quals.getAsOpaqueValue()),
----------------
You're splitting the type into unqualified type and quailfiers twice here, once 
in the call to `getQualifiers` and again in `getSplitUnqualifiedType`. It'd be 
better to only split it once.


================
Comment at: clang/lib/Sema/SemaType.cpp:9161
+  Qualifiers Quals = Pointee.getQualifiers();
+  Quals.removeAddressSpace();
+  return Context.getUnaryTransformType(
----------------
Do we really want to remove the address space here? The libc++ and libstdc++ 
implementations of the trait don't do that (https://godbolt.org/z/jqafYP6er) 
and it makes the behavior of `__remove_pointer` inconsistent with the behavior 
of `__add_pointer`. Can we just produce the pointee type intact, including all 
of its qualifiers?


================
Comment at: clang/lib/Sema/SemaType.cpp:9164
+      BaseType,
+      QualType(Pointee.getSplitUnqualifiedType().Ty, Quals.getAsOpaqueValue()),
+      UKind);
----------------
It's not correct to treat a `Qualifiers` opaque value as if it has meaning for 
anything other than converting back from an opaque value. The second argument 
here is a cvr mask, which is not the same thing.

Use `Context.getQualifiedType` to combine a type with some qualifiers.


================
Comment at: clang/lib/Sema/SemaType.cpp:9175-9183
+  Qualifiers Quals = Underlying.getQualifiers();
+  // std::decay is supposed to produce 'std::remove_cv', but since 'restrict' 
is
+  // in the same group of qualifiers as 'const' and 'volatile', we're extending
+  // '__decay(T)' so that it also removes '__restrict' in C++.
+  Quals.removeCVRQualifiers();
+  return Context.getUnaryTransformType(
+      BaseType,
----------------
As above, you can't use `getAsOpaqueValue` like this, and should avoid 
splitting the type twice.

Note that the pattern used here is only correct because we know that 
`Underlying` cannot be an array type.


================
Comment at: clang/lib/Sema/SemaType.cpp:9178
+  // in the same group of qualifiers as 'const' and 'volatile', we're extending
+  // '__decay(T)' so that it also removes '__restrict' in C++.
+  Quals.removeCVRQualifiers();
----------------
Why "in C++"? It looks like it does this in C too, which is probably 
appropriate. However, this is a divergence from what `std::decay_t` does in 
libc++ and libstdc++, where it does not remove `__restrict`.


================
Comment at: clang/lib/Sema/SemaType.cpp:9225-9236
+  assert(LangOpts.CPlusPlus);
+  QualType Underlying = BaseType.getNonReferenceType();
+  Qualifiers Quals = Underlying.getQualifiers();
+  if (UKind == UnaryTransformType::RemoveCVRef) {
+    Quals.removeConst();
+    Quals.removeVolatile();
+  }
----------------
As before, the way you were splitting and reconstructing types wasn't correct.


================
Comment at: clang/lib/Sema/SemaType.cpp:9241-9245
+  Qualifiers Quals;
+  QualType Unqual = Context.getUnqualifiedArrayType(BaseType, Quals);
+  if ((BaseType->isReferenceType() && UKind != UTTKind::RemoveRestrict) ||
+      BaseType->isFunctionType())
+    return Context.getUnaryTransformType(BaseType, BaseType, UKind);
----------------
LLVM style is to restrict the scopes of variables as much as possible, and 
there's no point splitting the type if we're not going to use the result.


================
Comment at: clang/lib/Sema/SemaType.cpp:9267-9274
+  QualType Underlying =
+      BaseType->isBitIntType()
+          ? Context.getBitIntType(!IsMakeSigned, Context.getIntWidth(BaseType))
+      : BaseType->isEnumeralType()
+          ? Context.getIntTypeForBitwidth(Context.getIntWidth(BaseType),
+                                          IsMakeSigned)
+      : IsMakeSigned ? Context.getCorrespondingSignedType(BaseType)
----------------
With the changes suggested for `getCorresponding*Type`, I think we can remove 
the `BitInt` special case here.


================
Comment at: clang/lib/Sema/SemaType.cpp:9196-9205
+  if (!BaseType->isArrayType())
+    return Context.getUnaryTransformType(BaseType, BaseType, UKind);
+  const ArrayType *ArrayType = BaseType->getAsArrayTypeUnsafe();
+  QualType ElementType =
+      UKind == UnaryTransformType::RemoveAllExtents
+          ? QualType(ArrayType->getBaseElementTypeUnsafe(), {})
+          : ArrayType->getElementType();
----------------
rsmith wrote:
> This looks like it'll lose all qualifiers other than CVR qualifiers. 
> Presumably that's not the intent. (This will also unnecessarily remove type 
> sugar in some cases, but that's not super important here.) Please also don't 
> use copy-list-initialization to form a `QualType`; see 
> https://llvm.org/docs/CodingStandards.html#do-not-use-braced-initializer-lists-to-call-a-constructor
> 
> See the suggested edits for an alternative, simpler formulation (that 
> preserves as much type sugar as possible).
> 
Looks like you forgot to remove the old code when adding the new code?


================
Comment at: clang/lib/Sema/SemaType.cpp:9251
+      BaseType->isBooleanType()) {
+    Diag(Loc, diag::err_make_signed_integral_only) << IsMakeSigned << BaseType;
+    return QualType();
----------------
cjdb wrote:
> cjdb wrote:
> > rsmith wrote:
> > > Should we support vector types here?
> > Is it a conforming extension for `make_signed_t<int 
> > __attribute__((ext_vector_type(2)))>` to work?
> Gentle ping.
Yes, it's a conforming extension compared to C++20; the standard places no 
restrictions on what a program that uses a vendor extension type such as a 
vector type does.

On reflection, I'm not sure it would be a conforming extension to OpenCL (in 
which such vector types exist as part of the language standard -- in 
particular, `ext_vector_type` is intended to follow the OpenCL rules). So let's 
not support such things until / unless we get some direction from the OpenCL 
folks.


================
Comment at: clang/lib/Sema/SemaType.cpp:9265-9266
+          ? Context.getBitIntType(!IsMakeSigned, Context.getIntWidth(BaseType))
+          : Context.getIntTypeForBitwidth(Context.getIntWidth(BaseType),
+                                          IsMakeSigned);
+  Qualifiers Quals = Underlying.getQualifiers();
----------------
cjdb wrote:
> rsmith wrote:
> > This is wrong: if, say, `int` and `long` are the same bit-width, this will 
> > give the same type for `__make_unsigned(int)` and `__make_unsigned(long)`, 
> > where they are required to produce `unsigned int` and `unsigned long` 
> > respectively.
> > 
> > Please look at `Context.getCorrespondingSignedType` and 
> > `Context.getCorrespondingUnsignedType` to see how to do this properly; you 
> > may be able to call those functions directly in some cases, but be careful 
> > about the cases where we're required to return the lowest-rank int type of 
> > the given size. Note that `getIntTypeForBitwidth` does *not* do that; 
> > rather, it produces the *preferred* type of the given width, and for 
> > WebAssembly and AVR it produces something other than the lowest-rank type 
> > in practice in some cases.
> This makes me very happy.
The comment on this producing the wrong underlying type for enums in some cases 
doesn't seem to be addressed. You need to produce the lowest-rank type of the 
given size, which is not what `getIntTypeForBitwidth` does.


================
Comment at: clang/lib/Sema/SemaType.cpp:6023
     void VisitUnaryTransformTypeLoc(UnaryTransformTypeLoc TL) {
-      // FIXME: This holds only because we only have one unary transform.
-      assert(DS.getTypeSpecType() == DeclSpec::TST_underlyingType);
+      // Make sure it is a unary transform type
+      assert(DS.getTypeSpecType() >= DeclSpec::TST_underlyingType &&
----------------
aaron.ballman wrote:
> 
I don't think this comment pulls its weight any more, now that you're calling a 
named function for this check. Maybe just remove it?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116203/new/

https://reviews.llvm.org/D116203

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D116203: [clang] adds unary type transformations as compiler built-ins

Reply via email to