Re: [PATCH] D21162: [CUDA] Implement __shfl* intrinsics in clang headers.

2016-06-09 Thread Justin Holewinski via cfe-commits
jholewinski added a comment.

Looks reasonable to me.


http://reviews.llvm.org/D21162



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D20389: NVPTX: Add supported CL features

2016-06-17 Thread Justin Holewinski via cfe-commits
jholewinski accepted this revision.
jholewinski added a comment.
This revision is now accepted and ready to land.

Looks good to me


Repository:
  rL LLVM

http://reviews.llvm.org/D20389



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-29 Thread Justin Holewinski via cfe-commits


@@ -26,24 +27,38 @@ static cl::opt
 NoF16Math("nvptx-no-f16-math", cl::Hidden,
   cl::desc("NVPTX Specific: Disable generation of f16 math ops."),
   cl::init(false));
+static cl::opt
+NextSM("nvptx-next-sm", cl::Hidden,
+   cl::desc("NVPTX Specific: Override SM ID for sm_next."),
+   cl::init(90));
+static cl::opt
+NextPTX("nvptx-next-ptx", cl::Hidden,
+cl::desc("NVPTX Specific: Override PTX version for sm_next."),
+cl::init(85));
+
 // Pin the vtable to this file.
 void NVPTXSubtarget::anchor() {}
 
 NVPTXSubtarget &NVPTXSubtarget::initializeSubtargetDependencies(StringRef CPU,
 StringRef FS) {
-// Provide the default CPU if we don't have one.
-TargetName = std::string(CPU.empty() ? "sm_30" : CPU);
+  // Provide the default CPU if we don't have one.
+  TargetName = std::string(CPU.empty() ? "sm_30" : CPU);
 
-ParseSubtargetFeatures(TargetName, /*TuneCPU*/ TargetName, FS);
+  ParseSubtargetFeatures(TargetName, /*TuneCPU*/ TargetName, FS);
+  if (TargetName == "sm_next") {
+TargetName = "sm_" + itostr(NextSM);
+FullSmVersion = NextSM * 10;

jholewinski wrote:

It would be good to support architecture conditional targets, e.g. `sm_90a`, 
with this feature.

https://github.com/llvm/llvm-project/pull/100247
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-29 Thread Justin Holewinski via cfe-commits

jholewinski wrote:

I'm wondering if this feature would be better named `sm_custom` or similar. The 
`sm_next` moniker implies that the target is for a _future_ architecture 
target, but this feature can be used to inject any custom SM/PTX combination. 
Especially if this is extended to support the architecture conditional suffix, 
e.g. `sm_90a`, which is more a variant on an existing target rather than a 
"next" target.

https://github.com/llvm/llvm-project/pull/100247
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [lld] [llvm] [mlir] [IR] Introduce `T` to `DataLayout` to represent flat address space if a target supports it (PR #108786)

2024-09-16 Thread Justin Holewinski via cfe-commits

jholewinski wrote:

>> If you specify a flat address space, does that mean that all other address 
>> spaces are not flat, and thus cannot alias with other address spaces?

> Yes, all other address spaces are not flat. A flat address space pointer can 
> still point to the same place as a non-flat address space pointer, so it 
> doesn't guarantee no alias.

Just to clarify, does this mean any two non-flat address space pointers 
_cannot_ alias?

> I think conceptually we should describe AS hierarchy explicitly, and avoid 
> the assumptions on their number or layout.
E.g. T0:1,2,3,4,5 may mean AS 0 is a flat superset of AS 1,2,3,4,5.

This definitely feels more expressive, though I'm still concerned about the 
(no-)aliasing guarantees. It's useful to have two non-flat address spaces that 
_can_ alias, for example two address spaces that may touch the same underlying 
memory using two vastly different hardware paths. For GPUs, this could be 
global and texture memory. It may technically be the same memory and can alias, 
even though the instructions and hardware paths used to access it are very 
different.

https://github.com/llvm/llvm-project/pull/108786
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Convert scalar function nvvm.annotations to attributes (PR #125908)

2025-02-10 Thread Justin Holewinski via cfe-commits

https://github.com/jholewinski approved this pull request.


https://github.com/llvm/llvm-project/pull/125908
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [NVPTX] Convert scalar function nvvm.annotations to attributes (PR #125908)

2025-02-10 Thread Justin Holewinski via cfe-commits


@@ -375,11 +375,8 @@ void 
CodeGenModule::handleCUDALaunchBoundsAttr(llvm::Function *F,
 if (MinBlocks > 0) {
   if (MinBlocksVal)
 *MinBlocksVal = MinBlocks.getExtValue();
-  if (F) {
-// Create !{, metadata !"minctasm", i32 } node
-NVPTXTargetCodeGenInfo::addNVVMMetadata(F, "minctasm",
-MinBlocks.getExtValue());
-  }
+  if (F)
+F->addFnAttr("nvvm.minctasm", llvm::utostr(MinBlocks.getExtValue()));

jholewinski wrote:

We should eventually create a list of strings for valid `nvvm.` function 
attributes and use them here instead of hard-coding strings. It would serve as 
a single-source-of-truth for the set of valid attributes. Not necessary for 
this PR, but something to consider for the future.

https://github.com/llvm/llvm-project/pull/125908
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits