gemini-code-assist[bot] commented on code in PR #18528:
URL: https://github.com/apache/tvm/pull/18528#discussion_r2573015008
##########
python/tvm/tir/tensor_intrin/cuda.py:
##########
@@ -1465,7 +1465,7 @@ def get_index_C(elem_offset, stride):
stride_b = stride // 8
bi = i // 8
bj = j // 8
- return (bi // 2) * 2 * stride_b + bi % 2 + bj * 2
+ return ((bi // 2) * 2 * stride_b + bi % 2 + bj * 2) * 2
Review Comment:

This change correctly fixes the offset calculation. For better
maintainability, please consider adding a comment explaining the multiplication
by 2. For example, explaining how it relates to the register layout of the C
fragment for `mma.sync` where each element might occupy two addressable units
for the `ptx_mma` intrinsic.
##########
src/tir/transforms/memhammer_intermediate_stage.cc:
##########
@@ -263,8 +263,12 @@ std::pair<Stmt, SeqStmt> InsertCacheStage(Stmt stmt, bool
is_write_cache, ffi::S
for (const For& loop : outer_loops) {
if (loop->kind == ForKind::kThreadBinding) {
const ffi::String& thread_tag = loop->thread_binding.value()->thread_tag;
- if
(CanRelaxStorageUnderThread(runtime::StorageScope::Create(storage_scope),
-
runtime::ThreadScope::Create(thread_tag))) {
+ auto thread_scope = runtime::ThreadScope::Create(thread_tag);
+ if
(CanRelaxStorageUnderThread(runtime::StorageScope::Create(storage_scope),
thread_scope)) {
+ if (is_write_cache && thread_scope.dim_index == 0) {
+ // writing C_reindex_m16n8k8_matrixC_shared_dyn is warp execution
+ continue;
Review Comment:

The logic to skip `threadIdx.x` for write caches is correct. The added
comment is helpful, but could be slightly more descriptive to aid future
maintenance. Consider expanding it to explain *why* warp execution implies
skipping this dimension, to prevent overestimation of shared memory.
```c
// For write caches that are part of a warp-level execution (e.g., storing
MMA results),
// the threadIdx.x dimension should not contribute to the cache size
calculation, as all
// threads in the warp access the same memory region. Skipping it prevents
overestimation
// of shared memory.
continue;
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]