> Since Option2 suggests the transform is global, shall we consider 
> BufferTransform being part of function attribute?

I had initially placed `BufferTransform` as a statement so that it could be 
possible to extended it to have a transformation defined by references to 
variables within the function body.  This would allow other transforms that 
require rewriting memory layouts to add a `BufferTransform` defining the 
layout, relying on a later pass to implement the transformation itself.  For 
example, `tir.transform.InjectDoubleBuffer` could be implemented by adding a 
buffer transform `lambda *indices: [loop_iter%2, *indices]`, and the load/store 
rewrites of `tir.transform.VectorizeLoop` could be implemented by adding a 
buffer transform `lambda *indices: [*indices, loop_iter]`.

On further thought, this potential future extension would still be possible 
with the definition in the `PrimFunc`, so long as visitors that modify the 
`loop_iter` also modify the transformation, so I support having the 
transformations as a global definition, and will update the RFC.

Having a single global definition of the buffer transformations would also make 
the order clearer in cases of multiple transformations.  A transformation may 
be more easily expressed multiple sequential transformations, such as an 
initial transformation of a matrix into tiles for transfer to a GPU, then 
another transformation for ordering the elements into groups of size 
`warp_size`.  If these are in different portions of the TIR call graph, the 
order of these transformations would depend on the traversal order.

> This implies separation between buffer transform and the underlying physical 
> layout requirement (e.g. 1-D by default in most cases).

That would minimize the number of changes to the TIR, though it would increase 
the number of distinct changes that are made while lowering.  I had been 
thinking of the flattening to the underlying physical layout requirement as a 
special case of a transformation, which happens to define a row-major traversal.

Good point, though, on how the current flattening depends on the parameters 
outside of the buffer itself.  As such, the transformation representing the 
flattening couldn't be generated initially.  The concepts could be combined at 
some point in the future, if the flattening is made to depend only on the 
buffer, but that should be a separate change.

Between those two points, I think that would be that `PrimFuncAttrs::attrs` 
should include a map from buffers to a list of layout transformations, and a 
map from buffers to a list of axis separator locations.  The layout 
transformations would be consumed by the pass that rearranges the layout, and 
the list of axis separator locations would be consumed by either 
`StorageFlatten` or `FlattenBuffer`.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/39#issuecomment-960254384

Reply via email to