+1
One small question: Under "Tracking Issue", the tracking issue is closed "when
the RFC is either completed or abandoned." Is "abandoned" the same as
"postponed" in this context, or is there a distinction between them?
--
You are receiving this because you are subscribed to this thread.
Rep
> @Lunderberg @junrushao1994 good catch, would changing "abandoned" to
> "postponed" resolve the ambiguity sufficiently?
Thank you! That resolves my question.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly o
This RFC introduces a hard boundary between the “logical layout” of a
mathematical tensor and the “physical layout” of a buffer in memory, along with
a specification for defining the conversion between the two.
You can view, comment on, or merge this pull request online at:
https://github.com/
This RFC changes `Array AllocateNode::extents` to `PrimExpr
AllocateNode::extent`, giving the 1-d size of the buffer to be allocated. This
is part of the separation between logical layout and physical layout, proposed
in [RFC#0039](https://github.com/apache/tvm-rfcs/pull/0039).
TODO:
Thank you for the comments, @vinx13.
> For example, another way is to use a mapping function: (n, c, h, w) -> (n,
> tir.floordiv(c, 4), h, w, tir.floormod(c, 4)). This would allow arbitrary
> mapping (we can add more restrictions like requiring affine mapping though,
> to make analysis easier).
True, and that would allow both for user-defined mappings, and for specifying
standard layouts. I have a bit of concern with using the `scope` parameter to
also describe layouts, since in my mind "scope" refers to where the entire
buffer is located, while "layout" refers to how individual eleme
> How will vectorization work? If there is a `vectorize` directive spanning a
> logical extent, will the vectorization pass create multidimensional `ramp`s?
> How will vector loads and stores be represented?
While in principle a vectorized load/store could fallback to a non-vectorized
load/stor
Following a video chat with @csullivan, documenting some of the key points of
the conversation.
* Setting the physical layout in a TE-based schedule has two roles. One is the
rewrite the buffer itself, and the other is to define the order of iteration
when writing to the buffer. In the latter
This is the first in a series of proposed changes, and this one on its own
won't be able to support the `PHYSICAL_AXIS_SEPARATOR`. In the [Impacted TIR
Nodes](https://github.com/Lunderberg/tvm-rfcs/blob/data_layout/rfcs/0039-buffer-physical-layout.md#impacted-tir-nodes)
section of RF
Following discussion with @tqchen , this RFC has had significant updates made.
The major change is that instead of extending the capabilities of `Store` and
`Load` nodes to support N-d indices, they would instead be removed in favor of
keeping `BufferStore` and `BufferLoad` nodes throughout the
Closing this RFC, as it is no longer applicable after significant changes made
to #39.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/40#issuecomment-954215697
Closed #40.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/40#event-5536746113
> I'd suggest adding the BufferTransform data structure here which will be very
> helpful to other audience.
Sounds good, and I've added a description of it, and a possible data structure
for it.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly
> Usage of te.AXIS_SEPARATOR: It seems this is only used in the API side but
> not in BufferTransform, would be good to get some clarification.
That's correct, the `te.AXIS_SEPARATOR` only appears in the API for the TE
schedules, and not in the TIR graph generated from the TE schedule. I've
up
> Since Option2 suggests the transform is global, shall we consider
> BufferTransform being part of function attribute?
I had initially placed `BufferTransform` as a statement so that it could be
possible to extended it to have a transformation defined by references to
variables within the func
Following a video chat discussion with @vinx13 , we touched on a number of
points, summarized below. Also, we are adding @vinx13 as a co-author on this
RFC.
- Are there cases where the flattening in `StorageFlatten`/`FlattenBuffer`
should be inferred from buffer properties, rather than explici
Sounds good! I've updated with examples of scheduling with the returned new
axes, which work in the implementation posted in
[PR#9727](https://github.com/apache/tvm/pull/9727).
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHu
+1
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/10471#issuecomment-1060835861
You are receiving this because you are subscribed to this thread.
Message ID:
> Based on my previous re-review of LLVM, thanks to @tqchen, it might help to
> use my_target.features.dsp rather than my_target.arch.has_dsp and clarifying
> these are features available to the Target? What do you think?
I like that, and the renaming makes it clear which are boolean parameters
This RFC introduces a method to specify padding to be applied as part of a
buffer layout transformation, to be used when the desired layout does not
evenly tile the buffer being transformed, and simplifications that can be
performed based on these padded buffers.
The motivating examples are pri
> Introducing changes to TIR would needs some additional thoughts that deserves
> some extra consideration. Due to the N*M complexity (where N is the TIR
> possibilities and M is the number of primitives to be supported) that needs
> to be handled in implementation (by backend implementers and p
> Indeed it is important to avoid having a separate compute definition for each
> workload on a new target. In this particular case, all computation definition
> would start with the original layout. Then there is a "schedule
> transformation" like transform layout which will generate the new st
> It doesn't add additional semantic, the computation semantic stays the same,
> it is a hint to the graph compiler.
My apologies, I had meant the semantics of a node from the perspective of a TIR
transformation, not the semantics from the perspective of the computation being
described. For a
Writing out some of my thoughts, to see if there's a way to express the
constraints while only using existing TIR features. The main goals would be as
follows.
1. Allow simplification of expressions based on the values present in the
padding.
2. Allow local simplifications to take advantage of
> For example, we may introduce explicit cache stage to add the padding, and
> mark this block for later processing.
Wouldn't that require a "remove entirely" annotation that was suggested against
[here](https://github.com/apache/tvm-rfcs/pull/77#issuecomment-1163019805)? I
could see how we co
> Talking about “constraints”, it is also useful to talk about categories of
> them, roughly we can divide them into three categories.
I like this breakdown, and agree. In this categorization, what I've been
calling "constraints" would be "assumptions". Double-checking in `builtin.h`,
it look
> Our design principle at TIR level ideally we start with one instance of
> possibility, then use probabilistic space of meta-schedule to represent
> multiple choices.
For this, would the layout re-flowing occur periodically during optimization?
Otherwise, including transformations in the perf
These make sense, and agreed that the TIR->global feedback is important for
enabling the layout reflow. Going back through the discussion, I think we're
converging on agreement on what features are required, and the main question
remaining are how to best provide annotation for non-local inform
> In general it is helpful to first keep schedule decision local, e.g.
> introducing a caching stage (AC, BC in the example), the compose with another
> reflowing pass to bring the decision to consumer/producers.
My goal with the latest update wasn't to require global decisions, but to make
loc
Thank you very much on the comments, suggestions, and discussion, and I'm quite
happy with how the design evolved over the course of the discussions!
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/77#issuecomment-1182157349
You are receiving this be
+1
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/12583#issuecomment-1232123344
You are receiving this because you are subscribed to this thread.
Message ID:
[Rendered
link](https://github.com/gromero/tvm-rfcs/blob/cmg/rfcs/0088-commit-message-guideline.md)
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/88#issuecomment-1233170752
You are receiving this because you are subscribed to this thread.
Message
I very much like the proposed improvements, especially the use cases for
inner-block and inter-block analysis. While I have made some development [for
similar
applications](https://github.com/apache/tvm/blob/main/src/tir/analysis/control_flow_graph.h),
the additional formalism and reliability
+1
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/15521#issuecomment-1675087792
You are receiving this because you are subscribed to this thread.
Message ID:
> What I'm aiming at is to be able to lower the TIR to a generic CPU, that is
> to an architecture that does not support SVE. The TIR will need to have some
> default lowering in CodeGenLLVM/CodeGenCPU, so being able to do that is
> important.
Could it instead be in a target-dependent lowering
Agreeing with @kparzysz-quic, changes that update the `DLDataType` would need
to be approached very cautiously. I usually lean toward allowing short-term
breakages if they lead to better long-term code health, but updating the
`DLDataType` would be very wide reaching even more my tastes.
One
+1
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/16368#issuecomment-1887308146
You are receiving this because you are subscribed to this thread.
Message ID:
+1
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/16912#issuecomment-2080793568
You are receiving this because you are subscribed to this thread.
Message ID:
Prior to this commit, allocations performed by `ncclCommInitRank` had no
corresponding call to `ncclCommDestroy`. While `ncclCommDestroy` does occur in
the `CCLThreadLocalContext::Clear` method, there are no calls into this method.
On worker processes, the failure to call `ncclCommDestroy` typ
Adding notes from a few video chats, so that there is a record of the discussion
>From @tkonolige , confirmed that the current implementation of
>`@tvm.testing.parametrize_targets` shows skipped targets if they are
>explicitly listed in the decorator, but not if they come from
>`TVM_TEST_TARG
Correct, the different cases are intended to show the entire contents of a test
file. The names in this example are chosen so that it can be run with minimal
interaction between cases.
For the `fixture(scope="module")`, this indicates when pytest should clean up a
fixture, but it is only ava
I think I'd lean towards markdown for consistency with other services, but
that's only if all other features were equal. Markdown would be nicer for
reviewing, since it can be viewed from github in the browser, but the I think
cross-references are the more important feature.
Would either md
@masahi Tagging following comments on
https://github.com/apache/tvm/pull/8528#pullrequestreview-718506978
---
[Visit
Topic](https://discuss.tvm.apache.org/t/pre-rfc-vectorized-tir-buffers/10615/5)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe f
43 matches
Mail list logo