Prior to this commit, allocations performed by `ncclCommInitRank` had no
corresponding call to `ncclCommDestroy`. While `ncclCommDestroy` does occur in
the `CCLThreadLocalContext::Clear` method, there are no calls into this method.
On worker processes, the failure to call `ncclCommDestroy` typ
+1
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/16912#issuecomment-2080793568
You are receiving this because you are subscribed to this thread.
Message ID:
+1
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/16368#issuecomment-1887308146
You are receiving this because you are subscribed to this thread.
Message ID:
Agreeing with @kparzysz-quic, changes that update the `DLDataType` would need
to be approached very cautiously. I usually lean toward allowing short-term
breakages if they lead to better long-term code health, but updating the
`DLDataType` would be very wide reaching even more my tastes.
One
> What I'm aiming at is to be able to lower the TIR to a generic CPU, that is
> to an architecture that does not support SVE. The TIR will need to have some
> default lowering in CodeGenLLVM/CodeGenCPU, so being able to do that is
> important.
Could it instead be in a target-dependent lowering
+1
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/15521#issuecomment-1675087792
You are receiving this because you are subscribed to this thread.
Message ID:
I very much like the proposed improvements, especially the use cases for
inner-block and inter-block analysis. While I have made some development [for
similar
applications](https://github.com/apache/tvm/blob/main/src/tir/analysis/control_flow_graph.h),
the additional formalism and reliability
[Rendered
link](https://github.com/gromero/tvm-rfcs/blob/cmg/rfcs/0088-commit-message-guideline.md)
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/88#issuecomment-1233170752
You are receiving this because you are subscribed to this thread.
Message
+1
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/12583#issuecomment-1232123344
You are receiving this because you are subscribed to this thread.
Message ID:
Thank you very much on the comments, suggestions, and discussion, and I'm quite
happy with how the design evolved over the course of the discussions!
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/77#issuecomment-1182157349
You are receiving this be
> In general it is helpful to first keep schedule decision local, e.g.
> introducing a caching stage (AC, BC in the example), the compose with another
> reflowing pass to bring the decision to consumer/producers.
My goal with the latest update wasn't to require global decisions, but to make
loc
These make sense, and agreed that the TIR->global feedback is important for
enabling the layout reflow. Going back through the discussion, I think we're
converging on agreement on what features are required, and the main question
remaining are how to best provide annotation for non-local inform
> Our design principle at TIR level ideally we start with one instance of
> possibility, then use probabilistic space of meta-schedule to represent
> multiple choices.
For this, would the layout re-flowing occur periodically during optimization?
Otherwise, including transformations in the perf
> Talking about “constraints”, it is also useful to talk about categories of
> them, roughly we can divide them into three categories.
I like this breakdown, and agree. In this categorization, what I've been
calling "constraints" would be "assumptions". Double-checking in `builtin.h`,
it look
> For example, we may introduce explicit cache stage to add the padding, and
> mark this block for later processing.
Wouldn't that require a "remove entirely" annotation that was suggested against
[here](https://github.com/apache/tvm-rfcs/pull/77#issuecomment-1163019805)? I
could see how we co
Writing out some of my thoughts, to see if there's a way to express the
constraints while only using existing TIR features. The main goals would be as
follows.
1. Allow simplification of expressions based on the values present in the
padding.
2. Allow local simplifications to take advantage of
> It doesn't add additional semantic, the computation semantic stays the same,
> it is a hint to the graph compiler.
My apologies, I had meant the semantics of a node from the perspective of a TIR
transformation, not the semantics from the perspective of the computation being
described. For a
> Indeed it is important to avoid having a separate compute definition for each
> workload on a new target. In this particular case, all computation definition
> would start with the original layout. Then there is a "schedule
> transformation" like transform layout which will generate the new st
> Introducing changes to TIR would needs some additional thoughts that deserves
> some extra consideration. Due to the N*M complexity (where N is the TIR
> possibilities and M is the number of primitives to be supported) that needs
> to be handled in implementation (by backend implementers and p
This RFC introduces a method to specify padding to be applied as part of a
buffer layout transformation, to be used when the desired layout does not
evenly tile the buffer being transformed, and simplifications that can be
performed based on these padded buffers.
The motivating examples are pri
> Based on my previous re-review of LLVM, thanks to @tqchen, it might help to
> use my_target.features.dsp rather than my_target.arch.has_dsp and clarifying
> these are features available to the Target? What do you think?
I like that, and the renaming makes it clear which are boolean parameters
+1
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/10471#issuecomment-1060835861
You are receiving this because you are subscribed to this thread.
Message ID:
@masahi Tagging following comments on
https://github.com/apache/tvm/pull/8528#pullrequestreview-718506978
---
[Visit
Topic](https://discuss.tvm.apache.org/t/pre-rfc-vectorized-tir-buffers/10615/5)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe f
I think I'd lean towards markdown for consistency with other services, but
that's only if all other features were equal. Markdown would be nicer for
reviewing, since it can be viewed from github in the browser, but the I think
cross-references are the more important feature.
Would either md
Correct, the different cases are intended to show the entire contents of a test
file. The names in this example are chosen so that it can be run with minimal
interaction between cases.
For the `fixture(scope="module")`, this indicates when pytest should clean up a
fixture, but it is only ava
Adding notes from a few video chats, so that there is a record of the discussion
>From @tkonolige , confirmed that the current implementation of
>`@tvm.testing.parametrize_targets` shows skipped targets if they are
>explicitly listed in the decorator, but not if they come from
>`TVM_TEST_TARG
26 matches
Mail list logo