Re: [apache/tvm-rfcs] [RFC] UMA Universal Modular Accelerator Interface (PR #60)

PaulPalomeroBernardo Mon, 23 May 2022 03:14:48 -0700

Thanks @mbs-octoml for this detailed explanation. Being a Collage-supported 
backend is definitely something we want to achieve for UMA-integrated backends.


> The registration of patterns will need to support the existing triple of 
> (pattern name, pattern, predicate) since the predicates are necessary to 
> control support based on dtypes, shapes, backend version, etc. No big deal.

We will add this to the pattern registration.

> I'm assuming those triples will continue to end up in either the global 
> pattern table registry, or can be otherwise retrieved by a system like 
> Collage which wishes to bypass the 'eager' UMA partitioning with it's own 
> search. But again no big deal, just need to know where to look.

They are registered in the global pattern table registry during backend 
registration but can also be accessed directly over the backend object if 
necessary.

> Though not significant to Collage, I assume the order of application of the 
> partitioning patterns matches the registration order?

Correct.

> Collage requires external codegen compiler names to be 1:1 with already 
> registered target kinds with the same kind name. It also requires instances 
> of those targets to be provided in the build targets list, even if those 
> instances are nothing other than Target("my_backend") with no extra 
> attributes. But the target kinds may also support additional attributes, and 
> the various transitions into external codegen code have been changed to 
> ensure the matching Target instance has been pushed as the Target.current() 
> so that codegen can retrieve and extract any attributes to guide compilation. 
> I think that matches some of the conversation above, except that the 
> attributes can be fetched by Target.current().get_attr("foo"), but I might 
> have missed the point in that sub-thread.

I think, this works well. After the backend registration (e.g., 
`UMABackend.register()`) the target kind, which matches the required codegen 
compiler name, is available. From there, a target can be created (with or 
without attributes) and passed to the build target list.

> Collage assumes a regular build of an IRModule will respect any existing 
> "Compiler" attributed functions already in the module. I think all that means 
> is that the UMA partitioner should respect existing partitions, but otherwise 
> trigger the appropriate custom downstream compilation, and given the 
> partitioner uses the existing passes I think that should all Just Work.

I agree.

> Collage assumes it can do it's partitioning before any other backend-specific 
> passes. I'm assuming however that some of the Relay pass phases mentioned can 
> be before partitioning. If so I'm guessing we'd need to first apply those 
> pre-partitioning phases in deterministic order in the hope that they sensibly 
> compose, then partition using Collage, then run the post-partitioning phases 
> as usual.

Yes, we were planning to include a pre-partitioning pass phase. Passes within 
one pass phase should always be executed in order of their registration.

> Collage uses the list of available Targets to guide it's search, but if I 
> understand correctly UMA uses the registration of backends to enforce a fixed 
> partitioning order. Perhaps this suggests the Collage partitioner should be 
> integrated as a user-controlled alternative to the default 'eager' partitoner 
> supplied by UMA (presumably as a loop of the usual Relay 
> MergeComposite/AnnotateTarget/MergeCompilerRegions?/PartitionGraph passes for 
> each backend). That way the user can use the same 
> construct-and-register-backends-of-interest API.

Currently a user needs to explicitly call `partition()` on the registered 
backend to perform the usual 
MergeComposite/AnnotateTarget/MergeCompilerRegions?/PartitionGraph passes plus 
the relevant relay pass phases (e.g., pre-partitioning).

```
backendA= MyUMABackendA()
backendB= MyUMABackendB()

backendA.register()
backendB.register()
mod = backendA.partition(mod)
mod = backendB.partition(mod)
```

As you described this would eagerly partition the graph depending on the call 
order of `.partition()`. This would actually give the user the opportunity to 
skip this partitioning and directly go for the Collage approach. I am not sure 
if this is the best solution though.

> I'm surprised by the emphasis on going via TIR. Are we explicitly saying any 
> BYOC integrations which don't need/want to go via TIR don't fall under the 
> UMA integration API? If so that will make Collage/UMA integration harder 
> since Collage would have to account for both UMA-style and original-style 
> integrations.

As it is now, they would not fall under the UMA integration API. With UMA we 
wanted to wrap one specific BYOC integration into an easy-to-use interface and 
we decided to go with the target hooks via TIR (`relay_to_tir`, 
`tir_to_runtime`). However, if there is enough motivation we could think about 
adding `relay_to_runtime` as a second path. This would require greater changes 
to the current architecture so I don't see it as part of UMA v1 but we can take 
this into account for future development.

> One more collage/uma overlap aspect: Collage distinguishes 'registered' 
> backends (ie just TargetKinds) from 'activated' backends (ie Target objects 
> in the provided build targets). I think though the proposal here is the act 
> of registration is also activation? I need help understanding how this will 
> look from the user's pov in combination with targets.

There are three steps required to make use of UMA as a user.
1. Create and instantiate a UMA backend `backend = MyUMABackend()`
2. Register the backend `backend.register()`
3. Apply the standard partitioning (might not be necessary with Collage)

`backend.register()` is registering the target kind, a pattern table, and 
global functions required by the UMA lowering. I think this is more or less 
equivalent with the Collage 'registration'. Only when the partitioning 
annotates a subgraph for the backend, it is 'activated'.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/60#issuecomment-1134482031
You are receiving this because you are subscribed to this thread.

Message ID: <apache/tvm-rfcs/pull/60/c1134482...@github.com>

Re: [apache/tvm-rfcs] [RFC] UMA Universal Modular Accelerator Interface (PR #60)

Reply via email to