On Mon, Nov 22, 2021 at 9:23 AM Liang, Prike <[email protected]> wrote:
>
> [Public]
>
> > -----Original Message-----
> > From: Alex Deucher <[email protected]>
> > Sent: Friday, November 19, 2021 12:18 AM
> > To: Lazar, Lijo <[email protected]>
> > Cc: Deucher, Alexander <[email protected]>; Christian König
> > <[email protected]>; Liang, Prike <[email protected]>;
> > Huang, Ray <[email protected]>; [email protected]
> > Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend
> > aborted
> >
> > On Thu, Nov 18, 2021 at 10:01 AM Lazar, Lijo <[email protected]> wrote:
> > >
> > > [Public]
> > >
> > >
> > > BTW, I'm not sure if 'reset always' on resume is a good idea  for GPUs in 
> > > a
> > hive (assuming those systems also get suspended and get hiccups). At this
> > point the hive isn't reinitialized.
> >
> > Yeah, we should probably not reset if we are part of a hive.
> >
> > Alex
> >
> For the GPU hive reset in this suspend abort case need treat specially, does 
> that because of GPU hive need take care each node reset dependence and 
> synchronous reset? For this purpose, can we skip the hive reset case and only 
> do GPU reset under adev->gmc.xgmi.num_physical_nodes == 0 ?

Yes, exactly.  For the aborted suspend reset, we can check the value
before doing a reset.  I think you want to check if
adev->gmc.xgmi.num_physical_nodes <= 1.

Alex

>
> > >
> > > Thanks,
> > > Lijo

Reply via email to