Good point, if there is a GL shader using the atomic extension and we enable atomic in L3 but when switch back to the GL shader's process with 0 DC allocation, it may trigger a hang.
I will modify the post action to restore it to disable state. Any other comments? On Mon, Jan 05, 2015 at 02:42:00AM +0000, Yang, Rong R wrote: > Do the KDM values need restore after finish Beignet command? > > > -----Original Message----- > > From: Beignet [mailto:[email protected]] On Behalf Of > > Zhigang Gong > > Sent: Wednesday, December 31, 2014 10:03 > > To: [email protected] > > Cc: Gong, Zhigang > > Subject: [Beignet] [PATCH] CL/Driver: enable atomics in L3 for HSW. > > > > This could get more than 10x boost for some atomic stress workloads. > > > > Signed-off-by: Zhigang Gong <[email protected]> > > --- > > src/intel/intel_defines.h | 4 ++++ > > src/intel/intel_gpgpu.c | 11 ++++++++++- > > 2 files changed, 14 insertions(+), 1 deletion(-) > > > > diff --git a/src/intel/intel_defines.h b/src/intel/intel_defines.h index > > e983718..a120f41 100644 > > --- a/src/intel/intel_defines.h > > +++ b/src/intel/intel_defines.h > > @@ -304,6 +304,10 @@ > > #define URB_SIZE(intel) (IS_IGDNG(intel->device_id) ? 1024 : \ > > IS_G4X(intel->device_id) ? 384 : 256) > > > > +// HSW > > +#define HSW_SCRATCH1_OFFSET (0xB038) > > +#define HSW_ROW_CHICKEN3_HDC_OFFSET (0xE49C) > > + > > // L3 cache stuff > > #define GEN7_L3_SQC_REG1_ADDRESS_OFFSET (0XB010) > > #define GEN7_L3_CNTL_REG2_ADDRESS_OFFSET (0xB020) > > diff --git a/src/intel/intel_gpgpu.c b/src/intel/intel_gpgpu.c index > > 9e442c0..eee495e 100644 > > --- a/src/intel/intel_gpgpu.c > > +++ b/src/intel/intel_gpgpu.c > > @@ -632,7 +632,16 @@ static void > > intel_gpgpu_set_L3_gen75(intel_gpgpu_t *gpgpu, uint32_t use_slm) { > > /* still set L3 in batch buffer for fulsim. */ > > - BEGIN_BATCH(gpgpu->batch, 9); > > + BEGIN_BATCH(gpgpu->batch, 15); > > + OUT_BATCH(gpgpu->batch, CMD_LOAD_REGISTER_IMM | 1); /* length - 2 > > */ > > + /* FIXME: KMD always disable the atomic in L3 for some reason. > > + I checked the spec, and don't think we need that workaround now. > > + Before I send a patch to kernel, let's just enable it here. */ > > + OUT_BATCH(gpgpu->batch, HSW_SCRATCH1_OFFSET); > > + OUT_BATCH(gpgpu->batch, 0); /* enable atomic in > > L3 */ > > + OUT_BATCH(gpgpu->batch, CMD_LOAD_REGISTER_IMM | 1); /* length - 2 > > */ > > + OUT_BATCH(gpgpu->batch, HSW_ROW_CHICKEN3_HDC_OFFSET); > > + OUT_BATCH(gpgpu->batch, (1 << 6ul) << 16); /* enable atomic in > > L3 */ > > OUT_BATCH(gpgpu->batch, CMD_LOAD_REGISTER_IMM | 1); /* length - 2 > > */ > > OUT_BATCH(gpgpu->batch, GEN7_L3_SQC_REG1_ADDRESS_OFFSET); > > OUT_BATCH(gpgpu->batch, 0x00800000); > > -- > > 1.8.3.2 > > > > _______________________________________________ > > Beignet mailing list > > [email protected] > > http://lists.freedesktop.org/mailman/listinfo/beignet > _______________________________________________ > Beignet mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/beignet _______________________________________________ Beignet mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/beignet
