Hello again, I have been trying to verify whether there may be assumptions >=32-wide execution hiding in the kernels (in particular in code that's using local memory for prefetching or reduction) and tried dropping in mem fences to test a few things, but at several points I managed to trigger the aforementioned error: drm_intel_gem_bo_context_exec() failed: Input/output error
Is this a known issues? There have been reports of it, but perhaps it is just the manifestation of multiple possible issues? Secondly, I do not see the reason why I get blocking behavior of all enqueue operations (and I don't get this on NVIDIA or AMD). Are there any peculiarities I should be aware of? Cheers, -- Szilárd On Mon, Mar 28, 2016 at 1:49 AM, Szilárd Páll <[email protected]> wrote: > Hi Xiuli, > > Thanks for the quick reply! > > On Fri, Mar 25, 2016 at 4:06 AM, Pan, Xiuli <[email protected]> wrote: > >> Hi Szilárd, >> >> >> >> What do you mean about quoted includes? >> > > I mean -I"/path/to/headers" does not work, but -I/path/to/headers does. > > >> If you mean the include in kernels, I think we may have some problem with >> that. The *.cl we used for clang actually was a copied tmp version stored >> not in where is used to be. So I think if you just put what need to be >> included in the old place, clang could not find it. You could try a >> workaround to pass “-I where/your/header/is” as a build option to >> clBuildProgram. >> >> >> >> Then if you have some double types used on Haswell it may have some >> problem. The hardware for HSW does not support double very well as we have >> refined our double support to hardware then, so HSW may have some issues >> with double type. If it is not the problem with double float, you can send >> your kernel as an attachment or report a bug on our Bugzilla( >> https://bugs.freedesktop.org) and we will tried to fix it. >> > > No double precision in the kernels. > > For now I'll post here, I feel like a bug report may be an overkill - > especially as I can't provide a full repro case that does not involve > building the entire application. > > I've attached a minimum set of source files that's needed to compile. We > have pretty heavy preprocessor use that generates kernels for the different > inputs / outputs / computation combinations, so one particular flavor > that's known to produce incorrect results is generated compiling > nbnxn_ocl_kernels.cl with the following flags: > > -D_WARPLESS_SOURCE_ -DGMX_OCL_FASTGEN -DEL_RF -DEELNAME=_ElecRF > -DLJ_COMB_GEOM -DVDWNAME=_VdwLJCombGeom -DCENTRAL=22 > -DNBNXN_GPU_NCLUSTER_PER_SUPERCLUSTER=8 -DNBNXN_GPU_CLUSTER_SIZE=8 > -DNBNXN_GPU_JGROUP_SIZE=4 -DNBNXN_AVOID_SING_R2_INC=1.0e-12f > > > Additionally I had a closer look and so far I have observed three issues > (additional to the minor include issue mentione before): > > 1. If I do a manual prefetch into local memory followed by a mem fence > (seenbnxn_ocl_kernel_nowarp.clh line 339), I get the following error: > drm_intel_gem_bo_context_exec() failed: Input/output error > The next kernel call then fails with CL_OUT_OF_RESOURCES. > Without the manual prefetch it works better, but... > > 2. The results produced by the kernel are still somewhat off. It could be > that I missed a subtle detail and the kernels still do not conform to the > hardware's execution model. I'm very familar with Intel's hardware and > these kernels were originally designed for 32/64 wide execution. > > 3. All task enqueue calls seem to be blocking. > > > Thanks & Cheers, > -- > Szilárd > > >> >> Thanks >> >> Xiuli >> >> >> >> *From:* Beignet [mailto:[email protected]] *On >> Behalf Of *Szilárd Páll >> *Sent:* Friday, March 25, 2016 7:16 AM >> *To:* [email protected] >> *Subject:* [Beignet] GROMACS on beignet >> >> >> >> Hi, >> >> >> >> I am a developer of the GROMACS (www.gromacs.org) molecular dynamics >> simulation package. We have OpenCL offload for some of the >> compute-intensive kernels which that works very well on AMD. I wanted to >> assess how feasible is to use an Intel iGPU in GROMACS and after jumping >> through some hoops I got a 4.2 kernel and beignet master installed. >> >> >> >> Then I ran into the first minor issue: it seems that beignet does not >> accept quoted includes although AFAIK the double-quoted include paths >> should be accepted, but that did not work. No big deal, it doesn't work >> with Apple's OpenCL either, but I thought I'd ask. >> >> >> >> However, the bigger issue is that running on Haswell (HD 4600, I think) >> the kernel produces results that are very off (while the very same source >> gives correct results on other platforms). I've not much time to dig >> deeper, but I thought I'd drop a mail maybe somebody is interested in >> helping out with tips or even tracking down where the issue is. >> >> >> >> Suggestions would be welcome! >> >> >> >> Cheers, >> >> -- >> Szilárd >> > >
_______________________________________________ Beignet mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/beignet
