Re: [Beignet] GROMACS on beignet

Szilárd Páll Wed, 30 Mar 2016 12:15:29 -0700

Hello again,

I have been trying to verify whether there may be assumptions >=32-wide
execution hiding in the kernels (in particular in code that's using local
memory for prefetching or reduction) and tried dropping in mem fences to
test a few things, but at several points I managed to trigger the
aforementioned error:
drm_intel_gem_bo_context_exec() failed: Input/output error


Is this a known issues? There have been reports of it, but perhaps it is
just the manifestation of multiple possible issues?

Secondly, I do not see the reason why I get blocking behavior of all
enqueue operations (and I don't get this on NVIDIA or AMD). Are there any
peculiarities I should be aware of?

Cheers,

--
Szilárd

On Mon, Mar 28, 2016 at 1:49 AM, Szilárd Páll <[email protected]> wrote:

> Hi Xiuli,
>
> Thanks for the quick reply!
>
> On Fri, Mar 25, 2016 at 4:06 AM, Pan, Xiuli <[email protected]> wrote:
>
>> Hi Szilárd,
>>
>>
>>
>> What do you mean about quoted includes?
>>
>
> I mean -I"/path/to/headers" does not work, but  -I/path/to/headers does.
>
>
>> If you mean the include in kernels, I think we may have some problem with
>> that. The *.cl we used for clang actually was a copied tmp version stored
>> not in where is used to be. So I think if you just put what need to be
>> included in the old place, clang could not find it. You could try a
>> workaround to pass “-I where/your/header/is”  as a build option to
>> clBuildProgram.
>>
>>
>>
>> Then if you have some double types used on Haswell it may have some
>> problem. The hardware for HSW does not support double very well as we have
>> refined our double support to hardware then, so HSW may have some issues
>> with double type. If it is not the problem with double float, you can send
>> your kernel as an attachment or report a bug on our Bugzilla(
>> https://bugs.freedesktop.org) and we will tried to fix it.
>>
>
> No double precision in the kernels.
>
> For now I'll post here, I feel like a bug report may be an overkill -
> especially as I can't provide a full repro case that does not involve
> building the entire application.
>
> I've attached a minimum set of source files that's needed to compile. We
> have pretty heavy preprocessor use that generates kernels for the different
> inputs / outputs / computation combinations, so one particular flavor
> that's known to produce incorrect results is generated compiling
> nbnxn_ocl_kernels.cl with the following flags:
>
> -D_WARPLESS_SOURCE_ -DGMX_OCL_FASTGEN -DEL_RF -DEELNAME=_ElecRF
> -DLJ_COMB_GEOM -DVDWNAME=_VdwLJCombGeom -DCENTRAL=22
> -DNBNXN_GPU_NCLUSTER_PER_SUPERCLUSTER=8 -DNBNXN_GPU_CLUSTER_SIZE=8
> -DNBNXN_GPU_JGROUP_SIZE=4 -DNBNXN_AVOID_SING_R2_INC=1.0e-12f
>
>
> Additionally I had a closer look and so far I have observed three issues
> (additional to the minor include issue mentione before):
>
> 1. If I do a manual prefetch into local memory followed by a mem fence
> (seenbnxn_ocl_kernel_nowarp.clh line 339), I get the following error:
> drm_intel_gem_bo_context_exec() failed: Input/output error
> The next kernel call then fails with CL_OUT_OF_RESOURCES.
> Without the manual prefetch it works better, but...
>
> 2. The results produced by the kernel are still somewhat off. It could be
> that I missed a subtle detail and the kernels still do not conform to the
> hardware's execution model. I'm very familar with Intel's hardware and
> these kernels were originally designed for 32/64 wide execution.
>
> 3. All task enqueue calls seem to be blocking.
>
>
> Thanks & Cheers,
> --
> Szilárd
>
>
>>
>> Thanks
>>
>> Xiuli
>>
>>
>>
>> *From:* Beignet [mailto:[email protected]] *On
>> Behalf Of *Szilárd Páll
>> *Sent:* Friday, March 25, 2016 7:16 AM
>> *To:* [email protected]
>> *Subject:* [Beignet] GROMACS on beignet
>>
>>
>>
>> Hi,
>>
>>
>>
>> I am a developer of the GROMACS (www.gromacs.org) molecular dynamics
>> simulation package. We have OpenCL offload for some of the
>> compute-intensive kernels which that works very well on AMD. I wanted to
>> assess how feasible is to use an Intel iGPU in GROMACS and after jumping
>> through some hoops I got a 4.2 kernel and beignet master installed.
>>
>>
>>
>> Then I ran into the first minor issue: it seems that beignet does not
>> accept quoted includes although AFAIK the double-quoted include paths
>> should be accepted, but that did not work. No big deal, it doesn't work
>> with Apple's OpenCL either, but I thought I'd ask.
>>
>>
>>
>> However, the bigger issue is that running on Haswell (HD 4600, I think)
>> the kernel produces results that are very off (while the very same source
>> gives correct results on other platforms). I've not much time to dig
>> deeper, but I thought I'd drop a mail maybe somebody is interested in
>> helping out with tips or even tracking down where the issue is.
>>
>>
>>
>> Suggestions would be welcome!
>>
>>
>>
>> Cheers,
>>
>> --
>> Szilárd
>>
>
>

_______________________________________________
Beignet mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/beignet

Re: [Beignet] GROMACS on beignet

Reply via email to