Thanks, I have successfully run bwd_activation without error.
------------------ ???????? ------------------
??????:
"gem5 users mailing list"
<[email protected]>;
????????: 2022??2??13??(??????) ????6:30
??????: "gem5 users mailing list"<[email protected]>;
????: "1575883782"<[email protected]>;"Kyle
Roarty"<[email protected]>;"Matt Sinclair"<[email protected]>;
????: [gem5-users] Re: Gem5 GCN3 DNNMark benchmark error (fwd_softmax is
ok, but others are not)
Thanks this is helpful. Kyle and I went through the error and we haven't
run on a machine with enough memory to run batch size 100 (which is what
bwd_activation assumes by default). However, we have gotten it to run
with up to batch size 50.
We think the failure you were seeing was essentially happening because we
weren't testing bwd_activation in the nightly/weekly regressions, and thus
missed that the file we use to generate the MIOpen cachefiles for the DNNMark
kernels did not have the appropriate kernel for bwd_activation. Kyle
created a patch to fix this problem:
https://gem5-review.googlesource.com/c/public/gem5-resources/+/56789.
You will need to pull this patch and rerun generate_cachefiles before trying to
run again. Moreover, since we only know it works up to batch size 50, you
may consider changing the batch size here:
https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/gpu/DNNMark/config_example/activation_config.dnnmark#6,
to something <= 50 since N represents the batch size. Alternatively if
you need > 50 batch size, you can try running again on the larger machine
you mentioned before, but since we haven't run it on such a large machine yet
we don't know exactly what will happen.
Hope this helps,
Matt
On Fri, Feb 11, 2022 at 12:11 PM 1575883782 via gem5-users
<[email protected]> wrote:
yeah, I running DNNMark inside docker, and the version is v21-2. I run command
by remote-container plugin of VsCode.
---Original---
From: "Matt Sinclair via gem5-users"<[email protected]>
Date: Sat, Feb 12, 2022 01:41 AM
To: "gem5 users mailing list"<[email protected]>;
Cc: "1575883782"<[email protected]>;"Kyle Roarty"<[email protected]>;"Matt
Sinclair"<[email protected]>;
Subject: [gem5-users] Re: Gem5 GCN3 DNNMark benchmark error (fwd_softmax is ok,
but others are not)
One more question for you, original poster: are you running DNNMark inside the
docker resources we provided: http://resources.gem5.org/resources/dnn-mark?
Or are you trying to get this running on your machine directly?
Matt
On Fri, Feb 11, 2022 at 11:37 AM Matt Sinclair
<[email protected]> wrote:
Kyle, can you please help with this? I don't recall when we last tested
bwd_act.
Matt
On Fri, Feb 11, 2022 at 2:18 AM 1575883782 via gem5-users
<[email protected]> wrote:
Hi, I was trying to run DNNMark benchmark with its GCN3 GPU model following the
instructions on http://resources.gem5.org/resources/dnn-mark.I succeed running
fwd_softmax, but when I run other layers, I met some problems. For example,
"bwd_activation".
I tried to run gem5 DNNMark bwd_activation bechmark in 2 computers.
First computer has 32G Mem size. Gem5 could run fwd_softmax successfully, but
always was killed while running bwd_activation. The error message was "Killed"
+ process id. No other messages. I guess it's as this computer's mem size is
not enough to run it.
Second computer has 256G Mem size. Gem5 could run fwd_softmax successfully. But
some problems happened while running bwd_activation. I solved some, but have
not solved all. Error messages are:
> I0909 01:46:50.680040 100 dnn_wrapper.h:341] enter
dnnmarkActivationBackward func > build/GCN3_X86/sim/mem_pool.cc:110: warn:
Reached m5ops MMIO region > build/GCN3_X86/sim/mem_pool.cc:110: warn:
Reached m5ops MMIO region > build/GCN3_X86/sim/mem_pool.cc:110: warn:
Reached m5ops MMIO region > build/GCN3_X86/sim/mem_pool.cc:110: warn:
Reached m5ops MMIO region > build/GCN3_X86/arch/x86/faults.cc:170: panic:
Tried to read unmapped address 0. > PC: 0x7fffeef84b80, Instr: FMUL2_M :
ldfp87 %ufp1, DS:[rdx] > Memory Usage: 46436124 KBytes > Program
aborted at tick 10680071080500 >
sometimes, error are:> panic: Tried to write unmapped address
0x2b95d881.or> panic: Tried to write unmapped address 0x3.
According to my log, I found the problem happended on
"dnnmarkActivationBackward" func.> LOG(INFO) << "enter
dnnmarkActivationBackward func"; > #ifdef AMD_MIOPEN >
MIOPEN_CALL(miopenActivationBackward( > mode == COMPOSED ?
> handle.GetMIOpen(idx) : handle.GetMIOpen(), >
activation_desc.Get(), > alpha, >
top_desc.Get(), y, > top_desc.Get(), dy, >
bottom_desc.Get(), x, > beta, >
bottom_desc.Get(), dx)); > #endif > LOG(INFO) << "exit
dnnmarkActivationBackward func";
It seems to be a miopen interface functions. I don't know how to solve it.
Someone could help me?
PS: my gem5 version is v21-2, and docker image is v21-2.my run command is:
build/GCN3_X86/gem5.opt --outdir=$outdir configs/example/apu_se.py -n 10
--mem-size=8GB --benchmark-root=$BenchmarkRoot/test_bwd_activation -c
dnnmark_test_bwd_activation --options="-config
"$ConfigRoot"/activation_config.dnnmark -mmap "$MMAPFile" -debuginfo 1"Both
computers have no AMD GPU._______________________________________________
gem5-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
_______________________________________________
gem5-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s_______________________________________________
gem5-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s