Hi,
My mistake, DMA are indeed handled coherently by the MESI_Two_Level protocol. I
did encounter such issue with non-coherent DMA accesses in our own protocol. I
wrongly assumed the same was happening here. I also overlooked the uncacheable
parameter that happens to be override in the upstream ruby_mem_test.py. Your
issue looked so much similar to one I’ve had that I did not double check these
points.
So, DMA accesses should not fail in any case with the MESI_Two_Level protocol.
Let’s start over again.
In MESI_Two_level-dir.sm, a DMA_WRITE event causes the
qw_queueMemoryWBRequest_partial to be executed which sends a MEMORY_WB request
to memory. The request is supposed to send:
1. the raw data block which is composed by the DMASequencer
* The DMASequencer aligns the data block on a cacheline as expected by the
ruby infrastructure (DMASequencer.cc:153)
2. the length of the access (always 1 byte for the memory tester)
3. the physical address of the access
* The problem might be there as the variable *address* on line
MESI_Two_Level-dir.sm:368 is set to the physical address of the access
**aligned to a cacheline** (MESI_Two_Level-dir.sm:207)
The actual request that will then be sent to the memory controller is then
produced by AbstractController::serviceMemoryQueue(). In that function, the
data is retrieved in the block assuming the base address of the block is
cacheline-aligned (AbstractController.cc:281). The result is that you always
access the base address of a block when performing a DMA access with the
MESI_Two_level protocol.
You can try the patch I’ve attached. It applies on v22.0.0.2. It will make
MESI_Two_Level work as long as DMA accesses target a different memory region
than CPU accesses (percent_uncacheable = 100).
Now, if you revert percent_uncacheable to 0, problems are back and this time
for real. Namely, The directory model is not wired up to handle all possible
data sharing states together with DMA accesses. Now, it’s you to decide whether
you want to take the red pill or the blue pill ;) I’ll stick to the blue one
and avoid diving into the rabbit hole that fixing a Ruby protocol often is. If
it works for you and you luckily avoid the unsupported scenarios, then you
should be good to go.
Regarding your comment
> This test was for MESI_Two_Level for which DMA should be working as it boots
> up in Full System mode
It could very well be that every single DMA access performed by Linux happens
to be cacheline-aligned for various good reasons. Booting Linux is not all that
demanding on the coherency protocol. Its more about getting a few atomic
accesses right and providing support for all the machinery going around the
bare CPU core and memory (iterrupt controller(s), timers, MMU, file system
storage back end, etc). Memory mapped register accesses and IOs are not very
demanding in terms of coherency and it is very likely that performing
non-coherent, AXI-like, accesses would work just fine.
ruby_mem_test.py is an actual torture test for the coherence protocol that
surpasses anything an actual program could ever ask for.
Best,
Gabriel
diff --git a/configs/example/ruby_mem_test.py b/configs/example/ruby_mem_test.py
index b16b295f0f..442cb136df 100644
--- a/configs/example/ruby_mem_test.py
+++ b/configs/example/ruby_mem_test.py
@@ -99,7 +99,7 @@ system = System(cpu = cpus,
if args.num_dmas > 0:
dmas = [ MemTest(max_loads = args.maxloads,
percent_functional = 0,
- percent_uncacheable = 0,
+ percent_uncacheable = 100,
progress_interval = args.progress,
suppress_func_errors =
not args.suppress_func_errors) \
@@ -110,7 +110,7 @@ else:
dma_ports = []
for (i, dma) in enumerate(dmas):
- dma_ports.append(dma.test)
+ dma_ports.append(dma.port)
Ruby.create_system(args, False, system, dma_ports = dma_ports)
# Create a top-level voltage domain and clock domain
diff --git a/src/cpu/testers/memtest/memtest.cc b/src/cpu/testers/memtest/memtest.cc
index 7c256d8642..5fefc7f899 100644
--- a/src/cpu/testers/memtest/memtest.cc
+++ b/src/cpu/testers/memtest/memtest.cc
@@ -220,7 +220,7 @@ MemTest::tick()
// create a new request
unsigned cmd = random_mt.random(0, 100);
uint8_t data = random_mt.random<uint8_t>();
- bool uncacheable = random_mt.random(0, 100) < percentUncacheable;
+ bool uncacheable = random_mt.random(1, 100) <= percentUncacheable;
unsigned base = random_mt.random(0, 1);
Request::Flags flags;
Addr paddr;
diff --git a/src/mem/ruby/protocol/MESI_Two_Level-dir.sm b/src/mem/ruby/protocol/MESI_Two_Level-dir.sm
index 9d6975570c..53672b280f 100644
--- a/src/mem/ruby/protocol/MESI_Two_Level-dir.sm
+++ b/src/mem/ruby/protocol/MESI_Two_Level-dir.sm
@@ -234,10 +234,11 @@ machine(MachineType:Directory, "MESI Two Level directory protocol")
in_port(memQueue_in, MemoryMsg, responseFromMemory, rank = 2) {
if (memQueue_in.isReady(clockEdge())) {
peek(memQueue_in, MemoryMsg) {
+ Addr lineAddr := makeLineAddress(in_msg.addr);
if (in_msg.Type == MemoryRequestType:MEMORY_READ) {
- trigger(Event:Memory_Data, in_msg.addr, TBEs[in_msg.addr]);
+ trigger(Event:Memory_Data, lineAddr, TBEs[lineAddr]);
} else if (in_msg.Type == MemoryRequestType:MEMORY_WB) {
- trigger(Event:Memory_Ack, in_msg.addr, TBEs[in_msg.addr]);
+ trigger(Event:Memory_Ack, lineAddr, TBEs[lineAddr]);
} else {
DPRINTF(RubySlicc, "%s\n", in_msg.Type);
error("Invalid message");
@@ -352,7 +353,7 @@ machine(MachineType:Directory, "MESI Two Level directory protocol")
peek(memQueue_in, MemoryMsg) {
enqueue(responseNetwork_out, ResponseMsg, to_mem_ctrl_latency) {
assert(is_valid(tbe));
- out_msg.addr := address;
+ out_msg.addr := tbe.PhysicalAddress;
out_msg.Type := CoherenceResponseType:DATA;
out_msg.DataBlk := in_msg.DataBlk; // we send the entire data block and rely on the dma controller to split it up if need be
out_msg.Destination.add(tbe.Requestor);
@@ -365,7 +366,7 @@ machine(MachineType:Directory, "MESI Two Level directory protocol")
desc="Queue off-chip writeback request") {
peek(requestNetwork_in, RequestMsg) {
enqueue(memQueue_out, MemoryMsg, to_mem_ctrl_latency) {
- out_msg.addr := address;
+ out_msg.addr := tbe.PhysicalAddress;
out_msg.Type := MemoryRequestType:MEMORY_WB;
out_msg.Sender := machineID;
out_msg.MessageSize := MessageSizeType:Writeback_Data;
@@ -378,7 +379,7 @@ machine(MachineType:Directory, "MESI Two Level directory protocol")
action(da_sendDMAAck, "da", desc="Send Ack to DMA controller") {
enqueue(responseNetwork_out, ResponseMsg, to_mem_ctrl_latency) {
assert(is_valid(tbe));
- out_msg.addr := address;
+ out_msg.addr := tbe.PhysicalAddress;
out_msg.Type := CoherenceResponseType:ACK;
out_msg.Destination.add(tbe.Requestor);
out_msg.MessageSize := MessageSizeType:Writeback_Control;
@@ -410,7 +411,7 @@ machine(MachineType:Directory, "MESI Two Level directory protocol")
peek(responseNetwork_in, ResponseMsg) {
enqueue(responseNetwork_out, ResponseMsg, to_mem_ctrl_latency) {
assert(is_valid(tbe));
- out_msg.addr := address;
+ out_msg.addr := tbe.PhysicalAddress;
out_msg.Type := CoherenceResponseType:DATA;
out_msg.DataBlk := in_msg.DataBlk; // we send the entire data block and rely on the dma controller to split it up if need be
out_msg.Destination.add(tbe.Requestor);
_______________________________________________
gem5-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]