If TARGET_GPU_ISA is not set, even if the GPU ISA namespace isn't used
by anything, the logic which figures out what to set it to will fail.
This checks for that condition and sets it to something invalid, but
doesn't crash. If that namespace is actually used, then the build will
still fail.
Change-Id: Iec44255cccbafa4aceaa68bdd8b6a835dc0637a0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56895
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The Request::NO_ACCESS flag instructs the cpu model to not issue
the request to the memory port.
While Atomic and Timing CPU models properly implement it [1], [2],
* MinorCPU is not looking at the flag
* O3CPU is looking at the flag only in case of a nested transaction
start/commit
This patch is extending NO_ACCESS support to all memory instructions.
This is achieved by using the localAccess callback in the Request object.
Handling of nested hardware transactions in the O3 LSQUnit is moved within
the local accessor callback
[1]: https://github.com/gem5/gem5/blob/v21.1.0.2/\
src/cpu/simple/timing.cc#L318
[2]: https://github.com/gem5/gem5/blob/v21.1.0.2/\
src/cpu/simple/atomic.cc#L396
Change-Id: Ifd5b388c53ead4fe358aa35d2197c12f1c5bb4f2
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56591
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: ZHENGRONG WANG <seanyukigeek@gmail.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
The MutableSet class used to be part of the collections module directly,
but in 3.3 was moved to collections.abc. Apparently there was still a
version in collections, since we had been importing it from that old
location and it had been working up until now. After a recent update,
this stopped working for me, and may be tied to an update to the local
version of python on my machine.
This change imports MutableSet from collections.abc instead of
collections directly. I found only one place that this class was used in
src or ext, so I don't think it needs to be fixed anywhere else.
Change-Id: I8b2e82160fd433d57af4a7008ec282ee8ad8a422
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56849
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabeblack@google.com>
This patch updates the canAllocate function to account both for
the number of regions of registers that need to be allocated,
and for the fact that the registers aren't one continuous chunk.
The patch also consolidates the registers as much as possible when
a register chunk is freed. This prevents fragmentation from making
it impossible to allocate enough registers
Change-Id: Ic95cfe614d247add475f7139d3703991042f8149
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56909
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
If the size of the address range is smaller than the maximum number
of outstanding requests allowed downstream, the tester will get stuck
trying to find a unique address. This patch adds a check for this
condition and forces the tester to wait for responses before
trying to generate another request.
Change-Id: Ie894a074cc4f8c7ad3d875dc21e8eb4f04562d72
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56811
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The tester currently assumes that one token per lane is needed when
checking if an action is ready to be issued. When actually issuing
requests, it is possible that a memory location is not valid for various
reasons. This was not being considered when checking for tokens causing
the tester to acquire more tokens than requests sent. Since tokens are
returned when requests are coalesced, this was causing some tokens never
to be returned, eventually depleting the token pool and causing a
deadlock.
Add a new method which determines the number of tokens needed for an
action. This is called by both the ready check method and the method to
issue the action to ensure they are aligned.
Change-Id: Ic1af72085c3b077539eb3fd129331e776ebdffbc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56450
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This protocol is using an old style where read/writes to memory were
being done by writing to a DataBlock in a DirectoryMemory entry. This
results in having multiple copies of memory, leads to stale copies in at
least one memory (usually DRAM), and require --access-backing-store in
most cases to work properly. This changeset removes all references to
getDirectoryEntry(...).DataBlk and instead forwards those reads and
writes to DRAM always.
This results in new transient states BL_WM, BDW_WM, and B_WM which are
blocked states waiting on memory acks indicating a write request is
complete. The appropriate transitions are updates to move to these new
states and stall states are updated to include them. DMA write ACK is
also moved to when the request is sent to memory, rather than when the
request is received.
Change-Id: Ic5bd6a8a8881d7df782e0f7eed8be9d873610e04
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56446
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
The directory has an assert that this is at least one destination for a
probe when sending an invalidation or shared probe to coherence end
points in the protocol (TCC, LLC). This is not necessarily request and
for certain configurations there will be no probes required and none
will be sent. One such configuration is the GPU protocol tester which
would not require a probe to the CPU if it does not exist.
To fix this we first collect the probe destinations. Then we check if
any destinations exist. If so, we send the probe message. Otherwise we
immediately enqueue a probe complete message to the trigger queue. This
reorganization prevents messages with no destinations from being
enqueued, meeting the criteria for the assertion.
Change-Id: If016f457cb8c9e0277a910ac2c3f315c25b50ce8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55543
Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>
Maintainer: Matt Sinclair <mattdsinclair@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
JIRA: https://gem5.atlassian.net/browse/GEM5-1185
Fixed an issue in which a CleanUnique responder would incorrectly
deallocate the cache block when handling an stale CU when the state
is UD_RU or UC_RU (thus incorrectly transitioning to RU).
The fix is to handle stale CUs similarly to stale WBs where we
override the dataValid TBE field to prevent the wrong state
transition.
This patch moves the stale code path to a separate transition
(similarly to stale WBs/Evicts) and moves the dataValid override to
Initiate_Request_Stale so it applies to all stale request types.
Notice now the stale field is also set on stale Comp_UC responses.
Additional minor change: CheckUpgrade_FromRU is the same as
CheckUpgrade_FromStore so it was removed.
Change-Id: I0a2cedcfde1dc30d67aa2c16d71b7470369c2b6e
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56810
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Meatboy 106 <garbage2collector@gmail.com>
The recent change to add an "exclude" pattern to Glob in SCons also
seems to have triggered a bug where SCons has decided directories that
don't exist are files, and then gets upset later when we try to treat
them as directories.
To avoid that bug, and to also make recursive searching for isa parser
.py files work, we can replace the call to Glob with a loop based on
os.walk.
Also, tell the microcode assembler not to generate the parsetab.py file
in the first place. This comes with a minor performance overhead, but
shouldn't matter for us since there are *much* bigger overheads when
processing ISA descriptions.
Change-Id: Ia84e97dab72723ad3f4350798ad70178e231144c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56749
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
When instructions perform accesses, they embed the segment being used
into the request flags. When the CPU creates a request instead, for
instance when fetching an instruction, it doesn't know to do that.
This change adds a check in the TLB when makes sure CS is used when
checking a fetch, even if the flags didn't say to.
Change-Id: Ie9da3afc0b10eeb96247353150c64f1829cea41b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55247
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When using fast forwarding, createThreads() is not
called upon FutureCPUs. This causes segment fault
as the decoder is initialized in createThreads() and
needed when instantiating CPUs.
This commit basically fixes this by invoking
createThreads() on FutureCPUs after they are created.
Change-Id: I812d18f06878f9fc3fa2183a2c8a64d316413398
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56812
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Luming Wang <wlm199558@126.com>
Tested-by: kokoro <noreply+kokoro@google.com>
When dealing with segmentation in x86, it is *usually* illegal to
attempt to access a segment which has a null selector when in protected
mode and not in 64 bit mode. While this is *almost* true, it is not
actually technically true.
What actually *is* true is that if you *set up* a segment using a null
selector in those circumstances, that segment becomes unusable, and then
tryint to use it causes a fault.
When in real mode, it is perfectly legal to use a null selector to
access memory, since that is just a selector with numerical value 0.
When you then transition into protected mode, the selector would still
be 0 (a null selector), but the segment itself would still be set up
properly and usuable using the base value, limit, and other attributes
it carried over from real mode.
Rather than check if a segment has a null selector while handling
segmentation, it's more correct for us to keep track of whether the
segment is currently usable and check that in the TLB.
Change-Id: Ic2c09e1cfa05afcb03900213b72733545c8f0f4c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55245
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
These were loading the immediate into a temporary microcode register
which would then be used to calculate the address to actually send to
the memory system. Unfortunately this was using a data size equal to the
address size, which would mean that the immediate would be merged into
that temporary, leaving previously set bits intact. The data size
*should* have been set to 8, and was already in other similar
instructions. That forces the limm microop to overwrite the temporary
entirely.
Change-Id: I87c82b4677db768ccb6401a3dbda61317c014152
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55286
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Because we don't have a good way to actually figure out what python
files we depend on, we have to use Glob and wildcard matching to depend
on all potential python files. Unfortunately that will pick up the
parsetab.py file that ply generates, which is a cached intermediate file
and not an input.
Change-Id: Id3dc0083e97c145deca04939182157623d6b780f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56341
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
According to KVM Docs [1]:
"When KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 is supported, the target vcpu is
identified as (256 * vcpu2_index + vcpu_index). Otherwise, vcpu2_index
must be zero."
The vcpu parameter from the setIntState method is populated with
the gem5 context identifier (ContextID) of a specific PE.
It is not contrained by the 256 vcpu limit, so it can already specify
more than 256 vcpus. We therefore just need to translate/unpack the
value in two indices (vcpu and vcpu2) which will be forwarded to KVM
when raising an IRQ from userspace.
We guard the vcpu2 retrieval with a hash define as this is a late
addition and some older kernels do not define this capability (4.15 as
an example).
[1]: https://www.kernel.org/doc/html/latest/virt/kvm/api.html
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Change-Id: If0c475dc4a573337edd053020920e9b109d13991
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55964
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Whether to honor the USE_KVM setting can be handled in a more targeted
way down at individual consumers. They will know what ISA the host needs
to be for their particular model to make sense, where the SConstruct has
to enable or disable it for the entire build of gem5.
Change-Id: I9e82e232203834f6fe56d0ce5cf696809a0970bc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56188
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
During the page table walking,
MMU will perform the PMP check for each page table page.
However, in the current implementation,
the param:mode used by pmp_Check() is equal to the MMU mode,
which means the page table page has an executable mode
if the target page is executable (during pmp_Check).
As the page table page will never be executable,
the mode for the page table page is either read or write.
Jira Issue: https://gem5.atlassian.net/browse/GEM5-1143
Change-Id: I105f52ef58fe1fbbf7d84c6563e8a8c22cea9ccb
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/55063
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Ayaz Akram <yazakram@ucdavis.edu>