Commit Graph

14368 Commits

Author SHA1 Message Date
Giacomo Travaglini
f7d6dadc10 mem-ruby: Allow trivial integer operations with Addr type
At the moment an address value can only be used in the slicc code
to do TBE lookups but there is no way to add/subtract/divide/multiply
two addresses nor an address and an integer value.

This hinders the development of protocol specific code and
forces developers to place such code in shared
C++ structures

Change-Id: Ia184e793b6cd38f951f475a7cdf284f529972ccb
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-08 08:38:13 +01:00
Giacomo Travaglini
ddb6749b62 mem-ruby: Add static_cast by value in SLICC
At the moment it is possible to static_cast by pointer/reference only:

static_cast(type, "pointer", val) -> static_cast<type*>(val);
static_cast(type, "reference", val) -> static_cast<type&>(val);

With this patch it will also be possible to do something like

static_cast(type, "value", val) -> static_cast<type>(val);

Which is important when wishing to convert integer types into
custom onces and viceversa.

This patch is also deferring static_cast type check to C++

At the moment it is difficult to use the static_cast utility in slicc as
it tries to handle type checking in the language itself. This would
force us to explicitly define compatible types (like an Addr and an int
as an example). Rather than pushing the burden on us, we should always
allow a developer to use a static_cast in slicc and let the C++ compiler
complain if the generated code is not compatible

Change-Id: I0586b9224b1e41751a07d15e2d48a435061c2582
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-09-08 08:38:13 +01:00
Bobby R. Bruce
eb5ae35341 resources,stdlib: Add workload to resource specialization and deprecate workload.py (#212) 2023-09-07 12:45:45 -07:00
Johnny
105839ae2b sim: add bypass_on_change to the set() of a signal
When reset a port, we don't want to trigger a onChange().
Offer an option to bypass it and update state only.

Change-Id: Ia53b7a76d2a320ea67101096cdbfe2eafaf440d2
2023-09-07 11:54:56 +08:00
Nicholas Mosier
3dfdd48211 misc: Fix buggy special path comparisons
This patch fixes the buggy special path comparisons in
src/kern/linux/linux.cc Linux::openSpecialFile(), which only checked
for equality of path prefixes, but not equality of the paths
themselves. This patch replaces those buggy comparisons with
regular std::string::operator== string equality comparisons.

GitHub issue: https://github.com/gem5/gem5/issues/269

Change-Id: I216ff8019b9a6a3e87e364c2e197d9b991959ec1
2023-09-05 13:44:10 -07:00
Matthew Poremba
2da54d5a4f mem-ruby: Reorder SLC atomic and response actions
Currently the MOESI_AMD_Base-directory transition for system level
atomics sends the response message before the atomic is performed. This
was likely done because atomics are supposed to return the value of the
data *before* the atomic is performed and by simply ordering the actions
this way that was taken care of.

With the new atomic log feature, the atomic values are pulled from the
log by the coalescer on the return path. Therefore, these actions can be
reordered. However, it is now necessary that the atomics be performed
before sending the response so that the log is populated and copied by
the response action. This should fix #253 .

Change-Id: Ie7e178f93990975367de2cc3e89e5ef9c9069241
2023-09-01 10:36:54 -05:00
Bobby R. Bruce
8d47cda8b6 arch-x86: Fix wrong x86 assembly (#251)
The RM field of ModRM was printed as Reg field for several instructions.

For reference, this change fixes typos introduced by [1].

[1] https://gem5-review.googlesource.com/c/public/gem5/+/40339
2023-09-01 00:26:00 -07:00
Hoa Nguyen
4ff1f160ec arch-x86: Fix wrong x86 assembly
The RM field of ModRM was printed as Reg field for several instructions.

For reference, this change fixes typos introduced by [1].

[1] https://gem5-review.googlesource.com/c/public/gem5/+/40339

Change-Id: I41eb58e6a70845c4ddd6774ccba81b8069888be5
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-01 00:26:51 +00:00
Harshil Patel
7042d74ac2 stdlib, resources: Moved parsing params.
- moved parsing from WorkloadResource __init__ to obtain_resource

Change-Id: I9ed2aebb23af7f359bc1e5fff8ebe616a0da1374
2023-08-31 09:19:24 -07:00
Matthew Poremba
cfa833a97d gpu-compute: Set LDS/scratch aperture base register
Starting with gfx900 (Vega) the LDS and scratch apertures can be queried
using a new s_getreg_b32 instruction. If the instruction is called with
the SH_MEM_BASES argument it returns the upper 16 bits of a 64 bit
address for the LDS and scratch apertures. The current addresses cannot
be encoded in this register, so that addresses are changed to have the
lower 48 bits be all zeros in addition to writing the bases register.

Change-Id: If20f262b2685d248afe31aa3ebb274e4f0fc0772
2023-08-31 11:01:32 -05:00
Bobby R. Bruce
0e323bc409 mem: Atomic ops to same address (#200)
Augmenting the DataBlock class with a change log structure to record the
effects of atomic operations on a data block and service these changes
if the atomic operations require return values.

Although the operations are atomic, the coalescer need not send unique
memory requests for each operation. Atomic operations within a wavefront
to the same address are now coalesced into a single memory request. The
response of this request carries all the necessary information to
provide the requesting lanes unique values as a result of their
individual atomic operations. This helps reduce contention for request
and response queues in simulation.

Previously, only the final value of the datablock after all atomic ops
to the same address was visible to the requesting waves. This change
corrects this behavior by allowing each wave to see the effect of this
individual atomic op is a return value is necessary.
2023-08-30 23:53:35 -07:00
Bobby R. Bruce
c156df620d resources, stdlib: Add support for local files in obtain_resource (#204)
This patch allows a local JSON file to specify a local path in the JSON
object of a Resource, through the "url" field.

Local paths can be entered with the prefix "file:" in the "url" field.

If the local path exists, then the Resource from there is copied into
the resource directory defined in the
function earlier.

This behavior is the same as using specific Resource classes (ex.
BinaryResource) and passing a local_path into the function.

But, the above class does not allow simultaneous creation of local
Resources and Workloads of those local Resources.

With this patch, someone can use a local JSON, specify the location of
local Resources and create a Workload from those Resources and test both
together.
2023-08-29 20:35:40 -07:00
KUNAL PAI
d52c7ce87f resources, stdlib: Add support for local files in obtain_resource
This patch allows a local JSON file to specify a local path
in the JSON object of a Resource, through the "url" field.

Local paths can be entered with the prefix "file:" in "url".
All File URI scheme formats are supported.

This behavior is the same as using specific Resource classes
(ex. BinaryResource) and passing a local_path into the function.

But, the above infrastructure does not allow simultaneous
creation of Resources and Workloads of those Resources.

With this patch, someone can use a local JSON, specify the location
of local Resources and create a Workload from those Resources and
test both together.

Also, this patch adds pyunit tests to check the functionality
of the function used to convert the "url" field into a path.

Change-Id: I1fa3ce33a9870528efd7751d7ca24c27baf36ad4
2023-08-29 09:47:03 -07:00
Harshil Patel
7da0aeee7d stdlib, resources: Updated warn messages
- Updated warn messages to be more clear.

Change-Id: I2a8922e96b5b544f2f229b01b3c51fc5e79995b4
2023-08-29 09:08:44 -07:00
Bobby R. Bruce
68a48a2dfa mem-ruby: fix CHI sending the wrong snoop response (#219)
Do not respond with SnpRespData_I when the line is still present
upstream.
2023-08-28 16:21:25 -07:00
Bobby R. Bruce
737c611e72 mem-ruby: fix assert on CHI ReadUnique (#218)
DCT must be disabled when handling a ReadUnique where the copy need to
be upgraded.

Previously we were just asserting as it was assumed DCT is only enabled
for HNFs (which can "auto-upgrade"). However DCT may also be enabled for
intermediated levels of distributed shared caches above the HNFs.
2023-08-28 16:06:09 -07:00
Bobby R. Bruce
4bd3d2f864 mem-ruby: Improve Ruby/CHI stats for in/out trans (#220)
Currently we generate these stats for all defined Events in the
protocol, which may generate too many stats that are never used. Though
these don't appear in the stats.txt file, they unnecessarily increases
simulation startup time and memory footprint.

This patch limits those stats to events with the "in_trans" and/or
"out_trans" properties. SLICC compiler then checks which combinations of
event+state are possible when generating the stats.

Also the possible level of detail for inTransLatHist was reduced.
Only the number of transactions for each event+initial+final state
combinations is now accounted. Latency histograms are only defined per
event type (similarly to outTransLatHist). This significantly reduces
the final file size for generated stats.
2023-08-28 15:06:39 -07:00
Matthew Poremba
60f071d09a gpu-compute,arch-vega: Implement flat scratch insts
Flat scratch instructions (aka private) are the 3rd and final segment of
flat instructions in gfx9 (Vega) and beyond. These are used for things
like spills/fills and thread local storage. This commit enables two
forms of flat scratch instructions: (1) flat_load/flat_store
instructions where the memory address resolves to private memory and (2)
the new scratch_load/scratch_store instructions in Vega. The first are
similar to older generation ISAs where the aperture is unknown until
address translation. The second are instructions guaranteed to go to
private memory.

Since these are very similar to flat global instructions there are
minimal changes needed:

- Ensure a flat instruction is either regular flat, global, XOR scratch
- Rename the global op_encoding methods to GlobalScratch to indicate
  they are for both and are intentionally used.
- Flat instructions in segment 1 output scratch_ in the disassembly
- Flat instruction executed as private use similar mem helpers as global
- Flat scratch cannot be an atomic

This was tested using a modified version of the 'square' application:

template <typename T>
__global__ void
scratch_square(T *C_d, T *A_d, size_t N)
{
    size_t offset = (blockIdx.x * blockDim.x + threadIdx.x);
    size_t stride = blockDim.x * gridDim.x ;

    volatile int foo; // Volatile ensures scratch / unoptimized code

    for (size_t i=offset; i<N; i+=stride) {
        foo = A_d[i];
        C_d[i] = foo * foo;
    }
}

Change-Id: Icc91a7f67836fa3e759fefe7c1c3f6851528ae7d
2023-08-26 13:40:12 -05:00
Matthew Poremba
4506188e00 gpu-compute: Fix private offset/size register indexes
According to the ABI documentation from LLVM, the *low* register of flat
scratch (maxSGPR - 4) is the offset and the high register (maxSGPR - 3)
is size. These are currently backwards, resulting in some gnarly
addresses being generated leading to page fault and/or incorrect data.

This commit fixes this by setting the order correctly.

Change-Id: I0b1d077c49c0ee2a4e59b0f6d85cdb8f17f9be61
2023-08-26 13:40:12 -05:00
Matthew Poremba
e0379f4526 gpu-compute: Fix flat scratch resource counters
Flat instructions may access memory locations in LDS (scratchpad) and
global (VRAM/framebuffer) and therefore increment both counters when
dispatched. Once the aperture is known, we decrement the counters of the
aperture that was *not* used. This is done incorrectly for scratch /
private flat instruction. Private memory is global and therefore local
memory counters should be decremented.

This commit fixes the counters by changing the global decrements to
local decrements.

Change-Id: I25890446908df72e5469e9dbaba6c984955196cf
2023-08-26 13:40:12 -05:00
Matthew Poremba
57b3d2897c gpu-compute: Use timing DMAs for GPUFS HSA signals
The functional HSA signal read was a hack left in the gpu-compute code.
In full system, this functional read is causing problems occasionally
with the translation not yet being in the page table. The error message
output by gem5 was a fatal message on the readBlob method in port proxy.
Changing this to a timing DMA fixes this problem.

This commit adds the various timing DMA functions to send and receive
response and clean up. A helper method "sendCompletionSignal" is added
to the GPUCommandProcessor because the indentation level was getting too
deep. This change applies only to FS mode. Code for SE mode is
equivalent to what it was before this commit.

Change-Id: I1bfcaa0a52731cdf9532a7fd0eb06ab2f0e09d48
2023-08-25 13:10:51 -05:00
Matthew Poremba
fcbed2bd8a dev-amdgpu: Tell OS about PCIe atomic support (#224)
configs,dev-amdgpu: Add PCI express capability info

The ROCm stack requires PCI express atomics. Currently the first PCI
CapabilityPtr does not point to anything, which signals to the OS
(Linux) that this is an early generation PCI device. As PCI express
atomics were introduced later, the CapabilityPtr needs to point to at
least a PCI express capability structure. This capability is defined as
0x10 in Linux. We additionally set the PCI atomic based bits and
implement device specific PCI configuration space reads and writes to
the amdgpu device.

The second commit, output of simulation when loading the amdgpu
driver no longer outputs "PCIE atomics not supported". Further, an
application which uses PCIe atomics (PyTorch with a reduce_sum kernel)
now makes further progress.

First commit is a minor typo fix changing PCI capability struct to
union.
2023-08-24 11:19:30 -07:00
Bobby R. Bruce
7aa896fe8f cpu-minor: Separate the reg_index of VecClassReg and VecElemReg (#225)
In the RISC-V system, we need to VecClassReg to run RISC-V vector
instruction, and VecElemReg is not applicable because the element length
of vector can be resizable via vset\*vl\* instruction.

The change will seperate the reg_index for VecReg and VecElemReg to
ensure that have the space for VecReg when VecElemReg is not applicable.
2023-08-24 10:13:21 -07:00
Giacomo Travaglini
56a8ab3f3c sim: provide a signal constructor with an init_state (#210)
The current SignalSinkPort and SignalSourcePort have no ways to assign
the init value of the state. Add a new constructor for them with the
param init_state

Bug: 293410800
Test: boot to linux
Change-Id: Idde0a12aa0ddd0c9c599ef47059674fb12aa5d68
Reviewed-on:
https://soc-sim-external-review.googlesource.com/c/gem5/gem5/+/13159
Gem5-Virtual-Platform-Presubmit-Ready: Johnny Ko <johnnyko@google.com>
Reviewed-by: Yu-hsin Wang <yuhsingw@google.com>
Perf-Presubmit-Ready: Johnny Ko <johnnyko@google.com>
Gem5-Virtual-Platform-Verified: kokoro <noreply+kokoro@google.com>
Perf-Verified: kokoro <noreply+kokoro@google.com>
2023-08-24 18:06:21 +01:00
Bobby R. Bruce
e77666d9e8 mem-ruby: fix CHI Evict race condition (#217)
When an Evict request is received from upstream for a shared line and
the line is no longer cached locally (or on any other upstream cache),
we need to also send an Evict downstream. In this case we need to wait
until our outgoing Evict completes before completing the Evict from
upstream in order be able to resolve race conditions with incoming
snoops. E.g.: while our outgoing Evict is pending we may receive a snoop
requesting data, but we won't be able to complete this snoop if we have
already completed all upstream Evicts and we no longer have the line.
2023-08-24 10:04:28 -07:00
Matthew Poremba
addba01d29 configs,dev-amdgpu: Add PCI express capability info
The ROCm stack requires PCI express atomics. Currently the first PCI
CapabilityPtr does not point to anything, which signals to the OS
(Linux) that this is an early generation PCI device. As PCI express
atomics were introduced later, the CapabilityPtr needs to point to at
least a PCI express capability structure. This capability is defined as
0x10 in Linux. We additionally set the PCI atomic based bits and
implement device specific PCI configuration space reads and writes to
the amdgpu device.

With this commit, the output of simulation when loading the amdgpu
driver no longer outputs "PCIE atomics not supported". Further, an
application which uses PCIe atomics (PyTorch with a reduce_sum kernel)
now makes further progress.

Change-Id: I5e3866979659a2657f558941106ef65c2f4d9988
2023-08-24 09:10:35 -05:00
Roger Chang
5c28113a06 cpu-minor: Separate the reg_index of VecClassReg and VecElemReg
In the RISC-V system, we need to VecClassReg to run RISC-V vector
instruction, and VecElemReg is not applicable because the element
length of vector can be resizable via vset*vl* instruction.

The change will seperate the reg_index for VecReg and VecElemReg to
ensure that have the space for VecReg when VecElemReg is not
applicable.

Change-Id: I99a82dec273baeee31df89a0ee0f5e87f3ff187c
2023-08-24 13:27:27 +08:00
Matthew Poremba
8b4c38302f dev: PCI: Fix PCI express capability union
The capabilities for PCI express is a struct, instead of a union, like
the other capability unions. A union is used here to provide access to
the ordinal data values when reading/writing an offset while
simultaneously providing human readable field values that can be set
when writing the code.

This commit changes it to union which is likely should be. Nothing
appears to be using this union yet so it is likely an oversight.

Change-Id: I85fe7cc62914525c70fd7a5946d725ed308f8775
2023-08-23 19:32:38 -05:00
Matthew Poremba
90a518e885 gpu-compute,arch-vega: Fix ALU-only LDS counters
There are a few LDS instructions that perform local ALU operations and
writeback which are marked as loads. These are marked as loads because
they fit in the pipeline logic better, according to a several year old
comment. In the VEGA ISA these instructions (swizzle, permute, bpermute)
are not decrementing the LDS load counter. As a result, the counter will
gradually increase over time. Since wavefront slots are persistent, this
can cause applications with a few thousand kernels to eventually hang
thinking there are not enough resources.

This changeset fixes this by decrementing the LDS load counter for these
instructions. This fix was already integrated in the GCN3 ISA in the
exact same way. This changeset moves it near a similar comment about
scheduling register file writes.

Change-Id: Ife5237a2cae7213948c32ef266f4f8f22917351c
2023-08-23 19:30:24 -05:00
Tiago Mück
9584d2efa9 mem-ruby: add in_trans/out_trans to CHI events
Marks which events signal the beginning of incoming and outgoing
transactions for generating inTransLatHist and outTransLatHist stats.

Change-Id: I90594a27fa01ef9cfface309971354b281308d22
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 17:25:50 -05:00
Tiago Mück
3360a87d5a mem-ruby: optimize in/outTransLatHist stats
Generating these stats for all defined Events may generate too many
stats that are never used, which unnecessarily increases simulation
startup time and memory consumption.

This patch limits those stats to events with the "in_trans" and/or
"out_trans" properties. SLICC compiler then checks which combinations
of event+state are possible when generating the stats.

Also the possible level of detail for inTransLatHist was reduced.
Only the number of transactions for each event+initial+final state
combinations is now accounted. Latency histograms are only defined
per event type (similarly to outTransLatHist). This significantly
reduces the final file size for generated stats.

Change-Id: I29aaeb771436cc3f0ce7547a223d58e71d9cedcc
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 17:25:38 -05:00
Tiago Mück
a5fd6edea1 mem-ruby: fix CHI sending the wrong snoop response
Do not respond with SnpRespData_I when the line is still present
upstream.

Change-Id: I2592e5c6637cfc0e83042169a245837648276e61
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 17:04:09 -05:00
Tiago Mück
49f5ec16d1 mem-ruby: fix assert on CHI ReadUnique
DCT must be disabled when handling a ReadUnique where the copy
need to be upgraded.

Previously we were just asserting as it was assumed DCT is only enabled
for HNFs (which can "auto-upgrade"). However DCT may also be enabled
for intermediated levels of distributed shared caches above the HNFs.

Change-Id: I9e29142a8d2f59ea61c1d90cda6b00c19435d6b7
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 16:58:25 -05:00
Harshil Patel
328d140c70 stdlib, resources: Added warn msgs and commets.
- Added deprecated warnings to Workload and Abstract workload.

- Added comments to the classes changed.

Change-Id: I671daacf5ef455ea65103bd96aa442486142a486
2023-08-23 13:50:08 -07:00
Reiley Jeyapaul
c9ff54677f mem-ruby: fix CHI Evict race condition
When an Evict request is received from upstream for a shared line
and the line is no longer cached locally (or on any other upstream
cache), we need to also send an Evict downstream. In this case we need
to wait until our outgoing Evict completes before completing the Evict
from upstream in order be able to resolve race conditions with incoming
snoops. E.g.: while our outgoing Evict is pending we may receive a
snoop requesting data, but we won't be able to complete this snoop if
we have already completed all upstream Evicts and we no longer have the
line.

Change-Id: I23ac4f0a9c4ddd81e2425376c8d1e1c7fb66d107
Signed-off-by: Tiago Mück <tiago.muck@arm.com>
2023-08-23 15:49:51 -05:00
Ranganath (Bujji) Selagamsetty
f6a453362f mem: Atomic ops to same address
Augmenting the DataBlock class with a change log structure to
record the effects of atomic operations on a data block and
service these changes if the atomic operations require return
values.

Although the operations are atomic, the coalescer need not
send unique memory requests for each operation. Atomic
operations within a wavefront to the same address are now
coalesced into a single memory request. The response of this
request carries all the necessary information to provide the
requesting lanes unique values as a result of their individual
atomic operations. This helps reduce contention for request
and response queues in simulation.

Previously, only the final value of the datablock after all
atomic ops to the same address was visible to the requesting
waves. This change corrects this behavior by allowing each wave
to see the effect of this individual atomic op is a return value
is necessary.

Change-Id: I639bea943afd317e45f8fa3bff7689f6b8df9395
2023-08-23 14:45:25 -05:00
Johnny
76fe71ebd0 sim: provide a signal constructor with an init_state
Add more description to the code

Change-Id: Iff8fb20762baa0c9d0b7e5f24fb8769d7e198b5c
2023-08-23 10:49:15 +08:00
Johnny
6acb687975 sim: provide a signal constructor with an init_state
1. The current SignalSinkPort and SignalSourcePort have no
   ways to assign the init value of the state. Add a new constructor
   for them with the param init_state
2. After the source and sink are bound, the state at both side should
   be the same. Set the the state of sink to the state of source in the
   bind() function.

Change-Id: Idde0a12aa0ddd0c9c599ef47059674fb12aa5d68
2023-08-23 10:12:41 +08:00
Jason Lowe-Power
e3414c7098 base: Make 'findLsbSetFallback' constexpr to fix gcc-8 comp (#203)
Compilation bug found on:
https://github.com/gem5/gem5/actions/runs/5899831222/job/16002984553

In gcc Version 8 and below the following error is received:

```
src/base/bitfield.hh: In function ‘constexpr int gem5::findLsbSet(uint64_t)’:
src/base/bitfield.hh:365:34: error: call to non-‘constexpr’ function ‘int gem5::{anonymous}::findLsbSetFallback(uint64_t)’
         return findLsbSetFallback(val);
                ~~~~~~~~~~~~~~~~~~^~~~~
scons: *** [build/ALL/kern/linux/events.o] Error 1
```

`findLsbSet` cannot be `constexr` as it calls non-constexpr function
`findLsbSetFallback`. `findLsbSetFallback`. The problematic function is
the `count` on the std::bitset.

This patch changes this to a constexpr.
2023-08-22 11:23:59 -07:00
Bobby R. Bruce
709f632730 base: Make 'findLsbSetFallback' constexpr to fix gcc-8 comp
Compilation bug found on:
https://github.com/gem5/gem5/actions/runs/5899831222/job/16002984553

In gcc Version 8 and below the following error is received:

```
src/base/bitfield.hh: In function ‘constexpr int gem5::findLsbSet(uint64_t)’:
src/base/bitfield.hh:365:34: error: call to non-‘constexpr’ function ‘int gem5::{anonymous}::findLsbSetFallback(uint64_t)’
         return findLsbSetFallback(val);
                ~~~~~~~~~~~~~~~~~~^~~~~
scons: *** [build/ALL/kern/linux/events.o] Error 1
```

`findLsbSet` cannot be `constexr` as it calls non-constexpr function
`findLsbSetFallback`. `findLsbSetFallback`. The problematic function is
the `count` on the std::bitset.

This patch changes this to a constexpr.

Change-Id: I48bd15d03e4615148be6c4d926a3c9c2f777dc3c
2023-08-21 14:04:36 -07:00
Hoa Nguyen
9e007e5bd7 mem-cache: fix wrong function call
Change-Id: I924ede89f373ec21557faf25c96b36f4bc8430dd
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-08-19 22:56:55 +00:00
Hoa Nguyen
f442846d9d mem-cache: Fix another typo
Change-Id: Ib2051f9bda6e6d9002d3be1dbf0b890299098201
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-08-19 22:50:53 +00:00
Hoa Nguyen
7b897a30fa mem-cache: Fix syntax error
Change-Id: I1360879c13d377661e9eeeddf345b785c01efeb6
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-08-19 21:27:53 +00:00
Hoa Nguyen
98daec7d99 mem-cache: Allow clflush's uncacheable requests on classic cache
When a linux kernel changes a page property, it flushes the related cache
lines. The kernel might change the page property before flushing the
cache lines. This results in the clflush might occur in an uncacheable region.

Currently, an uncacheable request must be a read or a write. However,
clflush request is neither of them.

This change aims to allow clflush requests to work on uncacheable regions.
Since there is no straightforward way to check if a packet is from a clflush
instruction, this change permits all Clean Invalidate Requests, which is
the type of request produced by clflush, to work on uncacheable regions.

Change-Id: Ib3ec01d9281d3dfe565a0ced773ed912edb32b8f
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-08-19 18:20:16 +00:00
Harshil Patel
a18b4b17ed stdlib, resources: depricated workload
- Added WrokloadResource in resource.py.

- depricated Workload and CustomWorkload.

- changed iscvmatched-fs.py with obtain resource for workload to test.

Change-Id: I2267c44249b96ca37da3890bf630e0d15c7335ed
Note: change example files back to original
2023-08-18 13:56:12 -07:00
Bobby R. Bruce
30ab2c19b1 stdlib: Allow passing of func as Exit Event generator (#195)
In this case the function is turned into a generator with the "yield" of
the generator the return the function's execution.

Translation of this stale Gerrit Change:
https://gem5-review.googlesource.com/c/public/gem5/+/62872
2023-08-18 10:55:50 -07:00
Harshil Patel
ea5951a467 stdlib, resources: skeleton for workload resouce
Change-Id: I5017ac479ad61c767ede36fae195105e0519304f
2023-08-18 08:57:31 -07:00
Bobby R. Bruce
c0216dbe48 stdlib: Allow passing of func as Exit Event generator
In this case the function is turned into a generator with the
"yield" of the generator the return the function's execution.

Change-Id: I4b06d64c5479638712a11e3c1a2f7bd30f60d188
2023-08-17 16:48:33 -07:00
Jason Lowe-Power
22c52f4fba Fix reporting traps (faults) to GDB in SE mode (#166)
This addresses #123
2023-08-17 16:08:49 -07:00
Jan Vrany
3564348eec arch-riscv: Report traps to GDB in SE mode
This commit add code to report illegal instruction and breakpoint traps
to GDB (if connected). This merely follows what POWER does.
2023-08-17 15:55:04 +01:00