Currently we drop the insertion of a whole symbol table if the name of
one symbol already exists in the base table. Having similar symbols
across different binaries is common.
This change adds a warning and recommends a fix instead of silently
dropping the table.
This PR utilizes GitHub Action's matrix's to automatically distribute
the CI testlib gem5 build and test jobs across available GitHub Action
Runners.
The CI tests (the `quick` testlib tests, i.e. those run with `./main.py
run`) are distributed across the runners on a per directory basis ---
all directories under "tests/gem5" are run as their own jobs.
The necessary gem5 builds for each workflow are now automatically
inferred via the introduction of `./main.py list`'s `--build-targets`
flag which returns the gem5 build target for a given test or collection
of tests. E.g., `./main.py list --build-targets` will return the build
targets for all the `quick` testlib tests and `./main.py list
--build-target --uid=<id>` will return the build targets the test suite
`<id>` requires.
Moving from monolithic jobs to fine-grained ones will make the locaiton
of test failures more obvious. Each job has it's own artifact containing
"test/testing-results" for the tests run in that job. In addition,
maintenance of these files should become less burdensome due to less
hardcoding.
As discussed here, [1], O3CPU counts getWritableRegOperand() as a reg
read, while SimpleCPU variants count getWriableRegOperand() as a reg
write.
This patch fixes this inconsistency. Here, I assume that if
getWritableRegOperand() is used, setReg() will not be used again to
write to the destination register.
[1] https://github.com/gem5/gem5/pull/341
The auxv platform string was not copied to the same location that was
pointed to by the value of AT_PLATFORM; instead, it was copied over the
auxv random buffer. This patch fixes this by copying the auxv platform
string to the right offset in the initial program stack.
GitHub issue: https://github.com/gem5/gem5/issues/346
The popx87 micro-op did not in fact pop the st(0) floating-point
register off the stack; it acted as a no-op. This patch fixes the bug by
passing the spm=1 argument to PopX87's superclass to indicate the
floating-point stack pointer should be incremented.
GitHub issue: https://github.com/gem5/gem5/issues/344
- Added mulitline string for print message
- Added get_category_name method instead of having category as variable
Change-Id: I51e0e14a70e802453c21070711b200bc47994ba3
This introduces the changes necessary for clang-15 and clang-16
to run within gem5, and adds them to the compiler tests.
Change-Id: If809eae1bd8c366b4d62476891feff0625bdf210
There are two overloaded-virtual issues reported by g++13.
1. Copy assignment and move assignment overload is hidden in the derived
class
[ CXX] src/mem/cache/replacement_policies/weighted_lru_rp.cc -> ALL/mem/cache/replacement_policies/weighted_lru_rp.o
In file included from src/mem/cache/base.hh:61,
from src/mem/cache/base.cc:46:
src/mem/cache/cache_blk.hh:172:5: error: ‘virtual gem5::CacheBlk& gem5::CacheBlk::operator=(gem5::CacheBlk&&)’ was hidden [-Werror=overloaded-virtual=]
172 | operator=(CacheBlk&& other)
| ^~~~~~~~
src/mem/cache/cache_blk.hh:518:19: note: by ‘gem5::TempCacheBlk& gem5::TempCacheBlk::operator=(const gem5::TempCacheBlk&)’
518 | TempCacheBlk& operator=(const TempCacheBlk&) = delete;
| ^~~~~~~~
In this case, we can exiplict using parent operator= to keep the
function overload.
2. Intended overload hidden in SystemC is reported as error.
In file included from src/systemc/ext/tlm_utils/simple_initiator_socket.h:24,
from src/systemc/tlm_bridge/gem5_to_tlm.hh:72,
from build/ALL/python/_m5/param_Gem5ToTlmBridge256.cc:17:
src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh: In instantiation of ‘class tlm::tlm_base_initiator_socket<256, tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1, sc_core::SC_ONE_OR_MORE_BOUND>’:
src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:185:7: required from ‘class tlm::tlm_initiator_socket<256, tlm::tlm_base_protocol_types, 1, sc_core::SC_ONE_OR_MORE_BOUND>’
src/systemc/ext/tlm_utils/simple_initiator_socket.h:37:7: required from ‘class tlm_utils::simple_initiator_socket_b<sc_gem5::Gem5ToTlmBridge<256>, 256, tlm::tlm_base_protocol_types, sc_core::SC_ONE_OR_MORE_BOUND>’
src/systemc/ext/tlm_utils/simple_initiator_socket.h:156:7: required from ‘class tlm_utils::simple_initiator_socket<sc_gem5::Gem5ToTlmBridge<256>, 256, tlm::tlm_base_protocol_types>’
src/systemc/tlm_bridge/gem5_to_tlm.hh:147:46: required from ‘class sc_gem5::Gem5ToTlmBridge<256>’
/usr/include/c++/13/type_traits:1411:38: required from ‘struct std::is_base_of<sc_gem5::Gem5ToTlmBridgeBase, sc_gem5::Gem5ToTlmBridge<256> >’
ext/pybind11/include/pybind11/detail/../detail/common.h:880:59: required from ‘struct pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>, sc_gem5::Gem5ToTlmBridgeBase, std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >::is_valid_class_option<sc_gem5::Gem5ToTlmBridgeBase>’
ext/pybind11/include/pybind11/detail/../detail/common.h:719:35: required by substitution of ‘template<class ... Ts> using pybind11::detail::all_of = pybind11::detail::bool_constant<(Ts::value && ...)> [with Ts = {pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>, sc_gem5::Gem5ToTlmBridgeBase, std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >::is_valid_class_option<sc_gem5::Gem5ToTlmBridgeBase>, pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>, sc_gem5::Gem5ToTlmBridgeBase, std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >::is_valid_class_option<std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >}]’
ext/pybind11/include/pybind11/pybind11.h:1506:70: required from ‘class pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>, sc_gem5::Gem5ToTlmBridgeBase, std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >’
build/ALL/python/_m5/param_Gem5ToTlmBridge256.cc:34:179: required from here
src/systemc/ext/tlm_utils/../core/sc_port.hh:125:18: error: ‘void sc_core::sc_port_b<IF>::bind(sc_core::sc_port_b<IF>&) [with IF = tlm::tlm_fw_transport_if<>]’ was hidden [-Werror=overloaded-virtual=]
125 | virtual void bind(sc_port_b<IF> &p) { sc_port_base::bind(p); }
| ^~~~
In file included from src/systemc/ext/tlm_utils/simple_initiator_socket.h:27:
src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:133:18: note: by ‘tlm::tlm_base_initiator_socket<256, tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1, sc_core::SC_ONE_OR_MORE_BOUND>::bind’
133 | virtual void bind(bw_interface_type &ifs) { (get_base_export())(ifs); }
| ^~~~
src/systemc/ext/tlm_utils/../core/sc_port.hh:124:18: error: ‘void sc_core::sc_port_b<IF>::bind(IF&) [with IF = tlm::tlm_fw_transport_if<>]’ was hidden [-Werror=overloaded-virtual=]
124 | virtual void bind(IF &i) { sc_port_base::bind(i); }
| ^~~~
src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:133:18: note: by ‘tlm::tlm_base_initiator_socket<256, tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1, sc_core::SC_ONE_OR_MORE_BOUND>::bind’
133 | virtual void bind(bw_interface_type &ifs) { (get_base_export())(ifs); }
| ^~~~
From the code comment, it's intended in SystemC header.
// The overloaded virtual is intended in SystemC, so we'll disable the warning.
// Please check section 9.3 of SystemC 2.3.1 release note for more details.
The issue is we should move the skip to the base class.
Change-Id: I6683919e594ffe1fb3b87ccca1602bffdb788e7d
Adds a new probe listener template which can be used to instantiate with
a lambda function that is called by notify(). It is similar to
ProbeListenerArg with class but provides more flexibility. I.e. the can
be another object than the one instantiating the lambda which allows to
listen to any object. Furthermore additional parameters can be passed in
easily.
Change-Id: Iba451357182caf25097b9ae201cd5c647aff3a4f
With this PR our CHI implementation starts making use of the txnid and
DBID identifiers.
Note: we were already making use of the txnId for DVM messages to convey
the DVM address. This is still the case.
In the future we should realign the DVM logic so that the txnId is
solely used as a transaction identifier.
Current we drop the insertion of a whole symbol table if the name
of one symbol already exists in the base table. Having similar
symbols across different binaries is very common.
This change adds a warning and recommends a fix instead of silently
dropping the table. This is useful for debugging when there are two
or more workloads, e.g. bootloader + kernel, are added separately.
Change-Id: I9e4cf06037cd70926fb5cee3c4dab464daf0912e
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
As discussed here, [1], O3CPU counts getWritableRegOperand() as a
reg read, while SimpleCPU variants count getWriableRegOperand()
as a reg write.
This patch fixes this inconsistency. Here, I assume that if
getWritableRegOperand() is used, setReg() will not be used again
to write to the destination register.
[1] https://github.com/gem5/gem5/pull/341
Change-Id: If00049eb598f6722285e9e09419aef98ceed759f
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
Adds a new probe listener template which can be used
to instantiate with a lambda function that is called by
notify(). It is similar to ProbeListenerArg with class but
provides more flexibility. I.e. the can be another object
than the one instantiating the lambda which allows to listen
to any object. Furthermore additional parameters can be
passed in easily.
Change-Id: Iba451357182caf25097b9ae201cd5c647aff3a4f
Signed-off-by: David Schall <david.schall@ed.ac.uk>
At the moment the instruction is disassembled as an integer
operation:
msrNZCV x547, x0
Instead of
msr nzcv x0
Change-Id: I3f6576dccbe86db401c73747750ca3cfdf4055d5
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
The change will allow developers to implement and decode their
non-standard instructions to the CPU models
Bug: 289467440
Test: None
Change-Id: I67f4abc71596f819c1265e325784f51c8e9bb359
By default, the --stderr-file and --stdout-file arguments were
directing the simulator output to files named "simerr" and
"simout" respectively if an output redirect was requested.
A small annoyance is these files lack an extension meaning programs
refuse to open them, or to do so without some additional effort. On
many systems they are assumed to scripts.
This patch adds the .txt extension to both, thus clearly indicating
to other programs these are text files and can be opened to be read
as such.
Change-Id: Iff5af4a9e6966b4467d005a029dbf401099fbd35
This allows us to generate stubs for the modules in gem5. The output
will be a "typings" directory which can be used by Pylance (Python
IntelliSense) to infer typings in Visual Studio Code.
Note: A "typings" directory in the root of the workspace is the default
location for Pylance to look for typings. This can be changed via
`python.analysis.stubPath` in "settings.json".
Usage
=====
```
pip3 install -r requirements.txt
scons build/ALL/gem5.opt -j$(nproc)
./build/ALL/gem5.opt util/gem5-stubgen.py
```
The auxv platform string was not copied to the same location that was
pointed to by the value of AT_PLATFORM; instead, it was copied over
the auxv random buffer. This patch fixes this by copying the auxv
platform string to the right offset in the initial program stack.
GitHub issue: https://github.com/gem5/gem5/issues/346
Change-Id: Ied4b660d5fc444a94acb97b799be0a3722438b5e
The popx87 micro-op did not in fact pop the st(0) floating-point
register off the stack; it acted as a no-op. This patch fixes the bug
by passing the spm=1 argument to PopX87's superclass to indicate the
floating-point stack pointer should be incremented.
GitHub issue: https://github.com/gem5/gem5/issues/344
Change-Id: I6e731882b6bcf8f0e06ebd2f66f673bf9da80717
The jal and jalr share the same instruction format JumpConstructor,
which sets the IsCall and IsReturn flags by the register ID. However, it
may cause wrong instruction flags set for jal because the section
"handle the 'Jalr' instruction" misses the opcode checking. The PR fix
the issue to ensure the IsReturn can be only set in Jalr.
It is possible to execute a GPU atomic instruction using a memory
address that is in the host memory space (e.g, HMM, __managed__,
hipHostMalloc'd address). Since these are in host memory they are passed
to the SystemHub DmaDevice. However, this currently executes as a write
packet without modifying data. This leads to hangs in applications that
use atomics for forward progress (e.g., HeteroSync).
It is not clear where these are handled on a real GPU, but they are
certainly not handled by the software stack nor driver, so they must be
handled in hardware and therefore implemented in gem5. Handling for
atomics in the SystemHub makes the most sense.
To make atomics work a few extra changes need to be made to the
SystemHub. (1) The atomic is implemented as a host memory read, followed
by calling the AtomicOpFunctor, followed by a write. This requires a
second event to handle read response, performing atomic, and issuing a
write. (2) Atomics must be serialized otherwise two atomics might return
the same value which is incorrect. This patch adds serialization logic
for all request types to the same address to handle this. (3) With the
added complexity of the SystemHub, a new debug flag explicitly for
SystemHub is added.
Testing done: The heterosync application with input "sleepMutex 10 16 4"
previously hung before this patch. It passes with the patch applied.
This application tests both (1) and (2) above, as it allocates locks
with hipHostMalloc and has multiple workgroups sending an atomic request
in the same Tick, verifying the serialization mechanism.
It is possible to execute a GPU atomic instruction using a memory
address that is in the host memory space (e.g, HMM, __managed__,
hipHostMalloc'd address). Since these are in host memory they are passed
to the SystemHub DmaDevice. However, this currently executes as a write
packet without modifying data. This leads to hangs in applications that
use atomics for forward progress (e.g., HeteroSync).
It is not clear where these are handled on a real GPU, but they are
certianly not handled by the software stack nor driver, so they must be
handled in hardware and therefore implemented in gem5. Handling for
atomics in the SystemHub makes the most sense.
To make atomics work a few extra changes need to be made to the
SystemHub. (1) The atomic is implemented as a host memory read, followed
by calling the AtomicOpFunctor, followed by a write. This requires a
second event to handle read response, performing atomic, and issuing a
write. (2) Atomics must be serialized otherwise two atomics might return
the same value which is incorrect. This patch adds serialization logic
for all request types to the same address to handle this. (3) With the
added complexity of the SystemHub, a new debug flag explicitly for
SystemHub is added.
Testing done: The heterosync application with input "sleepMutex 10 16 4"
previously hung before this patch. It passes with the patch applied.
This application tests both (1) and (2) above, as it allocates locks
with hipHostMalloc and has multiple workgroups sending an atomic request
in the same Tick, verifying the serialization mechanism.
Change-Id: Ife84b30037d1447dd384340cfeb06fdfd472fff9
The jal and jalr share the same instruction format JumpConstructor,
which sets the IsCall and IsReturn flags by the register ID.
However, it may cause wrong instruction flags set for jal because
the section "handle the 'Jalr' instruction" misses the opcode
checking. The PR fix the issue to ensure the IsReturn can be only
set in Jalr.
Change-Id: I9ad867a389256f9253988552e6567d2b505a6901
The implementation of the x86 PACK micro-op had a logical bug that
caused the `PACKSSWB` and `PACKSSDW` instructions to produce
incorrect results. Specifically, due to a signedness error, the
overflow check for negative integers being packed always evaluated
to true, resulting in all negative integers being packed as -1 in
the output.
This patch fixes the signedness error that causes the bug.
GitHub issue: https://github.com/gem5/gem5/issues/331
Change-Id: I44b7328a8ce31742a3c0dfaebd747f81751e8851
While there was code present in "serialize.cc" to create the checkpoint
directory, it did not do recursively. This patch ensures all the
directories are created in a path to the checkpoint directory.
Change-Id: Ibcf7f800358fd89946f550b8cfb0cef8b51fceac
While it makes sense to define the cache_line_size as a 32-bit unsigned int,
the use of cache_line_size is way out of its original scope.
cache_line_size has been used to produce an address mask, which masking out
the offset bits from an address. For example, [1], [2], [3], and [4].
However, since the cache_line_size is an "unsigned int", the type of the
value is not guaranteed to be 64-bit long. Subsequently, the
bit twiddling hacks in [1], [2], [3], and [4] produce 32-bit mask,
i.e., 0x00000000FFFFFFC0.
This behavior at least caused a problem in LLSC in RISC-V [5], where the
load reservation (LR) relies on the mask to produce the cache block address.
Two distinct 64-bit addresses can be mapped to the same cache block using
the above mask.
This patch explicitly defines cache_line_size as a 64-bit unsigned int so
the cache block mask can be produced correctly for 64-bit addresses.
[1] 3bdcfd6f7a/src/cpu/simple/atomic.hh (L147)
[2] 3bdcfd6f7a/src/cpu/simple/timing.hh (L224)
[3] 3bdcfd6f7a/src/cpu/o3/lsq_unit.cc (L241)
[4] 3bdcfd6f7a/src/cpu/minor/lsq.cc (L1425)
[5] 3bdcfd6f7a/src/arch/riscv/isa.cc (L787)
Change-Id: I29abc7aaab266a37326846bbf7a82219071c4ffe
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
When there is race between FwdGetX
and PUTX on owner. Owner in this case hands off
ownership to GetX requestor and PUTX still goes
through. But since owner has changed, state should go back to M and PUTX
is essentially trashed.
An Unblock to the Directory in this case will give an undefined
transition. I have added transitions which indicate that when an Unblock
is served to the Directory, it means that some kind of ownership
transfer has happened while a PUTX/PUTO was in progress.
- Implemented a __str__ for AbstractResource
__str__ prints resource category, id and version.
link to resources website is also printed.
Change-Id: Iad5825ff7d8d505ceb236e00dc49bb56055fc8f0
This allows for a user to specify the exact path they want a resource to
be downloaded to. This differs from 'resource_direcctory' in that a user
may specify the file/directory name of the resource (using just the
'resource_directory' will have the resource as its ID in that directory.
Change-Id: I887be6216c7607c22e49cf38226a5e4600f39057
This change, https://github.com/gem5/gem5/pull/205, mistakenly allocates
write buffer for clflush instruction when there's a cache miss. However,
clflush in gem5 is not a write instruction. Thus, the cache should
allocate miss buffer in this case.
When there is race between FwdGetX
and PUTX on owner. Owner in this case hands off
ownership to GetX requestor and PUTX still goes
through. But since owner has changed, state should
go back to M and PUTX is essentially trashed.
An Unblock to the Directory in this case will give an undefined
transition. I have added transitions which indicate that when
an Unblock is served to the Directory, it means that some kind
of ownership transfer has happened while a PUTX/PUTO was in
progress.
Change-Id: I37439b5a363417096030a0875a51c605bd34c127
Links to #293
After calling m5_dump_reset_stats(0,0) in a test program, some
statistics like
l1_controllers.L1Dcache.m_demand_hits,
l1_controllers.L1Dcache.m_demand_misses,
l1_controllers.L1Dcache.m_demand_accesses
were not getting reset in the newer stat dumps.
This one line patch fixes that. Changes were tested with calling two
m5_dump_reset_stats(0,0) in a row for a system with 1 core, tested on
both SE and FS.
Credits: @MeatBoy106
The x87 FPU tag word (FTW) was not explicitly initialized in
{X86_64,i386}Process::initState(), resulting in holding an initial value
of zero, resulting in an invalid x87 FPU state. This commit initializes
FTW to 0xFFFF, indicating the FPU is empty at program start during
syscall emulation.
The 16-bit FTW register was also incorrectly masked down to 8-bits in
X86ISA::ISA::setMiscRegNoEffect(), leading to an invalid X87 FPU state
that later caused crashes in the X86KvmCPU. This commit corrects the
bitwidth of the mask to 16.
GitHub issue: https://github.com/gem5/gem5/issues/303
After calling m5_dump_reset_stats(0,0) in a test program,
some statistics like
l1_controllers.L1Dcache.m_demand_hits,
l1_controllers.L1Dcache.m_demand_misses,
l1_controllers.L1Dcache.m_demand_accesses
were not getting reset in the newer stat dumps.
This one line patch fixes that. Changes were tested with
calling two m5_dump_reset_stats(0,0) in a row for a system
with 1 core, tested on both SE and FS.
Credits to Gabriel Busnot for finding the fix.
Change-Id: I19d75996fa53d31ef20f7b206024fd38dbeac643