Commit Graph

20721 Commits

Author SHA1 Message Date
Bobby R. Bruce
d1f9f98747 util: Make all runners the same type
Having two types of GitHub Action Runners has not yielded much benefit
and caused confusion and inefficiencies. This change simplifies things
to having just one runner with 8-cores and 16GB of memory. It is
sufficient to build gem5 and run most simulations.

Change-Id: Ic49ae5e98b02086f153f4ae2a4eedd8a535786c8
2023-10-06 15:53:57 -07:00
Nicholas Mosier
7a0e84d853 cpu-kvm, arch-x86: flush TLB after syscalls
Modified the x86 KVM-in-SE syscall handler to flush the TLB following
each syscall, in case the page table has been modified. This is done
by reloading the value in %cr3. Doing this requires an intermediate
GPR, which we store in a new scratch buffer following the syscall code
at address `syscallDataBuf`.

GitHub issue: https://github.com/gem5/gem5/issues/409

Change-Id: Ibc20018c97ebb1794fa31a0c71e0857d661c7c9d
2023-10-06 20:41:59 +00:00
Nicholas Mosier
0dcf0fb829 sim-se: unmap reclaimed heap pages in brk syscall emulation
gem5::MemState::updateBrkRegion(), which is called during the syscall
emulation of brk, did not unmap deallocated heap pages when the brk
region is receding. Instead, it kept it mapped for simplicity. This
introduced a bug where subequent expansions of the brk region reused
prior heap page mappings that were not zero-filled. This violates
the assumptions of glibc malloc, resulting in heap corruption and
crashes.

This patch fixes the bug by always unmapping pages that are deallocated
during a call to brk() that reduces the heap size. This makes the
gem5::MemState::_endBrkPoint field obsolete, so this patch removes it.

GitHub issue: https://github.com/gem5/gem5/issues/342

Change-Id: Ib2244e1aa4d2a26666ad60d231fdde2c22d2df35
2023-10-06 20:39:57 +00:00
Matthew Poremba
75a7f30dfb dev-amdgpu: Implement GPU clock MMIOs
The ROCr runtime uses a combination of HSA signal timestamps and
hardware MMIOs to calculate profiling times. At the beginning of an
application a timestamp is read from the GPU using MMIOs. The clock
MMIOs reside in the GFX MMIO region, so a new AMDGPUGfx class is added
to handle these MMIOs.

The timestamp value is expected to be in nanoseconds, so we simply use
the gem5 tick converted to ns.

Change-Id: I7d1cba40d5042a7f7a81fd4d132402dc11b71bd4
2023-10-06 13:21:40 -05:00
Matthew Poremba
6a4b2bb096 dev-hsa,gpu-compute: Add timestamps to AMD HSA signals
The AMD specific HSA signal contains start/end timestamps for dispatch
packet completion signals. These are current always zero. These
timestamp values are used for profiling in the ROCr runtime.
Unfortunately, the GpuAgent::TranslateTime method in ROCr does not check
for zero values before dividing, causing applications that use profiling
to crash with SIGFPE. Profiling is used via hipEvents in the HACC
application, so these should be supported in gem5.

In order to handle writing the timestamp values, we need to DMA the
values to memory before writing the completion signal. This changes the
flow of the async completion signal write to be (1) read mailbox pointer
(2) if valid, write the mailbox data, other skip to 4 (3) write mailbox
data if pointer is valid (4) write timestamp values (5) write completion
signal. The application will process the timestamp data as soon as the
completion signal is received, so we need to ordering to ensure the DMA
for timestamps was completed.

HACC now runs to completion on GPUFS and has the same output was
hardware.

Change-Id: I09877cdff901d1402140f2c3bafea7605fa6554e
2023-10-06 13:21:40 -05:00
Giacomo Travaglini
00748c7901 mem-ruby: Fix CHI fromSequencer helper function
This has been broken by #177

Change-Id: I52feff4b5ab2faf0aa91edd6572e3e767c88e257
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2023-10-06 14:51:11 +01:00
Giacomo Travaglini
ae104cc431 mem-ruby: Add new feature far atomics in CHI (#177)
Added a new feature to CHI protocol (in collaboration with @tiagormk).
Here is the Jira Ticket
[https://gem5.atlassian.net/browse/GEM5-1326](https://gem5.atlassian.net/browse/GEM5-1326
). As described in CHI specs, far atomic transactions enable remote
execution of Atomic Memory Operations. This pull request incorporates
several changes:

* Fix Arm ISA definition of Swap instructions. These instructions should
return an operand, so their ISA definition should be Return Operation.
* Enable AMOs in Ruby Mem Test to verify that AMOs work
* Enable near and far AMO in the Cache Controler of CHI

Three configuration parameters have been used to tune this behavior:
* policy_type: sets the atomic policy to one of the described in [our
paper](https://dl.acm.org/doi/10.1145/3579371.3589065)
* atomic_op_latency: simulates the AMO ALU operation latency
* comp_anr: configures the Atomic No return transaction to split
CompDBIDResp into two different messages DBIDResp and Comp
2023-10-06 10:09:58 +01:00
Vishnu Ramadas
a19667427a mem-ruby: Add BUILD_GPU guard to ruby cooldown and warmup phases
Ruby was recently updated to support flushes and warmup for GPUs. Since
this support uses the GPUCoalescer, non-GPU builds face a compile time
issue. This is because GPU code is not built for non-GPU builds. This
commit addes "#if BUILD_GPU" guards around the GPU-related code in
common files like AbstractController.hh, CacheRecorder.*, RubySystem.cc,
GPUCoalescer.hh, and VIPERCoalescer.hh. This support allows GPU builds
to use flushing while non-GPU builds compile without problems

Change-Id: If8ee4ff881fe154553289e8c00881ee1b6e3f113
2023-10-05 18:59:54 -05:00
Matt Sinclair
85340973bf configs: Add configurable GPU L1,L2 num banks and L2 latencies (#389)
Previously, the L1, L2 number of banks and L2 latencies were not
configurable through command line arguments. This commit adds support to
configure them through the arguments '--tcp-num-banks' for number of
banks in L1, '--tcc-num-banks' for number of banks in L2, and
'--tcc-tag-access-latency', and '--tcc-data-access-latency'

Change-Id: Ie3b713ead16865fd7120e2d809ebfa56b69bc4a1
2023-10-05 15:54:24 -05:00
Bobby R. Bruce
4db748a507 resources, stdlib: Adding 'suite' category to gem5 (#191) 2023-10-05 13:26:58 -07:00
Bobby R. Bruce
761f6b73a0 arch-arm: Implement FEAT_FGT (#334)
This PR implements FEAT_FGT (Fine Grain Traps)
2023-10-05 10:44:26 -07:00
Bobby R. Bruce
f5c7ea01ef gpu-compute: Fix dynamic scratch size test (#391)
ROCm supports dynamically allocating scratch space, which resides in
framebuffer memory, to reduce the amount of memory allocated for kernels
that have not yet launched. The size of the scratch space allocated is
located in task->amdQueue.compute_tmpring_size_wavesize. This size is in
kilobytes. The AQL task contains the number of bytes requested *per work
item*, however we currently check if there is enough tmpring space by
comparing a single work item. This should instead check the size *per
wavefront*.

This causes problems in applications where multiple kernels use dynamic
scratch allocation and a later kernel requires more space than the
earlier kernel. The only application being tested that does this is
LULESH. This was resulting in the scratch space being too small,
resulting in workgroups clobbering each other's private memory leading
to some nasty bugs. It is fixed by this patch as task->amdQueue will be
re-read from the host and will contain the updated tmpring size. After
this there is enough scratch space and LULESH makes forward progress.
2023-10-05 10:38:13 -07:00
Bobby R. Bruce
ee8c569513 arch-riscv: Implement Zcb instructions (#399)
Added the following instructions:
c.lbu
c.lh
c.lhu
c.sb
c.sh
c.zext.b
c.sext.b
c.zext.h
c.sext.h
c.zext.w
c.not
c.mul

Reference: https://github.com/riscv/riscv-code-size-reduction
2023-10-05 10:36:02 -07:00
Bobby R. Bruce
f75c0fca8a stdlib: Del comment stating SE mode limited to single thread
This comment was left in the codebase in error. The
`set_se_binary_workload` function works fine with multi-threaded
applications. This hasn't been a restriction for some time.

Change-Id: I1b1d27c86f8d9284659f62ae27d752bf5325e31b
2023-10-05 10:20:55 -07:00
Bobby R. Bruce
06bbc43b46 ext: Remove std::binary_function from DramPower
`std::binary_function` was deprecated in C++11 and officially
removed in CPP-17.

This caused a compilation error on some systems. Fortunately it can be
safely removed. It was unecessary. The commandItemSorter was compliant
witih the `sort` regardless.

Change-Id: I0d910e50c51cce2545dd89f618c99aef0fe8ab79
2023-10-05 10:18:19 -07:00
Bobby R. Bruce
39c7e7d1ed arch: Adding missing override to PCState.set
As highlighed in this failing compiler test:
https://github.com/gem5/gem5/actions/runs/6348223508/job/17389057995

Clang was failing when compiling "build/ALL/gem5.opt" due missing
overrides in `PCState`'s "set" function.

This was observed in Clang-14 and, stangely, Clang-8.

Change-Id: I240c1087e8875fd07630e467e7452c62a5d14d5b
2023-10-05 10:18:19 -07:00
Roger Chang
ea3ee880aa arch-riscv: Implement Zcb instructions
Added the following instructions:
c.lbu
c.lh
c.lhu
c.sb
c.sh
c.zext.b
c.sext.b
c.zext.h
c.sext.h
c.zext.w
c.not
c.mul

Reference: https://github.com/riscv/riscv-code-size-reduction
Change-Id: Ib04820bf5591b365a3bfbbd8b90655a8a1d844cf
2023-10-05 18:46:35 +08:00
Leo Redivo
98a6cd6ee2 misc: changed call get_default_disk_device to get_disk_device
Change-Id: I240da78a658208211ede6648547dfa4c971074a1
2023-10-04 13:32:35 -07:00
Víctor Soria
6411b2255c mem-ruby,configs: Add CHI far atomics support
Introduce far atomic operations in CHI protocol.
Three configuration parameters have been used to tune this behavior:

  policy_type:       sets the atomic policy to one of the described in our paper
  atomic_op_latency: simulates the AMO ALU operation latency
  comp_anr:          configures the Atomic No return transaction to split
                     CompDBIDResp into two different messages DBIDResp and Comp

Change-Id: I087afad9ad9fcb9df42d72893c9e32ad5a5eb478
2023-10-04 19:19:08 +02:00
Víctor Soria
12dada2dc5 arch-arm: Correct return operand in swap instructions
Swap instructions are configured as non returning AMO operations. This is wrong because they
return the previous value stored in the target memory position

Change-Id: I84d75a571a8eaeaee0dbfac344f7b34c72b47d53
2023-10-04 19:11:01 +02:00
Víctor Soria
4fd9d66c53 tests,mem-ruby: Enhance ruby false sharing test with Atomics
New ruby mem test includes a percentages of AMOs that will be executed randomly in ruby mem test

Change-Id: Ie95ed78e59ea773ce6b59060eaece3701fe4478c
2023-10-04 19:11:01 +02:00
Jason Lowe-Power
6f5d877b1a misc: Update gem5 to use clang-15 and clang-16 (#365)
This introduces the changes necessary for clang-15 and clang-16 to run
within gem5, and adds them to the compiler tests.

This also updates the dockerfiles for ubuntu 22.04 to include the steps
necessary to compile clang-15 and clang-16.
2023-10-04 09:51:12 -07:00
Matthew Poremba
2b97f17fe1 gpu-compute: Fix dynamic scratch size test
ROCm supports dynamically allocating scratch space, which resides in
framebuffer memory, to reduce the amount of memory allocated for kernels
that have not yet launched. The size of the scratch space allocated is
located in task->amdQueue.compute_tmpring_size_wavesize. This size is in
kilobytes. The AQL task contains the number of bytes requested *per work
item*, however we currently check if there is enough tmpring space by
comparing a single work item. This should instead check the size *per
wavefront*.

This causes problems in applications where multiple kernels use dynamic
scratch allocation and a later kernel requires more space than the
earlier kernel. The only application being tested that does this is
LULESH. This was resulting in the scratch space being too small,
resulting in workgroups clobbering each other's private memory leading
to some nasty bugs. It is fixed by this patch as task->amdQueue will be
re-read from the host and will contain the updated tmpring size. After
this there is enough scratch space and LULESH makes forward progress.

Change-Id: Ie9e0f92bb98fd3c3d6c2da3db9ee65352f9ae070
2023-10-04 09:38:31 -05:00
Andreas Sandberg
7806eaad51 arch: Add instruction size and PC set methods (#357)
Add the instruction size of a static instruction. x86 and arm decoders
add now the instruction size to the macro instruction. However, microops
are still handled by the fetch stage which is not nice.
Furthermore, we add a set method to the PC state. It allows setting a PC
state to acertain address.
Both methods are required for the decoupled front-end.

Change-Id: I311fe3f637e867c42dee7781f5373ea2e69e2072
2023-10-04 10:49:30 +01:00
Bobby R. Bruce
57e0c7d006 arch-riscv: FS bits -> DIRTY for more floating point loads (#381)
The affected instructions are,
- c.flw
- c.flwsp
- flh
- flw

This change is related to [1] [2], which also aim to change the FS bits
to DIRTY when the state of any floating point register might change.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://github.com/gem5/gem5/pull/370
2023-10-03 11:51:47 -07:00
Vishnu Ramadas
d3637a489d configs: Add option to disable AVX in GPUFS
GPUFS+KVM simulations automatically enable AVX. This commit adds a
command line option to disable AVX if its not needed for a GPUFS
simulation.

Change-Id: Ic22592767dbdca86f3718eca9c837a8e29b6b781
2023-10-03 12:10:42 -05:00
Vishnu Ramadas
53627cc39c configs: Add configurable GPU L1,L2 num banks and L2 latencies
Previously, the L1, L2 number of banks and L2 latencies were not
configurable through command line arguments. This commit adds support to
configure them through the arguments '--tcp-num-banks' for number of
banks in L1, '--tcc-num-banks' for number of banks in L2, and
'--tcc-tag-access-latency', and '--tcc-data-access-latency'

Change-Id: Ie3b713ead16865fd7120e2d809ebfa56b69bc4a1
2023-10-03 11:51:28 -05:00
Harshil Patel
3af3c1121b stdlib, resources: Addressed requested changes
Change-Id: I22abdc3bdcdde52301ed10cb3113e8925159c245
Co-authored-by: Kunal Pai <kunpai@users.noreply.github.com>
2023-10-02 23:27:32 -07:00
Vishnu Ramadas
f69191a31d dev-amdgpu: Remove duplicate writes to PM4 queue pointers
During checkpoint restoration, the unserialize() function writes rptr,
wptr, and indirect buffer rptr, wptr to PM4 queue's rptr, wptr fields.
This commit updates this to write only the relevant pointers to the
queue structure. If indirect buffers are used, then it writes only the
indirect buffer pointers to the queue. If they are not used, then it
writes rptr, wptr values to the queue.

Change-Id: Iedb25a726112e1af99cc1e7bc012de51c4ebfd45
2023-10-02 19:37:46 -05:00
Vishnu Ramadas
ae5a51994c mem-ruby: Update cache recorder to use GPUCoalescer port for GPUs
Previously, the cache recorder used the Sequencer to issue flush
requests and cache warmup requests. The GPU however uses GPUCoalescer to
access the cache, and not the Sequencer. This commit adds a GPUCoalescer
map to the cache recorder and uses it to send flushes and cache warmup
requests to any GPU caches in the system

Change-Id: I10490cf5e561c8559a98d4eb0550c62eefe769c9
2023-10-02 19:05:10 -05:00
Vishnu Ramadas
085789d00c mem-ruby: Add flush support to GPU_VIPER protocol
This commit adds flush support to the GPU VIPER coherence protocol. The
L1 cache will now initiate a flush request if the packet it receives
is of type RubyRequestType_FLUSH. During the flush process, the L1 cache
will a request to L2 if its in either V or I state. L2 will issue a
flush request to the directory if its cache line is in the valid
state before invalidating its copy. The directory, on receiving this
request, writes data to memory and sends an ack back to the L2. L2
forwards this ack back to the L1, which then ends the flush by calling
the write callback

Change-Id: I9dfc0c7b71a1e9f6d5e9e6ed4977c1e6a3b5ba46
2023-10-02 19:05:10 -05:00
Vishnu Ramadas
61e39d5b26 mem-ruby: Add cache cooldown and warmup support to GPUCoalescer
The GPU Coalescer does not contain cache cooldown and warmup support.
This commit updates the coalsecer to support cache cooldown during flush
and warmup during checkpoint restore.

Change-Id: I5459471dec20ff304fd5954af1079a7486ee860a
2023-10-02 19:05:04 -05:00
Vishnu Ramadas
a50ead5907 mem-ruby: Add Flush as a supported memory type in VIPERCoalescer
This commit adds flush as a recognized memory type in VIPERCoalescer.

Change-Id: I0f1b6f4518548e8e893ef681955b12a49293d8b4
2023-10-02 19:02:55 -05:00
Vishnu Ramadas
107e05266d dev-amdgpu: Add aql, hsa queue information to checkpoint-restore
GPUFS uses aql information from PM4 queues to initialize doorbells. This
commit adds aql information to the checkpoint so that it can be used
during restoration to correctly initialize all doorbells. Additionally,
this commit also sets the hsa queue correctly during checkpoint-restoration

Change-Id: Ief3ef6dc973f70f27255234872a12c396df05d89
2023-10-02 19:02:50 -05:00
Harshil Patel
7301d4bd19 python: Add importer to standalone gem5py_m5 (#369)
I believe the point of this binary was to allow people to use the m5
objects without the entire gem5 binary. However, without adding the
importer call, this did not work. Unfortunately, with the importer call
there is a circular dependence on the original gem5py.cc file.
Therefore, this change creates a new file that has the importer call.

Now, with the `gem5py_m5` binary you can run python code that references
modules in `src/python`. Note that `_m5` is not available, so anything
that depends on the gem5 SimObjects' implementation will not work.
However, this can still be useful for things like getting Resources,
processing stats, etc.
2023-10-02 14:28:45 -07:00
David Schall
7d2e1ee789 arch: Add instruction size and PC set methods
Adds the instruction size to all static instruction. x86, arm
and RISC-V decoders add the instruction size to every decoded
macro instruction. As microops should reflect the size of the
their parent macroop the set method is overwritten to pass the
size to all microops.
Furthermore, we add a set method to the PC state. It allows
setting a PC state to a certain address.
Both methods are required for the decoupled front-end.

Change-Id: I311fe3f637e867c42dee7781f5373ea2e69e2072
Signed-off-by: David Schall <david.schall@ed.ac.uk>
2023-10-02 20:10:57 +00:00
Hoa Nguyen
da72590c19 arch-riscv: FS bits -> DIRTY for more floating point loads
The affected instructions are,
- c.flw
- c.flwsp
- flh
- flw

This change is related to [1] [2], which also aim to change the
FS bits to DIRTY when the state of any floating point register
might change.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://github.com/gem5/gem5/pull/370

Change-Id: I098e1b1812fb352bd5d3614ff5d3547e58903b65
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-10-01 23:12:25 -07:00
Bobby R. Bruce
e211674625 util-docker: Fix/Improve ubuntu-22.04_clang-16
* Removes `+` symbol accidently left in (this broke building).
* Removes `ARG` thus making the Docker exclusively for clang-16.
* Adds "llvm.sh" to the repo. This stops us being dependent on the url
  download. The script is under the apache license therefore compatible.
* Merges several `apt install` commands into one.

Change-Id: Iaf411656aac83f67f5395b20efd96ecc1eabb263
2023-09-29 13:12:19 -07:00
Harshil Patel
f9781af6e5 mem: fix bug in 3-level cache (#265)
The L3 cache did not work due to argument type mismatch in the call to
the constructor `DMAController`. The second argument is expecting a
`RubySystem` type but the code passes in a `cache_line_size` variable.
After I change the second argument to `self.ruby_system` everything
works.
2023-09-29 10:59:18 -07:00
Bobby R. Bruce
2b791ff556 misc: fix g++13 overloaded-virtual warning (#363)
There are two overloaded-virtual issues reported by g++13.

1. Copy assignment and move assignment overload is hidden in the derived
class

[ CXX] src/mem/cache/replacement_policies/weighted_lru_rp.cc ->
ALL/mem/cache/replacement_policies/weighted_lru_rp.o
In file included from src/mem/cache/base.hh:61,
                 from src/mem/cache/base.cc:46:
src/mem/cache/cache_blk.hh:172:5: error: ‘virtual gem5::CacheBlk&
gem5::CacheBlk::operator=(gem5::CacheBlk&&)’ was hidden
[-Werror=overloaded-virtual=]
  172 |     operator=(CacheBlk&& other)
      |     ^~~~~~~~
src/mem/cache/cache_blk.hh:518:19: note: by ‘gem5::TempCacheBlk&
gem5::TempCacheBlk::operator=(const gem5::TempCacheBlk&)’
  518 |     TempCacheBlk& operator=(const TempCacheBlk&) = delete;
      |                   ^~~~~~~~

In this case, we can exiplict using parent operator= to keep the
function overload.

2. Intended overload hidden in SystemC is reported as error.

In file included from
src/systemc/ext/tlm_utils/simple_initiator_socket.h:24,
                 from src/systemc/tlm_bridge/gem5_to_tlm.hh:72,
from build/ALL/python/_m5/param_Gem5ToTlmBridge256.cc:17:
src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh: In
instantiation of ‘class tlm::tlm_base_initiator_socket<256,
tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1,
sc_core::SC_ONE_OR_MORE_BOUND>’:

src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:185:7:
required from ‘class tlm::tlm_initiator_socket<256,
tlm::tlm_base_protocol_types, 1, sc_core::SC_ONE_OR_MORE_BOUND>’
src/systemc/ext/tlm_utils/simple_initiator_socket.h:37:7: required from
‘class
tlm_utils::simple_initiator_socket_b<sc_gem5::Gem5ToTlmBridge<256>, 256,
tlm::tlm_base_protocol_types, sc_core::SC_ONE_OR_MORE_BOUND>’
src/systemc/ext/tlm_utils/simple_initiator_socket.h:156:7: required from
‘class tlm_utils::simple_initiator_socket<sc_gem5::Gem5ToTlmBridge<256>,
256, tlm::tlm_base_protocol_types>’
src/systemc/tlm_bridge/gem5_to_tlm.hh:147:46: required from ‘class
sc_gem5::Gem5ToTlmBridge<256>’
/usr/include/c++/13/type_traits:1411:38: required from ‘struct
std::is_base_of<sc_gem5::Gem5ToTlmBridgeBase,
sc_gem5::Gem5ToTlmBridge<256> >’
ext/pybind11/include/pybind11/detail/../detail/common.h:880:59: required
from ‘struct pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>,
sc_gem5::Gem5ToTlmBridgeBase,
std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete>
>::is_valid_class_option<sc_gem5::Gem5ToTlmBridgeBase>’
ext/pybind11/include/pybind11/detail/../detail/common.h:719:35: required
by substitution of ‘template<class ... Ts> using
pybind11::detail::all_of = pybind11::detail::bool_constant<(Ts::value &&
...)> [with Ts = {pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>,
sc_gem5::Gem5ToTlmBridgeBase,
std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete>
>::is_valid_class_option<sc_gem5::Gem5ToTlmBridgeBase>,
pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>,
sc_gem5::Gem5ToTlmBridgeBase,
std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete>
>::is_valid_class_option<std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>,
pybind11::nodelete> >}]’
ext/pybind11/include/pybind11/pybind11.h:1506:70: required from ‘class
pybind11::class_<sc_gem5::Gem5ToTlmBridge<256>,
sc_gem5::Gem5ToTlmBridgeBase,
std::unique_ptr<sc_gem5::Gem5ToTlmBridge<256>, pybind11::nodelete> >’
build/ALL/python/_m5/param_Gem5ToTlmBridge256.cc:34:179: required from
here
src/systemc/ext/tlm_utils/../core/sc_port.hh:125:18: error: ‘void
sc_core::sc_port_b<IF>::bind(sc_core::sc_port_b<IF>&) [with IF =
tlm::tlm_fw_transport_if<>]’ was hidden [-Werror=overloaded-virtual=]
125 | virtual void bind(sc_port_b<IF> &p) { sc_port_base::bind(p); }
      |                  ^~~~
In file included from
src/systemc/ext/tlm_utils/simple_initiator_socket.h:27:

src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:133:18:
note: by ‘tlm::tlm_base_initiator_socket<256,
tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1,
sc_core::SC_ONE_OR_MORE_BOUND>::bind’
133 | virtual void bind(bw_interface_type &ifs) {
(get_base_export())(ifs); }
      |                  ^~~~
src/systemc/ext/tlm_utils/../core/sc_port.hh:124:18: error: ‘void
sc_core::sc_port_b<IF>::bind(IF&) [with IF =
tlm::tlm_fw_transport_if<>]’ was hidden [-Werror=overloaded-virtual=]
  124 |     virtual void bind(IF &i) { sc_port_base::bind(i); }
      |                  ^~~~

src/systemc/ext/tlm_utils/../tlm_core/2/sockets/initiator_socket.hh:133:18:
note: by ‘tlm::tlm_base_initiator_socket<256,
tlm::tlm_fw_transport_if<>, tlm::tlm_bw_transport_if<>, 1,
sc_core::SC_ONE_OR_MORE_BOUND>::bind’
133 | virtual void bind(bw_interface_type &ifs) {
(get_base_export())(ifs); }
      |                  ^~~~

From the code comment, it's intended in SystemC header.

// The overloaded virtual is intended in SystemC, so we'll disable the
warning. // Please check section 9.3 of SystemC 2.3.1 release note for
more details.

The issue is we should move the skip to the base class.
2023-09-29 10:53:52 -07:00
Harshil Patel
8182f8084b stdlib, resources, tests: Introduce Suite of Workloads
This patch introduces a new category called "suite".
A suite is a collection of workloads.
Each workload in a SuiteResource has a tag that can be narrowed down
through the function with_input_group.
Also, the set of input groups can be seen through list_input_groups.
Added unit tests to test all functions of SuiteResource class.

Change-Id: Iddda5c898b32b7cd874987dbe694ac09aa231f08

Co-authored-by: Kunal Pai <kunpai@ucdavis.edu>
2023-09-29 10:50:09 -07:00
Bobby R. Bruce
3a35bdf57a arch-riscv: Update FS bits when doing floating point loads (#370)
This problem is similar to the problem described in [1]. This problem
produces symptoms as described in [2].

In short, the Linux kernel relies on the CSR_STATUS's FS bits to decide
whether to save the floating point registers. If the FS bits are set to
DIRTY, the floating point registers will be saved during context
switching / task switching.

Currently, with the patch in [1], we only change the FS bits upon every
floating arithmetic instruction. However, since floating load
instructions also mutate the state of floating point registers, the FS
bits should be updated to DIRTY.

The problem in [2] arose when the program populates the content of one
floating register to an array by repeatedly using `fld fa5, EA`. A
context switch occured upon a page fault, and while handling that page
fault, the kernel might have to handle an interrupt. This caused the
kernel to task switch between handling page fault and handling
interrupt. This caused __switch_to() to be called, which will save the
floating point registers only if the SD (indirectly set by FS) bits are
set to DIRTY, while restoring the floating point registers to the
switch-to task [3]. This caused the floating point registers to be
zeroed out when it was restored as it was never saved before.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://github.com/gem5/gem5/issues/349
[3]
https://github.com/torvalds/linux/blob/v6.5/arch/riscv/include/asm/switch_to.h#L56
2023-09-29 10:47:05 -07:00
Melissa Jost
a79dc3f23c util: Add steps to compile clang-15 and clang-16
This updates the dockerfiles for ubuntu 22.04 to include the
steps necessary to compile clang-15 and clang-16.

Change-Id: I2bba6393ab93a6ce05a2c3ce31f3bbc71bcdca7c
2023-09-29 08:32:01 -07:00
Hoa Nguyen
6640447c1e arch-riscv: Update FS bits when doing floating point loads
This problem is similar to the problem described in [1].
This problem produces symptoms as described in [2].

In short, the Linux kernel relies on the CSR_STATUS's FS bits
to decide whether to save the floating point registers. If
the FS bits are set to DIRTY, the floating point registers will
be saved during context switching / task switching.

Currently, with the patch in [1], we only change the FS bits
upon every floating arithmetic instruction. However, since
floating load instructions also mutate the state of floating
point registers, the FS bits should be updated to DIRTY.

The problem in [2] arose when the program populates the content
of one floating register to an array by repeatedly using
`fld fa5, EA`. A context switch occured upon a page fault, and
while handling that page fault, the kernel might have to handle
an interrupt. This caused the kernel to task switch between
handling page fault and handling interrupt. This caused
__switch_to() to be called, which will save the floating point
registers only if the SD (indirectly set by FS) bits are set to
DIRTY, while restoring the floating point registers to the
switch-to task [3]. This caused the floating point registers to
be zeroed out when it was restored as it was never saved before.

[1] https://gem5-review.googlesource.com/c/public/gem5/+/65272
[2] https://github.com/gem5/gem5/issues/349
[3] https://github.com/torvalds/linux/blob/v6.5/arch/riscv/include/asm/switch_to.h#L56

Change-Id: Ia5656da5a589a8e29fb699d2ee12885b8f3fa2d2
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-09-28 19:14:29 -07:00
Jason Lowe-Power
aaad79cf51 python: Add importer to standalone gem5py_m5
I believe the point of this binary was to allow people to use the m5
objects without the entire gem5 binary. However, without adding the
importer call, this did not work. Unfortunately, with the importer call
there is a circular dependence on the original gem5py.cc file.
Therefore, this change creates a new file that has the importer call.

Now, with the `gem5py_m5` binary you can run python code that references
modules in `src/python`. Note that `_m5` is not available, so anything
that depends on the gem5 SimObjects' implementation will not work.
However, thic can still be useful for things like getting Resources,
processing stats, etc.

Change-Id: I5c0e5d1a669fe5ce491458df916f2049c81292eb
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2023-09-28 11:22:33 -07:00
Bobby R. Bruce
62d34ef374 misc: 'sim{out/err}' -> 'sim{out/err}.txt' (#250)
By default, the `--stderr-file` and `--stdout-file` arguments were
directing the simulator to output files named "simerr" and "simout"
respectively if an output redirect was requested.

A small annoyance is these files lack an extension meaning programs
refuse to open them, or don't do so withou additional effort. On many
systems they are assumed to scripts.

This patch adds the `.txt` extension to both, thus clearly indicating to
other programs these are text files and can be opened and read as such.
2023-09-27 17:36:03 -07:00
Bobby R. Bruce
5d254ffb02 stdlib, resources: Added pretty printing resource (#323)
- Implemented a __str__ for AbstractResource __str__ prints resource
category, id and version.
link to resources website is also printed.
2023-09-27 17:32:35 -07:00
Bobby R. Bruce
14b928f77c base: Add a warning when failing to insert a whole symbol table (#361)
Currently we drop the insertion of a whole symbol table if the name of
one symbol already exists in the base table. Having similar symbols
across different binaries is common.

This change adds a warning and recommends a fix instead of silently
dropping the table.
2023-09-27 17:26:03 -07:00
Bobby R. Bruce
074fa4c604 misc,ext,tests: Automatically split CI TestLib tests across GitHub Action jobs (#263)
This PR utilizes GitHub Action's matrix's to automatically distribute
the CI testlib gem5 build and test jobs across available GitHub Action
Runners.

The CI tests (the `quick` testlib tests, i.e. those run with `./main.py
run`) are distributed across the runners on a per directory basis ---
all directories under "tests/gem5" are run as their own jobs.

The necessary gem5 builds for each workflow are now automatically
inferred via the introduction of `./main.py list`'s `--build-targets`
flag which returns the gem5 build target for a given test or collection
of tests. E.g., `./main.py list --build-targets` will return the build
targets for all the `quick` testlib tests and `./main.py list
--build-target --uid=<id>` will return the build targets the test suite
`<id>` requires.

Moving from monolithic jobs to fine-grained ones will make the locaiton
of test failures more obvious. Each job has it's own artifact containing
"test/testing-results" for the tests run in that job. In addition,
maintenance of these files should become less burdensome due to less
hardcoding.
2023-09-27 14:32:16 -07:00
Bobby R. Bruce
3a0f4598b9 cpu-o3: Mark getWritableRegOperand() in O3CPU as a regwrite (#360)
As discussed here, [1], O3CPU counts getWritableRegOperand() as a reg
read, while SimpleCPU variants count getWriableRegOperand() as a reg
write.

This patch fixes this inconsistency. Here, I assume that if
getWritableRegOperand() is used, setReg() will not be used again to
write to the destination register.

[1] https://github.com/gem5/gem5/pull/341
2023-09-27 14:31:38 -07:00