Commit Graph

14305 Commits

Author SHA1 Message Date
Yan Lee
b01590fdf4 mem: port: add getTraceInString() method
Return the whole port trace of the packet as a string.

Change-Id: I7b1b1fef628a47a6ce147cb5fb75af81948c1d89
2023-08-15 00:40:29 -07:00
Yan Lee
5edb760414 mem: port: add address value in the port trace
Add the address value from the packet with the request port name.

Change-Id: I3d4c75f48ca6fbdbd5656e594d5f85f9e5626be8
2023-08-15 00:38:29 -07:00
Bobby R. Bruce
77e63b6a6c cpu-o3: bugfix of rename squash when SMT (#172)
In an SMT CPU, upon a squash, the mis-predicted(squashing) instructions
can still be executing at IEW and own phys registers. If these registers
are added back to the rename freelist on this Tick, the registers may be
renamed to be used by other SMT thread(s). This causes register
ownership hazards, which may eventually freeze the CPU. This problem
seems to date back to 2014
(https://www.mail-archive.com/gem5-users@gem5.org/msg10180.html).

This patch delays the freelist update to avoid the hazard.

I tested that this patch does not cause any performance impact for my
set of benchmarks on default non-SMT O3CPU.
2023-08-10 15:29:51 -07:00
He, Wenjian
03c2b4692c cpu-o3: bugfix of rename squash when SMT
In an SMT CPU, upon a squash, the phys regs used by
mispredicted instructions can still be owned by executing
instructions in IEW. If the regs are added back to freelist
on this tick, the reg may be renamed to be used by another
SMT thread. This causes reg ownership hazard, which may
eventually freeze the CPU.

This patch delays the freelist update to avoid the hazard.

Change-Id: I993b3c7d357269f01146db61fc8a7b83a989ea45
2023-08-10 21:43:09 +08:00
Roger Chang
f54777419d cpu: Fix ?: error due to different type
Change-Id: I35c50fbba047fe05cc0cc29c631002a9b68795fd
2023-08-10 14:36:26 +08:00
rogerchang23424
81e3bfcdc3 cpu: Update src/cpu/pred/bpred_unit.cc
Change-Id: I0cf177676d0f9fb9db4b127d5507ba66904739c4
Co-authored-by: Jason Lowe-Power <jason@lowepower.com>
2023-08-10 14:36:12 +08:00
Roger Chang
97e55fc173 cpu: Fix segment fault when using debug flags Branch
Change-Id: I36624b93f53aa101a57d51f3b917696cb2809136
2023-08-10 07:36:50 +08:00
Roger Chang
42c2ed6c2d arch-riscv: Add condition for setting misa and mstatus CSR
Change-Id: I7e03b60d0de32fe8169dd79ded485d560aca64aa
2023-08-09 19:32:04 +08:00
Roger Chang
43adc5309a arch-riscv: Add Illegal Instruction Fault Condition for RVV Config
Check the status.vs and misa.rvv CSR registers before executing
RVV instructions

Change-Id: I0355b94ea8ee4018be11a75aab8c19b10cb36126
2023-08-09 19:31:58 +08:00
Roger Chang
85549842c7 arch-riscv: Add Illegal Instruction Fault Condition for Mem RVV
Check the status.vs and misa.rvv CSR registers before executing
RVV instructions

Change-Id: If1f6a440713612b9a044de4f320997e99722c06c
2023-08-09 19:22:32 +08:00
Roger Chang
c18e43a0ab arch-riscv: Add Illegal Instruction Fault Condition for Arith RVV
Check the status.vs and misa.rvv CSR registers before executing
RVV instructions

Change-Id: Idc143e1ba90320254926de9fa7a7b343bb96ba88
2023-08-09 19:20:53 +08:00
zmckevitt
14c25a383c arch-riscv: Implemented zicbom/zicboz extensions for RISC V
Change-Id: I79d0e6059a2dbb5a0057c4f7489b999f9e803684
2023-08-04 10:05:15 +08:00
Jason Lowe-Power
0ff485f7d0 stdlib, resources: fixed style issue in isa.hh (#149)
Changed "rv_type" to "rvType".

Change-Id: I7432a87d7a37324777385707854aefba2475b98c
2023-08-03 16:52:52 -07:00
Bobby R. Bruce
2bef8efb94 stdlib, resources: Fixed keyerror: 'is_zipped' bug (#153)
Change-Id: I68fffd880983ebc225ec6fc8c7f8d509759b581d
2023-08-03 16:01:07 -07:00
Harshil Patel
23f5535ef5 Merge branch 'develop' into riscv-fix-style 2023-08-03 13:32:53 -07:00
Harshil Patel
5cfac2cc94 stdlib: Fixed stype issue pcstate.hh
- Changed _rv_type to _rvType.
- Changed rv_type to rvType.

Change-Id: I27bdf342b038f5ebae78b104a29892684265584a
2023-08-03 13:04:17 -07:00
Harshil Patel
a25ca04851 stdlib, resources: Fixed keyerror: 'is_zipped' bug
Change-Id: I68fffd880983ebc225ec6fc8c7f8d509759b581d
2023-08-03 10:59:11 -07:00
Jason Lowe-Power
5eda9fe2ca arch-riscv: Relation chain on RVV support (#83)
This merges initial support for RVV. Currently, only the simple CPUs are supported.
The decoder stalls for every vsetvl instruction.

In the future, we will implement vsetvl as a control instruction as described in #144
2023-08-03 07:31:08 -07:00
Harshil Patel
51d492487e stdlib: stlye fix rv_type to _rvType in isa.hh and isa.cc
Change-Id: I68e2b1be9150e6528693e68fb73470d158838885
2023-08-02 14:06:30 -07:00
Adrià Armejach
884d62b33a arch-riscv: Make vset*vl* instructions serialize
Current implementation of vset*vl* instructions serialize pipeline and
are non-speculative.

Change-Id: Ibf93b60133fb3340690b126db12827e36e2c202d
2023-08-02 14:46:36 +02:00
Jason Lowe-Power
98d68a7307 arch-riscv: Improve style
Minor style fixes in vector code

Change-Id: If0de45a2dbfb5d5aaa65ed3b5d91d9bee9bcc960
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2023-08-02 14:46:36 +02:00
Jason Lowe-Power
af1b2ec2d5 arch-riscv: Add fatal if RVV used with o3 or minor
Since the O3 and Minor CPU models do not support RVV right now as the
implementation stalls the decode until vsetvl instructions are exectued,
this change calls `fatal` if RVV is not explicitly enabled.

It is possible to override this if you explicitly enable RVV in the
config file.

Change-Id: Ia801911141bb2fb2bedcff3e139bf41ba8936085
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2023-08-02 14:46:36 +02:00
Xuan Hu
a9f9c4d6d3 arch-riscv: Add risc-v vector ext v1.0 arith insts support
TODOs:
  + vcompress.vm

Change-Id: I86eceae66e90380416fd3be2c10ad616512b5eba
Co-authored-by: Yang Liu <numbksco@gmail.com>
Co-authored-by: Fan Yang <1209202421@qq.com>
Co-authored-by: Jerin Joy <joy@rivosinc.com>

arch-riscv: Add LICENCE to template files

Change-Id: I825e72bffb84cce559d2e4c1fc2246c3b05a1243
2023-08-02 14:46:36 +02:00
Xuan Hu
91b1d50f59 arch-riscv: Add risc-v vector ext v1.0 mem insts support
* TODOs:
  + Vector Segment Load/Store
  + Vector Fault-only-first Load

Change-Id: I2815c76404e62babab7e9466e4ea33ea87e66e75
Co-authored-by: Yang Liu <numbksco@gmail.com>
Co-authored-by: Fan Yang <1209202421@qq.com>
Co-authored-by: Jerin Joy <joy@rivosinc.com>
2023-08-02 14:46:35 +02:00
Xuan Hu
e14e066fde arch-riscv: Add risc-v vector ext v1.0 vset insts support
Change-Id: I84363164ca327151101e8a1c3d8441a66338c909
Co-authored-by: Yang Liu <numbksco@gmail.com>
Co-authored-by: Fan Yang <1209202421@qq.com>

arch-riscv: Add a todo to fix vsetvl stall on decode

Change-Id: Iafb129648fba89009345f0c0ad3710f773379bf6
2023-08-02 14:46:35 +02:00
Xuan Hu
73892c9b47 arch-riscv: Add risc-v vector regs and configs
This commit add regs and configs for vector extension

* Add 32 vector arch regs as spec defined and 8 internal regs for
  uop-based vector implementation.
* Add default vector configs(VLEN = 256, ELEN = 64). These cannot
  be changed yet, since the vector implementation has only be tested
  with such configs.
* Add disassamble register name v0~v31 and vtmp0~vtmp7.
* Add CSR registers defined in RISCV Vector Spec v1.0.
* Add vector bitfields.
* Add vector operand_types and operands.

Change-Id: I7bbab1ee9e0aa804d6f15ef7b77fac22d4f7212a
Co-authored-by: Yang Liu <numbksco@gmail.com>
Co-authored-by: Fan Yang <1209202421@qq.com>
Co-authored-by: Jerin Joy <joy@rivosinc.com>

arch-riscv: enable rvv flags only for RV64

Change-Id: I6586e322dfd562b598f63a18964d17326c14d4cf
2023-08-02 14:46:35 +02:00
Harshil Patel
32b7ffc454 stdlib: fixed warning message
Change-Id: I04ef23529d7afc5d46fbba7558279ec08acd629a
Co-authored-by: paikunal <kunpai@ucdavis.edu>
2023-08-01 17:22:35 -07:00
Harshil Patel
d96df40253 stdlib: Added support for JSON via env variables.
Change-Id: I5791e6d51b3b9f68eb212a46c4cd0add23668340
Co-authored-by: Kunal Pai <kunpai@ucdavis.edu>
2023-08-01 16:22:44 -07:00
Bobby R. Bruce
dceabe5fda dev-amdgpu: Support for ROCm 5.4+ and MI200 (#141) 2023-07-31 10:24:46 -07:00
Jason Lowe-Power
4ee6dbc330 mem: Minor typo fix in packet.hh (#143)
Change-Id: I07c31b7a62d83fe3250b48141951aec3c2f280df
2023-07-31 10:01:50 -07:00
Matthew Poremba
3589a4c11f arch-vega: Implement translate further
Starting with ROCm 5.4+, MI100 and MI200 make use of the translate
further bit in the page table. This bit enables mixing 4kiB and 2MiB
pages and is functionally equivalent to mixing page sizes using the
PDE.P bit for which gem5 currently has support.

With PDE.P bit set, we stop walking and the page size is equal to the
level in the page table we stopped at. For example, stopping at level
2 would be a 1GiB page, stopping at level 3 would be a 2MiB page.
This assumes most pages are 4kiB.

When the F bit is used, it is assumed most pages are 2MiB and we will
stop walking at the 3rd level of the page table unless the F bit is set.
When the F bit is set, the 2nd level PDE contains a block fragment size
representing the page size of the next PDE in the form of 2^(12+size).
If the next page has the F bit set we continue walking to the 4th level.
The block fragment size is hardcoded to 9 in the driver therefore we
assert that the block fragment size must be 0 or 9.

This enables MI200 with ROCm 5.4+ in gem5. This functionality was
determine by examining the driver source code in Linux and there is no
public documentation about this feature or why the change is made in or
around ROCm 5.4.

Change-Id: I603c0208cd9e821f7ad6eeb1d94ae15eaa146fb9
2023-07-30 13:17:05 -05:00
Matthew Poremba
3b35e73eb8 dev-amdgpu: Implement SDMA constant fill
This SDMA packet is much more common starting around ROCm 5.4.
Previously this was mostly used to clear page tables after an
application ended and was therefore left unimplemented. It is
now used for basic operation like device memsets.

This patch implements constant fill as it is now necessary.

Change-Id: I9b2cf076ec17f5ed07c20bb820e7db0c082bbfbc
2023-07-30 13:17:05 -05:00
Matthew Poremba
618b2a60de arch-vega, dev-amdgpu: Fix for memory leaks (#129)
When using the new operator, delete should be called
on any allocated memory after it's use is complete.

Change-Id: Id5fcfb264b6ddc252c0a9dcafc2d3b020f7b5019
2023-07-30 10:48:17 -07:00
Matthew Poremba
b35c2ba8c5 arch-vega: Fix vop2Helper scalar support (#142)
A previous change added a vop2Helper to remove 100s of lines of common
code from VOP2 instructions related to processing SDWA and DPP support.
That change inadvertently changed the type of operand source 0 from
const to non-const. The vector container operator[] does not allow
reading a scalar value such as a constant, a dword literal, etc. The
error shows up in the form of: assert(!scalar) in operand.hh.

Since the SDWA and DPP cases need to modify the source vector and
non-SDWA/DPP cases might require const, we make a non-const copy of the
const source 0 vector and place it in a temporary non-const vector. This
non-const vector is passed to the lambda function implementation of the
instruction. This prevents needing a const and non-const version of the
lambda and avoids needing to propagate the template parameters through
the various SDWA/DPP helper methods which seems like it will not work
anyways as they need to modify the vector.

As a result of this, as more VOP2 instructions are implemented using
this helper, they will need to specify the const and non-const template
parameters of the vector container needed for the instruction.

Change-Id: Ia0b3c550d7de32b830040007a110f4821e3385aa
2023-07-30 10:47:36 -07:00
Ranganath (Bujji) Selagamsetty
ede4d89a83 arch-vega, dev-amdgpu: Fix for memory leaks
When using the new operator, delete should be called
on any allocated memory after it's use is complete.

Change-Id: Id5fcfb264b6ddc252c0a9dcafc2d3b020f7b5019
2023-07-28 19:14:46 -05:00
Jason Lowe-Power
81cc57b828 gpu-compute: "<random>" -> "base/random.hh" in testers/gpu... (#140)
In "src/cpu/testers/gpu_ruby_test" a random number generator was used.
This was using the CPP "<random>" library. This patch changes it to the
gem5 random class (that declared in "base/random.hh").

In addition to this, undeterministic behavior has been removed. Via
"protocol_tester.cc" the RNG is either seeded with a seed specified by
the user, or goes with the gem5 default seed. This ensures reproducable
runs. Prior to this patch the RNG was seeded with `time(NULL)`. This
made finding faults difficult.

This, at least partially, addresses Issue #138

Change-Id: Ia8e9f7b87e91323f828e0b7f6c3906c0c5793b2c
2023-07-28 16:54:24 -07:00
Ranganath (Bujji) Selagamsetty
3f2899a7a8 mem: Minor typo fix in packet.hh
Change-Id: I07c31b7a62d83fe3250b48141951aec3c2f280df
2023-07-28 17:28:10 -05:00
Bobby R. Bruce
08a3762a14 gpu-compute: Add warn for random_seed == 0 case
Addresses:
https://github.com/gem5/gem5/pull/140#pullrequestreview-1552383650

Change-Id: Ia09a2bc74f35d3d6cb066efaf9d113db6caf4557
2023-07-28 12:55:18 -07:00
Bobby R. Bruce
48ac1ea38d gpu-compute: "<random>" -> "base/random.hh" in testers/gpu...
In "src/cpu/testers/gpu_ruby_test" a random number generator was used.
This was using the CPP "<random>" library. This patch changes it to the
gem5 random class (that declared in "base/random.hh").

In addition to this, undeterministic behavior has been removed. Via
"protocol_tester.cc" the RNG is either seeded with a seed specified by
the user, or goes with the gem5 default seed. This ensures reproducable
runs. Prior to this patch the RNG was seeded with `time(NULL)`. This
made finding faults difficult.

Change-Id: Ia8e9f7b87e91323f828e0b7f6c3906c0c5793b2c
2023-07-28 12:55:03 -07:00
Matthew Poremba
c722b0c73d arch-vega: Fix vop2Helper scalar support
A previous change added a vop2Helper to remove 100s of lines of common
code from VOP2 instructions related to processing SDWA and DPP support.
That change inadvertently changed the type of operand source 0 from
const to non-const. The vector container operator[] does not allow
reading a scalar value such as a constant, a dword literal, etc. The
error shows up in the form of: assert(!scalar) in operand.hh.

Since the SDWA and DPP cases need to modify the source vector and
non-SDWA/DPP cases might require const, we make a non-const copy of the
const source 0 vector and place it in a tempoary non-const vector. This
non-const vector is passed to the lambda function implementation of the
instruction. This prevents needing a const and non-const version of the
lambda and avoids needing to propagate the template parameters through
the various SDWA/DPP helper methods which seems like it will not work
anyways as they need to modify the vector.

As a result of this, as more VOP2 instructions are implemented using
this helper,they will need to specify the const and non-const template
parameters of the vector container needed for the instruction.

Change-Id: Ia0b3c550d7de32b830040007a110f4821e3385aa
2023-07-28 13:47:55 -05:00
Matthew Poremba
7c3c2b05f3 arch-x86: Add extended state CPUID function
The extended state CPUID function is used to set the values of the XCR0
register as well as specify the size of storage for context switching
storage for x87 and AVX+. This function is iterative and therefore
requires (1) marking it as such in the hsaSignificantIndex function (2)
setting multiple sets of 4-tuples for the default CPUID values where the
last 4-tuple ends with all zeros.

Change-Id: Ib6a43925afb1cae75f61d8acff52a3cc26ce17c8
2023-07-28 11:34:04 -05:00
Matthew Poremba
3584c3126c arch-x86: Expose CR4.osxsave bit
Related to the recent changes with moving CPUID values to python, this
value is needed to enable AVX and needs a way to be exposed to python as
well in order to set the bit and the corresponding CPUID values at the
same time.

Change-Id: I3cadb0fe61ff4ebf6de903018a8d8a411bfdb4e0
2023-07-28 11:34:04 -05:00
Matthew Poremba
3946f7ba2c arch-x86: Support CPUID functions with indexes
Various CPUID functions will return different values depending on the
value of ECX when executing the CPUID instruction. Add support for this
in the X86 KVM CPU. A subsequent patch will add a CPUID function which
requires iterating through multiple ECX values.

Change-Id: Ib44a52be52ea632d5e2cee3fb2ca390b60a7202a
2023-07-28 11:34:04 -05:00
Matthew Poremba
63d98018ea arch-x86: Move CPUID values to python
CPUID values for X86 are currently hard-coded in the C++ source file.
This makes it difficult to configure the bits if needed. Move these to
python instead. This will provide a few benefits:

1. We can enable features for certain configurations, for example AVX
can be enabled when the KVM CPU is used, but otherwise should not be
enabled as gem5 does not have full AVX support.
2. We can more accurately communicate things like cache/TLB sizes based
on the actual gem5 configuration. The CPUID values are can be used by
some libraries, e.g., MPI, to query system topology.
3. Enabling some bits breaks things in certain configurations and this
can be prevented by configuring in python. For example, enabling AVX
seems to currently be breaking SMP, meaning gem5 can only boot one CPU
in that configuration.

Change-Id: Ib3866f39c86d61374b9451e60b119a3155575884
2023-07-28 11:34:04 -05:00
Bobby R. Bruce
5aa955212f learning-gem5: Add a missing override (#135) 2023-07-27 09:37:52 -07:00
Hoa Nguyen
bd82e6f1a7 learning-gem5: Add a missing override
Change-Id: I9acebe6f3096b499fa2c69b6d757373431f63c71
Signed-off-by: Hoa Nguyen <hn@hnpl.org>
2023-07-26 20:01:37 -07:00
Matthew Poremba
ff7e67ee93 cpu: Set SLC bit for GPU tester
This fixes #131 by reverting to the old behavior of performing all
atomics at the system level. To do this the SLC bit needs to be set for
all atomic requests.

Change-Id: I63f4e449be1b02c933832d09700237f8c8026f4c
2023-07-26 21:18:26 -05:00
Atri Bhattacharyya
256729a40c mem: Make functional request a response when satisfied by queue
In the memory controller, MemCtrl::MemoryPort::recvFunctional,
when the functional request is satisfied by the ctrl-response queue,
correctly make the packet a response.

This change mirrors AbstractMemory::functionalAccess, which uses
Packet::makeResponse() after satisfying the request.

Change-Id: I47917062d3270915a97eed2c9fade66ba17019eb
2023-07-26 17:34:24 +02:00
Bobby R. Bruce
7601fcfba6 cpu-minor: Check pc valid before printing (#107)
In https://gem5-review.googlesource.com/c/public/gem5/+/52047 inst.pc
was changed from an object to a pointer. It is possible that this
pointer is null (e.g., if there is an interrupt and there is a bubble).
Make sure to check that it's not null before printing.

I believe that other places this pointer is dereferenced without an
explicit null check are safe, but I'm not certain.

Should fix #97 

Change-Id: Idbe246cfdb62d4d75416d41b451fb3c076233bbc
2023-07-25 17:14:38 -07:00
Melissa Jost
556c9154dd base: Add maybe_unused to findLsbSetFallback (#109)
When compiling with clang-14 I received the following error:

```
src/base/bitfield.hh:328:1: error: function 'findLsbSetFallback' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
```

This function was introduced in PR #76.
This fixes this compiler warning/error by using `[[maybe_unused]]`.

Change-Id: I0b99eab0a9e42ee1687e7a0594a5a7bf9588b422
2023-07-25 10:41:59 -07:00