Commit Graph

88 Commits

Author SHA1 Message Date
Cui Jin
19bf5c4f33 cpu-o3: Resolve circular buffer issue for LSQ
--since int is only 4 bytes, while ssize_t is 8 bytes in 64bit
  system. so 0x80000000 is regarded as negative value.

Jira Issue:: https://gem5.atlassian.net/browse/GEM5-1203

Change-Id: I74b3785b29751f777f5e154692fa60bf62b37b9f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/58649
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-04-08 15:40:30 +00:00
Samuel Stark
139f635bde cpu: Allow TLB shootdown requests in the o3 cpu
JIRA: https://gem5.atlassian.net/browse/GEM5-1097

Change-Id: Ie698efd583f592e5564af01c2150fbec969f56a2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/56600
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
2022-03-09 10:47:16 +00:00
Gabe Black
c498d8bced cpu: Specialize CPUs for an ISA at the leaves, not BaseCPU.
The BaseCPU type had been specializing itself based on the value of
TARGET_ISA, which is not compatible with building more than one ISA at a
time.

This change refactors the CPU models so that the BaseCPU is more
general, and the ISA specific components are added to the CPU when the
CPU types are fully specialized. For instance, The AtomicSimpleCPU has a
version called X86AtomicSimpleCPU which installs the X86 specific
aspects of the CPU.

This specialization is done in three ways.

1. The mmu parameter is assigned an instance of the architecture
specific MMU type. This provides a reasonable default, but also avoids
having having to use the ISA specific type when the parameter is
created.

2. The ISA specific types are made available as class attributes, and
the utility functions (including __init__!) in the BaseCPU class can
refer to them to get the types they need to set up the CPU at run time.

Because SimObjects have strange, unhelpful semantics as far as assigning
to their attributes, these types need to be set up in a non-SimObject
class, which is then brought in as a base of the actual SimObject type.
Because the metaclass of this other type is just "type", things work
like you would expect. The SimObject doesn't do any special processing
of base classes if they aren't also SimObjects, so these attributes
survive and are accessible using normal lookup in the BaseCPU class.

3. There are some methods like addCheckerCPU and properties like
needsTSO which have ISA specific values or behaviors. These are set in
the ISA specific subclass, where they are inherently specific to an ISA
and don't need to check TARGET_ISA.

Also, the DummyChecker which was set up for the BaseSimpleCPU which
doesn't actually do anything in either C++ or python was not carried
forward. The CPU type still exists, but it isn't installed in the
simple CPUs.

To provide backward compatibility, each ISA implements a .py file which
matches the original .py for a CPU, and the original is renamed with a
Base prefix. The ISA specific version creates an alias with the old CPU
name which maps to the ISA specific type. This way, old scripts which
refer to, for example, AtomicSimpleCPU, will get the X86AtomicSimpleCPU
if the x86 version was compiled in, the ArmAtomicSimpleCPU on arm, etc.

Unfortunately, because of how tags on PySource and by extension SimObjects
are implemented right now, if you set the tags on two SimObjects or
PySources which have the same module path, the later will overwrite the
former whether or not they both would be included. There are some
changes in review which would revamp this and make it work like you
would expect, without this central bookkeeping which has the conflict.
Since I can't use that here, I fell back to checking TARGET_ISA to
decide whether to tell SCons about those files at all.

In the long term, this mechanism should be revamped so that these
compatibility types are only available if there is exactly one ISA
compiled into gem5. After the configs have been updated and no longer
assume they can use AtomicSimpleCPU in all cases, then these types can
be deleted.

Also, because ISAs can now either provide subclasses for a CPU or not,
the CPU_MODELS variable has been removed, meaning the non-ISA
specialized versions of those CPU models will always be included in
gem5, except when building the NULL ISA.

In the future, a more granular config mechanism will hopefully be
implemented for *all* of gem5 and not just the CPUs, and these can be
conditional again in case you only need certain models, and want to
reduce build time or binary size by excluding the others.

Change-Id: I02fc3f645c551678ede46268bbea9f66c3f6c74b
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/52490
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2022-01-12 15:59:27 +00:00
Tom Rollet
13e3521a00 cpu-o3: remove useless 'using'-s
Change-Id: Ifa8ef516d0deabb4308bdf3c4b61b88ece149d0e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51347
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-11 08:19:29 +00:00
Tom Rollet
8a535eac48 cpu-o3: Naming cleanup for LSQRequest and Request
'LSQRequest' are now referred as 'request'
'Request' are now referred as 'req'

It makes the code easier to read.
Also it makes the naming of Request consistent with the cache.

Change-Id: I8ba75b75bd8408e411300d522cc2c8582c334cf5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/51067
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-10-11 08:19:29 +00:00
Tom Rollet
de0d6f4116 cpu-o3: remove LSQSenderState
The LSQSenderState that was attached to Request was not useful.
All the fields were either a duplicate of information in the
LSQRequest or totally unused.

The LSQRequest class now inherits from Packet::SenderState and is
attached to the Packet that are sent to memory. We do not need
anymore the indirection Packet->SenderState->LSQRequest.

This helps making the code clearer as it was sometimes hard to
follow the difference between what the LSQRequest and
LSQSenserState was doing
(ex: number of outstanding requests in the memory).

Change-Id: I5b21e007e6d183c6aa79c27c1787ca56dcbc3fb0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50733
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-10-11 08:19:29 +00:00
Samuel Stark
2c457d2a9f cpu: Fix TME for dyn_o3_cpu
Commit c417b76 changed the behaviour of addRequest(),
but did not update documentation or the HTM-related logic that used it.

Updates documentation for addRequest() in light of c417b76,
refactors request class to be idiomatic and use assigned byteEnable,
made HTM cmds pass in a correct byteEnable.

Change-Id: I7aa8c127df896e81caf915fbfea93e7b4bcc53b7
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/50147
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-09-10 08:13:59 +00:00
Giacomo Travaglini
d1cdcb311b misc: Move Mode and Translation from BaseTLB to BaseMMU
This is a step towards moving most of the TLB logic to the
MMU class.

Change-Id: Id6b1fb30aa89960705f165f9738f5b50aa1e6bdb
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46779
Tested-by: kokoro <noreply+kokoro@google.com>
2021-07-07 08:44:13 +00:00
Daniel R. Carvalho
974a47dfb9 misc: Adopt the gem5 namespace
Apply the gem5 namespace to the codebase.

Some anonymous namespaces could theoretically be removed,
but since this change's main goal was to keep conflicts
at a minimum, it was decided not to modify much the
general shape of the files.

A few missing comments of the form "// namespace X" that
occurred before the newly added "} // namespace gem5"
have been added for consistency.

std out should not be included in the gem5 namespace, so
they weren't.

ProtoMessage has not been included in the gem5 namespace,
since I'm not familiar with how proto works.

Regarding the SystemC files, although they belong to gem5,
they actually perform integration between gem5 and SystemC;
therefore, it deserved its own separate namespace.

Files that are automatically generated have been included
in the gem5 namespace.

The .isa files currently are limited to a single namespace.
This limitation should be later removed to make it easier
to accomodate a better API.

Regarding the files in util, gem5:: was prepended where
suitable. Notice that this patch was tested as much as
possible given that most of these were already not
previously compiling.

Change-Id: Ia53d404ec79c46edaa98f654e23bc3b0e179fe2d
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/46323
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-07-01 19:08:24 +00:00
Gabe Black
9909ea8a40 cpu: Create an O3 namespace and simplify O3 names.
DefaultFoo => Foo
O3Foo => Foo
FullO3CPU => CPU

DerivO3CPU => O3CPU (python)

DerivO3 => o3::CPU

Change-Id: I04551214442633c79c33e9d86b067ff3ec0d1a8d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42120
Maintainer: Gabe Black <gabe.black@gmail.com>
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@huawei.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-05-25 19:29:14 +00:00
Gabe Black
f30d15a29e cpu: Delete the now unused cpu/o3/impl.hh.
Change-Id: I99b6ec745066c154079c3f44086d2e8721c0ed82
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42119
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-05-21 01:19:37 +00:00
Gabe Black
fda2e46a9e cpu: De-templatize the FullO3CPU class.
Change-Id: Ib7f1e40447a2f5a49e0c9a3af8579d075d5d3625
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42117
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@huawei.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-05-21 01:19:37 +00:00
Gabe Black
03a843cf77 cpu: De-templatize the O3 DefaultIEW.
Change-Id: Ieb7b23250573a3bc7e7ff296ff6bf8811a865802
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42112
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-05-21 01:19:37 +00:00
Gabe Black
9722ce0075 cpu: De-templatize the O3 LSQ.
Change-Id: I7821fe971c0c38b77e730b4c40fb9fb204c6e7fd
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42111
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Gabe Black <gabe.black@gmail.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-05-21 01:19:37 +00:00
Gabe Black
0f667aff1f cpu: De-templatize O3's LSQUnit.
Change-Id: Id426950b4fec9b98855b3f9f95e63fc0d9b6e64f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42107
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@huawei.com>
2021-05-21 01:19:37 +00:00
Gabe Black
fc51c87329 cpu: Remove the O3CPU type from the O3CPUImpl.
Change-Id: I4dca10ea3ae1c9bb0f2cb55c7d303f1fd8d25283
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42102
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@huawei.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-05-20 20:07:29 +00:00
Gabe Black
2db8b308e0 cpu: Drop the DynInstPtr types from O3CPUImpl.
Aside from basic code editting, this also moves some methods from the
.hh files to the _impl.hh files. It also changes the Checker CPU
template to take the DynInstPtr type directly instead of through Impl
since that was the only type it used anyway. Finally it sets up a header
file which predeclares the O3DynInstPtr and O3DynInstConstPtr types so
they can be used without having to also include the BaseO3DynInst class
definition to break circular dependencies.

Change-Id: I5ca6af38ec13e6e820abcdb3748412e4f7fc1c78
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42101
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@huawei.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-05-20 20:07:09 +00:00
Gabe Black
7036e2174f cpu: Pull all remaining non-comm types out of SimpleCPUPolicy.
Change-Id: I79c56533cf6a9d1c982cea3ca9bedc83e6afda49
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42099
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
2021-04-29 04:23:25 +00:00
Gabe Black
86059e7a0b cpu: Extract stage classes from O3's SimpleCPUPolicy.
Use the target types directly without that layer of indirection. This
also narrows the scope of some includes.

Change-Id: I152f2ce0684781a9b61bd9d5a38620c39a4c60e8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/42098
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabe.black@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-04-29 04:23:01 +00:00
Daniel R. Carvalho
3a8df68388 misc: Fix some includes
Fix some missing and extra includes around the codebase.

Change-Id: Ibf314b43a966943a8096958f68382e1e245f29e3
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38738
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2021-01-15 23:15:30 +00:00
Gabe Black
91d83cc8a1 misc: Standardize the way create() constructs SimObjects.
The create() method on Params structs usually instantiate SimObjects
using a constructor which takes the Params struct as a parameter
somehow. There has been a lot of needless variation in how that was
done, making it annoying to pass Params down to base classes. Some of
the different forms were:

const Params &
Params &
Params *
const Params *
Params const*

This change goes through and fixes up every constructor and every
create() method to use the const Params & form. We use a reference
because the Params struct should never be null. We use const because
neither the create method nor the consuming object should modify the
record of the parameters as they came in from the config. That would
make consuming them not idempotent, and make it impossible to tell what
the actual simulation configuration was since it would change from any
user visible form (config script, config.ini, dot pdf output).

Change-Id: I77453cba52fdcfd5f4eec92dfb0bddb5a9945f31
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35938
Reviewed-by: Gabe Black <gabeblack@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-10-14 12:06:44 +00:00
Giacomo Travaglini
c417b76bad cpu: Never use a empty byteEnable
The byteEnable variable is used for masking bytes in a memory request.
The default behaviour is to provide from the ExecContext to the CPU
(and then to the LSQ) an empty vector, which is the same as providing
a vector where every element is true.
Such vectors basically mean: do not mask any byte in the memory request.

This behaviour adds more complexity to the downstream LSQs, which now
have to distinguish between an empty and non-empty byteEnable.

This patch is simplifying things by transforming an empty vector into
a all true one, making sure the CPUs are always receiving a non empty
byteEnable.

JIRA: https://gem5.atlassian.net/browse/GEM5-196

Change-Id: I1d1cecd86ed64c53a314ed700f28810d76c195c3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23285
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-09-30 14:16:31 +00:00
Shivani Parekh
392c1ced53 misc: Replaced master/slave terminology
Change-Id: I4df2557c71e38cc4e3a485b0e590e85eb45de8b6
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33553
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-09-10 23:02:28 +00:00
Emily Brickey
0df96ee6bb cpu-o3: convert lsq_unit to new style stats
Removes unused stats: invAddrLoads, invAddrSwpfs, lsqBlockedLoads

Change-Id: Icd7fc6d8a040f4a1f9b190409b7cdb0a57fd68cf
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33394
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-09-09 14:37:37 +00:00
Timothy Hayes
46d7fdf1b6 cpu: HTM Implementation for O3CPU
JIRA: https://gem5.atlassian.net/browse/GEM5-587

Change-Id: I83787f4594963a15d856b81ad283b4f032d1c007
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/30328
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-09-08 09:13:30 +00:00
Emily Brickey
1447017039 cpu: update port terminology
Change-Id: I891e7a74683c1775c75a62454fcfdecb7511b7e9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/32312
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu>
2020-08-26 16:48:13 +00:00
Gabe Black
4dd00b0153 arch,cpu,gpu-compute,mem: Remove asid from Request objects.
This is passed around a lot and set all over the place (usually to 0),
but it's never actually used for anything.

Change-Id: I38ca08387beabeaf9e339b4915ec7eba9e19eecb
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/26232
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
2020-03-07 00:40:41 +00:00
Gabe Black
ebd62eff3c arch,cpu,mem: Replace the mmmapped IPR mechanism with local accesses.
The new local access mechanism installs a callback in the request which
implements what the mmapped IPR was doing. That avoids having to have
stubs in ISAs that don't have mmapped IPRs, avoids having to encode
what to do to communicate from the TLB and the mmapped IPR functions,
and gets rid of another global ISA interface function and header files.

Jira Issue: https://gem5.atlassian.net/browse/GEM5-187

Change-Id: I772c2ae2ca3830a4486919ce9804560c0f2d596a
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23188
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2020-03-04 04:09:19 +00:00
Gabe Black
6687265fe2 cpu: Delete authors lists from the cpu directory.
Change-Id: Icfba8e23b5f6820a6ddefe1a50abbe5f8825b7b5
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/25444
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
2020-02-17 21:51:23 +00:00
Giacomo Travaglini
c3bd8eb121 cpu: Fix coding style (byteEnable->byte_enable)
Change-Id: I2206559c6c2a6e6a0452e9c7d9964792afa9f358
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/23282
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-12-11 15:07:52 +00:00
Giacomo Gabrielli
eef524d9ec cpu-o3: Fix handling of some mem. order violations
This patch fixes the handling of memory order violations due to snoops
targeting out-of-order loads: the re-execution triggered in these cases
is achieved by raising a ReExec fault, but such a fault was not handled
correctly after the code changes introduced in changeset 46da8fb.

Change-Id: I2abe161a90468412f56cb28dcc92729326cba1cd
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21819
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Timothy Hayes <timothy.hayes@arm.com>
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-10-30 09:05:34 +00:00
Jordi Vaquero
e5a82da26e cpu, mem: Changing AtomicOpFunctor* for unique_ptr<AtomicOpFunctor>
This change is based on modify the way we move the AtomicOpFunctor*
through gem5 in order to mantain proper ownership of the object and
ensuring its destruction when it is no longer used.

Doing that we fix at the same time a memory leak in Request.hh
where we were assigning a new AtomicOpFunctor* without destroying the
previous one.

This change creates a new type AtomicOpFunctor_ptr as a
std::unique_ptr<AtomicOpFunctor> and move its ownership as needed. Except
for its only usage when AtomicOpFunc() is called.

Change-Id: Ic516f9d8217cb1ae1f0a19500e5da0336da9fd4f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20919
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-09-23 12:32:08 +00:00
Gabe Black
b4e3e2f4a4 cpu: Move O3's data port into the LSQ.
That's where it's used, and putting it there avoids having to pass
around the port using the top level getDataPort function.

Change-Id: I0dea25d0c5f4bb3f58a6574a8f2b2d242784caf2
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/20238
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
2019-08-28 02:14:29 +00:00
Jordi Vaquero
7cb1010bde cpu-o3: added _amo_op parameter in o3 LSQ
Fix bug with AMO (or RMW) instructions where the amo_op variable
is not being propagated to the LSQ request.

Change-Id: I60c59641d9b497051376f638e27f3c4cc361f615
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19814
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
2019-08-07 17:19:22 +00:00
Gabor Dozsa
5a9fb5a2bf cpu-o3: Fix too strict assert condition in writeback()
The assert() in the LSQ writeback() only allowed ReExec faults.
However, a SplitRequest which completed the translation in
PartialFault state (i.e. any but the very first cacheline
translation failed) may end up here. The assert() condition is
extended accordingly.

The patch also removes the superfluous/unused Complete/Squashed
states from the LSQ request. (The completion of the request is
recorded in the flags still.)

Change-Id: Ie575f4d3b4d5295585828ad8c7d3f4c7c1fe15d0
Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19174
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-07-28 16:28:43 +00:00
Gabor Dozsa
46da8fb805 cpu: Add first-/non-faulting load support to Minor and O3
Some architectures allow masking faults of memory load instructions in
some specific circumstances (e.g. first-faulting and non-faulting
loads in Arm SVE). This patch adds support for such loads in the Minor
and O3 CPU models.

Change-Id: I264a81a078f049127779aa834e89f0e693ba0bea
Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19178
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-07-27 20:51:31 +00:00
Giacomo Gabrielli
c58cb8c9db cpu,mem: Add support for partial loads/stores and wide mem. accesses
This changeset adds support for partial (or masked) loads/stores, i.e.
loads/stores that can disable accesses to individual bytes within the
target address range.  In addition, this changeset extends the code to
crack memory accesses across most CPU models (TimingSimpleCPU still
TBD), so that arbitrarily wide memory accesses are supported.  These
changes are required for supporting ISAs with wide vectors.

Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
- Tiago Muck <tiago.muck@arm.com>

Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-11 12:48:58 +00:00
Gabor Dozsa
397d322b99 cpu-o3: Add cache read ports limit to LSQ
This change introduces cache read ports to limit the number of
per-cycle loads. Previously only the number of per-cycle stores
could be limited.

Change-Id: I39bbd984056c5a696725ee2db462a55b2079e2d4
Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13517
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2019-02-22 12:16:20 +00:00
Tuan Ta
25dc765889 cpu: support atomic memory request type with AtomicOpFunctor
This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU,
MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory
system.

Atomic memory instruction is treated as a special store instruction in
all CPU models.

In simple CPUs, an AMO request with an associated AtomicOpFunctor is
simply sent to L1 dcache.

In MinorCPU, an AMO request bypasses store buffer and waits for any
conflicting store request(s) currently in the store buffer to retire
before the AMO request is sent to the cache. AMO requests are not buffered
in the store buffer, so their effects appear immediately in the cache.

In DerivO3CPU, an AMO request is inserted in the store buffer so that it
is delivered to the cache only after all previous stores are issued to
the cache. Data forwarding between between an outstanding AMO in the
store buffer and a subsequent load is not allowed since the AMO request
does not hold valid data until it's executed in the cache.

This implementation assumes that a target ISA implementation must insert
enough memory fences as micro-ops around an atomic instruction to
enforce a correct order of memory instructions with respect to its
memory consistency model. Without extra memory fences, this implementation
can allow AMOs and other memory instructions that do not conflict
(i.e., not target the same address) to reorder.

This implementation also assumes that atomic instructions execute within
a cache line boundary since the cache for now is not able to execute an
operation on two different cache lines in one single step. Therefore,
ISAs like x86 that require multi-cache-line atomic instructions need to
either use a pair of locking load and unlocking store or change the
cache implementation to guarantee the atomicity of an atomic
instruction.

Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a
Reviewed-on: https://gem5-review.googlesource.com/c/8188
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-02-08 15:27:04 +00:00
Andrea Mondelli
1989ce9905 misc: added missing override specifier
Added missing specifier for various virtual functions.

Change-Id: I4783e92d78789a9ae182fad79aadceafb00b2458
Reviewed-on: https://gem5-review.googlesource.com/c/16103
Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-02-05 23:27:57 +00:00
Rekai Gonzalez-Alberquilla
51becd2475 cpu-o3: O3 LSQ Generalisation
This patch does a large modification of the LSQ in the O3 model. The
main goal of the patch is to remove the 'an operation can be served with
one or two memory requests' assumption that is present in the LSQ
and the instruction with the req, reqLow, reqHigh triplet, and
generalising it to operations that can be addressed with one request,
and operations that require many requests, embodied in the
SingleDataRequest and the SplitDataRequest.

This modification has been done mimicking the minor model to an extent,
shifting the responsibilities of dealing with VtoP translation and
tracking the status and resources from the DynInst to the LSQ via the
LSQRequest. The LSQRequest models the information concerning the
operation, handles the creation of fragments for translation and request
as well as assembling/splitting the data accordingly.

With this modifications, the implementation of vector ISAs, particularly
on the memory side, become more rich, as the new model permits a
dissociation of the ISA characteristics as vector length, from the
microarchitectural characteristics that govern how contiguous loads are
executing, allowing exploration of different LSQ to DL1 bus widths to
understand the tradeoffs in complexity and performance.

Part of the complexities introduced stem from the fact that gem5 keeps a
large amount of metadata regarding, in particular, memory operations,
thus, when an instruction is squashed while some operation as TLB lookup
or cache access is ongoing, when the relevant structure communicates to
the LSQ that the operation is over, it tries to access some pieces of
data that should have died when the instruction is squashed, leading to
asserts, panics, or memory corruption. To ensure the correct behaviour,
the LSQRequest rely on assesing who is their owner, and self-destroying
if they detect their owner is done with the request, and there will be
no subsequent action. For example, in the case of an instruction
squashed whal the TLB is doing a walk to serve the translation, when the
translation is served by the TLB, the LSQRequest detects that the
instruction was squashed, and as the translation is done, no one else
expect to access its information, and therefore, it self-destructs.
Having destroyed the LSQRequest earlier, would lead to wrong behaviour
as the TLB walk may access some fields of it.

Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>

Change-Id: I9578a1a3f6b899c390cdd886856a24db68ff7d0c
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13516
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
2019-01-24 09:46:34 +00:00
Nikos Nikoleris
7b4e441c35 cpu-o3: Make the smtLSQPolicy a Param.ScopedEnum
The smtLSQPolicy is a parameter in the o3 cpu that can have 3
different values. Previously this setting was done through a string
and a parser function would turn it into a c++ enum value. This
changeset turns the string into a python Param.ScopedEnum.

Change-Id: I82041b88bd914c5dc660058d9e3998e3114e7c35
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/15397
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-01-17 11:09:08 +00:00
Rekai Gonzalez-Alberquilla
9af1214ffe cpu: Change raw pointers to STL Containers
This patch changes two members from being raw pointers to being STL
containers. The reason behind, other than cleanlyness and arguable OO
best practices is that containers have more intronspections capabilities
than naked pointers do, as the size is known.

Using STL containers adds little overhead and eases the automation of
process during debugging (gdb).

Change-Id: I4d9d3eedafa8b5e50ac512ea93b458a4200229f2
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13126
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2018-12-03 14:23:56 +00:00
Rekai Gonzalez-Alberquilla
0c50a0b4fe cpu: Fix the usage of const DynInstPtr
Summary: Usage of const DynInstPtr& when possible and introduction of
move operators to RefCountingPtr.

In many places, scoped references to dynamic instructions do a copy of
the DynInstPtr when a reference would do. This is detrimental to
performance. On top of that, in case there is a need for reference
tracking for debugging, the redundant copies make the process much more
painful than it already is.

Also, from the theoretical point of view, a function/method that
defines a convenience name to access an instruction should not be
considered an owner of the data, i.e., doing a copy and not a reference
is not justified.

On a related topic, C++11 introduces move semantics, and those are
useful when, for example, there is a class modelling a HW structure that
contains a list, and has a getHeadOfList function, to prevent doing a
copy to an internal variable -> update pointer, remove from the list ->
update pointer, return value making a copy to the assined variable ->
update pointer, destroy the returned value -> update pointer.

Change-Id: I3bb46c20ef23b6873b469fd22befb251ac44d2f6
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/13105
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2018-11-16 10:39:03 +00:00
Giacomo Travaglini
f54020eb81 misc: Using smart pointers for memory Requests
This patch is changing the underlying type for RequestPtr from Request*
to shared_ptr<Request>. Having memory requests being managed by smart
pointers will simplify the code; it will also prevent memory leakage and
dangling pointers.

Change-Id: I7749af38a11ac8eb4d53d8df1252951e0890fde3
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/10996
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2018-06-11 16:55:30 +00:00
Mitch Hayenga
c75ff71139 mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system
as the ContextID is what is used for the purposes of locks/wakeups.
Since we allocate sequential ContextIDs for each thread on MT-enabled
CPUs, ThreadID is unnecessary as the CPUs can identify the requesting
thread through sideband info (SenderState / LSQ entries) or ContextID
offset from the base ContextID for a cpu.

This is a re-spin of 20264eb after the revert (bd1c6789) and includes
some fixes of that commit.
2016-04-07 09:30:20 -05:00
Andreas Sandberg
be28d96510 Revert power patch sets with unexpected interactions
The following patches had unexpected interactions with the current
upstream code and have been reverted for now:

e07fd01651f3: power: Add support for power models
831c7f2f9e39: power: Low-power idle power state for idle CPUs
4f749e00b667: power: Add power states to ClockedObject

Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>

--HG--
extra : amend_source : 0b6fb073c6bbc24be533ec431eb51fbf1b269508
2016-04-06 19:43:31 +01:00
Mitch Hayenga
8615b27174 mem: Remove threadId from memory request class
In general, the ThreadID parameter is unnecessary in the memory system
as the ContextID is what is used for the purposes of locks/wakeups.
Since we allocate sequential ContextIDs for each thread on MT-enabled
CPUs, ThreadID is unnecessary as the CPUs can identify the requesting
thread through sideband info (SenderState / LSQ entries) or ContextID
offset from the base ContextID for a cpu.
2016-04-05 12:39:21 -05:00
Steve Reinhardt
707275265f cpu: remove unnecessary data ptr from O3 internal read() funcs
The read() function merely initiates a memory read operation; the
data doesn't arrive until the access completes and a response packet
is received from the memory system.  Thus there's no need to provide
a data pointer; its existence is historical.

Getting this pointer out of this internal o3 interface sets the
stage for similar cleanup in the ExecContext interface.  Also
found that we were pointlessly setting the contents at this pointer
on a store forward (the useful memcpy happens just a few lines
below the deleted one).
2016-01-17 18:27:46 -08:00
Andreas Hansson
f26a289295 mem: Split port retry for all different packet classes
This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.

The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvReqRetry, sendTimingResp has
recvRespRetry etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.

The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.
2015-03-02 04:00:35 -05:00