Commit Graph

9331 Commits

Author SHA1 Message Date
Ciro Santilli
279501816a arch-arm: implement VMINNM and VMAXNM scalar version
ARMv8.2 16-bit versions have not yet been implemented, but a placeholders
were created for them.

Refactor the nearby decoding tree to closely match the ARM spec A32 decode
table.

That piece of the tree can also be called from thumb which decodes it in
the same way, although the thumb decode table has a different terminology

The old code didn't match neither A32 or T32 terminologies, so it is
better to at least match one of them to help verify correctness.

Change-Id: Iabbbca2932557cf6c98ce36690c385c3ddf39ed8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18690
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-17 10:02:40 +00:00
Ciro Santilli
0dee5c3d1b arch-arm: implement VMINNM and VMAXNM SIMD version
This instruction is backported from aarch64.

In order to use the existing fplibMinNum backend, we first move
VMIN and VPMIN to use fplib. Adding VMINNM is then trivial.

Change-Id: I404daabeb6079f60e51a648a06d5b3e54f1c24a9
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18689
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-17 10:02:40 +00:00
Ciro Santilli
396a07e34d arch-arm: rename operands to match spec in isa/formats/fp.isa
Matches ARM DDI 0487D.a decoding tables.

Change-Id: I48338ef956a04308d55d1022229ebe0962a8fe5d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18688
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-17 10:02:40 +00:00
Tiago Muck
83d5730c48 mem-ruby: MOESI_CMP_dir cleanup
Removed unused states and actions

Change-Id: I3dc684c78d4b92d219e71522ddb706a13f9874d1
Signed-off-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18415
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: John Alsop <johnathan.alsop@amd.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-05-14 22:01:12 +00:00
Tiago Muck
36e49e2b5b mem-ruby: Cache latencies for MOESI_CMP_dir
Modified both L1 and L2 controllers to take into account the cache
latency parameters. Default values in the configuration script updated
as well.

Change-Id: I72bb8dd29ee0b02da06e1addf13b266fe4d1e979
Signed-off-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18414
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-05-14 22:01:12 +00:00
Tiago Muck
496d5ed3e1 mem-ruby: Hit latencies defined by the controllers
Removed the icache/dcache hit latency parameters from the Sequencer.
They were replaced by the mandatory queue enqueue latency that is now
defined by the top-level cache controller. By default, the latency is
defined by the mandatory_queue_latency parameter. When the latency
depends on specific protocol states or on the request type, the protocol
may override the mandatoryQueueLatency function.

Change-Id: I72e57a7ea49501ef81dc7f591bef14134274647c
Signed-off-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18413
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-05-14 22:01:12 +00:00
Tiago Muck
42e55cdafd mem-ruby: Do not change blocked msg enqueue info
Updating the message counter and enqueue times when adding blocked
messages back to the queue does not make a lot of sense since these
messages are not new arrivals.
More importantly, this may lead to starvation. See the scenario below:

1) Request A for a blocked line X arrives
2) A is handled; X is blocked so A is stalled
3) Request B for X arrives; Reponse for X arrives
4) Response is handled; X unblocked; A added back to the request queue
5) B is handled ahead of A (since A's arrival was updated);
   X may become blocked again

If new requests keep comming for X, A may will be stalled forever.

Change-Id: Icad79f3f716a870e91cb3455437b8b3c35f130ac
Signed-off-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18412
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-05-14 22:01:12 +00:00
Tiago Muck
575ac7a14a mem-ruby: Unique ranks for MOESI_CMP_dir in ports
Setting different values for the rank parameter for all inputs ports.
If left unset, it defaults to 0. This may cause issues since the rank is
used as an index in the controller's list of stalled buffers.

Change-Id: Ie8ff660b7450df959292311040aebf802657efcf
Signed-off-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18411
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-05-14 22:01:12 +00:00
Tiago Muck
cbf74a79e6 mem-ruby: Change MOESI_CMP_Dir L2 addressing
L1 controller selects the L2 to message based on the assigned address
ranges instead of explicitly interleaving bits in the L1 controller. This
simplifies the L1 controller implementation a bit and allows for more
flexibility when changing the address->controller mapping.

Change-Id: Ie67999bb977566939432a5045f65dbd2da81816a
Signed-off-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18410
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-05-14 22:01:12 +00:00
Tiago Muck
7b84e3ba58 mem-ruby: Fix MOESI_CMP_dir debug msg
Change-Id: I3fd32bd2e81dbf9a8ea49a43727564b8a9d64767
Signed-off-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18409
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-05-14 22:01:12 +00:00
Tiago Muck
b98b648797 mem-ruby: Prevent response stalls on MOESI_CMP_directory
When a message triggers a transition that has actions which allocate
TBEs, the generated code automatically includes a check for the TBETable
size before executing any action. If the table is full, the transition
returns TransitionResult_ResourceStall and no more messages from the
buffer are handled (until the next cycle).

This behavior may lead to deadlocks in the MOESI_CMP_directory protocol
since events triggered by the response queue may allocate TBEs (e.g.
L2 replacements triggered by the response queue). If the table is full,
the queue is stalled preventing other responses from freeing TBEs.

This patch fixes this by handling WRITEBACK_DIRTY_DATA/CLEAN_DATA messages
as requests and WB_ACK/WB_NACK as responses. All controllers are changed
to work with the new types. With this fix, responses are always
handled first in all controllers, and no response triggers TBE
allocations.

Change-Id: I377c0ec4f06d528e9f0541daf3dcc621184f2524
Signed-off-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18408
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: John Alsop <johnathan.alsop@amd.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-05-14 22:01:12 +00:00
Javier Bueno
abd33d6fd2 arch-arm: Do not check MustBeOne flag for TLB requests from the prefetcher
Allow TLB requests generated from prefetchers to override the
MustBeOne arch flag. This allows the prefetchers to issue requests
without having to know architecutre-specific flags.

Change-Id: Id83e0c93f3d1a614da11c4f344ab4dc594423672
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18768
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-14 14:56:34 +00:00
Giacomo Travaglini
27378ecbe4 Revert "cpu: fix how a thread starts up in MinorCPU"
This reverts commit 02dafc5498.
The commit was part of a patchset which broke MinorCPU regressions
(switcheroo)

Change-Id: I0a8098fc71abe5838014e587dbe372b258d8aa9f
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18604
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-14 08:44:37 +00:00
Giacomo Travaglini
9a1eb7a3d2 Revert "cpu: stop scheduling suspended threads in MinorCPU"
This reverts commit 6a6668bbc4.
The commit was part of a patchset which broke MinorCPU regressions
(switcheroo)

Change-Id: I3c16a6478ba44b9d27cdd3d64a710a356999df05
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18603
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-14 08:44:37 +00:00
Giacomo Travaglini
9852c5d96b Revert "cpu: fix branching when thread is suspended in MinorCPU"
This reverts commit e437086341.
The commit was part of a patchset which broke MinorCPU regressions
(switcheroo)

Change-Id: Ib8482034c2402008ccfa552325a8eb31e731b619
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18602
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-14 08:44:37 +00:00
Daniel
c1bd27907d mem-cache: Use SatCounter for prefetchers
Many prefetchers re-implement saturating counters with ints. Make
them use SatCounters instead.

Added missing operators and constructors to SatCounter for that to
be possible and their respective tests.

Change-Id: I36f10c89c27c9b3d1bf461e9ea546920f6ebb888
Signed-off-by: Daniel <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17995
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Javier Bueno Hedo <javier.bueno@metempsy.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-05-14 07:55:06 +00:00
Daniel
50a533f101 base: Add operators to SatCounter
Add shift, add and subtract assignment operators, as well as
copy and move constructor and assignments to SatCounter, so
that it they can be used by the prefetchers.

Also add extra useful functions to calculate saturation
oercentile so that the instantiator does not need to be aware
of the counter's maximum value.

Change-Id: I61d0cb28c8375b9d2774a39011e4a0aa6fe9ccb7
Signed-off-by: Daniel <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17996
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-05-14 07:55:06 +00:00
Daniel
5b3f91daf6 base: Add GTest to SatCounter
Add a GTest to the SatCounter class.

Change-Id: Iaf1b18db9fe8d7fe32e0e40c7947dcd1fd6cc33b
Signed-off-by: Daniel <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17994
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2019-05-14 07:55:06 +00:00
Daniel
578dd474b9 base: Move SatCounter to base directory
Saturating counters are used by many objects, not only
the cpu predictors. Therefore, move the class to the
base folder so that it can be more easily used.

Change-Id: I26f799324bdd8720ab8834c72a2002149cee777c
Signed-off-by: Daniel <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17993
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2019-05-14 07:55:06 +00:00
Daniel
0def916836 cpu: Revamp saturating counters
Revamp the SatCounter class, improving comments, implementing
increment, decrement and read operators to solve an old todo,
and adding missing error checking.

Change-Id: Ia057c423c90652ebd966b6b91a3471b17800f933
Signed-off-by: Daniel <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17992
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-14 07:55:06 +00:00
Jairo Balart
0473f8f65f cpu: Make the indirect predictor into a SimObject
Change-Id: Ice6549773def7d3e944fae450d4a079bc351e2ba
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/15319
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-13 11:43:50 +00:00
Daniel R. Carvalho
129101524a mem-ruby: Replace string parameter in MultiBitSelBloomFilter
Replace string parameter from MultiBitSelBloomFilter's constructor
by their tokenized counterparts.

Change-Id: I2e3db109dc4814fa0e9c13259f1136a6c4083092
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18728
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-12 17:22:01 +00:00
Giacomo Gabrielli
8ddec45de4 arch-arm: Add initial support for SVE contiguous loads/stores
Thanks to Pau Cabre and Adria Armejach Sanosa for their contribution
of bugfixes.

Change-Id: If8983cf85d95cddb187c90967a94ddfe2414bc46
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13519
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Giacomo Travaglini <giacomo.travaglini@arm.com>
2019-05-11 12:49:15 +00:00
Giacomo Gabrielli
c58cb8c9db cpu,mem: Add support for partial loads/stores and wide mem. accesses
This changeset adds support for partial (or masked) loads/stores, i.e.
loads/stores that can disable accesses to individual bytes within the
target address range.  In addition, this changeset extends the code to
crack memory accesses across most CPU models (TimingSimpleCPU still
TBD), so that arbitrarily wide memory accesses are supported.  These
changes are required for supporting ISAs with wide vectors.

Additional authors:
- Gabor Dozsa <gabor.dozsa@arm.com>
- Tiago Muck <tiago.muck@arm.com>

Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-11 12:48:58 +00:00
Giacomo Gabrielli
d0e4cdc9c3 cpu: Add a memory access predicate
This changeset introduces a new predicate to guard memory accesses.
The most immediate use for this is to allow proper handling of
predicated-false vector contiguous loads and predicated-false
micro-ops of vector gather loads (added in separate changesets).

Change-Id: Ice6894fe150faec2f2f7ab796a00c99ac843810a
Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17991
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bradley Wang <radwang@ucdavis.edu>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-11 09:34:27 +00:00
Tiago Muck
8c34a1a677 mem-ruby: Fix MOESI_CMP_directory blocked line handling
Using recycle in the L2 controllers to put messages back into the buffer
may lead to starvation when there are many L1 requests for the same line.
This can easily trigger the deadlock detection mechanism in configurations
with many cores (16+). Replacing recycle by stall_and_wait for L1
requests avoids this issue. wakeUpBuffers calls were added to all
transitions from transient to stable states.

Change-Id: I28b8aeacc48919ccf38e69653cd9205a4153514b
Signed-off-by: Tiago Muck <tiago.muck@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17568
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Matthew Poremba <matthew.poremba@amd.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-09 15:39:27 +00:00
Daniel R. Carvalho
bf0a722acd mem-cache: Remove writebacks packet list
Previously all atomic writebacks concerned a single block,
therefore, when a block was evicted, no other block would be
pending eviction. With sector tags (and compression),
however, a single replacement can generate many evictions.

This can cause problems, since a writeback that evicts a block
may evict blocks in the lower cache. If one of these conflict
with one of the blocks pending eviction in the higher level, the
snoop must inform it to the lower level. Since atomic mode does
not have a writebuffer, this kind of conflict wouldn't be noticed.

Therefore, instead of evicting multiple blocks at once, we
do it one by one.

Change-Id: I2fc2f9eb0f26248ddf91adbe987d158f5a2e592b
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18209
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-08 17:41:09 +00:00
Daniel R. Carvalho
e54c7a68f8 mem-cache: Handle data expansion
When a block in compressed form is overwriten, it may change
its size. If the new compressed size is bigger, and the total
size becomes bigger than the block size, one or more blocks
will have to be evicted. This is called data expansion, or
fat writes.

This change assumes that a first level cache cannot have a
compressor, since otherwise data expansion should have been
handled for atomic operations and writes. As such, data
expansions should only be seen on writebacks. As writebacks
are forwarded to the next level when failed, there should
be no data expansions when servicing misses either.

This patch adds the functionality to handle data expansions
by evicting the co-allocated blocks to make room for an
expanded block.

Change-Id: I0bd77bf6446bfae336889940b2f75d6f0c87e533
Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/12087
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-08 17:41:09 +00:00
Daniel R. Carvalho
273aacfe48 mem-cache: Add co-allocation function to compressed tags
Implement a co-allocation function in compressed tags, so
that compressed blocks can be co-allocated in a superblock.
Co-allocation is possible when compression ratio (CR) blocks
that share a superblock tag can be compressed to up to (100/CR)%
of their size.

Change-Id: I937cc1fcbb488e70309cb5478c12db65f1b4b23f
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11411
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-08 17:41:09 +00:00
Daniel R. Carvalho
a39af1f0ac mem-cache: Add compression and decompression calls
Add a compressor to the base cache class and compress within
block allocation and decompress on writebacks.

This change does not implement data expansion (fat writes) yet,
nor it adds the compression latency to the block write time.

Change-Id: Ie36db65f7487c9b05ec4aedebc2c7651b4cb4821
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11410
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-08 17:41:09 +00:00
Daniel R. Carvalho
77a49860f9 mem-cache: Create BDI Compressor
Implement Base-Delta-Immediate compression, as described in
'Base-Delta-Immediate Compression: Practical Data Compression
for On-Chip Caches'

Change-Id: I7980c340ab53a086b748f4b2108de4adc775fac8
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11412
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-08 17:41:09 +00:00
Daniel R. Carvalho
0e276f6512 mem-cache: Add compression stats
Add compression statistics to the compressors. It tracks
the number of blocks that can fit into a certain power
of two size, and the number of decompressions.

For example, if a block is compressed to 100 bits, it will
belong to the 128-bits compression size. Although it could
also fit bigger sizes, they are not taken into account for
the stats (i.e., the 100-bit compression will fit only the
128-bits size, not 256 or higher).

We save stats for compressions that fail (i.e., compressed
size is bigger than original cache line size).

Change-Id: Idab71a40a660e33259908ccd880e42a880b5ee06
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11103
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-08 17:41:09 +00:00
Daniel R. Carvalho
f21f4a049e mem-cache: Create cache compressor
Create basic template for cache compressors. A basic compressor
must implement a compression and a decompression method.

Change-Id: I83dc4d2b8d2bc5ed9f760c938edfa4ebdd6b8583
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11100
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-08 17:41:09 +00:00
Daniel R. Carvalho
4b6b068aa0 mem-cache: Add block size to findVictim
Add block size to findVictim. For standard caches it
will not be used. Compressed caches, however, need to
know the size of the compressed block to decide whether
a block is co-allocatable or not.

Change-Id: Id07f79763687b29f75d707c080fa9bd978a408aa
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11198
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Mohammad Seyedzadeh <sm.seyedzade@gmail.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-08 17:41:09 +00:00
Daniel R. Carvalho
784642b431 mem-cache: Add compression data to CompressionBlk
Add a compression bit, decompression latency and compressed
block size and their respective getters and setters.

Change-Id: Ia9d8656552d60e8d4e85fe5379dd75fc5adb0abe
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11102
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-08 17:41:09 +00:00
Daniel R. Carvalho
bba32e6df8 mem-cache: Create CacheComp debug flag
Create a debug flag for cache compression.

Change-Id: Id4b8e86d658d3aa550906ee0f8da3b54f4cdab7d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/11104
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-05-08 17:41:09 +00:00
Daniel R. Carvalho
e22a6c9180 mem-cache: Stub compression framework
Create a stub of a compression framework where we can have
multiple data blocks per tag entry. Only consecutive blocks
can share a tag as of now.

For each tag entry there can be multiple data blocks. We have
the same number of tags a conventional cache would have, but
we instantiate the maximum number of data blocks (according to
the compression ratio) per tag, to virtually implement
compression without increasing the complexity of the simulator.

Change-Id: I549940c7afb2f744ab293ff8bb283967e7551a11
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/10763
Tested-by: kokoro <noreply+kokoro@google.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-08 17:41:09 +00:00
Gabor Dozsa
6bf8508fdc x86: Mark translation as delayed in case of a hw page table walk
This information is used by the LSQ in the O3 cpu (since commit
"51becd2... cpu-o3: O3 LSQ Generalisation")

Change-Id: I35fe7e2f8428641d863af0e79e28b0b259fb0b00
Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18508
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-07 09:42:45 +00:00
Andrea Mondelli
7a00e9d186 sim-se: correct statfs inclusion on !linux host
- Added missing header
- Fixed typo on __linux__ macro conditional
- s/ifdef/if defined/g for consistency

Change-Id: I83b69856e5ec8b23b707642c0e14216cf62db31e
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18668
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-06 18:20:44 +00:00
Alec Roelke
53e74695ac arch-riscv: Implement MHARTID CSR
This patch implements the MHARTID CSR by intercepting attempts to access
it, similar to the way accesses to the performance counters are
intercepted, to return the thread's context ID.

Change-Id: Ie14a31036fbe0e49fb3347ac0c3c508d9427a10d
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16988
Reviewed-by: Alec Roelke <alec.roelke@gmail.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Alec Roelke <alec.roelke@gmail.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-04 04:37:19 +00:00
Joe Gross
f75351acd7 sim-se: fix a few bugs/warns from GCC 6
Change-Id: Ib2ad860324fd234b23262d141be3e82628ff61f0
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/12126
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Brandon Potter <Brandon.Potter@amd.com>
Maintainer: Brandon Potter <Brandon.Potter@amd.com>
2019-05-03 19:16:34 +00:00
Brandon Potter
d692552e90 sim-se: add eventfd system call
Change-Id: I7aeb4fe808d0c8f2fb8041e3662d330d8458f09c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/12125
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Ciro Santilli <ciro.santilli@arm.com>
Maintainer: Brandon Potter <Brandon.Potter@amd.com>
2019-05-03 15:55:16 +00:00
Nikos Nikoleris
64687eee01 mem-cache: Mark block as dirty after a SWPrefetchEXResp
This is a workaround for a bug introduced from the change:
59e3585a8 arch-arm: We add PRFM PST instruction for arm
which can cause deadlocks in the memory system.

The design of the classic memory system in gem5 makes the folloing two
assumptions:
* A cache that fetches a block with an intention to modify it, becomes
  the point of ordering and therefore commits to respond to any snoop
  requests [1].
* A cache that fetches an exclusive copy of the block, does so with
  the intention to modify it [2]. Immediately after it receives the
  block, it will write to it and mark it as dirty. As the point of
  ordering, it responds to any outstanding snoops.

The current implementation of prefetch exclusive request breaks the
second assumption. A cache can fetch an exclusive block without an
immediate intention to modify it. If the block is not modified, it
will not be marked as dirty. However, the cache has committed to
respond to outstanding snoops, and if the block is clean it
won't. This can result in deadlocks where a snoop gets stuck waiting
for responses.

One solution (implemented by this patch) is to unconditionally mark
the block dirty when filling due to a prefetch exclusive request.
This makes the PrefetchExReq behave like a WriteReq. However, as it
may mark as dirty a clean block, it creates the requirement for an
uncessary WritebackDirty in the future. In practice, this shouldn't be
a big problem unless the application is unnecessarily using prefetch
exclusive instructions.

Other solutions, would require deeper changes to the design of the
memory system to handle this properly.

[1]: When a cache commits to respond, it "informs" the xbar/PoC (point
of coherence) and the other caches of its intention to respond. As a
result the request will not be send to the main memory.
[2]: In fact the assumption is that in the needsWritable MSHR there is
at least one WriteReq before any snoops from other caches.

Change-Id: I378d3c0dadf25fc52e430b67102347b44d2f18ea
Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17729
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-03 14:52:53 +00:00
Avishai Tvila
b4b487e1ad arch-riscv,isa: Fix for compressed jump (c_j) imm
c_j(al) has a special format, called CJ.
The jump offset format is instbits[12:2] --> offset[11|4|9:8|10|6|7|3:1|5]
Currently in decoder.isa, c_j format is JOp, the imm and branchTarget are incorrect
In the execute section (decoder.isa:228), the imm fields is ignored and the offset is calculated correctlly.
As a result, we get decoder flush for each c_j instance
I've added CJOp format in compressed.isa, and use it in execute section.
In addition, c_j is mappped to jal zero, cj_imm, and actually is neither indirect control nor a function call
I fixed the flags accordently.
I'll fix all IsRet, IsCall and IsIndirectControl flags for rest of (c_)jal(r) in my next commit.
I ran coremark -O0 before my fix and I got 37.7% branch miss-rate, after the fix the branch miss-rate is <13%

Change-Id: I608d5894a78a1ebefe36f21e21aaea68b42bccfc
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17808
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Alec Roelke <alec.roelke@gmail.com>
2019-05-03 12:53:53 +00:00
Giacomo Travaglini
b6d60e82dd dev: StreamID generation in DMA device
This patch is adding a StreamID tag to any DMA Packet. StreamIDs are
tags which are used by IOMMUs to distinguish between different
devices/functions.

For PCI devices for example, the RID (Pci Bus number, Pci Device
number, Pci Function number) could be stored in the Packet streamID
field.

For the DmaDevice base class, a simple pair of (Sub)StreamIDs has been
provided.  This is basically attaching a fixed (decided at python config
time) streamID per device.  If a derived device wants to implement a
more elaborate packet tagger (for example if it wants to have more than
one streamID), it needs to pass a different StreamID and SubstreamID to
the DmaPort interface (like dmaAction).

Change-Id: Ia17cf00437f7d3eb79211c1374134b174f90de59
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16749
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-03 08:38:12 +00:00
Giacomo Travaglini
49a71ca1d0 dev-arm: Store a PhysProxy port in Gicv3Redist
This spares us from retrieving the TC pointer every time we want to
write/read to memory (LPIs)

Change-Id: Iad76b5e69188fa0ac5c6777a3b2664b0fc66b12f
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18600
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-02 14:42:42 +00:00
Giacomo Travaglini
3762721456 dev-arm: Add named variable for GICD_TYPER.IDBits
This could be used by other GICv3 components to query the maximum
number of implemented interrupt identifiers

Change-Id: I132e50de331aea22523260bcefba7e961b53eccd
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18599
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-02 14:42:42 +00:00
Giacomo Travaglini
5c891178b9 dev-arm: Read correct version of ICC_BPR register
Some methods like groupPriorityMask check for the value of binary point
registers. Those registers have a minimum value.  Writing to those
register is taking this into account, but the problem with the minimum
value arises when the value is checked before sw is writing to them.
In this case the minimum value won't be considered if the read is
directly forwarded to the ISA class.

Change-Id: Id432a37f1634b02bc478d65c52ffb88323d4bb77
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18598
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-02 14:42:42 +00:00
Giacomo Travaglini
5f29ec8a5e dev-arm: Get a Gicv3Redistributor ptr from phys address
The patch is adding the following method to Gicv3:

* Gicv3::getRedistributorByAddr
This will be needed by the ITS when trying to select the target
redistributor after decoding the collection table entry (RDBase).

Change-Id: I40e2c155f2fdc8ca6d3c20ff7a27702e02499f20
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18597
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-02 14:42:42 +00:00
Giacomo Travaglini
68f2f1c5f5 dev-arm: Add several LPI methods in Gicv3Redistributor
Refactoring the existing in code in smaller methods will be crucial when
adding the ITS module, which is a client for the redistributor class and
which will require it to take different actions depending on the command
it receives from software.

List of methods:

* read/writeEntryLPI
Reading/Writing a byte from the LPI pending table

* isPendingLPI
Checks if the pINTID LPI is set. Knowing if an LPI is set is needed by
the MOVI command, which is transfering the pending state from one
redistributor to the other only if the LPI is pending.

Change-Id: If14b1c28ff7f2aa20b12dcd822bf6a490cbe0270
Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18596
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-02 14:42:42 +00:00