derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Gabe Black	55a3e35388	cpu,mem: Add or remove parenthesis to make the compiler happy. Remove extraneous parenthesis in an if condition, and add some parenthesis where an assignment was being used as a condition in a while loop. Change-Id: Ie12c74ac681ef042138e3b41f257ea1bb2ce4267 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/40956 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-02-19 04:43:54 +00:00
Hoa Nguyen	65bbd5fa2a	cpu: Add Units to cpu stats Change-Id: I387b2e9f6ecf62757242056f732bd443c457ebea Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/39095 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Bobby R. Bruce <bbruce@ucdavis.edu> Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu>	2021-02-10 09:03:09 +00:00
Gabe Black	344ea0330a	cpu: Replace fixed sized arrays in the O3 inst with variable arrays. The only way to allocate fixed sized arrays which will definitely be big enough for all source/destination registers for a given instruction is to track the maximum number of each at compile time, and then size the arrays appropriately. That creates a point of centralization which prevents breaking up decoder and instruction definitions into more modular pieces, and if multiple ISAs are ever built at once, would require coordination between all ISAs, and wasting memory for most of them. The dynamic allocation overhead is minimized by allocating the storage for all variable arrays in one chunk, and then placing the arrays there using placement new. There is still some overhead, although less than it might be otherwise. Change-Id: Id2c42869cba944deb97da01ca9e0e70186e22532 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/38384 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2021-02-05 03:04:56 +00:00
Hoa Nguyen	81c2978e6c	cpu,stats: Update stats style for base.hh and base.cc Change-Id: Ib34dcb294370ea66e3526ab35660d8b50668bebe Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36297 Reviewed-by: Gabe Black <gabe.black@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-11-19 22:46:48 +00:00
Hoa Nguyen	84fa2b46d9	cpu-o3,stats: Update stats style for mem_dep_unit.hh Change-Id: I9bd8e9bc331f5d57c1b6320a87b14e9b94465148 Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36215 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Bobby R. Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2020-11-19 22:46:48 +00:00
Hoa Nguyen	d9673c88d7	cpu-o3,stats: Update stats style of inst_queue & inst_queue_impl Change-Id: I95c2e194e757437fb8c3b3f530bce363e24f9a8e Signed-off-by: Hoa Nguyen <hoanguyen@ucdavis.edu> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/36176 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-11-19 22:46:48 +00:00
Gabe Black	91d83cc8a1	misc: Standardize the way create() constructs SimObjects. The create() method on Params structs usually instantiate SimObjects using a constructor which takes the Params struct as a parameter somehow. There has been a lot of needless variation in how that was done, making it annoying to pass Params down to base classes. Some of the different forms were: const Params & Params & Params * const Params * Params const* This change goes through and fixes up every constructor and every create() method to use the const Params & form. We use a reference because the Params struct should never be null. We use const because neither the create method nor the consuming object should modify the record of the parameters as they came in from the config. That would make consuming them not idempotent, and make it impossible to tell what the actual simulation configuration was since it would change from any user visible form (config script, config.ini, dot pdf output). Change-Id: I77453cba52fdcfd5f4eec92dfb0bddb5a9945f31 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/35938 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-10-14 12:06:44 +00:00
Gabe Black	faf0af7a35	arch,cpu: Rearrange StaticInst flags for memory barriers. There were three different StaticInst flags for memory barriers, IsMemBarrier, IsReadBarrier, and IsWriteBarrier. IsReadBarrier was never used, and IsMemBarrier was for both loads and stores, so a composite of IsReadBarrier and IsWriteBarrier. This change gets rid of IsMemBarrier and replaces by setting IsReadBarrier and IsWriteBarrier at the same time. An isMemBarrier accessor is left, but is now implemented by checking if both of the other flags are set, and renamed to isFullMemBarrier to make it clear that it's checking both for both types of barrier, not one or the other. Change-Id: I702633a047f4777be4b180b42d62438ca69f52ea Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/33743 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-09-16 08:29:17 +00:00
Tiago Mück	8c47c0dd63	cpu-o3: fix IQ missing mem barriers After commit `e2a5063e5f` some memory references now tracked as barriers were not having their completion properly notified to the MemDepUnit. This patch fixes InstructionQueue and changes MemDepUnit's completeBarrier to completeInst, which now should be called for both memory references and barrier instructions. Change-Id: I28b5f112b45778f6272e71bb3766b364c3d2e7db Signed-off-by: Tiago Mück <tiago.muck@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/29654 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2020-06-02 18:39:14 +00:00
Gabe Black	6687265fe2	cpu: Delete authors lists from the cpu directory. Change-Id: Icfba8e23b5f6820a6ddefe1a50abbe5f8825b7b5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/25444 Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>	2020-02-17 21:51:23 +00:00
Jordi Vaquero	fb9038ed23	cpu-o3: fix atomic instructions non-speculative Fix problem with O3 and AMO instructions. At initial stages amo instruction is considered a type of non-speculative store. After the instruction has been commited and during the squash step, acquire_release version of the AMO operation is considered speculative, that differents results in an assert fault. This fix ensures that AMO instructions are always considered non-speculative, during early stages and during squas/removal of the instruction. Change-Id: Ia0c5fbb9dc44a9991337b57eb759b1ed08e4149e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19815 Maintainer: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>	2019-08-07 17:39:51 +00:00
Giacomo Gabrielli	fc61172dbe	cpu-o3: Add support for pinned writes This patch adds support for pinning registers for a certain number of consecutive writes. This is only relevant for timing CPU models (functional-only models are unaffected), and it is primarily needed to provide a realistic execution model for micro-coded operations whose microops can write to non-overlapping portions of a destination register, e.g. vector gather loads. In those cases, this mechanism can disable renaming for a sequence of consecutive writes, thus making the resulting execution more efficient: allocating a new physical register for each microop would introduce a read-modify-write chain of dependencies, while with these modifications the microops can write back in parallel. Please note that this new feature is only leveraged by O3CPU for the time being. Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> Change-Id: I07eb5fdbd1fa0b748c9bdc1174d9f330fda34f81 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13520 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>	2019-05-30 15:55:59 +00:00
Gabe Black	972c38b1cc	arch, base, cpu, dev, mem, sim: Remove #if 0-ed out code. This code will be preserved through version control, but otherwise creates clutter and will rot in place since it's never compiled. Change-Id: Id265f6deac445116843956ea5cf1210d8127274e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18608 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>	2019-05-18 10:20:20 +00:00
Andrea Mondelli	e13d6dc9c0	misc: Removed inconsistency in O3* debug msgs Added consistency in the DEBUG message form, to allow a better parsing. Fixed sn/tid type parameter. Removed some annoying newlines Change-Id: I4761c49fc12b874a7d8b46779475b606865cad4b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17248 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>	2019-04-03 16:50:22 +00:00
Tuan Ta	25dc765889	cpu: support atomic memory request type with AtomicOpFunctor This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU, MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory system. Atomic memory instruction is treated as a special store instruction in all CPU models. In simple CPUs, an AMO request with an associated AtomicOpFunctor is simply sent to L1 dcache. In MinorCPU, an AMO request bypasses store buffer and waits for any conflicting store request(s) currently in the store buffer to retire before the AMO request is sent to the cache. AMO requests are not buffered in the store buffer, so their effects appear immediately in the cache. In DerivO3CPU, an AMO request is inserted in the store buffer so that it is delivered to the cache only after all previous stores are issued to the cache. Data forwarding between between an outstanding AMO in the store buffer and a subsequent load is not allowed since the AMO request does not hold valid data until it's executed in the cache. This implementation assumes that a target ISA implementation must insert enough memory fences as micro-ops around an atomic instruction to enforce a correct order of memory instructions with respect to its memory consistency model. Without extra memory fences, this implementation can allow AMOs and other memory instructions that do not conflict (i.e., not target the same address) to reorder. This implementation also assumes that atomic instructions execute within a cache line boundary since the cache for now is not able to execute an operation on two different cache lines in one single step. Therefore, ISAs like x86 that require multi-cache-line atomic instructions need to either use a pair of locking load and unlocking store or change the cache implementation to guarantee the atomicity of an atomic instruction. Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a Reviewed-on: https://gem5-review.googlesource.com/c/8188 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>	2019-02-08 15:27:04 +00:00
Giacomo Gabrielli	25474167e5	arch,cpu: Add vector predicate registers Latest-gen. vector/SIMD extensions, including the Arm Scalable Vector Extension (SVE), introduce the notion of a predicate register file. This changeset adds this feature across architectures and CPU models. Change-Id: Iebcadbad89c0a582ff8b1b70de353305db603946 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13715 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>	2019-01-30 16:57:54 +00:00
Rekai Gonzalez-Alberquilla	51becd2475	cpu-o3: O3 LSQ Generalisation This patch does a large modification of the LSQ in the O3 model. The main goal of the patch is to remove the 'an operation can be served with one or two memory requests' assumption that is present in the LSQ and the instruction with the req, reqLow, reqHigh triplet, and generalising it to operations that can be addressed with one request, and operations that require many requests, embodied in the SingleDataRequest and the SplitDataRequest. This modification has been done mimicking the minor model to an extent, shifting the responsibilities of dealing with VtoP translation and tracking the status and resources from the DynInst to the LSQ via the LSQRequest. The LSQRequest models the information concerning the operation, handles the creation of fragments for translation and request as well as assembling/splitting the data accordingly. With this modifications, the implementation of vector ISAs, particularly on the memory side, become more rich, as the new model permits a dissociation of the ISA characteristics as vector length, from the microarchitectural characteristics that govern how contiguous loads are executing, allowing exploration of different LSQ to DL1 bus widths to understand the tradeoffs in complexity and performance. Part of the complexities introduced stem from the fact that gem5 keeps a large amount of metadata regarding, in particular, memory operations, thus, when an instruction is squashed while some operation as TLB lookup or cache access is ongoing, when the relevant structure communicates to the LSQ that the operation is over, it tries to access some pieces of data that should have died when the instruction is squashed, leading to asserts, panics, or memory corruption. To ensure the correct behaviour, the LSQRequest rely on assesing who is their owner, and self-destroying if they detect their owner is done with the request, and there will be no subsequent action. For example, in the case of an instruction squashed whal the TLB is doing a walk to serve the translation, when the translation is served by the TLB, the LSQRequest detects that the instruction was squashed, and as the translation is done, no one else expect to access its information, and therefore, it self-destructs. Having destroyed the LSQRequest earlier, would lead to wrong behaviour as the TLB walk may access some fields of it. Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> Change-Id: I9578a1a3f6b899c390cdd886856a24db68ff7d0c Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13516 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>	2019-01-24 09:46:34 +00:00
Nikos Nikoleris	6825ef1941	cpu-o3: Make the smtIQPolicy a Param.ScopedEnum The smtIQPolicy is a parameter in the o3 cpu that can have 3 different values. Previously this setting was done through a string and a parser function would turn it into a c++ enum value. This changeset turns the string into a python Param.ScopedEnum. Change-Id: Ieecf0a19427dd250b0d5ae3d531ab46a37326ae5 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15398 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>	2019-01-17 11:09:08 +00:00
Rekai Gonzalez-Alberquilla	3bb49cb2b0	cpu,arch-arm: Initialise data members The value that is not initialized has a bogus value that manifests when using some debug-flags what makes the usage of tracediff a bit more challenging. In addition, while debugging with other techniques, it introduces the problem of understanding if the value of a field is 'intended' or just an effect of the lack of initialisation. Change-Id: Ied88caa77479c6f1d5166d80d1a1a057503cb106 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13125 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>	2018-11-28 14:12:35 +00:00
Gabe Black	12311c5540	arch, base, cpu, gpu, mem: Replace assert(0 or false with panic. Neither assert(0) nor assert(false) give any hint as to why control getting to them is bad, and their more descriptive versions, assert(0 && "description") and assert(false && "description"), jury rig assert to add an error message when the utility function panic() already does that directly with better formatting options. This change replaces that flavor of call to assert with panic, except in the actual code which processes the formatting that panic uses (to avoid infinitely recurring error handling), and in some *.sm files since I don't know what rules those have to follow and don't want to accidentaly break them. Change-Id: I8addfbfaf77eaed94ec8191f2ae4efb477cefdd0 Reviewed-on: https://gem5-review.googlesource.com/c/14636 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>	2018-11-27 21:58:24 +00:00
Rekai Gonzalez-Alberquilla	0c50a0b4fe	cpu: Fix the usage of const DynInstPtr Summary: Usage of const DynInstPtr& when possible and introduction of move operators to RefCountingPtr. In many places, scoped references to dynamic instructions do a copy of the DynInstPtr when a reference would do. This is detrimental to performance. On top of that, in case there is a need for reference tracking for debugging, the redundant copies make the process much more painful than it already is. Also, from the theoretical point of view, a function/method that defines a convenience name to access an instruction should not be considered an owner of the data, i.e., doing a copy and not a reference is not justified. On a related topic, C++11 introduces move semantics, and those are useful when, for example, there is a class modelling a HW structure that contains a list, and has a getHeadOfList function, to prevent doing a copy to an internal variable -> update pointer, remove from the list -> update pointer, return value making a copy to the assined variable -> update pointer, destroy the returned value -> update pointer. Change-Id: I3bb46c20ef23b6873b469fd22befb251ac44d2f6 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13105 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>	2018-11-16 10:39:03 +00:00
Hanhwi Jang	cfc3a3a628	cpu-o3: Missing freeing the heads of DepGraph in IQ squashing Free the squahsed instructions' heads of DepGraph in IQ squashing In a system with large register file (ex.2048), the number of DynInst hits the hardcoded limit (1500). This is caused by missing freeing the heads of DepGraph in IQ. IQ only clears out the heads when instructions reach writeback stage. If a instruction is squashed before writeback stage, its head of dependency graph, which holds the instruction's DynInstPtr, would not be cleared out. This prevents freeing the DynInst of the squahsed instruction even after it is committed. Change-Id: I05b3db93cb6ad8960183d7ae765149c7f292e5b3 Reviewed-on: https://gem5-review.googlesource.com/7481 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>	2018-07-24 00:52:51 +00:00
Andreas Sandberg	d3ecb5d406	cpu-o3: Add missing vector stat initializers All of the O3 vector stats added by 'arch: ISA parser additions of vector registers' are currently missing their stat initializers. Add the missing stat initialization to InstructionQueue::regStats. Change-Id: Idc4b8e2824120a2542d8a604340a1b41bde6aa28 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/6101 Reviewed-by: Gabe Black <gabeblack@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>	2017-11-28 14:15:36 +00:00
Rekai Gonzalez-Alberquilla	166da650a3	arch: ISA parser additions of vector registers Reiley's update :) of the isa parser definitions. My addition of the vector element operand concept for the ISA parser. Nathanael's modification creating a hierarchy between vector registers and its constituencies to the isa parser. Some fixes/updates on top to consider instructions as vectors instead of floating when they use the VectorRF. Some counters added to all the models to keep faithful counts. Change-Id: Id8f162a525240dfd7ba884c5a4d9fa69f4050101 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2706 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>	2017-07-05 14:43:49 +00:00
Rekai Gonzalez-Alberquilla	00da089029	cpu: Added interface for vector reg file This patch adds some more functionality to the cpu model and the arch to interface with the vector register file. This change consists mainly of augmenting ThreadContexts and ExecContexts with calls to get/set full vectors, underlying microarchitectural elements or lanes. Those are meant to interface with the vector register file. All classes that implement this interface also get an appropriate implementation. This requires implementing the vector register file for the different models using the VecRegContainer class. This change set also updates the Result abstraction to contemplate the possibility of having a vector as result. The changes also affect how the remote_gdb connection works. There are some (nasty) side effects, such as the need to define dummy numPhysVecRegs parameter values for architectures that do not implement vector extensions. Nathanael Premillieu's work with an increasing number of fixes and improvements of mine. Change-Id: Iee65f4e8b03abfe1e94e6940a51b68d0977fd5bb Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> [ Fix RISCV build issues and CC reg free list initialisation ] Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2705	2017-07-05 14:43:49 +00:00
Rekai Gonzalez-Alberquilla	a473b5a6eb	cpu: Simplify the rename interface and use RegId With the hierarchical RegId there are a lot of functions that are redundant now. The idea behind the simplification is that instead of having the regId, telling which kind of register read/write/rename/lookup/etc. and then the function panic_if'ing if the regId is not of the appropriate type, we provide an interface that decides what kind of register to read depending on the register type of the given regId. Change-Id: I7d52e9e21fc01205ae365d86921a4ceb67a57178 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> [ Fix RISCV build issues ] Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2702	2017-07-05 14:43:49 +00:00
Nathanael Premillieu	43d833246f	cpu: Physical register structural + flat indexing Mimic the changes done on the architectural register indexes on the physical register indexes. This is specific to the O3 model. The structure, called PhysRegId, contains a register class, a register index and a flat register index. The flat register index is kept because it is useful in some cases where the type of register is not important (dependency graph and scoreboard for example). Instead of directly using the structure, most of the code is working with a const PhysRegId* (typedef to PhysRegIdPtr). The actual PhysRegId objects are stored in the regFile. Change-Id: Ic879a3cc608aa2f34e2168280faac1846de77667 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2701 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>	2017-07-05 14:43:49 +00:00
Rekai Gonzalez Alberquilla	21f8242430	cpu: Change literal integer constants to meaningful labels fu_pool and inst_queue were using -1 for "no such FU" and -2 for "all those FUs are busy at the moment" when requesting for a FU and replying. This patch introduces new constants NoCapableFU and NoFreeFU respectively. In addition, the condition (idx == -2 \|\| idx != -1) is equivalent to (idx != -1), so this patch also simplifies that. --HG-- extra : rebase_source : 4833717b9d1e09d7594d1f34f882e13fc4b86846	2015-05-05 16:47:24 +01:00
Steve Reinhardt	5592798865	style: fix missing spaces in control statements Result of running 'hg m5style --skip-all --fix-control -a'.	2016-02-06 17:21:19 -08:00
Nilay Vaish	aafa5c3f86	revert 5af8f40d8f2c	2015-07-28 01:58:04 -05:00
Nilay Vaish	608641e23c	cpu: implements vector registers This adds a vector register type. The type is defined as a std::array of a fixed number of uint64_ts. The isa_parser.py has been modified to parse vector register operands and generate the required code. Different cpus have vector register files now.	2015-07-26 10:21:20 -05:00
Nilay Vaish	4333549575	cpu: o3: replace issueLatency with bool pipelined Currently, each op class has a parameter issueLat that denotes the cycles after which another op of the same class can be issued. As of now, this latency can either be one cycle (fully pipelined) or same as execution latency of the op (not at all pipelined). The fact that issueLat is a parameter of type Cycles makes one believe that it can be set to any value. To avoid the confusion, the parameter is being renamed as 'pipelined' with type boolean. If set to true, the op would execute in a fully pipelined fashion. Otherwise, it would execute in an unpipelined fashion.	2015-04-29 22:35:22 -05:00
Brandon Potter	a70a83155b	cpu: remove conditional check (count > 0) on o3 IQ squashes The o3 cpu instruction queue model uses the count variable to track the number of unissued instructions in the queue. Previously, the squash method used this variable to avoid executing the doSquash method when there were no unissued instructions in the pipeline. A corner case problem exists when only issued instructions exist in the pipeline and a squash occurs; the doSquash code is not invoked and subsequently does not clean up state properly.	2015-04-22 07:52:03 -07:00
Mitch Hayenga	5bfa521c46	cpu: Add writeback modeling for drain functionality It is possible for the O3 CPU to consider itself drained and later have a squashed instruction perform a writeback. This patch re-adds tracking of in-flight instructions to prevent falsely signaling a drained event.	2014-10-29 23:18:27 -05:00
Mitch Hayenga	6847bbf7ce	cpu: Add drain check functionality to IEW IEW did not check the instQueue and memDepUnit to ensure they were drained. This caused issues when drainSanityCheck() did check those structures after asserting IEW was drained.	2014-10-29 23:18:26 -05:00
Mitch Hayenga	4f13f676aa	cpu: Fix cache blocked load behavior in o3 cpu This patch fixes the load blocked/replay mechanism in the o3 cpu. Rather than flushing the entire pipeline, this patch replays loads once the cache becomes unblocked. Additionally, deferred memory instructions (loads which had conflicting stores), when replayed would not respect the number of functional units (only respected issue width). This patch also corrects that. Improvements over 20% have been observed on a microbenchmark designed to exercise this behavior.	2014-09-03 07:42:39 -04:00
Mitch Hayenga	976f27487b	cpu: Change writeback modeling for outstanding instructions As highlighed on the mailing list gem5's writeback modeling can impact performance. This patch removes the limitation on maximum outstanding issued instructions, however the number that can writeback in a single cycle is still respected in instToCommit().	2014-09-03 07:42:33 -04:00
Steve Reinhardt	0be64ffe2f	style: eliminate equality tests with true and false Using '== true' in a boolean expression is totally redundant, and using '== false' is pretty verbose (and arguably less readable in most cases) compared to '!'. It's somewhat of a pet peeve, perhaps, but I had some time waiting for some tests to run and decided to clean these up. Unfortunately, SLICC appears not to have the '!' operator, so I had to leave the '== false' tests in the SLICC code.	2014-05-31 18:00:23 -07:00
Giacomo Gabrielli	3436de0c2a	cpu: Add support for Memory+Barrier instruction types in O3 cpu.	2014-01-24 15:29:30 -06:00
Andreas Hansson	7db542c0dd	cpu: Relax check on squashed non-speculative instructions This patch relaxes the check performed when squashing non-speculative instructions, as it caused problems with loads that were marked ready, and then stalled on a blocked cache. The assertion is now allowing memory references to be non-faulting.	2014-01-24 15:29:29 -06:00
Matt Horsnell	6decd70bfb	cpu: add consistent guarding to *_impl.hh files.	2013-10-17 10:20:45 -05:00
Yasuko Eckert	2c293823aa	cpu: add a condition-code register class Add a third register class for condition codes, in parallel with the integer and FP classes. No ISAs use the CC class at this point though.	2013-10-15 14:22:44 -04:00
Andreas Sandberg	1814a85a05	cpu: Rewrite O3 draining to avoid stopping in microcode Previously, the O3 CPU could stop in the middle of a microcode sequence. This patch makes sure that the pipeline stops when it has committed a normal instruction or exited from a microcode sequence. Additionally, it makes sure that the pipeline has no instructions in flight when it is drained, which should make draining more robust. Draining is controlled in the commit stage, which checks if the next PC after a committed instruction is in microcode. If this isn't the case, it requests a squash of all instructions after that the instruction that just committed and immediately signals a drain stall to the fetch stage. The CPU then continues to execute until the pipeline and all associated buffers are empty.	2013-01-07 13:05:46 -05:00
Andreas Hansson	287ea1a081	Param: Transition to Cycles for relevant parameters This patch is a first step to using Cycles as a parameter type. The main affected modules are the CPUs and the Ruby caches. There are definitely plenty more places that are affected, but this patch serves as a starting point to making the transition. An important part of this patch is to actually enable parameters to be specified as Param.Cycles which involves some changes to params.py.	2012-09-07 12:34:38 -04:00
Andreas Hansson	0cacf7e817	Clock: Add a Cycles wrapper class and use where applicable This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes.	2012-08-28 14:30:33 -04:00
Andreas Hansson	d53d04473e	Clock: Rework clocks to avoid tick-to-cycle transformations This patch introduces the notion of a clock update function that aims to avoid costly divisions when turning the current tick into a cycle. Each clocked object advances a private (hidden) cycle member and a tick member and uses these to implement functions for getting the tick of the next cycle, or the tick of a cycle some time in the future. In the different modules using the clocks, changes are made to avoid counting in ticks only to later translate to cycles. There are a few oddities in how the O3 and inorder CPU count idle cycles, as seen by a few locations where a cycle is subtracted in the calculation. This is done such that the regression does not change any stats, but should be revisited in a future patch. Another, much needed, change that is not done as part of this patch is to introduce a new typedef uint64_t Cycle to be able to at least hint at the unit of the variables counting Ticks vs Cycles. This will be done as a follow-up patch. As an additional follow up, the thread context still uses ticks for the book keeping of last activate and last suspend and this should probably also be changed into cycles as well.	2012-08-28 14:30:31 -04:00
Ali Saidi	6df196b71e	O3: Clean up the O3 structures and try to pack them a bit better. DynInst is extremely large the hope is that this re-organization will put the most used members close to each other.	2012-06-05 01:23:09 -04:00
Koan-Sin Tan	7d4f187700	clang: Enable compiling gem5 using clang 2.9 and 3.0 This patch adds the necessary flags to the SConstruct and SConscript files for compiling using clang 2.9 and later (on Ubuntu et al and OSX XCode 4.2), and also cleans up a bunch of compiler warnings found by clang. Most of the warnings are related to hidden virtual functions, comparisons with unsigneds >= 0, and if-statements with empty bodies. A number of mismatches between struct and class are also fixed. clang 2.8 is not working as it has problems with class names that occur in multiple namespaces (e.g. Statistics in kernel_stats.hh). clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which causes confusion between the container std::set and the function Packet::set, and this is currently addressed by not including the entire namespace std, but rather selecting e.g. "using std::vector" in the appropriate places.	2012-01-31 12:05:52 -05:00
Steve Reinhardt	84f0a1bd91	event: minor cleanup Initialize flags via the Event constructor instead of calling setFlags() in the body of the derived class's constructor. I forget exactly why, but this made life easier when implementing multi-queue support. Also rename Event::getFlags() to isFlagSet() to better match common usage, and get rid of some unused Event methods.	2011-09-22 18:59:55 -07:00
Gabe Black	a9b7931156	O3: Let squashed and deferred instructions issue. Let squahsed and deferred instructions issue so they don't accumulate and clog up the CPU.	2011-08-07 15:41:07 -07:00

1 2

93 Commits