derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Bobby R. Bruce	7137b73ca0	cpu: Fix `std::min` type mismatch in reg_class.hh (#1266 ) Introduced in #1234, this caused compilation to faill in Apple Silicon systems. This bug is the same as #582 where a more detailed explanation is provided.	2024-06-20 13:02:08 -07:00
Mahyar Samani	7ff1e381c9	cpu,stdlib: Fix Access Trace for Accessing Indices in SpatterGen (#1258 ) This change fixes the way indices are generated in a multi generator setup. It changes it from all cores generating the same trace of indices for accessing the index array to each core generating an interleaved subset of indices. For an example look below for traces (indices to index array) in a 2 core setup. Before: core_0: 0, 1, 2, 3, 4, 5, 6, 7, ... core_1: 0, 1, 2, 3, 4, 5, 6, 7, ... After: core_0: 0, 1, 2, 3, 8, 9, 10, 11, ... core_1: 4, 5, 6, 7, 12, 13, 14, 15, ... Additionally, this change fixes the SpatterKernel class in the standard library to comply with the change in the SpatterGen source code.	2024-06-20 11:24:44 -07:00
Bobby R. Bruce	36f73f671d	cpu,stdlib: Adding Spatter (#1136 ) This PR adds source code for C++ implementation of SpatterGen as well as SpatterKernel. SpatterGen uses a PyBindMethod to add kernels to the backend code. This way the process of processing json files could be offloaded to python. In addition it adds standard library components for SpatterGenCore and SpatterGen. These two components follow the same structure as AbstractCore and AbstractProcessor. In addition spatter_kernel.py adds a definition for SpatterKernel in python to make adding kernels to C++ easier. Also it adds utility functions for parsing dictionaries read from json as well as partitioning traces for multicore setups.	2024-06-17 15:28:45 -07:00
Hoa Nguyen	15e0236a8b	arch,cpu,sim: Add mechanism to partially print vector regs (#1234 ) Currently, gem5's inst tracer prints the whole vector register container by default. The size of vector register containers in gem5 is the maximum size allowed by the ISA. For vector-length agnostic (VLA) vector registers, this means ARM SVE vector container is 2048 bits long, and RISC-V vector container is 65535 bits long. Note that VLA implementation in gem5 allows the vector length to be varied within the limit specified by the ISAs. However, in most use cases of gem5, the vector length is much less than 65535 bits. This causes two issues: (1) the vector container requires allocating and moving around a large amount of unused data while only a fraction of it is used, and (2) printing the execution trace of a vector register results in a wall of text with a small amount of useful data. This change addresses the problem (2) by providing a mechanism to limit the amount data printed by the instruction tracer. This is done by adding a function printing the first X bits of a vector register container, where X is the vector length determined at runtime, as opposed to the vector container size, which is determined at compilation time. Change-Id: I815fa5aa738373510afcfb0d544a5b19c40dc0c7 --------- Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2024-06-17 14:05:47 -07:00
Giacomo Travaglini	2804311f7b	cpu-o3: Revert "Do not set Executed on load instruction to be replayed" (#1251 ) Reverts gem5/gem5#1182 This is breaking O3 execution. Investigating the matter	2024-06-17 12:24:43 -07:00
Mahyar Samani	6695e5ef70	cpu: Adding SpatterGen This change adds source code for SpatterGen ClockedObject. The set of source code pushed includes code for SpatterKernel that tracks whether information is being gathered or scattered as well as the list of indices to be accessed. This model has PyBindMethod to add SpatterKernels from python. This way all the preparations for kernels can be done in python. SpatterGen has a few parameters that model limits on a few of hardware resources in the backend of a processor, e.g. number of functional units to calculate effective address, the latency of calculating effective address, number of integer registers. Change-Id: I451ffb385180a914e884cab220928c5f1944b2e3	2024-06-14 10:45:09 -07:00
Minje Jun	b8e21a2d32	cpu-o3: Do not set Executed on load instruction to be replayed (#1182 ) A load instruction can be replayed when 1) it's strictly ordered or 2) it falls into load-store forwarding mismatch. Case 1 was considered in executeLoad function but the case 2 wasn't. It causes the case-2 replayed load instruction to violate the assertion condition "assert(!load_inst->isExecuted())" in LSQUnit::read. This commit fixes the problem by adding consideration of the case 2 in LSQUnit::executeLoad. Co-authored-by: Minje Jun <minje.jun@samsung.com>	2024-06-14 10:12:26 -07:00
Jason Lowe-Power	21ffd91529	cpu,arch: Add IsInvalid flag to Unknown insts (#1071 ) The IsInvalid flag indicates that the static instruction is not part of the executing ISA and not part of m5's pseudo-instructions. This flag provides a way to recognize an illegal instruction at the decode stage.	2024-06-13 16:26:35 -07:00
Harshil Patel	74afea471d	cpu: Revert "Don't change to suspend if the thread status is halted" (#1225 ) Reverts gem5/gem5#1039	2024-06-12 00:20:06 -07:00
Hoa Nguyen	369029d2be	cpu: Add IsInvalid flag to StaticInstFlags The IsInvalid flag indicates that the static instruction is not part of the executing ISA and not part of m5's pseudo-instructions. This flag provides a way to recognize an illegal instruction at the decode stage. Change-Id: I2779c6edcd8c5e6a77ea11cad3ff73bacb79d800 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2024-06-11 18:48:29 +00:00
Ivana Mitrovic	a764b9be1c	Revert "arch-x86: Fix TLB Assertion Error on CFLUSH" (#1196 ) Reverts gem5/gem5#1080 as it is not a good fix.	2024-06-04 10:26:53 -07:00
Lukas Zenick	dad5c7b6f7	arch-x86: Fix TLB Assertion Error on CFLUSH (#1080 ) Fixed the assertion statement in the cpu's translation.hh file so that it doesn't fail the assertion if the cache is clean. I compile this c code to `test` ```c #include <stdio.h> static inline void clflush(volatile void *p) { __asm__ volatile ("clflush (%0)" : : "r"(p) : "memory"); } int main() { int data = 42; // Example variable printf("Value before clflush: %d\n", data); clflush(&data); printf("Value after clflush: %d\n", data); return 0; } ``` And run it with this script `./build/X86/gem5.opt configs/learning_gem5/part1/two_level.py ./test` In order to verify that it no longer fails the assertion check. GitHub Issue: #862 Change-Id: I6004662e7c99f637ba0ddb07d205d1657708e99f	2024-06-03 10:17:10 -07:00
Harshil Patel	0824d7f2cd	Revert "cpu-kvm: Support perf counters on hybrid host architectures" (#1127 ) Reverts gem5/gem5#1065 Reverting this change because this PR breaks X86 kvm as mentioned in the issue #1126.	2024-05-21 08:14:10 -07:00
Yu-Cheng Chang	321bd07163	cpu: Don't change to suspend if the thread status is halted (#1039 ) In our gem5 model, there are four types represent thread context: Active, Suspend, Halting and Halted `5641c5e464/src/cpu/thread_context.hh (L99-L117)` When initializing the gem5 instance, all of the thread contexts are set Halted. The status of thread context will not be active until the Workload initializes start up, except the StubWorkload. So if the user uses the StubWorkload, and the CPU is connected with the model_reset port. The thread context of the CPU will be activated possibly. The following is the steps of activating thread context of the CPU without Workload[1] initialization or lower model_reset port[2]. 1. Raise the model_reset port (Change the state from Halted to Suspend) `5641c5e464/src/cpu/base.cc (L671-L673)` 2. Post the interrupt to CPU (Change the state from Suspend to Active) `5641c5e464/src/cpu/base.cc (L231-L239)` Implementation of wakeup SimpleCPU: `5641c5e464/src/cpu/simple/base.cc (L251-L259)` MinorCPU: `5641c5e464/src/cpu/minor/cpu.cc (L143-L151)` O3CPU: `5641c5e464/src/cpu/o3/cpu.cc (L1337-L1346)` This CL fixed the issue when raising the model reset port to CPU(let CPU sleep) if the CPU is not activated by workload. If the CPU status is halted, it's should not change to Suspend to avoid wake up Reference The model_reset is introduced in the CL: https://gem5-review.googlesource.com/c/public/gem5/+/67574/4 [1] Activate by workload (ARM example): `5641c5e464/src/arch/arm/fs_workload.cc (L101-L114)` [2] Lower the model_reset: `5641c5e464/src/cpu/base.cc (L191-L192)` `5641c5e464/src/cpu/base.cc (L674-L685)` Change-Id: I5bfc0b7491d14369fff77b98b71c0ac763fb7c42	2024-05-16 10:02:53 -07:00
OdnetninI (Eduardo José Gómez Hernández)	17cbbd84ae	cpu: Indirect predictor track conditional indirect (#1077 ) As discussed in https://github.com/orgs/gem5/discussions/954: In the refactor made by commit `f65df9b959` conditional indirect branches are no longer updated in the indirect predictor. This kind of branches do not exist in x86 neither arm, but they are present in PowerPC. This patch, enables the indirect predictor to track this kind of branches.	2024-04-29 11:38:22 +01:00
Nicholas Mosier	c679c9c127	cpu-o3: prioritize exiting threads when committing (#1056 ) Fix #1055. Prioritize committing from exiting threads before we consider other threads using the specified SMT commit policy. All instructions in the ROB for exiting threads should already have been squashed. Thus, this ensures that the ROB instruction queues for all exiting threads will be empty at the end of the current cycle, avoiding the assertion failure encountered in #1055. Change-Id: Ib0178a1aa6e94bce2b6c49dd87750e82776639dc	2024-04-25 11:15:14 -07:00
Nicholas Mosier	51d546cb06	cpu-o3: Clear current macro-op in fetch if squashing after last micro-op (#1047 ) Fix #1042. Clear the current fetch macro-op if the instruction initiating the squash is the last micro-op in its macro-op. Change-Id: I77f60334771277e47f19573d4067b3a7bc5488b2	2024-04-25 11:14:58 -07:00
Nicholas Mosier	cf5ec880c9	cpu-kvm: Support overflows when migrating across hybrid cores Add support for event overflows when the host thread migrates across differnt types of cores on a hybrid host architecture. This patch achieves this by simply halving the sample period for each performance counter. Since there are two types of cores, this guarantees that an overflow event will trigger before N events occur, where N is the requested period (e.g., number of instructions to simulate). This may result in many early triggers (up to log2(N)) before the requested period is reached. However, gem5's existing bookkeeping logic already handles this case properly: if fewer events than requested occurred, it will set a new period (N - observed) and resume execution. This loop will exit once N events have actually occurred. Change-Id: Iff85237da1ae1aa25bc2045fbf9091726291fe36	2024-04-24 09:47:46 -07:00
Nicholas Mosier	30ea15009f	cpu-kvm: Support perf counters on hybrid host architectures Fix #1064 by adding support for hardware performance counters on hybrid architectures like Intel Alder Lake. Hybrid architectures have multiple types of cores, each of which require the instantiation of a separate performance counter. The KVM CPU's PerfKvmCounter class was not aware of this, any only instantiated a single performance counter, implicitly bound to the P-core only. This meant that if gem5 ever ran on an E-core, the various hardware performance counters would not get updated properly, in some cases always zero (e.g., for the number of instructions executed). This patch adds support for hybrid host architectures as follows. First, we convert PerfKvmCounter into an abstract class, which has two concrete implementations: SimplePerfKvmCounter and HybridPerfKvmCounter. The former is used for non-hybrid architectures or for non-hardware performance counters and is functionally equivalent to the prior implementation of PerfKvmCounter. The latter is used for instantiating hardware performance counters (i.e., of type PERF_TYPE_HARDWARE) on hybrid host architectures. It does so by internally instantiating two SimplePerfKvmCounters, one for a P-core and one for an E-core. Upon read, it sums the results of reading the two internal counters. Change-Id: If64fcb0e2fcc1b3a6a37d77455c2b21e1fc81150	2024-04-24 09:47:46 -07:00
Jason Lowe-Power	c13aa7727d	cpu: Fix Ruby/x86 pio port connections (#1035 ) Fixes #1033 In the BaseCPU object _uncached_interrupt_response_ports is a class variable, not an instance variable. #1004 changed the explicit self._uncached_interrupt_response_ports to use extend. This caused the list of ports to be extended for all cores, which caused problems when using a system with more than 1 core. This reverts the `extend` part of the change, but keeps the rest. Change-Id: I6dc7d6da6763048d82960229d34933a3a2ac36e0 Signed-off-by: Jason Lowe-Power <jason@lowepower.com>	2024-04-17 08:20:04 -07:00
Yu-Cheng Chang	ebb70dea99	cpu: Fix KVM false negative warning after Kconfig transition (#1013 ) When we start to build gem5. We will read and process all of SConsopts files, and process the after_sconsopts_callbacks after all of SConsopts files read. In the KVM_ISA env setting, the KVM_ISA env can be set in the different files, take x86 and arm as example: KVM_ISA default value: `bc39283451/src/cpu/kvm/SConsopts` x86 KVM_ISA: `bc39283451/src/arch/x86/kvm/SConsopts (L39-L45)` arm KVM_ISA: `bc39283451/src/arch/arm/kvm/SConsopts (L35-L36)` We should move the kvm warning after all of SConsopts env read issue: https://github.com/gem5/gem5/issues/686 Change-Id: I096c6bebaaec18f9b2af93191d0dd23c65084eda	2024-04-12 09:23:56 -07:00
Nicholas Mosier	bc39283451	cpu-o3, arch-x86: initialize interrupts for all SMT threads (#1007 ) Fix issue #1004. When enabling SMT with the O3 cpu, only the first interrupts object was getting initialized properly. This patch initializes all interrupts objects, one per SMT thread. Change-Id: I300782b645bd8ea3ef2497278fb73125ab4bf495	2024-04-11 11:17:24 -07:00
Ivan Fernandez	c91d1253de	cpu: This commit updates cpu FUs according to new Simd types This commit updates cpu by removing VectorXXX types and updates FUs according to the newer SimdXXX ones. This is part of the homogenization of RISCV Vector instruction types, which moved from VectorXXX to SimdXXX. Change-Id: I84baccd099b73a11cf26dd714487a9f272671d3d	2024-03-25 19:01:47 +01:00
Ivan Fernandez	1e743fd85a	arch-riscv: adding vector unit-stride segment stores to RISC-V (#913 ) This commit adds support for vector unit-stride segment store operations for RISC-V (vssegXeXX). This implementation is based in two types of microops: - VsSegIntrlv microops that properly interleave source registers into structs. - VsSeg microops that store data in memory as contiguous structs of several fields. Change-Id: Id80dd4e781743a60eb76c18b6a28061f8e9f723d Gem5 issue: https://github.com/gem5/gem5/issues/382	2024-03-22 15:45:58 -07:00
Ivan Fernandez	f6c61836b3	arch-riscv: adding vector unit-stride segment loads to RISC-V (#851 ) This commit adds support for vector unit-stride segment load operations for RISC-V (vlseg<NF>e<X>). This implementation is based in two types of microops: - VlSeg microops that load data as it is organized in memory in structs of several fields. - VectorDeIntrlv microops that properly deinterleave structs into destination registers. Gem5 issue: https://github.com/gem5/gem5/issues/382	2024-03-06 11:27:06 -08:00
Giacomo Travaglini	8759131df3	cpu-o3, arch: Fix SMT bug arising from v23.0 and make gem5 more robust with SMT (#828 ) This PR is fixing https://github.com/gem5/gem5/issues/668. It fixes it for all ISAs other than Arm with the first commit, which is setting the number of architectural Matrix registers to 0 for those ISA which are not using them. It then partly fixes it for Arm as well with the 2nd commit: by removing RenameMap::numFreeEntries we don't stall renaming unless a matrix instruction is encountered... This means most binaries will run with SMT as long as they don't use FEAT_SME instructions. Please note: this is not simply a SMT fix, it will generally address a shortcoming in the way we were renaming instructions. If an Arm binary wants to use SMT with FEAT_SME, the 4th commit will make sure the lack of physical registers is notified explicitly at the beginning of simulation, rather than silently blocking renaming	2024-02-19 08:52:31 +00:00
Arnabjyoti Kalita	b826d96f40	cpu-o3: add PerThreadUnifiedThreadMap to O3 CPU (#842 ) Github issue: https://github.com/gem5/gem5/issues/373 Change-Id: I1c8aba9bc5ea4e45faa6c174780904b8bd618604	2024-02-12 09:26:31 -08:00
Giacomo Travaglini	4eb0cd44fc	cpu-o3: Restrict constraint on number of physical registers Having the number of physical registers matching exactly the number of architectural ones does not guarantee a proper execution as it means the freeList would have 0 registers available for renaming. In this case the worst would happen: renaming would silently stall execution indefinitely. With this change we report the issue to the user and fail execution Change-Id: I1eb968802f1a1a5115012f44b541542a682f887d Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-02-02 21:18:32 +00:00
Giacomo Travaglini	1fb7c1ad7e	cpu-o3: Rename numFreeEntries into minFreeEntries Change-Id: I89faeb001ebdcbc90ea88508f8d231ec6e7fe197 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-02-02 18:11:47 +00:00
Giacomo Travaglini	86158de220	cpu-o3: Stop using RenameMap::numFreeEntries The method is extracting the minimum number of [1] non-zero free registers/entries across all register classes. This means that if we have saturated all register storage for a particular class, renaming will stop as a whole. I believe it does make sense to keep renaming and only block renaming in case an instruction requiring the particular register type is encountered. This would happen with the Rename::renameInsts method [1]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/rename_map.hh#L269 [2]: https://github.com/gem5/gem5/blob/stable/src/cpu/o3/rename.cc#L662 Change-Id: I932826a77a5c0b2e05d8fdcab0e6ca13cf0e3d23 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-02-02 18:11:47 +00:00
Mahyar Samani	b79fe82e5c	cpu,stdlib: Updating strided generator (#762 ) This change improves the functionality of strided generator to create trace with better flexibility. It allows the user to manually set offset and stride size instead of calculating it based on a "gen_id". This way different patterns could be created with the same SimObject. In addition, this change adds stdlib components for strided generator.	2024-02-01 09:08:42 -08:00
Matthew Poremba	63caa780c2	misc: Remove all references to GCN3 Replace instances of "GCN3" with Vega. Remove gfx801 and gfx803. Rename FIJI to Vega and Carrizo to Raven. Using misc since there is not enough room to fit all the tags. Change-Id: Ibafc939d49a69be9068107a906e878408c7a5891	2024-01-17 11:11:06 -06:00
Bobby R. Bruce	213d0b0bfe	cpu: 'suppressFuncErrors' -> 'pkt->suppressFuncError()' fix Change-Id: If4aa71e9f6332df2a3daa51b69eaad97f6603f6b	2023-12-20 09:15:15 -08:00
Hoa Nguyen	7a5052b3a0	arch-arm: Only build ArmCapstoneDisassembler when ISA is arm (#553 ) Currently, if the Capstone header file is found in the host system, scons will try to build the ArmCapstoneDisassembler regardless of the gem5 target ISA. This is causing problem when the host has Capstone, but the gem5 target ISA is not arm. Compiling gem5 in this case will cause errors, e.g., ArmISA and ArmSystem is not found. This change aims to prevent building the ArmCapstoneDisassembler when the gem5 target ISA is not arm. Ref: [1] The Arm Capstone PR https://github.com/gem5/gem5/pull/494 Change-Id: I1e714d34aec8fe2a2af8cd351536951053a4d8a5	2023-12-03 13:22:11 -08:00
Richard Cooper	2fbbdad618	base: Add encapsulation to the loader::Symbol class This commit converts `gem5::loader::Symbol` to a full class with private members, enforcing encapsulation. Until now client code has been able to (and does) access members directly. This change will enable class invariants to be enforced via accessor methods. Change-Id: Ia0b5b080d4f656637a211808e13dce1ddca74541 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>	2023-12-01 22:00:26 +00:00
Andreas Sandberg	dcdebec0f6	misc,python: Add `isort` hook to pre-commit (#431 )	2023-11-30 09:54:12 +00:00
Bobby R. Bruce	d11c40dcac	misc: Run `pre-commit run --all-files` This ensures `isort` is applied to all files in the repo. Change-Id: Ib7ced1c924ef1639542bf0d1a01c5737f6ba43e9	2023-11-29 22:06:41 -08:00
Adrià Armejach	eb13b32314	cpu-o3: Fix discarded requests str-ld forwarding (#614 ) With the use of large RVV vectors (i.e., 8K or 16K bits) and a limited number of cacheLoadPorts, some loads take multiple cycles to execute. This triggered certain conditions when store-to-load forwarding happens in the middle of the execution of a load that already has outstanding packets. First, after store-to-load forwarding the request is marked as discarded and the load is immediately writtenback, which triggers a writebackDone that tries to delete the request, triggering an assert as it still has outstanding packets. This patch avoid deleting the request leaving it self owned, it will be deleted when the last packet arrives in packetReplied. Second, this patch avoid checking snoops on discarded requests by checking if the request exists. Change-Id: Icea0add0327929d3a6af7e6dd0af9945cb0d0970 Co-authored-by: Adrià Armejach <adria.armejach@bsc.es>	2023-11-29 08:45:03 -08:00
Andreas Sandberg	0c30353c59	cpu: Require BTB hit to detect branches. (#493 ) In a high performance CPU there is no other way than a BTB hit to know about a branch instruction and its type. For low-end CPU's pre-decoding might sit in from of the BPU to provide this information. Currently, the BPU models only low-end behavior and updates the RAS and the indirect branch prediction even without a BTB hit. This patch adds three things to model the correct behavior for high-end CPUs. 1. A check before the RAS and indirect predictor wheather there was a BTB hit or not. Only for BTB hits the BPU will consolidate RAS, and indirect predictor. 2. Since, this check requires a BTB hit for indirect branches they must also be installed into the BTB. For returns this was already done. 3. Finally, the BTB update previously happened at squash (decode or commit). Since this can be out-of-order that means branches from the false path can get installed without ever been retired.	2023-11-28 09:39:14 +00:00
Roger Chang	4d632cb73f	scons: Add new config option HAVE_CAPSTONE to Kconfig The config option HAVE_CAPSTONE is added in the previous [1] and the Kconfig options should be sync with it. [1] https://github.com/gem5/gem5/pull/494 Change-Id: Id83718bc825f53d87d37d6ac930b96371209bdb3	2023-11-23 08:26:11 +08:00
Roger Chang	d758df4b5c	scons: Update the Kconfig build options The CL updates the Kconfig: 1. Replace the USE_NULL_ISA with BUILD_ISA 2. The USE_XXX_ISAs are depends on BUILD_ISA 3. If the BUILD_ISA is set, at least one of USE_XXX_ISAs must be set 4. Refactor the USE_KVM option Change-Id: I2a600dea9fb671263b0191c46c5790ebbe91a7b8	2023-11-23 08:26:11 +08:00
Gabe Black	db3a6e8e84	scons: Use Kconfig to configure gem5. These are not yet consumed by anything, but convert all the settings from SCons variables to Kconfig variables. If you have existing SConsopts files which need to be converted, you should take a look at KCONFIG.md to learn about how kconfig is used in gem5. You should decide if any variables need to be available to C++ or kconfig itself, and whether those are options which should be detected automatically, or should be up to the user. Options which should be measured automatically should still be in SConsopts files, while user facing options should be added to new or existing Kconfig files. Generally, make sure you're storing c++/kconfig visible options in env['CONF'][...]. Also remove references to sticky_vars since persistent options should now be handled with kconfig, and export_vars since everything in env['CONF'] is now exported automatically. Switch SCons/gem5 to use Kconfig for configuration, except EXTRAS which is still a sticky SCons variable. This is necessary because EXTRAS also controls what config options exist. If it came from Kconfig itself, then there would be a circular dependency. This dependency could theoretically be handled by reparsing the Kconfig when EXTRAS directories were added or removed, but that would be complicated, and isn't supported by kconfiglib. It wouldn't be worth the significant effort it would take to add it, just to use Kconfig more purely. Change-Id: I29ab1940b2d7b0e6635a490452d05befe5b4a2c9	2023-11-23 08:26:10 +08:00
David Schall	94879c2410	cpu: Require BTB hit to detect branches. In a high performance CPU there is no other way than a BTB hit to know about a branch instruction and its type. For low-end CPU's pre-decoding might sit in from of the BPU to provide this information. Currently, the BPU models only low-end behavior and updates the RAS and the indirect branch prediction even without a BTB hit. This patch adds two things to model the correct behavior for high-end CPUs. 1. A check before the RAS and indirect predictor wheather there was a BTB hit or not. Only for BTB hits the BPU will consolidate RAS, and indirect predictor. 2. Since, this check requires a BTB hit for indirect branches they must also be installed into the BTB. For returns this was already done. Change-Id: Ibef9aa890f180efe547c82f41fc71f457c988a89 Signed-off-by: David Schall <david.schall@ed.ac.uk>	2023-11-16 12:35:10 +00:00
Daniel Kouchekinia	dde3d10aea	cpu: Remove SLC bit restraint for GPU tester (#552 ) This reverts gem5#133, the temporary work-around for gem5#131, allowing both SLC and GLC atomic requests to be made in the GPU tester. The underlying issues behind gem5#131 have been resolved by gem5#367 and gem5#397.	2023-11-14 03:47:34 -08:00
Andreas Sandberg	60290c7c2f	cpu: Branch Predictor Refactoring (#455 ) Major refactoring of the branch predictor unit. - Clearer control flow of the main branch predictor - Remove `uncondBranch` and `btbUpdate` functions in favor of a common `historyUpdate` function. There is now only one lookup function for conditional branches and the new `historyUpdate` for speculative history update. - Added a new target provider class. - More expressive statistics depending on the different branch types. - Cleanup the branch history management	2023-10-26 09:15:11 +01:00
David Schall	ccbb85c67f	cpu: Branch Predictor Refactoring Major refactoring of the branch predictor unit. - Clearer control flow of the main branch predictor - Remove `uncondBranch` and `btbUpdate` functions in favour of a common `historyUpdate` function. There is now only one lookup function for conditional branches and the new `historyUpdate` for speculative history update. - Added a new target provider class. - More expressive statistics depending on the different branch types. - Cleanup the branch history management Change-Id: I21fa555b5663e4abad7c836fc1d41a9c8b205263 Signed-off-by: David Schall <david.schall@ed.ac.uk>	2023-10-24 18:53:20 +00:00
Giacomo Travaglini	82675648c8	cpu: Implement a CapstoneDisassembler Capstone is an open source disassembler [1] already used by other projects (like QEMU). gem5 is already capable of disassembling instructions. Every StaticInst is supposed to define a generateDisassembly method which returns the instruction mnemonic (opcode + operand list) as a string. This "distributed" implementation of a disassembler relies on the developer to properly populate the metadata fields of the base instruction class. The growing complexity of the ISA code and the massive reuse of base classes beyond their intended use has led to a disassembling logic which contains several bugs. By allowing a tracer to rely on a third party disassembler, we fill the intruction trace with a more trustworthy instruction stream. This will make any trace parsing tool to work better and it will also allow us to spot/fix our own bugs by comparing instruction traces with native vs custom disassembler [1]: http://www.capstone-engine.org/ Change-Id: I3c4db5072c03d2731265d0398d3863c101dcb180 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-10-20 09:27:51 +01:00
Giacomo Travaglini	237bbf0e42	cpu: Disassemble through the InstDisassembler in the ExeTracer Change-Id: I4a0c585b9b8824a0694066bef0ee004f68407111 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-10-20 09:27:50 +01:00
Giacomo Travaglini	952c4f5eea	cpu: Pass a reference of the parent tracer to the ExeTracerRecord Change-Id: I3576df2b7bee1289db60bb6072bd9c90038ca8ce Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2023-10-20 09:27:50 +01:00
Harshil Patel	7bd0b99635	tests: Changed percent atomics to 0 in memtest to fix daily test (#477 )	2023-10-18 10:09:45 -07:00

1 2 3 4 5 ...

2625 Commits