derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
studyztp	3a2cfb2dee	cpu: fix looppoint anaylsis param python string spacing Change-Id: I98fe434f1066f12b975425e49baca6e6a6087dab	2024-12-02 08:33:14 -08:00
studyztp	0f0a6a7851	cpu: fix pc count pair helper function return type Change the helper function's return type from int to uint64_t Change-Id: I34b6b563a6333bbf8516a16d2ad4b76b7c16bfe4	2024-12-02 08:33:14 -08:00
studyztp	4ce0f20436	cpu: make PcCountPair use 64 bit unsigned int for count In PcCountPair param, change the type for "count" from 32 bit int to 64 bit unsigned int. Change-Id: I2dc1bb2692914f06eaaae9bd5bbfb061bcbbfb8b	2024-12-02 08:33:14 -08:00
studyztp	6a9db637ae	cpu: add function to get inst map of each basic block Change-Id: I147d8c90cdfc7bf795d1c6a6daf96e11fa1c0858	2024-12-02 08:33:14 -08:00
studyztp	7ffa3646bd	cpu: fix the incorrect debug message Change-Id: I062e359e8c9205a9a993a33865434922c1f540b8	2024-12-02 08:33:14 -08:00
studyztp	1410c29147	cpu: modified after review feedback src/cpu/simple/probes/LooppointAnalysis.py: - remove default values for bb_valid_addr_range and marker_valid_addr_range - add more comments to explain parameter behaviors - add citation to the LoopPoint paper src/cpu/simple/probes/looppoint_analysis.cc: - fix the incorrect styles - remove updateBackwardBranch() function call - match the style of checking if listeners vector is empty - change the way of stopListening() to remove the listeners through the manager instead of through the ProbeListener object's destructor. src/cpu/simple/probes/looppoint_analysis.hh: - removed backwardBranchPC and use the backwardBranchCounter to replace its functionaility. Therefore, also removed updateBackwardBranch function. Change-Id: Id2430e2f04e61f72d5c4f1aad5cfd4d24a0fbc45	2024-12-02 08:33:14 -08:00
studyztp	89717eca3c	cpu: add more debug flags Change-Id: I4edd8f383294f76d3e76895d3a631cba21a45f90	2024-12-02 08:33:14 -08:00
studyztp	753d9971d2	cpu: add more comments to looppoint_analysis.cc Change-Id: I027db66ffed0cd5957bae2a9a36286ca1c73c313	2024-12-02 08:33:14 -08:00
studyztp	a1072357c1	cpu: fix a issue Change-Id: Iab621e294c84c7f5c704882b0c681f950ad08f9c	2024-12-02 08:33:13 -08:00
studyztp	abc8a4a483	cpu: fix a wrong file path Change-Id: I93343f4053c7a6d1bd4b6972a1e7c3dbc073c979	2024-12-02 08:33:13 -08:00
studyztp	cd29b199ce	cpu: add the python class Add the python classes for the LooppointAnalysis and the LooppointAnalysis Manager. Change-Id: I0a882bc1a9ef03b7b482e871a7160e7c33f9ac08	2024-12-02 08:33:13 -08:00
studyztp	e10fff4876	cpu: add looppoint_analysis.cc content Add LooppointAnalysis and LooppointAnalysisManager function definitions Change-Id: I1c05072ebf1b744ee102a82f8de2b93bab4a056f	2024-12-02 08:33:13 -08:00
studyztp	fff6c895fe	cpu: add comments and improve naming in looppoint_analysis.hh Add comments to most variables and functions. Change the naming of some variables and functions to improve the clearness. Change-Id: Idb557ec84698b4344ed4683f5de87b1a3c2fd66d	2024-12-02 08:33:13 -08:00
studyztp	3c7c7b8b54	cpu: add looppoint_analysis.hh content and licenses In looppoint_analysis.hh, added LooppointAnalysis and LooppointAnalysisManager classes. Added all functions and variables for the classes. Comments needed. Change-Id: Ia7425b672ef092a68c99b702136850bfa1fcf0a2	2024-12-02 08:33:13 -08:00
studyztp	157d89e255	cpu: add basic files for LoopPoint analysis Because the LoopPoint analysis will be done with ATOMIC CPU, so all files related to the LoopPoint analysis object will be under /src/cpu/simple/probes. Change-Id: Icbdb0742b712a23dc8f6a19f4c1c827a1f5bf288	2024-12-02 08:33:13 -08:00
studyztp	0d16c92341	cpu: add comments and change input type to list Change inst_threshold param to inst_thresholds, which it is now expecting a list of thresholds instead of one threshold. Add more getter and setter functions: addThreshold: it is for adding new thresholds getCounter: it is for getting the current counter getThresholds: it returns the list of targeted thresholds resetThresholds: it clears all the targeted thresholds Change-Id: I48d022effe7b315112ac150e6a4eaf5aab41c514	2024-11-18 11:24:26 -08:00
studyztp	627734e830	cpu: clear listeners list Change-Id: Ie9d664df1b29a0ba62174046a7ab1fda6753bef4	2024-11-18 11:24:19 -08:00
studyztp	0b0a8431dc	cpu: delete listener ptrs after removal The listener pointer does not get deleted with the removeListener() function call, so we need to make sure it is deleted in the ProbeListenerObject. Change-Id: I370f34651b889c8c00a378743e9c1c09fa1d775e	2024-11-18 11:24:11 -08:00
studyztp	9fede07f44	cpu: modified with review feedback x86-global-inst-tracker.py: - change the incorrect use of comment styly - add more comments about the usage of the script and the purpose of the script src/cpu/probes/inst_tracker.cc: - change the way of stopListening to use the manager function to remove listeners. If in the future, the ProbeListner object does not call the manager to remove itself in the destruction, then we should call it here. - fix stlying src/cpu/probes/inst_tracker.hh: - fix stlying Change-Id: I6f3d745e15883a8a702593f72f984e0d4cc4c526	2024-11-18 11:24:04 -08:00
studyztp	ddb29819ee	cpu: reorder functions, add more debug flags and comments Change-Id: I94bd4771130441a8e2e449a7527e87ba5c355236	2024-11-18 11:23:50 -08:00
studyztp	66d3f7c038	cpu: add GlobalInstTracker and LocalInstTracker The GlobalInstTracker manages the global instruction counter and responsible for triggering an exit event when the global instruction counter reaches the defined threshold. The LocalInstTracker listens to one core's retiredInsts probe point and updates the GlobalInstTracker every time there is an instruction committed. The purpose of this instruction tracker is to raise an instruction executed exit event with multi-core simulation. Related discussion can be found: https://github.com/gem5/gem5/issues/1087 Change-Id: Iab6fec57f14f28e590b035506282130ba8662706	2024-11-18 11:23:34 -08:00
aperais	b82ab5ac89	misc: Do not share the random number generator across components (#1534 ) Component that require randomness should not share their randomness source with other components to avoid simulation noise. For instance, the branch predictor of one core should not impact the random cache replacement policy of the cache of another core. This currently happens as all components share a single random number generator. This PR provides their own generators to relevant components, although a couple components still use rand(). Change-Id: I3fb7226111c9194ee457af0f0f2b83f8c7b69d1e Co-authored-by: Arthur Perais <arthur.perais@univ-grenoble-alpes.fr>	2024-11-18 01:37:12 -08:00
Yu-Cheng Chang	8b1075b792	arch, cpu: Add generic getValidAddr to correct exetrace symbol table (#1758 ) The getValidAddr is the method get virtual address with valid bits. It is useful to get the correct symbol table via valid virtual address. For ARM, we have `purifyTaggedAddr` to get the right virtual address. For RISC-V, we only get lower 32 bits in RV32 mode to get the right symbol table. Change-Id: I33ad7bec6e7ea4ec82cb1b3a7f521432c6d735b6	2024-11-08 13:33:53 +00:00
Pranith	50f652a2ee	Implement BTB using the cache library (#1537 ) This enables the BTB to be associative and use various replacement policies.	2024-10-10 17:05:22 +01:00
Erin (Jianghua) Le	feeb3b2d67	cpu: fix simInsts and simOps not resetting (#1615 ) This PR fixes the bug where simInsts and simOps don't reset when m5.stats.reset() is called. The stats hostInstRate and hostOpRate are affected by this change as well, as they depend on simInsts and simOps respectively. This is related to issue 1443 linked [here](https://github.com/gem5/gem5/issues/1443).	2024-10-09 19:49:43 -07:00
Yu-Cheng Chang	402a030ce1	cpu,arch,arch-riscv: Check wake up signal when post interrupt (#1641 ) The RISC-V doesn't not draft about how to handle wake up from interrupt signal. In SiFive U74 core, the hart will wake up if there is any enabled pending interrupt. [1] Section 14.3.1 https://sifive.cdn.prismic.io/sifive/ad5577a0-9a00-45c9-a5d0-424a3d586060_u74_core_complex_manual_21G3.pdf	2024-10-08 08:51:38 -07:00
Matthew Poremba	4f7b3ed827	mem-ruby: Remove static methods from RubySystem (#1453 ) There are several parts to this PR to work towards #1349 . (1) Make RubySystem::getBlockSizeBytes non-static by providing ways to access the block size or passing the block size explicitly to classes. The main changes are: - DataBlocks must be explicitly allocated. A default ctor still exists to avoid needing to heavily modify SLICC. The size can be set using a realloc function, operator=, or copy ctor. This is handled completely transparently meaning no protocol or config changes are required. - WriteMask now requires block size to be set. This is also handled transparently by modifying the SLICC parser to identify WriteMask types and call setBlockSize(). - AbstractCacheEntry and TBE classes now require block size to be set. This is handled transparently by modifying the SLICC parser to identify these classes and call initBlockSize() which calls setBlockSize() for any DataBlock or WriteMask. - All AbstractControllers now have a pointer to RubySystem. This is assigned in SLICC generated code and requires no changes to protocol or configs. - The Ruby Message class now requires block size in all constructors. This is added to the argument list automatically by the SLICC parser. (2) Relax dependence on common functions in src/mem/ruby/common/Address.hh so that RubySystem::getBlockSizeBits is no longer static. Many classes already have a way to get block size from the previous commit, so they simply multiple by 8 to get the number of bits. For handling SLICC and reducing the number of changes, define makeCacheLine, getOffset, etc. in RubyPort and AbstractController. The only protocol changes required are to change any "RubySystem::foo()" calls with "m_ruby_system->foo()". For classes which do not have a way to get access to block size but still used makeLineAddress, getOffset, etc., the block size must be passed to that class. This requires some changes to the SimObject interface for two commonly used classes: DirectoryMemory and RubyPrefecther, resulting in user-facing API changes User-facing API changes: - DirectoryMemory and RubyPrefetcher now require the cache line size as a non-optional argument. - RubySequencer SimObjects now require RubySystem as a non-optional argument. - TesterThread in the GPU ruby tester now requires the cache line size as a non-optional argument. (3) Removes static member variables in RubySystem which control randomization, cooldown, and warmup. These are mostly used by the Ruby Network. The network classes are modified to take these former static variables as parameters which are passed to the corresponding method (e.g., enqueue, delayHead, etc.) rather than needing a RubySystem object at all. Change-Id: Ia63c2ad5cf0bf9d1cbdffba5d3a679bb4d3b1220 (4) There are two major SLICC generated static methods: getNumControllers() on each cache controller which returns the number of controllers created by the configs at run time and the functions which access this method, which are MachineType_base_count and MachineType_base_number. These need to be removed to create multiple RubySystem objects otherwise NetDest, version value, and other objects are incorrect. To remove the static requirement, MachineType_base_count and MachineType_base_number are moved to RubySystem. Any class which needs to call these methods must now have a pointer to a RubySystem. To enable that, several changes are made: - RubyRequest and Message now require a RubySystem pointer in the constructor. The pointer is passed to fields in the Message class which require a RubySystem pointer (e.g., NetDest). SLICC is modified to do this automatically. - SLICC structures may now optionally take an "implicit constructor" which can be used to call a non-default constructor for locally defined variables (e.g., temporary variables within SLICC actions). A statement such as "NetDest bcast_dest;" in SLICC will implicitly append a call to the NetDest constructor taking RubySystem, for example. - RubySystem gets passed to Ruby network objects (Network, Topology).	2024-10-08 08:14:50 -07:00
Giacomo Travaglini	4a3e2633d2	cpu-o3: Add Matrix OpDesc to the O3 Default FU (#1640 ) There was a bug exposed by a recent PR [1] where until recently the O3 CPU was executing an instruction even if it did not have the required functional unit in the FU pool. We are adding the matrix descriptors to the Default FU pool in the O3 cpu so that no panic is encountered upon executing of a matrix instruction [1]: https://github.com/gem5/gem5/pull/1516 Change-Id: I04250255a2cbb2ee6f3ef204b62bc2c1ee2d4d2c Reviewed-by: Richard Cooper <richard.cooper@arm.com> Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-10-08 10:23:14 +01:00
Giacomo Travaglini	440999e447	cpu-o3: Add Crypto OpDesc to the O3 Default FU (#1639 ) There was a bug exposed by a recent PR [1] where until recently the O3 CPU was executing an instruction even if it did not have the required functional unit in the FU pool. We are adding the crypto descriptors to the Default FU pool in the O3 cpu so that no panic is encountered upon executing of a crypto instruction [1]: https://github.com/gem5/gem5/pull/1516 Change-Id: Ifaf2f8e4780dfb8ba825a99a02dd587f011dbd23 Reviewed-by: Richard Cooper <richard.cooper@arm.com> Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com>	2024-10-08 10:22:25 +01:00
aperais	e970acb9d2	cpu-o3: Replace integral constants by named constants in FU pool (#1556 ) This replaces hardcoded integral values with more explicit constant names in the code allocating functional units to instructions. This commit follows `ba5886aee7` which should have read: "If an instruction requires a functional unit that is not present in the model (e.g., because it is not present in the configuration), O3CPU treats it as a 1-cycle operation. This commit changes the behavior to make the cpu panic when this happens. The cpu panics only if the instruction reaches the head of the ROB, meaning it is ok to have unsupported instructions on the wrong path. Thanks to Chandana S. Deshpande (deshpande.s.chandana@gmail.com) for finding the issue." Change-Id: I5e0a37e5fb8404cb5496bd2cb0a9a5baeae3b895 Co-authored-by: Arthur perais <arthur.perais@univ-grenoble-alpes.fr>	2024-09-12 14:04:34 +01:00
aperais	ba5886aee7	cpu-o3: Panic if no FU exists for an instruction needing to issue (#1516 ) At present, if an instruction requires a functional unit that is not present in the O3CPU config, O3CPU treats it as a 1-cycle operation that does not consume an FU. This seems like a silent failure : if I forgot to add a FU for a new operation type I added, then I don't want it to silently work "for free". The problem is that the code treats the FU allocator returning `NoCapableFU` for a given DynInst as equivalent to the case where the DynInst obtained an FU, with default latency of 1. This is because there is a single if statement that checks whether the FU allocator returned `NoFreeFU` or not, and `NoCapableFU` happens to be different. The change is to introduce `NoNeedFU` and to panic if the FU allocator returns `NoCapableFU` An improvement would be to use a strongly typed enum rather than integer constants. Thoughts ? In addition to unit tests, I have tested this with `main.py run` and get panics if I remove support for `IntMul` type in `O3CPU.py` in: ``` ./SuiteUID-asm-riscv-rv32um-ps-mul-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv32um-ps-mul-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-asm-riscv-rv32um-ps-mulh-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv32um-ps-mulh-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-asm-riscv-rv32um-ps-mulhsu-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv32um-ps-mulhsu-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-asm-riscv-rv32um-ps-mulhu-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv32um-ps-mulhu-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-asm-riscv-rv64um-ps-mul-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mul-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-asm-riscv-rv64um-ps-mulh-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mulh-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-asm-riscv-rv64um-ps-mulhsu-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mulhsu-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-asm-riscv-rv64um-ps-mulhu-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mulhu-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-asm-riscv-rv64um-ps-mulw-o3-ALL-x86_64-opt/TestUID-asm-riscv-rv64um-ps-mulw-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-BaseCPUProcessor-arm-hello-ALL-x86_64-opt/TestUID-BaseCPUProcessor-arm-hello-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-cpu_test_ArmDerivO3CPU_Bubblesort-ALL-x86_64-opt/TestUID-cpu_test_ArmDerivO3CPU_Bubblesort-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-cpu_test_ArmDerivO3CPU_FloatMM-ALL-x86_64-opt/TestUID-cpu_test_ArmDerivO3CPU_FloatMM-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-cpu_test_RiscvDerivO3CPU_Bubblesort-ALL-x86_64-opt/TestUID-cpu_test_RiscvDerivO3CPU_Bubblesort-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-cpu_test_RiscvDerivO3CPU_FloatMM-ALL-x86_64-opt/TestUID-cpu_test_RiscvDerivO3CPU_FloatMM-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-o3-cpu_1-cores_classic_DualChannelDDR3_1600_arm_boot_test_to-tick-ALL-x86_64-opt/TestUID-o3-cpu_1-cores_classic_DualChannelDDR3_1600_arm_boot_test_to-tick-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-o3-cpu_1-cores_classic_DualChannelDDR3_1600_riscv-boot-test_to-tick-ALL-x86_64-opt/TestUID-o3-cpu_1-cores_classic_DualChannelDDR3_1600_riscv-boot-test_to-tick-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-test-arm-hello32-static-o3-ALL-x86_64-opt/TestUID-test-arm-hello32-static-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-test-arm-hello64-static-o3-ALL-x86_64-opt/TestUID-test-arm-hello64-static-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-test-mips-hello-o3-ALL-x86_64-opt/TestUID-test-mips-hello-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-test-riscv-hello-o3-ALL-x86_64-opt/TestUID-test-riscv-hello-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ./SuiteUID-test-riscv-print-this-o3-ALL-x86_64-opt/TestUID-test-riscv-print-this-o3-ALL-x86_64-opt/simerr.txt:src/cpu/o3/inst_queue.cc:905: panic: Processor cannot execute opclass:2 ``` Co-authored-by: Arthur perais <arthur.perais@univ-grenoble-alpes.fr>	2024-09-11 16:43:31 +01:00
MMysore2	33e3bc4ff1	Updating Traffic Generators (#1416 ) Added documentation for `strided_generator.py` and `strided_generator_core.py.` Updated clarity of documentation for `linear_generator.py`, `linear_generator_core.py`, `random_generator.py`, and `random_generator_core.py`. Made `max_addr` exclusive instead of inclusive for strided and linear traffic generation in `strided_gen.cc` and `linear_gen.cc`.	2024-08-08 12:46:10 -07:00
Saili Karkare	bd228af5cf	Updating hex addr printing (#1385 ) This change changes the addresses that are printed when TrafficGen DebugFlag is enabled. Previously, hex strings were printed without a preceding 0x. This change fixes that to distinguish between decimal and hex.	2024-08-07 02:31:21 -07:00
Yu-Cheng Chang	c13f895af0	arch,cpu: Implement generic reset method for MMU (#1342 ) Implementing generic reset method for MMU allows each ISA implementing their own reset methods. The default reset MMU method is flush all TLB entries. For example, The RISC-V needs to do PMP reset when received the reset signal, but the TLBs don't require to be flushed. Change-Id: I158261570fb6e5216ec105fbdc53460f83f88d15	2024-07-30 09:47:55 +01:00
Yu-Cheng Chang	ce8db85867	cpu: Add cpuIdlePins to indicate the threadContext of CPU is idle (#1285 ) If the threacContext of CPU enters the suspend mode, raise the threadID of threadContext cpu_idle_pins with the high signal to target. If the threadContext of CPU enters the activate mode, lower the threadID of thread cpu_idle_pins with low signal to target.	2024-07-10 10:36:37 +01:00
Bobby R. Bruce	7137b73ca0	cpu: Fix `std::min` type mismatch in reg_class.hh (#1266 ) Introduced in #1234, this caused compilation to faill in Apple Silicon systems. This bug is the same as #582 where a more detailed explanation is provided.	2024-06-20 13:02:08 -07:00
Mahyar Samani	7ff1e381c9	cpu,stdlib: Fix Access Trace for Accessing Indices in SpatterGen (#1258 ) This change fixes the way indices are generated in a multi generator setup. It changes it from all cores generating the same trace of indices for accessing the index array to each core generating an interleaved subset of indices. For an example look below for traces (indices to index array) in a 2 core setup. Before: core_0: 0, 1, 2, 3, 4, 5, 6, 7, ... core_1: 0, 1, 2, 3, 4, 5, 6, 7, ... After: core_0: 0, 1, 2, 3, 8, 9, 10, 11, ... core_1: 4, 5, 6, 7, 12, 13, 14, 15, ... Additionally, this change fixes the SpatterKernel class in the standard library to comply with the change in the SpatterGen source code.	2024-06-20 11:24:44 -07:00
Bobby R. Bruce	36f73f671d	cpu,stdlib: Adding Spatter (#1136 ) This PR adds source code for C++ implementation of SpatterGen as well as SpatterKernel. SpatterGen uses a PyBindMethod to add kernels to the backend code. This way the process of processing json files could be offloaded to python. In addition it adds standard library components for SpatterGenCore and SpatterGen. These two components follow the same structure as AbstractCore and AbstractProcessor. In addition spatter_kernel.py adds a definition for SpatterKernel in python to make adding kernels to C++ easier. Also it adds utility functions for parsing dictionaries read from json as well as partitioning traces for multicore setups.	2024-06-17 15:28:45 -07:00
Hoa Nguyen	15e0236a8b	arch,cpu,sim: Add mechanism to partially print vector regs (#1234 ) Currently, gem5's inst tracer prints the whole vector register container by default. The size of vector register containers in gem5 is the maximum size allowed by the ISA. For vector-length agnostic (VLA) vector registers, this means ARM SVE vector container is 2048 bits long, and RISC-V vector container is 65535 bits long. Note that VLA implementation in gem5 allows the vector length to be varied within the limit specified by the ISAs. However, in most use cases of gem5, the vector length is much less than 65535 bits. This causes two issues: (1) the vector container requires allocating and moving around a large amount of unused data while only a fraction of it is used, and (2) printing the execution trace of a vector register results in a wall of text with a small amount of useful data. This change addresses the problem (2) by providing a mechanism to limit the amount data printed by the instruction tracer. This is done by adding a function printing the first X bits of a vector register container, where X is the vector length determined at runtime, as opposed to the vector container size, which is determined at compilation time. Change-Id: I815fa5aa738373510afcfb0d544a5b19c40dc0c7 --------- Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2024-06-17 14:05:47 -07:00
Giacomo Travaglini	2804311f7b	cpu-o3: Revert "Do not set Executed on load instruction to be replayed" (#1251 ) Reverts gem5/gem5#1182 This is breaking O3 execution. Investigating the matter	2024-06-17 12:24:43 -07:00
Mahyar Samani	6695e5ef70	cpu: Adding SpatterGen This change adds source code for SpatterGen ClockedObject. The set of source code pushed includes code for SpatterKernel that tracks whether information is being gathered or scattered as well as the list of indices to be accessed. This model has PyBindMethod to add SpatterKernels from python. This way all the preparations for kernels can be done in python. SpatterGen has a few parameters that model limits on a few of hardware resources in the backend of a processor, e.g. number of functional units to calculate effective address, the latency of calculating effective address, number of integer registers. Change-Id: I451ffb385180a914e884cab220928c5f1944b2e3	2024-06-14 10:45:09 -07:00
Minje Jun	b8e21a2d32	cpu-o3: Do not set Executed on load instruction to be replayed (#1182 ) A load instruction can be replayed when 1) it's strictly ordered or 2) it falls into load-store forwarding mismatch. Case 1 was considered in executeLoad function but the case 2 wasn't. It causes the case-2 replayed load instruction to violate the assertion condition "assert(!load_inst->isExecuted())" in LSQUnit::read. This commit fixes the problem by adding consideration of the case 2 in LSQUnit::executeLoad. Co-authored-by: Minje Jun <minje.jun@samsung.com>	2024-06-14 10:12:26 -07:00
Jason Lowe-Power	21ffd91529	cpu,arch: Add IsInvalid flag to Unknown insts (#1071 ) The IsInvalid flag indicates that the static instruction is not part of the executing ISA and not part of m5's pseudo-instructions. This flag provides a way to recognize an illegal instruction at the decode stage.	2024-06-13 16:26:35 -07:00
Harshil Patel	74afea471d	cpu: Revert "Don't change to suspend if the thread status is halted" (#1225 ) Reverts gem5/gem5#1039	2024-06-12 00:20:06 -07:00
Hoa Nguyen	369029d2be	cpu: Add IsInvalid flag to StaticInstFlags The IsInvalid flag indicates that the static instruction is not part of the executing ISA and not part of m5's pseudo-instructions. This flag provides a way to recognize an illegal instruction at the decode stage. Change-Id: I2779c6edcd8c5e6a77ea11cad3ff73bacb79d800 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2024-06-11 18:48:29 +00:00
Ivana Mitrovic	a764b9be1c	Revert "arch-x86: Fix TLB Assertion Error on CFLUSH" (#1196 ) Reverts gem5/gem5#1080 as it is not a good fix.	2024-06-04 10:26:53 -07:00
Lukas Zenick	dad5c7b6f7	arch-x86: Fix TLB Assertion Error on CFLUSH (#1080 ) Fixed the assertion statement in the cpu's translation.hh file so that it doesn't fail the assertion if the cache is clean. I compile this c code to `test` ```c #include <stdio.h> static inline void clflush(volatile void *p) { __asm__ volatile ("clflush (%0)" : : "r"(p) : "memory"); } int main() { int data = 42; // Example variable printf("Value before clflush: %d\n", data); clflush(&data); printf("Value after clflush: %d\n", data); return 0; } ``` And run it with this script `./build/X86/gem5.opt configs/learning_gem5/part1/two_level.py ./test` In order to verify that it no longer fails the assertion check. GitHub Issue: #862 Change-Id: I6004662e7c99f637ba0ddb07d205d1657708e99f	2024-06-03 10:17:10 -07:00
Harshil Patel	0824d7f2cd	Revert "cpu-kvm: Support perf counters on hybrid host architectures" (#1127 ) Reverts gem5/gem5#1065 Reverting this change because this PR breaks X86 kvm as mentioned in the issue #1126.	2024-05-21 08:14:10 -07:00
Yu-Cheng Chang	321bd07163	cpu: Don't change to suspend if the thread status is halted (#1039 ) In our gem5 model, there are four types represent thread context: Active, Suspend, Halting and Halted `5641c5e464/src/cpu/thread_context.hh (L99-L117)` When initializing the gem5 instance, all of the thread contexts are set Halted. The status of thread context will not be active until the Workload initializes start up, except the StubWorkload. So if the user uses the StubWorkload, and the CPU is connected with the model_reset port. The thread context of the CPU will be activated possibly. The following is the steps of activating thread context of the CPU without Workload[1] initialization or lower model_reset port[2]. 1. Raise the model_reset port (Change the state from Halted to Suspend) `5641c5e464/src/cpu/base.cc (L671-L673)` 2. Post the interrupt to CPU (Change the state from Suspend to Active) `5641c5e464/src/cpu/base.cc (L231-L239)` Implementation of wakeup SimpleCPU: `5641c5e464/src/cpu/simple/base.cc (L251-L259)` MinorCPU: `5641c5e464/src/cpu/minor/cpu.cc (L143-L151)` O3CPU: `5641c5e464/src/cpu/o3/cpu.cc (L1337-L1346)` This CL fixed the issue when raising the model reset port to CPU(let CPU sleep) if the CPU is not activated by workload. If the CPU status is halted, it's should not change to Suspend to avoid wake up Reference The model_reset is introduced in the CL: https://gem5-review.googlesource.com/c/public/gem5/+/67574/4 [1] Activate by workload (ARM example): `5641c5e464/src/arch/arm/fs_workload.cc (L101-L114)` [2] Lower the model_reset: `5641c5e464/src/cpu/base.cc (L191-L192)` `5641c5e464/src/cpu/base.cc (L674-L685)` Change-Id: I5bfc0b7491d14369fff77b98b71c0ac763fb7c42	2024-05-16 10:02:53 -07:00
OdnetninI (Eduardo José Gómez Hernández)	17cbbd84ae	cpu: Indirect predictor track conditional indirect (#1077 ) As discussed in https://github.com/orgs/gem5/discussions/954: In the refactor made by commit `f65df9b959` conditional indirect branches are no longer updated in the indirect predictor. This kind of branches do not exist in x86 neither arm, but they are present in PowerPC. This patch, enables the indirect predictor to track this kind of branches.	2024-04-29 11:38:22 +01:00

1 2 3 4 5 ...

2660 Commits