derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Bobby R. Bruce	a4757bef47	tests: Add 'requires' tests for ALL/gem5 Change-Id: I58012b092dc1ec027474e2e45ad3e9809b31578b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63433 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2022-09-22 18:19:56 +00:00
Bobby R. Bruce	b94a6a50a5	tests: Update presubmit.sh to compile ALL/gem5.fast This part of the Kokoro presubmit tests was designed to ensure gem5 still compiled sucessfully with Clang and to the '.fast' variant. ARM was chosen arbitarily. Now that ALL exists, it makes more sense to use it for this test. Change-Id: Ia3593f7dd506205da13802a69094f4dd7019ab90 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63371 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-22 18:19:56 +00:00
Bobby R. Bruce	3b0cb574f5	tests: Update tests to use ALL/gem5.opt compilation Where possible the gem5 tests have been updated to use the build/ALL/gem5.opt compilation. If a quick test requied a specific a ISA/protocol compilation they were moved to the long/nightly set. This means all the quick/kokoro tests are run with the build/ALL/gem5.opt compilation. The learning_gem5 tests have been updated to use ALL/gem5.opt. The equivilant examples on the website have been updated via: https://gem5-review.googlesource.com/c/public/gem5-website/+/63336 Change-Id: I533689ad6848233867bdba9e9a43bb5840ed65c7 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63374 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu>	2022-09-22 18:19:56 +00:00
kunpai	2429a6dd58	stdlib: Added RiscvMatched prebuilt board Modeled after the HiFive Unmatched. For the cache, we inherited from AbstractClassicCacheHierarchy and AbstractTwoLevelCacheHierarchy to make a PrivateL1PrivateL2 hierarchy with the same associativity and sizes as the board. However, the L2 Size is kept as a parameter that can be set by the user. The core is in-order, therefore we inherited from RISC-V MinorCPU and used the same pipeline parameters as the ARM HPI CPU, except the decodeToExecuteForwardDelay, which was increased to 2 to avoid a PMC access fault. For the processor, we initialized the core with an ID so that we can return 4 cores in FS mode, which is the same as the real system, and 1 in SE mode. For the memory, we just have a single channel DDR4_2400 with a size of 16 GB and starting at 0x80000000. For the board, we declared a Boolean variable as a parameter to assign whether it is in FS mode or not. We inherited from KernelDiskWorkload and SEBinaryWorkload and initialized the components of the board according to the Boolean. The other parameters are the clock frequency and the L2 cache size. Jira Issue: https://gem5.atlassian.net/browse/GEM5-1257 Change-Id: Ic2c066bb3a41dc2f42566ce971f9a665542f9771 Co-authored-by: Jasjeet Rangi <jasrangi@ucdavis.edu> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63411 Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-22 14:44:40 +00:00
Gabe Black	16690bc289	scons: Fix the default KVM_ISA setting. The KVM_ISA setting was moved into a CONF dict, but the code which ensured it had a default if there was no possible KVM hosting ISA was still setting that variable in the base environment dict. This moves the setting into the CONF dict instead. Change-Id: I067c969dd761b2cdb098bcba6cd6a4b643d2d427 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63752 Reviewed-by: Earl Ou <shunhsingou@google.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Gabe Black <gabeblack@google.com>	2022-09-22 07:56:47 +00:00
Daecheol You	e8ff8817e3	mem-ruby: bug fix for stale WriteBack Finish_CopyBack_Stale is scheduled only when the requestor is the last sharer. This prevents the cacahe evicting the line which was already evicted while the stale WriteBack transaction was stalled. Wrong condition check in Finish_CopyBack_Stale for eviction is also removed. Change-Id: Ib66acc1b9e4a6f7cea373e1fb37375427897d48d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63611 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-19 01:57:23 +00:00
Bobby R. Bruce	db8641fd7b	stdlib: Add additional warns when `get_runtime_isa` used While the `runtime` module's `get_runtime_isa` function throws a warning to remind user's the function is deprecated, this was not always helpful to a user when setting a processor without a target ISA. This change adds additional warnings to the SimpleSwitchableProcessor and the SimpleProcessor. These warnings explain not explicitly setting the ISA via the processor's constructor is deprecated behavior. Change-Id: I994ad8355e0d1c3f07374bebe2b59106fb04d212 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63331 Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-16 03:16:36 +00:00
Bobby R. Bruce	3b832fceb9	scons: Update 'ALL' compilation to use MESI_Two_Level The MI_Example coherence protocol is a poor default. Change-Id: I2baec29ef18a9cb9f0d5751155935cae4621af5d Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63372 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2022-09-15 19:06:45 +00:00
Bobby R. Bruce	8c1c60d43f	tests: Update RISCV asmtests to use 'simple_binary_run.py' The tests have been modified to be functionally equivalent but utilize the standard library via the 'simple_binary_run.py' script. Change-Id: Ib8b7a442a478d0fb152339ff5ba039412d0fef48 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63373 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-15 19:06:45 +00:00
Bobby R. Bruce	bcb9181db3	tests: Remove 'test_build' from the testlib These gem5 builds are compiled automatically if required by a test. Additionally, they are redundant given the existance of the compiler tests, run daily. Change-Id: I71141f82a86538a77384e684b9d261794e103b99 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63334 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Bobby Bruce <bbruce@ucdavis.edu>	2022-09-15 19:06:45 +00:00
Bobby R. Bruce	cc3940b69e	tests: Add 'ALL' build to the compiler tests Change-Id: I7f31a23999173e7cd06f3ee87e5e4f0ae42c54ea Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63333 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-15 19:06:45 +00:00
Bobby R. Bruce	f9e1b308c1	tests: Enable build/ALL/gem5 supported-isas-check tests These tests were disabled until the build/ALL/gem5 compilation was complete. These test the `gem5.runtime.get_supported_isas` function. Change-Id: Ieac5676e9fed121f3cfe35e38f9748431824cbc0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63332 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-15 19:06:45 +00:00
Giacomo Travaglini	f448706dd5	arch-arm: Properly implement last level TLBIs Prior to gem5 v21.2, partial translation entries were not cached within the TLB, therefore Last Level (only) TLBI instructions were invalidating every entry. Now that we store translations from several lookup levels we are currently over-invalidating partial translations. This patch is adding a boolean flag to TLBIMVAA and TLBIMVA, allowing to discard a match if the TLBI is targeting complete translations only and the entry holds a partial translation Change-Id: I86fa39c962355d9c566ee8aa29bebcd9967c8c57 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62453 Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-15 17:24:06 +00:00
Luming Wang	fe6fc29b07	cpu: add `BTBUpdates` for BPredUnitStats Current BPredUnitStats only contains BTBLookups. However, the number of BTB updates is also needed to evaluate power consumption via McPAT. Thus, this patch add BTBUpdates for BPredUnitStats. Change-Id: I4c079b53f6585b5452022fe3fb516563c7d07f4e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63651 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2022-09-15 02:07:20 +00:00
Jui-Min Lee	e1ba253438	arch-riscv: Add flag for misaligned access check Misaligned access is an optional feature of riscv isa, but it is always enabled in current gem5 model. We add a flag into the ISA class to turn off the feature. Note this CL only consider the load/store instruction, but not the instruction fetch itself. To support instruction address fault, we'll need to modify the riscv decoder. Change-Id: Iec4cba0e4fdcb96ce400deb00cff47e56c6d1e93 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63211 Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Gabe Black <gabeblack@google.com>	2022-09-14 02:22:47 +00:00
Bobby R. Bruce	17a46091fa	tests: Remove '--ignore-style' from m5 weekly test build The '--ignore-style' flag was added to the `scons build/x86/out/m5` command in the weekly tests in this commit: https://gem5-review.googlesource.com/c/public/gem5/+/63012 The m5 compilation does not attempt to run the style check hooks and, as such, this command failed as the '--ignore-style' flag is not recognized. This caused the weekly tests to fail: https://jenkins.gem5.org/job/weekly/76 This commit removes this flag. Change-Id: Ic0320609ac5234be978743377f13dd1cf7f1e782 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63553 Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-13 17:05:54 +00:00
Giacomo Travaglini	d9ed84902d	sim: Fix serialize_handlers.test.cc on Arm platforms The C and C++ standards allows the character type char to be signed or unsigned, depending on the platform and compiler. Most systems, including x86 GNU/Linux and Microsoft Windows, use signed char, but those based on PowerPC and ARM processors typically use unsigned char This means testing for: EXPECT_FALSE(parser.parse("255", value)); is not portable as Arm platforms are able to convert 255 into an unsigned character. We are fixing this portability issue by performing different checks depending on the platform. Maybe a better solution would be to explicitly set the sign of the char (signed char in this case) Change-Id: I44dd84378ea62ae21a6b03e1f35119bf85f8c799 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63539 Maintainer: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>	2022-09-13 08:46:56 +00:00
Mahyar Samani	8ba46bafb0	stdlib: Improving synthetic traffic generation. This change adds a new traffic generator module to the standard library that can read a .cfg file describing the traffic pattern as a state machine. It wraps around the TrafficGen SimObject. In addition it adds a method to ComplexGenerator to set the traffic from outside using python generators like the example found in configs/dram/sweep.py. Change-Id: I5989bb900d26091e6e0e85ea63c741441b72069c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62473 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-13 07:14:51 +00:00
Bobby R. Bruce	26ea5a1c72	configs: Fix "gem5.resource" typo in riscv-ubuntu-run.py This was causing the Nightly tests to fail: https://jenkins.gem5.org/job/nightly/348/ The import should be "gem5.resources.workload". Change-Id: I0ecd181a3c1120c44ebd0683e2a62bdc602a75bd Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63391 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2022-09-12 18:44:28 +00:00
Tiago Muck	f6b2793b91	Revert "mem-ruby: bug fix for Finish_CopyBack_Stale" This reverts commit `f7cf47bc31`. Reason for revert: introduces an issue when handling a stale WriteBack Change-Id: I4bd370911cb003c0c99e5fd14866b8c98afa80e2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63412 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2022-09-12 14:52:38 +00:00
Jason Lowe-Power	170c998b8f	python: Enable -c in gem5 to mimic python Adds a -c parameter to gem5 that works like python's -c to execute commands from a string. This is to set up getting multiprocessing and spawn to work in a later changeset. Change-Id: I11a1dedb481fbe88898abc1e525d781ec3f66494 Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63131 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-10 15:36:01 +00:00
Luming Wang	f43e722238	cpu: Fix RAS behaviour when both isReturn and isCall are set. As discussed in [1], current BP cannot handle the instruction with both isReturn and isCall on RAS. This hurts the performance of coroutine-based programs. This patch adjusts the behaviour of RAS. When the isReturn flag is set, it will pop a RAS. Then, if the isCall flag is set, it will push a RAS. Previous implementation only pop a RAS when both isReturn and isCall are set. This behaviour follows the RISC-V Spec [2]. Since other ISAs do not have instructions that set both isCall and isReturn, this patch has no impact on other ISAs. [1] https://gem5-review.googlesource.com/c/public/gem5/+/58209 [2] https://github.com/riscv/riscv-isa-manual/releases/download/Ratified-IMAFDQC/riscv-spec-20191213.pdf Change-Id: I52c01bbea41347711edff9ce9a03076e46aadc92 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63311 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-10 08:02:06 +00:00
Jasjeet Rangi	6807f70b81	cpu-minor: Add control instruction statistics to MinorCPU Add control/branch instructions committed stat to Minor CPU. The stats can be found in board.processor.cores.core.committedControl_0 in stats.txt. The stats counted are IsControl, IsDirectControl, IsIndirectControl, IsCondControl, IsUncondControl, IsCall, and IsReturn. IsControl tracks the total control/branch instructions committed. Use inst->staticInst->isControl() flag to determine if an instruction is a control or not, and then using other flags in the StaticInstFlags to determine the type of control instruction and tracking the committed ones. Jira Issue: https://gem5.atlassian.net/browse/GEM5-1283 Change-Id: Iee1010fdf0fa4078ebe1c56b437295abdb5f4469 Co-authored-by: Kunal Pai <kunpai@ucdavis.edu> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63358 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: ZHENGRONG WANG <seanyukigeek@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2022-09-10 00:08:56 +00:00
Daecheol You	f7cf47bc31	mem-ruby: bug fix for Finish_CopyBack_Stale I made a mistake in the change below: https://gem5-review.googlesource.com/c/public/gem5/+/58413 Checking the requestor in the sharer list for eviction should be removed now. If the sharer count is zero, the requestor can't be in the sharer list. Change-Id: I304d2dd7df1aff4907801664a260c35c490a2136 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62991 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2022-09-09 20:38:20 +00:00
Jarvis Jia	b86088008a	mem-ruby: Fix replacement policy updates with stores in MI_example The current MI_example protocol's L1 caches updates the MRU information twice per store requests that miss -- once when the request reaches Ruby and once when the store miss is returned from another level of the memory hierarchy. Although this approach does not cause any correctness bugs for replacement policies like LRU since this request is the LRU in both cases, it does not work correctly for other policies like SecondChance and LFU, where updating the information twice (for misses) causes them to devolve to LRU. Note that this was not directly a problem with Ruby previously, because it only supported LRU-based policies that were unaffected by this. However, with the integration of 20879 Ruby now uses the same replacement policies as Classic (which has additional, non-LRU based replacement policies). This patch resolves this problem by not updating the MRU information a second time for the misses. It has been tested and validated with the replacement policy tests in 20880, and it modifies the store instead of the load in 62232. Change-Id: I8436e3e537da0ee5841c59a94fa5e5c30105529f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63191 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 15:19:54 +00:00
Matthew Poremba	b623d26543	dev-amdgpu: Fix interrupt call for release mem Both the client id and source id are incorrect for the release mem CP packet. This changeset sets both to the correct value and adds asserts that the value is declared in the client ID and source ID enums. Change-Id: I4cc6c3a5f2a482e8f7dcd2a529c4a69bf71742c0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63177 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	9f5c0f2822	gpu-compute: dprint instruction requesting translation When debugging strange addresses, it is extremely useful to know what instruction calculated that address. This make it much easier to follow assembly code backwards to find the source of an incorrect address. This change adds a DPRINTF for GPUTLB that by default prints the disassembly when a virtual address translation is sent to the TLB. Change-Id: I5066c064a48c5c48696863eeccd8d011245ef7b2 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63176 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	b919d9c5c9	arch-vega: Improve disasm for GLOBAL insts with scalar offset The previous print statement was not clear that a scalar offset was being used when printing disassembly, which made it slightly more difficult to track down bugs related to this (relatively) rare usage of global load/store instructions. This change improves the disassembly to closer match the output of hipcc's assembly code output. Change-Id: I8514aedacb5b1db93d0586c408c4cf1ce77a7db3 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63175 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	6c935657fd	dev-amdgpu: Implement SDMA atomic packet SDMA atomic packets are used in conjunction with RLC queues in SDMA for synchronization similar to how HSA signals are used with BLIT kernels when SDMA is disabled. Implement a skeleton of the SDMA atomic packet methods as well as the atomic add64 operation. The atomic add operation appears to be the only operation used in ROCm, so this implementation is fairly complete. See: https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/ rocm-4.2.x/src/core/runtime/amd_blit_sdma.cpp#L880 Change-Id: I62cc337f2ffe590bdb947b48053760ee8b3a6f32 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63174 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	9ea28bd782	dev-amdgpu: Implement SDMA RLC queue unmapping The unmap queues packet specifies all non-static queues should be unmapped which includes RLC queues in the SMDA. This functionality did not exist before and is added in this changeset. Fixes bug with rodinia_3.0/hip/bfs. Change-Id: I80ca8cf8d89559625b5870745889b0a27916635e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63173 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	af4251f6ae	dev-amdgpu: Rework SDMA RLC queue data structure There can only ever be two RLC queues maximum. Use this information for a simpler data structure to store doorbell information. The patch changes the std::unordered_map previously used to std::array. This will also be useful in avoiding erase-while-iterating issues needed to unregister all queues at once. Change-Id: I95600e40de51cb1a992a20bcebaf7580ea4d0be8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63172 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	12ec5f9172	dev-amdgpu: Rework framebuffer reads Previously framebuffer reads would try reading from MMIO trace, special addresses, and then anything previously written to a special address range. This does not handle direct large BAR reads, causing incorrect results in some applications that were doing this. Rework the readFramebuffer method to do the following. Remove the MMIO trace read altogether, as there were not any framebuffer reads from the trace to begin with. Read special addresses first to avoid overwriting by previous writes. Next read previous writes to special ranges. The special range is the GART table. These are required for functional translations. Lastly read from the device memory directly. This does a functional read required by the PCIDevice read method which is non-timing. Reading from device memory is preferred over the map type used for GART to avoid duplication of a potentially huge amount of data. With this changeset all but one of the HIP samples and HIP examples applications now run and pass verification of results. Change-Id: Id3b788bfc5eaf17cfa1897f25d26f3725d4db321 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63171 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	f91abb9770	arch-vega: Allow unaligned large host pages The virtual and physical address for device memory are typically aligned to the page size. On the host (x86), however, the physical address may not be aligned to page size for large page sizes when mixed with 4kB pages. As a result, the physical address calculation must add, rather than bitwise-OR, the virtual page offset to the physical page number. The virtual page offset on the GPU continues to use the variable page bytes for masking and shifting. Change-Id: I6563a1eb43d9b59577d32268b8645a7436304bcb Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63034 Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	4b35693bd2	dev-amdgpu: Forward RLC queue doorbells Forward user queue doorbells to the SDMA. This is the final step needed to enable RLC (user) queues to replace BLIT kernels. Change-Id: I0c2ef70bb5414b82785ef437dd65d6c57798d24f Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63033 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	a5dfb0718d	dev-amdgpu: Add user-mode TranslationGen to SDMA RLC queue do translation using user mode addresses. To support this, add the final aperture translation needed to the SDMA engine. Change-Id: I25841e240e3b44f66d26d503ab52b54379daa49a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63032 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	9ed39afe62	dev-amdgpu: Place all user-mode translations in MMHUB The memory management hub ("mmhub") is an aperture that aliases the GPU device memory. MMHUB addresses functionally map to the same device address, with the exception that it is guaranteed not to overlap with host memory. This is useful in gem5 for APIs with Addr type as it prevents sending e.g., DMAs to the wrong place. Change-Id: Ia296809a8dc2c5fbdeba6d70cd53215f9ab36c93 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63031 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	e0e2806fc4	dev-amdgpu: Add SDMA device translation helper Adding a helper function to remove duplicate code in the copy packet methods. Adds more comments on that code to explain what it is doing. This could in theory also be used in other packets in the future. Change-Id: Id0ed50c87260a2f12f53cb14e927f8c49bb99072 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62718 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	f20d12656a	configs: Stop disabling SDMA in GPUFS config Support has been added for SDMA RLC queues which are used for host to device and device to host "memcpy" calls. Previously the SDMA engine was disabled which caused GPU BLIT kernels to be called. This removes the environment variable disabling SDMAs which has two main benefits: - It will be much easier to debug host/device transfer by using SDMA debug flag. - Simulation time is improved since we no longer need detailed GPU simulation to copy data and instead are doing a simple large DMA Change-Id: I7524245731d301b5c26394318f2156ed6b4c983a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62717 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Matthew Poremba	58e072f8bf	dev-amdgpu: Remove default callback in mem manager API In almost all cases reading/writing using the GPU memory manager will want to wait until that read or write is complete. Therefore, change the API to not default to no callback so that the user must explicitly specify nullptr indicating they do not want to wait for completion. Updates a write call which cannot use a callback due to being atomic in the base gpu device code. Change-Id: Id19145d49c7cafc97e2e178819682cb97270a16a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62716 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-09 04:13:49 +00:00
Bobby R. Bruce	700f64c1c1	scons: Ensure style_hooks check exits if hook cannot install If the pre-commit could not be installed the compilation would continue as the exit code from running the pre-commit install script was not read or processed. This commit adds a check. If the install is unsuccessful the users is asked whether they want to continue the compilation or not. This check can be ignored with the '--ignore-style'. The tests have been updated to include this flag in all cases we compile gem5 to ensure tests remain automated and uninterrupted on Kokoro/Jenkins. Change-Id: Iaf4db71300883b828b00d77784c9bb46b2698f89 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63012 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com>	2022-09-08 18:31:08 +00:00
Bobby R. Bruce	92ab557947	configs: Use "arm64-ubuntu-20.04-boot" workload for example The ARM Ubuntu Boot example was using 18.04. This commit updates this example script to use the "arm64-ubuntu-20.04-boot" workload, added here: https://gem5-review.googlesource.com/c/public/gem5-resources/+/62662 Change-Id: I9cee16f739a5fa9281041fde242b5cd37e5be20b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62665 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-08 17:40:11 +00:00
Bobby R. Bruce	bb60998aa9	configs,tests: Update tests/configs for RISCV boot workload As of this commit: https://gem5-review.googlesource.com/c/public/gem5-resources/+/62659 we have a RISCV Ubuntu 20.04 boot workload. This patch applies it to test scripts and example scripts where appropriate. Change-Id: Ibf9bed1a978b6d2e456b528f64cf3a9d6dc0e568 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62664 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-08 17:40:11 +00:00
Bobby R. Bruce	af4fd2f2c6	tests,configs: Update x86 boot tests/examples with Workload As of this commit: https://gem5-review.googlesource.com/c/public/gem5-resources/+/62658 there is an x86-ubuntu-18.04-boot workload. Where appropriate tests and example scripts have been updated to use this workload. Change-Id: I7c9dc8e0e53b1d3f4c365f0382b5f5d4224436f7 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62663 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>	2022-09-08 17:40:11 +00:00
Giacomo Travaglini	bfcf5f0b91	arch-arm, kvm: Fix KVM_ARM_IRQ_VCPU2_SHIFT compilation error After the following patch: https://gem5-review.googlesource.com/c/public/gem5/+/59310 gem5 doesn't compile on Arm machines that don't define the KVM_ARM_IRQ_VCPU2_SHIFT macro as the latter is not guarded anymore. This patch fixes the problem by amending capIRQLineLayout2 to rely on KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 only (which makes sense) and moves back the KVM_ARM_IRQ_VCPU2_SHIFT guard back to its original place Change-Id: Ib6b6ef4014c2a54580cb3e5b0167d4ee1f7139ed Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63111 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-07 08:13:15 +00:00
Zhantong Qiu	07b693a186	stdlib, configs: stdlib SimPoints support and example scripts simpoints-se-checkpoint.py & simpoints-se-restore.py: These are two example scripts to show how to use SimPoints functions with the stdlib. se_binary_workload.py: Allow se_binary_workload to take in SimPoint Class item and schedule SimPoint exit events. exit_event.py: Added SIMPOINT_BEGIN and MAX_INSTS exit events. simulator.py: Added SIMPOINT_BEGIN and MAX_INSTS exit event scheduling functions. They can schedule exit events before or during the simulation. Jira Issue: https://gem5.atlassian.net/browse/GEM5-1259 Change-Id: Iaa07a83de9dddc293b9f1a230aba8e35d4f5af6c Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63154 Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu>	2022-09-07 02:20:08 +00:00
Zhantong Qiu	f08a4d2dc5	stdlib: cpu support for SimPoint and MAX_INSTS exit events BaseCPU.py: Linked "scheduleSimpointsInstStop" and "scheduleInstStopAnyThread" to python base.cc & base.hh: Added scheduling functions for SimPoint and MAX_INSTS exit event. abstract_core.py & base_cpu_core.py: Added scheduling functions for SimPoint and MAX_INSTS exit event for stdlib processor to access. Jira Issue: https://gem5.atlassian.net/browse/GEM5-1259 Change-Id: I98a0f93b46a220fdb3f350d8da359c24b4d66a58 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63153 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>	2022-09-06 18:48:51 +00:00
Zhantong Qiu	8fa5a8a668	stdlib: added SimPoint Class to stdlib Added SimPoint Class to store workload needed SimPoints information. It stores SimPoints starting instructions, SimPoints interval, SimPoints weight, and warmup length for each SimPoint. Jira Issue: https://gem5.atlassian.net/browse/GEM5-1259 Change-Id: I47e4dc0c98801d42acef9b7ccbb629401c61ca40 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63132 Maintainer: Bobby Bruce <bbruce@ucdavis.edu> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-06 18:48:20 +00:00
Zhantong Qiu	c16b717a60	stdlib: added three exit event generators In exit_event_generators.py, added a dump/reset exit generator, a save checkpoint generator, and a default generator for SimPoints. Jira Issue: https://gem5.atlassian.net/browse/GEM5-1259 Change-Id: Ie36e853a5ef992d6d293917ef2df2a3a8b8c68b9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/63152 Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-06 18:46:36 +00:00
Matthew Poremba	3465ff1e7d	dev-amdgpu: Add callbacks for all SDMA GPUMemMgr reqs SDMA write, copy, and ptePde use GPUMemMgr to write to device memory and were dangerously not waiting for write completion which could result in data not being completely written to memory, the data buffer being freed and potentially reused in the simulator, or advancing to the next SDMA packet before the previous one is complete. This changeset adds callbacks for the corresponding "done" methods similar to what the dmaVirt methods call when reading or writing to host memory to fix this issue. Change-Id: I44ce14c13f812ea2a7a76438e12a6ed7c6e0bff0 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62715 Maintainer: Matt Sinclair <mattdsinclair@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com>	2022-09-03 16:05:58 +00:00
Matthew Poremba	404aa34855	dev-amdgpu: Track outstanding chunks in mem manager Requests sent using the GPU memory manager are not guaranteed to be ordered. As a result, the last chunk created by the chunk generator could complete before all of the previous chunks are done. This will trigger the final callback and may cause an SDMA/PM4/etc. packet that is waiting for its completion to resume before the data is ready. This is likely a fix for verification failures in many applications. Currently this is tested on MatrixTranspose from the HIP cookbook which now passes its verification step. It could also potentially fix other race conditions between reads/writes from/to memory such as using a PTE or PDE before it is written, etc. Change-Id: Id6fb342d899db6bd0b86c80056ecf91eeb3026f5 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/62714 Reviewed-by: Matt Sinclair <mattdsinclair@gmail.com> Reviewed-by: Jason Lowe-Power <power.jg@gmail.com> Maintainer: Jason Lowe-Power <power.jg@gmail.com> Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Matt Sinclair <mattdsinclair@gmail.com>	2022-09-03 16:05:58 +00:00

1 2 3 4 5 ...

19431 Commits