derek/gem5 - gem5 - Gitea: Git with a cup of tea

derek/gem5

Author	SHA1	Message	Date
Johnny	105839ae2b	sim: add bypass_on_change to the set() of a signal When reset a port, we don't want to trigger a onChange(). Offer an option to bypass it and update state only. Change-Id: Ia53b7a76d2a320ea67101096cdbfe2eafaf440d2	2023-09-07 11:54:56 +08:00
Nicholas Mosier	3dfdd48211	misc: Fix buggy special path comparisons This patch fixes the buggy special path comparisons in src/kern/linux/linux.cc Linux::openSpecialFile(), which only checked for equality of path prefixes, but not equality of the paths themselves. This patch replaces those buggy comparisons with regular std::string::operator== string equality comparisons. GitHub issue: https://github.com/gem5/gem5/issues/269 Change-Id: I216ff8019b9a6a3e87e364c2e197d9b991959ec1	2023-09-05 13:44:10 -07:00
Matthew Poremba	2da54d5a4f	mem-ruby: Reorder SLC atomic and response actions Currently the MOESI_AMD_Base-directory transition for system level atomics sends the response message before the atomic is performed. This was likely done because atomics are supposed to return the value of the data before the atomic is performed and by simply ordering the actions this way that was taken care of. With the new atomic log feature, the atomic values are pulled from the log by the coalescer on the return path. Therefore, these actions can be reordered. However, it is now necessary that the atomics be performed before sending the response so that the log is populated and copied by the response action. This should fix #253 . Change-Id: Ie7e178f93990975367de2cc3e89e5ef9c9069241	2023-09-01 10:36:54 -05:00
Bobby R. Bruce	8d47cda8b6	arch-x86: Fix wrong x86 assembly (#251 ) The RM field of ModRM was printed as Reg field for several instructions. For reference, this change fixes typos introduced by [1]. [1] https://gem5-review.googlesource.com/c/public/gem5/+/40339	2023-09-01 00:26:00 -07:00
Hoa Nguyen	4ff1f160ec	arch-x86: Fix wrong x86 assembly The RM field of ModRM was printed as Reg field for several instructions. For reference, this change fixes typos introduced by [1]. [1] https://gem5-review.googlesource.com/c/public/gem5/+/40339 Change-Id: I41eb58e6a70845c4ddd6774ccba81b8069888be5 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-09-01 00:26:51 +00:00
Matthew Poremba	cfa833a97d	gpu-compute: Set LDS/scratch aperture base register Starting with gfx900 (Vega) the LDS and scratch apertures can be queried using a new s_getreg_b32 instruction. If the instruction is called with the SH_MEM_BASES argument it returns the upper 16 bits of a 64 bit address for the LDS and scratch apertures. The current addresses cannot be encoded in this register, so that addresses are changed to have the lower 48 bits be all zeros in addition to writing the bases register. Change-Id: If20f262b2685d248afe31aa3ebb274e4f0fc0772	2023-08-31 11:01:32 -05:00
Bobby R. Bruce	0e323bc409	mem: Atomic ops to same address (#200 ) Augmenting the DataBlock class with a change log structure to record the effects of atomic operations on a data block and service these changes if the atomic operations require return values. Although the operations are atomic, the coalescer need not send unique memory requests for each operation. Atomic operations within a wavefront to the same address are now coalesced into a single memory request. The response of this request carries all the necessary information to provide the requesting lanes unique values as a result of their individual atomic operations. This helps reduce contention for request and response queues in simulation. Previously, only the final value of the datablock after all atomic ops to the same address was visible to the requesting waves. This change corrects this behavior by allowing each wave to see the effect of this individual atomic op is a return value is necessary.	2023-08-30 23:53:35 -07:00
Bobby R. Bruce	c156df620d	resources, stdlib: Add support for local files in obtain_resource (#204 ) This patch allows a local JSON file to specify a local path in the JSON object of a Resource, through the "url" field. Local paths can be entered with the prefix "file:" in the "url" field. If the local path exists, then the Resource from there is copied into the resource directory defined in the function earlier. This behavior is the same as using specific Resource classes (ex. BinaryResource) and passing a local_path into the function. But, the above class does not allow simultaneous creation of local Resources and Workloads of those local Resources. With this patch, someone can use a local JSON, specify the location of local Resources and create a Workload from those Resources and test both together.	2023-08-29 20:35:40 -07:00
KUNAL PAI	d52c7ce87f	resources, stdlib: Add support for local files in obtain_resource This patch allows a local JSON file to specify a local path in the JSON object of a Resource, through the "url" field. Local paths can be entered with the prefix "file:" in "url". All File URI scheme formats are supported. This behavior is the same as using specific Resource classes (ex. BinaryResource) and passing a local_path into the function. But, the above infrastructure does not allow simultaneous creation of Resources and Workloads of those Resources. With this patch, someone can use a local JSON, specify the location of local Resources and create a Workload from those Resources and test both together. Also, this patch adds pyunit tests to check the functionality of the function used to convert the "url" field into a path. Change-Id: I1fa3ce33a9870528efd7751d7ca24c27baf36ad4	2023-08-29 09:47:03 -07:00
Bobby R. Bruce	68a48a2dfa	mem-ruby: fix CHI sending the wrong snoop response (#219 ) Do not respond with SnpRespData_I when the line is still present upstream.	2023-08-28 16:21:25 -07:00
Bobby R. Bruce	737c611e72	mem-ruby: fix assert on CHI ReadUnique (#218 ) DCT must be disabled when handling a ReadUnique where the copy need to be upgraded. Previously we were just asserting as it was assumed DCT is only enabled for HNFs (which can "auto-upgrade"). However DCT may also be enabled for intermediated levels of distributed shared caches above the HNFs.	2023-08-28 16:06:09 -07:00
Bobby R. Bruce	4bd3d2f864	mem-ruby: Improve Ruby/CHI stats for in/out trans (#220 ) Currently we generate these stats for all defined Events in the protocol, which may generate too many stats that are never used. Though these don't appear in the stats.txt file, they unnecessarily increases simulation startup time and memory footprint. This patch limits those stats to events with the "in_trans" and/or "out_trans" properties. SLICC compiler then checks which combinations of event+state are possible when generating the stats. Also the possible level of detail for inTransLatHist was reduced. Only the number of transactions for each event+initial+final state combinations is now accounted. Latency histograms are only defined per event type (similarly to outTransLatHist). This significantly reduces the final file size for generated stats.	2023-08-28 15:06:39 -07:00
Matthew Poremba	60f071d09a	gpu-compute,arch-vega: Implement flat scratch insts Flat scratch instructions (aka private) are the 3rd and final segment of flat instructions in gfx9 (Vega) and beyond. These are used for things like spills/fills and thread local storage. This commit enables two forms of flat scratch instructions: (1) flat_load/flat_store instructions where the memory address resolves to private memory and (2) the new scratch_load/scratch_store instructions in Vega. The first are similar to older generation ISAs where the aperture is unknown until address translation. The second are instructions guaranteed to go to private memory. Since these are very similar to flat global instructions there are minimal changes needed: - Ensure a flat instruction is either regular flat, global, XOR scratch - Rename the global op_encoding methods to GlobalScratch to indicate they are for both and are intentionally used. - Flat instructions in segment 1 output scratch_ in the disassembly - Flat instruction executed as private use similar mem helpers as global - Flat scratch cannot be an atomic This was tested using a modified version of the 'square' application: template <typename T> __global__ void scratch_square(T C_d, T A_d, size_t N) { size_t offset = (blockIdx.x * blockDim.x + threadIdx.x); size_t stride = blockDim.x * gridDim.x ; volatile int foo; // Volatile ensures scratch / unoptimized code for (size_t i=offset; i<N; i+=stride) { foo = A_d[i]; C_d[i] = foo * foo; } } Change-Id: Icc91a7f67836fa3e759fefe7c1c3f6851528ae7d	2023-08-26 13:40:12 -05:00
Matthew Poremba	4506188e00	gpu-compute: Fix private offset/size register indexes According to the ABI documentation from LLVM, the low register of flat scratch (maxSGPR - 4) is the offset and the high register (maxSGPR - 3) is size. These are currently backwards, resulting in some gnarly addresses being generated leading to page fault and/or incorrect data. This commit fixes this by setting the order correctly. Change-Id: I0b1d077c49c0ee2a4e59b0f6d85cdb8f17f9be61	2023-08-26 13:40:12 -05:00
Matthew Poremba	e0379f4526	gpu-compute: Fix flat scratch resource counters Flat instructions may access memory locations in LDS (scratchpad) and global (VRAM/framebuffer) and therefore increment both counters when dispatched. Once the aperture is known, we decrement the counters of the aperture that was not used. This is done incorrectly for scratch / private flat instruction. Private memory is global and therefore local memory counters should be decremented. This commit fixes the counters by changing the global decrements to local decrements. Change-Id: I25890446908df72e5469e9dbaba6c984955196cf	2023-08-26 13:40:12 -05:00
Matthew Poremba	57b3d2897c	gpu-compute: Use timing DMAs for GPUFS HSA signals The functional HSA signal read was a hack left in the gpu-compute code. In full system, this functional read is causing problems occasionally with the translation not yet being in the page table. The error message output by gem5 was a fatal message on the readBlob method in port proxy. Changing this to a timing DMA fixes this problem. This commit adds the various timing DMA functions to send and receive response and clean up. A helper method "sendCompletionSignal" is added to the GPUCommandProcessor because the indentation level was getting too deep. This change applies only to FS mode. Code for SE mode is equivalent to what it was before this commit. Change-Id: I1bfcaa0a52731cdf9532a7fd0eb06ab2f0e09d48	2023-08-25 13:10:51 -05:00
Matthew Poremba	fcbed2bd8a	dev-amdgpu: Tell OS about PCIe atomic support (#224 ) configs,dev-amdgpu: Add PCI express capability info The ROCm stack requires PCI express atomics. Currently the first PCI CapabilityPtr does not point to anything, which signals to the OS (Linux) that this is an early generation PCI device. As PCI express atomics were introduced later, the CapabilityPtr needs to point to at least a PCI express capability structure. This capability is defined as 0x10 in Linux. We additionally set the PCI atomic based bits and implement device specific PCI configuration space reads and writes to the amdgpu device. The second commit, output of simulation when loading the amdgpu driver no longer outputs "PCIE atomics not supported". Further, an application which uses PCIe atomics (PyTorch with a reduce_sum kernel) now makes further progress. First commit is a minor typo fix changing PCI capability struct to union.	2023-08-24 11:19:30 -07:00
Bobby R. Bruce	7aa896fe8f	cpu-minor: Separate the reg_index of VecClassReg and VecElemReg (#225 ) In the RISC-V system, we need to VecClassReg to run RISC-V vector instruction, and VecElemReg is not applicable because the element length of vector can be resizable via vset\vl\ instruction. The change will seperate the reg_index for VecReg and VecElemReg to ensure that have the space for VecReg when VecElemReg is not applicable.	2023-08-24 10:13:21 -07:00
Giacomo Travaglini	56a8ab3f3c	sim: provide a signal constructor with an init_state (#210 ) The current SignalSinkPort and SignalSourcePort have no ways to assign the init value of the state. Add a new constructor for them with the param init_state Bug: 293410800 Test: boot to linux Change-Id: Idde0a12aa0ddd0c9c599ef47059674fb12aa5d68 Reviewed-on: https://soc-sim-external-review.googlesource.com/c/gem5/gem5/+/13159 Gem5-Virtual-Platform-Presubmit-Ready: Johnny Ko <johnnyko@google.com> Reviewed-by: Yu-hsin Wang <yuhsingw@google.com> Perf-Presubmit-Ready: Johnny Ko <johnnyko@google.com> Gem5-Virtual-Platform-Verified: kokoro <noreply+kokoro@google.com> Perf-Verified: kokoro <noreply+kokoro@google.com>	2023-08-24 18:06:21 +01:00
Bobby R. Bruce	e77666d9e8	mem-ruby: fix CHI Evict race condition (#217 ) When an Evict request is received from upstream for a shared line and the line is no longer cached locally (or on any other upstream cache), we need to also send an Evict downstream. In this case we need to wait until our outgoing Evict completes before completing the Evict from upstream in order be able to resolve race conditions with incoming snoops. E.g.: while our outgoing Evict is pending we may receive a snoop requesting data, but we won't be able to complete this snoop if we have already completed all upstream Evicts and we no longer have the line.	2023-08-24 10:04:28 -07:00
Matthew Poremba	addba01d29	configs,dev-amdgpu: Add PCI express capability info The ROCm stack requires PCI express atomics. Currently the first PCI CapabilityPtr does not point to anything, which signals to the OS (Linux) that this is an early generation PCI device. As PCI express atomics were introduced later, the CapabilityPtr needs to point to at least a PCI express capability structure. This capability is defined as 0x10 in Linux. We additionally set the PCI atomic based bits and implement device specific PCI configuration space reads and writes to the amdgpu device. With this commit, the output of simulation when loading the amdgpu driver no longer outputs "PCIE atomics not supported". Further, an application which uses PCIe atomics (PyTorch with a reduce_sum kernel) now makes further progress. Change-Id: I5e3866979659a2657f558941106ef65c2f4d9988	2023-08-24 09:10:35 -05:00
Roger Chang	5c28113a06	cpu-minor: Separate the reg_index of VecClassReg and VecElemReg In the RISC-V system, we need to VecClassReg to run RISC-V vector instruction, and VecElemReg is not applicable because the element length of vector can be resizable via vsetvl instruction. The change will seperate the reg_index for VecReg and VecElemReg to ensure that have the space for VecReg when VecElemReg is not applicable. Change-Id: I99a82dec273baeee31df89a0ee0f5e87f3ff187c	2023-08-24 13:27:27 +08:00
Matthew Poremba	8b4c38302f	dev: PCI: Fix PCI express capability union The capabilities for PCI express is a struct, instead of a union, like the other capability unions. A union is used here to provide access to the ordinal data values when reading/writing an offset while simultaneously providing human readable field values that can be set when writing the code. This commit changes it to union which is likely should be. Nothing appears to be using this union yet so it is likely an oversight. Change-Id: I85fe7cc62914525c70fd7a5946d725ed308f8775	2023-08-23 19:32:38 -05:00
Matthew Poremba	90a518e885	gpu-compute,arch-vega: Fix ALU-only LDS counters There are a few LDS instructions that perform local ALU operations and writeback which are marked as loads. These are marked as loads because they fit in the pipeline logic better, according to a several year old comment. In the VEGA ISA these instructions (swizzle, permute, bpermute) are not decrementing the LDS load counter. As a result, the counter will gradually increase over time. Since wavefront slots are persistent, this can cause applications with a few thousand kernels to eventually hang thinking there are not enough resources. This changeset fixes this by decrementing the LDS load counter for these instructions. This fix was already integrated in the GCN3 ISA in the exact same way. This changeset moves it near a similar comment about scheduling register file writes. Change-Id: Ife5237a2cae7213948c32ef266f4f8f22917351c	2023-08-23 19:30:24 -05:00
Tiago Mück	9584d2efa9	mem-ruby: add in_trans/out_trans to CHI events Marks which events signal the beginning of incoming and outgoing transactions for generating inTransLatHist and outTransLatHist stats. Change-Id: I90594a27fa01ef9cfface309971354b281308d22 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 17:25:50 -05:00
Tiago Mück	3360a87d5a	mem-ruby: optimize in/outTransLatHist stats Generating these stats for all defined Events may generate too many stats that are never used, which unnecessarily increases simulation startup time and memory consumption. This patch limits those stats to events with the "in_trans" and/or "out_trans" properties. SLICC compiler then checks which combinations of event+state are possible when generating the stats. Also the possible level of detail for inTransLatHist was reduced. Only the number of transactions for each event+initial+final state combinations is now accounted. Latency histograms are only defined per event type (similarly to outTransLatHist). This significantly reduces the final file size for generated stats. Change-Id: I29aaeb771436cc3f0ce7547a223d58e71d9cedcc Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 17:25:38 -05:00
Tiago Mück	a5fd6edea1	mem-ruby: fix CHI sending the wrong snoop response Do not respond with SnpRespData_I when the line is still present upstream. Change-Id: I2592e5c6637cfc0e83042169a245837648276e61 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 17:04:09 -05:00
Tiago Mück	49f5ec16d1	mem-ruby: fix assert on CHI ReadUnique DCT must be disabled when handling a ReadUnique where the copy need to be upgraded. Previously we were just asserting as it was assumed DCT is only enabled for HNFs (which can "auto-upgrade"). However DCT may also be enabled for intermediated levels of distributed shared caches above the HNFs. Change-Id: I9e29142a8d2f59ea61c1d90cda6b00c19435d6b7 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 16:58:25 -05:00
Reiley Jeyapaul	c9ff54677f	mem-ruby: fix CHI Evict race condition When an Evict request is received from upstream for a shared line and the line is no longer cached locally (or on any other upstream cache), we need to also send an Evict downstream. In this case we need to wait until our outgoing Evict completes before completing the Evict from upstream in order be able to resolve race conditions with incoming snoops. E.g.: while our outgoing Evict is pending we may receive a snoop requesting data, but we won't be able to complete this snoop if we have already completed all upstream Evicts and we no longer have the line. Change-Id: I23ac4f0a9c4ddd81e2425376c8d1e1c7fb66d107 Signed-off-by: Tiago Mück <tiago.muck@arm.com>	2023-08-23 15:49:51 -05:00
Ranganath (Bujji) Selagamsetty	f6a453362f	mem: Atomic ops to same address Augmenting the DataBlock class with a change log structure to record the effects of atomic operations on a data block and service these changes if the atomic operations require return values. Although the operations are atomic, the coalescer need not send unique memory requests for each operation. Atomic operations within a wavefront to the same address are now coalesced into a single memory request. The response of this request carries all the necessary information to provide the requesting lanes unique values as a result of their individual atomic operations. This helps reduce contention for request and response queues in simulation. Previously, only the final value of the datablock after all atomic ops to the same address was visible to the requesting waves. This change corrects this behavior by allowing each wave to see the effect of this individual atomic op is a return value is necessary. Change-Id: I639bea943afd317e45f8fa3bff7689f6b8df9395	2023-08-23 14:45:25 -05:00
Johnny	76fe71ebd0	sim: provide a signal constructor with an init_state Add more description to the code Change-Id: Iff8fb20762baa0c9d0b7e5f24fb8769d7e198b5c	2023-08-23 10:49:15 +08:00
Johnny	6acb687975	sim: provide a signal constructor with an init_state 1. The current SignalSinkPort and SignalSourcePort have no ways to assign the init value of the state. Add a new constructor for them with the param init_state 2. After the source and sink are bound, the state at both side should be the same. Set the the state of sink to the state of source in the bind() function. Change-Id: Idde0a12aa0ddd0c9c599ef47059674fb12aa5d68	2023-08-23 10:12:41 +08:00
Jason Lowe-Power	e3414c7098	base: Make 'findLsbSetFallback' constexpr to fix gcc-8 comp (#203 ) Compilation bug found on: https://github.com/gem5/gem5/actions/runs/5899831222/job/16002984553 In gcc Version 8 and below the following error is received: ``` src/base/bitfield.hh: In function ‘constexpr int gem5::findLsbSet(uint64_t)’: src/base/bitfield.hh:365:34: error: call to non-‘constexpr’ function ‘int gem5::{anonymous}::findLsbSetFallback(uint64_t)’ return findLsbSetFallback(val); ~~~~~~~~~~~~~~~~~~^~~~~ scons: *** [build/ALL/kern/linux/events.o] Error 1 ``` `findLsbSet` cannot be `constexr` as it calls non-constexpr function `findLsbSetFallback`. `findLsbSetFallback`. The problematic function is the `count` on the std::bitset. This patch changes this to a constexpr.	2023-08-22 11:23:59 -07:00
Bobby R. Bruce	709f632730	base: Make 'findLsbSetFallback' constexpr to fix gcc-8 comp Compilation bug found on: https://github.com/gem5/gem5/actions/runs/5899831222/job/16002984553 In gcc Version 8 and below the following error is received: ``` src/base/bitfield.hh: In function ‘constexpr int gem5::findLsbSet(uint64_t)’: src/base/bitfield.hh:365:34: error: call to non-‘constexpr’ function ‘int gem5::{anonymous}::findLsbSetFallback(uint64_t)’ return findLsbSetFallback(val); ~~~~~~~~~~~~~~~~~~^~~~~ scons: *** [build/ALL/kern/linux/events.o] Error 1 ``` `findLsbSet` cannot be `constexr` as it calls non-constexpr function `findLsbSetFallback`. `findLsbSetFallback`. The problematic function is the `count` on the std::bitset. This patch changes this to a constexpr. Change-Id: I48bd15d03e4615148be6c4d926a3c9c2f777dc3c	2023-08-21 14:04:36 -07:00
Hoa Nguyen	9e007e5bd7	mem-cache: fix wrong function call Change-Id: I924ede89f373ec21557faf25c96b36f4bc8430dd Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-08-19 22:56:55 +00:00
Hoa Nguyen	f442846d9d	mem-cache: Fix another typo Change-Id: Ib2051f9bda6e6d9002d3be1dbf0b890299098201 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-08-19 22:50:53 +00:00
Hoa Nguyen	7b897a30fa	mem-cache: Fix syntax error Change-Id: I1360879c13d377661e9eeeddf345b785c01efeb6 Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-08-19 21:27:53 +00:00
Hoa Nguyen	98daec7d99	mem-cache: Allow clflush's uncacheable requests on classic cache When a linux kernel changes a page property, it flushes the related cache lines. The kernel might change the page property before flushing the cache lines. This results in the clflush might occur in an uncacheable region. Currently, an uncacheable request must be a read or a write. However, clflush request is neither of them. This change aims to allow clflush requests to work on uncacheable regions. Since there is no straightforward way to check if a packet is from a clflush instruction, this change permits all Clean Invalidate Requests, which is the type of request produced by clflush, to work on uncacheable regions. Change-Id: Ib3ec01d9281d3dfe565a0ced773ed912edb32b8f Signed-off-by: Hoa Nguyen <hn@hnpl.org>	2023-08-19 18:20:16 +00:00
Bobby R. Bruce	30ab2c19b1	stdlib: Allow passing of func as Exit Event generator (#195 ) In this case the function is turned into a generator with the "yield" of the generator the return the function's execution. Translation of this stale Gerrit Change: https://gem5-review.googlesource.com/c/public/gem5/+/62872	2023-08-18 10:55:50 -07:00
Bobby R. Bruce	c0216dbe48	stdlib: Allow passing of func as Exit Event generator In this case the function is turned into a generator with the "yield" of the generator the return the function's execution. Change-Id: I4b06d64c5479638712a11e3c1a2f7bd30f60d188	2023-08-17 16:48:33 -07:00
Jason Lowe-Power	22c52f4fba	Fix reporting traps (faults) to GDB in SE mode (#166 ) This addresses #123	2023-08-17 16:08:49 -07:00
Jan Vrany	3564348eec	arch-riscv: Report traps to GDB in SE mode This commit add code to report illegal instruction and breakpoint traps to GDB (if connected). This merely follows what POWER does.	2023-08-17 15:55:04 +01:00
Jan Vrany	546b3eac7d	arch-riscv: Do not advance PC when handling faults in SE mode On RISC-V when trap occurs the contents of PC register contains the address of instruction that caused that trap (as opposed to the address of instruction following it in instruction stream). Therefore this commit does not advance the PC before reporting trap in SE mode. Change-Id: I83f3766cff276312cefcf1b4ac6e78a6569846b9	2023-08-17 15:55:04 +01:00
Jan Vrany	fde58a4365	arch-power: Fix reporting traps to GDB Due to inverted logic in POWER fault handlers, unimplemented opcode and trap faults did not report trap to GDB (if connected). This commit fixes the problem. While at it, I opted to use `if (! ...) { panic(...) }` rather than `panic_if(...)`. I find it easier to understand in this case. Change-Id: I6cd5dfd5f6546b8541d685e877afef21540d6824	2023-08-17 15:55:04 +01:00
Roger Chang	fe142f485a	arch-riscv: Add missing vector required check for vmem instructions The mem instructions usually executed from initiateAcc. We also need to check vector required in those instructions Change-Id: I97b4fec7fada432abb55ca58050615e12e00d1ca	2023-08-17 09:53:30 +08:00
Roger Chang	35a6fe6f3d	arhc-riscv: Check vill in vector mem instructions Any vector instructions using vtype should check vill flag is set Change-Id: Ia9a2695f3005a176422da78e6f413cc789116faa	2023-08-17 09:53:30 +08:00
Bobby R. Bruce	3ff6fe0e90	arch-x86,cpu-kvm: Fix gem5.fast due to unused variable (#189 ) Detected via this failing workload: https://github.com/gem5/gem5/actions/runs/5861958237 Ir caused the following compilation error to be thrown: ``` src/arch/x86/kvm/x86_cpu.cc:1462:22: error: unused variable ‘rv’ [-Werror=unused-variable] 1462 \| bool rv = isa->cpuid->doCpuid(tc, function, idx, cpuid); \| ^~ ``` `rv` is unused in the .fast compilation as it's only used in the `assert` statement immediately after. To fix this, the `[[maybe_unused]]` annotation is used.	2023-08-16 12:52:44 -07:00
Bobby R. Bruce	c835c9faa3	arch-x86,cpu-kvm: Fix gem5.fast due to unused variable Detected via this failing workload: https://github.com/gem5/gem5/actions/runs/5861958237 It caused the following compilation error to be thrown: ``` src/arch/x86/kvm/x86_cpu.cc:1462:22: error: unused variable ‘rv’ [-Werror=unused-variable] 1462 \| bool rv = isa->cpuid->doCpuid(tc, function, idx, cpuid); \| ^~ ``` `rv` is unused in the .fast compilation as it's only used in the `assert` statement immediately after. To fix this, the `[[maybe_unused]]` annotation is used Change-Id: Ib98dd859c62f171c8eeefae93502f92a8f133776	2023-08-16 10:06:39 -07:00
Matthew Poremba	bc9bbc10f0	gpu-compute: Change kernel-based exit location (#184 ) The previous exit event occurs when the dispatcher sends a completion signal for a kernel, but gem5 does some kernel-based stats updates after the signal is sent. Therefore, if these exit events are used as a way to dump per-kernel stats, some of the stats for the kernel that just ended will be in the next kernel's stat dump which is misleading. This patch moves the exit event to where the stats are updated and only exits if the dispatcher has requested a stat dump to prevent situations where stats are updated mid-kernel. Change-Id: I74dc1cad5fc90382a2a80564764b3e7c9fb65521	2023-08-16 07:38:12 -07:00
Andreas Sandberg	f6d44ac7b3	fastmodel: Add option to retry licence server connection. (#183 ) We're seeing some occasional connection timeouts in CI, possibly when we aggressively hit the license server, so let's add a parameter to retry the connection a few times. Also, print the time required to connect to the server to help debug issues.	2023-08-16 10:08:59 +01:00

1 2 3 4 5 ...

14360 Commits