Commit Graph

3040 Commits

Author SHA1 Message Date
Alec Roelke
1229b3b623 riscv: [Patch 3/5] Added RISCV floating point extensions RV64FD
Third of five patches adding RISC-V to GEM5. This patch adds the RV64FD
extensions, which include single- and double-precision floating point
instructions.

Patch 1 introduced RISC-V and implemented the base instruction set, RV64I
and patch 2 implemented the integer multiply extension, RV64M.

Patch 4 will implement the atomic memory instructions, RV64A, and patch
5 will add support for timing, minor, and detailed CPU models that is
missing from the first four patches.

[Fixed exception handling in floating-point instructions to conform better
to IEEE-754 2008 standard and behavior of the Chisel-generated RISC-V
simulator.]
[Fixed style errors in decoder.isa.]
[Fixed some fuzz caused by modifying a previous patch.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke
070da98493 riscv: [Patch 2/5] Added RISC-V multiply extension RV64M
Second of five patches adding RISC-V to GEM5.  This patch adds the
RV64M extension, which includes integer multiply and divide instructions.

Patch 1 introduced RISC-V and implemented the base instruction set, RV64I.

Patch 3 will implement the floating point extensions, RV64FD; patch 4 will
implement the atomic memory instructions, RV64A; and patch 5 will add
support for timing, minor, and detailed CPU models that is missing from
the first four patches.

[Added mulw instruction that was missed when dividing changes among
patches.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Alec Roelke
e76bfc8764 arch: [Patch 1/5] Added RISC-V base instruction set RV64I
First of five patches adding RISC-V to GEM5. This patch introduces the
base 64-bit ISA (RV64I) in src/arch/riscv for use with syscall emulation.
The multiply, floating point, and atomic memory instructions will be added
in additional patches, as well as support for more detailed CPU models.
The loader is also modified to be able to parse RISC-V ELF files, and a
"Hello world\!" example for RISC-V is added to test-progs.

Patch 2 will implement the multiply extension, RV64M; patch 3 will implement
the floating point (single- and double-precision) extensions, RV64FD;
patch 4 will implement the atomic memory instructions, RV64A, and patch 5
will add support for timing, minor, and detailed CPU models that is missing
from the first four patches (such as handling locked memory).

[Removed several unused parameters and imports from RiscvInterrupts.py,
RiscvISA.py, and RiscvSystem.py.]
[Fixed copyright information in RISC-V files copied from elsewhere that had
ARM licenses attached.]
[Reorganized instruction definitions in decoder.isa so that they are sorted
by opcode in preparation for the addition of ISA extensions M, A, F, D.]
[Fixed formatting of several files, removed some variables and
instructions that were missed when moving them to other patches, fixed
RISC-V Foundation copyright attribution, and fixed history of files
copied from other architectures using hg copy.]
[Fixed indentation of switch cases in isa.cc.]
[Reorganized syscall descriptions in linux/process.cc to remove large
number of repeated unimplemented system calls and added implmementations
to functions that have received them since it process.cc was first
created.]
[Fixed spacing for some copyright attributions.]
[Replaced the rest of the file copies using hg copy.]
[Fixed style check errors and corrected unaligned memory accesses.]
[Fix some minor formatting mistakes.]
Signed-off by: Alec Roelke

Signed-off by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30 17:10:28 -05:00
Tony Gutierrez
0799600686 x86: fix issue with casting in Cvtf2i
UBSAN flags this operation because it detects that arg is being cast directly
to an unsigned type, argBits. this patch fixes this by first casting the
value to a signed int type, then reintrepreting the raw bits of the signed
int into argBits.
2016-11-21 15:35:56 -05:00
Tony Gutierrez
ae55cba281 x86: fix loading/storing of Float80 types 2016-11-19 12:35:14 -05:00
Andreas Hansson
6ed567d600 alpha: Remove ALPHA tru64 support and associated tests
No one appears to be using it, and it is causing build issues
and increases the development and maintenance effort.
2016-11-17 04:54:14 -05:00
Tony Gutierrez
74249f80df hsail,gpu-compute: fixes to appease clang++
fixes to appease clang++. tested on:

Ubuntu clang version 3.5.0-4ubuntu2~trusty2
(tags/RELEASE_350/final) (based on LLVM 3.5.0)

Ubuntu clang version 3.6.0-2ubuntu1~trusty1
(tags/RELEASE_360/final) (based on LLVM 3.6.0)

the fixes address the following five issues:

1) the exec continuations in gpu_static_inst.hh were marked
   as protected when they should be public. here we mark
   them as public

2) the Abs instruction uses std::abs() in its execute method.
   because Abs is templated, it can also operate on U32 and U64,
   types, which cause Abs::execute() to pass uint32_t and uint64_t
   types to std::abs() respectively. this triggers a warning
   because std::abs() has no effect in this case. to rememdy this
   we add template specialization for the execute() method of Abs
   when its template paramter is U32 or U64.

3) Some potocols that utilize the code in cprintf.hh were missing
   includes to BoolVec.hh, which defines operator<< for the BoolVec
   type. This would cause issues when the generated code would try
   to pass a BoolVec type to a method in cprintf.hh that used
   operator<< on an instance of a BoolVec.

4) Surprise, clang doesn't like it when you clobber all the bits
   in a newly allocated object. I.e., this code:

   tlb = new GpuTlbEntry\[size\];
   std::memset(tlb, 0, sizeof(GpuTlbEntry) \* size);

   Let's use std::vector to track the TLB entries in the GpuTlb now...

5) There were a few variables used only in DPRINTFs, so we mark them
   with M5_VAR_USED.
2016-10-26 22:48:45 -04:00
Michael LeBeane
dc16c1ceb8 dev: Add m5 op to toggle synchronization for dist-gem5.
This patch adds the ability for an application to request dist-gem5 to begin/
end synchronization using an m5 op. When toggling on sync, all nodes agree
on the next sync point based on the maximum of all nodes' ticks. CPUs are
suspended until the sync point to avoid sending network messages until sync has
been enabled. Toggling off sync acts like a global execution barrier, where
all CPUs are disabled until every node reaches the toggle off point. This
avoids tricky situations such as one node hitting a toggle off followed by a
toggle on before the other nodes hit the first toggle off.
2016-10-26 22:48:40 -04:00
Tony Gutierrez
de72e36619 gpu-compute: support in-order data delivery in GM pipe
this patch adds an ordered response buffer to the GM pipeline
to ensure in-order data delivery. the buffer is implemented as
a stl ordered map, which sorts the request in program order by
using their sequence ID. when requests return to the GM pipeline
they are marked as done. only the oldest request may be serviced
from the ordered buffer, and only if is marked as done.

the FIFO response buffers are kept and used in OoO delivery mode
2016-10-26 22:48:28 -04:00
Tony Gutierrez
b63eb1302b gpu-compute, hsail: pass GPUDynInstPtr to getRegisterIndex()
for HSAIL an operand's indices into the register files may be calculated
trivially, because the operands are always read from a register file, or are
an immediate.

for machine ISA, however, an op selector may specify special registers, or
may specify special SGPRs with an alias op selector value. the location of
some of the special registers values are dependent on the size of the RF
in some cases. here we add a way for the underlying getRegisterIndex()
method to know about the size of the RFs, so that it may find the relative
positions of the special register values.
2016-10-26 22:47:49 -04:00
Tony Gutierrez
844fb845a5 gpu-compute, hsail: make the PC a byte address, not an instruction index
currently the PC is incremented on an instruction granularity, and not as an
instruction's byte address. machine ISA instructions assume the PC is a byte
address, and is incremented accordingly. here we make the GPU model, and the
HSAIL instructions treat the PC as a byte address as well.
2016-10-26 22:47:43 -04:00
Tony Gutierrez
d327cdba07 gpu-compute: add gpu_isa.hh to switch hdrs, add GPUISA to WF
the GPUISA class is meant to encapsulate any ISA-specific behavior - special
register accesses, isa-specific WF/kernel state, etc. - in a generic enough
way so that it may be used in ISA-agnostic code.

gpu-compute: use the GPUISA object to advance the PC

the GPU model treats the PC as a pointer to individual instruction objects -
which are store in a contiguous array - and not a byte address to be fetched
from the real memory system. this is ok for HSAIL because all instructions
are considered by the model to be the same size.

in machine ISA, however, instructions may be 32b or 64b, and branches are
calculated by advancing the PC by the number of words (4 byte chunks) it
needs to advance in the real instruction stream. because of this there is
a mismatch between the PC we use to index into the instruction array, and
the actual byte address PC the ISA expects. here we move the PC advance
calculation to the ISA so that differences in the instrucion sizes may be
accounted for in generic way.
2016-10-26 22:47:38 -04:00
Tony Gutierrez
c7a79c9a42 gpu-compute, hsail: call discardFetch() from the WF
because every taken branch causes fetch to be discarded, we move the call
to the WF to avoid to have to call it from each and every branch instruction
type.
2016-10-26 22:47:27 -04:00
Tony Gutierrez
00a6346c91 hsail, gpu-compute: remove doGm/SmReturn add completeAcc
we are removing doGmReturn from the GM pipe, and adding completeAcc()
implementations for the HSAIL mem ops. the behavior in doGmReturn is
dependent on HSAIL and HSAIL mem ops, however the completion phase
of memory ops in machine ISA can be very different, even amongst individual
machine ISA mem ops. so we remove this functionality from the pipeline and
allow it to be implemented by the individual instructions.
2016-10-26 22:47:19 -04:00
Tony Gutierrez
7ac38849ab gpu-compute: remove inst enums and use bit flag for attributes
this patch removes the GPUStaticInst enums that were defined in GPU.py.
instead, a simple set of attribute flags that can be set in the base
instruction class are used. this will help unify the attributes of HSAIL
and machine ISA instructions within the model itself.

because the static instrution now carries the attributes, a GPUDynInst
must carry a pointer to a valid GPUStaticInst so a new static kernel launch
instruction is added, which carries the attributes needed to perform a
the kernel launch.
2016-10-26 22:47:11 -04:00
Tony Gutierrez
e1ad8035a3 gpu-compute: move disassemle() implementation to GPUStaticInst 2016-10-26 22:47:05 -04:00
Tony Gutierrez
0a6cdff176 gpu-compute, arch: add some methods to the base inst classes for ISA support 2016-10-26 22:47:01 -04:00
Fernando Endo
6c72c35519 cpu, arm: Distinguish Float* and SimdFloat*, create FloatMem* opClass
Modify the opClass assigned to AArch64 FP instructions from SimdFloat* to
Float*. Also create the FloatMemRead and FloatMemWrite opClasses, which
distinguishes writes to the INT and FP register banks.
Change the latency of (Simd)FloatMultAcc to 5, based on the Cortex-A72,
where the "latency" of FMADD is 3 if the next instruction is a FMADD and
has only the augend to destination dependency, otherwise it's 7 cycles.

Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-10-15 14:58:45 -05:00
Mitch Hayenga
bd0c2d5b0b isa,arm: Add missing AArch32 FP instructions
This commit adds missing non-predicated, scalar floating point
instructions.  Specifically VRINT* floating point integer rounding
instructions and VSEL* floating point conditional selects.

Change-Id: I23cbd1389f151389ac8beb28a7d18d5f93d000e7
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nathanael Premillieu <nathanael.premillieu@arm.com>
2016-10-13 19:22:10 +01:00
Alexandru Dutu
3f0118876f kvm: Adding details to kvm page fault in x86
Adding details, e.g. rip, rsp etc. to the kvm pagefault exit when in SE mode.
2016-10-04 13:06:05 -04:00
Alexandru Dutu
68127ca3da hsail: Fix disassembly of load instruction with 3 destination operands 2016-09-16 12:36:20 -04:00
Alexandru Dutu
e9b14d5111 gpu-compute: Refactoring Wavefront::dynWaveId 2016-09-16 12:31:46 -04:00
Alexandru Dutu
589e13a23b gpu-compute: Wavefront refactoring
Renaming members of the Wavefront class in accordance with the style guide.
2016-09-16 12:26:52 -04:00
Ricardo Alves
e5c1488cb6 arm: Add m5_fail support for aarch64
Change-Id: Id2acbc09772be310a0eb9e33295afab07e08a4fa
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-09-15 18:21:24 +01:00
Michael LeBeane
2c43a21687 x86: Force strict ordering for memory mapped m5ops
Normal MMAPPED_IPR requests are allowed to execute speculatively under the
assumption that they have no side effects.  The special case of m5ops that are
treated like MMAPPED_IPR should not be allowed to execute speculatively, since
they can have side-effects.  Adding the STRICT_ORDER flag to these requests
blocks execution until the associated instruction hits the ROB head.
2016-09-13 23:18:34 -04:00
Nikos Nikoleris
698767e538 cpu, arch: fix the type used for the request flags
Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-08-15 12:00:35 +01:00
Tony Gutierrez
fa5e64987e sim: fix issues with pwrite(); don't enable fstatfs
this patch fixes issues with changeset 11593

use the host's pwrite() syscall for pwrite64Func(),
as opposed to pwrite64(), because pwrite64() does
not work well on all distros.

undo the enabling of fstatfs, as we will add this
in a separate pate.
2016-08-05 17:15:19 -04:00
Tony Gutierrez
0b68475b10 x86, sim: add some syscalls to X86
this patch adds an implementation for the pwrite64 syscall and
enables it for x86_64, and enables fstatfs for x86_64.
2016-08-04 12:32:21 -04:00
Curtis Dunham
8b3434a4f2 arm: refactor page table walking
Introduce and use a lookup table.

Using fetchDescriptor() rather than DMA cleanly handles nested paging.

Change-Id: I69ec762f176bd752ba1040890e731826b58d15a6
2016-08-02 10:38:03 +01:00
Dylan Johnson
09218ed397 arm: warn not fail on use of missing miscreg CNTHCTL_EL2
During host bootup, KVM reads/writes to CNTHCTL_EL2. Because this
miscreg has not been implemented, the simulation would end there. This
patch causes the simulation to warn about the read/write instead of fail.

Change-Id: If034bfd0818a9a5e50c5fe86609e945258c96fa3
2016-08-02 10:38:03 +01:00
Dylan Johnson
c15711725d arm: Check TLB stage 2 permissions in AArch64
This fixes a bug where stage 2 lookups used the AArch32
permissions rules even if we were executing in AArch64 mode.

Change-Id: Ia40758f0599667ca7ca15268bd3bf051342c24c1
2016-08-02 10:38:03 +01:00
Dylan Johnson
bce923c189 arm: correctly assign faulting IPA's to HPFAR_EL2
This patch corrects IPA reporting if the translation faults in a
stage 2 lookup.

Change-Id: I0b914527f8a9f98a5e980a131cf9d03e5584b4e9
2016-08-02 10:38:03 +01:00
Dylan Johnson
4d5d47c173 arm: Add TLBI instruction for stage 2 IPA's
This patch adds support for stage 2 TLBI instructions
such as TLBI IPAS2E1_Xt.

Change-Id: I0cd5e8055b0c1003e03439aa5183252f50ea0a88
2016-08-02 10:38:03 +01:00
Dylan Johnson
89511856fe arm: Fix stage 2 memory attribute checking in AArch64
Change-Id: I14c93a5460550051a12129e792a9a9bd522a145c
2016-08-02 10:38:03 +01:00
Dylan Johnson
02fcca9b6f arm: Fix trapping to Hypervisor during MSR/MRS read/write
This patch restricts trapping to hypervisor only if we are in the
correct exception level for the trap to happen.

Change-Id: I0a382b6a572ef835ea36d2702b8a81b633bd3df0
2016-08-02 10:38:03 +01:00
Dylan Johnson
c2271e301d arm: Fix secure state checking in various places
Faults that could potentially be routed to the hypervisor checked
whether or not they were in a secure state without checking if security
was enabled or not. This caused faults not to be routed correctly. This
patch causes secure state checking to first ask if security is enabled.

Change-Id: I179e9b181b27f552734c9bab2b18d05ac579a119
2016-08-02 10:38:02 +01:00
Dylan Johnson
996c1ed33c arm: Fix stage 2 determination in table walker
We recompute if we are doing a stage 2 walk inside of the table walker
but we have already figured it out in the tlb. Pass the information in
to the walk instead of recomputing it.

Change-Id: I39637ce99309b2ddbc30344d45ac9ebf6a203401
2016-08-02 10:38:02 +01:00
Dylan Johnson
eac27759e7 arm: Refactor aarch64 table walk logic to remove redundancy
The functional case is already handled within the fetchDescriptor()
function. We can thus use that function for both atomic and functional
mode when we start the table walk.

Change-Id: Iacaed28cd9024d259fd37a58150efd00ff94d86e
2016-08-02 10:38:02 +01:00
Dylan Johnson
f9a6f68e0b arm: Add check to fault routing for hypervisor/virtualization
This patch adds the option for faults to be routed to the hypervisor
using the pre-existing routeToHyp() functions that are present in each
fault type.

Change-Id: I9735512c094457636b9870456a5be5432288e004
2016-08-02 10:38:02 +01:00
Dylan Johnson
fc6879097b arm: Fix EL perceived at TLB for address translation instructions
During address translation instructions (such as AT S1E1R_Xt) the exception
level can be different than the current exception level. This patch fixes
how the TLB determines what EL to use during these instructions.

Change-Id: Ia9ce229404de9e284bc1f7479fd2c580efd55f8f
2016-08-02 10:38:02 +01:00
Dylan Johnson
2950a95672 arm: Add AArch64 hypervisor call instruction 'hvc'
This patch adds the AArch64 instruction hvc which raises an exception
from EL1 into EL2. The host OS uses this instruction to world switch
into the guest.

Change-Id: I930ee43f4f0abd4b35a68eb2a72e44e3ea6570be
2016-08-02 10:38:02 +01:00
Dylan Johnson
c53a57f74f arm: add stage2 translation support
Change-Id: I8f7c09c7ec3a97149ebebf4b21471b244e6cecc1
2016-08-02 10:38:02 +01:00
Curtis Dunham
49538a7118 arm: enable EL2 support
Change-Id: I59fa4fae98c33d9e5c2185382e1411911d27d341
2016-08-02 10:38:01 +01:00
Dylan Johnson
4fbf40daab arm: invalidate TLB miscreg cache on modification of HSCTLR
Change-Id: I5212c91c56435fe008950ed99feacc6921609226
2016-08-02 10:38:01 +01:00
Dylan Johnson
e727a0eeaa arm: change instruction classes to catch hyp traps
Change-Id: I122918d0e3dfd01ae1a4ca4f19240a069115c8b7
2016-08-02 10:38:01 +01:00
Mitch Hayenga
8a476d387c isa: Modify get/check interrupt routines
Make it so that getInterrupt *always* returns an interrupt if
checkInterrupts() returns true.  This fixes/simplifies handling
of interrupts on the SMT FS CPUs (currently minor).
2016-07-21 17:19:15 +01:00
Andreas Sandberg
f471cc22f0 arm: Don't consult the TLB test iface for functional translations
Don't consult the TLB test interface for PA's returned by functional
translations by the AT instruction. We implement this by chaning the
ISA code to synthesize 0-length functional reads for the TLB lookup.
The TLB then bypasses the final PA check in the tester if the size is
zero.

Change-Id: I2487b7f829cea88c37e229e9fc7a4543aced961b
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2016-07-11 10:39:56 +01:00
Nikos Nikoleris
1fac3a292a arm: Mark uninitialized new TLB entries as not valid
Previously when we initialized the TLB we would allocate a number of
TLB entries which would be marked as valid. As a result the TLB
contained an entry which would be considered a valid entry for the 0
page.

Change-Id: I23ace86426a171a4f6200ebeb29ad57c21647036
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-06-20 15:51:31 +01:00
Andreas Sandberg
37bb0d0fb3 kern, arm: Dump dmesg on kernel panic/oops
Add helper functions to dump the guest kernel's dmesg buffer to a text
file in m5out. This functionality is split into two parts. First, a
dmesg dump function that can be used in other places:

void Linux::dumpDmesg(ThreadContext *, std::ostream &)

This function is used to implement two PCEvents: DmesgDumpEvent and
KernelPanic event. The only difference between the two is that the
latter produces a gem5 panic instead of a warning in addition to
dumping the kernel log.

Change-Id: I6d2af1d666ace57124089648ea906f6c787ac63c
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
2016-06-20 14:39:49 +01:00
Tuan Ta
bb9033c26b gpu-compute: Fixed a bug in decoding Atomic ST
There is a mismatch between DataType and SrcDataType in constructing
Atomic ST instruction. The mismatch causes atomic_store and
atomic_store_explicit function to store incorrect value in memory.
2016-06-18 13:02:13 -04:00