diff --git a/LICENSE b/LICENSE
index 88190f1f55..78875df735 100644
--- a/LICENSE
+++ b/LICENSE
@@ -1,4 +1,4 @@
-Copyright (c) 2000-2008 The Regents of The University of Michigan
+Copyright (c) 2000-2011 The Regents of The University of Michigan
 All rights reserved.
 
 Redistribution and use in source and binary forms, with or without
diff --git a/README b/README
index f8eef7417d..3b6a3f6bd9 100644
--- a/README
+++ b/README
@@ -1,4 +1,4 @@
-This is release 2.0_beta6 of the M5 simulator.
+This is the M5 simulator.
 
 For detailed information about building the simulator and getting
 started please refer to http://www.m5sim.org.
@@ -9,13 +9,16 @@ http://www.m5sim.org/wiki/index.php/Running_M5
 
 Short version:
 
-1. If you don't have SCons version 0.96.91 or newer, get it from
+1. If you don't have SCons version 0.98.1 or newer, get it from
 http://wwww.scons.org.
 
-2. If you don't have SWIG version 1.3.28 or newer, get it from
+2. If you don't have SWIG version 1.3.31 or newer, get it from
 http://wwww.swig.org.
 
-3. In this directory, type 'scons build/ALPHA_SE/tests/debug/quick'.  This
+3. Make sure you also have gcc version 3.4.6 or newer, Python 2.4 or newer
+(the dev version with header files), zlib, and the m4 preprocessor.
+
+4. In this directory, type 'scons build/ALPHA_SE/tests/debug/quick'.  This
 will build the debug version of the m5 binary (m5.debug) for the Alpha
 syscall emulation target, and run the quick regression tests on it.
 
@@ -25,18 +28,21 @@ WHAT'S INCLUDED (AND NOT)
 -------------------------
 
 The basic source release includes these subdirectories:
- - m5: 
-   - src: source code of the m5 simulator
-   - tests: regression tests
+ - m5:
+   - configs: simulation configuration scripts
    - ext: less-common external packages needed to build m5
+   - src: source code of the m5 simulator
+   - system: source for some optional system software for simulated systems
+   - tests: regression tests
+   - util: useful utility programs and files
 
-To run full-system simulations, you will need compiled console,
-PALcode, and kernel binaries and one or more disk images.  These files
-are collected in a separate archive, m5_system.tar.bz2.  This file
-can he downloaded separately.
+To run full-system simulations, you will need compiled system firmware
+(console and PALcode for Alpha), kernel binaries and one or more disk images. 
+These files for Alpha are collected in a separate archive, m5_system.tar.bz2.
+This file can he downloaded separately.
 
-M5 supports Linux 2.4/2.6, FreeBSD, and the proprietary Compaq/HP
-Tru64 version of Unix. We are able to distribute Linux and FreeBSD
-bootdisks, but we are unable to distribute bootable disk images of
+Depending on the ISA used, M5 may support Linux 2.4/2.6, FreeBSD, and the
+proprietary Compaq/HP Tru64 version of Unix. We are able to distribute Linux
+and FreeBSD bootdisks, but we are unable to distribute bootable disk images of
 Tru64 Unix. If you have a Tru64 license and are interested in
 obtaining disk images, contact us at m5-users@m5sim.org
diff --git a/RELEASE_NOTES b/RELEASE_NOTES
deleted file mode 100644
index f10ffddae6..0000000000
--- a/RELEASE_NOTES
+++ /dev/null
@@ -1,149 +0,0 @@
-October 6, 2008: m5_2.0_beta6
---------------------
-New Features
-1. Support for gcc 4.3
-2. Core m5 code in libm5 for integration with other simulators
-3. Preliminary support for X86 SE mode
-4. Additional system calls emulated
-5. m5term updated to work on OS X
-6. Ability to disable listen sockets
-7. Event queue performance improvements and rewrite
-8. Better errors for unconnected memory ports
-
-Bug fixes
-1. ALPHA_SE O3 perlbmk benchmark
-2. Translation bug where O3 could fetch from uncachable memory
-3. Many minor bugs
-
-Outstanding issues for 2.0 release:
---------------------
-1. Statistics cleanup
-2. Improve regression system
-3. Testing
-4. Validation
-
-March 1, 2008: m5_2.0_beta5
---------------------
-New Features
-1. Rick Strong's Simpoints config changes
-2. Support for FSU ARM port
-3. EXTRAS= option allow architectures to be specified
-
-Bug fixes
-1. Bus timing more realistic
-2. Cache writeback, LL/SC fixes
-3. Minor IGbE NIC fixes
-4. O3 op latency fix
-5. SPARC TLB demap fixes
-6. SPARC SE memory layout fixes
-7. Variety of MIPS fixes
-
-Nov 4, 2007: m5_2.0_beta4
---------------------
-New Features
-1. New cache model
-2. Use of a I/O cache between devices and memory 
-3. Ability to include compiled code with EXTRAS=
-4. Python creation of params structures for initialization
-5. Ability to remotely debug in SE 
-
-Bug fixes:
-1. Fix SE serialization
-2. SPARC_FS booting with TimingSimpleCPU
-3. Rename cycles() to ticks()
-4. Various SPARC ISA fixes
-5. Draining code for checkpointing
-6. Various performance improvements
-
-Possible Incompatibilities:
-1. Real TLBs are now used in SE mode. This is more accurate however it could
-   cause some problems if you've modified the way page handling is done in
-   SE mode.
-2. There have been many changes to the way the SCons files work. SimObjects,
-   sources files, and trace flags are all specified in the SConscript files.
-   To see how to add your sources take a look at one of them.
-3. Python is now used to created the parameter structs that were created
-   manually before. The parameters listed in a py file are turned into 
-   a header file with the same name (e.g. BadDevice.py -> BadDevice.hh). 
-   With this change the structs can be populated automatically and the 
-   ugly macros to define and create SimObjects at the bottem of source
-   files are gone. The parameter structs also automatically inherit 
-   parameters from their parents. 
-
-May 16, 2007: m5_2.0_beta3
---------------------
-New Features
-1. Some support for SPARC full-system simulation
-2. Reworking of trace facitities (parameter names changed, variadic macros
-   removed)
-3. Scons script cleanups
-4. Some support for compiling with Intel CC
-
-Bug fixes since beta 2:
-1. Many SPARC linux syscall emulation support fixes
-2. Multiprocessor linux boot using the detailed O3 CPU module
-3. Workaround for DMA bug (final solution to be released with 2.0f)
-4. Simulator performance and memory leak fixes
-5. Fixed issue where console could stop printing in ALPHA_FS
-6. Fix issues with remote debugging
-7. Several compile fixes, including gcc 4.1
-8. Many other minor fixes and enhancements 
-		
-Nov. 28, 2006: m5_2.0_beta2
---------------------
-Bug fixes since beta 1:
-1. Many cache issues resolved
-2. Uni-coherence fixes in full-system
-3. LL/SC Support
-4. Draining/Switchover
-5. Functional Accesses
-6. Bus now has real timing
-7. Single config file for all SpecCPU2000 benchmarks
-8. Several other minor bug fixes and enhancements
-
-Aug. 25, 2006: m5_2.0_beta patch 1
---------------------
-Handful of minor bug fixes for m5_2.0_beta,
-along with a few new regression tests.
-
-Aug. 15, 2006: m5_2.0_beta
---------------------
-Major update to M5 including:
-- New CPU model
-- New memory system
-- More extensive python integration
-- Preliminary syscall emulation support for MIPS and SPARC
-This is a *beta* release, meaning that some features are not complete,
-and some features from M5 1.X aren't currently supported (e.g., MP
-coherence).  We are working to address these limitations and hope to
-have a complete 2.0 release soon.
-
-Oct. 8, 2005: m5_1.1
---------------------
-Update release for IOSCA workshop mini-tutorial.  New features include:
-- Preliminary FreeBSD support
-- Integration of regression tests into scons build framework
-- Several bug fixes and better compatibility for Cygwin hosts
-- Major cleanup of Alpha system code (console, PAL, etc.) to make
-  it easier for others to build/modify
-- Fixes to enable compilation under g++ 4.0 
-- Numerous minor bug fixes
-
-June 10, 2005: m5_1.0_web
--------------------------
-The 1.0 release posted on Sourceforge after the ISCA tutorial contains
-just a few very minor fixes relative to the CD.
-
-June 5, 2005: m5_1.0_tutorial
------------------------------
-First non-beta release.  This release was on the CD distributed at the
-ISCA tutorial.  Major enhancements relative to beta releases include
-Linux support and Python-based configuration language.
-
-June 17, 2004: m5_1.0_beta2
----------------------------
-Stealth-mode beta bug-fix update, not widely advertised.
-
-Oct. 17, 2003: m5_1.0_beta1
----------------------------
-Early beta release.
diff --git a/configs/common/FSConfig.py b/configs/common/FSConfig.py
index 9e5fd3a0b1..f58fd3d2e9 100644
--- a/configs/common/FSConfig.py
+++ b/configs/common/FSConfig.py
@@ -238,6 +238,7 @@ def makeLinuxArmSystem(mem_mode, mdesc = None, bare_metal=False,
 
     self.intrctrl = IntrControl()
     self.terminal = Terminal()
+    self.vncserver = VncServer()
     self.kernel = binary('vmlinux.arm')
     self.boot_osflags = 'earlyprintk mem=128MB console=ttyAMA0 lpj=19988480' + \
                         ' norandmaps slram=slram0,0x8000000,+0x8000000' +      \
diff --git a/src/SConscript b/src/SConscript
index cad0736c54..0ee1447476 100755
--- a/src/SConscript
+++ b/src/SConscript
@@ -446,7 +446,7 @@ def makeInfoPyFile(target, source, env):
 
 # Generate a file that wraps the basic top level files
 env.Command('python/m5/info.py',
-            [ '#/AUTHORS', '#/LICENSE', '#/README', '#/RELEASE_NOTES' ],
+            [ '#/AUTHORS', '#/LICENSE', '#/README', ],
             MakeAction(makeInfoPyFile, Transform("INFO")))
 PySource('m5', 'python/m5/info.py')
 
diff --git a/src/arch/arm/table_walker.cc b/src/arch/arm/table_walker.cc
index 6b2113639a..e6dd728dda 100644
--- a/src/arch/arm/table_walker.cc
+++ b/src/arch/arm/table_walker.cc
@@ -208,19 +208,20 @@ TableWalker::processWalk()
         return f;
     }
 
+    Request::Flags flag = 0;
+    if (currState->sctlr.c == 0) {
+        flag = Request::UNCACHEABLE;
+    }
+
     if (currState->timing) {
         port->dmaAction(MemCmd::ReadReq, l1desc_addr, sizeof(uint32_t),
                 &doL1DescEvent, (uint8_t*)&currState->l1Desc.data,
-                currState->tc->getCpuPtr()->ticks(1));
+                currState->tc->getCpuPtr()->ticks(1), flag);
         DPRINTF(TLBVerbose, "Adding to walker fifo: queue size before adding: %d\n",
                 stateQueueL1.size());
         stateQueueL1.push_back(currState);
         currState = NULL;
     } else {
-        Request::Flags flag = 0;
-        if (currState->sctlr.c == 0){
-           flag = Request::UNCACHEABLE;
-        }
         port->dmaAction(MemCmd::ReadReq, l1desc_addr, sizeof(uint32_t),
                 NULL, (uint8_t*)&currState->l1Desc.data,
                 currState->tc->getCpuPtr()->ticks(1), flag);
@@ -472,7 +473,7 @@ TableWalker::doL1Descriptor()
     switch (currState->l1Desc.type()) {
       case L1Descriptor::Ignore:
       case L1Descriptor::Reserved:
-        if (!currState->delayed) {
+        if (!currState->timing) {
             currState->tc = NULL;
             currState->req = NULL;
         }
@@ -577,7 +578,7 @@ TableWalker::doL2Descriptor()
 
     if (currState->l2Desc.invalid()) {
         DPRINTF(TLB, "L2 descriptor invalid, causing fault\n");
-        if (!currState->delayed) {
+        if (!currState->timing) {
             currState->tc = NULL;
             currState->req = NULL;
         }
@@ -622,7 +623,7 @@ TableWalker::doL2Descriptor()
     memAttrs(currState->tc, te, currState->sctlr, currState->l2Desc.texcb(),
              currState->l2Desc.shareable());
 
-    if (!currState->delayed) {
+    if (!currState->timing) {
         currState->tc = NULL;
         currState->req = NULL;
     }
diff --git a/src/arch/arm/table_walker.hh b/src/arch/arm/table_walker.hh
index 267a7ad260..96a39cc619 100644
--- a/src/arch/arm/table_walker.hh
+++ b/src/arch/arm/table_walker.hh
@@ -93,14 +93,14 @@ class TableWalker : public MemObject
         {
             if (supersection())
                 panic("Super sections not implemented\n");
-            return mbits(data, 31,20);
+            return mbits(data, 31, 20);
         }
         /** Return the physcal address of the entry, bits in position*/
         Addr paddr(Addr va) const
         {
             if (supersection())
                 panic("Super sections not implemented\n");
-            return mbits(data, 31,20) | mbits(va, 20, 0);
+            return mbits(data, 31, 20) | mbits(va, 19, 0);
         }
 
 
@@ -109,7 +109,7 @@ class TableWalker : public MemObject
         {
             if (supersection())
                 panic("Super sections not implemented\n");
-            return bits(data, 31,20);
+            return bits(data, 31, 20);
         }
 
         /** Is the translation global (no asid used)? */
@@ -127,19 +127,19 @@ class TableWalker : public MemObject
         /** Three bit access protection flags */
         uint8_t ap() const
         {
-            return (bits(data, 15) << 2) | bits(data,11,10);
+            return (bits(data, 15) << 2) | bits(data, 11, 10);
         }
 
         /** Domain Client/Manager: ARM DDI 0406B: B3-31 */
         uint8_t domain() const
         {
-            return bits(data,8,5);
+            return bits(data, 8, 5);
         }
 
         /** Address of L2 descriptor if it exists */
         Addr l2Addr() const
         {
-            return mbits(data, 31,10);
+            return mbits(data, 31, 10);
         }
 
         /** Memory region attributes: ARM DDI 0406B: B3-32.
@@ -149,7 +149,7 @@ class TableWalker : public MemObject
          */
         uint8_t texcb() const
         {
-            return bits(data, 2) | bits(data,3) << 1 | bits(data, 14, 12) << 2;
+            return bits(data, 2) | bits(data, 3) << 1 | bits(data, 14, 12) << 2;
         }
 
         /** If the section is shareable. See texcb() comment. */
@@ -187,7 +187,7 @@ class TableWalker : public MemObject
         /** Is the entry invalid */
         bool invalid() const
         {
-            return bits(data, 1,0) == 0;;
+            return bits(data, 1, 0) == 0;
         }
 
         /** What is the size of the mapping? */
@@ -218,8 +218,8 @@ class TableWalker : public MemObject
         uint8_t texcb() const
         {
             return large() ?
-                (bits(data, 2) | (bits(data,3) << 1) | (bits(data, 14, 12) << 2)) :
-                (bits(data, 2) | (bits(data,3) << 1) | (bits(data, 8, 6) << 2));
+                (bits(data, 2) | (bits(data, 3) << 1) | (bits(data, 14, 12) << 2)) :
+                (bits(data, 2) | (bits(data, 3) << 1) | (bits(data, 8, 6) << 2));
         }
 
         /** Return the physical frame, bits shifted right */
diff --git a/src/arch/arm/tlb.cc b/src/arch/arm/tlb.cc
index e5f5b36f63..230c562001 100644
--- a/src/arch/arm/tlb.cc
+++ b/src/arch/arm/tlb.cc
@@ -696,6 +696,8 @@ TLB::translateTiming(RequestPtr req, ThreadContext *tc,
 #endif
     if (!delay)
         translation->finish(fault, req, tc, mode);
+    else
+        translation->markDelayed();
     return fault;
 }
 
diff --git a/src/arch/generic/debugfaults.hh b/src/arch/generic/debugfaults.hh
new file mode 100644
index 0000000000..acffadc343
--- /dev/null
+++ b/src/arch/generic/debugfaults.hh
@@ -0,0 +1,111 @@
+/*
+ * Copyright (c) 2010 Advanced Micro Devices
+ * All rights reserved.
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: Gabe Black
+ */
+
+#ifndef __ARCH_GENERIC_DEBUGFAULTS_HH__
+#define __ARCH_GENERIC_DEBUGFAULTS_HH__
+
+#include "base/misc.hh"
+#include "sim/faults.hh"
+
+#include <string>
+
+namespace GenericISA
+{
+class M5DebugFault : public FaultBase
+{
+  public:
+    enum DebugFunc
+    {
+        PanicFunc,
+        FatalFunc,
+        WarnFunc,
+        WarnOnceFunc
+    };
+
+  protected:
+    std::string message;
+    DebugFunc func;
+
+  public:
+    M5DebugFault(DebugFunc _func, std::string _message) :
+        message(_message), func(_func)
+    {}
+
+    FaultName
+    name() const
+    {
+        switch (func) {
+          case PanicFunc:
+            return "panic fault";
+          case FatalFunc:
+            return "fatal fault";
+          case WarnFunc:
+            return "warn fault";
+          case WarnOnceFunc:
+            return "warn_once fault";
+          default:
+            panic("unrecognized debug function number\n");
+        }
+    }
+
+    void
+    invoke(ThreadContext *tc,
+            StaticInstPtr inst = StaticInst::nullStaticInstPtr)
+    {
+        switch (func) {
+          case PanicFunc:
+            panic(message);
+            break;
+          case FatalFunc:
+            fatal(message);
+            break;
+          case WarnFunc:
+            warn(message);
+            break;
+          case WarnOnceFunc:
+            warn_once(message);
+            break;
+          default:
+            panic("unrecognized debug function number\n");
+        }
+    }
+};
+} // namespace GenericISA
+
+#endif // __ARCH_GENERIC_DEBUGFAULTS_HH__
diff --git a/src/arch/mips/isa/decoder.isa b/src/arch/mips/isa/decoder.isa
index 173fa89dfc..d97a141de4 100644
--- a/src/arch/mips/isa/decoder.isa
+++ b/src/arch/mips/isa/decoder.isa
@@ -367,21 +367,7 @@ decode OPCODE_HI default Unknown::unknown() {
             }});
             0x1: addiu({{ Rt.sw = Rs.sw + imm; }});
             0x2: slti({{ Rt.sw = (Rs.sw < imm) ? 1 : 0 }});
-
-            //Edited to include MIPS AVP Pass/Fail instructions and
-            //default to the sltiu instruction
-            0x3: decode RS_RT_INTIMM {
-                0xabc1: BasicOp::fail({{
-                    exitSimLoop("AVP/SRVP Test Failed");
-                }});
-                0xabc2: BasicOp::pass({{
-                    exitSimLoop("AVP/SRVP Test Passed");
-                }});
-                default: sltiu({{
-                    Rt.uw = (Rs.uw < (uint32_t)sextImm) ? 1 : 0;
-                }});
-            }
-
+            0x3: sltiu({{ Rt.uw = (Rs.uw < (uint32_t)sextImm) ? 1 : 0;}});
             0x4: andi({{ Rt.sw = Rs.sw & zextImm; }});
             0x5: ori({{ Rt.sw = Rs.sw | zextImm; }});
             0x6: xori({{ Rt.sw = Rs.sw ^ zextImm; }});
diff --git a/src/arch/x86/SConscript b/src/arch/x86/SConscript
index 27de9da11a..9cb7746475 100644
--- a/src/arch/x86/SConscript
+++ b/src/arch/x86/SConscript
@@ -46,6 +46,7 @@ if env['TARGET_ISA'] == 'x86':
     Source('cpuid.cc')
     Source('emulenv.cc')
     Source('faults.cc')
+    Source('insts/badmicroop.cc')
     Source('insts/microfpop.cc')
     Source('insts/microldstop.cc')
     Source('insts/micromediaop.cc')
diff --git a/src/arch/x86/insts/badmicroop.cc b/src/arch/x86/insts/badmicroop.cc
new file mode 100644
index 0000000000..ef493f250d
--- /dev/null
+++ b/src/arch/x86/insts/badmicroop.cc
@@ -0,0 +1,55 @@
+/*
+ * Copyright (c) 2011 Advanced Micro Devices
+ * All rights reserved.
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: Gabe Black
+ */
+
+#include "arch/x86/insts/badmicroop.hh"
+#include "arch/x86/isa_traits.hh"
+#include "arch/x86/decoder.hh"
+
+namespace X86ISA
+{
+
+// This microop needs to be allocated on the heap even though it could
+// theoretically be statically allocated. The reference counted pointer would
+// try to delete the static memory when it was destructed.
+const StaticInstPtr badMicroop =
+    new X86ISAInst::MicroPanic(NoopMachInst, "BAD",
+        StaticInst::IsMicroop | StaticInst::IsLastMicroop,
+        "Invalid microop!", 0);
+
+} // namespace X86ISA
diff --git a/src/arch/x86/insts/badmicroop.hh b/src/arch/x86/insts/badmicroop.hh
new file mode 100644
index 0000000000..57fe242c49
--- /dev/null
+++ b/src/arch/x86/insts/badmicroop.hh
@@ -0,0 +1,52 @@
+/*
+ * Copyright (c) 2011 Advanced Micro Devices
+ * All rights reserved.
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: Gabe Black
+ */
+
+#ifndef __ARCH_X86_INSTS_BADMICROOP_HH__
+#define __ARCH_X86_INSTS_BADMICROOP_HH__
+
+class StaticInstPtr;
+
+namespace X86ISA
+{
+
+extern const StaticInstPtr badMicroop;
+
+} // namespace X86ISA
+
+#endif //__ARCH_X86_INSTS_BADMICROOP_HH__
diff --git a/src/arch/x86/insts/macroop.hh b/src/arch/x86/insts/macroop.hh
index fcf051a371..4f4176b770 100644
--- a/src/arch/x86/insts/macroop.hh
+++ b/src/arch/x86/insts/macroop.hh
@@ -41,6 +41,7 @@
 #define __ARCH_X86_INSTS_MACROOP_HH__
 
 #include "arch/x86/emulenv.hh"
+#include "arch/x86/insts/badmicroop.hh"
 #include "arch/x86/types.hh"
 #include "arch/x86/insts/static_inst.hh"
 
@@ -76,8 +77,10 @@ class MacroopBase : public X86StaticInst
     StaticInstPtr
     fetchMicroop(MicroPC microPC) const
     {
-        assert(microPC < numMicroops);
-        return microops[microPC];
+        if (microPC >= numMicroops)
+            return badMicroop;
+        else
+            return microops[microPC];
     }
 
     std::string
diff --git a/src/arch/x86/insts/microregop.cc b/src/arch/x86/insts/microregop.cc
index 6aee874493..dedea0f3d6 100644
--- a/src/arch/x86/insts/microregop.cc
+++ b/src/arch/x86/insts/microregop.cc
@@ -50,9 +50,6 @@ namespace X86ISA
             bool subtract) const
     {
         DPRINTF(X86, "flagMask = %#x\n", flagMask);
-        if (_destRegIdx[0] & IntFoldBit) {
-            _dest >>= 8;
-        }
         uint64_t flags = oldFlags & ~flagMask;
         if(flagMask & (ECFBit | CFBit))
         {
diff --git a/src/arch/x86/isa/includes.isa b/src/arch/x86/isa/includes.isa
index 58b1fbc626..674e69e982 100644
--- a/src/arch/x86/isa/includes.isa
+++ b/src/arch/x86/isa/includes.isa
@@ -53,6 +53,7 @@ output header {{
 #include <sstream>
 #include <iostream>
 
+#include "arch/generic/debugfaults.hh"
 #include "arch/x86/emulenv.hh"
 #include "arch/x86/insts/macroop.hh"
 #include "arch/x86/insts/microfpop.hh"
@@ -113,6 +114,7 @@ output exec {{
 #include "arch/x86/regs/misc.hh"
 #include "arch/x86/tlb.hh"
 #include "base/bigint.hh"
+#include "base/compiler.hh"
 #include "base/condcodes.hh"
 #include "cpu/base.hh"
 #include "cpu/exetrace.hh"
diff --git a/src/arch/x86/isa/microops/debug.isa b/src/arch/x86/isa/microops/debug.isa
index 4b2ecdd5aa..220c1af972 100644
--- a/src/arch/x86/isa/microops/debug.isa
+++ b/src/arch/x86/isa/microops/debug.isa
@@ -45,16 +45,29 @@ output header {{
     class MicroDebugBase : public X86ISA::X86MicroopBase
     {
       protected:
+        typedef GenericISA::M5DebugFault::DebugFunc DebugFunc;
+        DebugFunc func;
         std::string message;
         uint8_t cc;
 
       public:
-        MicroDebugBase(ExtMachInst _machInst, const char * mnem,
+        MicroDebugBase(ExtMachInst machInst, const char * mnem,
                 const char * instMnem, uint64_t setFlags,
-                std::string _message, uint8_t _cc);
+                DebugFunc _func, std::string _message, uint8_t _cc) :
+            X86MicroopBase(machInst, mnem, instMnem, setFlags, No_OpClass),
+                    func(_func), message(_message), cc(_cc)
+        {}
 
-        std::string generateDisassembly(Addr pc,
-                const SymbolTable *symtab) const;
+        std::string
+        generateDisassembly(Addr pc, const SymbolTable *symtab) const
+        {
+            std::stringstream response;
+
+            printMnemonic(response, instMnem, mnemonic);
+            response << "\"" << message << "\"";
+
+            return response.str();
+        }
     };
 }};
 
@@ -70,53 +83,31 @@ def template MicroDebugDeclare {{
 }};
 
 def template MicroDebugExecute {{
-        Fault %(class_name)s::execute(%(CPU_exec_context)s *xc,
+        Fault
+        %(class_name)s::execute(%(CPU_exec_context)s *xc,
                 Trace::InstRecord *traceData) const
         {
             %(op_decl)s
             %(op_rd)s
             if (%(cond_test)s) {
-                %(func)s("%s\n", message);
+                return new GenericISA::M5DebugFault(func, message);
+            } else {
+                return NoFault;
             }
-            return NoFault;
         }
 }};
 
-output decoder {{
-    inline MicroDebugBase::MicroDebugBase(
-            ExtMachInst machInst, const char * mnem, const char * instMnem,
-            uint64_t setFlags, std::string _message, uint8_t _cc) :
-        X86MicroopBase(machInst, mnem, instMnem,
-                setFlags, No_OpClass),
-                message(_message), cc(_cc)
-    {
-    }
-}};
-
 def template MicroDebugConstructor {{
-    inline %(class_name)s::%(class_name)s(
+    %(class_name)s::%(class_name)s(
             ExtMachInst machInst, const char * instMnem, uint64_t setFlags,
             std::string _message, uint8_t _cc) :
         %(base_class)s(machInst, "%(func)s", instMnem,
-                setFlags, _message, _cc)
+                setFlags, %(func_num)s, _message, _cc)
     {
         %(constructor)s;
     }
 }};
 
-output decoder {{
-    std::string MicroDebugBase::generateDisassembly(Addr pc,
-            const SymbolTable *symtab) const
-    {
-        std::stringstream response;
-
-        printMnemonic(response, instMnem, mnemonic);
-        response << "\"" << message << "\"";
-
-        return response.str();
-    }
-}};
-
 let {{
     class MicroDebug(X86Microop):
         def __init__(self, message, flags=None):
@@ -142,13 +133,14 @@ let {{
     header_output = ""
     decoder_output = ""
 
-    def buildDebugMicro(func):
+    def buildDebugMicro(func, func_num):
         global exec_output, header_output, decoder_output
 
         iop = InstObjParams(func, "Micro%sFlags" % func.capitalize(),
                 "MicroDebugBase",
                 {"code": "",
                  "func": func,
+                 "func_num": "GenericISA::M5DebugFault::%s" % func_num,
                  "cond_test": "checkCondition(ccFlagBits, cc)"})
         exec_output += MicroDebugExecute.subst(iop)
         header_output += MicroDebugDeclare.subst(iop)
@@ -158,6 +150,7 @@ let {{
                 "MicroDebugBase",
                 {"code": "",
                  "func": func,
+                 "func_num": "GenericISA::M5DebugFault::%s" % func_num,
                  "cond_test": "true"})
         exec_output += MicroDebugExecute.subst(iop)
         header_output += MicroDebugDeclare.subst(iop)
@@ -169,8 +162,8 @@ let {{
         global microopClasses
         microopClasses[func] = MicroDebugChild
 
-    buildDebugMicro("panic")
-    buildDebugMicro("fatal")
-    buildDebugMicro("warn")
-    buildDebugMicro("warn_once")
+    buildDebugMicro("panic", "PanicFunc")
+    buildDebugMicro("fatal", "FatalFunc")
+    buildDebugMicro("warn", "WarnFunc")
+    buildDebugMicro("warn_once", "WarnOnceFunc")
 }};
diff --git a/src/arch/x86/isa/microops/ldstop.isa b/src/arch/x86/isa/microops/ldstop.isa
index 216a74c6cb..cd649d6447 100644
--- a/src/arch/x86/isa/microops/ldstop.isa
+++ b/src/arch/x86/isa/microops/ldstop.isa
@@ -301,6 +301,46 @@ let {{
                 "dataSize" : self.dataSize, "addressSize" : self.addressSize,
                 "memFlags" : self.memFlags}
             return allocator
+
+    class BigLdStOp(X86Microop):
+        def __init__(self, data, segment, addr, disp,
+                dataSize, addressSize, baseFlags, atCPL0, prefetch):
+            self.data = data
+            [self.scale, self.index, self.base] = addr
+            self.disp = disp
+            self.segment = segment
+            self.dataSize = dataSize
+            self.addressSize = addressSize
+            self.memFlags = baseFlags
+            if atCPL0:
+                self.memFlags += " | (CPL0FlagBit << FlagShift)"
+            if prefetch:
+                self.memFlags += " | Request::PREFETCH"
+            self.memFlags += " | (machInst.legacy.addr ? " + \
+                             "(AddrSizeFlagBit << FlagShift) : 0)"
+
+        def getAllocator(self, microFlags):
+            allocString = '''
+                (%(dataSize)s >= 4) ?
+                    (StaticInstPtr)(new %(class_name)sBig(machInst,
+                        macrocodeBlock, %(flags)s, %(scale)s, %(index)s,
+                        %(base)s, %(disp)s, %(segment)s, %(data)s,
+                        %(dataSize)s, %(addressSize)s, %(memFlags)s)) :
+                    (StaticInstPtr)(new %(class_name)s(machInst,
+                        macrocodeBlock, %(flags)s, %(scale)s, %(index)s,
+                        %(base)s, %(disp)s, %(segment)s, %(data)s,
+                        %(dataSize)s, %(addressSize)s, %(memFlags)s))
+            '''
+            allocator = allocString % {
+                "class_name" : self.className,
+                "flags" : self.microFlagsText(microFlags),
+                "scale" : self.scale, "index" : self.index,
+                "base" : self.base,
+                "disp" : self.disp,
+                "segment" : self.segment, "data" : self.data,
+                "dataSize" : self.dataSize, "addressSize" : self.addressSize,
+                "memFlags" : self.memFlags}
+            return allocator
 }};
 
 let {{
@@ -315,7 +355,8 @@ let {{
     EA = bits(SegBase + scale * Index + Base + disp, addressSize * 8 - 1, 0);
     '''
 
-    def defineMicroLoadOp(mnemonic, code, mem_flags="0"):
+    def defineMicroLoadOp(mnemonic, code, bigCode='',
+                          mem_flags="0", big=True):
         global header_output
         global decoder_output
         global exec_output
@@ -324,16 +365,22 @@ let {{
         name = mnemonic.lower()
 
         # Build up the all register version of this micro op
-        iop = InstObjParams(name, Name, 'X86ISA::LdStOp',
-                {"code": code,
-                 "ea_code": calculateEA})
-        header_output += MicroLdStOpDeclare.subst(iop)
-        decoder_output += MicroLdStOpConstructor.subst(iop)
-        exec_output += MicroLoadExecute.subst(iop)
-        exec_output += MicroLoadInitiateAcc.subst(iop)
-        exec_output += MicroLoadCompleteAcc.subst(iop)
+        iops = [InstObjParams(name, Name, 'X86ISA::LdStOp',
+                {"code": code, "ea_code": calculateEA})]
+        if big:
+            iops += [InstObjParams(name, Name + "Big", 'X86ISA::LdStOp',
+                     {"code": bigCode, "ea_code": calculateEA})]
+        for iop in iops:
+            header_output += MicroLdStOpDeclare.subst(iop)
+            decoder_output += MicroLdStOpConstructor.subst(iop)
+            exec_output += MicroLoadExecute.subst(iop)
+            exec_output += MicroLoadInitiateAcc.subst(iop)
+            exec_output += MicroLoadCompleteAcc.subst(iop)
 
-        class LoadOp(LdStOp):
+        base = LdStOp
+        if big:
+            base = BigLdStOp
+        class LoadOp(base):
             def __init__(self, data, segment, addr, disp = 0,
                     dataSize="env.dataSize",
                     addressSize="env.addressSize",
@@ -346,12 +393,15 @@ let {{
 
         microopClasses[name] = LoadOp
 
-    defineMicroLoadOp('Ld', 'Data = merge(Data, Mem, dataSize);')
+    defineMicroLoadOp('Ld', 'Data = merge(Data, Mem, dataSize);',
+                            'Data = Mem & mask(dataSize * 8);')
     defineMicroLoadOp('Ldst', 'Data = merge(Data, Mem, dataSize);',
-            '(StoreCheck << FlagShift)')
+                              'Data = Mem & mask(dataSize * 8);',
+                      '(StoreCheck << FlagShift)')
     defineMicroLoadOp('Ldstl', 'Data = merge(Data, Mem, dataSize);',
-            '(StoreCheck << FlagShift) | Request::LOCKED')
-    defineMicroLoadOp('Ldfp', 'FpData.uqw = Mem;')
+                               'Data = Mem & mask(dataSize * 8);',
+                      '(StoreCheck << FlagShift) | Request::LOCKED')
+    defineMicroLoadOp('Ldfp', 'FpData.uqw = Mem;', big = False)
 
     def defineMicroStoreOp(mnemonic, code, \
             postCode="", completeCode="", mem_flags="0"):
diff --git a/src/arch/x86/isa/microops/limmop.isa b/src/arch/x86/isa/microops/limmop.isa
index 2871d5a892..ac78b090dd 100644
--- a/src/arch/x86/isa/microops/limmop.isa
+++ b/src/arch/x86/isa/microops/limmop.isa
@@ -114,8 +114,16 @@ let {{
             self.dataSize = dataSize
 
         def getAllocator(self, microFlags):
-            allocator = '''new %(class_name)s(machInst, macrocodeBlock,
-                    %(flags)s, %(dest)s, %(imm)s, %(dataSize)s)''' % {
+            allocString = '''
+                (%(dataSize)s >= 4) ?
+                    (StaticInstPtr)(new %(class_name)sBig(machInst,
+                        macrocodeBlock, %(flags)s, %(dest)s, %(imm)s,
+                        %(dataSize)s)) :
+                    (StaticInstPtr)(new %(class_name)s(machInst,
+                        macrocodeBlock, %(flags)s, %(dest)s, %(imm)s,
+                        %(dataSize)s))
+            '''
+            allocator = allocString % {
                 "class_name" : self.className,
                 "mnemonic" : self.mnemonic,
                 "flags" : self.microFlagsText(microFlags),
@@ -152,12 +160,15 @@ let {{
 
 let {{
     # Build up the all register version of this micro op
-    iop = InstObjParams("limm", "Limm", 'X86MicroopBase',
-            {"code" : "DestReg = merge(DestReg, imm, dataSize);"})
-    header_output += MicroLimmOpDeclare.subst(iop)
-    decoder_output += MicroLimmOpConstructor.subst(iop)
-    decoder_output += MicroLimmOpDisassembly.subst(iop)
-    exec_output += MicroLimmOpExecute.subst(iop)
+    iops = [InstObjParams("limm", "Limm", 'X86MicroopBase',
+            {"code" : "DestReg = merge(DestReg, imm, dataSize);"}),
+            InstObjParams("limm", "LimmBig", 'X86MicroopBase',
+            {"code" : "DestReg = imm & mask(dataSize * 8);"})]
+    for iop in iops:
+        header_output += MicroLimmOpDeclare.subst(iop)
+        decoder_output += MicroLimmOpConstructor.subst(iop)
+        decoder_output += MicroLimmOpDisassembly.subst(iop)
+        exec_output += MicroLimmOpExecute.subst(iop)
 
     iop = InstObjParams("lfpimm", "Lfpimm", 'X86MicroopBase',
             {"code" : "FpDestReg.uqw = imm"})
diff --git a/src/arch/x86/isa/microops/regop.isa b/src/arch/x86/isa/microops/regop.isa
index ccfcb3a69a..e2a51c1271 100644
--- a/src/arch/x86/isa/microops/regop.isa
+++ b/src/arch/x86/isa/microops/regop.isa
@@ -51,6 +51,8 @@ def template MicroRegOpExecute {{
             %(op_decl)s;
             %(op_rd)s;
 
+            IntReg result M5_VAR_USED;
+
             if(%(cond_check)s)
             {
                 %(code)s;
@@ -79,6 +81,8 @@ def template MicroRegOpImmExecute {{
             %(op_decl)s;
             %(op_rd)s;
 
+            IntReg result M5_VAR_USED;
+
             if(%(cond_check)s)
             {
                 %(code)s;
@@ -224,8 +228,8 @@ let {{
             MicroRegOpExecute)
 
     class RegOpMeta(type):
-        def buildCppClasses(self, name, Name, suffix, \
-                code, flag_code, cond_check, else_code, cond_control_flag_init):
+        def buildCppClasses(self, name, Name, suffix, code, big_code, \
+                flag_code, cond_check, else_code, cond_control_flag_init):
 
             # Globals to stick the output in
             global header_output
@@ -235,11 +239,13 @@ let {{
             # Stick all the code together so it can be searched at once
             allCode = "|".join((code, flag_code, cond_check, else_code, 
                                 cond_control_flag_init))
+            allBigCode = "|".join((big_code, flag_code, cond_check, else_code,
+                                   cond_control_flag_init))
 
             # If op2 is used anywhere, make register and immediate versions
             # of this code.
             matcher = re.compile("(?<!\\w)(?P<prefix>s?)op2(?P<typeQual>\\.\\w+)?")
-            match = matcher.search(allCode)
+            match = matcher.search(allCode + allBigCode)
             if match:
                 typeQual = ""
                 if match.group("typeQual"):
@@ -247,6 +253,7 @@ let {{
                 src2_name = "%spsrc2%s" % (match.group("prefix"), typeQual)
                 self.buildCppClasses(name, Name, suffix,
                         matcher.sub(src2_name, code),
+                        matcher.sub(src2_name, big_code),
                         matcher.sub(src2_name, flag_code),
                         matcher.sub(src2_name, cond_check),
                         matcher.sub(src2_name, else_code),
@@ -254,6 +261,7 @@ let {{
                 imm_name = "%simm8" % match.group("prefix")
                 self.buildCppClasses(name + "i", Name, suffix + "Imm",
                         matcher.sub(imm_name, code),
+                        matcher.sub(imm_name, big_code),
                         matcher.sub(imm_name, flag_code),
                         matcher.sub(imm_name, cond_check),
                         matcher.sub(imm_name, else_code),
@@ -264,27 +272,32 @@ let {{
             # a version without it and fix up this version to use it.
             if flag_code != "" or cond_check != "true":
                 self.buildCppClasses(name, Name, suffix,
-                        code, "", "true", else_code, "")
+                        code, big_code, "", "true", else_code, "")
                 suffix = "Flags" + suffix
 
             # If psrc1 or psrc2 is used, we need to actually insert code to
             # compute it.
-            matcher = re.compile("(?<!\w)psrc1(?!\w)")
-            if matcher.search(allCode):
-                code = "uint64_t psrc1 = pick(SrcReg1, 0, dataSize);" + code
-            matcher = re.compile("(?<!\w)psrc2(?!\w)")
-            if matcher.search(allCode):
-                code = "uint64_t psrc2 = pick(SrcReg2, 1, dataSize);" + code
-            # Also make available versions which do sign extension
-            matcher = re.compile("(?<!\w)spsrc1(?!\w)")
-            if matcher.search(allCode):
-                code = "int64_t spsrc1 = signedPick(SrcReg1, 0, dataSize);" + code
-            matcher = re.compile("(?<!\w)spsrc2(?!\w)")
-            if matcher.search(allCode):
-                code = "int64_t spsrc2 = signedPick(SrcReg2, 1, dataSize);" + code
-            matcher = re.compile("(?<!\w)simm8(?!\w)")
-            if matcher.search(allCode):
-                code = "int8_t simm8 = imm8;" + code
+            for (big, all) in ((False, allCode), (True, allBigCode)):
+                prefix = ""
+                for (rex, decl) in (
+                        ("(?<!\w)psrc1(?!\w)",
+                         "uint64_t psrc1 = pick(SrcReg1, 0, dataSize);"),
+                        ("(?<!\w)psrc2(?!\w)",
+                         "uint64_t psrc2 = pick(SrcReg2, 1, dataSize);"),
+                        ("(?<!\w)spsrc1(?!\w)",
+                         "int64_t spsrc1 = signedPick(SrcReg1, 0, dataSize);"),
+                        ("(?<!\w)spsrc2(?!\w)",
+                         "int64_t spsrc2 = signedPick(SrcReg2, 1, dataSize);"),
+                        ("(?<!\w)simm8(?!\w)",
+                         "int8_t simm8 = imm8;")):
+                    matcher = re.compile(rex)
+                    if matcher.search(all):
+                        prefix += decl + "\n"
+                if big:
+                    if big_code != "":
+                        big_code = prefix + big_code
+                else:
+                    code = prefix + code
 
             base = "X86ISA::RegOp"
 
@@ -297,17 +310,26 @@ let {{
                 templates = immTemplates
 
             # Get everything ready for the substitution
-            iop = InstObjParams(name, Name + suffix, base,
+            iops = [InstObjParams(name, Name + suffix, base,
                     {"code" : code,
                      "flag_code" : flag_code,
                      "cond_check" : cond_check,
                      "else_code" : else_code,
-                     "cond_control_flag_init": cond_control_flag_init})
+                     "cond_control_flag_init" : cond_control_flag_init})]
+            if big_code != "":
+                iops += [InstObjParams(name, Name + suffix + "Big", base,
+                         {"code" : big_code,
+                          "flag_code" : flag_code,
+                          "cond_check" : cond_check,
+                          "else_code" : else_code,
+                          "cond_control_flag_init" :
+                              cond_control_flag_init})]
 
             # Generate the actual code (finally!)
-            header_output += templates[0].subst(iop)
-            decoder_output += templates[1].subst(iop)
-            exec_output += templates[2].subst(iop)
+            for iop in iops:
+                header_output += templates[0].subst(iop)
+                decoder_output += templates[1].subst(iop)
+                exec_output += templates[2].subst(iop)
 
 
         def __new__(mcls, Name, bases, dict):
@@ -322,14 +344,16 @@ let {{
                 cls.className = Name
                 cls.base_mnemonic = name
                 code = cls.code
+                big_code = cls.big_code
                 flag_code = cls.flag_code
                 cond_check = cls.cond_check
                 else_code = cls.else_code
                 cond_control_flag_init = cls.cond_control_flag_init
 
                 # Set up the C++ classes
-                mcls.buildCppClasses(cls, name, Name, "", code, flag_code,
-                        cond_check, else_code, cond_control_flag_init)
+                mcls.buildCppClasses(cls, name, Name, "", code, big_code,
+                        flag_code, cond_check, else_code,
+                        cond_control_flag_init)
 
                 # Hook into the microassembler dict
                 global microopClasses
@@ -352,6 +376,7 @@ let {{
         abstract = True
 
         # Default template parameter values
+        big_code = ""
         flag_code = ""
         cond_check = "true"
         else_code = ";"
@@ -372,26 +397,48 @@ let {{
                 self.className += "Flags"
 
         def getAllocator(self, microFlags):
-            className = self.className
-            if self.mnemonic == self.base_mnemonic + 'i':
-                className += "Imm"
-            allocator = '''new %(class_name)s(machInst, macrocodeBlock,
-                    %(flags)s, %(src1)s, %(op2)s, %(dest)s,
-                    %(dataSize)s, %(ext)s)''' % {
-                "class_name" : className,
-                "flags" : self.microFlagsText(microFlags),
-                "src1" : self.src1, "op2" : self.op2,
-                "dest" : self.dest,
-                "dataSize" : self.dataSize,
-                "ext" : self.ext}
-            return allocator
+            if self.big_code != "":
+                className = self.className
+                if self.mnemonic == self.base_mnemonic + 'i':
+                    className += "Imm"
+                allocString = '''
+                    (%(dataSize)s >= 4) ?
+                        (StaticInstPtr)(new %(class_name)sBig(machInst,
+                            macrocodeBlock, %(flags)s, %(src1)s, %(op2)s,
+                            %(dest)s, %(dataSize)s, %(ext)s)) :
+                        (StaticInstPtr)(new %(class_name)s(machInst,
+                            macrocodeBlock, %(flags)s, %(src1)s, %(op2)s,
+                            %(dest)s, %(dataSize)s, %(ext)s))
+                    '''
+                allocator = allocString % {
+                    "class_name" : className,
+                    "flags" : self.microFlagsText(microFlags),
+                    "src1" : self.src1, "op2" : self.op2,
+                    "dest" : self.dest,
+                    "dataSize" : self.dataSize,
+                    "ext" : self.ext}
+                return allocator
+            else:
+                className = self.className
+                if self.mnemonic == self.base_mnemonic + 'i':
+                    className += "Imm"
+                allocator = '''new %(class_name)s(machInst, macrocodeBlock,
+                        %(flags)s, %(src1)s, %(op2)s, %(dest)s,
+                        %(dataSize)s, %(ext)s)''' % {
+                    "class_name" : className,
+                    "flags" : self.microFlagsText(microFlags),
+                    "src1" : self.src1, "op2" : self.op2,
+                    "dest" : self.dest,
+                    "dataSize" : self.dataSize,
+                    "ext" : self.ext}
+                return allocator
 
     class LogicRegOp(RegOp):
         abstract = True
         flag_code = '''
             //Don't have genFlags handle the OF or CF bits
             uint64_t mask = CFBit | ECFBit | OFBit;
-            ccFlagBits = genFlags(ccFlagBits, ext & ~mask, DestReg, psrc1, op2);
+            ccFlagBits = genFlags(ccFlagBits, ext & ~mask, result, psrc1, op2);
             //If a logic microop wants to set these, it wants to set them to 0.
             ccFlagBits &= ~(CFBit & ext);
             ccFlagBits &= ~(ECFBit & ext);
@@ -401,12 +448,12 @@ let {{
     class FlagRegOp(RegOp):
         abstract = True
         flag_code = \
-            "ccFlagBits = genFlags(ccFlagBits, ext, DestReg, psrc1, op2);"
+            "ccFlagBits = genFlags(ccFlagBits, ext, result, psrc1, op2);"
 
     class SubRegOp(RegOp):
         abstract = True
         flag_code = \
-            "ccFlagBits = genFlags(ccFlagBits, ext, DestReg, psrc1, ~op2, true);"
+            "ccFlagBits = genFlags(ccFlagBits, ext, result, psrc1, ~op2, true);"
 
     class CondRegOp(RegOp):
         abstract = True
@@ -428,31 +475,44 @@ let {{
                     src1, src2, flags, dataSize)
 
     class Add(FlagRegOp):
-        code = 'DestReg = merge(DestReg, psrc1 + op2, dataSize);'
+        code = 'DestReg = merge(DestReg, result = (psrc1 + op2), dataSize);'
+        big_code = 'DestReg = result = (psrc1 + op2) & mask(dataSize * 8);'
 
     class Or(LogicRegOp):
-        code = 'DestReg = merge(DestReg, psrc1 | op2, dataSize);'
+        code = 'DestReg = merge(DestReg, result = (psrc1 | op2), dataSize);'
+        big_code = 'DestReg = result = (psrc1 | op2) & mask(dataSize * 8);'
 
     class Adc(FlagRegOp):
         code = '''
             CCFlagBits flags = ccFlagBits;
-            DestReg = merge(DestReg, psrc1 + op2 + flags.cf, dataSize);
+            DestReg = merge(DestReg, result = (psrc1 + op2 + flags.cf), dataSize);
+            '''
+        big_code = '''
+            CCFlagBits flags = ccFlagBits;
+            DestReg = result = (psrc1 + op2 + flags.cf) & mask(dataSize * 8);
             '''
 
     class Sbb(SubRegOp):
         code = '''
             CCFlagBits flags = ccFlagBits;
-            DestReg = merge(DestReg, psrc1 - op2 - flags.cf, dataSize);
+            DestReg = merge(DestReg, result = (psrc1 - op2 - flags.cf), dataSize);
+            '''
+        big_code = '''
+            CCFlagBits flags = ccFlagBits;
+            DestReg = result = (psrc1 - op2 - flags.cf) & mask(dataSize * 8);
             '''
 
     class And(LogicRegOp):
-        code = 'DestReg = merge(DestReg, psrc1 & op2, dataSize)'
+        code = 'DestReg = merge(DestReg, result = (psrc1 & op2), dataSize)'
+        big_code = 'DestReg = result = (psrc1 & op2) & mask(dataSize * 8)'
 
     class Sub(SubRegOp):
-        code = 'DestReg = merge(DestReg, psrc1 - op2, dataSize)'
+        code = 'DestReg = merge(DestReg, result = (psrc1 - op2), dataSize)'
+        big_code = 'DestReg = result = (psrc1 - op2) & mask(dataSize * 8)'
 
     class Xor(LogicRegOp):
-        code = 'DestReg = merge(DestReg, psrc1 ^ op2, dataSize)'
+        code = 'DestReg = merge(DestReg, result = (psrc1 ^ op2), dataSize)'
+        big_code = 'DestReg = result = (psrc1 ^ op2) & mask(dataSize * 8)'
 
     class Mul1s(WrRegOp):
         code = '''
@@ -505,6 +565,7 @@ let {{
 
     class Mulel(RdRegOp):
         code = 'DestReg = merge(SrcReg1, ProdLow, dataSize);'
+        big_code = 'DestReg = ProdLow & mask(dataSize * 8);'
 
     class Muleh(RdRegOp):
         def __init__(self, dest, src1=None, flags=None, dataSize="env.dataSize"):
@@ -513,6 +574,7 @@ let {{
             super(RdRegOp, self).__init__(dest, src1, \
                     "InstRegIndex(NUM_INTREGS)", flags, dataSize)
         code = 'DestReg = merge(SrcReg1, ProdHi, dataSize);'
+        big_code = 'DestReg = ProdHi & mask(dataSize * 8);'
 
     # One or two bit divide
     class Div1(WrRegOp):
@@ -540,7 +602,7 @@ let {{
 
     # Step divide
     class Div2(RegOp):
-        code = '''
+        divCode = '''
             uint64_t dividend = Remainder;
             uint64_t divisor = Divisor;
             uint64_t quotient = Quotient;
@@ -587,11 +649,13 @@ let {{
                 }
             }
             //Keep track of how many bits there are still to pull in.
-            DestReg = merge(DestReg, remaining, dataSize);
+            %s
             //Record the final results
             Remainder = remainder;
             Quotient = quotient;
         '''
+        code = divCode % "DestReg = merge(DestReg, remaining, dataSize);"
+        big_code = divCode % "DestReg = remaining & mask(dataSize * 8);"
         flag_code = '''
             if (remaining == 0)
                 ccFlagBits = ccFlagBits | (ext & EZFBit);
@@ -601,9 +665,11 @@ let {{
 
     class Divq(RdRegOp):
         code = 'DestReg = merge(SrcReg1, Quotient, dataSize);'
+        big_code = 'DestReg = Quotient & mask(dataSize * 8);'
 
     class Divr(RdRegOp):
         code = 'DestReg = merge(SrcReg1, Remainder, dataSize);'
+        big_code = 'DestReg = Remainder & mask(dataSize * 8);'
 
     class Mov(CondRegOp):
         code = 'DestReg = merge(SrcReg1, op2, dataSize)'
@@ -616,6 +682,10 @@ let {{
             uint8_t shiftAmt = (op2 & ((dataSize == 8) ? mask(6) : mask(5)));
             DestReg = merge(DestReg, psrc1 << shiftAmt, dataSize);
             '''
+        big_code = '''
+            uint8_t shiftAmt = (op2 & ((dataSize == 8) ? mask(6) : mask(5)));
+            DestReg = (psrc1 << shiftAmt) & mask(dataSize * 8);
+            '''
         flag_code = '''
             // If the shift amount is zero, no flags should be modified.
             if (shiftAmt) {
@@ -641,14 +711,19 @@ let {{
         '''
 
     class Srl(RegOp):
+        # Because what happens to the bits shift -in- on a right shift
+        # is not defined in the C/C++ standard, we have to mask them out
+        # to be sure they're zero.
         code = '''
             uint8_t shiftAmt = (op2 & ((dataSize == 8) ? mask(6) : mask(5)));
-            // Because what happens to the bits shift -in- on a right shift
-            // is not defined in the C/C++ standard, we have to mask them out
-            // to be sure they're zero.
             uint64_t logicalMask = mask(dataSize * 8 - shiftAmt);
             DestReg = merge(DestReg, (psrc1 >> shiftAmt) & logicalMask, dataSize);
             '''
+        big_code = '''
+            uint8_t shiftAmt = (op2 & ((dataSize == 8) ? mask(6) : mask(5)));
+            uint64_t logicalMask = mask(dataSize * 8 - shiftAmt);
+            DestReg = (psrc1 >> shiftAmt) & logicalMask;
+            '''
         flag_code = '''
             // If the shift amount is zero, no flags should be modified.
             if (shiftAmt) {
@@ -671,15 +746,21 @@ let {{
         '''
 
     class Sra(RegOp):
+        # Because what happens to the bits shift -in- on a right shift
+        # is not defined in the C/C++ standard, we have to sign extend
+        # them manually to be sure.
         code = '''
             uint8_t shiftAmt = (op2 & ((dataSize == 8) ? mask(6) : mask(5)));
-            // Because what happens to the bits shift -in- on a right shift
-            // is not defined in the C/C++ standard, we have to sign extend
-            // them manually to be sure.
             uint64_t arithMask = (shiftAmt == 0) ? 0 :
                 -bits(psrc1, dataSize * 8 - 1) << (dataSize * 8 - shiftAmt);
             DestReg = merge(DestReg, (psrc1 >> shiftAmt) | arithMask, dataSize);
             '''
+        big_code = '''
+            uint8_t shiftAmt = (op2 & ((dataSize == 8) ? mask(6) : mask(5)));
+            uint64_t arithMask = (shiftAmt == 0) ? 0 :
+                -bits(psrc1, dataSize * 8 - 1) << (dataSize * 8 - shiftAmt);
+            DestReg = ((psrc1 >> shiftAmt) | arithMask) & mask(dataSize * 8);
+            '''
         flag_code = '''
             // If the shift amount is zero, no flags should be modified.
             if (shiftAmt) {
@@ -704,13 +785,11 @@ let {{
             uint8_t shiftAmt =
                 (op2 & ((dataSize == 8) ? mask(6) : mask(5)));
             uint8_t realShiftAmt = shiftAmt % (dataSize * 8);
-            if(realShiftAmt)
-            {
+            if (realShiftAmt) {
                 uint64_t top = psrc1 << (dataSize * 8 - realShiftAmt);
                 uint64_t bottom = bits(psrc1, dataSize * 8, realShiftAmt);
                 DestReg = merge(DestReg, top | bottom, dataSize);
-            }
-            else
+            } else
                 DestReg = merge(DestReg, DestReg, dataSize);
             '''
         flag_code = '''
@@ -739,16 +818,14 @@ let {{
             uint8_t shiftAmt =
                 (op2 & ((dataSize == 8) ? mask(6) : mask(5)));
             uint8_t realShiftAmt = shiftAmt % (dataSize * 8 + 1);
-            if(realShiftAmt)
-            {
+            if (realShiftAmt) {
                 CCFlagBits flags = ccFlagBits;
                 uint64_t top = flags.cf << (dataSize * 8 - realShiftAmt);
                 if (realShiftAmt > 1)
                     top |= psrc1 << (dataSize * 8 - realShiftAmt + 1);
                 uint64_t bottom = bits(psrc1, dataSize * 8 - 1, realShiftAmt);
                 DestReg = merge(DestReg, top | bottom, dataSize);
-            }
-            else
+            } else
                 DestReg = merge(DestReg, DestReg, dataSize);
             '''
         flag_code = '''
@@ -780,14 +857,12 @@ let {{
             uint8_t shiftAmt =
                 (op2 & ((dataSize == 8) ? mask(6) : mask(5)));
             uint8_t realShiftAmt = shiftAmt % (dataSize * 8);
-            if(realShiftAmt)
-            {
+            if (realShiftAmt) {
                 uint64_t top = psrc1 << realShiftAmt;
                 uint64_t bottom =
                     bits(psrc1, dataSize * 8 - 1, dataSize * 8 - realShiftAmt);
                 DestReg = merge(DestReg, top | bottom, dataSize);
-            }
-            else
+            } else
                 DestReg = merge(DestReg, DestReg, dataSize);
             '''
         flag_code = '''
@@ -816,8 +891,7 @@ let {{
             uint8_t shiftAmt =
                 (op2 & ((dataSize == 8) ? mask(6) : mask(5)));
             uint8_t realShiftAmt = shiftAmt % (dataSize * 8 + 1);
-            if(realShiftAmt)
-            {
+            if (realShiftAmt) {
                 CCFlagBits flags = ccFlagBits;
                 uint64_t top = psrc1 << realShiftAmt;
                 uint64_t bottom = flags.cf << (realShiftAmt - 1);
@@ -826,8 +900,7 @@ let {{
                         bits(psrc1, dataSize * 8 - 1,
                                    dataSize * 8 - realShiftAmt + 1);
                 DestReg = merge(DestReg, top | bottom, dataSize);
-            }
-            else
+            } else
                 DestReg = merge(DestReg, DestReg, dataSize);
             '''
         flag_code = '''
@@ -853,10 +926,10 @@ let {{
         '''
 
     class Sld(RegOp):
-        code = '''
+        sldCode = '''
             uint8_t shiftAmt = (op2 & ((dataSize == 8) ? mask(6) : mask(5)));
             uint8_t dataBits = dataSize * 8;
-            uint8_t realShiftAmt = shiftAmt % (2 * dataBits);
+            uint8_t realShiftAmt = shiftAmt %% (2 * dataBits);
             uint64_t result;
             if (realShiftAmt == 0) {
                 result = psrc1;
@@ -867,8 +940,10 @@ let {{
                 result = (DoubleBits << (realShiftAmt - dataBits)) |
                          (psrc1 >> (2 * dataBits - realShiftAmt));
             }
-            DestReg = merge(DestReg, result, dataSize);
+            %s
             '''
+        code = sldCode % "DestReg = merge(DestReg, result, dataSize);"
+        big_code = sldCode % "DestReg = result & mask(dataSize * 8);"
         flag_code = '''
             // If the shift amount is zero, no flags should be modified.
             if (shiftAmt) {
@@ -899,10 +974,10 @@ let {{
         '''
 
     class Srd(RegOp):
-        code = '''
+        srdCode = '''
             uint8_t shiftAmt = (op2 & ((dataSize == 8) ? mask(6) : mask(5)));
             uint8_t dataBits = dataSize * 8;
-            uint8_t realShiftAmt = shiftAmt % (2 * dataBits);
+            uint8_t realShiftAmt = shiftAmt %% (2 * dataBits);
             uint64_t result;
             if (realShiftAmt == 0) {
                 result = psrc1;
@@ -919,8 +994,10 @@ let {{
                           logicalMask) |
                          (psrc1 << (2 * dataBits - realShiftAmt));
             }
-            DestReg = merge(DestReg, result, dataSize);
+            %s
             '''
+        code = srdCode % "DestReg = merge(DestReg, result, dataSize);"
+        big_code = srdCode % "DestReg = result & mask(dataSize * 8);"
         flag_code = '''
             // If the shift amount is zero, no flags should be modified.
             if (shiftAmt) {
@@ -986,6 +1063,12 @@ let {{
             ccFlagBits = (flag == 0) ? (ccFlagBits | EZFBit) :
                                        (ccFlagBits & ~EZFBit);
             '''
+        big_code = '''
+            int flag = bits(ccFlagBits, imm8);
+            DestReg = flag & mask(dataSize * 8);
+            ccFlagBits = (flag == 0) ? (ccFlagBits | EZFBit) :
+                                       (ccFlagBits & ~EZFBit);
+            '''
         def __init__(self, dest, imm, flags=None, \
                 dataSize="env.dataSize"):
             super(Ruflag, self).__init__(dest, \
@@ -1000,6 +1083,14 @@ let {{
             ccFlagBits = (flag == 0) ? (ccFlagBits | EZFBit) :
                                        (ccFlagBits & ~EZFBit);
             '''
+        big_code = '''
+            MiscReg flagMask = 0x3F7FDD5;
+            MiscReg flags = (nccFlagBits | ccFlagBits) & flagMask;
+            int flag = bits(flags, imm8);
+            DestReg = flag & mask(dataSize * 8);
+            ccFlagBits = (flag == 0) ? (ccFlagBits | EZFBit) :
+                                       (ccFlagBits & ~EZFBit);
+            '''
         def __init__(self, dest, imm, flags=None, \
                 dataSize="env.dataSize"):
             super(Rflag, self).__init__(dest, \
@@ -1015,6 +1106,15 @@ let {{
             val = sign_bit ? (val | ~maskVal) : (val & maskVal);
             DestReg = merge(DestReg, val, dataSize);
             '''
+        big_code = '''
+            IntReg val = psrc1;
+            // Mask the bit position so that it wraps.
+            int bitPos = op2 & (dataSize * 8 - 1);
+            int sign_bit = bits(val, bitPos, bitPos);
+            uint64_t maskVal = mask(bitPos+1);
+            val = sign_bit ? (val | ~maskVal) : (val & maskVal);
+            DestReg = val & mask(dataSize * 8);
+            '''
         flag_code = '''
             if (!sign_bit)
                 ccFlagBits = ccFlagBits &
@@ -1026,12 +1126,13 @@ let {{
 
     class Zext(RegOp):
         code = 'DestReg = merge(DestReg, bits(psrc1, op2, 0), dataSize);'
+        big_code = 'DestReg = bits(psrc1, op2, 0) & mask(dataSize * 8);'
 
     class Rddr(RegOp):
         def __init__(self, dest, src1, flags=None, dataSize="env.dataSize"):
             super(Rddr, self).__init__(dest, \
                     src1, "InstRegIndex(NUM_INTREGS)", flags, dataSize)
-        code = '''
+        rdrCode = '''
             CR4 cr4 = CR4Op;
             DR7 dr7 = DR7Op;
             if ((cr4.de == 1 && (src1 == 4 || src1 == 5)) || src1 >= 8) {
@@ -1039,9 +1140,11 @@ let {{
             } else if (dr7.gd) {
                 fault = new DebugException();
             } else {
-                DestReg = merge(DestReg, DebugSrc1, dataSize);
+                %s
             }
         '''
+        code = rdrCode % "DestReg = merge(DestReg, DebugSrc1, dataSize);"
+        big_code = rdrCode % "DestReg = DebugSrc1 & mask(dataSize * 8);"
 
     class Wrdr(RegOp):
         def __init__(self, dest, src1, flags=None, dataSize="env.dataSize"):
@@ -1066,13 +1169,15 @@ let {{
         def __init__(self, dest, src1, flags=None, dataSize="env.dataSize"):
             super(Rdcr, self).__init__(dest, \
                     src1, "InstRegIndex(NUM_INTREGS)", flags, dataSize)
-        code = '''
+        rdcrCode = '''
             if (src1 == 1 || (src1 > 4 && src1 < 8) || (src1 > 8)) {
                 fault = new InvalidOpcode();
             } else {
-                DestReg = merge(DestReg, ControlSrc1, dataSize);
+                %s
             }
         '''
+        code = rdcrCode % "DestReg = merge(DestReg, ControlSrc1, dataSize);"
+        big_code = rdcrCode % "DestReg = ControlSrc1 & mask(dataSize * 8);"
 
     class Wrcr(RegOp):
         def __init__(self, dest, src1, flags=None, dataSize="env.dataSize"):
@@ -1154,24 +1259,20 @@ let {{
         '''
 
     class Rdbase(SegOp):
-        code = '''
-            DestReg = merge(DestReg, SegBaseSrc1, dataSize);
-        '''
+        code = 'DestReg = merge(DestReg, SegBaseSrc1, dataSize);'
+        big_code = 'DestReg = SegBaseSrc1 & mask(dataSize * 8);'
 
     class Rdlimit(SegOp):
-        code = '''
-            DestReg = merge(DestReg, SegLimitSrc1, dataSize);
-        '''
+        code = 'DestReg = merge(DestReg, SegLimitSrc1, dataSize);'
+        big_code = 'DestReg = SegLimitSrc1 & mask(dataSize * 8);'
 
     class RdAttr(SegOp):
-        code = '''
-            DestReg = merge(DestReg, SegAttrSrc1, dataSize);
-        '''
+        code = 'DestReg = merge(DestReg, SegAttrSrc1, dataSize);'
+        big_code = 'DestReg = SegAttrSrc1 & mask(dataSize * 8);'
 
     class Rdsel(SegOp):
-        code = '''
-            DestReg = merge(DestReg, SegSelSrc1, dataSize);
-        '''
+        code = 'DestReg = merge(DestReg, SegSelSrc1, dataSize);'
+        big_code = 'DestReg = SegSelSrc1 & mask(dataSize * 8);'
 
     class Rdval(RegOp):
         def __init__(self, dest, src1, flags=None, dataSize="env.dataSize"):
diff --git a/src/arch/x86/microcode_rom.hh b/src/arch/x86/microcode_rom.hh
index f8ad410ce4..84c503bb9f 100644
--- a/src/arch/x86/microcode_rom.hh
+++ b/src/arch/x86/microcode_rom.hh
@@ -32,6 +32,7 @@
 #define __ARCH_X86_MICROCODE_ROM_HH__
 
 #include "arch/x86/emulenv.hh"
+#include "arch/x86/insts/badmicroop.hh"
 #include "cpu/static_inst.hh"
 
 namespace X86ISAInst
@@ -60,8 +61,10 @@ namespace X86ISAInst
         fetchMicroop(MicroPC microPC, StaticInstPtr curMacroop)
         {
             microPC = normalMicroPC(microPC);
-            assert(microPC < numMicroops);
-            return genFuncs[microPC](curMacroop);
+            if (microPC >= numMicroops)
+                return X86ISA::badMicroop;
+            else
+                return genFuncs[microPC](curMacroop);
         }
     };
 }
diff --git a/src/arch/x86/predecoder.hh b/src/arch/x86/predecoder.hh
index c06ec18bcc..5c67e28e19 100644
--- a/src/arch/x86/predecoder.hh
+++ b/src/arch/x86/predecoder.hh
@@ -225,7 +225,11 @@ namespace X86ISA
         {
             assert(emiIsReady);
             emiIsReady = false;
-            nextPC.npc(nextPC.pc() + getInstSize());
+            if (!nextPC.size()) {
+                Addr size = getInstSize();
+                nextPC.size(size);
+                nextPC.npc(nextPC.pc() + size);
+            }
             return emi;
         }
     };
diff --git a/src/arch/x86/types.hh b/src/arch/x86/types.hh
index 5a208446a5..4641141d35 100644
--- a/src/arch/x86/types.hh
+++ b/src/arch/x86/types.hh
@@ -222,7 +222,61 @@ namespace X86ISA
         return true;
     }
 
-    typedef GenericISA::UPCState<MachInst> PCState;
+    class PCState : public GenericISA::UPCState<MachInst>
+    {
+      protected:
+        typedef GenericISA::UPCState<MachInst> Base;
+
+        uint8_t _size;
+
+      public:
+        void
+        set(Addr val)
+        {
+            Base::set(val);
+            _size = 0;
+        }
+
+        PCState() {}
+        PCState(Addr val) { set(val); }
+
+        uint8_t size() const { return _size; }
+        void size(uint8_t newSize) { _size = newSize; }
+
+        bool
+        branching() const
+        {
+            return this->npc() != this->pc() + size();
+        }
+
+        void
+        advance()
+        {
+            Base::advance();
+            _size = 0;
+        }
+
+        void
+        uEnd()
+        {
+            Base::uEnd();
+            _size = 0;
+        }
+
+        void
+        serialize(std::ostream &os)
+        {
+            Base::serialize(os);
+            SERIALIZE_SCALAR(_size);
+        }
+
+        void
+        unserialize(Checkpoint *cp, const std::string &section)
+        {
+            Base::unserialize(cp, section);
+            UNSERIALIZE_SCALAR(_size);
+        }
+    };
 
     struct CoreSpecific {
         int core_type;
diff --git a/src/base/SConscript b/src/base/SConscript
index 2bb6b13ab8..3f069bf9ee 100644
--- a/src/base/SConscript
+++ b/src/base/SConscript
@@ -35,6 +35,7 @@ if env['CP_ANNOTATE']:
     Source('cp_annotate.cc')
 Source('atomicio.cc')
 Source('bigint.cc')
+Source('bitmap.cc')
 Source('callback.cc')
 Source('circlebuf.cc')
 Source('cprintf.cc')
diff --git a/src/base/bitmap.cc b/src/base/bitmap.cc
new file mode 100644
index 0000000000..0d2a9302ba
--- /dev/null
+++ b/src/base/bitmap.cc
@@ -0,0 +1,82 @@
+/*
+ * Copyright (c) 2010 ARM Limited
+ * All rights reserved
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: William Wang
+ *          Ali Saidi
+ */
+
+#include <cassert>
+
+#include "base/bitmap.hh"
+#include "base/misc.hh"
+
+// bitmap class ctor
+Bitmap::Bitmap(VideoConvert::Mode _mode, uint16_t w, uint16_t h, uint8_t *d)
+    : mode(_mode), height(h), width(w), data(d),
+    vc(mode, VideoConvert::rgb8888, width, height)
+{
+}
+
+void
+Bitmap::write(std::ostream *bmp)
+{
+    assert(data);
+
+    // For further information see: http://en.wikipedia.org/wiki/BMP_file_format
+    Magic  magic = {{'B','M'}};
+    Header header = {sizeof(VideoConvert::Rgb8888) * width * height , 0, 0, 54};
+    Info   info = {sizeof(Info), width, height, 1,
+                   sizeof(VideoConvert::Rgb8888) * 8, 0,
+                   sizeof(VideoConvert::Rgb8888) * width * height, 1, 1, 0, 0};
+
+    bmp->write(reinterpret_cast<char*>(&magic),  sizeof(magic));
+    bmp->write(reinterpret_cast<char*>(&header), sizeof(header));
+    bmp->write(reinterpret_cast<char*>(&info),   sizeof(info));
+
+    uint8_t *tmp = vc.convert(data);
+    uint32_t *tmp32 = (uint32_t*)tmp;
+
+    // BMP start store data left to right starting with the bottom row
+    // so we need to do some creative flipping
+    for (int i = height - 1; i >= 0; i--)
+        for (int j = 0; j < width; j++)
+            bmp->write((char*)&tmp32[i * width + j], sizeof(uint32_t));
+
+    bmp->flush();
+
+    delete [] tmp;
+}
+
diff --git a/src/base/bitmap.hh b/src/base/bitmap.hh
new file mode 100644
index 0000000000..9dfaa87a1c
--- /dev/null
+++ b/src/base/bitmap.hh
@@ -0,0 +1,114 @@
+/*
+ * Copyright (c) 2010 ARM Limited
+ * All rights reserved
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: William Wang
+ *          Ali Saidi
+ */
+#ifndef __BASE_BITMAP_HH__
+#define __BASE_BITMAP_HH__
+
+#include <fstream>
+
+#include "base/vnc/convert.hh"
+
+/**
+ * @file Declaration of a class that writes a frame buffer to a bitmap
+ */
+
+
+// write frame buffer into a bitmap picture
+class  Bitmap
+{
+  public:
+    /** Create a Bitmap creator that takes data in the given mode & size
+     * and outputs to an fstream
+     * @param mode the type of data that is being provided
+     * @param h the hight of the image
+     * @param w the width of the image
+     * @param d the data for the image in mode
+     */
+    Bitmap(VideoConvert::Mode mode, uint16_t w, uint16_t h, uint8_t *d);
+
+    /** Provide the converter with the data that should be output. It will be
+     * converted into rgb8888 and write out when write() is called.
+     * @param d the data
+     */
+    void rawData(uint8_t* d) { data = d; }
+
+    /** Write the provided data into the fstream provided
+     * @param bmp stream to write to
+     */
+    void write(std::ostream *bmp);
+
+  private:
+    VideoConvert::Mode mode;
+    uint16_t height;
+    uint16_t width;
+    uint8_t *data;
+
+    VideoConvert vc;
+
+    struct Magic
+    {
+        unsigned char magic_number[2];
+    };
+
+    struct Header
+    {
+        uint32_t size;
+        uint16_t reserved1;
+        uint16_t reserved2;
+        uint32_t offset;
+    };
+
+    struct Info
+    {
+        uint32_t Size;
+        uint32_t Width;
+        uint32_t Height;
+        uint16_t Planes;
+        uint16_t BitCount;
+        uint32_t Compression;
+        uint32_t SizeImage;
+        uint32_t XPelsPerMeter;
+        uint32_t YPelsPerMeter;
+        uint32_t ClrUsed;
+        uint32_t ClrImportant;
+    };
+};
+
+#endif // __BASE_BITMAP_HH__
+
diff --git a/src/base/compiler.hh b/src/base/compiler.hh
index 2c655af608..3315fb2f74 100644
--- a/src/base/compiler.hh
+++ b/src/base/compiler.hh
@@ -41,6 +41,7 @@
 #define M5_PRAGMA_NORETURN(x)
 #define M5_DUMMY_RETURN
 #define M5_VAR_USED __attribute__((unused))
+#define M5_ATTR_PACKED __attribute__ ((__packed__))
 #elif defined(__SUNPRO_CC)
 // this doesn't do anything with sun cc, but why not
 #define M5_ATTR_NORETURN  __sun_attr__((__noreturn__))
@@ -48,6 +49,7 @@
 #define DO_PRAGMA(x) _Pragma(#x)
 #define M5_VAR_USED
 #define M5_PRAGMA_NORETURN(x) DO_PRAGMA(does_not_return(x))
+#define M5_ATTR_PACKED __attribute__ ((__packed__))
 #else
 #error "Need to define compiler options in base/compiler.hh"
 #endif
diff --git a/src/base/vnc/SConscript b/src/base/vnc/SConscript
new file mode 100644
index 0000000000..c926765556
--- /dev/null
+++ b/src/base/vnc/SConscript
@@ -0,0 +1,48 @@
+# -*- mode:python -*-
+
+# Copyright (c) 2010 ARM Limited
+# All rights reserved.
+#
+# The license below extends only to copyright in the software and shall
+# not be construed as granting a license to any other intellectual
+# property including but not limited to intellectual property relating
+# to a hardware implementation of the functionality of the software
+# licensed hereunder.  You may use the software subject to the license
+# terms below provided that you ensure that this notice is replicated
+# unmodified and in its entirety in all distributions of the software,
+# modified or unmodified, in source code or in binary form.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met: redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer;
+# redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in the
+# documentation and/or other materials provided with the distribution;
+# neither the name of the copyright holders nor the names of its
+# contributors may be used to endorse or promote products derived from
+# this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+# Authors: William Wang
+
+Import('*')
+
+if env['FULL_SYSTEM']:
+    SimObject('VncServer.py')
+    Source('vncserver.cc')
+    TraceFlag('VNC')
+
+Source('convert.cc')
+
diff --git a/src/base/vnc/VncServer.py b/src/base/vnc/VncServer.py
new file mode 100644
index 0000000000..21eb3ed28a
--- /dev/null
+++ b/src/base/vnc/VncServer.py
@@ -0,0 +1,45 @@
+# Copyright (c) 2010 ARM Limited
+# All rights reserved.
+#
+# The license below extends only to copyright in the software and shall
+# not be construed as granting a license to any other intellectual
+# property including but not limited to intellectual property relating
+# to a hardware implementation of the functionality of the software
+# licensed hereunder.  You may use the software subject to the license
+# terms below provided that you ensure that this notice is replicated
+# unmodified and in its entirety in all distributions of the software,
+# modified or unmodified, in source code or in binary form.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met: redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer;
+# redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in the
+# documentation and/or other materials provided with the distribution;
+# neither the name of the copyright holders nor the names of its
+# contributors may be used to endorse or promote products derived from
+# this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+# Authors: William Wang
+
+from m5.SimObject import SimObject
+from m5.params import *
+from m5.proxy import *
+
+class VncServer(SimObject):
+    type = 'VncServer'
+    port = Param.TcpPort(5900, "listen port")
+    number = Param.Int(0, "vnc client number")
diff --git a/src/base/vnc/convert.cc b/src/base/vnc/convert.cc
new file mode 100644
index 0000000000..ea7a9b1c59
--- /dev/null
+++ b/src/base/vnc/convert.cc
@@ -0,0 +1,139 @@
+/*
+ * Copyright (c) 2011 ARM Limited
+ * All rights reserved
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: Ali Saidi
+ *          William Wang
+ */
+
+#include <cassert>
+
+#include "base/misc.hh"
+#include "base/vnc/convert.hh"
+
+/** @file
+ * This file provides conversion functions for a variety of video modes
+ */
+
+VideoConvert::VideoConvert(Mode input_mode, Mode output_mode, int _width,
+        int _height)
+    : inputMode(input_mode), outputMode(output_mode), width(_width),
+    height(_height)
+{
+    if (inputMode != bgr565 && inputMode != rgb565 && inputMode != bgr8888)
+        fatal("Only support converting from bgr565, rdb565, and bgr8888\n");
+
+    if (outputMode != rgb8888)
+        fatal("Only support converting to rgb8888\n");
+
+    assert(0 < height && height < 4000);
+    assert(0 < width && width < 4000);
+}
+
+VideoConvert::~VideoConvert()
+{
+}
+
+uint8_t*
+VideoConvert::convert(uint8_t *fb)
+{
+    switch (inputMode) {
+      case bgr565:
+        return m565rgb8888(fb, true);
+      case rgb565:
+        return m565rgb8888(fb, false);
+      case bgr8888:
+        return bgr8888rgb8888(fb);
+      default:
+        panic("Unimplemented Mode\n");
+    }
+}
+
+uint8_t*
+VideoConvert::m565rgb8888(uint8_t *fb, bool bgr)
+{
+    uint8_t *out = new uint8_t[area() * sizeof(uint32_t)];
+    uint32_t *out32 = (uint32_t*)out;
+
+    uint16_t *in16 = (uint16_t*)fb;
+
+    for (int x = 0; x < area(); x++) {
+        Bgr565 inpx;
+        Rgb8888 outpx = 0;
+
+        inpx = in16[x];
+
+        if (bgr) {
+            outpx.red = inpx.blue << 3;
+            outpx.green = inpx.green << 2;
+            outpx.blue = inpx.red << 3;
+        } else {
+            outpx.blue = inpx.blue << 3;
+            outpx.green = inpx.green << 2;
+            outpx.red = inpx.red << 3;
+        }
+
+        out32[x] = outpx;
+    }
+
+    return out;
+}
+
+
+uint8_t*
+VideoConvert::bgr8888rgb8888(uint8_t *fb)
+{
+    uint8_t *out = new uint8_t[area() * sizeof(uint32_t)];
+    uint32_t *out32 = (uint32_t*)out;
+
+    uint32_t *in32 = (uint32_t*)fb;
+
+    for (int x = 0; x < area(); x++) {
+        Rgb8888 outpx = 0;
+        Bgr8888 inpx;
+
+
+        inpx = in32[x];
+
+        outpx.red = inpx.blue;
+        outpx.green = inpx.green;
+        outpx.blue = inpx.red;
+
+        out32[x] = outpx;
+    }
+
+    return out;
+}
+
diff --git a/src/base/vnc/convert.hh b/src/base/vnc/convert.hh
new file mode 100644
index 0000000000..68a21d677c
--- /dev/null
+++ b/src/base/vnc/convert.hh
@@ -0,0 +1,141 @@
+/*
+ * Copyright (c) 2011 ARM Limited
+ * All rights reserved
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: Ali Saidi
+ */
+
+/** @file
+ * This file provides conversion functions for a variety of video modes
+ */
+
+#ifndef __BASE_VNC_CONVERT_HH__
+#define __BASE_VNC_CONVERT_HH__
+
+#include "base/bitunion.hh"
+
+class VideoConvert
+{
+  public:
+    enum Mode {
+        UnknownMode,
+        bgr565,
+        rgb565,
+        bgr8888,
+        rgb8888,
+        rgb888,
+        bgr888,
+        bgr444,
+        bgr4444,
+        rgb444,
+        rgb4444,
+    };
+
+    // supports bpp32 RGB (bmp) and bpp16 5:6:5 mode BGR (linux)
+    BitUnion32(Rgb8888)
+        Bitfield<7,0> blue;
+        Bitfield<15,8> green;
+        Bitfield<23,16> red;
+        Bitfield<31,24> alpha;
+    EndBitUnion(Rgb8888)
+
+    BitUnion32(Bgr8888)
+        Bitfield<7,0> red;
+        Bitfield<15,8> green;
+        Bitfield<23,16> blue;
+        Bitfield<31,24> alpha;
+    EndBitUnion(Bgr8888)
+
+    BitUnion16(Bgr565)
+        Bitfield<4,0> red;
+        Bitfield<10,5> green;
+        Bitfield<15,11> blue;
+    EndBitUnion(Bgr565)
+
+    BitUnion16(Rgb565)
+        Bitfield<4,0> red;
+        Bitfield<10,5> green;
+        Bitfield<15,11> blue;
+    EndBitUnion(Rgb565)
+
+    /** Setup the converter with the given parameters
+     * @param input_mode type of data that will be provided
+     * @param output_mode type of data that should be output
+     * @param _width width of the frame buffer
+     * @param _height height of the frame buffer
+     */
+    VideoConvert(Mode input_mode, Mode output_mode, int _width, int _height);
+
+    /** Destructor
+     */
+    ~VideoConvert();
+
+    /** Convert the provided frame buffer data into the format specified in the
+     * constructor.
+     * @param fb the frame buffer to convert
+     * @return the converted data (user must free)
+     */
+    uint8_t* convert(uint8_t *fb);
+
+    /** Return the number of pixels that this buffer specifies
+     * @return number of pixels
+     */
+    int area() { return width * height; }
+
+  private:
+
+    /**
+     * Convert a bgr8888 input to rgb8888.
+     * @param fb the data to convert
+     * @return converted data
+     */
+    uint8_t* bgr8888rgb8888(uint8_t *fb);
+
+    /**
+     * Convert a bgr565 or rgb565 input to rgb8888.
+     * @param fb the data to convert
+     * @param bgr true if the input data is bgr565
+     * @return converted data
+     */
+    uint8_t* m565rgb8888(uint8_t *fb, bool bgr);
+
+    Mode inputMode;
+    Mode outputMode;
+    int width;
+    int height;
+};
+
+#endif // __BASE_VNC_CONVERT_HH__
+
diff --git a/src/base/vnc/vncserver.cc b/src/base/vnc/vncserver.cc
new file mode 100644
index 0000000000..8936fa67b5
--- /dev/null
+++ b/src/base/vnc/vncserver.cc
@@ -0,0 +1,703 @@
+/*
+ * Copyright (c) 2010 ARM Limited
+ * All rights reserved
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: Ali Saidi
+ *          William Wang
+ */
+
+/** @file
+ * Implementiation of a VNC server
+ */
+
+#include <cstdio>
+
+#include <sys/ioctl.h>
+#include <sys/termios.h>
+#include <errno.h>
+#include <poll.h>
+#include <unistd.h>
+
+#include "base/atomicio.hh"
+#include "base/misc.hh"
+#include "base/socket.hh"
+#include "base/trace.hh"
+#include "base/vnc/vncserver.hh"
+#include "sim/byteswap.hh"
+
+using namespace std;
+
+/**
+ * Poll event for the listen socket
+ */
+VncServer::ListenEvent::ListenEvent(VncServer *vs, int fd, int e)
+    : PollEvent(fd, e), vncserver(vs)
+{
+}
+
+void
+VncServer::ListenEvent::process(int revent)
+{
+    vncserver->accept();
+}
+
+/**
+ * Poll event for the data socket
+ */
+VncServer::DataEvent::DataEvent(VncServer *vs, int fd, int e)
+    : PollEvent(fd, e), vncserver(vs)
+{
+}
+
+void
+VncServer::DataEvent::process(int revent)
+{
+    if (revent & POLLIN)
+        vncserver->data();
+    else if (revent & POLLNVAL)
+        vncserver->detach();
+}
+
+/**
+ * VncServer
+ */
+VncServer::VncServer(const Params *p)
+    : SimObject(p), listenEvent(NULL), dataEvent(NULL), number(p->number),
+      dataFd(-1), _videoWidth(1), _videoHeight(1), clientRfb(0), keyboard(NULL),
+      mouse(NULL), sendUpdate(false), videoMode(VideoConvert::UnknownMode),
+      vc(NULL)
+{
+    if (p->port)
+        listen(p->port);
+
+    curState = WaitForProtocolVersion;
+
+
+    // currently we only support this one pixel format
+    // unpacked 32bit rgb (rgb888 + 8 bits of nothing/alpha)
+    // keep it around for telling the client and making
+    // sure the client cooperates
+    pixelFormat.bpp = 32;
+    pixelFormat.depth = 24;
+    pixelFormat.bigendian = 0;
+    pixelFormat.truecolor = 1;
+    pixelFormat.redmax = 0xff;
+    pixelFormat.greenmax = 0xff;
+    pixelFormat.bluemax = 0xff;
+    pixelFormat.redshift = 16;
+    pixelFormat.greenshift = 8;
+    pixelFormat.blueshift = 0;
+
+
+    DPRINTF(VNC, "Vnc server created at port %d\n", p->port);
+}
+
+VncServer::~VncServer()
+{
+    if (dataFd != -1)
+        ::close(dataFd);
+
+    if (listenEvent)
+        delete listenEvent;
+
+    if (dataEvent)
+        delete dataEvent;
+}
+
+
+//socket creation and vnc client attach
+void
+VncServer::listen(int port)
+{
+    if (ListenSocket::allDisabled()) {
+        warn_once("Sockets disabled, not accepting vnc client connections");
+        return;
+    }
+
+    while (!listener.listen(port, true)) {
+        DPRINTF(VNC,
+                "can't bind address vnc server port %d in use PID %d\n",
+                port, getpid());
+        port++;
+    }
+
+    int p1, p2;
+    p2 = name().rfind('.') - 1;
+    p1 = name().rfind('.', p2);
+    ccprintf(cerr, "Listening for %s connection on port %d\n",
+             name().substr(p1 + 1, p2 - p1), port);
+
+    listenEvent = new ListenEvent(this, listener.getfd(), POLLIN);
+    pollQueue.schedule(listenEvent);
+}
+
+// attach a vnc client
+void
+VncServer::accept()
+{
+    if (!listener.islistening())
+        panic("%s: cannot accept a connection if not listening!", name());
+
+    int fd = listener.accept(true);
+    if (dataFd != -1) {
+        char message[] = "vnc server already attached!\n";
+        atomic_write(fd, message, sizeof(message));
+        ::close(fd);
+        return;
+    }
+
+    dataFd = fd;
+
+    // Send our version number to the client
+    write((uint8_t*)vncVersion(), strlen(vncVersion()));
+
+    // read the client response
+    dataEvent = new DataEvent(this, dataFd, POLLIN);
+    pollQueue.schedule(dataEvent);
+
+    inform("VNC client attached\n");
+}
+
+// data called by data event
+void
+VncServer::data()
+{
+    // We have new data, see if we can handle it
+    size_t len;
+    DPRINTF(VNC, "Vnc client message recieved\n");
+
+    switch (curState) {
+      case WaitForProtocolVersion:
+        checkProtocolVersion();
+        break;
+      case WaitForSecurityResponse:
+        checkSecurity();
+        break;
+      case WaitForClientInit:
+        // Don't care about shared, just need to read it out of the socket
+        uint8_t shared;
+        len = read(&shared);
+        assert(len == 1);
+
+        // Send our idea of the frame buffer
+        sendServerInit();
+
+        break;
+      case NormalPhase:
+        uint8_t message_type;
+        len = read(&message_type);
+        if (!len) {
+            detach();
+            return;
+        }
+        assert(len == 1);
+
+        switch (message_type) {
+          case ClientSetPixelFormat:
+            setPixelFormat();
+            break;
+          case ClientSetEncodings:
+            setEncodings();
+            break;
+          case ClientFrameBufferUpdate:
+             requestFbUpdate();
+             break;
+          case ClientKeyEvent:
+             recvKeyboardInput();
+             break;
+          case ClientPointerEvent:
+             recvPointerInput();
+             break;
+          case ClientCutText:
+             recvCutText();
+             break;
+          default:
+             panic("Unimplemented message type recv from client: %d\n",
+                     message_type);
+             break;
+        }
+        break;
+      default:
+        panic("Unknown vnc server state\n");
+    }
+}
+
+
+// read from socket
+size_t
+VncServer::read(uint8_t *buf, size_t len)
+{
+    if (dataFd < 0)
+        panic("vnc not properly attached.\n");
+
+    size_t ret;
+    do {
+        ret = ::read(dataFd, buf, len);
+    } while (ret == -1 && errno == EINTR);
+
+
+    if (ret <= 0){
+        DPRINTF(VNC, "Read failed.\n");
+        detach();
+        return 0;
+    }
+
+    return ret;
+}
+
+size_t
+VncServer::read1(uint8_t *buf, size_t len)
+{
+    size_t read_len M5_VAR_USED;
+    read_len = read(buf + 1, len - 1);
+    assert(read_len == len - 1);
+    return read_len;
+}
+
+
+template<typename T>
+size_t
+VncServer::read(T* val)
+{
+    return read((uint8_t*)val, sizeof(T));
+}
+
+// write to socket
+size_t
+VncServer::write(const uint8_t *buf, size_t len)
+{
+    if (dataFd < 0)
+        panic("Vnc client not properly attached.\n");
+
+    ssize_t ret;
+    ret = atomic_write(dataFd, buf, len);
+
+    if (ret < len)
+        detach();
+
+    return ret;
+}
+
+template<typename T>
+size_t
+VncServer::write(T* val)
+{
+    return write((uint8_t*)val, sizeof(T));
+}
+
+size_t
+VncServer::write(const char* str)
+{
+    return write((uint8_t*)str, strlen(str));
+}
+
+// detach a vnc client
+void
+VncServer::detach()
+{
+    if (dataFd != -1) {
+        ::close(dataFd);
+        dataFd = -1;
+    }
+
+    if (!dataEvent || !dataEvent->queued())
+        return;
+
+    pollQueue.remove(dataEvent);
+    delete dataEvent;
+    dataEvent = NULL;
+    curState = WaitForProtocolVersion;
+
+    inform("VNC client detached\n");
+    DPRINTF(VNC, "detach vnc client %d\n", number);
+}
+
+void
+VncServer::sendError(const char* error_msg)
+{
+   uint32_t len = strlen(error_msg);
+   write(&len);
+   write(error_msg);
+}
+
+void
+VncServer::checkProtocolVersion()
+{
+    assert(curState == WaitForProtocolVersion);
+
+    size_t len M5_VAR_USED;
+    char version_string[13];
+
+    // Null terminate the message so it's easier to work with
+    version_string[12] = 0;
+
+    len = read((uint8_t*)version_string, 12);
+    assert(len == 12);
+
+    uint32_t major, minor;
+
+    // Figure out the major/minor numbers
+    if (sscanf(version_string, "RFB %03d.%03d\n", &major, &minor) != 2) {
+        warn(" Malformed protocol version %s\n", version_string);
+        sendError("Malformed protocol version\n");
+        detach();
+    }
+
+    DPRINTF(VNC, "Client request protocol version %d.%d\n", major, minor);
+
+    // If it's not 3.X we don't support it
+    if (major != 3 || minor < 2) {
+        warn("Unsupported VNC client version... disconnecting\n");
+        uint8_t err = AuthInvalid;
+        write(&err);
+        detach();
+    }
+    // Auth is different based on version number
+    if (minor < 7) {
+        uint32_t sec_type = htobe((uint32_t)AuthNone);
+        write(&sec_type);
+    } else {
+        uint8_t sec_cnt = 1;
+        uint8_t sec_type = htobe((uint8_t)AuthNone);
+        write(&sec_cnt);
+        write(&sec_type);
+    }
+
+    // Wait for client to respond
+    curState = WaitForSecurityResponse;
+}
+
+void
+VncServer::checkSecurity()
+{
+    assert(curState == WaitForSecurityResponse);
+
+    uint8_t security_type;
+    size_t len M5_VAR_USED = read(&security_type);
+
+    assert(len == 1);
+
+    if (security_type != AuthNone) {
+        warn("Unknown VNC security type\n");
+        sendError("Unknown security type\n");
+    }
+
+    DPRINTF(VNC, "Sending security auth OK\n");
+
+    uint32_t success = htobe(VncOK);
+    write(&success);
+    curState = WaitForClientInit;
+}
+
+void
+VncServer::sendServerInit()
+{
+    ServerInitMsg msg;
+
+    DPRINTF(VNC, "Sending server init message to client\n");
+
+    msg.fbWidth = htobe(videoWidth());
+    msg.fbHeight = htobe(videoHeight());
+
+    msg.px.bpp = htobe(pixelFormat.bpp);
+    msg.px.depth = htobe(pixelFormat.depth);
+    msg.px.bigendian = htobe(pixelFormat.bigendian);
+    msg.px.truecolor = htobe(pixelFormat.truecolor);
+    msg.px.redmax = htobe(pixelFormat.redmax);
+    msg.px.greenmax = htobe(pixelFormat.greenmax);
+    msg.px.bluemax = htobe(pixelFormat.bluemax);
+    msg.px.redshift = htobe(pixelFormat.redshift);
+    msg.px.greenshift = htobe(pixelFormat.greenshift);
+    msg.px.blueshift = htobe(pixelFormat.blueshift);
+    memset(msg.px.padding, 0, 3);
+    msg.namelen = 2;
+    msg.namelen = htobe(msg.namelen);
+    memcpy(msg.name, "M5", 2);
+
+    write(&msg);
+    curState = NormalPhase;
+}
+
+
+void
+VncServer::setPixelFormat()
+{
+    DPRINTF(VNC, "Received pixel format from client message\n");
+
+    PixelFormatMessage pfm;
+    read1((uint8_t*)&pfm, sizeof(PixelFormatMessage));
+
+    DPRINTF(VNC, " -- bpp = %d; depth = %d; be = %d\n", pfm.px.bpp,
+            pfm.px.depth, pfm.px.bigendian);
+    DPRINTF(VNC, " -- true color = %d red,green,blue max = %d,%d,%d\n",
+            pfm.px.truecolor, betoh(pfm.px.redmax), betoh(pfm.px.greenmax),
+                betoh(pfm.px.bluemax));
+    DPRINTF(VNC, " -- red,green,blue shift = %d,%d,%d\n", pfm.px.redshift,
+            pfm.px.greenshift, pfm.px.blueshift);
+
+    if (betoh(pfm.px.bpp) != pixelFormat.bpp ||
+        betoh(pfm.px.depth) != pixelFormat.depth ||
+        betoh(pfm.px.bigendian) != pixelFormat.bigendian ||
+        betoh(pfm.px.truecolor) != pixelFormat.truecolor ||
+        betoh(pfm.px.redmax) != pixelFormat.redmax ||
+        betoh(pfm.px.greenmax) != pixelFormat.greenmax ||
+        betoh(pfm.px.bluemax) != pixelFormat.bluemax ||
+        betoh(pfm.px.redshift) != pixelFormat.redshift ||
+        betoh(pfm.px.greenshift) != pixelFormat.greenshift ||
+        betoh(pfm.px.blueshift) != pixelFormat.blueshift)
+        fatal("VNC client doesn't support true color raw encoding\n");
+}
+
+void
+VncServer::setEncodings()
+{
+    DPRINTF(VNC, "Received supported encodings from client\n");
+
+    PixelEncodingsMessage pem;
+    read1((uint8_t*)&pem, sizeof(PixelEncodingsMessage));
+
+    pem.num_encodings = betoh(pem.num_encodings);
+
+    DPRINTF(VNC, " -- %d encoding present\n", pem.num_encodings);
+    supportsRawEnc = supportsResizeEnc = false;
+
+    for (int x = 0; x < pem.num_encodings; x++) {
+        int32_t encoding;
+        size_t len M5_VAR_USED;
+        len = read(&encoding);
+        assert(len == sizeof(encoding));
+        DPRINTF(VNC, " -- supports %d\n", betoh(encoding));
+
+        switch (betoh(encoding)) {
+          case EncodingRaw:
+            supportsRawEnc = true;
+            break;
+          case EncodingDesktopSize:
+            supportsResizeEnc = true;
+            break;
+        }
+    }
+
+    if (!supportsRawEnc)
+        fatal("VNC clients must always support raw encoding\n");
+}
+
+void
+VncServer::requestFbUpdate()
+{
+    DPRINTF(VNC, "Received frame buffer update request from client\n");
+
+    FrameBufferUpdateReq fbr;
+    read1((uint8_t*)&fbr, sizeof(FrameBufferUpdateReq));
+
+    fbr.x = betoh(fbr.x);
+    fbr.y = betoh(fbr.y);
+    fbr.width = betoh(fbr.width);
+    fbr.height = betoh(fbr.height);
+
+    DPRINTF(VNC, " -- x = %d y = %d w = %d h = %d\n", fbr.x, fbr.y, fbr.width,
+            fbr.height);
+
+    sendFrameBufferUpdate();
+}
+
+void
+VncServer::recvKeyboardInput()
+{
+    DPRINTF(VNC, "Received keyboard input from client\n");
+    KeyEventMessage kem;
+    read1((uint8_t*)&kem, sizeof(KeyEventMessage));
+
+    kem.key = betoh(kem.key);
+    DPRINTF(VNC, " -- received key code %d (%s)\n", kem.key, kem.down_flag ?
+            "down" : "up");
+
+    if (keyboard)
+        keyboard->keyPress(kem.key, kem.down_flag);
+}
+
+void
+VncServer::recvPointerInput()
+{
+    DPRINTF(VNC, "Received pointer input from client\n");
+    PointerEventMessage pem;
+
+    read1((uint8_t*)&pem, sizeof(PointerEventMessage));;
+
+    pem.x = betoh(pem.x);
+    pem.y = betoh(pem.y);
+    DPRINTF(VNC, " -- pointer at x = %d y = %d buttons = %#x\n", pem.x, pem.y,
+            pem.button_mask);
+
+    if (mouse)
+        mouse->mouseAt(pem.x, pem.y, pem.button_mask);
+}
+
+void
+VncServer::recvCutText()
+{
+    DPRINTF(VNC, "Received client copy buffer message\n");
+
+    ClientCutTextMessage cct;
+    read1((uint8_t*)&cct, sizeof(ClientCutTextMessage));
+
+    char str[1025];
+    size_t data_len = betoh(cct.length);
+    DPRINTF(VNC, "String length %d\n", data_len);
+    while (data_len > 0) {
+        size_t len;
+        size_t bytes_to_read = data_len > 1024 ? 1024 : data_len;
+        len = read((uint8_t*)&str, bytes_to_read);
+        str[bytes_to_read] = 0;
+        data_len -= len;
+        assert(data_len >= 0);
+        DPRINTF(VNC, "Buffer: %s\n", str);
+    }
+
+}
+
+
+void
+VncServer::sendFrameBufferUpdate()
+{
+
+    if (!clientRfb || dataFd <= 0 || curState != NormalPhase || !sendUpdate) {
+        DPRINTF(VNC, "NOT sending framebuffer update\n");
+        return;
+    }
+
+    assert(vc);
+
+    // The client will request data constantly, unless we throttle it
+    sendUpdate = false;
+
+    DPRINTF(VNC, "Sending framebuffer update\n");
+
+    FrameBufferUpdate fbu;
+    FrameBufferRect fbr;
+
+    fbu.type = ServerFrameBufferUpdate;
+    fbu.num_rects = 1;
+    fbr.x = 0;
+    fbr.y = 0;
+    fbr.width = videoWidth();
+    fbr.height = videoHeight();
+    fbr.encoding = EncodingRaw;
+
+    // fix up endian
+    fbu.num_rects = htobe(fbu.num_rects);
+    fbr.x = htobe(fbr.x);
+    fbr.y = htobe(fbr.y);
+    fbr.width = htobe(fbr.width);
+    fbr.height = htobe(fbr.height);
+    fbr.encoding = htobe(fbr.encoding);
+
+    // send headers to client
+    write(&fbu);
+    write(&fbr);
+
+    assert(clientRfb);
+
+    uint8_t *tmp = vc->convert(clientRfb);
+    write(tmp, videoWidth() * videoHeight() * sizeof(uint32_t));
+    delete [] tmp;
+
+}
+
+void
+VncServer::sendFrameBufferResized()
+{
+    assert(clientRfb && dataFd > 0 && curState == NormalPhase);
+    DPRINTF(VNC, "Sending framebuffer resize\n");
+
+    FrameBufferUpdate fbu;
+    FrameBufferRect fbr;
+
+    fbu.type = ServerFrameBufferUpdate;
+    fbu.num_rects = 1;
+    fbr.x = 0;
+    fbr.y = 0;
+    fbr.width = videoWidth();
+    fbr.height = videoHeight();
+    fbr.encoding = EncodingDesktopSize;
+
+    // fix up endian
+    fbu.num_rects = htobe(fbu.num_rects);
+    fbr.x = htobe(fbr.x);
+    fbr.y = htobe(fbr.y);
+    fbr.width = htobe(fbr.width);
+    fbr.height = htobe(fbr.height);
+    fbr.encoding = htobe(fbr.encoding);
+
+    // send headers to client
+    write(&fbu);
+    write(&fbr);
+
+    // No actual data is sent in this message
+}
+
+void
+VncServer::setFrameBufferParams(VideoConvert::Mode mode, int width, int height)
+{
+    DPRINTF(VNC, "Updating video params: mode: %d width: %d height: %d\n", mode,
+            width, height);
+
+    if (mode != videoMode || width != videoWidth() || height != videoHeight()) {
+        videoMode = mode;
+        _videoWidth = width;
+        _videoHeight = height;
+
+        if (vc)
+            delete vc;
+
+        vc = new VideoConvert(mode, VideoConvert::rgb8888, videoWidth(),
+                videoHeight());
+
+        if (dataFd > 0 && clientRfb && curState == NormalPhase) {
+            if (supportsResizeEnc)
+                sendFrameBufferResized();
+            else
+                // The frame buffer changed size and we can't update the client
+                detach();
+        }
+    }
+}
+
+// create the VNC server object
+VncServer *
+VncServerParams::create()
+{
+    return new VncServer(this);
+}
diff --git a/src/base/vnc/vncserver.hh b/src/base/vnc/vncserver.hh
new file mode 100644
index 0000000000..23b097b111
--- /dev/null
+++ b/src/base/vnc/vncserver.hh
@@ -0,0 +1,475 @@
+/*
+ * Copyright (c) 2010 ARM Limited
+ * All rights reserved
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: Ali Saidi
+ *          William Wang
+ */
+
+/** @file
+ * Declaration of a VNC server
+ */
+
+#ifndef __DEV_VNC_SERVER_HH__
+#define __DEV_VNC_SERVER_HH__
+
+#include <iostream>
+
+#include "base/circlebuf.hh"
+#include "base/pollevent.hh"
+#include "base/socket.hh"
+#include "base/vnc/convert.hh"
+#include "cpu/intr_control.hh"
+#include "sim/sim_object.hh"
+#include "params/VncServer.hh"
+
+/**
+ * A device that expects to receive input from the vnc server should derrive
+ * (through mulitple inheritence if necessary from VncKeyboard or VncMouse
+ * and call setKeyboard() or setMouse() respectively on the vnc server.
+ */
+class VncKeyboard
+{
+  public:
+    /**
+     * Called when the vnc server receives a key press event from the
+     * client.
+     * @param key the key passed is an x11 keysym
+     * @param down is the key now down or up?
+     */
+    virtual void keyPress(uint32_t key, bool down) = 0;
+};
+
+class VncMouse
+{
+  public:
+    /**
+     * called whenever the mouse moves or it's button state changes
+     * buttons is a simple mask with each button (0-8) corresponding to
+     * a bit position in the byte with 1 being down and 0 being up
+     * @param x the x position of the mouse
+     * @param y the y position of the mouse
+     * @param buttos the button state as described above
+     */
+    virtual void mouseAt(uint16_t x, uint16_t y, uint8_t buttons) = 0;
+};
+
+class VncServer : public SimObject
+{
+  public:
+
+    /**
+     * \defgroup VncConstants A set of constants and structs from the VNC spec
+     * @{
+     */
+    /** Authentication modes */
+    const static uint32_t AuthInvalid = 0;
+    const static uint32_t AuthNone    = 1;
+
+    /** Error conditions */
+    const static uint32_t VncOK   = 0;
+
+    /** Client -> Server message IDs */
+    enum ClientMessages {
+        ClientSetPixelFormat    = 0,
+        ClientSetEncodings      = 2,
+        ClientFrameBufferUpdate = 3,
+        ClientKeyEvent          = 4,
+        ClientPointerEvent      = 5,
+        ClientCutText           = 6
+    };
+
+    /** Server -> Client message IDs */
+    enum ServerMessages {
+        ServerFrameBufferUpdate     = 0,
+        ServerSetColorMapEntries    = 1,
+        ServerBell                  = 2,
+        ServerCutText               = 3
+    };
+
+    /** Encoding types */
+    enum EncodingTypes {
+        EncodingRaw         = 0,
+        EncodingCopyRect    = 1,
+        EncodingHextile     = 5,
+        EncodingDesktopSize = -223
+    };
+
+    /** keyboard/mouse support */
+    enum MouseEvents {
+        MouseLeftButton     = 0x1,
+        MouseRightButton    = 0x2,
+        MouseMiddleButton   = 0x4
+    };
+
+    const char* vncVersion() const
+    {
+        return "RFB 003.008\n";
+    }
+
+    enum ConnectionState {
+        WaitForProtocolVersion,
+        WaitForSecurityResponse,
+        WaitForClientInit,
+        InitializationPhase,
+        NormalPhase
+    };
+
+    struct PixelFormat {
+        uint8_t bpp;
+        uint8_t depth;
+        uint8_t bigendian;
+        uint8_t truecolor;
+        uint16_t redmax;
+        uint16_t greenmax;
+        uint16_t bluemax;
+        uint8_t redshift;
+        uint8_t greenshift;
+        uint8_t blueshift;
+        uint8_t padding[3];
+    } M5_ATTR_PACKED;
+
+    struct ServerInitMsg {
+        uint16_t fbWidth;
+        uint16_t fbHeight;
+        PixelFormat px;
+        uint32_t namelen;
+        char name[2]; // just to put M5 in here
+    } M5_ATTR_PACKED;
+
+    struct PixelFormatMessage {
+        uint8_t type;
+        uint8_t padding[3];
+        PixelFormat px;
+    } M5_ATTR_PACKED;
+
+    struct PixelEncodingsMessage {
+        uint8_t type;
+        uint8_t padding;
+        uint16_t num_encodings;
+    } M5_ATTR_PACKED;
+
+    struct FrameBufferUpdateReq {
+        uint8_t type;
+        uint8_t incremental;
+        uint16_t x;
+        uint16_t y;
+        uint16_t width;
+        uint16_t height;
+    } M5_ATTR_PACKED;
+
+    struct KeyEventMessage {
+        uint8_t type;
+        uint8_t down_flag;
+        uint8_t padding[2];
+        uint32_t key;
+    } M5_ATTR_PACKED;
+
+    struct PointerEventMessage {
+        uint8_t type;
+        uint8_t button_mask;
+        uint16_t x;
+        uint16_t y;
+    } M5_ATTR_PACKED;
+
+    struct ClientCutTextMessage {
+        uint8_t type;
+        uint8_t padding[3];
+        uint32_t length;
+    } M5_ATTR_PACKED;
+
+    struct FrameBufferUpdate {
+        uint8_t type;
+        uint8_t padding;
+        uint16_t num_rects;
+    } M5_ATTR_PACKED;
+
+    struct FrameBufferRect {
+        uint16_t x;
+        uint16_t y;
+        uint16_t width;
+        uint16_t height;
+        int32_t encoding;
+    } M5_ATTR_PACKED;
+
+    struct ServerCutText {
+        uint8_t type;
+        uint8_t padding[3];
+        uint32_t length;
+    } M5_ATTR_PACKED;
+
+    /** @} */
+
+  protected:
+    /** ListenEvent to accept a vnc client connection */
+    class ListenEvent: public PollEvent
+    {
+      protected:
+        VncServer *vncserver;
+
+      public:
+        ListenEvent(VncServer *vs, int fd, int e);
+        void process(int revent);
+    };
+
+    friend class ListenEvent;
+    ListenEvent *listenEvent;
+
+    /** DataEvent to read data from vnc */
+    class DataEvent: public PollEvent
+    {
+      protected:
+        VncServer *vncserver;
+
+      public:
+        DataEvent(VncServer *vs, int fd, int e);
+        void process(int revent);
+    };
+
+    friend class DataEvent;
+    DataEvent *dataEvent;
+
+    int number;
+    int dataFd; // data stream file describer
+
+    ListenSocket listener;
+
+    void listen(int port);
+    void accept();
+    void data();
+    void detach();
+
+  public:
+    typedef VncServerParams Params;
+    VncServer(const Params *p);
+    ~VncServer();
+
+    // RFB
+  protected:
+
+    /** The rfb prototol state the connection is in */
+    ConnectionState curState;
+
+    /** the width of the frame buffer we are sending to the client */
+    uint16_t _videoWidth;
+
+    /** the height of the frame buffer we are sending to the client */
+    uint16_t _videoHeight;
+
+    /** pointer to the actual data that is stored in the frame buffer device */
+    uint8_t* clientRfb;
+
+    /** The device to notify when we get key events */
+    VncKeyboard *keyboard;
+
+    /** The device to notify when we get mouse events */
+    VncMouse *mouse;
+
+    /** An update needs to be sent to the client. Without doing this the
+     * client will constantly request data that is pointless */
+    bool sendUpdate;
+
+    /** The one and only pixel format we support */
+    PixelFormat pixelFormat;
+
+    /** If the vnc client supports receiving raw data. It always should */
+    bool supportsRawEnc;
+
+    /** If the vnc client supports the desktop resize command */
+    bool supportsResizeEnc;
+
+    /** The mode of data we're getting frame buffer in */
+    VideoConvert::Mode videoMode;
+
+    /** The video converter that transforms data for us */
+    VideoConvert *vc;
+
+  protected:
+    /**
+     * vnc client Interface
+     */
+
+    /** Send an error message to the client
+     * @param error_msg text to send describing the error
+     */
+    void sendError(const char* error_msg);
+
+    /** Read some data from the client
+     * @param buf the data to read
+     * @param len the amount of data to read
+     * @return length read
+     */
+    size_t read(uint8_t *buf, size_t len);
+
+    /** Read len -1 bytes from the client into the buffer provided + 1
+     * assert that we read enough bytes. This function exists to handle
+     * reading all of the protocol structs above when we've already read
+     * the first byte which describes which one we're reading
+     * @param buf the address of the buffer to add one to and read data into
+     * @param len the amount of data  + 1 to read
+     * @return length read
+     */
+    size_t read1(uint8_t *buf, size_t len);
+
+
+    /** Templated version of the read function above to
+     * read simple data to the client
+     * @param val data to recv from the client
+     */
+    template <typename T> size_t read(T* val);
+
+
+    /** Write a buffer to the client.
+     * @param buf buffer to send
+     * @param len length of the buffer
+     * @return number of bytes sent
+     */
+    size_t write(const uint8_t *buf, size_t len);
+
+    /** Templated version of the write function above to
+     * write simple data to the client
+     * @param val data to send to the client
+     */
+    template <typename T> size_t write(T* val);
+
+    /** Send a string to the client
+     * @param str string to transmit
+     */
+    size_t write(const char* str);
+
+    /** Check the client's protocol verion for compatibility and send
+     * the security types we support
+     */
+    void checkProtocolVersion();
+
+    /** Check that the security exchange was successful
+     */
+    void checkSecurity();
+
+    /** Send client our idea about what the frame buffer looks like */
+    void sendServerInit();
+
+    /** Send an error message to the client when something goes wrong
+     * @param error_msg error to send
+     */
+    void sendError(std::string error_msg);
+
+    /** Send a updated frame buffer to the client.
+     * @todo this doesn't do anything smart and just sends the entire image
+     */
+    void sendFrameBufferUpdate();
+
+    /** Receive pixel foramt message from client and process it. */
+    void setPixelFormat();
+
+    /** Receive encodings message from client and process it. */
+    void setEncodings();
+
+    /** Receive message from client asking for updated frame buffer */
+    void requestFbUpdate();
+
+    /** Receive message from client providing new keyboard input */
+    void recvKeyboardInput();
+
+    /** Recv message from client providing new mouse movement or button click */
+    void recvPointerInput();
+
+    /**  Receive message from client that there is text in it's paste buffer.
+     * This is a no-op at the moment, but perhaps we would want to be able to
+     * paste it at some point.
+     */
+    void recvCutText();
+
+    /** Tell the client that the frame buffer resized. This happens when the
+     * simulated system changes video modes (E.g. X11 starts).
+     */
+    void sendFrameBufferResized();
+
+  public:
+    /** Set the address of the frame buffer we are going to show.
+     * To avoid copying, just have the display controller
+     * tell us where the data is instead of constanly copying it around
+     * @param rfb frame buffer that we're going to use
+     */
+    void
+    setFramebufferAddr(uint8_t* rfb)
+    {
+        clientRfb = rfb;
+    }
+
+    /** Set up the device that would like to receive notifications when keys are
+     * pressed in the vnc client keyboard
+     * @param _keyboard an object that derrives from VncKeyboard
+     */
+    void setKeyboard(VncKeyboard *_keyboard) { keyboard = _keyboard; }
+
+    /** Setup the device that would like to receive notifications when mouse
+     * movements or button presses are received from the vnc client.
+     * @param _mouse an object that derrives from VncMouse
+     */
+    void setMouse(VncMouse *_mouse) { mouse = _mouse; }
+
+    /** The frame buffer uses this call to notify the vnc server that
+     * the frame buffer has been updated and a new image needs to be sent to the
+     * client
+     */
+    void
+    setDirty()
+    {
+        sendUpdate = true;
+        sendFrameBufferUpdate();
+    }
+
+    /** What is the width of the screen we're displaying.
+     * This is used for pointer/tablet devices that need to know to calculate
+     * the correct value to send to the device driver.
+     * @return the width of the simulated screen
+     */
+    uint16_t videoWidth() { return _videoWidth; }
+
+    /** What is the height of the screen we're displaying.
+     * This is used for pointer/tablet devices that need to know to calculate
+     * the correct value to send to the device driver.
+     * @return the height of the simulated screen
+     */
+    uint16_t videoHeight() { return _videoHeight; }
+
+    /** Set the mode of the data the frame buffer will be sending us
+     * @param mode the mode
+     */
+    void setFrameBufferParams(VideoConvert::Mode mode, int width, int height);
+};
+
+#endif
diff --git a/src/cpu/base_dyn_inst.hh b/src/cpu/base_dyn_inst.hh
index 0c566ec656..8b6662d700 100644
--- a/src/cpu/base_dyn_inst.hh
+++ b/src/cpu/base_dyn_inst.hh
@@ -1,4 +1,16 @@
 /*
+ * Copyright (c) 2011 ARM Limited
+ * All rights reserved.
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
  * Copyright (c) 2004-2006 The Regents of The University of Michigan
  * Copyright (c) 2009 The University of Edinburgh
  * All rights reserved.
@@ -150,6 +162,29 @@ class BaseDynInst : public FastAlloc, public RefCounted
     /** Finish a DTB address translation. */
     void finishTranslation(WholeTranslationState *state);
 
+    /** True if the DTB address translation has started. */
+    bool translationStarted;
+
+    /** True if the DTB address translation has completed. */
+    bool translationCompleted;
+
+    /**
+     * Returns true if the DTB address translation is being delayed due to a hw
+     * page table walk.
+     */
+    bool isTranslationDelayed() const
+    {
+        return (translationStarted && !translationCompleted);
+    }
+
+    /**
+     * Saved memory requests (needed when the DTB address translation is
+     * delayed due to a hw page table walk).
+     */
+    RequestPtr savedReq;
+    RequestPtr savedSreqLow;
+    RequestPtr savedSreqHigh;
+
     /** @todo: Consider making this private. */
   public:
     /** The sequence number of the instruction. */
@@ -835,33 +870,42 @@ BaseDynInst<Impl>::readBytes(Addr addr, uint8_t *data,
                              unsigned size, unsigned flags)
 {
     reqMade = true;
-    Request *req = new Request(asid, addr, size, flags, this->pc.instAddr(),
-                               thread->contextId(), threadNumber);
-
+    Request *req = NULL;
     Request *sreqLow = NULL;
     Request *sreqHigh = NULL;
 
-    // Only split the request if the ISA supports unaligned accesses.
-    if (TheISA::HasUnalignedMemAcc) {
-        splitRequest(req, sreqLow, sreqHigh);
-    }
-    initiateTranslation(req, sreqLow, sreqHigh, NULL, BaseTLB::Read);
-
-    if (fault == NoFault) {
-        effAddr = req->getVaddr();
-        effAddrValid = true;
-        fault = cpu->read(req, sreqLow, sreqHigh, data, lqIdx);
+    if (reqMade && translationStarted) {
+        req = savedReq;
+        sreqLow = savedSreqLow;
+        sreqHigh = savedSreqHigh;
     } else {
-        // Commit will have to clean up whatever happened.  Set this
-        // instruction as executed.
-        this->setExecuted();
+        req = new Request(asid, addr, size, flags, this->pc.instAddr(),
+                          thread->contextId(), threadNumber);
+
+        // Only split the request if the ISA supports unaligned accesses.
+        if (TheISA::HasUnalignedMemAcc) {
+            splitRequest(req, sreqLow, sreqHigh);
+        }
+        initiateTranslation(req, sreqLow, sreqHigh, NULL, BaseTLB::Read);
     }
 
-    if (fault != NoFault) {
-        // Return a fixed value to keep simulation deterministic even
-        // along misspeculated paths.
-        if (data)
-            bzero(data, size);
+    if (translationCompleted) {
+        if (fault == NoFault) {
+            effAddr = req->getVaddr();
+            effAddrValid = true;
+            fault = cpu->read(req, sreqLow, sreqHigh, data, lqIdx);
+        } else {
+            // Commit will have to clean up whatever happened.  Set this
+            // instruction as executed.
+            this->setExecuted();
+        }
+
+        if (fault != NoFault) {
+            // Return a fixed value to keep simulation deterministic even
+            // along misspeculated paths.
+            if (data)
+                bzero(data, size);
+        }
     }
 
     if (traceData) {
@@ -897,19 +941,26 @@ BaseDynInst<Impl>::writeBytes(uint8_t *data, unsigned size,
     }
 
     reqMade = true;
-    Request *req = new Request(asid, addr, size, flags, this->pc.instAddr(),
-                               thread->contextId(), threadNumber);
-
+    Request *req = NULL;
     Request *sreqLow = NULL;
     Request *sreqHigh = NULL;
 
-    // Only split the request if the ISA supports unaligned accesses.
-    if (TheISA::HasUnalignedMemAcc) {
-        splitRequest(req, sreqLow, sreqHigh);
-    }
-    initiateTranslation(req, sreqLow, sreqHigh, res, BaseTLB::Write);
+    if (reqMade && translationStarted) {
+        req = savedReq;
+        sreqLow = savedSreqLow;
+        sreqHigh = savedSreqHigh;
+    } else {
+        req = new Request(asid, addr, size, flags, this->pc.instAddr(),
+                          thread->contextId(), threadNumber);
 
-    if (fault == NoFault) {
+        // Only split the request if the ISA supports unaligned accesses.
+        if (TheISA::HasUnalignedMemAcc) {
+            splitRequest(req, sreqLow, sreqHigh);
+        }
+        initiateTranslation(req, sreqLow, sreqHigh, res, BaseTLB::Write);
+    }
+
+    if (fault == NoFault && translationCompleted) {
         effAddr = req->getVaddr();
         effAddrValid = true;
         fault = cpu->write(req, sreqLow, sreqHigh, data, sqIdx);
@@ -953,6 +1004,8 @@ BaseDynInst<Impl>::initiateTranslation(RequestPtr req, RequestPtr sreqLow,
                                        RequestPtr sreqHigh, uint64_t *res,
                                        BaseTLB::Mode mode)
 {
+    translationStarted = true;
+
     if (!TheISA::HasUnalignedMemAcc || sreqLow == NULL) {
         WholeTranslationState *state =
             new WholeTranslationState(req, NULL, res, mode);
@@ -961,6 +1014,12 @@ BaseDynInst<Impl>::initiateTranslation(RequestPtr req, RequestPtr sreqLow,
         DataTranslation<BaseDynInst<Impl> > *trans =
             new DataTranslation<BaseDynInst<Impl> >(this, state);
         cpu->dtb->translateTiming(req, thread->getTC(), trans, mode);
+        if (!translationCompleted) {
+            // Save memory requests.
+            savedReq = state->mainReq;
+            savedSreqLow = state->sreqLow;
+            savedSreqHigh = state->sreqHigh;
+        }
     } else {
         WholeTranslationState *state =
             new WholeTranslationState(req, sreqLow, sreqHigh, NULL, res, mode);
@@ -973,6 +1032,12 @@ BaseDynInst<Impl>::initiateTranslation(RequestPtr req, RequestPtr sreqLow,
 
         cpu->dtb->translateTiming(sreqLow, thread->getTC(), stransLow, mode);
         cpu->dtb->translateTiming(sreqHigh, thread->getTC(), stransHigh, mode);
+        if (!translationCompleted) {
+            // Save memory requests.
+            savedReq = state->mainReq;
+            savedSreqLow = state->sreqLow;
+            savedSreqHigh = state->sreqHigh;
+        }
     }
 }
 
@@ -998,6 +1063,8 @@ BaseDynInst<Impl>::finishTranslation(WholeTranslationState *state)
         state->deleteReqs();
     }
     delete state;
+
+    translationCompleted = true;
 }
 
 #endif // __CPU_BASE_DYN_INST_HH__
diff --git a/src/cpu/base_dyn_inst_impl.hh b/src/cpu/base_dyn_inst_impl.hh
index 74f199d5f6..7e4d25322b 100644
--- a/src/cpu/base_dyn_inst_impl.hh
+++ b/src/cpu/base_dyn_inst_impl.hh
@@ -1,4 +1,16 @@
 /*
+ * Copyright (c) 2011 ARM Limited
+ * All rights reserved.
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
  * Copyright (c) 2004-2006 The Regents of The University of Michigan
  * All rights reserved.
  *
@@ -107,6 +119,9 @@ BaseDynInst<Impl>::initVars()
     effAddrValid = false;
     physEffAddr = 0;
 
+    translationStarted = false;
+    translationCompleted = false;
+
     isUncacheable = false;
     reqMade = false;
     readyRegs = 0;
diff --git a/src/cpu/inorder/SConscript b/src/cpu/inorder/SConscript
index ae5ec02571..b9c526763b 100644
--- a/src/cpu/inorder/SConscript
+++ b/src/cpu/inorder/SConscript
@@ -55,7 +55,7 @@ if 'InOrderCPU' in env['CPU_MODELS']:
         TraceFlag('ThreadModel')
         TraceFlag('RefCount')
         TraceFlag('AddrDep')
-
+	TraceFlag('SkedCache')
 
         CompoundFlag('InOrderCPUAll', [ 'InOrderStage', 'InOrderStall', 'InOrderCPU',
                'InOrderMDU', 'InOrderAGEN', 'InOrderFetchSeq', 'InOrderTLB', 'InOrderBPred',
@@ -63,7 +63,6 @@ if 'InOrderCPU' in env['CPU_MODELS']:
                'InOrderGraduation', 'InOrderCachePort', 'RegDepMap', 'Resource',
                'ThreadModel', 'AddrDep'])
 
-        Source('pipeline_traits.cc')        
         Source('inorder_dyn_inst.cc')
         Source('inorder_cpu_builder.cc')
         Source('inorder_trace.cc')
diff --git a/src/cpu/inorder/cpu.cc b/src/cpu/inorder/cpu.cc
index ffdcae7df8..0ec4c98619 100644
--- a/src/cpu/inorder/cpu.cc
+++ b/src/cpu/inorder/cpu.cc
@@ -324,19 +324,19 @@ InOrderCPU::InOrderCPU(Params *params)
                                             tid, 
                                             asid[tid]);
 
-        dummyReq[tid] = new ResourceRequest(resPool->getResource(0), 
-                                            dummyInst[tid], 
-                                            0, 
-                                            0, 
-                                            0, 
-                                            0);        
+        dummyReq[tid] = new ResourceRequest(resPool->getResource(0));
     }
 
     dummyReqInst = new InOrderDynInst(this, NULL, 0, 0, 0);
     dummyReqInst->setSquashed();
+    dummyReqInst->resetInstCount();
 
     dummyBufferInst = new InOrderDynInst(this, NULL, 0, 0, 0);
     dummyBufferInst->setSquashed();
+    dummyBufferInst->resetInstCount();
+
+    endOfSkedIt = skedCache.end();
+    frontEndSked = createFrontEndSked();
     
     lastRunningCycle = curTick();
 
@@ -348,7 +348,6 @@ InOrderCPU::InOrderCPU(Params *params)
     reset();
 #endif
 
-    dummyBufferInst->resetInstCount();
     
     // Schedule First Tick Event, CPU will reschedule itself from here on out.
     scheduleTickEvent(0);
@@ -357,8 +356,131 @@ InOrderCPU::InOrderCPU(Params *params)
 InOrderCPU::~InOrderCPU()
 {
     delete resPool;
+
+    std::map<SkedID, ThePipeline::RSkedPtr>::iterator sked_it =
+        skedCache.begin();
+    std::map<SkedID, ThePipeline::RSkedPtr>::iterator sked_end =
+        skedCache.end();
+
+    while (sked_it != sked_end) {
+        delete (*sked_it).second;
+        sked_it++;
+    }
+    skedCache.clear();
 }
 
+std::map<InOrderCPU::SkedID, ThePipeline::RSkedPtr> InOrderCPU::skedCache;
+
+RSkedPtr
+InOrderCPU::createFrontEndSked()
+{
+    RSkedPtr res_sked = new ResourceSked();
+    int stage_num = 0;
+    StageScheduler F(res_sked, stage_num++);
+    StageScheduler D(res_sked, stage_num++);
+
+    // FETCH
+    F.needs(FetchSeq, FetchSeqUnit::AssignNextPC);
+    F.needs(ICache, FetchUnit::InitiateFetch);
+
+    // DECODE
+    D.needs(ICache, FetchUnit::CompleteFetch);
+    D.needs(Decode, DecodeUnit::DecodeInst);
+    D.needs(BPred, BranchPredictor::PredictBranch);
+    D.needs(FetchSeq, FetchSeqUnit::UpdateTargetPC);
+
+
+    DPRINTF(SkedCache, "Resource Sked created for instruction \"front_end\"\n");
+
+    return res_sked;
+}
+
+RSkedPtr
+InOrderCPU::createBackEndSked(DynInstPtr inst)
+{
+    RSkedPtr res_sked = lookupSked(inst);
+    if (res_sked != NULL) {
+        DPRINTF(SkedCache, "Found %s in sked cache.\n",
+                inst->instName());
+        return res_sked;
+    } else {
+        res_sked = new ResourceSked();
+    }
+
+    int stage_num = ThePipeline::BackEndStartStage;
+    StageScheduler X(res_sked, stage_num++);
+    StageScheduler M(res_sked, stage_num++);
+    StageScheduler W(res_sked, stage_num++);
+
+    if (!inst->staticInst) {
+        warn_once("Static Instruction Object Not Set. Can't Create"
+                  " Back End Schedule");
+        return NULL;
+    }
+
+    // EXECUTE
+    for (int idx=0; idx < inst->numSrcRegs(); idx++) {
+        if (!idx || !inst->isStore()) {
+            X.needs(RegManager, UseDefUnit::ReadSrcReg, idx);
+        }
+    }
+
+    if ( inst->isNonSpeculative() ) {
+        // skip execution of non speculative insts until later
+    } else if ( inst->isMemRef() ) {
+        if ( inst->isLoad() ) {
+            X.needs(AGEN, AGENUnit::GenerateAddr);
+        }
+    } else if (inst->opClass() == IntMultOp || inst->opClass() == IntDivOp) {
+        X.needs(MDU, MultDivUnit::StartMultDiv);
+    } else {
+        X.needs(ExecUnit, ExecutionUnit::ExecuteInst);
+    }
+
+    if (inst->opClass() == IntMultOp || inst->opClass() == IntDivOp) {
+        X.needs(MDU, MultDivUnit::EndMultDiv);
+    }
+
+    // MEMORY
+    if ( inst->isLoad() ) {
+        M.needs(DCache, CacheUnit::InitiateReadData);
+    } else if ( inst->isStore() ) {
+        if ( inst->numSrcRegs() >= 2 ) {
+            M.needs(RegManager, UseDefUnit::ReadSrcReg, 1);
+        }
+        M.needs(AGEN, AGENUnit::GenerateAddr);
+        M.needs(DCache, CacheUnit::InitiateWriteData);
+    }
+
+
+    // WRITEBACK
+    if ( inst->isLoad() ) {
+        W.needs(DCache, CacheUnit::CompleteReadData);
+    } else if ( inst->isStore() ) {
+        W.needs(DCache, CacheUnit::CompleteWriteData);
+    }
+
+    if ( inst->isNonSpeculative() ) {
+        if ( inst->isMemRef() ) fatal("Non-Speculative Memory Instruction");
+        W.needs(ExecUnit, ExecutionUnit::ExecuteInst);
+    }
+
+    W.needs(Grad, GraduationUnit::GraduateInst);
+
+    for (int idx=0; idx < inst->numDestRegs(); idx++) {
+        W.needs(RegManager, UseDefUnit::WriteDestReg, idx);
+    }
+
+    // Insert Back Schedule into our cache of
+    // resource schedules
+    addToSkedCache(inst, res_sked);
+
+    DPRINTF(SkedCache, "Back End Sked Created for instruction: %s (%08p)\n",
+            inst->instName(), inst->getMachInst());
+    res_sked->print();
+
+    return res_sked;
+}
 
 void
 InOrderCPU::regStats()
@@ -520,8 +642,7 @@ InOrderCPU::tick()
     }
     activityRec.advance();
    
-    // Any squashed requests, events, or insts then remove them now
-    cleanUpRemovedReqs();
+    // Any squashed events, or insts then remove them now
     cleanUpRemovedEvents();
     cleanUpRemovedInsts();
 
@@ -1299,14 +1420,6 @@ InOrderCPU::cleanUpRemovedInsts()
         DynInstPtr inst = *removeList.front();
         ThreadID tid = inst->threadNumber;
 
-        // Make Sure Resource Schedule Is Emptied Out
-        ThePipeline::ResSchedule *inst_sched = &inst->resSched;
-        while (!inst_sched->empty()) {
-            ScheduleEntry* sch_entry = inst_sched->top();
-            inst_sched->pop();
-            delete sch_entry;
-        }
-
         // Remove From Register Dependency Map, If Necessary
         archRegDepMap[(*removeList.front())->threadNumber].
             remove((*removeList.front()));
@@ -1314,8 +1427,8 @@ InOrderCPU::cleanUpRemovedInsts()
 
         // Clear if Non-Speculative
         if (inst->staticInst &&
-              inst->seqNum == nonSpecSeqNum[tid] &&
-                nonSpecInstActive[tid] == true) {
+            inst->seqNum == nonSpecSeqNum[tid] &&
+            nonSpecInstActive[tid] == true) {
             nonSpecInstActive[tid] = false;
         }
 
@@ -1327,28 +1440,6 @@ InOrderCPU::cleanUpRemovedInsts()
     removeInstsThisCycle = false;
 }
 
-void
-InOrderCPU::cleanUpRemovedReqs()
-{
-    while (!reqRemoveList.empty()) {
-        ResourceRequest *res_req = reqRemoveList.front();
-
-        DPRINTF(RefCount, "[tid:%i] [sn:%lli]: Removing Request "
-                "[stage_num:%i] [res:%s] [slot:%i] [completed:%i].\n",
-                res_req->inst->threadNumber,
-                res_req->inst->seqNum,
-                res_req->getStageNum(),
-                res_req->res->name(),
-                (res_req->isCompleted()) ?
-                res_req->getComplSlot() : res_req->getSlot(),
-                res_req->isCompleted());
-
-        reqRemoveList.pop();
-
-        delete res_req;
-    }
-}
-
 void
 InOrderCPU::cleanUpRemovedEvents()
 {
diff --git a/src/cpu/inorder/cpu.hh b/src/cpu/inorder/cpu.hh
index 9ff0f12ce5..2fa6bdc593 100644
--- a/src/cpu/inorder/cpu.hh
+++ b/src/cpu/inorder/cpu.hh
@@ -296,6 +296,92 @@ class InOrderCPU : public BaseCPU
     TheISA::TLB *getITBPtr();
     TheISA::TLB *getDTBPtr();
 
+    /** Accessor Type for the SkedCache */
+    typedef uint32_t SkedID;
+
+    /** Cache of Instruction Schedule using the instruction's name as a key */
+    static std::map<SkedID, ThePipeline::RSkedPtr> skedCache;
+
+    typedef std::map<SkedID, ThePipeline::RSkedPtr>::iterator SkedCacheIt;
+
+    /** Initialized to last iterator in map, signifying a invalid entry
+        on map searches
+    */
+    SkedCacheIt endOfSkedIt;
+
+    ThePipeline::RSkedPtr frontEndSked;
+
+    /** Add a new instruction schedule to the schedule cache */
+    void addToSkedCache(DynInstPtr inst, ThePipeline::RSkedPtr inst_sked)
+    {
+        SkedID sked_id = genSkedID(inst);
+        assert(skedCache.find(sked_id) == skedCache.end());
+        skedCache[sked_id] = inst_sked;
+    }
+
+
+    /** Find a instruction schedule */
+    ThePipeline::RSkedPtr lookupSked(DynInstPtr inst)
+    {
+        SkedID sked_id = genSkedID(inst);
+        SkedCacheIt lookup_it = skedCache.find(sked_id);
+
+        if (lookup_it != endOfSkedIt) {
+            return (*lookup_it).second;
+        } else {
+            return NULL;
+        }
+    }
+
+    static const uint8_t INST_OPCLASS                       = 26;
+    static const uint8_t INST_LOAD                          = 25;
+    static const uint8_t INST_STORE                         = 24;
+    static const uint8_t INST_CONTROL                       = 23;
+    static const uint8_t INST_NONSPEC                       = 22;
+    static const uint8_t INST_DEST_REGS                     = 18;
+    static const uint8_t INST_SRC_REGS                      = 14;
+
+    inline SkedID genSkedID(DynInstPtr inst)
+    {
+        SkedID id = 0;
+        id = (inst->opClass() << INST_OPCLASS) |
+            (inst->isLoad() << INST_LOAD) |
+            (inst->isStore() << INST_STORE) |
+            (inst->isControl() << INST_CONTROL) |
+            (inst->isNonSpeculative() << INST_NONSPEC) |
+            (inst->numDestRegs() << INST_DEST_REGS) |
+            (inst->numSrcRegs() << INST_SRC_REGS);
+        return id;
+    }
+
+    ThePipeline::RSkedPtr createFrontEndSked();
+    ThePipeline::RSkedPtr createBackEndSked(DynInstPtr inst);
+
+    class StageScheduler {
+      private:
+        ThePipeline::RSkedPtr rsked;
+        int stageNum;
+        int nextTaskPriority;
+
+      public:
+        StageScheduler(ThePipeline::RSkedPtr _rsked, int stage_num)
+            : rsked(_rsked), stageNum(stage_num),
+              nextTaskPriority(0)
+        { }
+
+        void needs(int unit, int request) {
+            rsked->push(new ScheduleEntry(
+                            stageNum, nextTaskPriority++, unit, request
+                            ));
+        }
+
+        void needs(int unit, int request, int param) {
+            rsked->push(new ScheduleEntry(
+                            stageNum, nextTaskPriority++, unit, request, param
+                            ));
+        }
+    };
+
   public:
 
     /** Registers statistics. */
@@ -508,10 +594,7 @@ class InOrderCPU : public BaseCPU
     /** Cleans up all instructions on the instruction remove list. */
     void cleanUpRemovedInsts();
 
-    /** Cleans up all instructions on the request remove list. */
-    void cleanUpRemovedReqs();
-
-    /** Cleans up all instructions on the CPU event remove list. */
+    /** Cleans up all events on the CPU event remove list. */
     void cleanUpRemovedEvents();
 
     /** Debug function to print all instructions on the list. */
@@ -541,11 +624,6 @@ class InOrderCPU : public BaseCPU
      */
     std::queue<ListIt> removeList;
 
-    /** List of all the resource requests that will be removed at the end 
-     *  of this cycle.
-     */
-    std::queue<ResourceRequest*> reqRemoveList;
-
     /** List of all the cpu event requests that will be removed at the end of
      *  the current cycle.
      */
diff --git a/src/cpu/inorder/first_stage.cc b/src/cpu/inorder/first_stage.cc
index 71c6ec3e05..b656ca1c75 100644
--- a/src/cpu/inorder/first_stage.cc
+++ b/src/cpu/inorder/first_stage.cc
@@ -181,7 +181,7 @@ FirstStage::processInsts(ThreadID tid)
             inst->setInstListIt(cpu->addInst(inst));
 
             // Create Front-End Resource Schedule For Instruction
-            ThePipeline::createFrontEndSchedule(inst);
+            inst->setFrontSked(cpu->frontEndSked);
         }
 
         int reqs_processed = 0;            
diff --git a/src/cpu/inorder/inorder_dyn_inst.cc b/src/cpu/inorder/inorder_dyn_inst.cc
index 6afe35862f..e9deb76255 100644
--- a/src/cpu/inorder/inorder_dyn_inst.cc
+++ b/src/cpu/inorder/inorder_dyn_inst.cc
@@ -51,7 +51,7 @@ InOrderDynInst::InOrderDynInst(TheISA::ExtMachInst machInst,
                                const TheISA::PCState &instPC,
                                const TheISA::PCState &_predPC,
                                InstSeqNum seq_num, InOrderCPU *cpu)
-  : staticInst(machInst, instPC.instAddr()), traceData(NULL), cpu(cpu)
+    : staticInst(machInst, instPC.instAddr()), traceData(NULL), cpu(cpu)
 {
     seqNum = seq_num;
 
@@ -108,6 +108,8 @@ InOrderDynInst::setMachInst(ExtMachInst machInst)
 void
 InOrderDynInst::initVars()
 {
+    inFrontEnd = true;
+
     fetchMemReq = NULL;
     dataMemReq = NULL;
     splitMemData = NULL;
@@ -123,7 +125,6 @@ InOrderDynInst::initVars()
     readyRegs = 0;
 
     nextStage = 0;
-    nextInstStageNum = 0;
 
     for(int i = 0; i < MaxInstDestRegs; i++)
         instResult[i].val.integer = 0;
@@ -206,8 +207,6 @@ InOrderDynInst::~InOrderDynInst()
 
     --instcount;
 
-    deleteStages();
-
     DPRINTF(InOrderDynInst, "DynInst: [tid:%i] [sn:%lli] Instruction destroyed"
             " (active insts: %i)\n", threadNumber, seqNum, instcount);
 }
@@ -282,29 +281,6 @@ InOrderDynInst::completeAcc(Packet *pkt)
     return this->fault;
 }
 
-InstStage *InOrderDynInst::addStage()
-{
-    this->currentInstStage = new InstStage(this, nextInstStageNum++);
-    instStageList.push_back( this->currentInstStage );
-    return this->currentInstStage;
-}
-
-InstStage *InOrderDynInst::addStage(int stage_num)
-{
-    nextInstStageNum = stage_num;
-    return InOrderDynInst::addStage();
-}
-
-void InOrderDynInst::deleteStages() {
-    std::list<InstStage*>::iterator list_it = instStageList.begin();
-    std::list<InstStage*>::iterator list_end = instStageList.end();
-
-    while(list_it != list_end) {
-        delete *list_it;
-        list_it++;
-    }
-}
-
 Fault
 InOrderDynInst::memAccess()
 {
diff --git a/src/cpu/inorder/inorder_dyn_inst.hh b/src/cpu/inorder/inorder_dyn_inst.hh
index 1c0ee43844..0e6be3da22 100644
--- a/src/cpu/inorder/inorder_dyn_inst.hh
+++ b/src/cpu/inorder/inorder_dyn_inst.hh
@@ -210,9 +210,6 @@ class InOrderDynInst : public FastAlloc, public RefCounted
     /**  Data used for a store for operation. */
     uint64_t storeData;
 
-    /** The resource schedule for this inst */
-    ThePipeline::ResSchedule resSched;
-
     /** List of active resource requests for this instruction */
     std::list<ResourceRequest*> reqList;
 
@@ -304,11 +301,6 @@ class InOrderDynInst : public FastAlloc, public RefCounted
 
     int nextStage;
 
-    /* vars to keep track of InstStage's - used for resource sched defn */
-    int nextInstStageNum;
-    ThePipeline::InstStage *currentInstStage;
-    std::list<ThePipeline::InstStage*> instStageList;
-
   private:
     /** Function to initialize variables in the constructors. */
     void initVars();
@@ -337,9 +329,10 @@ class InOrderDynInst : public FastAlloc, public RefCounted
     ////////////////////////////////////////////////////////////
     std::string instName() { return staticInst->getName(); }
 
-
     void setMachInst(ExtMachInst inst);
 
+    ExtMachInst getMachInst() { return staticInst->machInst; }
+
     /** Sets the StaticInst. */
     void setStaticInst(StaticInstPtr &static_inst);
 
@@ -411,68 +404,88 @@ class InOrderDynInst : public FastAlloc, public RefCounted
     // RESOURCE SCHEDULING
     //
     /////////////////////////////////////////////
+    typedef ThePipeline::RSkedPtr RSkedPtr;
+    bool inFrontEnd;
+
+    RSkedPtr frontSked;
+    RSkedIt frontSked_end;
+
+    RSkedPtr backSked;
+    RSkedIt backSked_end;
+
+    RSkedIt curSkedEntry;
+
+    void setFrontSked(RSkedPtr front_sked)
+    {
+        frontSked = front_sked;
+        frontSked_end.init(frontSked);
+        frontSked_end = frontSked->end();
+        //DPRINTF(InOrderDynInst, "Set FrontSked End to : %x \n" ,
+        //        frontSked_end.getIt()/*, frontSked->end()*/);
+        //assert(frontSked_end == frontSked->end());
+
+        // This initializes instruction to be able
+        // to walk the resource schedule
+        curSkedEntry.init(frontSked);
+        curSkedEntry = frontSked->begin();
+    }
+
+    void setBackSked(RSkedPtr back_sked)
+    {
+        backSked = back_sked;
+        backSked_end.init(backSked);
+        backSked_end = backSked->end();
+    }
 
     void setNextStage(int stage_num) { nextStage = stage_num; }
     int getNextStage() { return nextStage; }
 
-    ThePipeline::InstStage *addStage();
-    ThePipeline::InstStage *addStage(int stage);
-    ThePipeline::InstStage *currentStage() { return currentInstStage; }
-    void deleteStages();
-
-    /** Add A Entry To Reource Schedule */
-    void addToSched(ScheduleEntry* sched_entry)
-    { resSched.push(sched_entry); }
-
-
     /** Print Resource Schedule */
-    /** @NOTE: DEBUG ONLY */
-    void printSched()
+    void printSked()
     {
-        ThePipeline::ResSchedule tempSched;
-        std::cerr << "\tInst. Res. Schedule: ";
-        while (!resSched.empty()) {
-            std::cerr << '\t' << resSched.top()->stageNum << "-"
-                 << resSched.top()->resNum << ", ";
-
-            tempSched.push(resSched.top());
-            resSched.pop();
+        if (frontSked != NULL) {
+            frontSked->print();
         }
 
-        std::cerr << std::endl;
-        resSched = tempSched;
+        if (backSked != NULL) {
+            backSked->print();
+        }
     }
 
     /** Return Next Resource Stage To Be Used */
     int nextResStage()
     {
-        if (resSched.empty())
-            return -1;
-        else
-            return resSched.top()->stageNum;
+        assert((inFrontEnd && curSkedEntry != frontSked_end) ||
+               (!inFrontEnd && curSkedEntry != backSked_end));
+
+        return curSkedEntry->stageNum;
     }
 
 
     /** Return Next Resource To Be Used */
     int nextResource()
     {
-        if (resSched.empty())
-            return -1;
-        else
-            return resSched.top()->resNum;
+        assert((inFrontEnd && curSkedEntry != frontSked_end) ||
+               (!inFrontEnd && curSkedEntry != backSked_end));
+
+        return curSkedEntry->resNum;
     }
 
-    /** Remove & Deallocate a schedule entry */
-    void popSchedEntry()
+    /** Finish using a schedule entry, increment to next entry */
+    bool finishSkedEntry()
     {
-        if (!resSched.empty()) {
-            ScheduleEntry* sked = resSched.top();
-            resSched.pop();
-            if (sked != 0) {
-                delete sked;
-                
-            }            
+        curSkedEntry++;
+
+        if (inFrontEnd && curSkedEntry == frontSked_end) {
+            assert(backSked != NULL);
+            curSkedEntry.init(backSked);
+            curSkedEntry = backSked->begin();
+            inFrontEnd = false;
+        } else if (!inFrontEnd && curSkedEntry == backSked_end) {
+            return true;
         }
+
+        return false;
     }
 
     /** Release a Resource Request (Currently Unused) */
diff --git a/src/cpu/inorder/pipeline_stage.cc b/src/cpu/inorder/pipeline_stage.cc
index 744ffd4d2b..b267ac00e4 100644
--- a/src/cpu/inorder/pipeline_stage.cc
+++ b/src/cpu/inorder/pipeline_stage.cc
@@ -44,12 +44,17 @@ PipelineStage::PipelineStage(Params *params, unsigned stage_num)
       stageBufferMax(params->stageWidth),
       prevStageValid(false), nextStageValid(false), idle(false)
 {
-    switchedOutBuffer.resize(ThePipeline::MaxThreads);
-    switchedOutValid.resize(ThePipeline::MaxThreads);
-    
     init(params);
 }
 
+PipelineStage::~PipelineStage()
+{
+   for(ThreadID tid = 0; tid < numThreads; tid++) {
+       skidBuffer[tid].clear();
+       stalls[tid].resources.clear();
+   }
+}
+
 void
 PipelineStage::init(Params *params)
 {
@@ -66,6 +71,12 @@ PipelineStage::init(Params *params)
         else
             lastStallingStage[tid] = NumStages - 1;
     }
+
+    if ((InOrderCPU::ThreadModel) params->threadModel ==
+        InOrderCPU::SwitchOnCacheMiss) {
+        switchedOutBuffer.resize(ThePipeline::MaxThreads);
+        switchedOutValid.resize(ThePipeline::MaxThreads);
+    }
 }
 
 
@@ -190,9 +201,6 @@ PipelineStage::takeOverFrom()
 
         stalls[tid].resources.clear();
 
-        while (!insts[tid].empty())
-            insts[tid].pop();
-
         skidBuffer[tid].clear();
     }
     wroteToTimeBuffer = false;
@@ -938,17 +946,24 @@ PipelineStage::processInstSchedule(DynInstPtr inst,int &reqs_processed)
                     "\n", tid, inst->seqNum, cpu->resPool->name(res_num));
 
             ResReqPtr req = cpu->resPool->request(res_num, inst);
+            assert(req->valid);
 
-            if (req->isCompleted()) {
+            bool req_completed = req->isCompleted();
+            bool done_in_pipeline = false;
+            if (req_completed) {
                 DPRINTF(InOrderStage, "[tid:%i]: [sn:%i] request to %s "
                         "completed.\n", tid, inst->seqNum, 
                         cpu->resPool->name(res_num));
 
-                inst->popSchedEntry();
-
                 reqs_processed++;                
 
                 req->stagePasses++;                
+
+                done_in_pipeline = inst->finishSkedEntry();
+                if (done_in_pipeline) {
+                    DPRINTF(InOrderDynInst, "[tid:%i]: [sn:%i] finished "
+                            "in pipeline.\n", tid, inst->seqNum);
+                }
             } else {
                 DPRINTF(InOrderStage, "[tid:%i]: [sn:%i] request to %s failed."
                         "\n", tid, inst->seqNum, cpu->resPool->name(res_num));
@@ -982,23 +997,20 @@ PipelineStage::processInstSchedule(DynInstPtr inst,int &reqs_processed)
                     // Activate Next Ready Thread at end of cycle
                     DPRINTF(ThreadModel, "Attempting to activate next ready "
                             "thread due to cache miss.\n");
-                    cpu->activateNextReadyContext();                                                                                               
+                    cpu->activateNextReadyContext();
                 }
-                
-                // Mark request for deletion
-                // if it isnt currently being used by a resource
-                if (!req->hasSlot()) {                   
-                    DPRINTF(InOrderStage, "[sn:%i] Deleting Request, has no "
-                            "slot in resource.\n", inst->seqNum);
-                    
-                    cpu->reqRemoveList.push(req);
-                } else {
-                    DPRINTF(InOrderStage, "[sn:%i] Ignoring Request Deletion, "
-                            "in resource [slot:%i].\n", inst->seqNum,
-                            req->getSlot());
-                }
-                
-                
+            }
+
+            // If this request is no longer needs to take up bandwidth in the
+            // resource, go ahead and free that bandwidth up
+            if (req->doneInResource) {
+                req->freeSlot();
+            }
+
+            // No longer need to process this instruction if the last
+            // request it had wasn't completed or if there is nothing
+            // else for it to do in the pipeline
+            if (done_in_pipeline || !req_completed) {
                 break;
             }
 
diff --git a/src/cpu/inorder/pipeline_stage.hh b/src/cpu/inorder/pipeline_stage.hh
index dfa88de87c..ec70fefc5d 100644
--- a/src/cpu/inorder/pipeline_stage.hh
+++ b/src/cpu/inorder/pipeline_stage.hh
@@ -91,10 +91,7 @@ class PipelineStage
   public:
     PipelineStage(Params *params, unsigned stage_num);
 
-    /** MUST use init() function if this constructor is used. */
-    PipelineStage() { }
-
-    virtual ~PipelineStage() { }
+    virtual ~PipelineStage();
 
     /** PipelineStage initialization. */
     void init(Params *params);
@@ -268,16 +265,6 @@ class PipelineStage
      */
     unsigned instsProcessed;    
 
-    /** Queue of all instructions coming from previous stage on this cycle. */
-    std::queue<DynInstPtr> insts[ThePipeline::MaxThreads];
-
-    /** Queue of instructions that are finished processing and ready to go 
-     *  next stage. This is used to prevent from processing an instrution more 
-     *  than once on any stage. NOTE: It is up to the PROGRAMMER must manage 
-     *  this as a queue
-     */
-    std::list<DynInstPtr> instsToNextStage;
-
     /** Skid buffer between previous stage and this one. */
     std::list<DynInstPtr> skidBuffer[ThePipeline::MaxThreads];
 
diff --git a/src/cpu/inorder/pipeline_traits.cc b/src/cpu/inorder/pipeline_traits.cc
deleted file mode 100644
index a6fad68f71..0000000000
--- a/src/cpu/inorder/pipeline_traits.cc
+++ /dev/null
@@ -1,171 +0,0 @@
-/*
- * Copyright (c) 2007 MIPS Technologies, Inc.
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are
- * met: redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer;
- * redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in the
- * documentation and/or other materials provided with the distribution;
- * neither the name of the copyright holders nor the names of its
- * contributors may be used to endorse or promote products derived from
- * this software without specific prior written permission.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * Authors: Korey Sewell
- *
- */
-
-#include "cpu/inorder/pipeline_traits.hh"
-#include "cpu/inorder/inorder_dyn_inst.hh"
-#include "cpu/inorder/resources/resource_list.hh"
-
-using namespace std;
-
-namespace ThePipeline {
-
-//@TODO: create my own Instruction Schedule Class
-//that operates as a Priority QUEUE
-int getNextPriority(DynInstPtr &inst, int stage_num)
-{
-    int cur_pri = 20;
-
-    /*
-    std::priority_queue<ScheduleEntry*, std::vector<ScheduleEntry*>,
-        entryCompare>::iterator sked_it = inst->resSched.begin();
-
-    std::priority_queue<ScheduleEntry*, std::vector<ScheduleEntry*>,
-        entryCompare>::iterator sked_end = inst->resSched.end();
-
-    while (sked_it != sked_end) {
-
-        if (sked_it.top()->stageNum == stage_num) {
-            cur_pri = sked_it.top()->priority;
-        }
-
-        sked_it++;
-    }
-    */
-
-    return cur_pri;
-}
-
-void createFrontEndSchedule(DynInstPtr &inst)
-{
-    InstStage *F = inst->addStage();
-    InstStage *D = inst->addStage();
-
-    // FETCH
-    F->needs(FetchSeq, FetchSeqUnit::AssignNextPC);
-    F->needs(ICache, FetchUnit::InitiateFetch);
-
-    // DECODE
-    D->needs(ICache, FetchUnit::CompleteFetch);
-    D->needs(Decode, DecodeUnit::DecodeInst);
-    D->needs(BPred, BranchPredictor::PredictBranch);
-    D->needs(FetchSeq, FetchSeqUnit::UpdateTargetPC);
-
-    inst->resSched.init();
-}
-
-bool createBackEndSchedule(DynInstPtr &inst)
-{
-    if (!inst->staticInst) {
-        return false;
-    }
-
-    InstStage *X = inst->addStage();
-    InstStage *M = inst->addStage();
-    InstStage *W = inst->addStage();
-
-    // EXECUTE
-    for (int idx=0; idx < inst->numSrcRegs(); idx++) {
-        if (!idx || !inst->isStore()) {
-            X->needs(RegManager, UseDefUnit::ReadSrcReg, idx);
-        }
-    }
-
-    if ( inst->isNonSpeculative() ) {
-        // skip execution of non speculative insts until later
-    } else if ( inst->isMemRef() ) {
-        if ( inst->isLoad() ) {
-            X->needs(AGEN, AGENUnit::GenerateAddr);
-        }
-    } else if (inst->opClass() == IntMultOp || inst->opClass() == IntDivOp) {
-        X->needs(MDU, MultDivUnit::StartMultDiv);
-    } else {
-        X->needs(ExecUnit, ExecutionUnit::ExecuteInst);
-    }
-
-    if (inst->opClass() == IntMultOp || inst->opClass() == IntDivOp) {
-        X->needs(MDU, MultDivUnit::EndMultDiv);
-    }
-
-    // MEMORY
-    if ( inst->isLoad() ) {
-        M->needs(DCache, CacheUnit::InitiateReadData);
-    } else if ( inst->isStore() ) {
-        if ( inst->numSrcRegs() >= 2 ) {            
-            M->needs(RegManager, UseDefUnit::ReadSrcReg, 1);
-        }        
-        M->needs(AGEN, AGENUnit::GenerateAddr);
-        M->needs(DCache, CacheUnit::InitiateWriteData);
-    }
-
-
-    // WRITEBACK
-    if ( inst->isLoad() ) {
-        W->needs(DCache, CacheUnit::CompleteReadData);
-    } else if ( inst->isStore() ) {
-        W->needs(DCache, CacheUnit::CompleteWriteData);
-    }
-
-    if ( inst->isNonSpeculative() ) {
-        if ( inst->isMemRef() ) fatal("Non-Speculative Memory Instruction");
-        W->needs(ExecUnit, ExecutionUnit::ExecuteInst);
-    }
-
-    for (int idx=0; idx < inst->numDestRegs(); idx++) {
-        W->needs(RegManager, UseDefUnit::WriteDestReg, idx);
-    }
-
-    W->needs(Grad, GraduationUnit::GraduateInst);
-
-    return true;
-}
-
-InstStage::InstStage(DynInstPtr inst, int stage_num)
-{
-    stageNum = stage_num;
-    nextTaskPriority = 0;
-    instSched = &inst->resSched;
-}
-
-void
-InstStage::needs(int unit, int request) {
-    instSched->push( new ScheduleEntry(
-                         stageNum, nextTaskPriority++, unit, request
-                         ));
-}
-
-void
-InstStage::needs(int unit, int request, int param) {
-    instSched->push( new ScheduleEntry(
-                         stageNum, nextTaskPriority++, unit, request, param
-                         ));
-}
-
-};
diff --git a/src/cpu/inorder/pipeline_traits.hh b/src/cpu/inorder/pipeline_traits.hh
index df964e2544..573c0200ab 100644
--- a/src/cpu/inorder/pipeline_traits.hh
+++ b/src/cpu/inorder/pipeline_traits.hh
@@ -51,7 +51,7 @@ class ResourceSked;
 namespace ThePipeline {
     // Pipeline Constants
     const unsigned NumStages = 5;
-    const ThreadID MaxThreads = 8;
+    const ThreadID MaxThreads = 1;
     const unsigned BackEndStartStage = 2;
 
     // List of Resources The Pipeline Uses
@@ -77,23 +77,7 @@ namespace ThePipeline {
     // RESOURCE SCHEDULING
     //////////////////////////
     typedef ResourceSked ResSchedule;
-
-    void createFrontEndSchedule(DynInstPtr &inst);
-    bool createBackEndSchedule(DynInstPtr &inst);
-    int getNextPriority(DynInstPtr &inst, int stage_num);
-
-    class InstStage {
-      private:
-        int nextTaskPriority;
-        int stageNum;
-        ResSchedule *instSched;
-
-      public:
-        InstStage(DynInstPtr inst, int stage_num);
-
-        void needs(int unit, int request);
-        void needs(int unit, int request, int param);
-    };
+    typedef ResourceSked* RSkedPtr;
 };
 
 
diff --git a/src/cpu/inorder/reg_dep_map.cc b/src/cpu/inorder/reg_dep_map.cc
index 98a0727a9a..48820b50e5 100644
--- a/src/cpu/inorder/reg_dep_map.cc
+++ b/src/cpu/inorder/reg_dep_map.cc
@@ -45,6 +45,14 @@ RegDepMap::RegDepMap(int size)
     regMap.resize(size);
 }
 
+RegDepMap::~RegDepMap()
+{
+    for (int i = 0; i < regMap.size(); i++) {
+        regMap[i].clear();
+    }
+    regMap.clear();
+}
+
 string
 RegDepMap::name()
 {
diff --git a/src/cpu/inorder/reg_dep_map.hh b/src/cpu/inorder/reg_dep_map.hh
index fa4fe45f37..047e4d1291 100644
--- a/src/cpu/inorder/reg_dep_map.hh
+++ b/src/cpu/inorder/reg_dep_map.hh
@@ -48,7 +48,7 @@ class RegDepMap
   public:
     RegDepMap(int size = TheISA::TotalNumRegs);
 
-    ~RegDepMap() { }
+    ~RegDepMap();
 
     std::string name();
 
diff --git a/src/cpu/inorder/resource.cc b/src/cpu/inorder/resource.cc
index 51beb5aa0e..24211532ef 100644
--- a/src/cpu/inorder/resource.cc
+++ b/src/cpu/inorder/resource.cc
@@ -31,6 +31,8 @@
 
 #include <vector>
 #include <list>
+
+#include "base/str.hh"
 #include "cpu/inorder/resource.hh"
 #include "cpu/inorder/cpu.hh"
 using namespace std;
@@ -40,22 +42,42 @@ Resource::Resource(string res_name, int res_id, int res_width,
     : resName(res_name), id(res_id),
       width(res_width), latency(res_latency), cpu(_cpu)
 {
+    reqs.resize(width);
+
     // Use to deny a instruction a resource.
-    deniedReq = new ResourceRequest(this, NULL, 0, 0, 0, 0);
+    deniedReq = new ResourceRequest(this);
+    deniedReq->valid = true;
 }
 
 Resource::~Resource()
 {
-    delete [] resourceEvent;
-    delete deniedReq;    
+    if (resourceEvent) {
+        delete [] resourceEvent;
+    }
+
+    delete deniedReq;
+
+    for (int i = 0; i < width; i++) {
+        delete reqs[i];
+    }
 }
 
 
 void
 Resource::init()
 {
-    // Set Up Resource Events to Appropriate Resource BandWidth
-    resourceEvent = new ResourceEvent[width];
+    // If the resource has a zero-cycle (no latency)
+    // function, then no reason to have events
+    // that will process them for the right tick
+    if (latency > 0) {
+        resourceEvent = new ResourceEvent[width];
+    } else {
+        resourceEvent = NULL;
+    }
+
+    for (int i = 0; i < width; i++) {
+        reqs[i] = new ResourceRequest(this);
+    }
 
     initSlots();
 }
@@ -66,7 +88,10 @@ Resource::initSlots()
     // Add available slot numbers for resource
     for (int slot_idx = 0; slot_idx < width; slot_idx++) {
         availSlots.push_back(slot_idx);
-        resourceEvent[slot_idx].init(this, slot_idx);
+
+        if (resourceEvent) {
+            resourceEvent[slot_idx].init(this, slot_idx);
+        }
     }
 }
 
@@ -91,42 +116,34 @@ Resource::slotsInUse()
 void
 Resource::freeSlot(int slot_idx)
 {
+    DPRINTF(Resource, "Deallocating [slot:%i].\n",
+            slot_idx);
+
     // Put slot number on this resource's free list
     availSlots.push_back(slot_idx);
 
-    // Erase Request Pointer From Request Map
-    std::map<int, ResReqPtr>::iterator req_it = reqMap.find(slot_idx);
-
-    assert(req_it != reqMap.end());
-    reqMap.erase(req_it);
-
+    // Invalidate Request & Reset it's flags
+    reqs[slot_idx]->clearRequest();
 }
 
-// TODO: More efficiently search for instruction's slot within
-// resource.
 int
 Resource::findSlot(DynInstPtr inst)
 {
-    map<int, ResReqPtr>::iterator map_it = reqMap.begin();
-    map<int, ResReqPtr>::iterator map_end = reqMap.end();
-
     int slot_num = -1;
 
-    while (map_it != map_end) {
-        if ((*map_it).second->getInst()->seqNum ==
-            inst->seqNum) {
-            slot_num = (*map_it).second->getSlot();
+    for (int i = 0; i < width; i++) {
+        if (reqs[i]->valid &&
+            reqs[i]->getInst()->seqNum == inst->seqNum) {
+            slot_num = reqs[i]->getSlot();
         }
-        map_it++;
     }
-
     return slot_num;
 }
 
 int
 Resource::getSlot(DynInstPtr inst)
 {
-    int slot_num;
+    int slot_num = -1;
 
     if (slotsAvail() != 0) {
         slot_num = availSlots[0];
@@ -136,24 +153,6 @@ Resource::getSlot(DynInstPtr inst)
         assert(slot_num == *vect_it);
 
         availSlots.erase(vect_it);
-    } else {
-        DPRINTF(Resource, "[tid:%i]: No slots in resource "
-                "available to service [sn:%i].\n", inst->readTid(),
-                inst->seqNum);
-        slot_num = -1;
-
-        map<int, ResReqPtr>::iterator map_it = reqMap.begin();
-        map<int, ResReqPtr>::iterator map_end = reqMap.end();
-
-        while (map_it != map_end) {
-            if ((*map_it).second) {
-                DPRINTF(Resource, "Currently Serving request from: "
-                        "[tid:%i] [sn:%i].\n",
-                        (*map_it).second->getInst()->readTid(),
-                        (*map_it).second->getInst()->seqNum);
-            }
-            map_it++;
-        }
     }
 
     return slot_num;
@@ -183,9 +182,12 @@ Resource::request(DynInstPtr inst)
         slot_num = getSlot(inst);
 
         if (slot_num != -1) {
+            DPRINTF(Resource, "Allocating [slot:%i] for [tid:%i]: [sn:%i]\n",
+                    slot_num, inst->readTid(), inst->seqNum);
+
             // Get Stage # from Schedule Entry
-            stage_num = inst->resSched.top()->stageNum;
-            unsigned cmd = inst->resSched.top()->cmd;
+            stage_num = inst->curSkedEntry->stageNum;
+            unsigned cmd = inst->curSkedEntry->cmd;
 
             // Generate Resource Request
             inst_req = getRequest(inst, stage_num, id, slot_num, cmd);
@@ -200,10 +202,12 @@ Resource::request(DynInstPtr inst)
                         inst->readTid());
             }
 
-            reqMap[slot_num] = inst_req;
-
             try_request = true;
+        } else {
+            DPRINTF(Resource, "No slot available for [tid:%i]: [sn:%i]\n",
+                    inst->readTid(), inst->seqNum);
         }
+
     }
 
     if (try_request) {
@@ -236,32 +240,21 @@ ResReqPtr
 Resource::getRequest(DynInstPtr inst, int stage_num, int res_idx,
                      int slot_num, unsigned cmd)
 {
-    return new ResourceRequest(this, inst, stage_num, id, slot_num,
-                               cmd);
+    reqs[slot_num]->setRequest(inst, stage_num, id, slot_num, cmd);
+    return reqs[slot_num];
 }
 
 ResReqPtr
 Resource::findRequest(DynInstPtr inst)
 {
-    map<int, ResReqPtr>::iterator map_it = reqMap.begin();
-    map<int, ResReqPtr>::iterator map_end = reqMap.end();
-
-    bool found = false;
-    ResReqPtr req = NULL;
-    
-    while (map_it != map_end) {
-        if ((*map_it).second &&
-            (*map_it).second->getInst() == inst) {            
-            req = (*map_it).second;
-            //return (*map_it).second;
-            assert(found == false);
-            found = true;            
+    for (int i = 0; i < width; i++) {
+        if (reqs[i]->valid &&
+            reqs[i]->getInst() == inst) {
+            return reqs[i];
         }
-        map_it++;
     }
 
-    return req;    
-    //return NULL;
+    return NULL;
 }
 
 void
@@ -275,9 +268,9 @@ void
 Resource::execute(int slot_idx)
 {
     DPRINTF(Resource, "[tid:%i]: Executing %s resource.\n",
-            reqMap[slot_idx]->getTid(), name());
-    reqMap[slot_idx]->setCompleted(true);
-    reqMap[slot_idx]->done();
+            reqs[slot_idx]->getTid(), name());
+    reqs[slot_idx]->setCompleted(true);
+    reqs[slot_idx]->done();
 }
 
 void
@@ -293,15 +286,10 @@ void
 Resource::squash(DynInstPtr inst, int stage_num, InstSeqNum squash_seq_num,
                  ThreadID tid)
 {
-    std::vector<int> slot_remove_list;
+    for (int i = 0; i < width; i++) {
+        ResReqPtr req_ptr = reqs[i];
 
-    map<int, ResReqPtr>::iterator map_it = reqMap.begin();
-    map<int, ResReqPtr>::iterator map_end = reqMap.end();
-
-    while (map_it != map_end) {
-        ResReqPtr req_ptr = (*map_it).second;
-
-        if (req_ptr &&
+        if (req_ptr->valid &&
             req_ptr->getInst()->readTid() == tid &&
             req_ptr->getInst()->seqNum > squash_seq_num) {
 
@@ -316,19 +304,8 @@ Resource::squash(DynInstPtr inst, int stage_num, InstSeqNum squash_seq_num,
             if (resourceEvent[req_slot_num].scheduled())
                 unscheduleEvent(req_slot_num);
 
-            // Mark request for later removal
-            cpu->reqRemoveList.push(req_ptr);
-
-            // Mark slot for removal from resource
-            slot_remove_list.push_back(req_ptr->getSlot());
+            freeSlot(req_slot_num);
         }
-
-        map_it++;
-    }
-
-    // Now Delete Slot Entry from Req. Map
-    for (int i = 0; i < slot_remove_list.size(); i++) {
-        freeSlot(slot_remove_list[i]);
     }
 }
 
@@ -350,10 +327,8 @@ Resource::ticks(int num_cycles)
 void
 Resource::scheduleExecution(int slot_num)
 {
-    int res_latency = getLatency(slot_num);
-
-    if (res_latency >= 1) {
-        scheduleEvent(slot_num, res_latency);
+    if (latency >= 1) {
+        scheduleEvent(slot_num, latency);
     } else {
         execute(slot_num);
     }
@@ -363,8 +338,8 @@ void
 Resource::scheduleEvent(int slot_idx, int delay)
 {
     DPRINTF(Resource, "[tid:%i]: Scheduling event for [sn:%i] on tick %i.\n",
-            reqMap[slot_idx]->inst->readTid(),
-            reqMap[slot_idx]->inst->seqNum,
+            reqs[slot_idx]->inst->readTid(),
+            reqs[slot_idx]->inst->seqNum,
             cpu->ticks(delay) + curTick());
     resourceEvent[slot_idx].scheduleEvent(delay);
 }
@@ -401,32 +376,11 @@ int ResourceRequest::resReqID = 0;
 
 int ResourceRequest::maxReqCount = 0;
 
-ResourceRequest::ResourceRequest(Resource *_res, DynInstPtr _inst, 
-                                 int stage_num, int res_idx, int slot_num, 
-                                 unsigned _cmd)
-    : res(_res), inst(_inst), cmd(_cmd),  stageNum(stage_num),
-      resIdx(res_idx), slotNum(slot_num), completed(false),
-      squashed(false), processing(false), memStall(false)
+ResourceRequest::ResourceRequest(Resource *_res)
+    : res(_res), inst(NULL), stagePasses(0), valid(false), doneInResource(false),
+      completed(false), squashed(false), processing(false),
+      memStall(false)
 {
-#ifdef DEBUG
-        reqID = resReqID++;
-        res->cpu->resReqCount++;
-        DPRINTF(ResReqCount, "Res. Req %i created. resReqCount=%i.\n", reqID, 
-                res->cpu->resReqCount);
-
-        if (res->cpu->resReqCount > 100) {
-            fatal("Too many undeleted resource requests. Memory leak?\n");
-        }
-
-        if (res->cpu->resReqCount > maxReqCount) {            
-            maxReqCount = res->cpu->resReqCount;
-        }
-        
-#endif
-
-        stagePasses = 0;
-        complSlotNum = -1;
-        
 }
 
 ResourceRequest::~ResourceRequest()
@@ -436,6 +390,46 @@ ResourceRequest::~ResourceRequest()
         DPRINTF(ResReqCount, "Res. Req %i deleted. resReqCount=%i.\n", reqID, 
                 res->cpu->resReqCount);
 #endif
+        inst = NULL;
+}
+
+std::string
+ResourceRequest::name()
+{
+    return res->name() + "."  + to_string(slotNum);
+}
+
+void
+ResourceRequest::setRequest(DynInstPtr _inst, int stage_num,
+                            int res_idx, int slot_num, unsigned _cmd)
+{
+    valid = true;
+    inst = _inst;
+    stageNum = stage_num;
+    resIdx = res_idx;
+    slotNum = slot_num;
+    cmd = _cmd;
+}
+
+void
+ResourceRequest::clearRequest()
+{
+    valid = false;
+    inst = NULL;
+    stagePasses = 0;
+    completed = false;
+    doneInResource = false;
+    squashed = false;
+    memStall = false;
+}
+
+void
+ResourceRequest::freeSlot()
+{
+    assert(res);
+
+    // Free Slot So Another Instruction Can Use This Resource
+    res->freeSlot(slotNum);
 }
 
 void
@@ -447,25 +441,7 @@ ResourceRequest::done(bool completed)
 
     setCompleted(completed);
 
-    // Used for debugging purposes
-    if (completed) {
-        complSlotNum = slotNum;
-    
-        // Would like to start a convention such as all requests deleted in
-        // resources/pipeline
-        // but a little more complex then it seems...
-        // For now, all COMPLETED requests deleted in resource..
-        //          all FAILED requests deleted in pipeline stage
-        //          *all SQUASHED requests deleted in resource
-        res->cpu->reqRemoveList.push(res->reqMap[slotNum]);
-    }
-    
-    // Free Slot So Another Instruction Can Use This Resource
-    res->freeSlot(slotNum);
-
-    // change slot # to -1, since we check slotNum to see if request is
-    // still valid
-    slotNum = -1;
+    doneInResource = true;
 }
 
 ResourceEvent::ResourceEvent()
@@ -493,7 +469,8 @@ ResourceEvent::process()
 const char *
 ResourceEvent::description()
 {
-    string desc = resource->name() + " event";
+    string desc = resource->name() + "-event:slot[" + to_string(slotIdx)
+        + "]";
 
     return desc.c_str();
 }
diff --git a/src/cpu/inorder/resource.hh b/src/cpu/inorder/resource.hh
index bd9ec48ca3..7899a215fe 100644
--- a/src/cpu/inorder/resource.hh
+++ b/src/cpu/inorder/resource.hh
@@ -221,8 +221,10 @@ class Resource {
     const int latency;
 
   public:
-    /** Mapping of slot-numbers to the resource-request pointers */
-    std::map<int, ResReqPtr> reqMap;
+    /** List of all Requests the Resource is Servicing. Each request
+        represents part of the resource's bandwidth
+    */
+    std::vector<ResReqPtr> reqs;
 
     /** A list of all the available execution slots for this resource.
      *  This correlates with the actual resource event idx.
@@ -245,7 +247,7 @@ class Resource {
 class ResourceEvent : public Event
 {
   public:
-    /** Pointer to the CPU. */
+    /** Pointer to the Resource this is an event for */
     Resource *resource;
 
 
@@ -297,21 +299,29 @@ class ResourceRequest
 
     static int maxReqCount;
     
+    friend class Resource;
+
   public:
-    ResourceRequest(Resource *_res, DynInstPtr _inst, int stage_num,
-                    int res_idx, int slot_num, unsigned _cmd);
+    ResourceRequest(Resource *_res);
     
     virtual ~ResourceRequest();
+
+    std::string name();
     
     int reqID;
 
+    virtual void setRequest(DynInstPtr _inst, int stage_num,
+                    int res_idx, int slot_num, unsigned _cmd);
+
+    virtual void clearRequest();
+
     /** Acknowledge that this is a request is done and remove
      *  from resource.
      */
     void done(bool completed = true);
-
-    short stagePasses;
     
+    void freeSlot();
+
     /////////////////////////////////////////////
     //
     // GET RESOURCE REQUEST IDENTIFICATION / INFO
@@ -319,11 +329,9 @@ class ResourceRequest
     /////////////////////////////////////////////
     /** Get Resource Index */
     int getResIdx() { return resIdx; }
-
        
     /** Get Slot Number */
     int getSlot() { return slotNum; }
-    int getComplSlot() { return complSlotNum; }
     bool hasSlot()  { return slotNum >= 0; }     
 
     /** Get Stage Number */
@@ -353,6 +361,12 @@ class ResourceRequest
     /** Command For This Resource */
     unsigned cmd;
 
+    short stagePasses;
+
+    bool valid;
+
+    bool doneInResource;
+
     ////////////////////////////////////////
     //
     // GET RESOURCE REQUEST STATUS FROM VARIABLES
@@ -380,7 +394,6 @@ class ResourceRequest
     int stageNum;
     int resIdx;
     int slotNum;
-    int complSlotNum;
     
     /** Resource Request Status */
     bool completed;
diff --git a/src/cpu/inorder/resource_pool.cc b/src/cpu/inorder/resource_pool.cc
index a037cbe9ed..4e2f930ab5 100644
--- a/src/cpu/inorder/resource_pool.cc
+++ b/src/cpu/inorder/resource_pool.cc
@@ -55,7 +55,7 @@ ResourcePool::ResourcePool(InOrderCPU *_cpu, ThePipeline::Params *params)
 
     memObjects.push_back(ICache);
     resources.push_back(new FetchUnit("icache_port", ICache,
-                                      stage_width * MaxThreads, 0, _cpu,
+                                      stage_width * 2 + MaxThreads, 0, _cpu,
                                       params));
 
     resources.push_back(new DecodeUnit("Decode-Unit", Decode, 
@@ -68,7 +68,7 @@ ResourcePool::ResourcePool(InOrderCPU *_cpu, ThePipeline::Params *params)
                                        0, _cpu, params));
 
     resources.push_back(new UseDefUnit("RegFile-Manager", RegManager, 
-                                       stage_width * MaxThreads, 0, _cpu,
+                                       stage_width * 3, 0, _cpu,
                                        params));
 
     resources.push_back(new AGENUnit("AGEN-Unit", AGEN, 
@@ -77,20 +77,21 @@ ResourcePool::ResourcePool(InOrderCPU *_cpu, ThePipeline::Params *params)
     resources.push_back(new ExecutionUnit("Execution-Unit", ExecUnit, 
                                           stage_width, 0, _cpu, params));
 
-    resources.push_back(new MultDivUnit("Mult-Div-Unit", MDU, 5, 0, _cpu, 
-                                        params));
+    resources.push_back(new MultDivUnit("Mult-Div-Unit", MDU,
+                                        stage_width * 2, 0, _cpu, params));
 
     memObjects.push_back(DCache);
     resources.push_back(new CacheUnit("dcache_port", DCache, 
-                                      stage_width * MaxThreads, 0, _cpu,
+                                      stage_width * 2 + MaxThreads, 0, _cpu,
                                       params));
 
     resources.push_back(new GraduationUnit("Graduation-Unit", Grad, 
-                                           stage_width * MaxThreads, 0, _cpu,
+                                           stage_width, 0, _cpu,
                                            params));
 
     resources.push_back(new InstBuffer("Fetch-Buffer-T1", FetchBuff2, 4, 
                                        0, _cpu, params));
+
 }
 
 ResourcePool::~ResourcePool()
@@ -122,6 +123,16 @@ ResourcePool::name()
     return cpu->name() + ".ResourcePool";
 }
 
+void
+ResourcePool::print()
+{
+    for (int i=0; i < resources.size(); i++) {
+        DPRINTF(InOrderDynInst, "Res:%i %s\n",
+                i, resources[i]->name());
+    }
+
+}
+
 
 void
 ResourcePool::regStats()
diff --git a/src/cpu/inorder/resource_pool.hh b/src/cpu/inorder/resource_pool.hh
index e8061d3ffa..fde38b4e9a 100644
--- a/src/cpu/inorder/resource_pool.hh
+++ b/src/cpu/inorder/resource_pool.hh
@@ -130,6 +130,8 @@ class ResourcePool {
 
     void init();
 
+    void print();
+
     /** Register Statistics in All Resources */
     void regStats();
 
diff --git a/src/cpu/inorder/resource_sked.cc b/src/cpu/inorder/resource_sked.cc
index 4104e69898..4cf791228f 100644
--- a/src/cpu/inorder/resource_sked.cc
+++ b/src/cpu/inorder/resource_sked.cc
@@ -34,30 +34,30 @@
 
 #include <vector>
 #include <list>
-#include <stdio.h>
+#include <cstdio>
 
 using namespace std;
 using namespace ThePipeline;
 
 ResourceSked::ResourceSked()
 {
-    sked.resize(NumStages);
+    stages.resize(NumStages);
 }
 
 void
 ResourceSked::init()
 {
-    assert(!sked[0].empty());
+    assert(!stages[0].empty());
 
-    curSkedEntry = sked[0].begin();
+    curSkedEntry = stages[0].begin();
 }
 
 int
 ResourceSked::size()
 {
     int total = 0;
-    for (int i = 0; i < sked.size(); i++) {
-        total += sked[i].size();
+    for (int i = 0; i < stages.size(); i++) {
+        total += stages[i].size();
     }
 
     return total;
@@ -69,6 +69,26 @@ ResourceSked::empty()
     return size() == 0;
 }
 
+
+ResourceSked::SkedIt
+ResourceSked::begin()
+{
+    int num_stages = stages.size();
+    for (int i = 0; i < num_stages; i++) {
+        if (stages[i].size() > 0)
+            return stages[i].begin();
+    }
+
+    return stages[num_stages - 1].end();
+}
+
+ResourceSked::SkedIt
+ResourceSked::end()
+{
+    int num_stages = stages.size();
+    return stages[num_stages - 1].end();
+}
+
 ScheduleEntry*
 ResourceSked::top()
 {
@@ -82,18 +102,18 @@ ResourceSked::pop()
 {
     int stage_num = (*curSkedEntry)->stageNum;
 
-    sked[stage_num].erase(curSkedEntry);
+    stages[stage_num].erase(curSkedEntry);
 
-    if (!sked[stage_num].empty()) {
-        curSkedEntry = sked[stage_num].begin();
+    if (!stages[stage_num].empty()) {
+        curSkedEntry = stages[stage_num].begin();
     } else {
         int next_stage = stage_num + 1;
 
         while (next_stage < NumStages) {
-            if (sked[next_stage].empty()) {
+            if (stages[next_stage].empty()) {
                 next_stage++;
             } else {
-                curSkedEntry = sked[next_stage].begin();
+                curSkedEntry = stages[next_stage].begin();
                 break;
             }
         }
@@ -108,7 +128,7 @@ ResourceSked::push(ScheduleEntry* sked_entry)
 
     SkedIt pri_iter = findIterByPriority(sked_entry, stage_num);
 
-    sked[stage_num].insert(pri_iter, sked_entry);
+    stages[stage_num].insert(pri_iter, sked_entry);
 }
 
 void
@@ -122,23 +142,23 @@ ResourceSked::pushBefore(ScheduleEntry* sked_entry, int sked_cmd,
     SkedIt pri_iter = findIterByCommand(sked_entry, stage_num,
                                         sked_cmd, sked_cmd_idx);
 
-    assert(pri_iter != sked[stage_num].end() &&
+    assert(pri_iter != stages[stage_num].end() &&
            "Could not find command to insert in front of.");
 
-    sked[stage_num].insert(pri_iter, sked_entry);
+    stages[stage_num].insert(pri_iter, sked_entry);
 }
 
 ResourceSked::SkedIt
 ResourceSked::findIterByPriority(ScheduleEntry* sked_entry, int stage_num)
 {
-    if (sked[stage_num].empty()) {
-        return sked[stage_num].end();
+    if (stages[stage_num].empty()) {
+        return stages[stage_num].end();
     }
 
     int priority = sked_entry->priority;
 
-    SkedIt sked_it = sked[stage_num].begin();
-    SkedIt sked_end = sked[stage_num].end();
+    SkedIt sked_it = stages[stage_num].begin();
+    SkedIt sked_end = stages[stage_num].end();
 
     while (sked_it != sked_end) {
         if ((*sked_it)->priority > priority)
@@ -154,12 +174,12 @@ ResourceSked::SkedIt
 ResourceSked::findIterByCommand(ScheduleEntry* sked_entry, int stage_num,
                                 int sked_cmd, int sked_cmd_idx)
 {
-    if (sked[stage_num].empty()) {
-        return sked[stage_num].end();
+    if (stages[stage_num].empty()) {
+        return stages[stage_num].end();
     }
 
-    SkedIt sked_it = sked[stage_num].begin();
-    SkedIt sked_end = sked[stage_num].end();
+    SkedIt sked_it = stages[stage_num].begin();
+    SkedIt sked_end = stages[stage_num].end();
 
     while (sked_it != sked_end) {
         if ((*sked_it)->cmd == sked_cmd &&
@@ -175,12 +195,16 @@ ResourceSked::findIterByCommand(ScheduleEntry* sked_entry, int stage_num,
 void
 ResourceSked::print()
 {
-    for (int i = 0; i < sked.size(); i++) {
-        cprintf("Stage %i\n====\n", i);
-        SkedIt sked_it = sked[i].begin();
-        SkedIt sked_end = sked[i].end();
+    for (int i = 0; i < stages.size(); i++) {
+        //ccprintf(cerr, "Stage %i\n====\n", i);
+        SkedIt sked_it = stages[i].begin();
+        SkedIt sked_end = stages[i].end();
         while (sked_it != sked_end) {
-            cprintf("\t res:%i cmd:%i idx:%i\n", (*sked_it)->resNum, (*sked_it)->cmd, (*sked_it)->idx);
+            DPRINTF(SkedCache, "\t stage:%i res:%i cmd:%i idx:%i\n",
+                    (*sked_it)->stageNum,
+                    (*sked_it)->resNum,
+                    (*sked_it)->cmd,
+                    (*sked_it)->idx);
             sked_it++;
         }
     }
diff --git a/src/cpu/inorder/resource_sked.hh b/src/cpu/inorder/resource_sked.hh
index 22e29d7285..bd002e161e 100644
--- a/src/cpu/inorder/resource_sked.hh
+++ b/src/cpu/inorder/resource_sked.hh
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2010 The Regents of The University of Michigan
+ * Copyright (c) 2010-2011 The Regents of The University of Michigan
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
@@ -34,7 +34,19 @@
 
 #include <vector>
 #include <list>
+#include <cstdlib>
 
+/** ScheduleEntry class represents a single function that an instruction
+    wants to do at any pipeline stage. For example, if an instruction
+    needs to be decoded and do a branch prediction all in one stage
+    then each of those tasks would need it's own ScheduleEntry.
+
+    Each schedule entry corresponds to some resource that the instruction
+    wants to interact with.
+
+    The file pipeline_traits.cc shows how a typical instruction schedule is
+    made up of these schedule entries.
+*/
 class ScheduleEntry {
   public:
     ScheduleEntry(int stage_num, int _priority, int res_num, int _cmd = 0,
@@ -43,45 +55,225 @@ class ScheduleEntry {
         idx(_idx), priority(_priority)
     { }
 
-    // Stage number to perform this service.
+    /** Stage number to perform this service. */
     int stageNum;
 
-    // Resource ID to access
+    /** Resource ID to access */
     int resNum;
 
-    // See specific resource for meaning
+    /** See specific resource for meaning */
     unsigned cmd;
 
-    // See specific resource for meaning
+    /** See specific resource for meaning */
     unsigned idx;
 
-    // Some Resources May Need Priority
+    /** Some Resources May Need Priority */
     int priority;
 };
 
+/** The ResourceSked maintains the complete schedule
+    for an instruction. That schedule includes what
+    resources an instruction wants to acquire at each
+    pipeline stage and is represented by a collection
+    of ScheduleEntry objects (described above) that
+    must be executed in-order.
+
+    In every pipeline stage, the InOrder model will
+    process all entries on the resource schedule for
+    that stage and then send the instruction to the next
+    stage if and only if the instruction successfully
+    completed each ScheduleEntry.
+*/
 class ResourceSked {
   public:
     typedef std::list<ScheduleEntry*>::iterator SkedIt;
+    typedef std::vector<std::list<ScheduleEntry*> > StageList;
 
     ResourceSked();
 
+    /** Initializee the current entry pointer to
+        pipeline stage 0 and the 1st schedule entry
+    */
     void init();
 
+    /** Goes through the remaining stages on the schedule
+        and sums all the remaining entries left to be
+        processed
+    */
     int size();
+
+    /** Is the schedule empty? */
     bool empty();
+
+    /** Beginning Entry of this schedule */
+    SkedIt begin();
+
+    /** Ending Entry of this schedule */
+    SkedIt end();
+
+    /** What is the next task for this instruction schedule? */
     ScheduleEntry* top();
+
+    /** Top() Task is completed, remove it from schedule */
     void pop();
+
+    /** Add To Schedule based on stage num and priority of
+        Schedule Entry
+    */
     void push(ScheduleEntry* sked_entry);
+
+    /** Add Schedule Entry to be in front of another Entry */
     void pushBefore(ScheduleEntry* sked_entry, int sked_cmd, int sked_cmd_idx);
+
+    /** Print what's left on the instruction schedule */
     void print();
 
-  private:
-    SkedIt curSkedEntry;
-    std::vector<std::list<ScheduleEntry*> > sked;
+    StageList *getStages()
+    {
+        return &stages;
+    }
 
+  private:
+    /** Current Schedule Entry Pointer */
+    SkedIt curSkedEntry;
+
+    /** The Stage-by-Stage Resource Schedule:
+        Resized to Number of Stages in the constructor
+    */
+    StageList stages;
+
+    /** Find a place to insert the instruction using  the
+        schedule entries priority
+    */
     SkedIt findIterByPriority(ScheduleEntry *sked_entry, int stage_num);
+
+    /** Find a place to insert the instruction using a particular command
+        to look for.
+    */
     SkedIt findIterByCommand(ScheduleEntry *sked_entry, int stage_num,
                              int sked_cmd, int sked_cmd_idx = -1);
 };
 
+/** Wrapper class around the SkedIt iterator in the Resource Sked so that
+    we can use ++ operator to automatically go to the next available
+    resource schedule entry but otherwise maintain same functionality
+    as a normal iterator.
+*/
+class RSkedIt
+{
+  public:
+    RSkedIt()
+        : curStage(0), numStages(0)
+    { }
+
+
+    /** init() must be called before the use of any other member
+        in the RSkedIt class.
+    */
+    void init(ResourceSked* rsked)
+    {
+        stages = rsked->getStages();
+        numStages = stages->size();
+    }
+
+    /* Update the encapsulated "myIt" iterator, but only
+       update curStage/curStage_end if the iterator is valid.
+       The iterator could be invalid in the case where
+       someone is saving the end of a list (i.e. std::list->end())
+    */
+    RSkedIt operator=(ResourceSked::SkedIt const &rhs)
+    {
+        myIt = rhs;
+        if (myIt != (*stages)[numStages-1].end()) {
+            curStage = (*myIt)->stageNum;
+            curStage_end = (*stages)[curStage].end();
+        }
+        return *this;
+    }
+
+    /** Increment to the next entry in current stage.
+        If no more entries then find the next stage that has
+        resource schedule to complete.
+        If no more stages, then return the end() iterator from
+        the last stage to indicate we are done.
+    */
+    RSkedIt &operator++(int unused)
+    {
+        if (++myIt == curStage_end) {
+            curStage++;
+            while (curStage < numStages) {
+                if ((*stages)[curStage].empty()) {
+                    curStage++;
+                } else {
+                    myIt = (*stages)[curStage].begin();
+                    curStage_end = (*stages)[curStage].end();
+                    return *this;
+                }
+            }
+
+            myIt = (*stages)[numStages - 1].end();
+        }
+
+        return *this;
+    }
+
+    /** The "pointer" operator can be used on a RSkedIt and it
+        will use the encapsulated iterator
+    */
+    ScheduleEntry* operator->()
+    {
+        return *myIt;
+    }
+
+    /** Dereferencing a RSKedIt will access the encapsulated
+        iterator
+    */
+    ScheduleEntry* operator*()
+    {
+        return *myIt;
+    }
+
+    /** Equality for RSkedIt only compares the "myIt" iterators,
+        as the other members are just ancillary
+    */
+    bool operator==(RSkedIt const &rhs)
+    {
+        return this->myIt == rhs.myIt;
+    }
+
+    /** Inequality for RSkedIt only compares the "myIt" iterators,
+        as the other members are just ancillary
+    */
+    bool operator!=(RSkedIt const &rhs)
+    {
+        return this->myIt != rhs.myIt;
+    }
+
+    /* The == and != operator overloads should be sufficient
+       here if need otherwise direct access to the schedule
+       iterator, then this can be used */
+    ResourceSked::SkedIt getIt()
+    {
+        return myIt;
+    }
+
+  private:
+    /** Schedule Iterator that this class is encapsulating */
+    ResourceSked::SkedIt myIt;
+
+    /** Ptr to resource schedule that the 'myIt' iterator
+        belongs to
+    */
+    ResourceSked::StageList *stages;
+
+    /**  The last iterator in the current stage. */
+    ResourceSked::SkedIt curStage_end;
+
+    /** Current Stage that "myIt" refers to. */
+    int curStage;
+
+    /** Number of stages in the "*stages" object. */
+    int numStages;
+};
+
 #endif //__CPU_INORDER_RESOURCE_SKED_HH__
diff --git a/src/cpu/inorder/resources/agen_unit.cc b/src/cpu/inorder/resources/agen_unit.cc
index f1862b94a7..764cd9446e 100644
--- a/src/cpu/inorder/resources/agen_unit.cc
+++ b/src/cpu/inorder/resources/agen_unit.cc
@@ -50,8 +50,8 @@ AGENUnit::regStats()
 void
 AGENUnit::execute(int slot_num)
 {
-    ResourceRequest* agen_req = reqMap[slot_num];
-    DynInstPtr inst = reqMap[slot_num]->inst;
+    ResourceRequest* agen_req = reqs[slot_num];
+    DynInstPtr inst = reqs[slot_num]->inst;
 #if TRACING_ON
     ThreadID tid = inst->readTid();
 #endif
diff --git a/src/cpu/inorder/resources/branch_predictor.cc b/src/cpu/inorder/resources/branch_predictor.cc
index 8ca5a97186..5a22e40ebf 100644
--- a/src/cpu/inorder/resources/branch_predictor.cc
+++ b/src/cpu/inorder/resources/branch_predictor.cc
@@ -66,7 +66,7 @@ BranchPredictor::execute(int slot_num)
 {
     // After this is working, change this to a reinterpret cast
     // for performance considerations
-    ResourceRequest* bpred_req = reqMap[slot_num];
+    ResourceRequest* bpred_req = reqs[slot_num];
     DynInstPtr inst = bpred_req->inst;
     ThreadID tid = inst->readTid();
     int seq_num = inst->seqNum;
diff --git a/src/cpu/inorder/resources/cache_unit.cc b/src/cpu/inorder/resources/cache_unit.cc
index 8b4dd4402c..b17e5b3da4 100644
--- a/src/cpu/inorder/resources/cache_unit.cc
+++ b/src/cpu/inorder/resources/cache_unit.cc
@@ -133,6 +133,10 @@ CacheUnit::getPort(const string &if_name, int idx)
 void
 CacheUnit::init()
 {
+    for (int i = 0; i < width; i++) {
+        reqs[i] = new CacheRequest(this);
+    }
+
     // Currently Used to Model TLB Latency. Eventually
     // Switch to Timing TLB translations.
     resourceEvent = new CacheUnitEvent[width];
@@ -250,20 +254,16 @@ CacheUnit::removeAddrDependency(DynInstPtr inst)
 ResReqPtr
 CacheUnit::findRequest(DynInstPtr inst)
 {
-    map<int, ResReqPtr>::iterator map_it = reqMap.begin();
-    map<int, ResReqPtr>::iterator map_end = reqMap.end();
-
-    while (map_it != map_end) {
+    for (int i = 0; i < width; i++) {
         CacheRequest* cache_req =
-            dynamic_cast<CacheRequest*>((*map_it).second);
+            dynamic_cast<CacheRequest*>(reqs[i]);
         assert(cache_req);
 
-        if (cache_req &&
+        if (cache_req->valid &&
             cache_req->getInst() == inst &&
-            cache_req->instIdx == inst->resSched.top()->idx) {
+            cache_req->instIdx == inst->curSkedEntry->idx) {
             return cache_req;
         }
-        map_it++;
     }
 
     return NULL;
@@ -272,20 +272,16 @@ CacheUnit::findRequest(DynInstPtr inst)
 ResReqPtr
 CacheUnit::findRequest(DynInstPtr inst, int idx)
 {
-    map<int, ResReqPtr>::iterator map_it = reqMap.begin();
-    map<int, ResReqPtr>::iterator map_end = reqMap.end();
-
-    while (map_it != map_end) {
+    for (int i = 0; i < width; i++) {
         CacheRequest* cache_req =
-            dynamic_cast<CacheRequest*>((*map_it).second);
+            dynamic_cast<CacheRequest*>(reqs[i]);
         assert(cache_req);
 
-        if (cache_req &&
+        if (cache_req->valid &&
             cache_req->getInst() == inst &&
             cache_req->instIdx == idx) {
             return cache_req;
         }
-        map_it++;
     }
 
     return NULL;
@@ -296,7 +292,8 @@ ResReqPtr
 CacheUnit::getRequest(DynInstPtr inst, int stage_num, int res_idx,
                      int slot_num, unsigned cmd)
 {
-    ScheduleEntry* sched_entry = inst->resSched.top();
+    ScheduleEntry* sched_entry = *inst->curSkedEntry;
+    CacheRequest* cache_req = dynamic_cast<CacheRequest*>(reqs[slot_num]);
 
     if (!inst->validMemAddr()) {
         panic("Mem. Addr. must be set before requesting cache access\n");
@@ -343,10 +340,10 @@ CacheUnit::getRequest(DynInstPtr inst, int stage_num, int res_idx,
               sched_entry->cmd, name());
     }
 
-    return new CacheRequest(this, inst, stage_num, id, slot_num,
-                            sched_entry->cmd, 0, pkt_cmd,
-                            0/*flags*/, this->cpu->readCpuId(),
-                            inst->resSched.top()->idx);
+    cache_req->setRequest(inst, stage_num, id, slot_num,
+                          sched_entry->cmd, pkt_cmd,
+                          inst->curSkedEntry->idx);
+    return cache_req;
 }
 
 void
@@ -357,17 +354,17 @@ CacheUnit::requestAgain(DynInstPtr inst, bool &service_request)
 
     // Check to see if this instruction is requesting the same command
     // or a different one
-    if (cache_req->cmd != inst->resSched.top()->cmd &&
-        cache_req->instIdx == inst->resSched.top()->idx) {
+    if (cache_req->cmd != inst->curSkedEntry->cmd &&
+        cache_req->instIdx == inst->curSkedEntry->idx) {
         // If different, then update command in the request
-        cache_req->cmd = inst->resSched.top()->cmd;
+        cache_req->cmd = inst->curSkedEntry->cmd;
         DPRINTF(InOrderCachePort,
                 "[tid:%i]: [sn:%i]: Updating the command for this "
                 "instruction\n ", inst->readTid(), inst->seqNum);
 
         service_request = true;
-    } else if (inst->resSched.top()->idx != CacheUnit::InitSecondSplitRead &&
-               inst->resSched.top()->idx != CacheUnit::InitSecondSplitWrite) {        
+    } else if (inst->curSkedEntry->idx != CacheUnit::InitSecondSplitRead &&
+               inst->curSkedEntry->idx != CacheUnit::InitSecondSplitWrite) {
         // If same command, just check to see if memory access was completed
         // but dont try to re-execute
         DPRINTF(InOrderCachePort,
@@ -487,14 +484,20 @@ CacheUnit::read(DynInstPtr inst, Addr addr,
         inst->splitMemData = new uint8_t[size];
         
         if (!inst->splitInstSked) {
+            assert(0 && "Split Requests Not Supported for Now...");
+
             // Schedule Split Read/Complete for Instruction
             // ==============================
             int stage_num = cache_req->getStageNum();
-        
-            int stage_pri = ThePipeline::getNextPriority(inst, stage_num);
+            RSkedPtr inst_sked = (stage_num >= ThePipeline::BackEndStartStage) ?
+                inst->backSked : inst->frontSked;
+
+            // this is just an arbitrarily high priority to ensure that this
+            // gets pushed to the back of the list
+            int stage_pri = 20;
         
             int isplit_cmd = CacheUnit::InitSecondSplitRead;
-            inst->resSched.push(new
+            inst_sked->push(new
                                 ScheduleEntry(stage_num,
                                               stage_pri,
                                               cpu->resPool->getResIdx(DCache),
@@ -502,7 +505,7 @@ CacheUnit::read(DynInstPtr inst, Addr addr,
                                               1));
 
             int csplit_cmd = CacheUnit::CompleteSecondSplitRead;
-            inst->resSched.push(new
+            inst_sked->push(new
                                 ScheduleEntry(stage_num + 1,
                                               1/*stage_pri*/,
                                               cpu->resPool->getResIdx(DCache),
@@ -590,27 +593,33 @@ CacheUnit::write(DynInstPtr inst, uint8_t *data, unsigned size,
         inst->splitInst = true;        
 
         if (!inst->splitInstSked) {
+            assert(0 && "Split Requests Not Supported for Now...");
+
             // Schedule Split Read/Complete for Instruction
             // ==============================
             int stage_num = cache_req->getStageNum();
+            RSkedPtr inst_sked = (stage_num >= ThePipeline::BackEndStartStage) ?
+                inst->backSked : inst->frontSked;
         
-            int stage_pri = ThePipeline::getNextPriority(inst, stage_num);
+            // this is just an arbitrarily high priority to ensure that this
+            // gets pushed to the back of the list
+            int stage_pri = 20;
         
             int isplit_cmd = CacheUnit::InitSecondSplitWrite;
-            inst->resSched.push(new
-                                ScheduleEntry(stage_num,
-                                              stage_pri,
-                                              cpu->resPool->getResIdx(DCache),
-                                              isplit_cmd,
-                                              1));
+            inst_sked->push(new
+                            ScheduleEntry(stage_num,
+                                          stage_pri,
+                                          cpu->resPool->getResIdx(DCache),
+                                          isplit_cmd,
+                                          1));
 
             int csplit_cmd = CacheUnit::CompleteSecondSplitWrite;
-            inst->resSched.push(new
-                                ScheduleEntry(stage_num + 1,
-                                              1/*stage_pri*/,
-                                              cpu->resPool->getResIdx(DCache),
-                                              csplit_cmd,
-                                              1));
+            inst_sked->push(new
+                            ScheduleEntry(stage_num + 1,
+                                          1/*stage_pri*/,
+                                          cpu->resPool->getResIdx(DCache),
+                                          csplit_cmd,
+                                          1));
             inst->splitInstSked = true;
         } else {
             DPRINTF(InOrderCachePort, "[tid:%i] sn:%i] Retrying Split Read "
@@ -639,8 +648,6 @@ CacheUnit::write(DynInstPtr inst, uint8_t *data, unsigned size,
 
     if (inst->fault == NoFault) {
         if (!cache_req->splitAccess) {            
-            // Remove this line since storeData is saved in INST?
-            cache_req->reqData = new uint8_t[size];
             doCacheAccess(inst, write_res);
         } else {            
             doCacheAccess(inst, write_res, cache_req);            
@@ -655,16 +662,19 @@ CacheUnit::write(DynInstPtr inst, uint8_t *data, unsigned size,
 void
 CacheUnit::execute(int slot_num)
 {
-    CacheReqPtr cache_req = dynamic_cast<CacheReqPtr>(reqMap[slot_num]);
+    CacheReqPtr cache_req = dynamic_cast<CacheReqPtr>(reqs[slot_num]);
     assert(cache_req);
 
-    if (cachePortBlocked) {
+    if (cachePortBlocked &&
+        (cache_req->cmd == InitiateReadData ||
+         cache_req->cmd == InitiateWriteData ||
+         cache_req->cmd == InitSecondSplitRead ||
+         cache_req->cmd == InitSecondSplitWrite)) {
         DPRINTF(InOrderCachePort, "Cache Port Blocked. Cannot Access\n");
-        cache_req->setCompleted(false);
+        cache_req->done(false);
         return;
     }
 
-
     DynInstPtr inst = cache_req->inst;
 #if TRACING_ON
     ThreadID tid = inst->readTid();
@@ -681,7 +691,12 @@ CacheUnit::execute(int slot_num)
         acc_type = "read";
 #endif        
       case InitiateWriteData:
-            
+        if (cachePortBlocked) {
+            DPRINTF(InOrderCachePort, "Cache Port Blocked. Cannot Access\n");
+            cache_req->done(false);
+            return;
+        }
+
         DPRINTF(InOrderCachePort,
                 "[tid:%u]: [sn:%i] Initiating data %s access to %s for "
                 "addr. %08p\n", tid, inst->seqNum, acc_type, name(),
@@ -796,7 +811,7 @@ CacheUnit::doCacheAccess(DynInstPtr inst, uint64_t *write_res,
     CacheReqPtr cache_req;
     
     if (split_req == NULL) {        
-        cache_req = dynamic_cast<CacheReqPtr>(reqMap[inst->getCurResSlot()]);
+        cache_req = dynamic_cast<CacheReqPtr>(reqs[inst->getCurResSlot()]);
     } else{
         cache_req = split_req;
     }        
@@ -855,7 +870,7 @@ CacheUnit::doCacheAccess(DynInstPtr inst, uint64_t *write_res,
                     "[tid:%i] [sn:%i] cannot access cache, because port "
                     "is blocked. now waiting to retry request\n", tid, 
                     inst->seqNum);
-            cache_req->setCompleted(false);
+            cache_req->done(false);
             cachePortBlocked = true;
         } else {
             DPRINTF(InOrderCachePort,
@@ -879,7 +894,7 @@ CacheUnit::doCacheAccess(DynInstPtr inst, uint64_t *write_res,
         // Make cache request again since access due to
         // inability to access
         DPRINTF(InOrderStall, "STALL: \n");
-        cache_req->setCompleted(false);
+        cache_req->done(false);
     }
 
 }
@@ -902,7 +917,7 @@ CacheUnit::processCacheCompletion(PacketPtr pkt)
                 cache_pkt->cacheReq->getTid(),
                 cache_pkt->cacheReq->seqNum);
 
-        cache_pkt->cacheReq->done();
+        cache_pkt->cacheReq->freeSlot();
         delete cache_pkt;
 
         cpu->wakeCPU();
@@ -1047,10 +1062,10 @@ CacheUnitEvent::CacheUnitEvent()
 void
 CacheUnitEvent::process()
 {
-    DynInstPtr inst = resource->reqMap[slotIdx]->inst;
-    int stage_num = resource->reqMap[slotIdx]->getStageNum();
+    DynInstPtr inst = resource->reqs[slotIdx]->inst;
+    int stage_num = resource->reqs[slotIdx]->getStageNum();
     ThreadID tid = inst->threadNumber;
-    CacheReqPtr req_ptr = dynamic_cast<CacheReqPtr>(resource->reqMap[slotIdx]);
+    CacheReqPtr req_ptr = dynamic_cast<CacheReqPtr>(resource->reqs[slotIdx]);
 
     DPRINTF(InOrderTLB, "Waking up from TLB Miss caused by [sn:%i].\n",
             inst->seqNum);
@@ -1061,13 +1076,15 @@ CacheUnitEvent::process()
     tlb_res->tlbBlocked[tid] = false;
 
     tlb_res->cpu->pipelineStage[stage_num]->
-        unsetResStall(tlb_res->reqMap[slotIdx], tid);
+        unsetResStall(tlb_res->reqs[slotIdx], tid);
 
     req_ptr->tlbStall = false;
 
     if (req_ptr->isSquashed()) {
-        req_ptr->done();
+        req_ptr->freeSlot();
     }
+
+    tlb_res->cpu->wakeCPU();
 }
 
 void
@@ -1112,15 +1129,10 @@ void
 CacheUnit::squash(DynInstPtr inst, int stage_num,
                   InstSeqNum squash_seq_num, ThreadID tid)
 {
-    vector<int> slot_remove_list;
+    for (int i = 0; i < width; i++) {
+        ResReqPtr req_ptr = reqs[i];
 
-    map<int, ResReqPtr>::iterator map_it = reqMap.begin();
-    map<int, ResReqPtr>::iterator map_end = reqMap.end();
-
-    while (map_it != map_end) {
-        ResReqPtr req_ptr = (*map_it).second;
-
-        if (req_ptr &&
+        if (req_ptr->valid &&
             req_ptr->getInst()->readTid() == tid &&
             req_ptr->getInst()->seqNum > squash_seq_num) {
 
@@ -1133,7 +1145,6 @@ CacheUnit::squash(DynInstPtr inst, int stage_num,
                         "squashed, ignoring squash process.\n",
                         req_ptr->getInst()->readTid(),
                         req_ptr->getInst()->seqNum);
-                map_it++;                
                 continue;                
             }
 
@@ -1147,18 +1158,14 @@ CacheUnit::squash(DynInstPtr inst, int stage_num,
             if (cache_req->tlbStall) {
                 tlbBlocked[tid] = false;
 
-                int stall_stage = reqMap[req_slot_num]->getStageNum();
+                int stall_stage = reqs[req_slot_num]->getStageNum();
 
                 cpu->pipelineStage[stall_stage]->
-                    unsetResStall(reqMap[req_slot_num], tid);
+                    unsetResStall(reqs[req_slot_num], tid);
             }
 
             if (!cache_req->tlbStall && !cache_req->isMemAccPending()) {
-                // Mark request for later removal
-                cpu->reqRemoveList.push(req_ptr);
-
-                // Mark slot for removal from resource
-                slot_remove_list.push_back(req_ptr->getSlot());
+                freeSlot(req_slot_num);
             } else {
                 DPRINTF(InOrderCachePort,
                         "[tid:%i] Request from [sn:%i] squashed, but still "
@@ -1170,14 +1177,8 @@ CacheUnit::squash(DynInstPtr inst, int stage_num,
                         req_ptr->getInst()->readTid(), req_ptr->getInst()->seqNum,
                         req_ptr->getInst()->splitInst);
             }
-
         }
-
-        map_it++;
     }
 
-    // Now Delete Slot Entry from Req. Map
-    for (int i = 0; i < slot_remove_list.size(); i++)
-        freeSlot(slot_remove_list[i]);
 }
 
diff --git a/src/cpu/inorder/resources/cache_unit.hh b/src/cpu/inorder/resources/cache_unit.hh
index afcb36a24a..097b6fa7ab 100644
--- a/src/cpu/inorder/resources/cache_unit.hh
+++ b/src/cpu/inorder/resources/cache_unit.hh
@@ -219,20 +219,18 @@ class CacheUnitEvent : public ResourceEvent {
     void process();
 };
 
+//@todo: Move into CacheUnit Class for private access to "valid" field
 class CacheRequest : public ResourceRequest
 {
   public:
-    CacheRequest(CacheUnit *cres, DynInstPtr inst, int stage_num, int res_idx,
-                 int slot_num, unsigned cmd, int req_size,
-                 MemCmd::Command pkt_cmd, unsigned flags, int cpu_id, int idx)
-        : ResourceRequest(cres, inst, stage_num, res_idx, slot_num, cmd),
-          pktCmd(pkt_cmd), memReq(NULL), reqData(NULL), dataPkt(NULL),
-          retryPkt(NULL), memAccComplete(false), memAccPending(false),
-          tlbStall(false), splitAccess(false), splitAccessNum(-1),
-          split2ndAccess(false), instIdx(idx), fetchBufferFill(false)
+    CacheRequest(CacheUnit *cres)
+        :  ResourceRequest(cres), memReq(NULL), reqData(NULL),
+           dataPkt(NULL), retryPkt(NULL), memAccComplete(false),
+           memAccPending(false), tlbStall(false), splitAccess(false),
+           splitAccessNum(-1), split2ndAccess(false),
+           fetchBufferFill(false)
     { }
 
-
     virtual ~CacheRequest()
     {
         if (reqData && !splitAccess) {
@@ -240,6 +238,37 @@ class CacheRequest : public ResourceRequest
         }
     }
 
+    void setRequest(DynInstPtr _inst, int stage_num, int res_idx, int slot_num,
+                    unsigned _cmd, MemCmd::Command pkt_cmd, int idx)
+    {
+        pktCmd = pkt_cmd;
+        instIdx = idx;
+
+        ResourceRequest::setRequest(_inst, stage_num, res_idx, slot_num, _cmd);
+    }
+
+    void clearRequest()
+    {
+        if (reqData && !splitAccess) {
+            delete [] reqData;
+        }
+
+        memReq = NULL;
+        reqData = NULL;
+        dataPkt = NULL;
+        retryPkt = NULL;
+        memAccComplete = false;
+        memAccPending = false;
+        tlbStall = false;
+        splitAccess = false;
+        splitAccessNum = -1;
+        split2ndAccess = false;
+        instIdx = 0;
+        fetchBufferFill = false;
+
+        ResourceRequest::clearRequest();
+    }
+
     virtual PacketDataPtr getData()
     { return reqData; }
 
diff --git a/src/cpu/inorder/resources/decode_unit.cc b/src/cpu/inorder/resources/decode_unit.cc
index c2f7ae22d8..71d33ab90d 100644
--- a/src/cpu/inorder/resources/decode_unit.cc
+++ b/src/cpu/inorder/resources/decode_unit.cc
@@ -49,21 +49,24 @@ DecodeUnit::DecodeUnit(std::string res_name, int res_id, int res_width,
 void
 DecodeUnit::execute(int slot_num)
 {
-    ResourceRequest* decode_req = reqMap[slot_num];
-    DynInstPtr inst = reqMap[slot_num]->inst;
+    ResourceRequest* decode_req = reqs[slot_num];
+    DynInstPtr inst = reqs[slot_num]->inst;
     ThreadID tid = inst->readTid();
 
     switch (decode_req->cmd)
     {
       case DecodeInst:
         {
-            bool done_sked = ThePipeline::createBackEndSchedule(inst);
+            inst->setBackSked(cpu->createBackEndSked(inst));
 
-            if (done_sked) {
+            if (inst->backSked != NULL) {
                 DPRINTF(InOrderDecode,
                     "[tid:%i]: Setting Destination Register(s) for [sn:%i].\n",
                     tid, inst->seqNum);
                 regDepMap[tid]->insert(inst);
+
+                //inst->printSked();
+
                 decode_req->done();
             } else {
                 DPRINTF(Resource,
diff --git a/src/cpu/inorder/resources/execution_unit.cc b/src/cpu/inorder/resources/execution_unit.cc
index 36bf2a4dcc..b2540cff8b 100644
--- a/src/cpu/inorder/resources/execution_unit.cc
+++ b/src/cpu/inorder/resources/execution_unit.cc
@@ -42,7 +42,7 @@ ExecutionUnit::ExecutionUnit(string res_name, int res_id, int res_width,
                              int res_latency, InOrderCPU *_cpu,
                              ThePipeline::Params *params)
     : Resource(res_name, res_id, res_width, res_latency, _cpu),
-      lastExecuteTick(0), lastControlTick(0)
+      lastExecuteTick(0), lastControlTick(0), serializeTick(0)
 { }
 
 void
@@ -82,27 +82,52 @@ ExecutionUnit::regStats()
 void
 ExecutionUnit::execute(int slot_num)
 {
-    ResourceRequest* exec_req = reqMap[slot_num];
-    DynInstPtr inst = reqMap[slot_num]->inst;
+    ResourceRequest* exec_req = reqs[slot_num];
+    DynInstPtr inst = reqs[slot_num]->inst;
     Fault fault = NoFault;
     int seq_num = inst->seqNum;
+    Tick cur_tick = curTick();
+
+    if (cur_tick == serializeTick) {
+        DPRINTF(InOrderExecute, "Can not execute [tid:%i][sn:%i][PC:%s] %s. "
+                "All instructions are being serialized this cycle\n",
+                inst->readTid(), seq_num, inst->pcState(), inst->instName());
+        exec_req->done(false);
+        return;
+    }
 
-    DPRINTF(InOrderExecute, "[tid:%i] Executing [sn:%i] [PC:%s] %s.\n",
-            inst->readTid(), seq_num, inst->pcState(), inst->instName());
 
     switch (exec_req->cmd)
     {
       case ExecuteInst:
         {
-            if (curTick() != lastExecuteTick) {
-                lastExecuteTick = curTick();
+            if (inst->isNop()) {
+                DPRINTF(InOrderExecute, "[tid:%i] [sn:%i] [PC:%s] Ignoring execution"
+                        "of %s.\n", inst->readTid(), seq_num, inst->pcState(),
+                        inst->instName());
+                inst->setExecuted();
+                exec_req->done();
+                return;
+            } else {
+                DPRINTF(InOrderExecute, "[tid:%i] Executing [sn:%i] [PC:%s] %s.\n",
+                        inst->readTid(), seq_num, inst->pcState(), inst->instName());
             }
 
+            if (cur_tick != lastExecuteTick) {
+                lastExecuteTick = cur_tick;
+            }
 
-            if (inst->isMemRef()) {
-                panic("%s not configured to handle memory ops.\n", resName);
-            } else if (inst->isControl()) {
-                if (lastControlTick == curTick()) {
+            assert(!inst->isMemRef());
+
+            if (inst->isSerializeAfter()) {
+                serializeTick = cur_tick;
+                DPRINTF(InOrderExecute, "Serializing execution after [tid:%i] "
+                        "[sn:%i] [PC:%s] %s.\n", inst->readTid(), seq_num,
+                        inst->pcState(), inst->instName());
+            }
+
+            if (inst->isControl()) {
+                if (lastControlTick == cur_tick) {
                     DPRINTF(InOrderExecute, "Can not Execute More than One Control "
                             "Inst Per Cycle. Blocking Request.\n");
                     exec_req->done(false);
diff --git a/src/cpu/inorder/resources/execution_unit.hh b/src/cpu/inorder/resources/execution_unit.hh
index a6694ddb5e..b03a6655e5 100644
--- a/src/cpu/inorder/resources/execution_unit.hh
+++ b/src/cpu/inorder/resources/execution_unit.hh
@@ -76,6 +76,7 @@ class ExecutionUnit : public Resource {
     Stats::Scalar executions;
     Tick lastExecuteTick;
     Tick lastControlTick;
+    Tick serializeTick;
 };
 
 
diff --git a/src/cpu/inorder/resources/fetch_seq_unit.cc b/src/cpu/inorder/resources/fetch_seq_unit.cc
index 6f84a333dc..d23ea0a826 100644
--- a/src/cpu/inorder/resources/fetch_seq_unit.cc
+++ b/src/cpu/inorder/resources/fetch_seq_unit.cc
@@ -62,13 +62,17 @@ FetchSeqUnit::init()
 {
     resourceEvent = new FetchSeqEvent[width];
 
+    for (int i = 0; i < width; i++) {
+        reqs[i] = new ResourceRequest(this);
+    }
+
     initSlots();
 }
 
 void
 FetchSeqUnit::execute(int slot_num)
 {
-    ResourceRequest* fs_req = reqMap[slot_num];
+    ResourceRequest* fs_req = reqs[slot_num];
     DynInstPtr inst = fs_req->inst;
     ThreadID tid = inst->readTid();
     int stage_num = fs_req->getStageNum();
@@ -96,7 +100,7 @@ FetchSeqUnit::execute(int slot_num)
                 fs_req->done();
             } else {
                 DPRINTF(InOrderStall, "STALL: [tid:%i]: NPC not valid\n", tid);
-                fs_req->setCompleted(false);
+                fs_req->done(false);
             }
         }
         break;
diff --git a/src/cpu/inorder/resources/fetch_unit.cc b/src/cpu/inorder/resources/fetch_unit.cc
index 0e98667086..a0d830ecf1 100644
--- a/src/cpu/inorder/resources/fetch_unit.cc
+++ b/src/cpu/inorder/resources/fetch_unit.cc
@@ -56,6 +56,31 @@ FetchUnit::FetchUnit(string res_name, int res_id, int res_width,
       predecoder(NULL)
 { }
 
+FetchUnit::~FetchUnit()
+{
+    std::list<FetchBlock*>::iterator fetch_it = fetchBuffer.begin();
+    std::list<FetchBlock*>::iterator end_it = fetchBuffer.end();
+    while (fetch_it != end_it) {
+        delete (*fetch_it)->block;
+        delete *fetch_it;
+        fetch_it++;
+    }
+    fetchBuffer.clear();
+
+
+    std::list<FetchBlock*>::iterator pend_it = pendingFetch.begin();
+    std::list<FetchBlock*>::iterator pend_end = pendingFetch.end();
+    while (pend_it != pend_end) {
+        if ((*pend_it)->block) {
+            delete (*pend_it)->block;
+        }
+
+        delete *pend_it;
+        pend_it++;
+    }
+    pendingFetch.clear();
+}
+
 void
 FetchUnit::createMachInst(std::list<FetchBlock*>::iterator fetch_it,
                           DynInstPtr inst)
@@ -118,33 +143,24 @@ ResReqPtr
 FetchUnit::getRequest(DynInstPtr inst, int stage_num, int res_idx,
                      int slot_num, unsigned cmd)
 {
-    ScheduleEntry* sched_entry = inst->resSched.top();
+    ScheduleEntry* sched_entry = *inst->curSkedEntry;
+    CacheRequest* cache_req = dynamic_cast<CacheRequest*>(reqs[slot_num]);
 
     if (!inst->validMemAddr()) {
         panic("Mem. Addr. must be set before requesting cache access\n");
     }
 
-    MemCmd::Command pkt_cmd;
+    assert(sched_entry->cmd == InitiateFetch);
 
-    switch (sched_entry->cmd)
-    {
-      case InitiateFetch:
-        pkt_cmd = MemCmd::ReadReq;
+    DPRINTF(InOrderCachePort,
+            "[tid:%i]: Fetch request from [sn:%i] for addr %08p\n",
+            inst->readTid(), inst->seqNum, inst->getMemAddr());
 
-        DPRINTF(InOrderCachePort,
-                "[tid:%i]: Fetch request from [sn:%i] for addr %08p\n",
-                inst->readTid(), inst->seqNum, inst->getMemAddr());
-        break;
+    cache_req->setRequest(inst, stage_num, id, slot_num,
+                          sched_entry->cmd, MemCmd::ReadReq,
+                          inst->curSkedEntry->idx);
 
-      default:
-        panic("%i: Unexpected request type (%i) to %s", curTick(),
-              sched_entry->cmd, name());
-    }
-
-    return new CacheRequest(this, inst, stage_num, id, slot_num,
-                            sched_entry->cmd, 0, pkt_cmd,
-                            0/*flags*/, this->cpu->readCpuId(),
-                            inst->resSched.top()->idx);
+    return cache_req;
 }
 
 void
@@ -214,12 +230,12 @@ FetchUnit::markBlockUsed(std::list<FetchBlock*>::iterator block_it)
 void
 FetchUnit::execute(int slot_num)
 {
-    CacheReqPtr cache_req = dynamic_cast<CacheReqPtr>(reqMap[slot_num]);
+    CacheReqPtr cache_req = dynamic_cast<CacheReqPtr>(reqs[slot_num]);
     assert(cache_req);
 
-    if (cachePortBlocked) {
+    if (cachePortBlocked && cache_req->cmd == InitiateFetch) {
         DPRINTF(InOrderCachePort, "Cache Port Blocked. Cannot Access\n");
-        cache_req->setCompleted(false);
+        cache_req->done(false);
         return;
     }
 
@@ -270,7 +286,7 @@ FetchUnit::execute(int slot_num)
             // If not, block this request.
             if (pendingFetch.size() >= fetchBuffSize) {
                 DPRINTF(InOrderCachePort, "No room available in fetch buffer.\n");
-                cache_req->setCompleted(false);
+                cache_req->done();
                 return;
             }
 
@@ -337,6 +353,8 @@ FetchUnit::execute(int slot_num)
                     return;
                 }
 
+                delete [] (*repl_it)->block;
+                delete *repl_it;
                 fetchBuffer.erase(repl_it);
             }
 
@@ -414,6 +432,7 @@ FetchUnit::processCacheCompletion(PacketPtr pkt)
                 cache_pkt->cacheReq->seqNum);
 
         cache_pkt->cacheReq->done();
+        cache_pkt->cacheReq->freeSlot();
         delete cache_pkt;
 
         cpu->wakeCPU();
@@ -447,7 +466,7 @@ FetchUnit::processCacheCompletion(PacketPtr pkt)
     short asid = cpu->asid[tid];
 
     assert(!cache_req->isSquashed());
-    assert(inst->resSched.top()->cmd == CompleteFetch);
+    assert(inst->curSkedEntry->cmd == CompleteFetch);
 
     DPRINTF(InOrderCachePort,
             "[tid:%u]: [sn:%i]: Processing fetch access for block %#x\n",
@@ -514,6 +533,10 @@ FetchUnit::squashCacheRequest(CacheReqPtr req_ptr)
                 DPRINTF(InOrderCachePort, "[sn:%i] Removing Pending Fetch "
                         "for block %08p (cnt=%i)\n", inst->seqNum,
                         block_addr, (*block_it)->cnt);
+                if ((*block_it)->block) {
+                    delete [] (*block_it)->block;
+                }
+                delete *block_it;
                 pendingFetch.erase(block_it);
             }
         }
diff --git a/src/cpu/inorder/resources/fetch_unit.hh b/src/cpu/inorder/resources/fetch_unit.hh
index 035f3f4a17..fa133b9eb1 100644
--- a/src/cpu/inorder/resources/fetch_unit.hh
+++ b/src/cpu/inorder/resources/fetch_unit.hh
@@ -55,6 +55,8 @@ class FetchUnit : public CacheUnit
     FetchUnit(std::string res_name, int res_id, int res_width,
               int res_latency, InOrderCPU *_cpu, ThePipeline::Params *params);
 
+    virtual ~FetchUnit();
+
     typedef ThePipeline::DynInstPtr DynInstPtr;
     typedef TheISA::ExtMachInst ExtMachInst;
 
diff --git a/src/cpu/inorder/resources/graduation_unit.cc b/src/cpu/inorder/resources/graduation_unit.cc
index 8ccdaa36a9..edc2fb3ff7 100644
--- a/src/cpu/inorder/resources/graduation_unit.cc
+++ b/src/cpu/inorder/resources/graduation_unit.cc
@@ -37,8 +37,7 @@ GraduationUnit::GraduationUnit(std::string res_name, int res_id, int res_width,
                                int res_latency, InOrderCPU *_cpu,
                                ThePipeline::Params *params)
     : Resource(res_name, res_id, res_width, res_latency, _cpu),
-      lastCycleGrad(0), numCycleGrad(0)
-      
+      lastNonSpecTick(0)
 {
     for (ThreadID tid = 0; tid < ThePipeline::MaxThreads; tid++) {
         nonSpecInstActive[tid] = &cpu->nonSpecInstActive[tid];
@@ -49,23 +48,27 @@ GraduationUnit::GraduationUnit(std::string res_name, int res_id, int res_width,
 void
 GraduationUnit::execute(int slot_num)
 {
-    ResourceRequest* grad_req = reqMap[slot_num];
-    DynInstPtr inst = reqMap[slot_num]->inst;
+    ResourceRequest* grad_req = reqs[slot_num];
+    DynInstPtr inst = reqs[slot_num]->inst;
     ThreadID tid = inst->readTid();
-    int stage_num = inst->resSched.top()->stageNum;
+    int stage_num = inst->curSkedEntry->stageNum;
 
     switch (grad_req->cmd)
     {
       case GraduateInst:
         {
-            // Make sure this is the last thing on the resource schedule
-            assert(inst->resSched.size() == 1);
+            if (lastNonSpecTick == curTick()) {
+                DPRINTF(InOrderGraduation, "Unable to graduate [sn:%i]. "
+                        "Only 1 nonspec inst. per cycle can graduate.\n");
+                grad_req->done(false);
+                return;
+            }
 
-             // Handle Any Faults Before Graduating Instruction
+            // Handle Any Faults Before Graduating Instruction
             if (inst->fault != NoFault) {
                 cpu->trap(inst->fault, tid, inst);
                 grad_req->setCompleted(false);
-                 return;
+                return;
             }
 
             DPRINTF(InOrderGraduation,
@@ -80,6 +83,7 @@ GraduationUnit::execute(int slot_num)
                 DPRINTF(InOrderGraduation,
                         "[tid:%i] Non-speculative inst [sn:%i] graduated\n",
                         tid, inst->seqNum);
+                lastNonSpecTick = curTick();
             }
 
             if (inst->traceData) {
diff --git a/src/cpu/inorder/resources/graduation_unit.hh b/src/cpu/inorder/resources/graduation_unit.hh
index aae41993fc..59631bfcbb 100644
--- a/src/cpu/inorder/resources/graduation_unit.hh
+++ b/src/cpu/inorder/resources/graduation_unit.hh
@@ -57,9 +57,7 @@ class GraduationUnit : public Resource {
     void execute(int slot_num);
 
   protected:
-    Tick lastCycleGrad;
-    int numCycleGrad;
-
+    Tick lastNonSpecTick;
     bool *nonSpecInstActive[ThePipeline::MaxThreads];
 
     InstSeqNum *nonSpecSeqNum[ThePipeline::MaxThreads];
diff --git a/src/cpu/inorder/resources/inst_buffer.cc b/src/cpu/inorder/resources/inst_buffer.cc
index 18dd26a78f..46f5cce729 100644
--- a/src/cpu/inorder/resources/inst_buffer.cc
+++ b/src/cpu/inorder/resources/inst_buffer.cc
@@ -62,7 +62,7 @@ InstBuffer::regStats()
 void
 InstBuffer::execute(int slot_idx)
 {
-    ResReqPtr ib_req = reqMap[slot_idx];
+    ResReqPtr ib_req = reqs[slot_idx];
     DynInstPtr inst = ib_req->inst;
     ThreadID tid = inst->readTid();
     int stage_num = ib_req->getStageNum();
@@ -99,19 +99,22 @@ InstBuffer::execute(int slot_idx)
                         inst->seqNum, next_stage);
 
                 // Add to schedule: Insert into buffer in next stage
-                int stage_pri = ThePipeline::getNextPriority(inst,
-                                                             next_stage);
+                int stage_pri = 20;
+                RSkedPtr insert_sked = (stage_num >= ThePipeline::BackEndStartStage) ?
+                    inst->backSked : inst->frontSked;
 
-                inst->resSched.push(new ScheduleEntry(next_stage,
+                insert_sked->push(new ScheduleEntry(next_stage,
                                                       stage_pri,
                                                       id,
                                                       InstBuffer::InsertInst));
 
                 // Add to schedule: Remove from buffer in next next (bypass)
                 // stage
-                stage_pri = ThePipeline::getNextPriority(inst, bypass_stage);
+                stage_pri = 20;
+                RSkedPtr bypass_sked = (stage_num >= ThePipeline::BackEndStartStage) ?
+                    inst->backSked : inst->frontSked;
 
-                inst->resSched.push(new ScheduleEntry(bypass_stage,
+               bypass_sked->push(new ScheduleEntry(bypass_stage,
                                                       stage_pri,
                                                       id,
                                                       InstBuffer::RemoveInst));
diff --git a/src/cpu/inorder/resources/inst_buffer_new.cc b/src/cpu/inorder/resources/inst_buffer_new.cc
deleted file mode 100644
index 2e5a9666a5..0000000000
--- a/src/cpu/inorder/resources/inst_buffer_new.cc
+++ /dev/null
@@ -1,158 +0,0 @@
-/*
- * Copyright (c) 2007 MIPS Technologies, Inc.
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are
- * met: redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer;
- * redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in the
- * documentation and/or other materials provided with the distribution;
- * neither the name of the copyright holders nor the names of its
- * contributors may be used to endorse or promote products derived from
- * this software without specific prior written permission.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * Authors: Korey Sewell
- *
- */
-
-#include <vector>
-#include <list>
-
-#include "arch/isa_traits.hh"
-#include "config/the_isa.hh"
-#include "cpu/inorder/pipeline_traits.hh"
-#include "cpu/inorder/resources/inst_buffer.hh"
-#include "cpu/inorder/cpu.hh"
-
-using namespace std;
-using namespace TheISA;
-using namespace ThePipeline;
-
-InstBuffer::InstBuffer(string res_name, int res_id, int res_width,
-                 int res_latency, InOrderCPU *_cpu)
-    : Resource(res_name, res_id, res_width, res_latency, _cpu)
-{ }
-
-ResReqPtr
-InstBuffer::getRequest(DynInstPtr inst, int stage_num, int res_idx,
-                     int slot_num)
-{
-    // After this is working, change this to a reinterpret cast
-    // for performance considerations
-    InstBufferEntry* ib_entry = dynamic_cast<InstBufferEntry*>(inst->resSched.top());
-    assert(ib_entry);
-
-    return new InstBufferRequest(this, inst, stage_num, id, slot_num,
-                             ib_entry->cmd);
-}
-
-void
-InstBuffer::execute(int slot_idx)
-{
-    // After this is working, change this to a reinterpret cast
-    // for performance considerations
-    InstBufferRequest* ib_req = dynamic_cast<InstBufferRequest*>(reqMap[slot_idx]);
-    assert(ib_req);
-
-    DynInstPtr inst = ib_req->inst;
-    ThreadID tid = inst->readTid();
-    int seq_num = inst->seqNum;
-    ib_req->fault = NoFault;
-
-    switch (ib_req->cmd)
-    {
-      case InsertInst:
-        {
-            DPRINTF(Resource, "[tid:%i]: Inserting [sn:%i] into buffer.\n",
-                tid, seq_num);
-            insert(inst);
-            ib_req->done();
-        }
-        break;
-
-      case RemoveInst:
-        {
-            DPRINTF(Resource, "[tid:%i]: Removing [sn:%i] from buffer.\n",
-                    tid, seq_num);
-            remove(inst);
-            ib_req->done();
-        }
-        break;
-
-      default:
-        fatal("Unrecognized command to %s", resName);
-    }
-
-    DPRINTF(Resource, "Buffer now contains %i insts.\n", instList.size());
-}
-
-void
-InstBuffer::insert(DynInstPtr inst)
-{
-    instList.push_back(inst);
-}
-
-void
-InstBuffer::remove(DynInstPtr inst)
-{
-    std::list<DynInstPtr>::iterator list_it = instList.begin();
-    std::list<DynInstPtr>::iterator list_end = instList.end();
-
-    while (list_it != list_end) {
-        if((*list_it) == inst) {
-            instList.erase(list_it);
-            break;
-        }
-        list_it++;
-    }
-}
-
-void
-InstBuffer::pop()
-{ instList.pop_front(); }
-
-ThePipeline::DynInstPtr
-InstBuffer::top()
-{ return instList.front(); }
-
-void
-InstBuffer::squash(InstSeqNum squash_seq_num, ThreadID tid)
-{
-    list<DynInstPtr>::iterator list_it = instList.begin();
-    list<DynInstPtr>::iterator list_end = instList.end();
-    queue<list<DynInstPtr>::iterator> remove_list;
-
-    // Collect All Instructions to be Removed in Remove List
-    while (list_it != list_end) {
-        if((*list_it)->seqNum > squash_seq_num) {
-            DPRINTF(Resource, "[tid:%i]: Squashing [sn:%i] in resource.\n",
-                    tid, (*list_it)->seqNum);
-            (*list_it)->setSquashed();
-            remove_list.push(list_it);
-        }
-
-        list_it++;
-    }
-
-    // Removed Instructions from InstList & Clear Remove List
-    while (!remove_list.empty()) {
-        instList.erase(remove_list.front());
-        remove_list.pop();
-    }
-
-    Resource::squash(squash_seq_num, tid);
-}
diff --git a/src/cpu/inorder/resources/inst_buffer_new.hh b/src/cpu/inorder/resources/inst_buffer_new.hh
deleted file mode 100644
index b1d5a7b09a..0000000000
--- a/src/cpu/inorder/resources/inst_buffer_new.hh
+++ /dev/null
@@ -1,109 +0,0 @@
-/*
- * Copyright (c) 2007 MIPS Technologies, Inc.
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are
- * met: redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer;
- * redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in the
- * documentation and/or other materials provided with the distribution;
- * neither the name of the copyright holders nor the names of its
- * contributors may be used to endorse or promote products derived from
- * this software without specific prior written permission.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- *
- * Authors: Korey Sewell
- *
- */
-
-#ifndef __CPU_INORDER_INST_BUFF_UNIT_HH__
-#define __CPU_INORDER_INST_BUFF_UNIT_HH__
-
-#include <vector>
-#include <list>
-#include <string>
-
-#include "cpu/inorder/resource.hh"
-#include "cpu/inorder/inorder_dyn_inst.hh"
-#include "cpu/inorder/pipeline_traits.hh"
-#include "cpu/inorder/cpu.hh"
-
-class InstBuffer : public Resource {
-  public:
-    typedef InOrderDynInst::DynInstPtr DynInstPtr;
-
-  public:
-    enum Command {
-        InsertInst,
-        InsertAddr,
-        RemoveInst,
-        RemoveAddr
-    };
-
-  public:
-    InstBuffer(std::string res_name, int res_id, int res_width,
-              int res_latency, InOrderCPU *_cpu);
-    virtual ~InstBuffer() {}
-
-    virtual ResourceRequest* getRequest(DynInstPtr _inst, int stage_num,
-                                        int res_idx, int slot_num);
-
-    virtual void execute(int slot_num);
-
-    virtual void insert(DynInstPtr inst);
-
-    virtual void remove(DynInstPtr inst);
-
-    virtual void pop();
-
-    virtual DynInstPtr top();
-
-    virtual void squash(InstSeqNum squash_seq_num, ThreadID tid);
-
-  protected:
-    /** List of instructions this resource is currently
-     *  processing.
-     */
-    std::list<DynInstPtr> instList;
-
-    /** @todo: Add Resource Stats Here */
-
-};
-
-struct InstBufferEntry : public ThePipeline::ScheduleEntry {
-    InstBufferEntry(int stage_num, int res_num, InstBuffer::Command _cmd) :
-        ScheduleEntry(stage_num, res_num), cmd(_cmd)
-    { }
-
-    InstBuffer::Command cmd;
-};
-
-class InstBufferRequest : public ResourceRequest {
-  public:
-    typedef InOrderDynInst::DynInstPtr DynInstPtr;
-
-  public:
-    InstBufferRequest(InstBuffer *res, DynInstPtr inst, int stage_num, int res_idx, int slot_num,
-                  InstBuffer::Command _cmd)
-        : ResourceRequest(res, inst, stage_num, res_idx, slot_num),
-          cmd(_cmd)
-    { }
-
-    InstBuffer::Command cmd;
-};
-
-
-#endif //__CPU_INORDER_INST_BUFF_UNIT_HH__
diff --git a/src/cpu/inorder/resources/mult_div_unit.cc b/src/cpu/inorder/resources/mult_div_unit.cc
index 5aa0b0aa18..ad8b2b47bc 100644
--- a/src/cpu/inorder/resources/mult_div_unit.cc
+++ b/src/cpu/inorder/resources/mult_div_unit.cc
@@ -76,6 +76,10 @@ MultDivUnit::init()
     // Set Up Resource Events to Appropriate Resource BandWidth
     resourceEvent = new MDUEvent[width];
 
+    for (int i = 0; i < width; i++) {
+        reqs[i] = new ResourceRequest(this);
+    }
+
     initSlots();
 }
 
@@ -92,7 +96,7 @@ void
 MultDivUnit::freeSlot(int slot_idx)
 {
     DPRINTF(InOrderMDU, "Freeing slot for inst:%i\n | slots-free:%i | "
-            "slots-used:%i\n", reqMap[slot_idx]->getInst()->seqNum,
+            "slots-used:%i\n", reqs[slot_idx]->getInst()->seqNum,
             slotsAvail(), slotsInUse());
     
     Resource::freeSlot(slot_idx);    
@@ -110,9 +114,9 @@ MultDivUnit::requestAgain(DynInstPtr inst, bool &service_request)
 
     // Check to see if this instruction is requesting the same command
     // or a different one
-    if (mult_div_req->cmd != inst->resSched.top()->cmd) {
+    if (mult_div_req->cmd != inst->curSkedEntry->cmd) {
         // If different, then update command in the request
-        mult_div_req->cmd = inst->resSched.top()->cmd;
+        mult_div_req->cmd = inst->curSkedEntry->cmd;
         DPRINTF(InOrderMDU,
                 "[tid:%i]: [sn:%i]: Updating the command for this "
                 "instruction\n", inst->readTid(), inst->seqNum);
@@ -132,7 +136,7 @@ MultDivUnit::getSlot(DynInstPtr inst)
 
     // If we have this instruction's request already then return
     if (slot_num != -1 &&         
-        inst->resSched.top()->cmd == reqMap[slot_num]->cmd)
+        inst->curSkedEntry->cmd == reqs[slot_num]->cmd)
         return slot_num;
     
     unsigned repeat_rate = 0;
@@ -202,8 +206,8 @@ MultDivUnit::getDivOpSize(DynInstPtr inst)
 void 
 MultDivUnit::execute(int slot_num)
 {
-    ResourceRequest* mult_div_req = reqMap[slot_num];
-    DynInstPtr inst = reqMap[slot_num]->inst;
+    ResourceRequest* mult_div_req = reqs[slot_num];
+    DynInstPtr inst = reqs[slot_num]->inst;
  
     switch (mult_div_req->cmd)
     {
@@ -275,8 +279,8 @@ MultDivUnit::execute(int slot_num)
 void 
 MultDivUnit::exeMulDiv(int slot_num)
 {
-    ResourceRequest* mult_div_req = reqMap[slot_num];
-    DynInstPtr inst = reqMap[slot_num]->inst;
+    ResourceRequest* mult_div_req = reqs[slot_num];
+    DynInstPtr inst = reqs[slot_num]->inst;
 
     inst->fault = inst->execute();
 
@@ -310,7 +314,7 @@ MDUEvent::process()
 
     mdu_res->exeMulDiv(slotIdx);
 
-    ResourceRequest* mult_div_req = resource->reqMap[slotIdx];
+    ResourceRequest* mult_div_req = resource->reqs[slotIdx];
 
     mult_div_req->done();    
 }
diff --git a/src/cpu/inorder/resources/tlb_unit.cc b/src/cpu/inorder/resources/tlb_unit.cc
index 59840d15bb..37aec22091 100644
--- a/src/cpu/inorder/resources/tlb_unit.cc
+++ b/src/cpu/inorder/resources/tlb_unit.cc
@@ -72,6 +72,10 @@ TLBUnit::init()
 {
     resourceEvent = new TLBUnitEvent[width];
 
+    for (int i = 0; i < width; i++) {
+        reqs[i] = new TLBUnitRequest(this);
+    }
+
     initSlots();
 }
 
@@ -90,8 +94,9 @@ TLBUnit::getRequest(DynInstPtr _inst, int stage_num,
                             int res_idx, int slot_num,
                             unsigned cmd)
 {
-    return new TLBUnitRequest(this, _inst, stage_num, res_idx, slot_num,
-                          cmd);
+    TLBUnitRequest *tlb_req = dynamic_cast<TLBUnitRequest*>(reqs[slot_num]);
+    tlb_req->setRequest(inst, stage_num, id, slot_num, cmd);
+    return ud_req;
 }
 
 void
@@ -99,7 +104,7 @@ TLBUnit::execute(int slot_idx)
 {
     // After this is working, change this to a reinterpret cast
     // for performance considerations
-    TLBUnitRequest* tlb_req = dynamic_cast<TLBUnitRequest*>(reqMap[slot_idx]);
+    TLBUnitRequest* tlb_req = dynamic_cast<TLBUnitRequest*>(reqs[slot_idx]);
     assert(tlb_req != 0x0);
 
     DynInstPtr inst = tlb_req->inst;
@@ -200,8 +205,8 @@ TLBUnitEvent::TLBUnitEvent()
 void
 TLBUnitEvent::process()
 {
-    DynInstPtr inst = resource->reqMap[slotIdx]->inst;
-    int stage_num = resource->reqMap[slotIdx]->getStageNum();
+    DynInstPtr inst = resource->reqs[slotIdx]->inst;
+    int stage_num = resource->reqs[slotIdx]->getStageNum();
     ThreadID tid = inst->threadNumber;
 
     DPRINTF(InOrderTLB, "Waking up from TLB Miss caused by [sn:%i].\n",
@@ -212,31 +217,18 @@ TLBUnitEvent::process()
 
     tlb_res->tlbBlocked[tid] = false;
 
-    tlb_res->cpu->pipelineStage[stage_num]->unsetResStall(tlb_res->reqMap[slotIdx], tid);
-
-    // Effectively NOP the instruction but still allow it
-    // to commit
-    //while (!inst->resSched.empty() &&
-    //   inst->resSched.top()->stageNum != ThePipeline::NumStages - 1) {
-    //inst->resSched.pop();
-    //}
+    tlb_res->cpu->pipelineStage[stage_num]->
+        unsetResStall(tlb_res->reqs[slotIdx], tid);
 }
 
 void
 TLBUnit::squash(DynInstPtr inst, int stage_num,
                    InstSeqNum squash_seq_num, ThreadID tid)
 {
-     //@TODO: Figure out a way to consolidate common parts
-     //       of this squash code
-     std::vector<int> slot_remove_list;
+    for (int i = 0; i < width; i++) {
+        ResReqPtr req_ptr = reqs[i];
 
-     map<int, ResReqPtr>::iterator map_it = reqMap.begin();
-     map<int, ResReqPtr>::iterator map_end = reqMap.end();
-
-     while (map_it != map_end) {
-         ResReqPtr req_ptr = (*map_it).second;
-
-         if (req_ptr &&
+         if (req_ptr->valid &&
              req_ptr->getInst()->readTid() == tid &&
              req_ptr->getInst()->seqNum > squash_seq_num) {
 
@@ -250,26 +242,16 @@ TLBUnit::squash(DynInstPtr inst, int stage_num,
 
              tlbBlocked[tid] = false;
 
-             int stall_stage = reqMap[req_slot_num]->getStageNum();
+             int stall_stage = reqs[req_slot_num]->getStageNum();
 
-             cpu->pipelineStage[stall_stage]->unsetResStall(reqMap[req_slot_num], tid);
+             cpu->pipelineStage[stall_stage]->
+                 unsetResStall(reqs[req_slot_num], tid);
 
              if (resourceEvent[req_slot_num].scheduled())
                  unscheduleEvent(req_slot_num);
 
-             // Mark request for later removal
-             cpu->reqRemoveList.push(req_ptr);
-
-             // Mark slot for removal from resource
-             slot_remove_list.push_back(req_ptr->getSlot());
+             freeSlot(req_slot_num);
          }
-
-         map_it++;
-     }
-
-     // Now Delete Slot Entry from Req. Map
-     for (int i = 0; i < slot_remove_list.size(); i++) {
-         freeSlot(slot_remove_list[i]);
      }
 }
 
diff --git a/src/cpu/inorder/resources/tlb_unit.hh b/src/cpu/inorder/resources/tlb_unit.hh
index eb1bf55f08..904ac3eba2 100644
--- a/src/cpu/inorder/resources/tlb_unit.hh
+++ b/src/cpu/inorder/resources/tlb_unit.hh
@@ -99,9 +99,15 @@ class TLBUnitRequest : public ResourceRequest {
     typedef ThePipeline::DynInstPtr DynInstPtr;
 
   public:
-    TLBUnitRequest(TLBUnit *res, DynInstPtr inst, int stage_num, int res_idx, int slot_num,
-                   unsigned _cmd)
-        : ResourceRequest(res, inst, stage_num, res_idx, slot_num, _cmd)
+    TLBUnitRequest(TLBUnit *res)
+        : ResourceRequest(res), memReq(NULL)
+    {
+    }
+
+    RequestPtr memReq;
+
+    void setRequest(DynInstPtr inst, int stage_num, int res_idx, int slot_num,
+                    unsigned _cmd)
     {
         Addr aligned_addr;
         int req_size;
@@ -131,9 +137,10 @@ class TLBUnitRequest : public ResourceRequest {
                                            inst->readTid());
             memReq = inst->dataMemReq;
         }
+
+        ResourceRequest::setRequest(inst, stage_num, res_idx, slot_num, _cmd);
     }
 
-    RequestPtr memReq;
 };
 
 
diff --git a/src/cpu/inorder/resources/use_def.cc b/src/cpu/inorder/resources/use_def.cc
index 7430115738..19246a30bd 100644
--- a/src/cpu/inorder/resources/use_def.cc
+++ b/src/cpu/inorder/resources/use_def.cc
@@ -88,33 +88,48 @@ UseDefUnit::regStats()
     Resource::regStats();
 }
 
+void
+UseDefUnit::init()
+{
+    // Set Up Resource Events to Appropriate Resource BandWidth
+    if (latency > 0) {
+        resourceEvent = new ResourceEvent[width];
+    } else {
+        resourceEvent = NULL;
+    }
+
+    for (int i = 0; i < width; i++) {
+        reqs[i] = new UseDefRequest(this);
+    }
+
+    initSlots();
+}
+
 ResReqPtr
 UseDefUnit::getRequest(DynInstPtr inst, int stage_num, int res_idx,
                      int slot_num, unsigned cmd)
 {
-    return new UseDefRequest(this, inst, stage_num, id, slot_num, cmd,
-                             inst->resSched.top()->idx);
+    UseDefRequest *ud_req = dynamic_cast<UseDefRequest*>(reqs[slot_num]);
+    ud_req->setRequest(inst, stage_num, id, slot_num, cmd,
+                       inst->curSkedEntry->idx);
+    return ud_req;
 }
 
 
 ResReqPtr
 UseDefUnit::findRequest(DynInstPtr inst)
 {
-    map<int, ResReqPtr>::iterator map_it = reqMap.begin();
-    map<int, ResReqPtr>::iterator map_end = reqMap.end();
-
-    while (map_it != map_end) {
-        UseDefRequest* ud_req = 
-            dynamic_cast<UseDefRequest*>((*map_it).second);
+    for (int i = 0; i < width; i++) {
+        UseDefRequest* ud_req =
+            dynamic_cast<UseDefRequest*>(reqs[i]);
         assert(ud_req);
 
-        if (ud_req &&
+        if (ud_req->valid &&
             ud_req->getInst() == inst &&
-            ud_req->cmd == inst->resSched.top()->cmd &&
-            ud_req->useDefIdx == inst->resSched.top()->idx) {
+            ud_req->cmd == inst->curSkedEntry->cmd &&
+            ud_req->useDefIdx == inst->curSkedEntry->idx) {
             return ud_req;
         }
-        map_it++;
     }
 
     return NULL;
@@ -125,7 +140,7 @@ UseDefUnit::execute(int slot_idx)
 {
     // After this is working, change this to a reinterpret cast
     // for performance considerations
-    UseDefRequest* ud_req = dynamic_cast<UseDefRequest*>(reqMap[slot_idx]);
+    UseDefRequest* ud_req = dynamic_cast<UseDefRequest*>(reqs[slot_idx]);
     assert(ud_req);
 
     DynInstPtr inst = ud_req->inst;
@@ -408,15 +423,10 @@ UseDefUnit::squash(DynInstPtr inst, int stage_num, InstSeqNum squash_seq_num,
     DPRINTF(InOrderUseDef, "[tid:%i]: Updating Due To Squash After [sn:%i].\n",
             tid, squash_seq_num);
 
-    std::vector<int> slot_remove_list;
+    for (int i = 0; i < width; i++) {
+        ResReqPtr req_ptr = reqs[i];
 
-    map<int, ResReqPtr>::iterator map_it = reqMap.begin();
-    map<int, ResReqPtr>::iterator map_end = reqMap.end();
-
-    while (map_it != map_end) {
-        ResReqPtr req_ptr = (*map_it).second;
-
-        if (req_ptr &&
+        if (req_ptr->valid &&
             req_ptr->getInst()->readTid() == tid &&
             req_ptr->getInst()->seqNum > squash_seq_num) {
 
@@ -431,20 +441,9 @@ UseDefUnit::squash(DynInstPtr inst, int stage_num, InstSeqNum squash_seq_num,
                 
                 unscheduleEvent(req_slot_num);
             }
-            
-            // Mark request for later removal
-            cpu->reqRemoveList.push(req_ptr);
 
-            // Mark slot for removal from resource
-            slot_remove_list.push_back(req_ptr->getSlot());
+            freeSlot(req_slot_num);
         }
-
-        map_it++;
-    }
-
-    // Now Delete Slot Entry from Req. Map
-    for (int i = 0; i < slot_remove_list.size(); i++) {
-        freeSlot(slot_remove_list[i]);
     }
 
     if (outReadSeqNum[tid] >= squash_seq_num) {
diff --git a/src/cpu/inorder/resources/use_def.hh b/src/cpu/inorder/resources/use_def.hh
index d2cc55315b..21770cec6d 100644
--- a/src/cpu/inorder/resources/use_def.hh
+++ b/src/cpu/inorder/resources/use_def.hh
@@ -56,6 +56,8 @@ class UseDefUnit : public Resource {
     UseDefUnit(std::string res_name, int res_id, int res_width,
                int res_latency, InOrderCPU *_cpu, ThePipeline::Params *params);
 
+    void init();
+
     ResourceRequest* getRequest(DynInstPtr _inst, int stage_num,
                                         int res_idx, int slot_num,
                                         unsigned cmd);
@@ -96,14 +98,20 @@ class UseDefUnit : public Resource {
         typedef ThePipeline::DynInstPtr DynInstPtr;
 
       public:
-        UseDefRequest(UseDefUnit *res, DynInstPtr inst, int stage_num, 
-                      int res_idx, int slot_num, unsigned cmd, 
-                      int use_def_idx)
-            : ResourceRequest(res, inst, stage_num, res_idx, slot_num, cmd),
-              useDefIdx(use_def_idx)
+        UseDefRequest(UseDefUnit *res)
+            : ResourceRequest(res)
         { }
 
         int useDefIdx;
+
+        void setRequest(DynInstPtr _inst, int stage_num, int res_idx,
+                        int slot_num, unsigned _cmd, int idx)
+        {
+            useDefIdx = idx;
+
+            ResourceRequest::setRequest(_inst, stage_num, res_idx, slot_num,
+                                        _cmd);
+        }
     };
 
   protected:
diff --git a/src/cpu/o3/fetch.hh b/src/cpu/o3/fetch.hh
index 92691720be..647c48a763 100644
--- a/src/cpu/o3/fetch.hh
+++ b/src/cpu/o3/fetch.hh
@@ -136,6 +136,10 @@ class DefaultFetch
             : fetch(_fetch)
         {}
 
+        void
+        markDelayed()
+        {}
+
         void
         finish(Fault fault, RequestPtr req, ThreadContext *tc,
                BaseTLB::Mode mode)
diff --git a/src/cpu/o3/fetch_impl.hh b/src/cpu/o3/fetch_impl.hh
index d0c83d586e..d2cde496e5 100644
--- a/src/cpu/o3/fetch_impl.hh
+++ b/src/cpu/o3/fetch_impl.hh
@@ -604,6 +604,9 @@ DefaultFetch<Impl>::finishTranslation(Fault fault, RequestPtr mem_req)
     ThreadID tid = mem_req->threadId();
     Addr block_PC = mem_req->getVaddr();
 
+    // Wake up CPU if it was idle
+    cpu->wakeCPU();
+
     // If translation was successful, attempt to read the icache block.
     if (fault == NoFault) {
         // Build packet here.
@@ -654,6 +657,9 @@ DefaultFetch<Impl>::finishTranslation(Fault fault, RequestPtr mem_req)
         instruction->fault = fault;
         wroteToTimeBuffer = true;
 
+        DPRINTF(Activity, "Activity this cycle.\n");
+        cpu->activityThisCycle();
+
         fetchStatus[tid] = TrapPending;
 
         DPRINTF(Fetch, "[tid:%i]: Blocked, need to handle the trap.\n", tid);
@@ -1064,6 +1070,8 @@ DefaultFetch<Impl>::fetch(bool &status_change)
     Addr pcOffset = fetchOffset[tid];
     Addr fetchAddr = (thisPC.instAddr() + pcOffset) & BaseCPU::PCMask;
 
+    bool inRom = isRomMicroPC(thisPC.microPC());
+
     // If returning from the delay of a cache miss, then update the status
     // to running, otherwise do the cache access.  Possibly move this up
     // to tick() function.
@@ -1077,7 +1085,7 @@ DefaultFetch<Impl>::fetch(bool &status_change)
         Addr block_PC = icacheBlockAlignPC(fetchAddr);
 
         // Unless buffer already got the block, fetch it from icache.
-        if (!cacheDataValid[tid] || block_PC != cacheDataPC[tid]) {
+        if (!(cacheDataValid[tid] && block_PC == cacheDataPC[tid]) && !inRom) {
             DPRINTF(Fetch, "[tid:%i]: Attempting to translate and read "
                     "instruction, starting at PC %s.\n", tid, thisPC);
 
@@ -1149,7 +1157,7 @@ DefaultFetch<Impl>::fetch(bool &status_change)
            !predictedBranch) {
 
         // If we need to process more memory, do it now.
-        if (!curMacroop && !predecoder.extMachInstReady()) {
+        if (!(curMacroop || inRom) && !predecoder.extMachInstReady()) {
             if (ISA_HAS_DELAY_SLOT && pcOffset == 0) {
                 // Walk past any annulled delay slot instructions.
                 Addr pcAddr = thisPC.instAddr() & BaseCPU::PCMask;
@@ -1175,7 +1183,7 @@ DefaultFetch<Impl>::fetch(bool &status_change)
         // Extract as many instructions and/or microops as we can from
         // the memory we've processed so far.
         do {
-            if (!curMacroop) {
+            if (!(curMacroop || inRom)) {
                 if (predecoder.extMachInstReady()) {
                     ExtMachInst extMachInst;
 
@@ -1196,8 +1204,13 @@ DefaultFetch<Impl>::fetch(bool &status_change)
                     break;
                 }
             }
-            if (curMacroop) {
-                staticInst = curMacroop->fetchMicroop(thisPC.microPC());
+            if (curMacroop || inRom) {
+                if (inRom) {
+                    staticInst = cpu->microcodeRom.fetchMicroop(
+                            thisPC.microPC(), curMacroop);
+                } else {
+                    staticInst = curMacroop->fetchMicroop(thisPC.microPC());
+                }
                 if (staticInst->isLastMicroop()) {
                     curMacroop = NULL;
                     pcOffset = 0;
diff --git a/src/cpu/o3/iew_impl.hh b/src/cpu/o3/iew_impl.hh
index 3f3761ff32..03f73c798f 100644
--- a/src/cpu/o3/iew_impl.hh
+++ b/src/cpu/o3/iew_impl.hh
@@ -1241,12 +1241,33 @@ DefaultIEW<Impl>::executeInsts()
                 // Loads will mark themselves as executed, and their writeback
                 // event adds the instruction to the queue to commit
                 fault = ldstQueue.executeLoad(inst);
+
+                if (inst->isTranslationDelayed() &&
+                    fault == NoFault) {
+                    // A hw page table walk is currently going on; the
+                    // instruction must be deferred.
+                    DPRINTF(IEW, "Execute: Delayed translation, deferring "
+                            "load.\n");
+                    instQueue.deferMemInst(inst);
+                    continue;
+                }
+
                 if (inst->isDataPrefetch() || inst->isInstPrefetch()) {
                     fault = NoFault;
                 }
             } else if (inst->isStore()) {
                 fault = ldstQueue.executeStore(inst);
 
+                if (inst->isTranslationDelayed() &&
+                    fault == NoFault) {
+                    // A hw page table walk is currently going on; the
+                    // instruction must be deferred.
+                    DPRINTF(IEW, "Execute: Delayed translation, deferring "
+                            "store.\n");
+                    instQueue.deferMemInst(inst);
+                    continue;
+                }
+
                 // If the store had a fault then it may not have a mem req
                 if (fault != NoFault || inst->readPredicate() == false ||
                         !inst->isStoreConditional()) {
diff --git a/src/cpu/o3/inst_queue.hh b/src/cpu/o3/inst_queue.hh
index be936e2045..64df357438 100644
--- a/src/cpu/o3/inst_queue.hh
+++ b/src/cpu/o3/inst_queue.hh
@@ -1,4 +1,16 @@
 /*
+ * Copyright (c) 2011 ARM Limited
+ * All rights reserved.
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
  * Copyright (c) 2004-2006 The Regents of The University of Michigan
  * All rights reserved.
  *
@@ -180,6 +192,11 @@ class InstructionQueue
      */
     DynInstPtr getInstToExecute();
 
+    /** Returns a memory instruction that was referred due to a delayed DTB
+     *  translation if it is now ready to execute.
+     */
+    DynInstPtr getDeferredMemInstToExecute();
+
     /**
      * Records the instruction as the producer of a register without
      * adding it to the rest of the IQ.
@@ -223,6 +240,12 @@ class InstructionQueue
     /** Completes a memory operation. */
     void completeMemInst(DynInstPtr &completed_inst);
 
+    /**
+     * Defers a memory instruction when its DTB translation incurs a hw
+     * page table walk.
+     */
+    void deferMemInst(DynInstPtr &deferred_inst);
+
     /** Indicates an ordering violation between a store and a load. */
     void violation(DynInstPtr &store, DynInstPtr &faulting_load);
 
@@ -284,6 +307,11 @@ class InstructionQueue
     /** List of instructions that are ready to be executed. */
     std::list<DynInstPtr> instsToExecute;
 
+    /** List of instructions waiting for their DTB translation to
+     *  complete (hw page table walk in progress).
+     */
+    std::list<DynInstPtr> deferredMemInsts;
+
     /**
      * Struct for comparing entries to be added to the priority queue.
      * This gives reverse ordering to the instructions in terms of
diff --git a/src/cpu/o3/inst_queue_impl.hh b/src/cpu/o3/inst_queue_impl.hh
index 91cb2f0c82..aa21a0edc8 100644
--- a/src/cpu/o3/inst_queue_impl.hh
+++ b/src/cpu/o3/inst_queue_impl.hh
@@ -1,4 +1,16 @@
 /*
+ * Copyright (c) 2011 ARM Limited
+ * All rights reserved.
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
  * Copyright (c) 2004-2006 The Regents of The University of Michigan
  * All rights reserved.
  *
@@ -397,6 +409,7 @@ InstructionQueue<Impl>::resetState()
     }
     nonSpecInsts.clear();
     listOrder.clear();
+    deferredMemInsts.clear();
 }
 
 template <class Impl>
@@ -733,6 +746,15 @@ InstructionQueue<Impl>::scheduleReadyInsts()
 
     IssueStruct *i2e_info = issueToExecuteQueue->access(0);
 
+    DynInstPtr deferred_mem_inst;
+    int total_deferred_mem_issued = 0;
+    while (total_deferred_mem_issued < totalWidth &&
+           (deferred_mem_inst = getDeferredMemInstToExecute()) != 0) {
+        issueToExecuteQueue->access(0)->size++;
+        instsToExecute.push_back(deferred_mem_inst);
+        total_deferred_mem_issued++;
+    }
+
     // Have iterator to head of the list
     // While I haven't exceeded bandwidth or reached the end of the list,
     // Try to get a FU that can do what this op needs.
@@ -745,7 +767,7 @@ InstructionQueue<Impl>::scheduleReadyInsts()
     ListOrderIt order_end_it = listOrder.end();
     int total_issued = 0;
 
-    while (total_issued < totalWidth &&
+    while (total_issued < (totalWidth - total_deferred_mem_issued) &&
            iewStage->canIssue() &&
            order_it != order_end_it) {
         OpClass op_class = (*order_it).queueType;
@@ -858,7 +880,7 @@ InstructionQueue<Impl>::scheduleReadyInsts()
     iqInstsIssued+= total_issued;
 
     // If we issued any instructions, tell the CPU we had activity.
-    if (total_issued) {
+    if (total_issued || total_deferred_mem_issued) {
         cpu->activityThisCycle();
     } else {
         DPRINTF(IQ, "Not able to schedule any instructions.\n");
@@ -1021,6 +1043,11 @@ void
 InstructionQueue<Impl>::rescheduleMemInst(DynInstPtr &resched_inst)
 {
     DPRINTF(IQ, "Rescheduling mem inst [sn:%lli]\n", resched_inst->seqNum);
+
+    // Reset DTB translation state
+    resched_inst->translationStarted = false;
+    resched_inst->translationCompleted = false;
+
     resched_inst->clearCanIssue();
     memDepUnit[resched_inst->threadNumber].reschedule(resched_inst);
 }
@@ -1049,6 +1076,28 @@ InstructionQueue<Impl>::completeMemInst(DynInstPtr &completed_inst)
     count[tid]--;
 }
 
+template <class Impl>
+void
+InstructionQueue<Impl>::deferMemInst(DynInstPtr &deferred_inst)
+{
+    deferredMemInsts.push_back(deferred_inst);
+}
+
+template <class Impl>
+typename Impl::DynInstPtr
+InstructionQueue<Impl>::getDeferredMemInstToExecute()
+{
+    for (ListIt it = deferredMemInsts.begin(); it != deferredMemInsts.end();
+         ++it) {
+        if ((*it)->translationCompleted) {
+            DynInstPtr ret = *it;
+            deferredMemInsts.erase(it);
+            return ret;
+        }
+    }
+    return NULL;
+}
+
 template <class Impl>
 void
 InstructionQueue<Impl>::violation(DynInstPtr &store,
diff --git a/src/cpu/o3/lsq_unit_impl.hh b/src/cpu/o3/lsq_unit_impl.hh
index dd3604ffe2..b5d3379356 100644
--- a/src/cpu/o3/lsq_unit_impl.hh
+++ b/src/cpu/o3/lsq_unit_impl.hh
@@ -445,12 +445,16 @@ LSQUnit<Impl>::executeLoad(DynInstPtr &inst)
     Fault load_fault = NoFault;
 
     DPRINTF(LSQUnit, "Executing load PC %s, [sn:%lli]\n",
-            inst->pcState(),inst->seqNum);
+            inst->pcState(), inst->seqNum);
 
     assert(!inst->isSquashed());
 
     load_fault = inst->initiateAcc();
 
+    if (inst->isTranslationDelayed() &&
+        load_fault == NoFault)
+        return load_fault;
+
     // If the instruction faulted or predicated false, then we need to send it
     // along to commit without the instruction completing.
     if (load_fault != NoFault || inst->readPredicate() == false) {
@@ -532,6 +536,10 @@ LSQUnit<Impl>::executeStore(DynInstPtr &store_inst)
 
     Fault store_fault = store_inst->initiateAcc();
 
+    if (store_inst->isTranslationDelayed() &&
+        store_fault == NoFault)
+        return store_fault;
+
     if (store_inst->readPredicate() == false)
         store_inst->forwardOldRegs();
 
diff --git a/src/cpu/simple/timing.cc b/src/cpu/simple/timing.cc
index 453699f841..ab1ff91e8b 100644
--- a/src/cpu/simple/timing.cc
+++ b/src/cpu/simple/timing.cc
@@ -752,6 +752,7 @@ TimingSimpleCPU::sendFetch(Fault fault, RequestPtr req, ThreadContext *tc)
     } else {
         delete req;
         // fetch fault: advance directly to next instruction (fault handler)
+        _status = Running;
         advanceInst(fault);
     }
 
@@ -805,12 +806,11 @@ TimingSimpleCPU::completeIfetch(PacketPtr pkt)
     if (curStaticInst && curStaticInst->isMemRef()) {
         // load or store: just send to dcache
         Fault fault = curStaticInst->initiateAcc(this, traceData);
-        if (_status != Running) {
-            // instruction will complete in dcache response callback
-            assert(_status == DcacheWaitResponse ||
-                    _status == DcacheRetry || DTBWaitResponse);
-            assert(fault == NoFault);
-        } else {
+
+        // If we're not running now the instruction will complete in a dcache
+        // response callback or the instruction faulted and has started an
+        // ifetch
+        if (_status == Running) {
             if (fault != NoFault && traceData) {
                 // If there was a fault, we shouldn't trace this instruction.
                 delete traceData;
diff --git a/src/cpu/simple/timing.hh b/src/cpu/simple/timing.hh
index 2b0c8942a3..a7a3eb7c3c 100644
--- a/src/cpu/simple/timing.hh
+++ b/src/cpu/simple/timing.hh
@@ -107,6 +107,13 @@ class TimingSimpleCPU : public BaseSimpleCPU
             : cpu(_cpu)
         {}
 
+        void
+        markDelayed()
+        {
+            assert(cpu->_status == Running);
+            cpu->_status = ITBWaitResponse;
+        }
+
         void
         finish(Fault fault, RequestPtr req, ThreadContext *tc,
                BaseTLB::Mode mode)
diff --git a/src/cpu/translation.hh b/src/cpu/translation.hh
index 7db7c381aa..60953540fa 100644
--- a/src/cpu/translation.hh
+++ b/src/cpu/translation.hh
@@ -1,4 +1,16 @@
 /*
+ * Copyright (c) 2011 ARM Limited
+ * All rights reserved.
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
  * Copyright (c) 2002-2005 The Regents of The University of Michigan
  * Copyright (c) 2009 The University of Edinburgh
  * All rights reserved.
@@ -53,6 +65,7 @@ class WholeTranslationState
     Fault faults[2];
 
   public:
+    bool delay;
     bool isSplit;
     RequestPtr mainReq;
     RequestPtr sreqLow;
@@ -67,8 +80,8 @@ class WholeTranslationState
      */
     WholeTranslationState(RequestPtr _req, uint8_t *_data, uint64_t *_res,
                           BaseTLB::Mode _mode)
-        : outstanding(1), isSplit(false), mainReq(_req), sreqLow(NULL),
-          sreqHigh(NULL), data(_data), res(_res), mode(_mode)
+        : outstanding(1), delay(false), isSplit(false), mainReq(_req),
+          sreqLow(NULL), sreqHigh(NULL), data(_data), res(_res), mode(_mode)
     {
         faults[0] = faults[1] = NoFault;
         assert(mode == BaseTLB::Read || mode == BaseTLB::Write);
@@ -82,8 +95,9 @@ class WholeTranslationState
     WholeTranslationState(RequestPtr _req, RequestPtr _sreqLow,
                           RequestPtr _sreqHigh, uint8_t *_data, uint64_t *_res,
                           BaseTLB::Mode _mode)
-        : outstanding(2), isSplit(true), mainReq(_req), sreqLow(_sreqLow),
-          sreqHigh(_sreqHigh), data(_data), res(_res), mode(_mode)
+        : outstanding(2), delay(false), isSplit(true), mainReq(_req),
+          sreqLow(_sreqLow), sreqHigh(_sreqHigh), data(_data), res(_res),
+          mode(_mode)
     {
         faults[0] = faults[1] = NoFault;
         assert(mode == BaseTLB::Read || mode == BaseTLB::Write);
@@ -220,6 +234,16 @@ class DataTranslation : public BaseTLB::Translation
     {
     }
 
+    /**
+     * Signal the translation state that the translation has been delayed due
+     * to a hw page table walk.  Split requests are transparently handled.
+     */
+    void
+    markDelayed()
+    {
+        state->delay = true;
+    }
+
     /**
      * Finish this part of the translation and indicate that the whole
      * translation is complete if the state says so.
diff --git a/src/dev/SConscript b/src/dev/SConscript
index 7cdea7961f..5243da683c 100644
--- a/src/dev/SConscript
+++ b/src/dev/SConscript
@@ -69,6 +69,7 @@ if env['FULL_SYSTEM']:
     Source('pcidev.cc')
     Source('pktfifo.cc')
     Source('platform.cc')
+    Source('ps2.cc')
     Source('simple_disk.cc')
     Source('sinic.cc')
     Source('terminal.cc')
diff --git a/src/dev/arm/RealView.py b/src/dev/arm/RealView.py
index cdc06e4ef7..ef3f68a884 100644
--- a/src/dev/arm/RealView.py
+++ b/src/dev/arm/RealView.py
@@ -52,6 +52,14 @@ class AmbaDevice(BasicPioDevice):
     abstract = True
     amba_id = Param.UInt32("ID of AMBA device for kernel detection")
 
+class AmbaIntDevice(AmbaDevice):
+    type = 'AmbaIntDevice'
+    abstract = True
+    gic = Param.Gic(Parent.any, "Gic to use for interrupting")
+    int_num = Param.UInt32("Interrupt number that connects to GIC")
+    int_delay = Param.Latency("100ns",
+            "Time between action and interrupt generation by device")
+
 class AmbaDmaDevice(DmaDevice):
     type = 'AmbaDmaDevice'
     abstract = True
@@ -94,16 +102,17 @@ class Sp804(AmbaDevice):
     clock1 = Param.Clock('1MHz', "Clock speed of the input")
     amba_id = 0x00141804
 
-class Pl050(AmbaDevice):
+class Pl050(AmbaIntDevice):
     type = 'Pl050'
-    gic = Param.Gic(Parent.any, "Gic to use for interrupting")
-    int_num = Param.UInt32("Interrupt number that connects to GIC")
-    int_delay = Param.Latency("100ns", "Time between action and interrupt generation by UART")
+    vnc = Param.VncServer(Parent.any, "Vnc server for remote frame buffer display")
+    is_mouse = Param.Bool(False, "Is this interface a mouse, if not a keyboard")
+    int_delay = '1us'
     amba_id = 0x00141050
 
 class Pl111(AmbaDmaDevice):
     type = 'Pl111'
     clock = Param.Clock('24MHz', "Clock speed of the input")
+    vnc   = Param.VncServer(Parent.any, "Vnc server for remote frame buffer display")
     amba_id = 0x00141111
 
 class RealView(Platform):
@@ -121,7 +130,7 @@ class RealViewPBX(RealView):
     timer1 = Sp804(int_num0=37, int_num1=37, pio_addr=0x10012000)
     clcd = Pl111(pio_addr=0x10020000, int_num=55)
     kmi0   = Pl050(pio_addr=0x10006000, int_num=52)
-    kmi1   = Pl050(pio_addr=0x10007000, int_num=53)
+    kmi1   = Pl050(pio_addr=0x10007000, int_num=53, is_mouse=True)
 
     l2x0_fake     = IsaFake(pio_addr=0x1f002000, pio_size=0xfff)
     flash_fake    = IsaFake(pio_addr=0x40000000, pio_size=0x4000000)
@@ -140,7 +149,7 @@ class RealViewPBX(RealView):
     aaci_fake     = AmbaFake(pio_addr=0x10004000)
     mmc_fake      = AmbaFake(pio_addr=0x10005000)
     rtc_fake      = AmbaFake(pio_addr=0x10017000, amba_id=0x41031)
-
+    cf0_fake      = IsaFake(pio_addr=0x18000000, pio_size=0xfff)
 
 
     # Attach I/O devices that are on chip
@@ -175,6 +184,7 @@ class RealViewPBX(RealView):
        self.mmc_fake.pio      = bus.port
        self.rtc_fake.pio      = bus.port
        self.flash_fake.pio    = bus.port
+       self.cf0_fake.pio      = bus.port
 
 # Reference for memory map and interrupt number
 # RealView Emulation Baseboard User Guide (ARM DUI 0143B)
@@ -187,7 +197,7 @@ class RealViewEB(RealView):
     timer1 = Sp804(int_num0=37, int_num1=37, pio_addr=0x10012000)
     clcd   = Pl111(pio_addr=0x10020000, int_num=23)
     kmi0   = Pl050(pio_addr=0x10006000, int_num=20)
-    kmi1   = Pl050(pio_addr=0x10007000, int_num=21)
+    kmi1   = Pl050(pio_addr=0x10007000, int_num=21, is_mouse=True)
 
     l2x0_fake     = IsaFake(pio_addr=0x1f002000, pio_size=0xfff, warn_access="1")
     dmac_fake     = AmbaFake(pio_addr=0x10030000)
diff --git a/src/dev/arm/amba_device.cc b/src/dev/arm/amba_device.cc
index e5d53d6a38..37eb77ae1b 100644
--- a/src/dev/arm/amba_device.cc
+++ b/src/dev/arm/amba_device.cc
@@ -47,11 +47,19 @@
 #include "mem/packet_access.hh"
 
 const uint64_t AmbaVendor = ULL(0xb105f00d00000000);
+
 AmbaDevice::AmbaDevice(const Params *p)
     : BasicPioDevice(p), ambaId(AmbaVendor | p->amba_id)
 {
 }
 
+AmbaIntDevice::AmbaIntDevice(const Params *p)
+    : AmbaDevice(p), intNum(p->int_num), gic(p->gic), intDelay(p->int_delay)
+{
+}
+
+
+
 AmbaDmaDevice::AmbaDmaDevice(const Params *p)
     : DmaDevice(p), ambaId(AmbaVendor | p->amba_id),
       pioAddr(p->pio_addr), pioSize(0),
diff --git a/src/dev/arm/amba_device.hh b/src/dev/arm/amba_device.hh
index 1782fb003f..297a78f827 100644
--- a/src/dev/arm/amba_device.hh
+++ b/src/dev/arm/amba_device.hh
@@ -55,6 +55,7 @@
 #include "mem/packet.hh"
 #include "mem/packet_access.hh"
 #include "params/AmbaDevice.hh"
+#include "params/AmbaIntDevice.hh"
 #include "params/AmbaDmaDevice.hh"
 
 namespace AmbaDev {
@@ -81,6 +82,18 @@ class AmbaDevice : public BasicPioDevice
     AmbaDevice(const Params *p);
 };
 
+class AmbaIntDevice : public AmbaDevice
+{
+  protected:
+    int intNum;
+    Gic *gic;
+    Tick intDelay;
+
+  public:
+    typedef AmbaIntDeviceParams Params;
+    AmbaIntDevice(const Params *p);
+};
+
 class AmbaDmaDevice : public DmaDevice
 {
   protected:
diff --git a/src/dev/arm/kmi.cc b/src/dev/arm/kmi.cc
index 6cd61fd090..adf1439b3b 100644
--- a/src/dev/arm/kmi.cc
+++ b/src/dev/arm/kmi.cc
@@ -37,21 +37,31 @@
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
- * Authors: William Wang
+ * Authors: Ali Saidi
+ *          William Wang
  */
 
 #include "base/trace.hh"
+#include "base/vnc/vncserver.hh"
 #include "dev/arm/amba_device.hh"
 #include "dev/arm/kmi.hh"
+#include "dev/ps2.hh"
 #include "mem/packet.hh"
 #include "mem/packet_access.hh"
 
 Pl050::Pl050(const Params *p)
-    : AmbaDevice(p), control(0x00), status(0x43), kmidata(0x00), clkdiv(0x00),
-      intreg(0x00), intNum(p->int_num), gic(p->gic), intDelay(p->int_delay),
-      intEvent(this)
+    : AmbaIntDevice(p), control(0), status(0x43), clkdiv(0), interrupts(0),
+      rawInterrupts(0), ackNext(false), shiftDown(false), vnc(p->vnc),
+      driverInitialized(false), intEvent(this)
 {
     pioSize = 0xfff;
+
+    if (vnc) {
+        if (!p->is_mouse)
+            vnc->setKeyboard(this);
+        else
+            vnc->setMouse(this);
+    }
 }
 
 Tick
@@ -62,28 +72,39 @@ Pl050::read(PacketPtr pkt)
     Addr daddr = pkt->getAddr() - pioAddr;
     pkt->allocate();
 
-    DPRINTF(Pl050, " read register %#x size=%d\n", daddr, pkt->getSize());
 
-    // use a temporary data since the KMI registers are read/written with
-    // different size operations
-    //
     uint32_t data = 0;
 
     switch (daddr) {
       case kmiCr:
+        DPRINTF(Pl050, "Read Commmand: %#x\n", (uint32_t)control);
         data = control;
         break;
       case kmiStat:
+        if (rxQueue.empty())
+            status.rxfull = 0;
+        else
+            status.rxfull = 1;
+
+        DPRINTF(Pl050, "Read Status: %#x\n", (uint32_t)status);
         data = status;
         break;
       case kmiData:
-        data = kmidata;
+        if (rxQueue.empty()) {
+            data = 0;
+        } else {
+            data = rxQueue.front();
+            rxQueue.pop_front();
+        }
+        DPRINTF(Pl050, "Read Data: %#x\n", (uint32_t)data);
+        updateIntStatus();
         break;
       case kmiClkDiv:
         data = clkdiv;
         break;
       case kmiISR:
-        data = intreg;
+        data = interrupts;
+        DPRINTF(Pl050, "Read Interrupts: %#x\n", (uint32_t)interrupts);
         break;
       default:
         if (AmbaDev::readId(pkt, ambaId, pioAddr)) {
@@ -123,47 +144,22 @@ Pl050::write(PacketPtr pkt)
 
     Addr daddr = pkt->getAddr() - pioAddr;
 
-    DPRINTF(Pl050, " write register %#x value %#x size=%d\n", daddr,
-            pkt->get<uint8_t>(), pkt->getSize());
-
-    // use a temporary data since the KMI registers are read/written with
-    // different size operations
-    //
-    uint32_t data = 0;
-
-    switch (pkt->getSize()) {
-      case 1:
-        data = pkt->get<uint8_t>();
-        break;
-      case 2:
-        data = pkt->get<uint16_t>();
-        break;
-      case 4:
-        data = pkt->get<uint32_t>();
-        break;
-      default:
-        panic("KMI write size too big?\n");
-        break;
-    }
+    assert(pkt->getSize() == sizeof(uint8_t));
 
 
     switch (daddr) {
       case kmiCr:
-        control = data;
-        break;
-      case kmiStat:
-        panic("Tried to write PL050 register(read only) at offset %#x\n",
-              daddr);
+        DPRINTF(Pl050, "Write Commmand: %#x\n", (uint32_t)pkt->get<uint8_t>());
+        control = pkt->get<uint8_t>();
+        updateIntStatus();
         break;
       case kmiData:
-        kmidata = data;
+        DPRINTF(Pl050, "Write Data: %#x\n", (uint32_t)pkt->get<uint8_t>());
+        processCommand(pkt->get<uint8_t>());
+        updateIntStatus();
         break;
       case kmiClkDiv:
-        clkdiv = data;
-        break;
-      case kmiISR:
-        panic("Tried to write PL050 register(read only) at offset %#x\n",
-              daddr);
+        clkdiv = pkt->get<uint8_t>();
         break;
       default:
         warn("Tried to write PL050 at offset %#x that doesn't exist\n", daddr);
@@ -173,15 +169,199 @@ Pl050::write(PacketPtr pkt)
     return pioDelay;
 }
 
+void
+Pl050::processCommand(uint8_t byte)
+{
+    using namespace Ps2;
+
+    if (ackNext) {
+        ackNext--;
+        rxQueue.push_back(Ack);
+        updateIntStatus();
+        return;
+    }
+
+    switch (byte) {
+      case Ps2Reset:
+        rxQueue.push_back(Ack);
+        rxQueue.push_back(SelfTestPass);
+        break;
+      case SetResolution:
+      case SetRate:
+      case SetStatusLed:
+      case SetScaling1_1:
+      case SetScaling1_2:
+        rxQueue.push_back(Ack);
+        ackNext = 1;
+        break;
+      case ReadId:
+        rxQueue.push_back(Ack);
+        if (params()->is_mouse)
+            rxQueue.push_back(MouseId);
+        else
+            rxQueue.push_back(KeyboardId);
+        break;
+      case TpReadId:
+        if (!params()->is_mouse)
+            break;
+        // We're not a trackpoint device, this should make the probe go away
+        rxQueue.push_back(Ack);
+        rxQueue.push_back(0);
+        rxQueue.push_back(0);
+        // fall through
+      case Disable:
+      case Enable:
+        rxQueue.push_back(Ack);
+        break;
+      case StatusRequest:
+        rxQueue.push_back(Ack);
+        rxQueue.push_back(0);
+        rxQueue.push_back(2); // default resolution
+        rxQueue.push_back(100); // default sample rate
+        break;
+      case TouchKitId:
+        ackNext = 2;
+        rxQueue.push_back(Ack);
+        rxQueue.push_back(TouchKitId);
+        rxQueue.push_back(1);
+        rxQueue.push_back('A');
+
+        driverInitialized = true;
+        break;
+      default:
+        panic("Unknown byte received: %d\n", byte);
+    }
+
+    updateIntStatus();
+}
+
+
+void
+Pl050::updateIntStatus()
+{
+    if (!rxQueue.empty())
+        rawInterrupts.rx = 1;
+    else
+        rawInterrupts.rx = 0;
+
+    interrupts.tx = rawInterrupts.tx & control.txint_enable;
+    interrupts.rx = rawInterrupts.rx & control.rxint_enable;
+
+    DPRINTF(Pl050, "rawInterupts=%#x control=%#x interrupts=%#x\n",
+            (uint32_t)rawInterrupts, (uint32_t)control, (uint32_t)interrupts);
+
+    if (interrupts && !intEvent.scheduled())
+        schedule(intEvent, curTick() + intDelay);
+}
+
 void
 Pl050::generateInterrupt()
 {
-    if (intreg.rxintr || intreg.txintr) {
+
+    if (interrupts) {
         gic->sendInt(intNum);
-        DPRINTF(Pl050, " -- Generated\n");
+        DPRINTF(Pl050, "Generated interrupt\n");
     }
 }
 
+void
+Pl050::mouseAt(uint16_t x, uint16_t y, uint8_t buttons)
+{
+    using namespace Ps2;
+
+    // If the driver hasn't initialized the device yet, no need to try and send
+    // it anything. Similarly we can get vnc mouse events orders of maginture
+    // faster than m5 can process them. Only queue up two sets mouse movements
+    // and don't add more until those are processed.
+    if (!driverInitialized || rxQueue.size() > 10)
+        return;
+
+    // We shouldn't be here unless a vnc server called us in which case
+    // we should have a pointer to it
+    assert(vnc);
+
+    // Convert screen coordinates to touchpad coordinates
+    uint16_t _x = (2047.0/vnc->videoWidth()) * x;
+    uint16_t _y = (2047.0/vnc->videoHeight()) * y;
+
+    rxQueue.push_back(buttons);
+    rxQueue.push_back(_x >> 7);
+    rxQueue.push_back(_x & 0x7f);
+    rxQueue.push_back(_y >> 7);
+    rxQueue.push_back(_y & 0x7f);
+
+    updateIntStatus();
+}
+
+
+void
+Pl050::keyPress(uint32_t key, bool down)
+{
+    using namespace Ps2;
+
+    std::list<uint8_t> keys;
+
+    // convert the X11 keysym into ps2 codes
+    keySymToPs2(key, down, shiftDown, keys);
+
+    // Insert into our queue of charecters
+    rxQueue.splice(rxQueue.end(), keys);
+    updateIntStatus();
+}
+
+void
+Pl050::serialize(std::ostream &os)
+{
+    uint8_t ctrlreg = control;
+    SERIALIZE_SCALAR(ctrlreg);
+
+    uint8_t stsreg = status;
+    SERIALIZE_SCALAR(stsreg);
+    SERIALIZE_SCALAR(clkdiv);
+
+    uint8_t ints = interrupts;
+    SERIALIZE_SCALAR(ints);
+
+    uint8_t raw_ints = rawInterrupts;
+    SERIALIZE_SCALAR(raw_ints);
+
+    SERIALIZE_SCALAR(ackNext);
+    SERIALIZE_SCALAR(shiftDown);
+    SERIALIZE_SCALAR(driverInitialized);
+
+    arrayParamOut(os, "rxQueue", rxQueue);
+}
+
+void
+Pl050::unserialize(Checkpoint *cp, const std::string &section)
+{
+    uint8_t ctrlreg;
+    UNSERIALIZE_SCALAR(ctrlreg);
+    control = ctrlreg;
+
+    uint8_t stsreg;
+    UNSERIALIZE_SCALAR(stsreg);
+    status = stsreg;
+
+    UNSERIALIZE_SCALAR(clkdiv);
+
+    uint8_t ints;
+    UNSERIALIZE_SCALAR(ints);
+    interrupts = ints;
+
+    uint8_t raw_ints;
+    UNSERIALIZE_SCALAR(raw_ints);
+    rawInterrupts = raw_ints;
+
+    UNSERIALIZE_SCALAR(ackNext);
+    UNSERIALIZE_SCALAR(shiftDown);
+    UNSERIALIZE_SCALAR(driverInitialized);
+
+    arrayParamIn(cp, section, "rxQueue", rxQueue);
+}
+
+
+
 Pl050 *
 Pl050Params::create()
 {
diff --git a/src/dev/arm/kmi.hh b/src/dev/arm/kmi.hh
index c96dd55a97..1e25f89748 100644
--- a/src/dev/arm/kmi.hh
+++ b/src/dev/arm/kmi.hh
@@ -48,13 +48,16 @@
 #ifndef __DEV_ARM_PL050_HH__
 #define __DEV_ARM_PL050_HH__
 
+#include <list>
+
 #include "base/range.hh"
-#include "dev/io_device.hh"
+#include "base/vnc/vncserver.hh"
+#include "dev/arm/amba_device.hh"
 #include "params/Pl050.hh"
 
 class Gic;
 
-class Pl050 : public AmbaDevice
+class Pl050 : public AmbaIntDevice, public VncKeyboard, public VncMouse
 {
   protected:
     static const int kmiCr       = 0x000;
@@ -63,34 +66,68 @@ class Pl050 : public AmbaDevice
     static const int kmiClkDiv   = 0x00C;
     static const int kmiISR      = 0x010;
 
-    // control register
-    uint8_t control;
+    BitUnion8(ControlReg)
+        Bitfield<0> force_clock_low;
+        Bitfield<1> force_data_low;
+        Bitfield<2> enable;
+        Bitfield<3> txint_enable;
+        Bitfield<4> rxint_enable;
+        Bitfield<5> type;
+    EndBitUnion(ControlReg)
 
-    // status register
-    uint8_t status;
+    /** control register
+     */
+    ControlReg control;
 
-    // received data (read) or data to be transmitted (write)
-    uint8_t kmidata;
+    /** KMI status register */
+    BitUnion8(StatusReg)
+        Bitfield<0> data_in;
+        Bitfield<1> clk_in;
+        Bitfield<2> rxparity;
+        Bitfield<3> rxbusy;
+        Bitfield<4> rxfull;
+        Bitfield<5> txbusy;
+        Bitfield<6> txempty;
+    EndBitUnion(StatusReg)
 
-    // clock divisor register
+    StatusReg status;
+
+    /** clock divisor register
+     * This register is just kept around to satisfy reads after driver does
+     * writes. The divsor does nothing, as we're not actually signaling ps2
+     * serial commands to anything.
+     */
     uint8_t clkdiv;
 
-    BitUnion8(IntReg)
-    Bitfield<0> txintr;
-    Bitfield<1> rxintr;
-    EndBitUnion(IntReg)
+    BitUnion8(InterruptReg)
+        Bitfield<0> rx;
+        Bitfield<1> tx;
+    EndBitUnion(InterruptReg)
 
-    /** interrupt mask register. */
-    IntReg intreg;
+    /** interrupt status register. */
+    InterruptReg interrupts;
 
-    /** Interrupt number to generate */
-    int intNum;
+    /** raw interrupt register (unmasked) */
+    InterruptReg rawInterrupts;
 
-    /** Gic to use for interrupting */
-    Gic *gic;
+    /** If the controller should ignore the next data byte and acknowledge it.
+     * The driver is attempting to setup some feature we don't care about
+     */
+    int ackNext;
 
-    /** Delay before interrupting */
-    Tick intDelay;
+    /** is the shift key currently down */
+    bool shiftDown;
+
+    /** The vnc server we're connected to (if any) */
+    VncServer *vnc;
+
+    /** If the linux driver has initialized the device yet and thus can we send
+     * mouse data */
+    bool driverInitialized;
+
+    /** Update the status of the interrupt registers and schedule an interrupt
+     * if required */
+    void updateIntStatus();
 
     /** Function to generate interrupt */
     void generateInterrupt();
@@ -98,6 +135,15 @@ class Pl050 : public AmbaDevice
     /** Wrapper to create an event out of the thing */
     EventWrapper<Pl050, &Pl050::generateInterrupt> intEvent;
 
+    /** Receive queue. This list contains all the pending commands that
+     * need to be sent to the driver
+     */
+    std::list<uint8_t> rxQueue;
+
+    /** Handle a command sent to the kmi and respond appropriately
+     */
+    void processCommand(uint8_t byte);
+
   public:
     typedef Pl050Params Params;
     const Params *
@@ -111,12 +157,11 @@ class Pl050 : public AmbaDevice
     virtual Tick read(PacketPtr pkt);
     virtual Tick write(PacketPtr pkt);
 
-    /**
-     * Return if we have an interrupt pending
-     * @return interrupt status
-     * @todo fix me when implementation improves
-     */
-    virtual bool intStatus() { return false; }
+    virtual void mouseAt(uint16_t x, uint16_t y, uint8_t buttons);
+    virtual void keyPress(uint32_t key, bool down);
+
+    virtual void serialize(std::ostream &os);
+    virtual void unserialize(Checkpoint *cp, const std::string &section);
 };
 
-#endif
+#endif // __DEV_ARM_PL050_HH__
diff --git a/src/dev/arm/pl111.cc b/src/dev/arm/pl111.cc
index e597bf2724..e884d9b58f 100644
--- a/src/dev/arm/pl111.cc
+++ b/src/dev/arm/pl111.cc
@@ -35,9 +35,13 @@
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * Authors: William Wang
+ *          Ali Saidi
  */
 
+#include "base/bitmap.hh"
+#include "base/output.hh"
 #include "base/trace.hh"
+#include "base/vnc/vncserver.hh"
 #include "dev/arm/amba_device.hh"
 #include "dev/arm/gic.hh"
 #include "dev/arm/pl111.hh"
@@ -50,20 +54,27 @@ using namespace AmbaDev;
 Pl111::Pl111(const Params *p)
     : AmbaDmaDevice(p), lcdTiming0(0), lcdTiming1(0), lcdTiming2(0),
       lcdTiming3(0), lcdUpbase(0), lcdLpbase(0), lcdControl(0), lcdImsc(0),
-      lcdRis(0), lcdMis(0), lcdIcr(0), lcdUpcurr(0), lcdLpcurr(0),
+      lcdRis(0), lcdMis(0),
       clcdCrsrCtrl(0), clcdCrsrConfig(0), clcdCrsrPalette0(0),
       clcdCrsrPalette1(0), clcdCrsrXY(0), clcdCrsrClip(0), clcdCrsrImsc(0),
       clcdCrsrIcr(0), clcdCrsrRis(0), clcdCrsrMis(0), clock(p->clock),
-      height(0), width(0), startTime(0), startAddr(0), maxAddr(0), curAddr(0),
+      vncserver(p->vnc), bmp(NULL), width(LcdMaxWidth), height(LcdMaxHeight),
+      bytesPerPixel(4), startTime(0), startAddr(0), maxAddr(0), curAddr(0),
       waterMark(0), dmaPendingNum(0), readEvent(this), fillFifoEvent(this),
       dmaDoneEvent(maxOutstandingDma, this), intEvent(this)
 {
     pioSize = 0xFFFF;
 
+    pic = simout.create("framebuffer.bmp", true);
+
+    dmaBuffer = new uint8_t[LcdMaxWidth * LcdMaxHeight * sizeof(uint32_t)];
+
     memset(lcdPalette, 0, sizeof(lcdPalette));
     memset(cursorImage, 0, sizeof(cursorImage));
     memset(dmaBuffer, 0, sizeof(dmaBuffer));
-    memset(frameBuffer, 0, sizeof(frameBuffer));
+
+    if (vncserver)
+        vncserver->setFramebufferAddr(dmaBuffer);
 }
 
 // read registers and frame buffer
@@ -75,111 +86,105 @@ Pl111::read(PacketPtr pkt)
 
     uint32_t data = 0;
 
-    if ((pkt->getAddr()& 0xffff0000) == pioAddr) {
+    assert(pkt->getAddr() >= pioAddr &&
+           pkt->getAddr() < pioAddr + pioSize);
 
-        assert(pkt->getAddr() >= pioAddr &&
-               pkt->getAddr() < pioAddr + pioSize);
+    Addr daddr = pkt->getAddr() - pioAddr;
+    pkt->allocate();
 
-        Addr daddr = pkt->getAddr()&0xFFFF;
-        pkt->allocate();
+    DPRINTF(PL111, " read register %#x size=%d\n", daddr, pkt->getSize());
 
-        DPRINTF(PL111, " read register %#x size=%d\n", daddr, pkt->getSize());
-
-        switch (daddr) {
-          case LcdTiming0:
-            data = lcdTiming0;
+    switch (daddr) {
+      case LcdTiming0:
+        data = lcdTiming0;
+        break;
+      case LcdTiming1:
+        data = lcdTiming1;
+        break;
+      case LcdTiming2:
+        data = lcdTiming2;
+        break;
+      case LcdTiming3:
+        data = lcdTiming3;
+        break;
+      case LcdUpBase:
+        data = lcdUpbase;
+        break;
+      case LcdLpBase:
+        data = lcdLpbase;
+        break;
+      case LcdControl:
+        data = lcdControl;
+        break;
+      case LcdImsc:
+        data = lcdImsc;
+        break;
+      case LcdRis:
+        data = lcdRis;
+        break;
+      case LcdMis:
+        data = lcdMis;
+        break;
+      case LcdIcr:
+        panic("LCD register at offset %#x is Write-Only\n", daddr);
+        break;
+      case LcdUpCurr:
+        data = curAddr;
+        break;
+      case LcdLpCurr:
+        data = curAddr;
+        break;
+      case ClcdCrsrCtrl:
+        data = clcdCrsrCtrl;
+        break;
+      case ClcdCrsrConfig:
+        data = clcdCrsrConfig;
+        break;
+      case ClcdCrsrPalette0:
+        data = clcdCrsrPalette0;
+        break;
+      case ClcdCrsrPalette1:
+        data = clcdCrsrPalette1;
+        break;
+      case ClcdCrsrXY:
+        data = clcdCrsrXY;
+        break;
+      case ClcdCrsrClip:
+        data = clcdCrsrClip;
+        break;
+      case ClcdCrsrImsc:
+        data = clcdCrsrImsc;
+        break;
+      case ClcdCrsrIcr:
+        panic("CLCD register at offset %#x is Write-Only\n", daddr);
+        break;
+      case ClcdCrsrRis:
+        data = clcdCrsrRis;
+        break;
+      case ClcdCrsrMis:
+        data = clcdCrsrMis;
+        break;
+      default:
+        if (AmbaDev::readId(pkt, AMBA_ID, pioAddr)) {
+            // Hack for variable size accesses
+            data = pkt->get<uint32_t>();
             break;
-          case LcdTiming1:
-            data = lcdTiming1;
+        } else if (daddr >= CrsrImage && daddr <= 0xBFC) {
+            // CURSOR IMAGE
+            int index;
+            index = (daddr - CrsrImage) >> 2;
+            data= cursorImage[index];
             break;
-          case LcdTiming2:
-            data = lcdTiming2;
+        } else if (daddr >= LcdPalette && daddr <= 0x3FC) {
+            // LCD Palette
+            int index;
+            index = (daddr - LcdPalette) >> 2;
+            data = lcdPalette[index];
             break;
-          case LcdTiming3:
-            data = lcdTiming3;
-            break;
-          case LcdUpBase:
-            data = lcdUpbase;
-            break;
-          case LcdLpBase:
-            data = lcdLpbase;
-            break;
-          case LcdControl:
-            data = lcdControl;
-            break;
-          case LcdImsc:
-            warn("LCD interrupt set/clear function not supported\n");
-            data = lcdImsc;
-            break;
-          case LcdRis:
-            warn("LCD Raw interrupt status function not supported\n");
-            data = lcdRis;
-            break;
-          case LcdMis:
-            warn("LCD Masked interrupt status function not supported\n");
-            data = lcdMis;
-            break;
-          case LcdIcr:
-            panic("LCD register at offset %#x is Write-Only\n", daddr);
-            break;
-          case LcdUpCurr:
-            data = lcdUpcurr;
-            break;
-          case LcdLpCurr:
-            data = lcdLpcurr;
-            break;
-          case ClcdCrsrCtrl:
-            data = clcdCrsrCtrl;
-            break;
-          case ClcdCrsrConfig:
-            data = clcdCrsrConfig;
-            break;
-          case ClcdCrsrPalette0:
-            data = clcdCrsrPalette0;
-            break;
-          case ClcdCrsrPalette1:
-            data = clcdCrsrPalette1;
-            break;
-          case ClcdCrsrXY:
-            data = clcdCrsrXY;
-            break;
-          case ClcdCrsrClip:
-            data = clcdCrsrClip;
-            break;
-          case ClcdCrsrImsc:
-            data = clcdCrsrImsc;
-            break;
-          case ClcdCrsrIcr:
-            panic("CLCD register at offset %#x is Write-Only\n", daddr);
-            break;
-          case ClcdCrsrRis:
-            data = clcdCrsrRis;
-            break;
-          case ClcdCrsrMis:
-            data = clcdCrsrMis;
-            break;
-          default:
-            if (AmbaDev::readId(pkt, AMBA_ID, pioAddr)) {
-                // Hack for variable size accesses
-                data = pkt->get<uint32_t>();
-                break;
-            } else if (daddr >= CrsrImage && daddr <= 0xBFC) {
-                // CURSOR IMAGE
-                int index;
-                index = (daddr - CrsrImage) >> 2;
-                data= cursorImage[index];
-                break;
-            } else if (daddr >= LcdPalette && daddr <= 0x3FC) {
-                // LCD Palette
-                int index;
-                index = (daddr - LcdPalette) >> 2;
-                data = lcdPalette[index];
-                break;
-            } else {
-                panic("Tried to read CLCD register at offset %#x that \
+        } else {
+            panic("Tried to read CLCD register at offset %#x that \
                        doesn't exist\n", daddr);
-                break;
-            }
+            break;
         }
     }
 
@@ -226,119 +231,133 @@ Pl111::write(PacketPtr pkt)
         break;
     }
 
-    if ((pkt->getAddr()& 0xffff0000) == pioAddr) {
+    assert(pkt->getAddr() >= pioAddr &&
+           pkt->getAddr() < pioAddr + pioSize);
 
-        assert(pkt->getAddr() >= pioAddr &&
-               pkt->getAddr() < pioAddr + pioSize);
+    Addr daddr = pkt->getAddr() - pioAddr;
 
-        Addr daddr = pkt->getAddr() - pioAddr;
+    DPRINTF(PL111, " write register %#x value %#x size=%d\n", daddr,
+            pkt->get<uint8_t>(), pkt->getSize());
 
-        DPRINTF(PL111, " write register %#x value %#x size=%d\n", daddr,
-                pkt->get<uint8_t>(), pkt->getSize());
+    switch (daddr) {
+      case LcdTiming0:
+        lcdTiming0 = data;
+        // width = 16 * (PPL+1)
+        width = (lcdTiming0.ppl + 1) << 4;
+        break;
+      case LcdTiming1:
+        lcdTiming1 = data;
+        // height = LPP + 1
+        height = (lcdTiming1.lpp) + 1;
+        break;
+      case LcdTiming2:
+        lcdTiming2 = data;
+        break;
+      case LcdTiming3:
+        lcdTiming3 = data;
+        break;
+      case LcdUpBase:
+        lcdUpbase = data;
+        DPRINTF(PL111, "####### Upper panel base set to: %#x #######\n", lcdUpbase);
+        break;
+      case LcdLpBase:
+        warn("LCD dual screen mode not supported\n");
+        lcdLpbase = data;
+        DPRINTF(PL111, "###### Lower panel base set to: %#x #######\n", lcdLpbase);
+        break;
+      case LcdControl:
+        int old_lcdpwr;
+        old_lcdpwr = lcdControl.lcdpwr;
+        lcdControl = data;
 
-        switch (daddr) {
-          case LcdTiming0:
-            lcdTiming0 = data;
-            // width = 16 * (PPL+1)
-            width = (lcdTiming0.ppl + 1) << 4;
+        DPRINTF(PL111, "LCD power is:%d\n", lcdControl.lcdpwr);
+
+        // LCD power enable
+        if (lcdControl.lcdpwr && !old_lcdpwr) {
+            updateVideoParams();
+            DPRINTF(PL111, " lcd size: height %d width %d\n", height, width);
+            waterMark = lcdControl.watermark ? 8 : 4;
+            startDma();
+        }
+        break;
+      case LcdImsc:
+        lcdImsc = data;
+        if (lcdImsc.vcomp)
+            panic("Interrupting on vcomp not supported\n");
+
+        lcdMis = lcdImsc & lcdRis;
+
+        if (!lcdMis)
+            gic->clearInt(intNum);
+
+         break;
+      case LcdRis:
+        panic("LCD register at offset %#x is Read-Only\n", daddr);
+        break;
+      case LcdMis:
+        panic("LCD register at offset %#x is Read-Only\n", daddr);
+        break;
+      case LcdIcr:
+        lcdRis = lcdRis & ~data;
+        lcdMis = lcdImsc & lcdRis;
+
+        if (!lcdMis)
+            gic->clearInt(intNum);
+
+        break;
+      case LcdUpCurr:
+        panic("LCD register at offset %#x is Read-Only\n", daddr);
+        break;
+      case LcdLpCurr:
+        panic("LCD register at offset %#x is Read-Only\n", daddr);
+        break;
+      case ClcdCrsrCtrl:
+        clcdCrsrCtrl = data;
+        break;
+      case ClcdCrsrConfig:
+        clcdCrsrConfig = data;
+        break;
+      case ClcdCrsrPalette0:
+        clcdCrsrPalette0 = data;
+        break;
+      case ClcdCrsrPalette1:
+        clcdCrsrPalette1 = data;
+        break;
+      case ClcdCrsrXY:
+        clcdCrsrXY = data;
+        break;
+      case ClcdCrsrClip:
+        clcdCrsrClip = data;
+        break;
+      case ClcdCrsrImsc:
+        clcdCrsrImsc = data;
+        break;
+      case ClcdCrsrIcr:
+        clcdCrsrIcr = data;
+        break;
+      case ClcdCrsrRis:
+        panic("CLCD register at offset %#x is Read-Only\n", daddr);
+        break;
+      case ClcdCrsrMis:
+        panic("CLCD register at offset %#x is Read-Only\n", daddr);
+        break;
+      default:
+        if (daddr >= CrsrImage && daddr <= 0xBFC) {
+            // CURSOR IMAGE
+            int index;
+            index = (daddr - CrsrImage) >> 2;
+            cursorImage[index] = data;
             break;
-          case LcdTiming1:
-            lcdTiming1 = data;
-            // height = LPP + 1
-            height  = (lcdTiming1.lpp) + 1;
+        } else if (daddr >= LcdPalette && daddr <= 0x3FC) {
+            // LCD Palette
+            int index;
+            index = (daddr - LcdPalette) >> 2;
+            lcdPalette[index] = data;
             break;
-          case LcdTiming2:
-            lcdTiming2 = data;
-            break;
-          case LcdTiming3:
-            lcdTiming3 = data;
-            break;
-          case LcdUpBase:
-            lcdUpbase  = data;
-            break;
-          case LcdLpBase:
-            warn("LCD dual screen mode not supported\n");
-            lcdLpbase  = data;
-            break;
-          case LcdControl:
-            int old_lcdpwr;
-            old_lcdpwr = lcdControl.lcdpwr;
-            lcdControl = data;
-            // LCD power enable
-            if (lcdControl.lcdpwr&&!old_lcdpwr) {
-                DPRINTF(PL111, " lcd size: height %d width %d\n", height, width);
-                waterMark = lcdControl.watermark ? 8 : 4;
-                readFramebuffer();
-            }
-            break;
-          case LcdImsc:
-            warn("LCD interrupt mask set/clear not supported\n");
-            lcdImsc    = data;
-            break;
-          case LcdRis:
-            warn("LCD register at offset %#x is Read-Only\n", daddr);
-            break;
-          case LcdMis:
-            warn("LCD register at offset %#x is Read-Only\n", daddr);
-            break;
-          case LcdIcr:
-            warn("LCD interrupt clear not supported\n");
-            lcdIcr     = data;
-            break;
-          case LcdUpCurr:
-            warn("LCD register at offset %#x is Read-Only\n", daddr);
-            break;
-          case LcdLpCurr:
-            warn("LCD register at offset %#x is Read-Only\n", daddr);
-            break;
-          case ClcdCrsrCtrl:
-            clcdCrsrCtrl = data;
-            break;
-          case ClcdCrsrConfig:
-            clcdCrsrConfig = data;
-            break;
-          case ClcdCrsrPalette0:
-            clcdCrsrPalette0 = data;
-            break;
-          case ClcdCrsrPalette1:
-            clcdCrsrPalette1 = data;
-            break;
-          case ClcdCrsrXY:
-            clcdCrsrXY = data;
-            break;
-          case ClcdCrsrClip:
-            clcdCrsrClip = data;
-            break;
-          case ClcdCrsrImsc:
-            clcdCrsrImsc = data;
-            break;
-          case ClcdCrsrIcr:
-            clcdCrsrIcr = data;
-            break;
-          case ClcdCrsrRis:
-            warn("CLCD register at offset %#x is Read-Only\n", daddr);
-            break;
-          case ClcdCrsrMis:
-            warn("CLCD register at offset %#x is Read-Only\n", daddr);
-            break;
-          default:
-            if (daddr >= CrsrImage && daddr <= 0xBFC) {
-                // CURSOR IMAGE
-                int index;
-                index = (daddr - CrsrImage) >> 2;
-                cursorImage[index] = data;
-                break;
-            } else if (daddr >= LcdPalette && daddr <= 0x3FC) {
-                // LCD Palette
-                int index;
-                index = (daddr - LcdPalette) >> 2;
-                lcdPalette[index] = data;
-                break;
-            } else {
-                panic("Tried to write PL111 register at offset %#x that \
+        } else {
+            panic("Tried to write PL111 register at offset %#x that \
                        doesn't exist\n", daddr);
-                break;
-            }
+            break;
         }
     }
 
@@ -346,18 +365,76 @@ Pl111::write(PacketPtr pkt)
     return pioDelay;
 }
 
+void
+Pl111::updateVideoParams()
+{
+        if (lcdControl.lcdbpp == bpp24) {
+            bytesPerPixel = 4;
+        } else if (lcdControl.lcdbpp == bpp16m565) {
+            bytesPerPixel = 2;
+        }
+
+        if (vncserver) {
+            if (lcdControl.lcdbpp == bpp24 && lcdControl.bgr)
+                vncserver->setFrameBufferParams(VideoConvert::bgr8888, width,
+                       height);
+            else if (lcdControl.lcdbpp == bpp24 && !lcdControl.bgr)
+                vncserver->setFrameBufferParams(VideoConvert::rgb8888, width,
+                       height);
+            else if (lcdControl.lcdbpp == bpp16m565 && lcdControl.bgr)
+                vncserver->setFrameBufferParams(VideoConvert::bgr565, width,
+                       height);
+            else if (lcdControl.lcdbpp == bpp16m565 && !lcdControl.bgr)
+                vncserver->setFrameBufferParams(VideoConvert::rgb565, width,
+                       height);
+            else
+                panic("Unimplemented video mode\n");
+        }
+
+        if (bmp)
+            delete bmp;
+
+        if (lcdControl.lcdbpp == bpp24 && lcdControl.bgr)
+            bmp = new Bitmap(VideoConvert::bgr8888, width, height, dmaBuffer);
+        else if (lcdControl.lcdbpp == bpp24 && !lcdControl.bgr)
+            bmp = new Bitmap(VideoConvert::rgb8888, width, height, dmaBuffer);
+        else if (lcdControl.lcdbpp == bpp16m565 && lcdControl.bgr)
+            bmp = new Bitmap(VideoConvert::bgr565, width, height, dmaBuffer);
+        else if (lcdControl.lcdbpp == bpp16m565 && !lcdControl.bgr)
+            bmp = new Bitmap(VideoConvert::rgb565, width, height, dmaBuffer);
+        else
+            panic("Unimplemented video mode\n");
+}
+
+void
+Pl111::startDma()
+{
+    if (dmaPendingNum != 0 || readEvent.scheduled())
+        return;
+    readFramebuffer();
+}
+
 void
 Pl111::readFramebuffer()
 {
     // initialization for dma read from frame buffer to dma buffer
-    uint32_t length  = height*width;
-    if (startAddr != lcdUpbase) {
+    uint32_t length = height * width;
+    if (startAddr != lcdUpbase)
         startAddr = lcdUpbase;
-    }
+
+    // Updating base address, interrupt if we're supposed to
+    lcdRis.baseaddr = 1;
+    if (!intEvent.scheduled())
+        schedule(intEvent, nextCycle());
+
     curAddr = 0;
     startTime = curTick();
-    maxAddr = static_cast<Addr>(length*sizeof(uint32_t));
-    dmaPendingNum =0 ;
+
+    maxAddr = static_cast<Addr>(length * bytesPerPixel);
+
+    DPRINTF(PL111, " lcd frame buffer size of %d bytes \n", maxAddr);
+
+    dmaPendingNum = 0;
 
     fillFifo();
 }
@@ -369,11 +446,16 @@ Pl111::fillFifo()
         // concurrent dma reads need different dma done events
         // due to assertion in scheduling state
         ++dmaPendingNum;
-        DPRINTF(PL111, " ++ DMA pending number %d read addr %#x\n",
-                dmaPendingNum, curAddr);
+
         assert(!dmaDoneEvent[dmaPendingNum-1].scheduled());
-        dmaRead(curAddr + startAddr, dmaSize, &dmaDoneEvent[dmaPendingNum-1],
-                curAddr + dmaBuffer);
+
+        // We use a uncachable request here because the requests from the CPU
+        // will be uncacheable as well. If we have uncacheable and cacheable
+        // requests in the memory system for the same address it won't be
+        // pleased
+        dmaPort->dmaAction(MemCmd::ReadReq, curAddr + startAddr, dmaSize,
+                &dmaDoneEvent[dmaPendingNum-1], curAddr + dmaBuffer, 0,
+                Request::UNCACHEABLE);
         curAddr += dmaSize;
     }
 }
@@ -381,27 +463,34 @@ Pl111::fillFifo()
 void
 Pl111::dmaDone()
 {
-    Tick maxFrameTime = lcdTiming2.cpl*height*clock;
+    Tick maxFrameTime = lcdTiming2.cpl * height * clock;
 
     --dmaPendingNum;
 
-    DPRINTF(PL111, " -- DMA pending number %d\n", dmaPendingNum);
-
     if (maxAddr == curAddr && !dmaPendingNum) {
-        if ((curTick() - startTime) > maxFrameTime)
+        if ((curTick() - startTime) > maxFrameTime) {
             warn("CLCD controller buffer underrun, took %d cycles when should"
                  " have taken %d\n", curTick() - startTime, maxFrameTime);
+            lcdRis.underflow = 1;
+            if (!intEvent.scheduled())
+                schedule(intEvent, nextCycle());
+        }
 
-        // double buffering so the vnc server doesn't see a tear in the screen
-        memcpy(frameBuffer, dmaBuffer, maxAddr);
         assert(!readEvent.scheduled());
+        if (vncserver)
+            vncserver->setDirty();
 
         DPRINTF(PL111, "-- write out frame buffer into bmp\n");
-        writeBMP(frameBuffer);
+
+        assert(bmp);
+        pic->seekp(0);
+        bmp->write(pic);
 
         DPRINTF(PL111, "-- schedule next dma read event at %d tick \n",
                 maxFrameTime + curTick());
-        schedule(readEvent, nextCycle(startTime + maxFrameTime));
+
+        if (lcdControl.lcden)
+            schedule(readEvent, nextCycle(startTime + maxFrameTime));
     }
 
     if (dmaPendingNum > (maxOutstandingDma - waterMark))
@@ -409,9 +498,9 @@ Pl111::dmaDone()
 
     if (!fillFifoEvent.scheduled())
         schedule(fillFifoEvent, nextCycle());
-
 }
 
+
 Tick
 Pl111::nextCycle()
 {
@@ -431,33 +520,6 @@ Pl111::nextCycle(Tick beginTick)
     return nextTick;
 }
 
-// write out the frame buffer into a bitmap file
-void
-Pl111::writeBMP(uint32_t* frameBuffer)
-{
-    fstream pic;
-
-    // write out bmp head
-    std::string filename = "./m5out/frameBuffer.bmp";
-    pic.open(filename.c_str(), ios::out|ios::binary);
-    Bitmap bm(pic, height, width);
-
-    DPRINTF(PL111, "-- write out data into bmp\n");
-
-    // write out frame buffer data
-    for (int i = height -1; i >= 0; --i) {
-        for (int j = 0; j< width; ++j) {
-            uint32_t pixel = frameBuffer[i*width + j];
-            pic.write(reinterpret_cast<char*>(&pixel),
-                      sizeof(uint32_t));
-            DPRINTF(PL111, " write pixel data  %#x at addr %#x\n",
-                    pixel, i*width + j);
-        }
-    }
-
-    pic.close();
-}
-
 void
 Pl111::serialize(std::ostream &os)
 {
@@ -490,9 +552,6 @@ Pl111::serialize(std::ostream &os)
     uint8_t lcdMis_serial = lcdMis;
     SERIALIZE_SCALAR(lcdMis_serial);
 
-    uint8_t lcdIcr_serial = lcdIcr;
-    SERIALIZE_SCALAR(lcdIcr_serial);
-
     SERIALIZE_ARRAY(lcdPalette, LcdPaletteSize);
     SERIALIZE_ARRAY(cursorImage, CrsrImageSize);
 
@@ -518,9 +577,9 @@ Pl111::serialize(std::ostream &os)
     SERIALIZE_SCALAR(clock);
     SERIALIZE_SCALAR(height);
     SERIALIZE_SCALAR(width);
+    SERIALIZE_SCALAR(bytesPerPixel);
 
-    SERIALIZE_ARRAY(dmaBuffer, height*width);
-    SERIALIZE_ARRAY(frameBuffer, height*width);
+    SERIALIZE_ARRAY(dmaBuffer, height * width);
     SERIALIZE_SCALAR(startTime);
     SERIALIZE_SCALAR(startAddr);
     SERIALIZE_SCALAR(maxAddr);
@@ -569,10 +628,6 @@ Pl111::unserialize(Checkpoint *cp, const std::string &section)
     UNSERIALIZE_SCALAR(lcdMis_serial);
     lcdMis = lcdMis_serial;
 
-    uint8_t lcdIcr_serial;
-    UNSERIALIZE_SCALAR(lcdIcr_serial);
-    lcdIcr = lcdIcr_serial;
-
     UNSERIALIZE_ARRAY(lcdPalette, LcdPaletteSize);
     UNSERIALIZE_ARRAY(cursorImage, CrsrImageSize);
 
@@ -602,25 +657,29 @@ Pl111::unserialize(Checkpoint *cp, const std::string &section)
     UNSERIALIZE_SCALAR(clock);
     UNSERIALIZE_SCALAR(height);
     UNSERIALIZE_SCALAR(width);
+    UNSERIALIZE_SCALAR(bytesPerPixel);
 
-    UNSERIALIZE_ARRAY(dmaBuffer, height*width);
-    UNSERIALIZE_ARRAY(frameBuffer, height*width);
+    UNSERIALIZE_ARRAY(dmaBuffer, height * width);
     UNSERIALIZE_SCALAR(startTime);
     UNSERIALIZE_SCALAR(startAddr);
     UNSERIALIZE_SCALAR(maxAddr);
     UNSERIALIZE_SCALAR(curAddr);
     UNSERIALIZE_SCALAR(waterMark);
     UNSERIALIZE_SCALAR(dmaPendingNum);
+
+    updateVideoParams();
+    if (vncserver)
+        vncserver->setDirty();
 }
 
 void
 Pl111::generateInterrupt()
 {
     DPRINTF(PL111, "Generate Interrupt: lcdImsc=0x%x lcdRis=0x%x lcdMis=0x%x\n",
-            lcdImsc, lcdRis, lcdMis);
+            (uint32_t)lcdImsc, (uint32_t)lcdRis, (uint32_t)lcdMis);
     lcdMis = lcdImsc & lcdRis;
 
-    if (lcdMis.ffufie || lcdMis.nbupie || lcdMis.vtcpie || lcdMis.ahmeie) {
+    if (lcdMis.underflow || lcdMis.baseaddr || lcdMis.vcomp || lcdMis.ahbmaster) {
         gic->sendInt(intNum);
         DPRINTF(PL111, " -- Generated\n");
     }
@@ -639,15 +698,4 @@ Pl111Params::create()
     return new Pl111(this);
 }
 
-// bitmap class ctor
-Bitmap::Bitmap(std::fstream& bmp, uint16_t h, uint16_t w)
-{
-    Magic  magic  = {{'B','M'}};
-    Header header = {sizeof(Color)*w*h , 0, 0, 54};
-    Info   info   = {sizeof(Info), w, h, 1, sizeof(Color)*8, 0,
-                     ( sizeof(Color) *(w*h) ), 1, 1, 0, 0};
 
-    bmp.write(reinterpret_cast<char*>(&magic),  sizeof(magic));
-    bmp.write(reinterpret_cast<char*>(&header), sizeof(header));
-    bmp.write(reinterpret_cast<char*>(&info),   sizeof(info));
-}
diff --git a/src/dev/arm/pl111.hh b/src/dev/arm/pl111.hh
index 4e75af4e84..f36dc68103 100644
--- a/src/dev/arm/pl111.hh
+++ b/src/dev/arm/pl111.hh
@@ -35,6 +35,7 @@
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * Authors: William Wang
+ *          Ali Saidi
  */
 
 
@@ -55,6 +56,8 @@
 using namespace std;
 
 class Gic;
+class VncServer;
+class Bitmap;
 
 class Pl111: public AmbaDmaDevice
 {
@@ -96,58 +99,69 @@ class Pl111: public AmbaDmaDevice
     static const int dmaSize            = 8;    // 64 bits
     static const int maxOutstandingDma  = 16;   // 16 deep FIFO of 64 bits
 
+    enum LcdMode {
+        bpp1 = 0,
+        bpp2,
+        bpp4,
+        bpp8,
+        bpp16,
+        bpp24,
+        bpp16m565,
+        bpp12
+    };
+
     BitUnion8(InterruptReg)
-    Bitfield<1> ffufie;
-    Bitfield<2> nbupie;
-    Bitfield<3> vtcpie;
-    Bitfield<4> ahmeie;
+        Bitfield<1> underflow;
+        Bitfield<2> baseaddr;
+        Bitfield<3> vcomp;
+        Bitfield<4> ahbmaster;
     EndBitUnion(InterruptReg)
 
     BitUnion32(TimingReg0)
-    Bitfield<7,2> ppl;
-    Bitfield<15,8> hsw;
-    Bitfield<23,16> hfp;
-    Bitfield<31,24> hbp;
+        Bitfield<7,2> ppl;
+        Bitfield<15,8> hsw;
+        Bitfield<23,16> hfp;
+        Bitfield<31,24> hbp;
     EndBitUnion(TimingReg0)
 
     BitUnion32(TimingReg1)
-    Bitfield<9,0> lpp;
-    Bitfield<15,10> vsw;
-    Bitfield<23,16> vfp;
-    Bitfield<31,24> vbp;
+        Bitfield<9,0> lpp;
+        Bitfield<15,10> vsw;
+        Bitfield<23,16> vfp;
+        Bitfield<31,24> vbp;
     EndBitUnion(TimingReg1)
 
     BitUnion32(TimingReg2)
-    Bitfield<4,0> pcdlo;
-    Bitfield<5> clksel;
-    Bitfield<10,6> acb;
-    Bitfield<11> avs;
-    Bitfield<12> ihs;
-    Bitfield<13> ipc;
-    Bitfield<14> ioe;
-    Bitfield<25,16> cpl;
-    Bitfield<26> bcd;
-    Bitfield<31,27> pcdhi;
+        Bitfield<4,0> pcdlo;
+        Bitfield<5> clksel;
+        Bitfield<10,6> acb;
+        Bitfield<11> avs;
+        Bitfield<12> ihs;
+        Bitfield<13> ipc;
+        Bitfield<14> ioe;
+        Bitfield<25,16> cpl;
+        Bitfield<26> bcd;
+        Bitfield<31,27> pcdhi;
     EndBitUnion(TimingReg2)
 
     BitUnion32(TimingReg3)
-    Bitfield<6,0> led;
-    Bitfield<16> lee;
+        Bitfield<6,0> led;
+        Bitfield<16> lee;
     EndBitUnion(TimingReg3)
 
     BitUnion32(ControlReg)
-    Bitfield<0> lcden;
-    Bitfield<3,1> lcdbpp;
-    Bitfield<4> lcdbw;
-    Bitfield<5> lcdtft;
-    Bitfield<6> lcdmono8;
-    Bitfield<7> lcddual;
-    Bitfield<8> bgr;
-    Bitfield<9> bebo;
-    Bitfield<10> bepo;
-    Bitfield<11> lcdpwr;
-    Bitfield<13,12> lcdvcomp;
-    Bitfield<16> watermark;
+        Bitfield<0> lcden;
+        Bitfield<3,1> lcdbpp;
+        Bitfield<4> lcdbw;
+        Bitfield<5> lcdtft;
+        Bitfield<6> lcdmono8;
+        Bitfield<7> lcddual;
+        Bitfield<8> bgr;
+        Bitfield<9> bebo;
+        Bitfield<10> bepo;
+        Bitfield<11> lcdpwr;
+        Bitfield<13,12> lcdvcomp;
+        Bitfield<16> watermark;
     EndBitUnion(ControlReg)
 
     /** Horizontal axis panel control register */
@@ -180,15 +194,6 @@ class Pl111: public AmbaDmaDevice
     /** Masked interrupt status register */
     InterruptReg lcdMis;
 
-    /** Interrupt clear register */
-    InterruptReg lcdIcr;
-
-    /** Upper panel current address value register - ro */
-    int lcdUpcurr;
-
-    /** Lower panel current address value register - ro */
-    int lcdLpcurr;
-
     /** 256x16-bit color palette registers
      * 256 palette entries organized as 128 locations of two entries per word */
     int lcdPalette[LcdPaletteSize];
@@ -228,17 +233,26 @@ class Pl111: public AmbaDmaDevice
     /** Clock speed */
     Tick clock;
 
-    /** Frame buffer height - lines per panel */
-    uint16_t height;
+    /** VNC server */
+    VncServer *vncserver;
+
+    /** Helper to write out bitmaps */
+    Bitmap *bmp;
+
+    /** Picture of what the current frame buffer looks like */
+    std::ostream *pic;
 
     /** Frame buffer width - pixels per line */
     uint16_t width;
 
-    /** CLCDC supports up to 1024x768 */
-    uint8_t dmaBuffer[LcdMaxWidth * LcdMaxHeight * sizeof(uint32_t)];
+    /** Frame buffer height - lines per panel */
+    uint16_t height;
 
-    /** Double buffering */
-    uint32_t frameBuffer[LcdMaxWidth * LcdMaxHeight];
+    /** Bytes per pixel */
+    uint8_t bytesPerPixel;
+
+    /** CLCDC supports up to 1024x768 */
+    uint8_t *dmaBuffer;
 
     /** Start time for frame buffer dma read */
     Tick startTime;
@@ -258,12 +272,12 @@ class Pl111: public AmbaDmaDevice
     /** Number of pending dma reads */
     int dmaPendingNum;
 
+    /** Send updated parameters to the vnc server */
+    void updateVideoParams();
+
     /** DMA framebuffer read */
     void readFramebuffer();
 
-    /** Write framebuffer to a bmp file */
-    void writeBMP(uint32_t*);
-
     /** Generate dma framebuffer read event */
     void generateReadEvent();
 
@@ -273,6 +287,9 @@ class Pl111: public AmbaDmaDevice
     /** fillFIFO event */
     void fillFifo();
 
+    /** start the dmas off after power is enabled */
+    void startDma();
+
     /** DMA done event */
     void dmaDone();
 
@@ -289,7 +306,7 @@ class Pl111: public AmbaDmaDevice
     /** DMA done event */
     vector<EventWrapper<Pl111, &Pl111::dmaDone> > dmaDoneEvent;
 
-    /** Wrapper to create an event out of the thing */
+    /** Wrapper to create an event out of the interrupt */
     EventWrapper<Pl111, &Pl111::generateInterrupt> intEvent;
 
   public:
@@ -312,57 +329,6 @@ class Pl111: public AmbaDmaDevice
      * @param range_list range list to populate with ranges
      */
     void addressRanges(AddrRangeList &range_list);
-
-    /**
-     * Return if we have an interrupt pending
-     * @return interrupt status
-     * @todo fix me when implementation improves
-     */
-    virtual bool intStatus() { return false; }
-};
-
-// write frame buffer into a bitmap picture
-class  Bitmap
-{
-  public:
-    Bitmap(std::fstream& bmp, uint16_t h, uint16_t w);
-
-  private:
-    struct Magic
-    {
-        unsigned char magic_number[2];
-    } magic;
-
-    struct Header
-    {
-        uint32_t size;
-        uint16_t reserved1;
-        uint16_t reserved2;
-        uint32_t offset;
-    } header;
-
-    struct Info
-    {
-        uint32_t Size;
-        uint32_t Width;
-        uint32_t Height;
-        uint16_t Planes;
-        uint16_t BitCount;
-        uint32_t Compression;
-        uint32_t SizeImage;
-        uint32_t XPelsPerMeter;
-        uint32_t YPelsPerMeter;
-        uint32_t ClrUsed;
-        uint32_t ClrImportant;
-    } info;
-
-    struct Color
-    {
-        unsigned char b;
-        unsigned char g;
-        unsigned char r;
-        unsigned char a;
-    } color;
 };
 
 #endif
diff --git a/src/dev/arm/rv_ctrl.cc b/src/dev/arm/rv_ctrl.cc
index c0ba4c7aa1..b1bbc065b4 100644
--- a/src/dev/arm/rv_ctrl.cc
+++ b/src/dev/arm/rv_ctrl.cc
@@ -68,6 +68,27 @@ RealViewCtrl::read(PacketPtr pkt)
       case Flash:
         pkt->set<uint32_t>(0);
         break;
+      case Clcd:
+        pkt->set<uint32_t>(0x00001F00);
+        break;
+      case Osc0:
+        pkt->set<uint32_t>(0x00012C5C);
+        break;
+      case Osc1:
+        pkt->set<uint32_t>(0x00002CC0);
+        break;
+      case Osc2:
+        pkt->set<uint32_t>(0x00002C75);
+        break;
+      case Osc3:
+        pkt->set<uint32_t>(0x00020211);
+        break;
+      case Osc4:
+        pkt->set<uint32_t>(0x00002C75);
+        break;
+      case Lock:
+        pkt->set<uint32_t>(sysLock);
+        break;
       default:
         panic("Tried to read RealView I/O at offset %#x that doesn't exist\n", daddr);
         break;
@@ -85,6 +106,15 @@ RealViewCtrl::write(PacketPtr pkt)
     Addr daddr = pkt->getAddr() - pioAddr;
     switch (daddr) {
       case Flash:
+      case Clcd:
+      case Osc0:
+      case Osc1:
+      case Osc2:
+      case Osc3:
+      case Osc4:
+        break;
+      case Lock:
+        sysLock.lockVal = pkt->get<uint16_t>();
         break;
       default:
         panic("Tried to write RVIO at offset %#x that doesn't exist\n", daddr);
diff --git a/src/dev/arm/rv_ctrl.hh b/src/dev/arm/rv_ctrl.hh
index 00a19d7158..ceed5ef2f4 100644
--- a/src/dev/arm/rv_ctrl.hh
+++ b/src/dev/arm/rv_ctrl.hh
@@ -40,6 +40,7 @@
 #ifndef __DEV_ARM_RV_HH__
 #define __DEV_ARM_RV_HH__
 
+#include "base/bitunion.hh"
 #include "base/range.hh"
 #include "dev/io_device.hh"
 #include "params/RealViewCtrl.hh"
@@ -86,6 +87,14 @@ class RealViewCtrl : public BasicPioDevice
         TestOsc4   = 0xD0
     };
 
+    // system lock value
+    BitUnion32(SysLockReg)
+        Bitfield<15,0> lockVal;
+        Bitfield<16> locked;
+    EndBitUnion(SysLockReg)
+
+    SysLockReg sysLock;
+
   public:
     typedef RealViewCtrlParams Params;
     const Params *
@@ -120,4 +129,3 @@ class RealViewCtrl : public BasicPioDevice
 
 
 #endif // __DEV_ARM_RV_HH__
-
diff --git a/src/dev/arm/timer_sp804.cc b/src/dev/arm/timer_sp804.cc
index 04668d2685..e6d2657ea9 100644
--- a/src/dev/arm/timer_sp804.cc
+++ b/src/dev/arm/timer_sp804.cc
@@ -178,11 +178,11 @@ Sp804::Timer::restartCounter(uint32_t val)
     if (!control.timerEnable)
         return;
 
-    Tick time = clock << power(16, control.timerPrescale);
+    Tick time = clock * power(16, control.timerPrescale);
     if (control.timerSize)
-        time *= bits(val,15,0);
-    else
         time *= val;
+    else
+        time *= bits(val,15,0);
 
     if (zeroEvent.scheduled()) {
         DPRINTF(Timer, "-- Event was already schedule, de-scheduling\n");
diff --git a/src/dev/ps2.cc b/src/dev/ps2.cc
new file mode 100644
index 0000000000..fe90ce6bc3
--- /dev/null
+++ b/src/dev/ps2.cc
@@ -0,0 +1,200 @@
+/*
+ * Copyright (c) 2011 ARM Limited
+ * All rights reserved
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: Ali Saidi
+ */
+
+#include <list>
+#include "x11keysym/keysym.h"
+
+#include "base/misc.hh"
+#include "dev/ps2.hh"
+
+
+namespace Ps2 {
+
+/** Table to convert simple key symbols (0x00XX) into ps2 bytes. Lower byte
+ * is the scan code to send and upper byte is if a modifier is required to
+ * generate it. The table generates us keyboard codes, (e.g. the guest is
+ * supposed to recognize the keyboard as en_US). A new table would be required
+ * for another locale.
+ */
+
+static const uint16_t keySymToPs2Byte[128] = {
+// 0 / 8   1 / 9   2 / A   3 / B   4 / C   5 / D   6 / E   7 / F
+   0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, // 0x00-0x07
+   0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, // 0x08-0x0f
+   0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, // 0x10-0x17
+   0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, // 0x18-0x1f
+   0x0029, 0x0116, 0x0152, 0x0126, 0x0125, 0x012e, 0x013d, 0x0052, // 0x20-0x27
+   0x0146, 0x0145, 0x013e, 0x0155, 0x0041, 0x004e, 0x0049, 0x004a, // 0x28-0x2f
+   0x0045, 0x0016, 0x001e, 0x0026, 0x0025, 0x002e, 0x0036, 0x003d, // 0x30-0x37
+   0x003e, 0x0046, 0x014c, 0x004c, 0x0141, 0x0055, 0x0149, 0x014a, // 0x38-0x3f
+   0x011e, 0x011c, 0x0132, 0x0121, 0x0123, 0x0124, 0x012b, 0x0134, // 0x40-0x47
+   0x0133, 0x0143, 0x013b, 0x0142, 0x014b, 0x013a, 0x0131, 0x0144, // 0x48-0x4f
+   0x014d, 0x0115, 0x012d, 0x011b, 0x012c, 0x013c, 0x012a, 0x011d, // 0x50-0x57
+   0x0122, 0x0135, 0x011a, 0x0054, 0x005d, 0x005b, 0x0136, 0x014e, // 0x58-0x5f
+   0x000e, 0x001c, 0x0032, 0x0021, 0x0023, 0x0024, 0x002b, 0x0034, // 0x60-0x67
+   0x0033, 0x0043, 0x003b, 0x0042, 0x004b, 0x003a, 0x0031, 0x0044, // 0x68-0x6f
+   0x004d, 0x0015, 0x002d, 0x001b, 0x002c, 0x003c, 0x002a, 0x001d, // 0x70-0x77
+   0x0022, 0x0035, 0x001a, 0x0154, 0x015d, 0x015b, 0x010e, 0x0000  // 0x78-0x7f
+};
+
+const uint8_t ShiftKey = 0x12;
+const uint8_t BreakKey = 0xf0;
+const uint8_t ExtendedKey = 0xe0;
+const uint32_t UpperKeys = 0xff00;
+
+void
+keySymToPs2(uint32_t key, bool down, bool &cur_shift,
+        std::list<uint8_t> &keys)
+{
+    if (key <= XK_asciitilde) {
+        uint16_t tmp = keySymToPs2Byte[key];
+        uint8_t code = tmp & 0xff;
+        bool shift = tmp >> 8;
+
+        if (down) {
+            if (!cur_shift && shift) {
+                keys.push_back(ShiftKey);
+                cur_shift = true;
+            }
+            keys.push_back(code);
+        } else {
+            if (cur_shift && !shift) {
+                keys.push_back(BreakKey);
+                keys.push_back(ShiftKey);
+                cur_shift = false;
+            }
+            keys.push_back(BreakKey);
+            keys.push_back(code);
+        }
+    } else {
+        if ((key & UpperKeys) == UpperKeys) {
+            bool extended = false;
+            switch (key) {
+              case XK_BackSpace:
+                keys.push_back(0x66);
+                break;
+              case XK_Tab:
+                keys.push_back(0x0d);
+                break;
+              case XK_Return:
+                keys.push_back(0x5a);
+                break;
+             case XK_Escape:
+                keys.push_back(0x76);
+                break;
+             case XK_Delete:
+                extended = true;
+                keys.push_back(0x71);
+                break;
+             case XK_Home:
+                extended = true;
+                keys.push_back(0x6c);
+                break;
+             case XK_Left:
+                extended = true;
+                keys.push_back(0x6b);
+                break;
+             case XK_Right:
+                extended = true;
+                keys.push_back(0x74);
+                break;
+             case XK_Down:
+                extended = true;
+                keys.push_back(0x72);
+                break;
+             case XK_Up:
+                extended = true;
+                keys.push_back(0x75);
+                break;
+             case XK_Page_Up:
+                extended = true;
+                keys.push_back(0x7d);
+                break;
+             case XK_Page_Down:
+                extended = true;
+                keys.push_back(0x7a);
+                break;
+             case XK_End:
+                extended = true;
+                keys.push_back(0x69);
+                break;
+             case XK_Shift_L:
+                keys.push_back(0x12);
+                if (down)
+                    cur_shift = true;
+                else
+                    cur_shift = false;
+                break;
+             case XK_Shift_R:
+                keys.push_back(0x59);
+                if (down)
+                    cur_shift = true;
+                else
+                    cur_shift = false;
+                break;
+             case XK_Control_L:
+                keys.push_back(0x14);
+                break;
+             case XK_Control_R:
+                extended = true;
+                keys.push_back(0x14);
+                break;
+             default:
+               warn("Unknown extended key %#x\n", key);
+               return;
+            }
+
+            if (extended) {
+                if (down) {
+                    keys.push_front(ExtendedKey);
+                } else {
+                    keys.push_front(BreakKey);
+                    keys.push_front(ExtendedKey);
+                }
+            } else {
+                if (!down)
+                    keys.push_front(BreakKey);
+            }
+        } // upper keys
+    } // extended keys
+    return;
+}
+
+} /* namespace Ps2 */
+
diff --git a/src/dev/ps2.hh b/src/dev/ps2.hh
new file mode 100644
index 0000000000..73f3f9cd8c
--- /dev/null
+++ b/src/dev/ps2.hh
@@ -0,0 +1,94 @@
+/*
+ * Copyright (c) 2011 ARM Limited
+ * All rights reserved
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: Ali Saidi
+ */
+
+#ifndef __DEV_PS2_HH__
+#define __DEV_PS2_HH__
+
+#include <stdint.h>
+
+#include "base/bitunion.hh"
+
+/** @file misc functions and constants required to interface with or emulate ps2
+ * devices
+ */
+
+namespace Ps2 {
+enum {
+    Ps2Reset        = 0xff,
+    SelfTestPass    = 0xAA,
+    SetStatusLed    = 0xed,
+    SetResolution   = 0xe8,
+    StatusRequest   = 0xe9,
+    SetScaling1_2   = 0xe7,
+    SetScaling1_1   = 0xe6,
+    ReadId          = 0xf2,
+    TpReadId        = 0xe1,
+    Ack             = 0xfa,
+    SetRate         = 0xf3,
+    Enable          = 0xf4,
+    Disable         = 0xf6,
+    KeyboardId      = 0xab,
+    TouchKitId      = 0x0a,
+    MouseId         = 0x00,
+};
+
+/** A bitfield that represents the first byte of a mouse movement packet
+ */
+BitUnion8(Ps2MouseMovement)
+    Bitfield<0> leftButton;
+    Bitfield<1> rightButton;
+    Bitfield<2> middleButton;
+    Bitfield<3> one;
+    Bitfield<4> xSign;
+    Bitfield<5> ySign;
+    Bitfield<6> xOverflow;
+    Bitfield<7> yOverflow;
+EndBitUnion(Ps2MouseMovement)
+
+/** Convert an x11 key symbol into a set of ps2 charecters.
+ * @param key x11 key symbol
+ * @param down if the key is being pressed or released
+ * @param cur_shift if device has already sent a shift
+ * @param keys list of keys command to send to emulate the x11 key symbol
+ */
+void keySymToPs2(uint32_t key, bool down, bool &cur_shift,
+        std::list<uint8_t> &keys);
+
+} /* namespace Ps2 */
+#endif // __DEV_PS2_HH__
diff --git a/src/mem/protocol/MESI_CMP_directory-L1cache.sm b/src/mem/protocol/MESI_CMP_directory-L1cache.sm
index 8744a71225..4442cee412 100644
--- a/src/mem/protocol/MESI_CMP_directory-L1cache.sm
+++ b/src/mem/protocol/MESI_CMP_directory-L1cache.sm
@@ -287,20 +287,21 @@ machine(L1Cache, "MSI Directory L1 Cache CMP")
         if (in_msg.Type == CacheRequestType:IFETCH) {
           // ** INSTRUCTION ACCESS ***
 
-          // Check to see if it is in the OTHER L1
-          Entry L1Dcache_entry := getL1DCacheEntry(in_msg.LineAddress);
-          if (is_valid(L1Dcache_entry)) {
-            // The block is in the wrong L1, put the request on the queue to the shared L2
-            trigger(Event:L1_Replacement, in_msg.LineAddress,
-                    L1Dcache_entry, L1_TBEs[in_msg.LineAddress]);
-          }
-
           Entry L1Icache_entry := getL1ICacheEntry(in_msg.LineAddress);
           if (is_valid(L1Icache_entry)) {
             // The tag matches for the L1, so the L1 asks the L2 for it.
             trigger(mandatory_request_type_to_event(in_msg.Type), in_msg.LineAddress,
                     L1Icache_entry, L1_TBEs[in_msg.LineAddress]);
           } else {
+
+            // Check to see if it is in the OTHER L1
+            Entry L1Dcache_entry := getL1DCacheEntry(in_msg.LineAddress);
+            if (is_valid(L1Dcache_entry)) {
+              // The block is in the wrong L1, put the request on the queue to the shared L2
+              trigger(Event:L1_Replacement, in_msg.LineAddress,
+                      L1Dcache_entry, L1_TBEs[in_msg.LineAddress]);
+            }
+
             if (L1IcacheMemory.cacheAvail(in_msg.LineAddress)) {
               // L1 does't have the line, but we have space for it in the L1 so let's see if the L2 has it
               trigger(mandatory_request_type_to_event(in_msg.Type), in_msg.LineAddress,
@@ -313,21 +314,23 @@ machine(L1Cache, "MSI Directory L1 Cache CMP")
             }
           }
         } else {
-          // *** DATA ACCESS ***
-          // Check to see if it is in the OTHER L1
-          Entry L1Icache_entry := getL1ICacheEntry(in_msg.LineAddress);
-          if (is_valid(L1Icache_entry)) {
-            // The block is in the wrong L1, put the request on the queue to the shared L2
-            trigger(Event:L1_Replacement, in_msg.LineAddress,
-                    L1Icache_entry, L1_TBEs[in_msg.LineAddress]);
-          }
 
+          // *** DATA ACCESS ***
           Entry L1Dcache_entry := getL1DCacheEntry(in_msg.LineAddress);
           if (is_valid(L1Dcache_entry)) {
             // The tag matches for the L1, so the L1 ask the L2 for it
             trigger(mandatory_request_type_to_event(in_msg.Type), in_msg.LineAddress,
                     L1Dcache_entry, L1_TBEs[in_msg.LineAddress]);
           } else {
+
+            // Check to see if it is in the OTHER L1
+            Entry L1Icache_entry := getL1ICacheEntry(in_msg.LineAddress);
+            if (is_valid(L1Icache_entry)) {
+              // The block is in the wrong L1, put the request on the queue to the shared L2
+              trigger(Event:L1_Replacement, in_msg.LineAddress,
+                      L1Icache_entry, L1_TBEs[in_msg.LineAddress]);
+            }
+
             if (L1DcacheMemory.cacheAvail(in_msg.LineAddress)) {
               // L1 does't have the line, but we have space for it in the L1 let's see if the L2 has it
               trigger(mandatory_request_type_to_event(in_msg.Type), in_msg.LineAddress,
diff --git a/src/mem/protocol/MOESI_CMP_directory-L1cache.sm b/src/mem/protocol/MOESI_CMP_directory-L1cache.sm
index 4082f23c9f..e590c952ad 100644
--- a/src/mem/protocol/MOESI_CMP_directory-L1cache.sm
+++ b/src/mem/protocol/MOESI_CMP_directory-L1cache.sm
@@ -44,7 +44,6 @@ machine(L1Cache, "Directory protocol")
   // From this node's L1 cache TO the network
   // a local L1 -> this L2 bank, currently ordered with directory forwarded requests
   MessageBuffer requestFromL1Cache, network="To", virtual_network="0", ordered="false";
-  MessageBuffer foo, network="To", virtual_network="1", ordered="false";
   // a local L1 -> this L2 bank
   MessageBuffer responseFromL1Cache, network="To", virtual_network="2", ordered="false";
 //  MessageBuffer writebackFromL1Cache, network="To", virtual_network="3", ordered="false";
@@ -53,7 +52,6 @@ machine(L1Cache, "Directory protocol")
   // To this node's L1 cache FROM the network
   // a L2 bank -> this L1
   MessageBuffer requestToL1Cache, network="From", virtual_network="0", ordered="false";
-  MessageBuffer goo, network="From", virtual_network="1", ordered="false";
   // a L2 bank -> this L1
   MessageBuffer responseToL1Cache, network="From", virtual_network="2", ordered="false";
 
@@ -229,7 +227,6 @@ machine(L1Cache, "Directory protocol")
   out_port(requestNetwork_out, RequestMsg, requestFromL1Cache);
   out_port(responseNetwork_out, ResponseMsg, responseFromL1Cache);
   out_port(triggerQueue_out, TriggerMsg, triggerQueue);
-  out_port(foo_out, ResponseMsg, foo);
 
   // ** IN_PORTS **
 
@@ -242,15 +239,6 @@ machine(L1Cache, "Directory protocol")
     }
   }
 
-
-  in_port(goo_in, RequestMsg, goo) {
-    if (goo_in.isReady()) {
-      peek(goo_in, RequestMsg) {
-        assert(false);
-      }
-    }
-  }
-
   // Trigger Queue
   in_port(triggerQueue_in, TriggerMsg, triggerQueue) {
     if (triggerQueue_in.isReady()) {
@@ -338,14 +326,6 @@ machine(L1Cache, "Directory protocol")
         if (in_msg.Type == CacheRequestType:IFETCH) {
           // ** INSTRUCTION ACCESS ***
 
-          Entry L1Dcache_entry := getL1DCacheEntry(in_msg.LineAddress);
-          // Check to see if it is in the OTHER L1
-          if (is_valid(L1Dcache_entry)) {
-            // The block is in the wrong L1, put the request on the queue to the shared L2
-            trigger(Event:L1_Replacement, in_msg.LineAddress, L1Dcache_entry,
-                    TBEs[in_msg.LineAddress]);
-          }
-
           Entry L1Icache_entry := getL1ICacheEntry(in_msg.LineAddress);
           if (is_valid(L1Icache_entry)) {
             // The tag matches for the L1, so the L1 asks the L2 for it.
@@ -353,6 +333,14 @@ machine(L1Cache, "Directory protocol")
                     in_msg.LineAddress, L1Icache_entry,
                     TBEs[in_msg.LineAddress]);
           } else {
+
+            Entry L1Dcache_entry := getL1DCacheEntry(in_msg.LineAddress);
+            // Check to see if it is in the OTHER L1
+            if (is_valid(L1Dcache_entry)) {
+              // The block is in the wrong L1, put the request on the queue to the shared L2
+              trigger(Event:L1_Replacement, in_msg.LineAddress, L1Dcache_entry,
+                      TBEs[in_msg.LineAddress]);
+            }
             if (L1IcacheMemory.cacheAvail(in_msg.LineAddress)) {
               // L1 does't have the line, but we have space for it in the L1 so let's see if the L2 has it
               trigger(mandatory_request_type_to_event(in_msg.Type),
@@ -369,14 +357,6 @@ machine(L1Cache, "Directory protocol")
         } else {
           // *** DATA ACCESS ***
 
-          Entry L1Icache_entry := getL1ICacheEntry(in_msg.LineAddress);
-          // Check to see if it is in the OTHER L1
-          if (is_valid(L1Icache_entry)) {
-            // The block is in the wrong L1, put the request on the queue to the shared L2
-            trigger(Event:L1_Replacement, in_msg.LineAddress,
-                    L1Icache_entry, TBEs[in_msg.LineAddress]);
-          }
-
           Entry L1Dcache_entry := getL1DCacheEntry(in_msg.LineAddress);
           if (is_valid(L1Dcache_entry)) {
             // The tag matches for the L1, so the L1 ask the L2 for it
@@ -384,6 +364,14 @@ machine(L1Cache, "Directory protocol")
                     in_msg.LineAddress, L1Dcache_entry,
                     TBEs[in_msg.LineAddress]);
           } else {
+
+            Entry L1Icache_entry := getL1ICacheEntry(in_msg.LineAddress);
+            // Check to see if it is in the OTHER L1
+            if (is_valid(L1Icache_entry)) {
+              // The block is in the wrong L1, put the request on the queue to the shared L2
+              trigger(Event:L1_Replacement, in_msg.LineAddress,
+                      L1Icache_entry, TBEs[in_msg.LineAddress]);
+            }
             if (L1DcacheMemory.cacheAvail(in_msg.LineAddress)) {
               // L1 does't have the line, but we have space for it in the L1 let's see if the L2 has it
               trigger(mandatory_request_type_to_event(in_msg.Type),
@@ -411,6 +399,7 @@ machine(L1Cache, "Directory protocol")
         out_msg.Address := address;
         out_msg.Type := CoherenceRequestType:GETS;
         out_msg.Requestor := machineID;
+        out_msg.RequestorMachine := MachineType:L1Cache;
         out_msg.Destination.add(mapAddressToRange(address, MachineType:L2Cache, 
               l2_select_low_bit, l2_select_num_bits));
         out_msg.MessageSize := MessageSizeType:Request_Control;
@@ -455,6 +444,7 @@ machine(L1Cache, "Directory protocol")
       out_msg.Address := address;
       out_msg.Type := CoherenceRequestType:PUTO;
       out_msg.Requestor := machineID;
+      out_msg.RequestorMachine := MachineType:L1Cache;
       out_msg.Destination.add(mapAddressToRange(address, MachineType:L2Cache, 
             l2_select_low_bit, l2_select_num_bits));
       out_msg.MessageSize := MessageSizeType:Writeback_Control;
@@ -467,6 +457,7 @@ machine(L1Cache, "Directory protocol")
       out_msg.Address := address;
       out_msg.Type := CoherenceRequestType:PUTS;
       out_msg.Requestor := machineID;
+      out_msg.RequestorMachine := MachineType:L1Cache;
       out_msg.Destination.add(mapAddressToRange(address, MachineType:L2Cache, 
             l2_select_low_bit, l2_select_num_bits));
       out_msg.MessageSize := MessageSizeType:Writeback_Control;
@@ -481,6 +472,7 @@ machine(L1Cache, "Directory protocol")
           out_msg.Address := address;
           out_msg.Type := CoherenceResponseType:DATA;
           out_msg.Sender := machineID;
+          out_msg.SenderMachine := MachineType:L1Cache;
           out_msg.Destination.add(mapAddressToRange(address, MachineType:L2Cache, 
                 l2_select_low_bit, l2_select_num_bits));
           out_msg.DataBlk := cache_entry.DataBlk;
@@ -496,6 +488,7 @@ machine(L1Cache, "Directory protocol")
           out_msg.Address := address;
           out_msg.Type := CoherenceResponseType:DATA;
           out_msg.Sender := machineID;
+          out_msg.SenderMachine := MachineType:L1Cache;
           out_msg.Destination.add(in_msg.Requestor);
           out_msg.DataBlk := cache_entry.DataBlk;
           // out_msg.Dirty := cache_entry.Dirty;
@@ -514,6 +507,7 @@ machine(L1Cache, "Directory protocol")
       out_msg.Address := address;
       out_msg.Type := CoherenceResponseType:DATA;
       out_msg.Sender := machineID;
+      out_msg.SenderMachine := MachineType:L1Cache;
       out_msg.Destination.add(mapAddressToRange(address, MachineType:L2Cache, 
             l2_select_low_bit, l2_select_num_bits));
       out_msg.DataBlk := cache_entry.DataBlk;
@@ -592,6 +586,7 @@ machine(L1Cache, "Directory protocol")
       out_msg.Address := address;
       out_msg.Type := CoherenceResponseType:UNBLOCK;
       out_msg.Sender := machineID;
+      out_msg.SenderMachine := MachineType:L1Cache;
       out_msg.Destination.add(mapAddressToRange(address, MachineType:L2Cache, 
             l2_select_low_bit, l2_select_num_bits));
       out_msg.MessageSize := MessageSizeType:Unblock_Control;
@@ -690,6 +685,7 @@ machine(L1Cache, "Directory protocol")
           out_msg.Address := address;
           out_msg.Type := CoherenceResponseType:DATA;
           out_msg.Sender := machineID;
+          out_msg.SenderMachine := MachineType:L1Cache;
           out_msg.Destination.add(in_msg.Requestor);
           out_msg.DataBlk := tbe.DataBlk;
           // out_msg.Dirty := tbe.Dirty;
@@ -703,6 +699,7 @@ machine(L1Cache, "Directory protocol")
           out_msg.Address := address;
           out_msg.Type := CoherenceResponseType:DATA;
           out_msg.Sender := machineID;
+          out_msg.SenderMachine := MachineType:L1Cache;
           out_msg.Destination.add(mapAddressToRange(address, MachineType:L2Cache, 
                 l2_select_low_bit, l2_select_num_bits));
           out_msg.DataBlk := tbe.DataBlk;
@@ -723,6 +720,7 @@ machine(L1Cache, "Directory protocol")
           out_msg.Address := address;
           out_msg.Type := CoherenceResponseType:DATA_EXCLUSIVE;
           out_msg.Sender := machineID;
+          out_msg.SenderMachine := MachineType:L1Cache;
           out_msg.Destination.add(in_msg.Requestor);
           out_msg.DataBlk := tbe.DataBlk;
           out_msg.Dirty := tbe.Dirty;
@@ -735,6 +733,7 @@ machine(L1Cache, "Directory protocol")
           out_msg.Address := address;
           out_msg.Type := CoherenceResponseType:DATA_EXCLUSIVE;
           out_msg.Sender := machineID;
+          out_msg.SenderMachine := MachineType:L1Cache;
           out_msg.Destination.add(mapAddressToRange(address, MachineType:L2Cache, 
                 l2_select_low_bit, l2_select_num_bits));
           out_msg.DataBlk := tbe.DataBlk;
diff --git a/src/mem/protocol/MOESI_CMP_token-L1cache.sm b/src/mem/protocol/MOESI_CMP_token-L1cache.sm
index 00e9404c9b..226f213744 100644
--- a/src/mem/protocol/MOESI_CMP_token-L1cache.sm
+++ b/src/mem/protocol/MOESI_CMP_token-L1cache.sm
@@ -647,20 +647,21 @@ machine(L1Cache, "Token protocol")
         if (in_msg.Type == CacheRequestType:IFETCH) {
           // ** INSTRUCTION ACCESS ***
 
-          // Check to see if it is in the OTHER L1
-          Entry L1Dcache_entry := getL1DCacheEntry(in_msg.LineAddress);
-          if (is_valid(L1Dcache_entry)) {
-            // The block is in the wrong L1, try to write it to the L2
-              trigger(Event:L1_Replacement, in_msg.LineAddress,
-                      L1Dcache_entry, tbe);
-          }
-
           Entry L1Icache_entry := getL1ICacheEntry(in_msg.LineAddress);
           if (is_valid(L1Icache_entry)) {
             // The tag matches for the L1, so the L1 fetches the line.  We know it can't be in the L2 due to exclusion
             trigger(mandatory_request_type_to_event(in_msg.Type),
                     in_msg.LineAddress, L1Icache_entry, tbe);
           } else {
+
+            // Check to see if it is in the OTHER L1
+            Entry L1Dcache_entry := getL1DCacheEntry(in_msg.LineAddress);
+            if (is_valid(L1Dcache_entry)) {
+              // The block is in the wrong L1, try to write it to the L2
+                trigger(Event:L1_Replacement, in_msg.LineAddress,
+                        L1Dcache_entry, tbe);
+            }
+
             if (L1IcacheMemory.cacheAvail(in_msg.LineAddress)) {
               // L1 does't have the line, but we have space for it in the L1
               trigger(mandatory_request_type_to_event(in_msg.Type),
@@ -676,21 +677,21 @@ machine(L1Cache, "Token protocol")
         } else {
           // *** DATA ACCESS ***
 
-            // Check to see if it is in the OTHER L1
-          Entry L1Icache_entry := getL1ICacheEntry(in_msg.LineAddress);
-
-          if (is_valid(L1Icache_entry)) {
-            // The block is in the wrong L1, try to write it to the L2
-            trigger(Event:L1_Replacement, in_msg.LineAddress,
-                    L1Icache_entry, tbe);
-          }
-
           Entry L1Dcache_entry := getL1DCacheEntry(in_msg.LineAddress);
           if (is_valid(L1Dcache_entry)) {
             // The tag matches for the L1, so the L1 fetches the line.  We know it can't be in the L2 due to exclusion
             trigger(mandatory_request_type_to_event(in_msg.Type),
                     in_msg.LineAddress, L1Dcache_entry, tbe);
           } else {
+
+            // Check to see if it is in the OTHER L1
+            Entry L1Icache_entry := getL1ICacheEntry(in_msg.LineAddress);
+            if (is_valid(L1Icache_entry)) {
+              // The block is in the wrong L1, try to write it to the L2
+              trigger(Event:L1_Replacement, in_msg.LineAddress,
+                      L1Icache_entry, tbe);
+            }
+
             if (L1DcacheMemory.cacheAvail(in_msg.LineAddress)) {
               // L1 does't have the line, but we have space for it in the L1
               trigger(mandatory_request_type_to_event(in_msg.Type),
diff --git a/src/mem/protocol/MOESI_hammer-cache.sm b/src/mem/protocol/MOESI_hammer-cache.sm
index 26598f5418..f9d5ffcab2 100644
--- a/src/mem/protocol/MOESI_hammer-cache.sm
+++ b/src/mem/protocol/MOESI_hammer-cache.sm
@@ -377,26 +377,26 @@ machine(L1Cache, "AMD Hammer-like protocol")
         if (in_msg.Type == CacheRequestType:IFETCH) {
           // ** INSTRUCTION ACCESS ***
 
-          // Check to see if it is in the OTHER L1
-          Entry L1Dcache_entry := getL1DCacheEntry(in_msg.LineAddress);
-          if (is_valid(L1Dcache_entry)) {
-            // The block is in the wrong L1, try to write it to the L2
-            if (L2cacheMemory.cacheAvail(in_msg.LineAddress)) {
-              trigger(Event:L1_to_L2, in_msg.LineAddress, L1Dcache_entry, tbe);
-            } else {
-              trigger(Event:L2_Replacement,
-                      L2cacheMemory.cacheProbe(in_msg.LineAddress),
-                      getL2CacheEntry(L2cacheMemory.cacheProbe(in_msg.LineAddress)),
-                      TBEs[L2cacheMemory.cacheProbe(in_msg.LineAddress)]);
-            }
-          }
-
           Entry L1Icache_entry := getL1ICacheEntry(in_msg.LineAddress);
           if (is_valid(L1Icache_entry)) {
             // The tag matches for the L1, so the L1 fetches the line.  We know it can't be in the L2 due to exclusion
             trigger(mandatory_request_type_to_event(in_msg.Type),
                     in_msg.LineAddress, L1Icache_entry, tbe);
           } else {
+            // Check to see if it is in the OTHER L1
+            Entry L1Dcache_entry := getL1DCacheEntry(in_msg.LineAddress);
+            if (is_valid(L1Dcache_entry)) {
+              // The block is in the wrong L1, try to write it to the L2
+              if (L2cacheMemory.cacheAvail(in_msg.LineAddress)) {
+                trigger(Event:L1_to_L2, in_msg.LineAddress, L1Dcache_entry, tbe);
+              } else {
+                trigger(Event:L2_Replacement,
+                        L2cacheMemory.cacheProbe(in_msg.LineAddress),
+                        getL2CacheEntry(L2cacheMemory.cacheProbe(in_msg.LineAddress)),
+                        TBEs[L2cacheMemory.cacheProbe(in_msg.LineAddress)]);
+              }
+            }
+
             if (L1IcacheMemory.cacheAvail(in_msg.LineAddress)) {
               // L1 does't have the line, but we have space for it in the L1
 
@@ -430,26 +430,27 @@ machine(L1Cache, "AMD Hammer-like protocol")
         } else {
           // *** DATA ACCESS ***
 
-          // Check to see if it is in the OTHER L1
-          Entry L1Icache_entry := getL1ICacheEntry(in_msg.LineAddress);
-          if (is_valid(L1Icache_entry)) {
-            // The block is in the wrong L1, try to write it to the L2
-            if (L2cacheMemory.cacheAvail(in_msg.LineAddress)) {
-              trigger(Event:L1_to_L2, in_msg.LineAddress, L1Icache_entry, tbe);
-            } else {
-              trigger(Event:L2_Replacement,
-                      L2cacheMemory.cacheProbe(in_msg.LineAddress),
-                      getL2CacheEntry(L2cacheMemory.cacheProbe(in_msg.LineAddress)),
-                      TBEs[L2cacheMemory.cacheProbe(in_msg.LineAddress)]);
-            }
-          }
-
           Entry L1Dcache_entry := getL1DCacheEntry(in_msg.LineAddress);
           if (is_valid(L1Dcache_entry)) {
             // The tag matches for the L1, so the L1 fetches the line.  We know it can't be in the L2 due to exclusion
             trigger(mandatory_request_type_to_event(in_msg.Type),
                     in_msg.LineAddress, L1Dcache_entry, tbe);
           } else {
+
+            // Check to see if it is in the OTHER L1
+            Entry L1Icache_entry := getL1ICacheEntry(in_msg.LineAddress);
+            if (is_valid(L1Icache_entry)) {
+              // The block is in the wrong L1, try to write it to the L2
+              if (L2cacheMemory.cacheAvail(in_msg.LineAddress)) {
+                trigger(Event:L1_to_L2, in_msg.LineAddress, L1Icache_entry, tbe);
+              } else {
+                trigger(Event:L2_Replacement,
+                        L2cacheMemory.cacheProbe(in_msg.LineAddress),
+                        getL2CacheEntry(L2cacheMemory.cacheProbe(in_msg.LineAddress)),
+                        TBEs[L2cacheMemory.cacheProbe(in_msg.LineAddress)]);
+              }
+            }
+
             if (L1DcacheMemory.cacheAvail(in_msg.LineAddress)) {
               // L1 does't have the line, but we have space for it in the L1
               Entry L2cache_entry := getL2CacheEntry(in_msg.LineAddress);
diff --git a/src/mem/ruby/buffers/MessageBuffer.cc b/src/mem/ruby/buffers/MessageBuffer.cc
index f6b79c5803..2255950053 100644
--- a/src/mem/ruby/buffers/MessageBuffer.cc
+++ b/src/mem/ruby/buffers/MessageBuffer.cc
@@ -58,6 +58,8 @@ MessageBuffer::MessageBuffer(const string &name)
     m_name = name;
 
     m_stall_msg_map.clear();
+    m_input_link_id = 0;
+    m_vnet_id = 0;
 }
 
 int
@@ -228,6 +230,7 @@ MessageBuffer::enqueue(MsgPtr message, Time delta)
     // Schedule the wakeup
     if (m_consumer_ptr != NULL) {
         g_eventQueue_ptr->scheduleEventAbsolute(m_consumer_ptr, arrival_time);
+        m_consumer_ptr->storeEventInfo(m_vnet_id);
     } else {
         panic("No consumer: %s name: %s\n", *this, m_name);
     }
diff --git a/src/mem/ruby/buffers/MessageBuffer.hh b/src/mem/ruby/buffers/MessageBuffer.hh
index 62cc656701..88df5b788e 100644
--- a/src/mem/ruby/buffers/MessageBuffer.hh
+++ b/src/mem/ruby/buffers/MessageBuffer.hh
@@ -142,6 +142,9 @@ class MessageBuffer
     void printStats(std::ostream& out);
     void clearStats() { m_not_avail_count = 0; m_msg_counter = 0; }
 
+    void setIncomingLink(int link_id) { m_input_link_id = link_id; }
+    void setVnet(int net) { m_vnet_id = net; }
+
   private:
     //added by SS
     int m_recycle_latency;
@@ -184,6 +187,9 @@ class MessageBuffer
     bool m_ordering_set;
     bool m_randomization;
     Time m_last_arrival_time;
+
+    int m_input_link_id;
+    int m_vnet_id;
 };
 
 inline std::ostream&
diff --git a/src/mem/ruby/common/Consumer.hh b/src/mem/ruby/common/Consumer.hh
index c1f8bc42e6..a119abb390 100644
--- a/src/mem/ruby/common/Consumer.hh
+++ b/src/mem/ruby/common/Consumer.hh
@@ -67,6 +67,7 @@ class Consumer
 
     virtual void wakeup() = 0;
     virtual void print(std::ostream& out) const = 0;
+    virtual void storeEventInfo(int info) {}
 
     const Time&
     getLastScheduledWakeup() const
diff --git a/src/mem/ruby/network/simple/PerfectSwitch.cc b/src/mem/ruby/network/simple/PerfectSwitch.cc
index 7229c724f0..5c461c63fe 100644
--- a/src/mem/ruby/network/simple/PerfectSwitch.cc
+++ b/src/mem/ruby/network/simple/PerfectSwitch.cc
@@ -54,6 +54,11 @@ PerfectSwitch::PerfectSwitch(SwitchID sid, SimpleNetwork* network_ptr)
     m_round_robin_start = 0;
     m_network_ptr = network_ptr;
     m_wakeups_wo_switch = 0;
+
+    for(int i = 0;i < m_virtual_networks;++i)
+    {
+        m_pending_message_count.push_back(0);
+    }
 }
 
 void
@@ -62,12 +67,15 @@ PerfectSwitch::addInPort(const vector<MessageBuffer*>& in)
     assert(in.size() == m_virtual_networks);
     NodeID port = m_in.size();
     m_in.push_back(in);
+
     for (int j = 0; j < m_virtual_networks; j++) {
         m_in[port][j]->setConsumer(this);
         string desc = csprintf("[Queue from port %s %s %s to PerfectSwitch]",
             NodeIDToString(m_switch_id), NodeIDToString(port),
             NodeIDToString(j));
         m_in[port][j]->setDescription(desc);
+        m_in[port][j]->setIncomingLink(port);
+        m_in[port][j]->setVnet(j);
     }
 }
 
@@ -154,160 +162,169 @@ PerfectSwitch::wakeup()
             m_round_robin_start = 0;
         }
 
-        // for all input ports, use round robin scheduling
-        for (int counter = 0; counter < m_in.size(); counter++) {
-            // Round robin scheduling
-            incoming++;
-            if (incoming >= m_in.size()) {
-                incoming = 0;
-            }
-
-            // temporary vectors to store the routing results
-            vector<LinkID> output_links;
-            vector<NetDest> output_link_destinations;
-
-            // Is there a message waiting?
-            while (m_in[incoming][vnet]->isReady()) {
-                DPRINTF(RubyNetwork, "incoming: %d\n", incoming);
-
-                // Peek at message
-                msg_ptr = m_in[incoming][vnet]->peekMsgPtr();
-                net_msg_ptr = safe_cast<NetworkMessage*>(msg_ptr.get());
-                DPRINTF(RubyNetwork, "Message: %s\n", (*net_msg_ptr));
-
-                output_links.clear();
-                output_link_destinations.clear();
-                NetDest msg_dsts =
-                    net_msg_ptr->getInternalDestination();
-
-                // Unfortunately, the token-protocol sends some
-                // zero-destination messages, so this assert isn't valid
-                // assert(msg_dsts.count() > 0);
-
-                assert(m_link_order.size() == m_routing_table.size());
-                assert(m_link_order.size() == m_out.size());
-
-                if (m_network_ptr->getAdaptiveRouting()) {
-                    if (m_network_ptr->isVNetOrdered(vnet)) {
-                        // Don't adaptively route
-                        for (int out = 0; out < m_out.size(); out++) {
-                            m_link_order[out].m_link = out;
-                            m_link_order[out].m_value = 0;
-                        }
-                    } else {
-                        // Find how clogged each link is
-                        for (int out = 0; out < m_out.size(); out++) {
-                            int out_queue_length = 0;
-                            for (int v = 0; v < m_virtual_networks; v++) {
-                                out_queue_length += m_out[out][v]->getSize();
-                            }
-                            int value =
-                                (out_queue_length << 8) | (random() & 0xff);
-                            m_link_order[out].m_link = out;
-                            m_link_order[out].m_value = value;
-                        }
-
-                        // Look at the most empty link first
-                        sort(m_link_order.begin(), m_link_order.end());
-                    }
+        if(m_pending_message_count[vnet] > 0) {
+            // for all input ports, use round robin scheduling
+            for (int counter = 0; counter < m_in.size(); counter++) {
+                // Round robin scheduling
+                incoming++;
+                if (incoming >= m_in.size()) {
+                    incoming = 0;
                 }
 
-                for (int i = 0; i < m_routing_table.size(); i++) {
-                    // pick the next link to look at
-                    int link = m_link_order[i].m_link;
-                    NetDest dst = m_routing_table[link];
-                    DPRINTF(RubyNetwork, "dst: %s\n", dst);
+                // temporary vectors to store the routing results
+                vector<LinkID> output_links;
+                vector<NetDest> output_link_destinations;
 
-                    if (!msg_dsts.intersectionIsNotEmpty(dst))
-                        continue;
+                // Is there a message waiting?
+                while (m_in[incoming][vnet]->isReady()) {
+                    DPRINTF(RubyNetwork, "incoming: %d\n", incoming);
 
-                    // Remember what link we're using
-                    output_links.push_back(link);
-
-                    // Need to remember which destinations need this
-                    // message in another vector.  This Set is the
-                    // intersection of the routing_table entry and the
-                    // current destination set.  The intersection must
-                    // not be empty, since we are inside "if"
-                    output_link_destinations.push_back(msg_dsts.AND(dst));
-
-                    // Next, we update the msg_destination not to
-                    // include those nodes that were already handled
-                    // by this link
-                    msg_dsts.removeNetDest(dst);
-                }
-
-                assert(msg_dsts.count() == 0);
-                //assert(output_links.size() > 0);
-
-                // Check for resources - for all outgoing queues
-                bool enough = true;
-                for (int i = 0; i < output_links.size(); i++) {
-                    int outgoing = output_links[i];
-                    if (!m_out[outgoing][vnet]->areNSlotsAvailable(1))
-                        enough = false;
-                    DPRINTF(RubyNetwork, "Checking if node is blocked\n"
-                            "outgoing: %d, vnet: %d, enough: %d\n",
-                            outgoing, vnet, enough);
-                }
-
-                // There were not enough resources
-                if (!enough) {
-                    g_eventQueue_ptr->scheduleEvent(this, 1);
-                    DPRINTF(RubyNetwork, "Can't deliver message since a node "
-                            "is blocked\n"
-                            "Message: %s\n", (*net_msg_ptr));
-                    break; // go to next incoming port
-                }
-
-                MsgPtr unmodified_msg_ptr;
-
-                if (output_links.size() > 1) {
-                    // If we are sending this message down more than
-                    // one link (size>1), we need to make a copy of
-                    // the message so each branch can have a different
-                    // internal destination we need to create an
-                    // unmodified MsgPtr because the MessageBuffer
-                    // enqueue func will modify the message
-
-                    // This magic line creates a private copy of the
-                    // message
-                    unmodified_msg_ptr = msg_ptr->clone();
-                }
-
-                // Enqueue it - for all outgoing queues
-                for (int i=0; i<output_links.size(); i++) {
-                    int outgoing = output_links[i];
-
-                    if (i > 0) {
-                        // create a private copy of the unmodified
-                        // message
-                        msg_ptr = unmodified_msg_ptr->clone();
-                    }
-
-                    // Change the internal destination set of the
-                    // message so it knows which destinations this
-                    // link is responsible for.
+                    // Peek at message
+                    msg_ptr = m_in[incoming][vnet]->peekMsgPtr();
                     net_msg_ptr = safe_cast<NetworkMessage*>(msg_ptr.get());
-                    net_msg_ptr->getInternalDestination() =
-                        output_link_destinations[i];
+                    DPRINTF(RubyNetwork, "Message: %s\n", (*net_msg_ptr));
 
-                    // Enqeue msg
-                    DPRINTF(RubyNetwork, "Switch: %d enqueuing net msg from "
-                            "inport[%d][%d] to outport [%d][%d] time: %lld.\n",
-                            m_switch_id, incoming, vnet, outgoing, vnet,
-                            g_eventQueue_ptr->getTime());
+                    output_links.clear();
+                    output_link_destinations.clear();
+                    NetDest msg_dsts =
+                        net_msg_ptr->getInternalDestination();
 
-                    m_out[outgoing][vnet]->enqueue(msg_ptr);
+                    // Unfortunately, the token-protocol sends some
+                    // zero-destination messages, so this assert isn't valid
+                    // assert(msg_dsts.count() > 0);
+
+                    assert(m_link_order.size() == m_routing_table.size());
+                    assert(m_link_order.size() == m_out.size());
+
+                    if (m_network_ptr->getAdaptiveRouting()) {
+                        if (m_network_ptr->isVNetOrdered(vnet)) {
+                            // Don't adaptively route
+                            for (int out = 0; out < m_out.size(); out++) {
+                                m_link_order[out].m_link = out;
+                                m_link_order[out].m_value = 0;
+                            }
+                        } else {
+                            // Find how clogged each link is
+                            for (int out = 0; out < m_out.size(); out++) {
+                                int out_queue_length = 0;
+                                for (int v = 0; v < m_virtual_networks; v++) {
+                                    out_queue_length += m_out[out][v]->getSize();
+                                }
+                                int value =
+                                    (out_queue_length << 8) | (random() & 0xff);
+                                m_link_order[out].m_link = out;
+                                m_link_order[out].m_value = value;
+                            }
+
+                            // Look at the most empty link first
+                            sort(m_link_order.begin(), m_link_order.end());
+                        }
+                    }
+
+                    for (int i = 0; i < m_routing_table.size(); i++) {
+                        // pick the next link to look at
+                        int link = m_link_order[i].m_link;
+                        NetDest dst = m_routing_table[link];
+                        DPRINTF(RubyNetwork, "dst: %s\n", dst);
+
+                        if (!msg_dsts.intersectionIsNotEmpty(dst))
+                            continue;
+
+                        // Remember what link we're using
+                        output_links.push_back(link);
+
+                        // Need to remember which destinations need this
+                        // message in another vector.  This Set is the
+                        // intersection of the routing_table entry and the
+                        // current destination set.  The intersection must
+                        // not be empty, since we are inside "if"
+                        output_link_destinations.push_back(msg_dsts.AND(dst));
+
+                        // Next, we update the msg_destination not to
+                        // include those nodes that were already handled
+                        // by this link
+                        msg_dsts.removeNetDest(dst);
+                    }
+
+                    assert(msg_dsts.count() == 0);
+                    //assert(output_links.size() > 0);
+
+                    // Check for resources - for all outgoing queues
+                    bool enough = true;
+                    for (int i = 0; i < output_links.size(); i++) {
+                        int outgoing = output_links[i];
+                        if (!m_out[outgoing][vnet]->areNSlotsAvailable(1))
+                            enough = false;
+                        DPRINTF(RubyNetwork, "Checking if node is blocked\n"
+                                "outgoing: %d, vnet: %d, enough: %d\n",
+                                outgoing, vnet, enough);
+                    }
+
+                    // There were not enough resources
+                    if (!enough) {
+                        g_eventQueue_ptr->scheduleEvent(this, 1);
+                        DPRINTF(RubyNetwork, "Can't deliver message since a node "
+                                "is blocked\n"
+                                "Message: %s\n", (*net_msg_ptr));
+                        break; // go to next incoming port
+                    }
+
+                    MsgPtr unmodified_msg_ptr;
+
+                    if (output_links.size() > 1) {
+                        // If we are sending this message down more than
+                        // one link (size>1), we need to make a copy of
+                        // the message so each branch can have a different
+                        // internal destination we need to create an
+                        // unmodified MsgPtr because the MessageBuffer
+                        // enqueue func will modify the message
+
+                        // This magic line creates a private copy of the
+                        // message
+                        unmodified_msg_ptr = msg_ptr->clone();
+                    }
+
+                    // Enqueue it - for all outgoing queues
+                    for (int i=0; i<output_links.size(); i++) {
+                        int outgoing = output_links[i];
+
+                        if (i > 0) {
+                            // create a private copy of the unmodified
+                            // message
+                            msg_ptr = unmodified_msg_ptr->clone();
+                        }
+
+                        // Change the internal destination set of the
+                        // message so it knows which destinations this
+                        // link is responsible for.
+                        net_msg_ptr = safe_cast<NetworkMessage*>(msg_ptr.get());
+                        net_msg_ptr->getInternalDestination() =
+                            output_link_destinations[i];
+
+                        // Enqeue msg
+                        DPRINTF(RubyNetwork, "Switch: %d enqueuing net msg from "
+                                "inport[%d][%d] to outport [%d][%d] time: %lld.\n",
+                                m_switch_id, incoming, vnet, outgoing, vnet,
+                                g_eventQueue_ptr->getTime());
+
+                        m_out[outgoing][vnet]->enqueue(msg_ptr);
+                    }
+
+                    // Dequeue msg
+                    m_in[incoming][vnet]->pop();
+                    m_pending_message_count[vnet]--;
                 }
-
-                // Dequeue msg
-                m_in[incoming][vnet]->pop();
             }
         }
     }
 }
 
+void
+PerfectSwitch::storeEventInfo(int info)
+{
+    m_pending_message_count[info]++;
+}
+
 void
 PerfectSwitch::printStats(std::ostream& out) const
 {
diff --git a/src/mem/ruby/network/simple/PerfectSwitch.hh b/src/mem/ruby/network/simple/PerfectSwitch.hh
index a7e577df01..cd0219fd9f 100644
--- a/src/mem/ruby/network/simple/PerfectSwitch.hh
+++ b/src/mem/ruby/network/simple/PerfectSwitch.hh
@@ -69,6 +69,7 @@ class PerfectSwitch : public Consumer
     int getOutLinks() const { return m_out.size(); }
 
     void wakeup();
+    void storeEventInfo(int info);
 
     void printStats(std::ostream& out) const;
     void clearStats();
@@ -92,6 +93,7 @@ class PerfectSwitch : public Consumer
     int m_round_robin_start;
     int m_wakeups_wo_switch;
     SimpleNetwork* m_network_ptr;
+    std::vector<int> m_pending_message_count;
 };
 
 inline std::ostream&
diff --git a/src/mem/ruby/slicc_interface/Message.hh b/src/mem/ruby/slicc_interface/Message.hh
index ff94fdd409..7fcfabe9ca 100644
--- a/src/mem/ruby/slicc_interface/Message.hh
+++ b/src/mem/ruby/slicc_interface/Message.hh
@@ -57,6 +57,8 @@ class Message : public RefCounted
 
     virtual Message* clone() const = 0;
     virtual void print(std::ostream& out) const = 0;
+    virtual void setIncomingLink(int) {}
+    virtual void setVnet(int) {}
 
     void setDelayedCycles(const int& cycles) { m_DelayedCycles = cycles; }
     const int& getDelayedCycles() const {return m_DelayedCycles;}
diff --git a/src/mem/ruby/slicc_interface/NetworkMessage.hh b/src/mem/ruby/slicc_interface/NetworkMessage.hh
index 082481e054..a8f9c625b3 100644
--- a/src/mem/ruby/slicc_interface/NetworkMessage.hh
+++ b/src/mem/ruby/slicc_interface/NetworkMessage.hh
@@ -82,9 +82,16 @@ class NetworkMessage : public Message
 
     virtual void print(std::ostream& out) const = 0;
 
+    int getIncomingLink() const { return incoming_link; }
+    void setIncomingLink(int link) { incoming_link = link; }
+    int getVnet() const { return vnet; }
+    void setVnet(int net) { vnet = net; }
+
   private:
     NetDest m_internal_dest;
     bool m_internal_dest_valid;
+    int incoming_link;
+    int vnet;
 };
 
 inline std::ostream&
diff --git a/src/python/m5/main.py b/src/python/m5/main.py
index cd139ccb39..23a012166a 100644
--- a/src/python/m5/main.py
+++ b/src/python/m5/main.py
@@ -61,8 +61,6 @@ add_option('-C', "--copyright", action="store_true", default=False,
     help="Show full copyright information")
 add_option('-R', "--readme", action="store_true", default=False,
     help="Show the readme")
-add_option('-N', "--release-notes", action="store_true", default=False,
-    help="Show the release notes")
 
 # Options for configuring the base simulator
 add_option('-d', "--outdir", metavar="DIR", default="m5out",
@@ -207,13 +205,6 @@ def main():
         print info.README
         print
 
-    if options.release_notes:
-        done = True
-        print 'Release Notes:'
-        print
-        print info.RELEASE_NOTES
-        print
-
     if options.trace_help:
         done = True
         check_tracing()
diff --git a/src/sim/root.cc b/src/sim/root.cc
index 1dc9b60586..d51fcbda60 100644
--- a/src/sim/root.cc
+++ b/src/sim/root.cc
@@ -108,7 +108,18 @@ Root::Root(RootParams *p) : SimObject(p), _enabled(false),
     assert(_root == NULL);
     _root = this;
     lastTime.setTimer();
-    timeSyncEnable(p->time_sync_enable);
+}
+
+void
+Root::initState()
+{
+    timeSyncEnable(params()->time_sync_enable);
+}
+
+void
+Root::loadState(Checkpoint *cp)
+{
+    timeSyncEnable(params()->time_sync_enable);
 }
 
 Root *
diff --git a/src/sim/root.hh b/src/sim/root.hh
index 2beced9d4a..76a508c194 100644
--- a/src/sim/root.hh
+++ b/src/sim/root.hh
@@ -95,7 +95,22 @@ class Root : public SimObject
     /// Set the threshold for time remaining to spin wait.
     void timeSyncSpinThreshold(Time newThreshold);
 
-    Root(RootParams *p);
+    typedef RootParams Params;
+    const Params *
+    params() const
+    {
+        return dynamic_cast<const Params *>(_params);
+    }
+
+    Root(Params *p);
+
+    /** Schedule the timesync event at loadState() so that curTick is correct
+     */
+    void loadState(Checkpoint *cp);
+
+    /** Schedule the timesync event at initState() when not unserializing
+     */
+    void initState();
 };
 
 #endif // __SIM_ROOT_HH__
diff --git a/src/sim/serialize.cc b/src/sim/serialize.cc
index d28f335beb..44fe7b2e7f 100644
--- a/src/sim/serialize.cc
+++ b/src/sim/serialize.cc
@@ -201,6 +201,23 @@ arrayParamOut(ostream &os, const string &name, const vector<T> &param)
     os << "\n";
 }
 
+template <class T>
+void
+arrayParamOut(ostream &os, const string &name, const list<T> &param)
+{
+    typename list<T>::const_iterator it = param.begin();
+
+    os << name << "=";
+    if (param.size() > 0)
+        showParam(os, *it);
+    it++;
+    while (it != param.end()) {
+        os << " ";
+        showParam(os, *it);
+        it++;
+    }
+    os << "\n";
+}
 
 template <class T>
 void
@@ -326,6 +343,37 @@ arrayParamIn(Checkpoint *cp, const string &section,
     }
 }
 
+template <class T>
+void
+arrayParamIn(Checkpoint *cp, const string &section,
+             const string &name, list<T> &param)
+{
+    string str;
+    if (!cp->find(section, name, str)) {
+        fatal("Can't unserialize '%s:%s'\n", section, name);
+    }
+    param.clear();
+
+    vector<string> tokens;
+    tokenize(tokens, str, ' ');
+
+    for (vector<string>::size_type i = 0; i < tokens.size(); i++) {
+        T scalar_value = 0;
+        if (!parseParam(tokens[i], scalar_value)) {
+            string err("could not parse \"");
+
+            err += str;
+            err += "\"";
+
+            fatal(err);
+        }
+
+        // assign parsed value to vector
+        param.push_back(scalar_value);
+    }
+}
+
+
 void
 objParamIn(Checkpoint *cp, const string &section,
            const string &name, SimObject * &param)
@@ -356,7 +404,13 @@ arrayParamOut(ostream &os, const string &name,                          \
               const vector<type> &param);                               \
 template void                                                           \
 arrayParamIn(Checkpoint *cp, const string &section,                     \
-             const string &name, vector<type> &param);
+             const string &name, vector<type> &param);                  \
+template void                                                           \
+arrayParamOut(ostream &os, const string &name,                          \
+              const list<type> &param);                                 \
+template void                                                           \
+arrayParamIn(Checkpoint *cp, const string &section,                     \
+             const string &name, list<type> &param);
 
 INSTANTIATE_PARAM_TEMPLATES(char)
 INSTANTIATE_PARAM_TEMPLATES(signed char)
diff --git a/src/sim/serialize.hh b/src/sim/serialize.hh
index 5ea632ea4b..6be8ce3b6e 100644
--- a/src/sim/serialize.hh
+++ b/src/sim/serialize.hh
@@ -69,6 +69,10 @@ template <class T>
 void arrayParamOut(std::ostream &os, const std::string &name,
                    const std::vector<T> &param);
 
+template <class T>
+void arrayParamOut(std::ostream &os, const std::string &name,
+                   const std::list<T> &param);
+
 template <class T>
 void arrayParamIn(Checkpoint *cp, const std::string &section,
                   const std::string &name, T *param, unsigned size);
@@ -77,6 +81,10 @@ template <class T>
 void arrayParamIn(Checkpoint *cp, const std::string &section,
                   const std::string &name, std::vector<T> &param);
 
+template <class T>
+void arrayParamIn(Checkpoint *cp, const std::string &section,
+                  const std::string &name, std::list<T> &param);
+
 void
 objParamIn(Checkpoint *cp, const std::string &section,
            const std::string &name, SimObject * &param);
diff --git a/src/sim/tlb.hh b/src/sim/tlb.hh
index 1512bc0fae..253f120720 100644
--- a/src/sim/tlb.hh
+++ b/src/sim/tlb.hh
@@ -1,4 +1,16 @@
 /*
+ * Copyright (c) 2011 ARM Limited
+ * All rights reserved.
+ *
+ * The license below extends only to copyright in the software and shall
+ * not be construed as granting a license to any other intellectual
+ * property including but not limited to intellectual property relating
+ * to a hardware implementation of the functionality of the software
+ * licensed hereunder.  You may use the software subject to the license
+ * terms below provided that you ensure that this notice is replicated
+ * unmodified and in its entirety in all distributions of the software,
+ * modified or unmodified, in source code or in binary form.
+ *
  * Copyright (c) 2006 The Regents of The University of Michigan
  * All rights reserved.
  *
@@ -64,6 +76,12 @@ class BaseTLB : public SimObject
         virtual ~Translation()
         {}
 
+        /**
+         * Signal that the translation has been delayed due to a hw page table
+         * walk.
+         */
+        virtual void markDelayed() = 0;
+
         /*
          * The memory for this object may be dynamically allocated, and it may
          * be responsible for cleaning itself up which will happen in this
diff --git a/system/alpha/console/Makefile b/system/alpha/console/Makefile
new file mode 100644
index 0000000000..9fea133a2f
--- /dev/null
+++ b/system/alpha/console/Makefile
@@ -0,0 +1,60 @@
+# Copyright (c) 2005 The Regents of The University of Michigan
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are
+# met: redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer;
+# redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in the
+# documentation and/or other materials provided with the distribution;
+# neither the name of the copyright holders nor the names of its
+# contributors may be used to endorse or promote products derived from
+# this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+# Authors: Nathan L. Binkert
+#          Ali G. Saidi
+
+# Point to the M5 diretory so we can get some headers
+M5?=../../..
+
+### If we are not compiling on an alpha, we must use cross tools ###    
+ifneq ($(shell uname -m), alpha)
+CROSS_COMPILE?=alpha-unknown-linux-gnu-
+endif
+CC=$(CROSS_COMPILE)gcc
+AS=$(CROSS_COMPILE)as
+LD=$(CROSS_COMPILE)ld
+
+DBMENTRY= fffffc0000010000
+CFLAGS=-I . -I ../h -I$(M5)/src/dev/alpha -I$(M5)/util/m5/ -fno-builtin -Wa,-m21164
+OBJS=dbmentry.o printf.o paljtokern.o paljtoslave.o console.o m5op.o
+
+all: console
+
+m5op.o: $(M5)/util/m5/m5op_alpha.S
+	$(CC) $(CFLAGS) -nostdinc -o $@ -c $<
+
+%.o: %.S
+	$(CC) $(CFLAGS) -nostdinc -o $@ -c $<
+
+%.o: %.c
+	$(CC)  -g3 $(CFLAGS) -o $@ -c $<
+
+console: $(OBJS)
+	$(LD) -o console -N -Ttext $(DBMENTRY) -non_shared $(OBJS) -lc
+
+clean:
+	rm -f *.o console
diff --git a/system/alpha/console/console.c b/system/alpha/console/console.c
new file mode 100644
index 0000000000..f57ce054f0
--- /dev/null
+++ b/system/alpha/console/console.c
@@ -0,0 +1,1074 @@
+/*
+ * Copyright (c) 2003-2004 The Regents of The University of Michigan
+ * Copyright (c) 1993 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/* ******************************************
+ * M5 Console
+ * ******************************************/
+
+#include <linux/stddef.h>
+#include <sys/types.h>
+
+#define CONSOLE
+#include "access.h"
+#include "cserve.h"
+#include "rpb.h"
+
+#define CONS_INT_TX   0x01  /* interrupt enable / state bits */
+#define CONS_INT_RX   0x02
+
+#define PAGE_SIZE (8192)
+
+#define KSTACK_REGION_VA 0x20040000
+
+#define KSEG   0xfffffc0000000000
+#define K1BASE 0xfffffc8000000000
+#define KSEG_TO_PHYS(x) (((ulong)x) & ~KSEG)
+
+#define ROUNDUP8(x) ((ulong)(((ulong)x)+7) & ~7)
+#define ROUNDUP128(x) ((ulong)(((ulong)x) + 127) & ~127)
+#define ROUNDUP8K(x) ((ulong)(((ulong)(x)) + 8191) & ~8191)
+
+#define FIRST(x)  ((((ulong)(x)) >> 33) & 0x3ff)
+#define SECOND(x) ((((ulong)(x)) >> 23) & 0x3ff)
+#define THIRD(x) ((((ulong)(x)) >> 13) & 0x3ff)
+#define THIRD_XXX(x)  ((((ulong)(x)) >> 13) & 0xfff)
+#define PFN(x)  ((((ulong)(x) & ~KSEG) >> 13))
+
+/* Kernel write | kernel read | valid */
+#define KPTE(x) ((ulong)((((ulong)(x)) << 32) | 0x1101))
+
+#define HWRPB_PAGES 16
+
+#define NUM_KERNEL_THIRD (4)
+
+#define printf_lock(args...)		\
+    do {				\
+        SpinLock(&theLock);		\
+        printf(args);			\
+        SpinUnlock(&theLock);		\
+    } while (0)
+
+
+void unixBoot(int argc, char **argv);
+void JToKern(char *bootadr, ulong rpb_percpu, ulong free_pfn, ulong k_argc,
+             char **k_argv, char **envp);
+void JToPal(ulong bootadr);
+void SlaveLoop(int cpu);
+
+volatile struct AlphaAccess *m5AlphaAccess;
+struct AlphaAccess m5Conf;
+
+ulong theLock;
+
+extern void SpinLock(ulong *lock);
+#define SpinUnlock(_x) *(_x) = 0;
+
+struct _kernel_params {
+    char *bootadr;
+    ulong rpb_percpu;
+    ulong free_pfn;
+    ulong argc;
+    ulong argv;
+    ulong envp; /* NULL */
+};
+
+extern consoleCallback[];
+extern consoleFixup[];
+long CallBackDispatcher();
+long CallBackFixup();
+
+/*
+ * m5 console output
+ */
+
+void
+InitConsole()
+{
+}
+
+char
+GetChar()
+{
+    return m5AlphaAccess->inputChar;
+}
+
+void
+PutChar(char c)
+{
+    m5AlphaAccess->outputChar = c;
+}
+
+int
+passArgs(int argc)
+{
+    return 0;
+}
+
+int
+main(int argc, char **argv)
+{
+    int x, i;
+    uint *k1ptr, *ksegptr;
+
+    InitConsole();
+    printf_lock("M5 console: m5AlphaAccess @ 0x%x\n", m5AlphaAccess);
+
+    /*
+     * get configuration from backdoor
+     */
+    m5Conf.last_offset = m5AlphaAccess->last_offset;
+    printf_lock("Got Configuration %d\n", m5Conf.last_offset);
+
+    m5Conf.last_offset = m5AlphaAccess->last_offset;
+    m5Conf.version = m5AlphaAccess->version;
+    m5Conf.numCPUs = m5AlphaAccess->numCPUs;
+    m5Conf.intrClockFrequency = m5AlphaAccess->intrClockFrequency;
+    m5Conf.cpuClock = m5AlphaAccess->cpuClock;
+    m5Conf.mem_size = m5AlphaAccess->mem_size;
+    m5Conf.kernStart = m5AlphaAccess->kernStart;
+    m5Conf.kernEnd = m5AlphaAccess->kernEnd;
+    m5Conf.entryPoint = m5AlphaAccess->entryPoint;
+    m5Conf.diskUnit = m5AlphaAccess->diskUnit;
+    m5Conf.diskCount = m5AlphaAccess->diskCount;
+    m5Conf.diskPAddr = m5AlphaAccess->diskPAddr;
+    m5Conf.diskBlock = m5AlphaAccess->diskBlock;
+    m5Conf.diskOperation = m5AlphaAccess->diskOperation;
+    m5Conf.outputChar = m5AlphaAccess->outputChar;
+    m5Conf.inputChar = m5AlphaAccess->inputChar;
+
+    if (m5Conf.version != ALPHA_ACCESS_VERSION)  {
+        panic("Console version mismatch. Console expects %d. has %d \n",
+              ALPHA_ACCESS_VERSION, m5Conf.version);
+    }
+
+    /*
+     * setup arguments to kernel
+     */
+    unixBoot(argc, argv);
+
+    panic("unix failed to boot\n");
+    return 1;
+}
+
+/*
+ * BOOTING
+ */
+struct rpb m5_rpb = {
+    NULL,		/* 000: physical self-reference */
+    ((long)'H') | (((long)'W') << 8) | (((long)'R') << 16) |
+    ((long)'P' << 24) | (((long)'B') << 32),  /* 008: contains "HWRPB" */
+    6,			/* 010: HWRPB version number */
+    /* the byte count is wrong, but who needs it? - lance */
+    0,			/* 018: bytes in RPB perCPU CTB CRB MEDSC */
+    0,			/* 020: primary cpu id */
+    PAGE_SIZE,		/* 028: page size in bytes */
+    43,			/* 030: number of phys addr bits */
+    127,		/* 038: max valid ASN */
+    {'0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','1'},
+                        /* 040: system serial num: 10 ascii chars */
+    0, /* OVERRIDDEN */
+    (1<<10),		/* 058: system variation */
+    'c'|('o'<<8)|('o'<<16)|('l'<< 24),	/* 060: system revision */
+    1024*4096,		/* 068: scaled interval clock intr freq */
+    0,			/* 070: cycle counter frequency */
+    0x200000000,	/* 078: virtual page table base */
+    0,			/* 080: reserved */
+    0,			/* 088: offset to translation buffer hint */
+    1,			/* 090: number of processor slots OVERRIDDEN*/
+    sizeof(struct rpb_percpu),	/* 098: per-cpu slot size. OVERRIDDEN */
+    0,			/* 0A0: offset to per_cpu slots */
+    1,			/* 0A8: number of CTBs */
+    sizeof(struct ctb_tt),
+    0,			/* 0B8: offset to CTB (cons term block) */
+    0,			/* 0C0: offset to CRB (cons routine block) */
+    0,			/* 0C8: offset to memory descriptor table */
+    0,			/* 0D0: offset to config data block */
+    0,			/* 0D8: offset to FRU table */
+    0,			/* 0E0: virt addr of save term routine */
+    0,			/* 0E8: proc value for save term routine */
+    0,			/* 0F0: virt addr of restore term routine */
+    0,			/* 0F8: proc value for restore term routine */
+    0,			/* 100: virt addr of CPU restart routine */
+    0,			/* 108: proc value for CPU restart routine */
+    0,			/* 110: used to determine presence of kdebug */
+    0,			/* 118: reserved for hardware */
+/* the checksum is wrong, but who needs it? - lance */
+    0,			/* 120: checksum of prior entries in rpb */
+    0,			/* 128: receive ready bitmask */
+    0,			/* 130: transmit ready bitmask */
+    0,			/* 138: Dynamic System Recog. offset */
+};
+
+ulong m5_tbb[] = { 0x1e1e1e1e1e1e1e1e, 0x1e1e1e1e1e1e1e1e,
+                   0x1e1e1e1e1e1e1e1e, 0x1e1e1e1e1e1e1e1e,
+                   0x1e1e1e1e1e1e1e1e, 0x1e1e1e1e1e1e1e1e,
+                   0x1e1e1e1e1e1e1e1e, 0x1e1e1e1e1e1e1e1e };
+
+struct rpb_percpu m5_rpb_percpu = {
+    {0,0,0,0,0,0,1,{0,0},{0,0,0,0,0,0,0,0}}, /* 000: boot/restart HWPCB */
+    (STATE_PA | STATE_PP | STATE_CV |
+     STATE_PV | STATE_PMV | STATE_PL), 	/* 080: per-cpu state bits */
+    0xc000,				/* 088: palcode memory length */
+    0x2000,				/* 090: palcode scratch length */
+    0x4000,				/* 098: paddr of pal mem space */
+    0x2000,				/* 0A0: paddr of pal scratch space */
+    (2 << 16) | (5 << 8) | 1,		/* 0A8: PALcode rev required */
+    11 | (2L  << 32),			/* 0B0: processor type */
+    7,					/* 0B8: processor variation */
+    'M'|('5'<<8)|('A'<<16)|('0'<<24),	/* 0C0: processor revision */
+    {'M','5','/','A','l','p','h','a','0','0','0','0','0','0','0','0'},
+                                        /* 0C8: proc serial num: 10 chars */
+    0,					/* 0D8: phys addr of logout area */
+    0,					/* 0E0: len in bytes of logout area */
+    0,					/* 0E8: halt pcb base */
+    0,					/* 0F0: halt pc */
+    0,					/* 0F8: halt ps */
+    0,					/* 100: halt arg list (R25) */
+    0,					/* 108: halt return address (R26) */
+    0,					/* 110: halt procedure value (R27) */
+    0,		       			/* 118: reason for halt */
+    0,		       			/* 120: for software */
+    {0},				/* 128: inter-console comm buffer */
+    {1,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0},	/* 1D0: PALcode revs available */
+    0					/* 250: reserved for arch use */
+/* the dump stack grows from the end of the rpb page not to reach here */
+};
+
+struct _m5_rpb_mdt {
+    long   rpb_checksum;	/* 000: checksum of entire mem desc table */
+    long   rpb_impaddr;		/* 008: PA of implementation dep info */
+    long   rpb_numcl;		/* 010: number of clusters */
+    struct rpb_cluster rpb_cluster[3];	/* first instance of a cluster */
+};
+
+struct _m5_rpb_mdt m5_rpb_mdt = {
+    0,			/* 000: checksum of entire mem desc table */
+    0,			/* 008: PA of implementation dep info */
+    0,			/* 010: number of clusters */
+    {{	0,		/* 000: starting PFN of this cluster */
+        0,		/* 008: count of PFNs in this cluster */
+        0,		/* 010: count of tested PFNs in cluster */
+        0,		/* 018: va of bitmap */
+        0,		/* 020: pa of bitmap */
+        0,		/* 028: checksum of bitmap */
+        1		/* 030: usage of cluster */
+     },
+     {   0,		/* 000: starting PFN of this cluster */
+         0,		/* 008: count of PFNs in this cluster */
+         0,		/* 010: count of tested PFNs in cluster */
+         0,		/* 018: va of bitmap */
+         0,		/* 020: pa of bitmap */
+         0,		/* 028: checksum of bitmap */
+         0		/* 030: usage of cluster */
+     },
+     {   0,		/* 000: starting PFN of this cluster */
+         0,		/* 008: count of PFNs in this cluster */
+         0,		/* 010: count of tested PFNs in cluster */
+         0,		/* 018: va of bitmap */
+         0,		/* 020: pa of bitmap */
+         0,		/* 028: checksum of bitmap */
+         0		/* 030: usage of cluster */
+     }}
+};
+
+/* constants for slotinfo bus_type subfield */
+#define SLOTINFO_TC	0
+#define SLOTINFO_ISA	1
+#define SLOTINFO_EISA	2
+#define SLOTINFO_PCI	3
+
+struct rpb_ctb m5_rpb_ctb = {
+    CONS_DZ,	/* 000: console type */
+    0,		/* 008: console unit */
+    0,		/* 010: reserved */
+    0		/* 018: byte length of device dep portion */
+};
+
+/* we don't do any fixup (aka relocate the console) - we hope */
+struct rpb_crb m5_rpb_crb = {
+    0,		/* va of call-back dispatch rtn */
+    0,		/* pa of call-back dispatch rtn */
+    0,		/* va of call-back fixup rtn */
+    0,		/* pa of call-back fixup rtn */
+    0,		/* number of entries in phys/virt map */
+    0		/* Number of pages to be mapped */
+};
+
+struct _rpb_name {
+    ulong length;
+    char name[16];
+};
+
+extern struct _rpb_name m5_name;
+
+struct rpb_dsr m5_rpb_dsr = {
+    0,
+    0,
+    0,
+};
+
+struct _rpb_name m5_name = {
+    16,
+    {'U','M','I','C','H',' ','M','5','/','A','L','P','H','A',' ',0},
+};
+
+/*
+ * M5 has one LURT entry:
+ *   1050 is for workstations
+ *   1100 is servers (and is needed for CXX)
+ */
+long m5_lurt[10] = { 9, 12, -1, -1, -1, -1, -1, -1, 1100, 1100 };
+
+ulong unix_boot_mem;
+ulong bootadr;
+
+char **kargv;
+int kargc;
+ulong free_pfn;
+struct rpb_percpu *rpb_percpu;
+
+char *
+unix_boot_alloc(int pages)
+{
+    char *ret = (char *)unix_boot_mem;
+    unix_boot_mem += (pages * PAGE_SIZE);
+    return ret;
+}
+
+ulong *first = 0;
+ulong *third_rpb = 0;
+ulong *reservedFixup = 0;
+
+int strcpy(char *dst, char *src);
+
+struct rpb *rpb;
+extern ulong _end;
+
+void
+unixBoot(int argc, char **argv)
+{
+    ulong *second,  *third_kernel, ptr, *tbb, size, *percpu_logout;
+    unsigned char *mdt_bitmap;
+    long *lp1, *lp2, sum;
+    int i, cl;
+    ulong kern_first_page;
+    ulong mem_size = m5Conf.mem_size;
+
+    ulong mem_pages = mem_size / PAGE_SIZE, cons_pages;
+    ulong mdt_bitmap_pages = mem_pages / (PAGE_SIZE*8);
+
+    ulong kernel_bytes, ksp, kernel_end, *unix_kernel_stack, bss,
+        ksp_bottom, ksp_top;
+    struct rpb_ctb *rpb_ctb;
+    struct ctb_tt *ctb_tt;
+    struct rpb_dsr *rpb_dsr;
+    struct rpb_crb *rpb_crb;
+    struct _m5_rpb_mdt *rpb_mdt;
+    int *rpb_lurt;
+    char *rpb_name;
+    ulong nextPtr;
+
+    printf_lock("memsize %x pages %x \n", mem_size, mem_pages);
+
+    /* Allocate:
+     *   two pages for the HWRPB
+     *   five page table pages:
+     *     1: First level page table
+     *     1: Second level page table
+     *     1: Third level page table for HWRPB
+     *     2: Third level page table for kernel (for up to 16MB)
+     * set up the page tables
+     * load the kernel at the physical address 0x230000
+     * build the HWRPB
+     *   set up memory descriptor table to give up the
+     *   physical memory between the end of the page
+     *   tables and the start of the kernel
+     * enable kseg addressing
+     * jump to the kernel
+     */
+
+    unix_boot_mem = ROUNDUP8K(&_end);
+
+    printf_lock("First free page after ROM 0x%x\n", unix_boot_mem);
+
+    rpb = (struct rpb *)unix_boot_alloc(HWRPB_PAGES);
+
+    mdt_bitmap = (unsigned char *)unix_boot_alloc(mdt_bitmap_pages); 
+    first = (ulong *)unix_boot_alloc(1);
+    second = (ulong *)unix_boot_alloc(1);
+    third_rpb = (ulong *)unix_boot_alloc(1);
+    reservedFixup = (ulong*) unix_boot_alloc(1);
+    third_kernel = (ulong *)unix_boot_alloc(NUM_KERNEL_THIRD);
+    percpu_logout = (ulong*)unix_boot_alloc(1);
+
+    cons_pages = KSEG_TO_PHYS(unix_boot_mem) / PAGE_SIZE;
+
+    /* Set up the page tables */
+    bzero((char *)first, PAGE_SIZE);
+    bzero((char *)second, PAGE_SIZE);
+    bzero((char *)reservedFixup, PAGE_SIZE);
+    bzero((char *)third_rpb, HWRPB_PAGES * PAGE_SIZE);
+    bzero((char *)third_kernel, PAGE_SIZE * NUM_KERNEL_THIRD);
+
+    first[0] = KPTE(PFN(second));
+    first[1] = KPTE(PFN(first)); /* Region 3 */
+
+    /* Region 0 */
+    second[SECOND(0x10000000)] = KPTE(PFN(third_rpb));
+
+    for (i = 0; i < NUM_KERNEL_THIRD; i++) {
+        /* Region 1 */
+        second[SECOND(0x20000000) + i] = KPTE(PFN(third_kernel) + i);
+    }
+
+    /* Region 2 */
+    second[SECOND(0x40000000)] = KPTE(PFN(second));
+
+
+    /* For some obscure reason, Dec Unix's database read
+     * from /etc/sysconfigtab is written to this fixed
+     * mapped memory location. Go figure, since it is
+     * not initialized by the console. Maybe it is
+     * to look at the database from the console
+     * after a boot/crash.
+     *
+     * Black magic to estimate the max size. SEGVs on overflow
+     * bugnion
+     */
+
+#define DATABASE_BASE           0x20000000
+#define DATABASE_END            0x20020000
+
+    ulong *dbPage = (ulong*)unix_boot_alloc(1);
+    bzero(dbPage, PAGE_SIZE);
+    second[SECOND(DATABASE_BASE)] = KPTE(PFN(dbPage));
+    for (i = DATABASE_BASE; i < DATABASE_END ; i += PAGE_SIZE) {
+        ulong *db = (ulong*)unix_boot_alloc(1);
+        dbPage[THIRD(i)] = KPTE(PFN(db));
+    }
+
+    /* Region 0 */
+    /* Map the HWRPB */
+    for (i = 0; i < HWRPB_PAGES; i++)
+        third_rpb[i] = KPTE(PFN(rpb) + i);
+
+    /* Map the MDT bitmap table */
+    for (i = 0; i < mdt_bitmap_pages; i++) {
+        third_rpb[HWRPB_PAGES + i] = KPTE(PFN(mdt_bitmap) + i);
+    }
+
+    /* Protect the PAL pages */
+    for (i = 1; i < PFN(first); i++)
+        third_rpb[HWRPB_PAGES + mdt_bitmap_pages + i] = KPTE(i);
+
+   /* Set up third_kernel after it's loaded, when we know where it is */
+    kern_first_page = (KSEG_TO_PHYS(m5Conf.kernStart)/PAGE_SIZE);
+    kernel_end = ROUNDUP8K(m5Conf.kernEnd);
+    bootadr = m5Conf.entryPoint;
+
+    printf_lock("HWRPB 0x%x l1pt 0x%x l2pt 0x%x l3pt_rpb 0x%x l3pt_kernel 0x%x"
+                " l2reserv 0x%x\n",
+                rpb, first, second, third_rpb, third_kernel, reservedFixup);
+    if (kernel_end - m5Conf.kernStart > (0x800000*NUM_KERNEL_THIRD)) {
+        printf_lock("Kernel is more than 8MB 0x%x - 0x%x = 0x%x\n",
+                    kernel_end, m5Conf.kernStart,
+                    kernel_end - m5Conf.kernStart );
+        panic("kernel too big\n");
+    }
+    printf_lock("kstart = 0x%x, kend = 0x%x, kentry = 0x%x, numCPUs = 0x%x\n", m5Conf.kernStart, m5Conf.kernEnd, m5Conf.entryPoint, m5Conf.numCPUs);
+    ksp_bottom = (ulong)unix_boot_alloc(1);
+    ksp_top = ksp_bottom + PAGE_SIZE;
+    ptr = (ulong) ksp_bottom;
+    bzero((char *)ptr, PAGE_SIZE);
+    dbPage[THIRD(KSTACK_REGION_VA)] = 0;		          /* Stack Guard Page */
+    dbPage[THIRD(KSTACK_REGION_VA + PAGE_SIZE)] = KPTE(PFN(ptr)); /* Kernel Stack Page */
+    dbPage[THIRD(KSTACK_REGION_VA + 2*PAGE_SIZE)] = 0;		  /* Stack Guard Page */
+
+    /* put argv into the bottom of the stack - argv starts at 1 because
+     * the command thatr got us here (i.e. "unixboot) is in argv[0].
+     */
+    ksp = ksp_top - 8;			/* Back up one longword */
+    ksp -= argc * sizeof(char *);	/* Make room for argv */
+    kargv = (char **) ksp;
+    for (i = 1; i < argc; i++) {	/* Copy arguments to stack */
+        ksp -= ((strlen(argv[i]) + 1) + 7) & ~0x7;
+        kargv[i-1] = (char *) ksp;
+        strcpy(kargv[i - 1], argv[i]);
+    }
+    kargc = i - 1;
+    kargv[kargc] = NULL;	/* just to be sure; doesn't seem to be used */
+    ksp -= sizeof(char *);	/* point above last arg for no real reason */
+
+    free_pfn = PFN(kernel_end);
+
+    bcopy((char *)&m5_rpb, (char *)rpb, sizeof(struct rpb));
+
+    rpb->rpb_selfref = (struct rpb *) KSEG_TO_PHYS(rpb);
+    rpb->rpb_string = 0x0000004250525748;
+
+    tbb = (ulong *) (((char *) rpb) + ROUNDUP8(sizeof(struct rpb)));
+    rpb->rpb_trans_off = (ulong)tbb - (ulong)rpb;
+    bcopy((char *)m5_tbb, (char *)tbb, sizeof(m5_tbb));
+
+    /*
+     * rpb_counter. Use to determine timeouts in OS.
+     * XXX must be patched after a checkpoint restore (I guess)
+     */
+
+    printf_lock("CPU Clock at %d MHz IntrClockFrequency=%d \n",
+                m5Conf.cpuClock, m5Conf.intrClockFrequency);
+    rpb->rpb_counter = m5Conf.cpuClock * 1000 * 1000;
+
+    /*
+     * By definition, the rpb_clock is scaled by 4096 (in hz)
+     */
+    rpb->rpb_clock = m5Conf.intrClockFrequency * 4096;
+
+    /*
+     * Per CPU Slots. Multiprocessor support.
+     */
+    int percpu_size = ROUNDUP128(sizeof(struct rpb_percpu));
+
+    printf_lock("Booting with %d processor(s) \n", m5Conf.numCPUs);
+
+    rpb->rpb_numprocs = m5Conf.numCPUs;
+    rpb->rpb_slotsize = percpu_size;
+    rpb_percpu = (struct rpb_percpu *)
+        ROUNDUP128(((ulong)tbb) + (sizeof(m5_tbb)));
+
+    rpb->rpb_percpu_off = (ulong)rpb_percpu - (ulong)rpb;
+
+    for (i = 0; i < m5Conf.numCPUs; i++) {
+        struct rpb_percpu *thisCPU = (struct rpb_percpu*)
+            ((ulong)rpb_percpu + percpu_size * i);
+
+        bzero((char *)thisCPU, percpu_size);
+        bcopy((char *)&m5_rpb_percpu, (char *)thisCPU,
+              sizeof(struct rpb_percpu));
+
+        thisCPU->rpb_pcb.rpb_ksp = (KSTACK_REGION_VA + 2*PAGE_SIZE - (ksp_top - ksp));
+        thisCPU->rpb_pcb.rpb_ptbr = PFN(first);
+
+        thisCPU->rpb_logout = KSEG_TO_PHYS(percpu_logout);
+        thisCPU->rpb_logout_len = PAGE_SIZE;
+
+        printf_lock("KSP: 0x%x PTBR 0x%x\n",
+                    thisCPU->rpb_pcb.rpb_ksp, thisCPU->rpb_pcb.rpb_ptbr);
+    }
+
+    nextPtr = (ulong)rpb_percpu + percpu_size * m5Conf.numCPUs;
+
+    /*
+     * Console Terminal Block
+     */
+    rpb_ctb = (struct rpb_ctb *) nextPtr;
+    ctb_tt = (struct ctb_tt*) rpb_ctb;
+
+    rpb->rpb_ctb_off = ((ulong)rpb_ctb) - (ulong)rpb;
+    rpb->rpb_ctb_size  = sizeof(struct rpb_ctb);
+
+    bzero((char *)rpb_ctb, sizeof(struct ctb_tt));
+
+    rpb_ctb->rpb_type = CONS_DZ;
+    rpb_ctb->rpb_length = sizeof(ctb_tt) - sizeof(rpb_ctb);
+
+    /*
+     * uart initizliation
+     */
+    ctb_tt->ctb_tintr_vec = 0x6c0;  /* matches tlaser pal code */
+    ctb_tt->ctb_rintr_vec = 0x680;  /* matches tlaser pal code */
+    ctb_tt->ctb_term_type = CTB_GRAPHICS;
+
+    rpb_crb = (struct rpb_crb *) (((ulong)rpb_ctb) + sizeof(struct ctb_tt));
+    rpb->rpb_crb_off = ((ulong)rpb_crb) - (ulong)rpb;
+
+    bzero((char *)rpb_crb, sizeof(struct rpb_crb));
+
+    /*
+     * console callback stuff (m5)
+     */
+    rpb_crb->rpb_num = 1;
+    rpb_crb->rpb_mapped_pages = HWRPB_PAGES;
+    rpb_crb->rpb_map[0].rpb_virt = 0x10000000;
+    rpb_crb->rpb_map[0].rpb_phys = KSEG_TO_PHYS(((ulong)rpb) & ~0x1fff);
+    rpb_crb->rpb_map[0].rpb_pgcount = HWRPB_PAGES;
+
+    printf_lock("Console Callback at 0x%x, fixup at 0x%x, crb offset: 0x%x\n",
+                rpb_crb->rpb_va_disp, rpb_crb->rpb_va_fixup, rpb->rpb_crb_off);
+
+    rpb_mdt = (struct _m5_rpb_mdt *)((ulong)rpb_crb + sizeof(struct rpb_crb));
+    rpb->rpb_mdt_off = (ulong)rpb_mdt - (ulong)rpb;
+    bcopy((char *)&m5_rpb_mdt, (char *)rpb_mdt, sizeof(struct _m5_rpb_mdt));
+
+
+    cl = 0;
+    rpb_mdt->rpb_cluster[cl].rpb_pfncount = kern_first_page;
+    cl++;
+
+    rpb_mdt->rpb_cluster[cl].rpb_pfn = kern_first_page;
+    rpb_mdt->rpb_cluster[cl].rpb_pfncount = mem_pages - kern_first_page;
+    rpb_mdt->rpb_cluster[cl].rpb_pfntested =
+        rpb_mdt->rpb_cluster[cl].rpb_pfncount;
+    rpb_mdt->rpb_cluster[cl].rpb_pa = KSEG_TO_PHYS(mdt_bitmap);
+    rpb_mdt->rpb_cluster[cl].rpb_va = 0x10000000 + HWRPB_PAGES * PAGE_SIZE;
+    cl++;
+
+    rpb_mdt->rpb_numcl = cl;
+
+    for (i = 0; i < cl; i++)
+        printf_lock("Memory cluster %d [%d - %d]\n", i,
+                    rpb_mdt->rpb_cluster[i].rpb_pfn,
+                    rpb_mdt->rpb_cluster[i].rpb_pfncount);
+
+    /* Checksum the rpb for good luck */
+    sum = 0;
+    lp1 = (long *)&rpb_mdt->rpb_impaddr;
+    lp2 = (long *)&rpb_mdt->rpb_cluster[cl];
+    while (lp1 < lp2) sum += *lp1++;
+    rpb_mdt->rpb_checksum = sum;
+
+    /* XXX should checksum the cluster descriptors */
+    bzero((char *)mdt_bitmap, mdt_bitmap_pages * PAGE_SIZE);
+    for (i = 0; i < mem_pages/8; i++)
+        ((unsigned char *)mdt_bitmap)[i] = 0xff;
+
+    printf_lock("Initalizing mdt_bitmap addr 0x%x mem_pages %x \n",
+                (long)mdt_bitmap,(long)mem_pages);
+
+    m5_rpb.rpb_config_off = 0;
+    m5_rpb.rpb_fru_off = 0;
+
+    rpb_dsr = (struct rpb_dsr *)((ulong)rpb_mdt + sizeof(struct _m5_rpb_mdt));
+    rpb->rpb_dsr_off = (ulong)rpb_dsr - (ulong)rpb;
+    bzero((char *)rpb_dsr, sizeof(struct rpb_dsr));
+    rpb_dsr->rpb_smm = 1578; /* Official XXM SMM number as per SRM */
+    rpb_dsr->rpb_smm = 1089; /* Official Alcor SMM number as per SRM */
+
+    rpb_lurt = (int *) ROUNDUP8((ulong)rpb_dsr + sizeof(struct rpb_dsr));
+    rpb_dsr->rpb_lurt_off = ((ulong) rpb_lurt) - (ulong) rpb_dsr;
+    bcopy((char *)m5_lurt, (char *)rpb_lurt, sizeof(m5_lurt));
+
+    rpb_name = (char *) ROUNDUP8(((ulong)rpb_lurt) + sizeof(m5_lurt));
+    rpb_dsr->rpb_sysname_off = ((ulong) rpb_name) - (ulong) rpb_dsr;
+#define THENAME "             M5/Alpha       "
+    sum = sizeof(THENAME);
+    bcopy(THENAME, rpb_name, sum);
+    *(ulong *)rpb_name = sizeof(THENAME); /* put in length field */
+
+    /* calculate size of rpb */
+    rpb->rpb_size = ((ulong) &rpb_name[sum]) - (ulong)rpb;
+
+    if (rpb->rpb_size > PAGE_SIZE * HWRPB_PAGES) {
+        panic("HWRPB_PAGES=%d too small for HWRPB !!! \n");
+    }
+
+    ulong *rpbptr = (ulong*)((char*)rpb_dsr + sizeof(struct rpb_dsr));
+    rpb_crb->rpb_pa_disp = KSEG_TO_PHYS(rpbptr);
+    rpb_crb->rpb_va_disp = 0x10000000 +
+        (((ulong)rpbptr - (ulong)rpb) & (0x2000 * HWRPB_PAGES - 1));
+    printf_lock("ConsoleDispatch at virt %x phys %x val %x\n",
+                rpb_crb->rpb_va_disp, rpb_crb->rpb_pa_disp, consoleCallback);
+    *rpbptr++ = 0;
+    *rpbptr++ = (ulong) consoleCallback;
+    rpb_crb->rpb_pa_fixup = KSEG_TO_PHYS(rpbptr);
+    rpb_crb->rpb_va_fixup = 0x10000000 +
+        (((ulong)rpbptr - (ulong)rpb) & (0x2000 * HWRPB_PAGES - 1));
+    *rpbptr++ = 0;
+
+    *rpbptr++ = (ulong) consoleFixup;
+
+    /* Checksum the rpb for good luck */
+    sum = 0;
+    lp1 = (long *)rpb;
+    lp2 = &rpb->rpb_checksum;
+    while (lp1 < lp2)
+        sum += *lp1++;
+    *lp2 = sum;
+
+  /*
+   * MP bootstrap
+   */
+    for (i = 1; i < m5Conf.numCPUs; i++) {
+        ulong stack = (ulong)unix_boot_alloc(1);
+        printf_lock("Bootstraping CPU %d with sp=0x%x\n", i, stack);
+        m5AlphaAccess->cpuStack[i] = stack;
+    }
+
+    /*
+     * Make sure that we are not stepping on the kernel
+     */
+    if ((ulong)unix_boot_mem >= (ulong)m5Conf.kernStart) {
+        panic("CONSOLE: too much memory. Smashing kernel\n");
+    } else {
+        printf_lock("unix_boot_mem ends at %x \n", unix_boot_mem);
+    }
+
+    JToKern((char *)bootadr, (ulong)rpb_percpu, free_pfn, kargc, kargv, NULL);
+}
+
+
+void
+JToKern(char *bootadr, ulong rpb_percpu, ulong free_pfn, ulong k_argc,
+        char **k_argv, char **envp)
+{
+    extern ulong palJToKern[];
+
+    struct _kernel_params *kernel_params = (struct _kernel_params *) KSEG;
+    int i;
+
+    printf_lock("k_argc = %d ", k_argc);
+    for (i = 0; i < k_argc; i++) {
+        printf_lock("'%s' ", k_argv[i]);
+    }
+    printf_lock("\n");
+
+    kernel_params->bootadr = bootadr;
+    kernel_params->rpb_percpu = KSEG_TO_PHYS(rpb_percpu);
+    kernel_params->free_pfn = free_pfn;
+    kernel_params->argc = k_argc;
+    kernel_params->argv = (ulong)k_argv;
+    kernel_params->envp = (ulong)envp;
+    printf_lock("jumping to kernel at 0x%x, (PCBB 0x%x pfn %d)\n",
+                bootadr, rpb_percpu, free_pfn);
+    JToPal(KSEG_TO_PHYS(palJToKern));
+    printf_lock("returned from JToPal. Looping\n");
+    while (1)
+        continue;
+}
+
+void
+JToPal(ulong bootadr)
+{
+    cServe(bootadr, 0, CSERVE_K_JTOPAL);
+
+    /*
+     * Make sure that floating point is enabled incase
+     * it was disabled by the user program.
+     */
+    wrfen(1);
+}
+
+int
+strcpy(char *dst, char *src)
+{
+    int i = 0;
+    while (*src) {
+        *dst++ = *src++;
+        i++;
+    }
+    return i;
+}
+
+/*
+ * Console I/O
+ *
+ */
+
+int numOpenDevices = 11;
+struct {
+    char name[128];
+} deviceState[32];
+
+#define BOOTDEVICE_NAME "SCSI 1 0 0 1 100 0"
+
+void
+DeviceOperation(long op, long channel, long count, long address, long block)
+{
+    long pAddr;
+
+    if (strcmp(deviceState[channel].name, BOOTDEVICE_NAME )) {
+        panic("DeviceRead: only implemented for root disk \n");
+    }
+    pAddr = KSEG_TO_PHYS(address);
+    if (pAddr + count > m5Conf.mem_size) {
+        panic("DeviceRead: request out of range \n");
+    }
+
+    m5AlphaAccess->diskCount = count;
+    m5AlphaAccess->diskPAddr = pAddr;
+    m5AlphaAccess->diskBlock = block;
+    m5AlphaAccess->diskOperation = op; /* launch */
+}
+
+/*
+ * M5 Console callbacks
+ *
+ */
+
+/* AXP manual 2-31 */
+#define CONSCB_GETC 0x1
+#define CONSCB_PUTS 0x2
+#define CONSCB_RESET_TERM 0x3
+#define CONSCB_SET_TERM_INT 0x4
+#define CONSCB_SET_TERM_CTL 0x5
+#define CONSCB_PROCESS_KEY 0x6
+#define CONSCB_OPEN_CONSOLE 0x7
+#define CONSCB_CLOSE_CONSOLE 0x8
+
+#define CONSCB_OPEN 0x10
+#define CONSCB_CLOSE 0x11
+#define CONSCB_READ 0x13
+
+#define CONSCB_GETENV 0x22
+
+/* AXP manual 2-26 */
+#define	ENV_AUTO_ACTION		0X01
+#define	ENV_BOOT_DEV		0X02
+#define	ENV_BOOTDEF_DEV		0X03
+#define	ENV_BOOTED_DEV		0X04
+#define	ENV_BOOT_FILE		0X05
+#define	ENV_BOOTED_FILE		0X06
+#define	ENV_BOOT_OSFLAGS	0X07
+#define	ENV_BOOTED_OSFLAGS	0X08
+#define	ENV_BOOT_RESET		0X09
+#define	ENV_DUMP_DEV		0X0A
+#define	ENV_ENABLE_AUDIT	0X0B
+#define	ENV_LICENSE		0X0C
+#define	ENV_CHAR_SET		0X0D
+#define	ENV_LANGUAGE		0X0E
+#define	ENV_TTY_DEV		0X0F
+#define	ENV_SCSIID		0X42
+#define	ENV_SCSIFAST		0X43
+#define	ENV_COM1_BAUD		0X44
+#define	ENV_COM1_MODEM		0X45
+#define	ENV_COM1_FLOW		0X46
+#define	ENV_COM1_MISC		0X47
+#define	ENV_COM2_BAUD		0X48
+#define	ENV_COM2_MODEM		0X49
+#define	ENV_COM2_FLOW		0X4A
+#define	ENV_COM2_MISC		0X4B
+#define	ENV_PASSWORD		0X4C
+#define	ENV_SECURE		0X4D
+#define	ENV_LOGFAIL		0X4E
+#define	ENV_SRM2DEV_ID		0X4F
+
+#define MAX_ENVLEN 32
+
+char env_auto_action[MAX_ENVLEN]	= "BOOT";
+char env_boot_dev[MAX_ENVLEN]		= "";
+char env_bootdef_dev[MAX_ENVLEN]	= "";
+char env_booted_dev[MAX_ENVLEN]		= BOOTDEVICE_NAME;
+char env_boot_file[MAX_ENVLEN]		= "";
+char env_booted_file[MAX_ENVLEN]	= "";
+char env_boot_osflags[MAX_ENVLEN]	= "";
+char env_booted_osflags[MAX_ENVLEN]	= "";
+char env_boot_reset[MAX_ENVLEN]		= "";
+char env_dump_dev[MAX_ENVLEN]		= "";
+char env_enable_audit[MAX_ENVLEN]	= "";
+char env_license[MAX_ENVLEN]		= "";
+char env_char_set[MAX_ENVLEN]		= "";
+char env_language[MAX_ENVLEN]		= "";
+char env_tty_dev[MAX_ENVLEN]		= "0";
+char env_scsiid[MAX_ENVLEN]		= "";
+char env_scsifast[MAX_ENVLEN]		= "";
+char env_com1_baud[MAX_ENVLEN]		= "";
+char env_com1_modem[MAX_ENVLEN]		= "";
+char env_com1_flow[MAX_ENVLEN]		= "";
+char env_com1_misc[MAX_ENVLEN]		= "";
+char env_com2_baud[MAX_ENVLEN]		= "";
+char env_com2_modem[MAX_ENVLEN]		= "";
+char env_com2_flow[MAX_ENVLEN]		= "";
+char env_com2_misc[MAX_ENVLEN]		= "";
+char env_password[MAX_ENVLEN]		= "";
+char env_secure[MAX_ENVLEN]		= "";
+char env_logfail[MAX_ENVLEN]		= "";
+char env_srm2dev_id[MAX_ENVLEN]		= "";
+
+#define MAX_ENV_INDEX 100
+char *envptr[MAX_ENV_INDEX] = {
+    0,					/* 0x00 */
+    env_auto_action,			/* 0x01 */
+    env_boot_dev,			/* 0x02 */
+    env_bootdef_dev,			/* 0x03 */
+    env_booted_dev,			/* 0x04 */
+    env_boot_file,			/* 0x05 */
+    env_booted_file,			/* 0x06 */
+    env_boot_osflags,			/* 0x07 */
+    env_booted_osflags,			/* 0x08 */
+    env_boot_reset,			/* 0x09 */
+    env_dump_dev,			/* 0x0A */
+    env_enable_audit,			/* 0x0B */
+    env_license,			/* 0x0C */
+    env_char_set,			/* 0x0D */
+    (char *)&env_language,		/* 0x0E */
+    env_tty_dev,			/* 0x0F */
+    0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,	/* 0x10 - 0x1F */
+    0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,	/* 0x20 - 0x2F */
+    0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,	/* 0x30 - 0x3F */
+    0,					/* 0x40 */
+    0,					/* 0x41 */
+    env_scsiid,				/* 0x42 */
+    env_scsifast,			/* 0x43 */
+    env_com1_baud,			/* 0x44 */
+    env_com1_modem,			/* 0x45 */
+    env_com1_flow,			/* 0x46 */
+    env_com1_misc,			/* 0x47 */
+    env_com2_baud,			/* 0x48 */
+    env_com2_modem,			/* 0x49 */
+    env_com2_flow,			/* 0x4A */
+    env_com2_misc,			/* 0x4B */
+    env_password,			/* 0x4C */
+    env_secure,				/* 0x4D */
+    env_logfail,			/* 0x4E */
+    env_srm2dev_id,			/* 0x4F */
+    0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0,	/* 0x50 - 0x5F */
+    0,					/* 0x60 */
+    0,					/* 0x61 */
+    0,					/* 0x62 */
+    0,					/* 0x63 */
+};
+
+long
+CallBackDispatcher(long a0, long a1, long a2, long a3, long a4)
+{
+    long i;
+    switch (a0) {
+      case CONSCB_GETC:
+        return GetChar();
+
+      case CONSCB_PUTS:
+        for (i = 0; i < a3; i++)
+            PutChar(*((char *)a2 + i));
+        return a3;
+
+      case CONSCB_GETENV:
+        if (a1 >= 0 && a1 < MAX_ENV_INDEX && envptr[a1] != 0 && *envptr[a1]) {
+            i = strcpy((char*)a2, envptr[a1]);
+        } else {
+            strcpy((char*)a2, "");
+            i = (long)0xc000000000000000;
+            if (a1 >= 0 && a1 < MAX_ENV_INDEX)
+                printf_lock("GETENV unsupported option %d (0x%x)\n", a1, a1);
+            else
+                printf_lock("GETENV unsupported option %s\n", a1);
+        }
+
+        if (i > a3)
+            panic("CONSCB_GETENV overwrote buffer\n");
+        return i;
+
+      case CONSCB_OPEN:
+        bcopy((char*)a1, deviceState[numOpenDevices].name, a2);
+        deviceState[numOpenDevices].name[a2] = '\0';
+        printf_lock("CONSOLE OPEN : %s --> success \n",
+                    deviceState[numOpenDevices].name);
+        return numOpenDevices++;
+
+      case CONSCB_READ:
+        DeviceOperation(a0, a1, a2, a3, a4);
+        break;
+
+      case CONSCB_CLOSE:
+        break;
+
+      case CONSCB_OPEN_CONSOLE:
+        printf_lock("CONSOLE OPEN\n");
+        return 0; /* success */
+        break; /* not reached */
+
+      case CONSCB_CLOSE_CONSOLE:
+        printf_lock("CONSOLE CLOSE\n");
+        return 0; /* success */
+        break; /* not reached */
+
+      default:
+        panic("CallBackDispatcher(%x,%x,%x,%x,%x)\n", a0, a1, a2, a3, a4);
+    }
+
+    return 0;
+}
+
+long
+CallBackFixup(int a0, int a1, int a2)
+{
+    long temp;
+    /*
+     * Linux uses r8 for the current pointer (pointer to data
+     * structure contating info about currently running process). It
+     * is set when the kernel starts and is expected to remain
+     * there... Problem is that the unlike the kernel, the console
+     * does not prevent the assembler from using r8. So here is a work
+     * around. So far this has only been a problem in CallBackFixup()
+     * but any other call back functions couldd cause a problem at
+     * some point
+     */
+
+    /* save off the current pointer to a temp variable */
+    asm("bis $8, $31, %0" : "=r" (temp));
+
+    /* call original code */
+    printf_lock("CallbackFixup %x %x, t7=%x\n", a0, a1, temp);
+
+    /* restore the current pointer */
+    asm("bis %0, $31, $8" : : "r" (temp) : "$8");
+
+    return 0;
+}
+
+void
+SlaveCmd(int cpu, struct rpb_percpu *my_rpb)
+{
+    extern ulong palJToSlave[];
+
+    printf_lock("Slave CPU %d console command %s", cpu,
+                my_rpb->rpb_iccb.iccb_rxbuf);
+
+    my_rpb->rpb_state |= STATE_BIP;
+    my_rpb->rpb_state &= ~STATE_RC;
+
+    printf_lock("SlaveCmd: restart %x %x vptb %x my_rpb %x my_rpb_phys %x\n",
+                rpb->rpb_restart, rpb->rpb_restart_pv, rpb->rpb_vptb, my_rpb,
+                KSEG_TO_PHYS(my_rpb));
+
+    cServe(KSEG_TO_PHYS(palJToSlave), (ulong)rpb->rpb_restart,
+           CSERVE_K_JTOPAL, rpb->rpb_restart_pv, rpb->rpb_vptb,
+           KSEG_TO_PHYS(my_rpb));
+
+    panic("SlaveCmd returned \n");
+}
+
+void
+SlaveLoop(int cpu)
+{
+    int size = ROUNDUP128(sizeof(struct rpb_percpu));
+    struct rpb_percpu *my_rpb = (struct rpb_percpu*)
+        ((ulong)rpb_percpu + size * cpu);
+
+    if (cpu == 0) {
+        panic("CPU�0 entering slaveLoop. Reenetering the console. HOSED\n");
+    } else {
+        printf_lock("Entering slaveloop for cpu %d my_rpb=%x\n", cpu, my_rpb);
+    }
+
+    // swap the processors context to the one in the
+    // rpb_percpu struct very carefully (i.e. no stack usage)
+    // so that linux knows which processor ends up in __smp_callin
+    // and we don't trash any data is the process
+    SlaveSpin(cpu, my_rpb, &my_rpb->rpb_iccb.iccb_rxlen);
+}
diff --git a/system/alpha/console/dbmentry.S b/system/alpha/console/dbmentry.S
new file mode 100644
index 0000000000..56a2c1950b
--- /dev/null
+++ b/system/alpha/console/dbmentry.S
@@ -0,0 +1,213 @@
+/*
+ * Copyright (c) 2003-2004 The Regents of The University of Michigan
+ * Copyright (c) 1993 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * Debug Monitor Entry code
+ */
+#include "fromHudsonOsf.h"
+
+        .extern myAlphaAccess
+        .text
+
+/* return address and padding to octaword align */
+#define STARTFRM 16
+
+        .globl  _start
+        .ent    _start, 0
+_start:
+_entry:
+        br      t0, 2f			# get the current PC
+2:	ldgp    gp, 0(t0)               # init gp
+
+/* Processor 0 start stack frame is begining of physical memory (0)
+   Other processors spin here waiting to get their stacks from
+   Processor 0, then they can progress as normal.
+*/
+        call_pal PAL_WHAMI_ENTRY
+        beq v0, cpuz
+        ldq  t3, m5AlphaAccess
+        addq t3,0x70,t3 # *** If offset in console alpha access struct changes
+                        # This must be changed as well!
+        bis  zero,8,t4
+        mulq t4,v0,t4
+        addq t3,t4,t3
+        ldah a0, 3(zero)  # load arg0 with 65536*3
+cpuwait: .long 0x6000002  # jsr quiesceNs
+        ldq  t4, 0(t3)
+        beq  t4, cpuwait
+        bis  t4,t4,sp
+
+
+cpuz:	bis	sp,sp,s0 /* save sp */
+
+slave:	lda	v0,(8*1024)(sp) /* end of page  */
+
+        subq	zero, 1, t0
+        sll	t0, 42, t0
+        bis	t0, v0, sp
+
+        lda     sp, -STARTFRM(sp)	# Create a stack frame
+        stq     ra, 0(sp)		# Place return address on the stack
+
+        .mask   0x84000000, -8
+        .frame  sp, STARTFRM, ra
+
+/*
+ *	Enable the Floating Point Unit
+ */
+        lda	a0, 1(zero)
+        call_pal PAL_WRFEN_ENTRY
+
+/*
+ *	Every good C program has a main()
+ */
+
+/* If stack pointer was 0, then this is CPU0*/
+        beq	s0,master
+
+        call_pal PAL_WHAMI_ENTRY
+        bis	v0,v0,a0
+        jsr	ra, SlaveLoop
+master:
+        jsr	ra, main
+
+
+
+/*
+ *	The Debug Monitor should never return.
+ *	However, just incase...
+ */
+        ldgp	gp, 0(ra)
+        bsr	zero, _exit
+
+.end	_start
+
+
+
+        .globl  _exit
+        .ent    _exit, 0
+_exit:
+
+        ldq     ra, 0(sp)		# restore return address
+        lda	sp, STARTFRM(sp)	# prune back the stack
+        ret	zero, (ra)		# Back from whence we came
+.end	_exit
+
+                .globl	cServe
+        .ent	cServe 2
+cServe:
+        .option	O1
+        .frame	sp, 0, ra
+        call_pal PAL_CSERVE_ENTRY
+        ret	zero, (ra)
+        .end	cServe
+
+        .globl	wrfen
+        .ent	wrfen 2
+wrfen:
+        .option	O1
+        .frame	sp, 0, ra
+        call_pal PAL_WRFEN_ENTRY
+        ret	zero, (ra)
+        .end	wrfen
+        .globl	consoleCallback
+        .ent	consoleCallback 2
+consoleCallback:
+        br      t0, 2f			# get the current PC
+2:	ldgp    gp, 0(t0)               # init gp
+        lda     sp,-64(sp)
+        stq     ra,0(sp)
+        jsr     CallBackDispatcher
+        ldq     ra,0(sp)
+        lda     sp,64(sp)
+        ret     zero,(ra)
+        .end    consoleCallback
+
+
+        .globl	consoleFixup
+        .ent	consoleFixup 2
+consoleFixup:
+        br      t0, 2f			# get the current PC
+2:	ldgp    gp, 0(t0)               # init gp
+        lda     sp,-64(sp)
+        stq     ra,0(sp)
+        jsr     CallBackFixup
+        ldq     ra,0(sp)
+        lda     sp,64(sp)
+        ret     zero,(ra)
+        .end    consoleFixup
+
+
+
+        .globl	SpinLock
+        .ent	SpinLock 2
+SpinLock:
+1:
+        ldq_l	a1,0(a0)		# interlock complete lock state
+        subl	ra,3,v0			# get calling addr[31:0] + 1
+        blbs	a1,2f			# branch if lock is busy
+        stq_c	v0,0(a0)		# attempt to acquire lock
+        beq	v0,2f			# branch if lost atomicity
+        mb				# ensure memory coherence
+        ret	zero,(ra)		# return to caller (v0 is 1)
+2:
+        br	zero,1b
+        .end	SpinLock
+
+        .globl	loadContext
+        .ent	loadContext 2
+loadContext:
+        .option	O1
+        .frame	sp, 0, ra
+        call_pal PAL_SWPCTX_ENTRY
+        ret	zero, (ra)
+        .end	loadContext
+
+
+        .globl	SlaveSpin          # Very carefully spin wait
+        .ent	SlaveSpin 2        # and swap context without
+SlaveSpin:                         # using any stack space
+        .option	O1
+        .frame	sp, 0, ra
+        mov a0, t0                 # cpu number
+        mov a1, t1                 # cpu rpb pointer (virtual)
+        mov a2, t2                 # what to spin on
+        ldah a0, 3(zero)  # load arg0 with 65536
+test:   .long 0x6000002  # jsr quiesceNs     # wait 65us*3
+        ldl  t3, 0(t2)
+        beq  t3, test
+        zapnot t1,0x1f,a0          # make rpb physical
+        call_pal PAL_SWPCTX_ENTRY  # switch to pcb
+        mov t0, a0                 # setup args for SlaveCmd
+        mov t1, a1
+        jsr SlaveCmd               # call SlaveCmd
+        ret	zero, (ra)         # Should never be reached
+        .end	SlaveSpin
+
+
diff --git a/system/alpha/console/paljtokern.S b/system/alpha/console/paljtokern.S
new file mode 100644
index 0000000000..dfaf325331
--- /dev/null
+++ b/system/alpha/console/paljtokern.S
@@ -0,0 +1,174 @@
+/*
+ * Copyright (c) 2003-2004 The Regents of The University of Michigan
+ * Copyright (c) 1993 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "dc21164FromGasSources.h"	// DECchip 21164 specific definitions
+#include "ev5_defs.h"
+#include "fromHudsonOsf.h"		// OSF/1 specific definitions
+#include "fromHudsonMacros.h"		// Global macro definitions
+
+/* Jump to kernel
+ * args:
+ *	Kernel address - a0
+ *	PCBB           - a1
+ *	First free PFN - a3?
+ *
+ *	Enable kseg addressing in ICSR
+ *	Enable kseg addressing in MCSR
+ *	Set VTBR -- Set to 1GB as per SRM, or maybe 8GB??
+ *	Set PCBB -- pass pointer in arg
+ *	Set PTBR -- get it out of PCB
+ *	Set KSP  -- get it out of PCB
+ *
+ *	Jump to kernel address
+ *
+ *	Kernel args-
+ *	s0 first free PFN
+ *	s1 ptbr
+ *	s2 argc 0
+ *	s3 argv NULL
+ *	s5 osf_param (sysconfigtab) NULL
+ */
+
+        .global palJToKern
+        .text 3
+palJToKern:
+        ALIGN_BRANCH
+
+        ldq_p	a0, 0(zero)
+        ldq_p	a1, 8(zero)
+        ldq_p	a3, 16(zero)
+
+        /* Point the Vptbr at 8GB */
+        lda	t0, 0x1(zero)
+        sll	t0, 33, t0
+
+        mtpr	t0, mVptBr	// Load Mbox copy
+        mtpr	t0, iVptBr	// Load Ibox copy
+        STALL			// don't dual issue the load with mtpr -pb
+
+        /* Turn on superpage mapping in the mbox and icsr */
+        lda	t0, (2<<MCSR_V_SP)(zero) // Get a '10' (binary) in MCSR<SP>
+        STALL			// don't dual issue the load with mtpr -pb
+        mtpr	t0, mcsr	// Set the super page mode enable bit
+        STALL			// don't dual issue the load with mtpr -pb
+
+        lda	t0, 0(zero)
+        mtpr	t0, dtbAsn
+        mtpr	t0, itbAsn
+
+        LDLI	(t1,0x20000000)
+        STALL			// don't dual issue the load with mtpr -pb
+        mfpr	t0, icsr	// Enable superpage mapping
+        STALL			// don't dual issue the load with mtpr -pb
+        bis	t0, t1, t0
+        mtpr	t0, icsr
+
+        STALL			// Required stall to update chip ...
+        STALL
+        STALL
+        STALL
+        STALL
+
+        ldq_p	s0, PCB_Q_PTBR(a1)
+        sll	s0, VA_S_OFF, s0 // Shift PTBR into position
+        STALL			// don't dual issue the load with mtpr -pb
+        mtpr	s0, ptPtbr	// PHYSICAL MBOX INST -> MT PT20 IN 0,1
+        STALL			// don't dual issue the load with mtpr -pb
+        ldq_p	sp, PCB_Q_KSP(a1)
+
+        mtpr	a0, excAddr	// Load the dispatch address.
+        STALL			// don't dual issue the load with mtpr -pb
+        bis	a3, zero, a0	// first free PFN
+        ldq_p	a1, PCB_Q_PTBR(a1) // ptbr
+        ldq_p	a2, 24(zero)	// argc
+        ldq_p	a3, 32(zero)	// argv
+        ldq_p	a4, 40(zero)	// environ
+        lda	a5, 0(zero)	// osf_param
+        STALL			// don't dual issue the load with mtpr -pb
+        mtpr	zero, dtbIa	// Flush all D-stream TB entries
+        mtpr	zero, itbIa	// Flush all I-stream TB entries
+        br	zero, 2f
+
+        ALIGN_BLOCK
+
+2:      NOP
+        mtpr	zero, icFlush	// Flush the icache.
+        NOP
+        NOP
+
+        NOP			// Required NOPs ... 1-10
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+
+        NOP			// Required NOPs ... 11-20
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+
+        NOP			// Required NOPs ... 21-30
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+
+        NOP			// Required NOPs ... 31-40
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+
+        NOP			// Required NOPs ... 41-44
+        NOP
+        NOP
+        NOP
+
+        hw_rei_stall		// Dispatch to kernel
diff --git a/system/alpha/console/paljtoslave.S b/system/alpha/console/paljtoslave.S
new file mode 100644
index 0000000000..59cfb210d3
--- /dev/null
+++ b/system/alpha/console/paljtoslave.S
@@ -0,0 +1,161 @@
+/*
+ * Copyright (c) 2003-2004 The Regents of The University of Michigan
+ * Copyright (c) 1993 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "dc21164FromGasSources.h"	// DECchip 21164 specific definitions
+#include "ev5_defs.h"
+#include "fromHudsonOsf.h"		// OSF/1 specific definitions
+#include "fromHudsonMacros.h"		// Global macro definitions
+
+/*
+ * args:
+ *   a0: here
+ *   a1: boot location
+ *   a2: CSERVE_J_KTOPAL
+ *   a3: restrart_pv
+ *   a4: vptb
+ *   a5: my_rpb
+ *
+ * SRM Console Architecture III 3-26
+ */
+
+        .global	palJToSlave
+        .text	3
+palJToSlave:
+
+        ALIGN_BRANCH
+
+        bis	a3, zero, pv
+        bis	zero, zero, t11
+        bis	zero, zero, ra
+
+        /* Point the Vptbr to a2 */
+
+        mtpr	a4, mVptBr	// Load Mbox copy
+        mtpr	a4, iVptBr	// Load Ibox copy
+        STALL			// don't dual issue the load with mtpr -pb
+
+        /* Turn on superpage mapping in the mbox and icsr */
+        lda	t0, (2<<MCSR_V_SP)(zero) // Get a '10' (binary) in MCSR<SP>
+        STALL			// don't dual issue the load with mtpr -pb
+        mtpr	t0, mcsr	// Set the super page mode enable bit
+        STALL			// don't dual issue the load with mtpr -pb
+
+        lda	t0, 0(zero)
+        mtpr	t0, dtbAsn
+        mtpr	t0, itbAsn
+
+        LDLI	(t1,0x20000000)
+        STALL			// don't dual issue the load with mtpr -pb
+        mfpr	t0, icsr	// Enable superpage mapping
+        STALL			// don't dual issue the load with mtpr -pb
+        bis	t0, t1, t0
+        mtpr	t0, icsr
+
+        STALL			// Required stall to update chip ...
+        STALL
+        STALL
+        STALL
+        STALL
+
+        ldq_p	s0, PCB_Q_PTBR(a5)
+        sll	s0, VA_S_OFF, s0 // Shift PTBR into position
+        STALL			// don't dual issue the load with mtpr -pb
+        mtpr	s0, ptPtbr	// PHYSICAL MBOX INST -> MT PT20 IN 0,1
+        STALL			// don't dual issue the load with mtpr -pb
+        ldq_p	sp, PCB_Q_KSP(a5)
+
+        mtpr	zero, dtbIa	// Flush all D-stream TB entries
+        mtpr	zero, itbIa	// Flush all I-stream TB entries
+
+        mtpr	a1, excAddr	// Load the dispatch address.
+
+        STALL			// don't dual issue the load with mtpr -pb
+        STALL			// don't dual issue the load with mtpr -pb
+        mtpr	zero, dtbIa	// Flush all D-stream TB entries
+        mtpr	zero, itbIa	// Flush all I-stream TB entries
+        br	zero, 2f
+
+        ALIGN_BLOCK
+
+2:	NOP
+        mtpr	zero, icFlush	// Flush the icache.
+        NOP
+        NOP
+
+        NOP			// Required NOPs ... 1-10
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+
+        NOP			// Required NOPs ... 11-20
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+
+        NOP			// Required NOPs ... 21-30
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+
+        NOP			// Required NOPs ... 31-40
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+        NOP
+
+        NOP			// Required NOPs ... 41-44
+        NOP
+        NOP
+        NOP
+
+        hw_rei_stall		// Dispatch to kernel
+
diff --git a/system/alpha/console/printf.c b/system/alpha/console/printf.c
new file mode 100644
index 0000000000..3d8cb41084
--- /dev/null
+++ b/system/alpha/console/printf.c
@@ -0,0 +1,301 @@
+/*
+ * Copyright (c) 2003-2004 The Regents of The University of Michigan
+ * Copyright (c) 1993 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <sys/types.h>
+#include <stdarg.h>
+#include <stdint.h>
+#include "m5op.h"
+
+/* The string s is terminated by a '\0' */
+void
+PutString(const char *s)
+{
+    while (*s)
+        PutChar(*s++);
+}
+
+/* print c count times */
+void
+PutRepChar(char c, int count)
+{
+    while (count--)
+        PutChar(c);
+}
+
+/* put string reverse */
+void
+PutStringReverse(const char *s, int index)
+{
+    while (index-- > 0)
+        PutChar(s[index]);
+}
+
+/*
+ * prints value in radix, in a field width width, with fill
+ * character fill
+ * if radix is negative, print as signed quantity
+ * if width is negative, left justify
+ * if width is 0, use whatever is needed
+ * if fill is 0, use ' '
+ */
+void
+PutNumber(long value, int radix, int width, char fill)
+{
+    char buffer[40];
+    uint bufferindex = 0;
+    ulong uvalue;
+    ushort digit;
+    ushort left = 0;
+    ushort negative = 0;
+
+    if (fill == 0)
+        fill = ' ';
+
+    if (width < 0) {
+        width = -width;
+        left = 1;
+    }
+
+    if (width < 0 || width > 80)
+        width = 0;
+
+    if (radix < 0) {
+        radix = -radix;
+        if (value < 0) {
+            negative = 1;
+            value = -value;
+        }
+    }
+
+    switch (radix) {
+      case 8:
+      case 10:
+      case 16:
+        break;
+
+      default:
+        PutString("****");
+        return;
+    }
+
+    uvalue = value;
+
+    do {
+        if (radix != 16) {
+            digit = (ushort)(uvalue % radix);
+            uvalue /= radix;
+        } else {
+            digit = (ushort)(uvalue & 0xf);
+            uvalue = uvalue >> 4;
+        }
+        buffer[bufferindex] = digit + ((digit <= 9) ? '0' : ('A' - 10));
+        bufferindex += 1;
+    } while (uvalue != 0);
+
+  /* fill # ' ' and negative cannot happen at once */
+    if (negative) {
+        buffer[bufferindex] = '-';
+        bufferindex += 1;
+    }
+
+    if ((uint)width <= bufferindex) {
+        PutStringReverse(buffer, bufferindex);
+    } else {
+        width -= bufferindex;
+        if (!left)
+            PutRepChar(fill, width);
+        PutStringReverse(buffer, bufferindex);
+        if (left)
+            PutRepChar(fill, width);
+    }
+}
+
+ulong
+power(long base, long n)
+{
+    ulong p;
+
+    for (p = 1; n > 0; --n)
+        p = p * base;
+    return p;
+}
+
+void
+putFloat(double a, int fieldwidth, char fill)
+{
+    int i;
+    ulong b;
+
+    /*
+     *  Put out everything before the decimal place.
+     */
+    PutNumber(((ulong) a), 10, fieldwidth, fill);
+
+    /*
+     *  Output the decimal place.
+     */
+    PutChar('.' & 0x7f);
+
+    /*
+     *  Output the n digits after the decimal place.
+     */
+    for (i = 1; i < 6; i++) {
+        b = (ulong)(power(10, i) * (double)(a - (ulong) a));
+        PutChar((char)(b % 10) + '0');
+    }
+}
+
+const char *
+FormatItem(const char *f, va_list *ap)
+{
+    char c;
+    int fieldwidth = 0;
+    int leftjust = 0;
+    int radix = 0;
+    char fill = ' ';
+
+    if (*f == '0')
+        fill = '0';
+
+    while (c = *f++) {
+        if (c >= '0' && c <= '9') {
+            fieldwidth = (fieldwidth * 10) + (c - '0');
+        } else {
+            switch (c) {
+              case '\000':
+                return(--f);
+              case '%':
+                PutChar('%');
+                return(f);
+              case '-':
+                leftjust = 1;
+                break;
+              case 'c': {
+                  char a = (char)va_arg(*ap, int);
+
+                  if (leftjust)
+                      PutChar(a & 0x7f);
+                  if (fieldwidth > 0)
+                      PutRepChar(fill, fieldwidth - 1);
+                  if (!leftjust)
+                      PutChar(a & 0x7f);
+                  return(f);
+              }
+              case 's': {
+                  const char *a = va_arg(*ap, const char *);
+
+                  if (leftjust)
+                      PutString((const char *) a);
+                  if (fieldwidth > strlen((const char *) a))
+                      PutRepChar(fill, fieldwidth - strlen((const char *)a));
+                  if (!leftjust)
+                      PutString((const char *) a);
+                  return(f);
+              }
+              case 'd':
+                radix = -10;
+                break;
+              case 'u':
+                radix = 10;
+                break;
+              case 'x':
+                radix = 16;
+                break;
+              case 'X':
+                radix = 16;
+                break;
+              case 'o':
+                radix = 8;
+                break;
+              case 'f': {
+                  double a = va_arg(*ap, double);
+
+                  putFloat(a, fieldwidth, fill);
+                  return(f);
+              }
+              default:   /* unknown switch! */
+                radix = 3;
+                break;
+            }
+        }
+
+        if (radix)
+            break;
+    }
+
+    if (leftjust)
+        fieldwidth = -fieldwidth;
+
+    long a = va_arg(*ap, long);
+    PutNumber(a, radix, fieldwidth, fill);
+
+    return(f);
+}
+
+int
+printf(const char *f, ...)
+{
+    va_list ap;
+
+    va_start(ap, f);
+
+    while (*f) {
+        if (*f == '%')
+            f = FormatItem(f + 1, &ap);
+        else
+            PutChar(*f++);
+    }
+
+    if (*(f - 1) == '\n') {
+        /* add a line-feed (SimOS console output goes to shell */
+        PutChar('\r');
+    }
+
+    va_end(ap);         /* clean up */
+    return 0;
+}
+
+void
+panic(const char *f, ...)
+{
+    va_list ap;
+
+    va_start(ap, f);
+
+    printf("CONSOLE PANIC (looping): ");
+    while (*f) {
+        if (*f == '%')
+            f = FormatItem(f + 1, &ap);
+        else
+            PutChar(*f++);
+    }
+
+    va_end(ap);         /* clean up */
+    m5_panic();
+}
diff --git a/system/alpha/h/cserve.h b/system/alpha/h/cserve.h
new file mode 100644
index 0000000000..5b0e0f61c6
--- /dev/null
+++ b/system/alpha/h/cserve.h
@@ -0,0 +1,52 @@
+/*
+ * Copyright (c) 1993 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#define	__CSERVE_LOADED	1
+
+/*
+ * Console Service (cserve) sub-function codes:
+ */
+#define CSERVE_K_LDQP           0x01
+#define CSERVE_K_STQP           0x02
+#define CSERVE_K_JTOPAL         0x09
+#define CSERVE_K_WR_INT         0x0A
+#define CSERVE_K_RD_IMPURE      0x0B
+#define CSERVE_K_PUTC           0x0F
+#define CSERVE_K_WR_ICSR	0x10
+#define CSERVE_K_WR_ICCSR	0x10    /* for ev4 backwards compatibility */
+#define CSERVE_K_RD_ICSR	0x11
+#define CSERVE_K_RD_ICCSR	0x11    /* for ev4 backwards compatibility */
+#define CSERVE_K_RD_BCCTL	0x12
+#define CSERVE_K_RD_BCCFG	0x13
+
+#define CSERVE_K_WR_BCACHE      0x16
+
+#define CSERVE_K_RD_BCCFG_OFF   0x17
+#define CSERVE_K_JTOKERN	0x18
+
+
diff --git a/system/alpha/h/dc21164FromGasSources.h b/system/alpha/h/dc21164FromGasSources.h
new file mode 100644
index 0000000000..038861d360
--- /dev/null
+++ b/system/alpha/h/dc21164FromGasSources.h
@@ -0,0 +1,886 @@
+/*
+ * Copyright (c) 1993 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef DC21164FROMGASSOURCES_INCLUDED
+#define	DC21164FROMGASSOURCES_INCLUDED	1
+
+/*
+**
+**  INTERNAL PROCESSOR REGISTER DEFINITIONS
+**
+**  The internal processor register definitions below are annotated
+**  with one of the following symbols:
+**
+**	RW - The register may be read and written
+**	RO - The register may only be read
+**	WO - The register may only be written
+**
+**  For RO and WO registers, all bits and fields within the register are
+**  also read-only or write-only.  For RW registers, each bit or field
+**  within the register is annotated with one of the following:
+**
+**	RW  - The bit/field may be read and written
+** 	RO  - The bit/field may be read; writes are ignored
+**	WO  - The bit/field may be written; reads return UNPREDICTABLE
+**	WZ  - The bit/field may be written; reads return a zero value
+**	W0C - The bit/field may be read; write-zero-to-clear
+**	W1C - The bit/field may be read; write-one-to-clear
+**	WA  - The bit/field may be read; write-anything-to-clear
+**	RC  - The bit/field may be read, causing state to clear;
+**	      writes are ignored
+**
+*/
+
+
+/*
+**
+**  Ibox IPR Definitions:
+**
+*/
+
+// replaced by ev5_defs.h #define isr		0x100	/* RO - Interrupt Summary */
+#define itbTag		0x101	/* WO - ITB Tag */
+#define	itbPte		0x102	/* RW - ITB Page Table Entry */
+#define itbAsn		0x103	/* RW - ITB Address Space Number */
+#define itbPteTemp	0x104	/* RO - ITB Page Table Entry Temporary */
+#define	itbIa		0x105	/* WO - ITB Invalidate All */
+#define itbIap		0x106	/* WO - ITB Invalidate All Process */
+#define itbIs		0x107	/* WO - ITB Invalidate Single */
+// replaced by ev5_defs.h #define sirr		0x108	/* RW - Software Interrupt Request */
+// replaced by ev5_defs.h #define astrr		0x109	/* RW - Async. System Trap Request */
+// replaced by ev5_defs.h #define aster		0x10A	/* RW - Async. System Trap Enable */
+#define excAddr		0x10B	/* RW - Exception Address */
+#define excSum		0x10C	/* RW - Exception Summary */
+#define excMask		0x10D	/* RO - Exception Mask */
+#define palBase		0x10E	/* RW - PAL Base */
+#define ips		0x10F	/* RW - Processor Status */
+// replaced by ev5_defs.h #define ipl		0x110	/* RW - Interrupt Priority Level */
+#define intId		0x111	/* RO - Interrupt ID */
+#define iFaultVaForm	0x112	/* RO - Formatted Faulting VA */
+#define iVptBr		0x113	/* RW - I-Stream Virtual Page Table Base */
+#define hwIntClr	0x115	/* WO - Hardware Interrupt Clear */
+#define slXmit		0x116	/* WO - Serial Line Transmit */
+#define slRcv		0x117	/* RO - Serial Line Receive */
+// replaced by ev5_defs.h #define icsr		0x118	/* RW - Ibox Control/Status */
+#define icFlush		0x119	/* WO - I-Cache Flush Control */
+#define flushIc         0x119   /* WO - I-Cache Flush Control (DC21064 Symbol) */
+#define icPerr		0x11A	/* RW - I-Cache Parity Error Status */
+#define PmCtr		0x11C	/* RW - Performance Counter */
+
+/*
+**
+**  Ibox Control/Status Register (ICSR) Bit Summary
+**
+**	Extent	Size	Name	Type	Function
+**	------	----	----	----	------------------------------------
+**	 <39>	 1	TST	RW,0	Assert Test Status
+**	 <38>	 1	ISTA	RO	I-Cache BIST Status
+**	 <37>	 1	DBS	RW,1	Debug Port Select
+**	 <36>	 1	FBD	RW,0	Force Bad I-Cache Data Parity
+**	 <35>	 1	FBT	RW,0	Force Bad I-Cache Tag Parity
+**	 <34>	 1	FMS	RW,0	Force I-Cache Miss
+**	 <33>	 1	SLE	RW,0	Enable Serial Line Interrupts
+**	 <32>	 1	CRDE	RW,0	Enable Correctable Error Interrupts
+**	 <30>	 1	SDE	RW,0	Enable PAL Shadow Registers
+**	<29:28>	 2	SPE	RW,0	Enable I-Stream Super Page Mode
+**	 <27>	 1	HWE	RW,0	Enable PALRES Instrs in Kernel Mode
+**	 <26>	 1	FPE	RW,0	Enable Floating Point Instructions
+**	 <25>	 1	TMD	RW,0	Disable Ibox Timeout Counter
+**	 <24>	 1	TMM	RW,0	Timeout Counter Mode
+**
+*/
+
+#define ICSR_V_TST	39
+#define ICSR_M_TST	(1<<ICSR_V_TST)
+#define ICSR_V_ISTA	38
+#define ICSR_M_ISTA	(1<<ICSR_V_ISTA)
+#define ICSR_V_DBS	37
+#define ICSR_M_DBS	(1<<ICSR_V_DBS)
+#define ICSR_V_FBD	36
+#define ICSR_M_FBD	(1<<ICSR_V_FBD)
+#define ICSR_V_FBT	35
+#define	ICSR_M_FBT	(1<<ICSR_V_FBT)
+#define ICSR_V_FMS	34
+#define ICSR_M_FMS	(1<<ICSR_V_FMS)
+#define	ICSR_V_SLE	33
+#define ICSR_M_SLE	(1<<ICSR_V_SLE)
+#define ICSR_V_CRDE	32
+#define ICSR_M_CRDE	(1<<ICSR_V_CRDE)
+#define ICSR_V_SDE	30
+#define ICSR_M_SDE	(1<<ICSR_V_SDE)
+#define ICSR_V_SPE	28
+#define ICSR_M_SPE	(3<<ICSR_V_SPE)
+#define ICSR_V_HWE	27
+#define ICSR_M_HWE	(1<<ICSR_V_HWE)
+#define ICSR_V_FPE	26
+#define ICSR_M_FPE	(1<<ICSR_V_FPE)
+#define ICSR_V_TMD	25
+#define ICSR_M_TMD	(1<<ICSR_V_TMD)
+#define ICSR_V_TMM	24
+#define ICSR_M_TMM	(1<<ICSR_V_TMM)
+
+/*
+**
+**  Serial Line Tranmit Register (SL_XMIT)
+**
+**	Extent	Size	Name	Type	Function
+**	------	----	----	----	------------------------------------
+**	 <7>	 1	TMT	WO,1	Serial line transmit data
+**
+*/
+
+#define	SLXMIT_V_TMT   	7
+#define SLXMIT_M_TMT	(1<<SLXMIT_V_TMT)
+
+/*
+**
+**  Serial Line Receive Register (SL_RCV)
+**
+**	Extent	Size	Name	Type	Function
+**	------	----	----	----	------------------------------------
+**	 <6>	 1	RCV	RO	Serial line receive data
+**
+*/
+
+#define	SLRCV_V_RCV   	6
+#define SLRCV_M_RCV	(1<<SLRCV_V_RCV)
+
+/*
+**
+**  Icache Parity Error Status Register (ICPERR) Bit Summary
+**
+**	Extent	Size	Name	Type	Function
+**	------	----	----	----	------------------------------------
+**	 <13>	 1	TMR	W1C	Timeout reset error
+**	 <12>	 1	TPE	W1C	Tag parity error
+**	 <11>	 1	DPE	W1C	Data parity error
+**
+*/
+
+#define	ICPERR_V_TMR   	13
+#define ICPERR_M_TMR	(1<<ICPERR_V_TMR)
+#define ICPERR_V_TPE	12
+#define ICPERR_M_TPE	(1<<ICPERR_V_TPE)
+#define ICPERR_V_DPE	11
+#define ICPERR_M_DPE	(1<<ICPERR_V_DPE)
+
+#define ICPERR_M_ALL	(ICPERR_M_TMR | ICPERR_M_TPE | ICPERR_M_DPE)
+
+/*
+**
+**  Exception Summary Register (EXC_SUM) Bit Summary
+**
+**	Extent	Size	Name	Type	Function
+**	------	----	----	----	------------------------------------
+**	 <16>	 1	IOV	 WA	Integer overflow
+**	 <15>	 1	INE	 WA	Inexact result
+**	 <14>	 1	UNF	 WA	Underflow
+**	 <13>	 1	FOV	 WA	Overflow
+**	 <12>	 1	DZE	 WA	Division by zero
+**	 <11>	 1	INV	 WA	Invalid operation
+**	 <10>	 1	SWC	 WA	Software completion
+**
+*/
+
+#define EXC_V_IOV	16
+#define EXC_M_IOV	(1<<EXC_V_IOV)
+#define EXC_V_INE	15
+#define EXC_M_INE	(1<<EXC_V_INE)
+#define EXC_V_UNF	14
+#define EXC_M_UNF	(1<<EXC_V_UNF)
+#define EXC_V_FOV	13
+#define EXC_M_FOV	(1<<EXC_V_FOV)
+#define EXC_V_DZE	12
+#define	EXC_M_DZE	(1<<EXC_V_DZE)
+#define EXC_V_INV	11
+#define EXC_M_INV	(1<<EXC_V_INV)
+#define	EXC_V_SWC	10
+#define EXC_M_SWC	(1<<EXC_V_SWC)
+
+/*
+**
+**  Hardware Interrupt Clear Register (HWINT_CLR) Bit Summary
+**
+**	 Extent	Size	Name	Type	Function
+**	 ------	----	----	----	---------------------------------
+**	  <33>	  1	SLC	W1C	Clear Serial Line interrupt
+**	  <32>	  1	CRDC	W1C	Clear Correctable Read Data interrupt
+**	  <29>	  1	PC2C	W1C	Clear Performance Counter 2 interrupt
+**	  <28>	  1	PC1C	W1C	Clear Performance Counter 1 interrupt
+**	  <27>	  1	PC0C    W1C	Clear Performance Counter 0 interrupt
+**
+*/
+
+#define HWINT_V_SLC	33
+#define HWINT_M_SLC	(1<<HWINT_V_SLC)
+#define HWINT_V_CRDC	32
+#define HWINT_M_CRDC	(1<<HWINT_V_CRDC)
+#define HWINT_V_PC2C	29
+#define HWINT_M_PC2C	(1<<HWINT_V_PC2C)
+#define HWINT_V_PC1C	28
+#define HWINT_M_PC1C	(1<<HWINT_V_PC1C)
+#define HWINT_V_PC0C	27
+#define HWINT_M_PC0C	(1<<HWINT_V_PC0C)
+
+/*
+**
+**  Interrupt Summary Register (ISR) Bit Summary
+**
+**	 Extent	Size	Name	Type	Function
+**	 ------	----	----	----	---------------------------------
+**	  <34>	  1	HLT    	RO	External Halt interrupt
+**	  <33>	  1	SLI	RO	Serial Line interrupt
+**	  <32>	  1	CRD	RO	Correctable ECC errors
+**	  <31>	  1	MCK	RO	System Machine Check
+**	  <30>	  1	PFL	RO	Power Fail
+**	  <29>	  1	PC2	RO	Performance Counter 2 interrupt
+**	  <28>	  1	PC1	RO	Performance Counter 1 interrupt
+**	  <27>	  1	PC0	RO	Performance Counter 0 interrupt
+**	  <23>	  1	I23	RO	External Hardware interrupt
+**	  <22>	  1	I22	RO	External Hardware interrupt
+**	  <21>	  1	I21	RO	External Hardware interrupt
+**	  <20>	  1	I20	RO	External Hardware interrupt
+**	  <19>	  1	ATR	RO	Async. System Trap request
+**	 <18:4>	 15	SIRR	RO,0	Software Interrupt request
+**	  <3:0>	  4	ASTRR	RO	Async. System Trap request (USEK)
+**
+**/
+
+#define ISR_V_HLT	34
+#define ISR_M_HLT	(1<<ISR_V_HLT)
+#define ISR_V_SLI	33
+#define ISR_M_SLI	(1<<ISR_V_SLI)
+#define ISR_V_CRD	32
+#define ISR_M_CRD	(1<<ISR_V_CRD)
+#define ISR_V_MCK	31
+#define ISR_M_MCK	(1<<ISR_V_MCK)
+#define ISR_V_PFL	30
+#define ISR_M_PFL	(1<<ISR_V_PFL)
+#define ISR_V_PC2	29
+#define ISR_M_PC2	(1<<ISR_V_PC2)
+#define ISR_V_PC1	28
+#define ISR_M_PC1	(1<<ISR_V_PC1)
+#define ISR_V_PC0	27
+#define ISR_M_PC0	(1<<ISR_V_PC0)
+#define ISR_V_I23	23
+#define ISR_M_I23	(1<<ISR_V_I23)
+#define ISR_V_I22	22
+#define ISR_M_I22	(1<<ISR_V_I22)
+#define ISR_V_I21	21
+#define ISR_M_I21	(1<<ISR_V_I21)
+#define ISR_V_I20	20
+#define ISR_M_I20	(1<<ISR_V_I20)
+#define ISR_V_ATR	19
+#define ISR_M_ATR	(1<<ISR_V_ATR)
+#define ISR_V_SIRR	4
+#define ISR_M_SIRR	(0x7FFF<<ISR_V_SIRR)
+#define ISR_V_ASTRR	0
+#define ISR_M_ASTRR	(0xF<<ISR_V_ASTRR)
+
+/*
+**
+**  Mbox and D-Cache IPR Definitions:
+**
+*/
+
+#define dtbAsn		0x200	/* WO - DTB Address Space Number */
+#define dtbCm		0x201	/* WO - DTB Current Mode */
+#define dtbTag		0x202	/* WO - DTB Tag */
+#define dtbPte		0x203	/* RW - DTB Page Table Entry */
+#define dtbPteTemp	0x204	/* RO - DTB Page Table Entry Temporary */
+#define mmStat		0x205	/* RO - D-Stream MM Fault Status */
+// replaced by ev5_defs.h #define va		0x206	/* RO - Faulting Virtual Address */
+#define vaForm		0x207	/* RO - Formatted Virtual Address */
+#define mVptBr		0x208	/* WO - Mbox Virtual Page Table Base */
+#define dtbIap		0x209	/* WO - DTB Invalidate All Process */
+#define dtbIa		0x20A	/* WO - DTB Invalidate All */
+#define dtbIs		0x20B	/* WO - DTB Invalidate Single */
+#define altMode		0x20C	/* WO - Alternate Mode */
+// replaced by ev5_defs.h #define cc		0x20D	/* WO - Cycle Counter */
+#define ccCtl		0x20E	/* WO - Cycle Counter Control */
+// replaced by ev5_defs.h #define mcsr		0x20F	/* RW - Mbox Control Register */
+#define dcFlush		0x210	/* WO - Dcache Flush */
+#define dcPerr	        0x212	/* RW - Dcache Parity Error Status */
+#define dcTestCtl	0x213	/* RW - Dcache Test Tag Control */
+#define dcTestTag	0x214	/* RW - Dcache Test Tag */
+#define dcTestTagTemp	0x215	/* RW - Dcache Test Tag Temporary */
+#define dcMode		0x216	/* RW - Dcache Mode */
+#define mafMode		0x217	/* RW - Miss Address File Mode */
+
+/*
+**
+**  D-Stream MM Fault Status Register (MM_STAT) Bit Summary
+**
+**	 Extent	Size	Name	  Type	Function
+**	 ------	----	----	  ----	---------------------------------
+**	<16:11>	  6	OPCODE 	  RO	Opcode of faulting instruction
+**	<10:06>	  5	RA	  RO	Ra field of faulting instruction
+**          <5>	  1	BAD_VA	  RO	Bad virtual address
+**	    <4>	  1	DTB_MISS  RO	Reference resulted in DTB miss
+**	    <3>	  1	FOW	  RO	Fault on write
+**	    <2>	  1	FOR	  RO	Fault on read
+**	    <1>   1     ACV	  RO	Access violation
+**          <0>	  1	WR	  RO	Reference type
+**
+*/
+
+#define	MMSTAT_V_OPC		11
+#define MMSTAT_M_OPC		(0x3F<<MMSTAT_V_OPC)
+#define MMSTAT_V_RA		6
+#define MMSTAT_M_RA		(0x1F<<MMSTAT_V_RA)
+#define MMSTAT_V_BAD_VA		5
+#define MMSTAT_M_BAD_VA		(1<<MMSTAT_V_BAD_VA)
+#define MMSTAT_V_DTB_MISS	4
+#define MMSTAT_M_DTB_MISS	(1<<MMSTAT_V_DTB_MISS)
+#define MMSTAT_V_FOW		3
+#define MMSTAT_M_FOW		(1<<MMSTAT_V_FOW)
+#define MMSTAT_V_FOR		2
+#define MMSTAT_M_FOR		(1<<MMSTAT_V_FOR)
+#define MMSTAT_V_ACV		1
+#define MMSTAT_M_ACV		(1<<MMSTAT_V_ACV)
+#define MMSTAT_V_WR		0
+#define MMSTAT_M_WR		(1<<MMSTAT_V_WR)
+
+
+/*
+**
+** Mbox Control Register (MCSR) Bit Summary
+**
+**	 Extent	Size	Name	Type	Function
+**	 ------	----	----	----	---------------------------------
+**	   <5>	  1	DBG1	RW,0   	Mbox Debug Packet Select
+**	   <4>	  1	E_BE	RW,0	Ebox Big Endian mode enable
+**	   <3>	  1	DBG0	RW,0	Debug Test Select
+**	  <2:1>	  2	SP	RW,0   	Superpage mode enable
+**	   <0>	  1	M_BE	RW,0    Mbox Big Endian mode enable
+**
+*/
+
+#define MCSR_V_DBG1	5
+#define MCSR_M_DBG1	(1<<MCSR_V_DBG1)
+#define MCSR_V_E_BE	4
+#define MCSR_M_E_BE	(1<<MCSR_V_E_BE)
+#define MCSR_V_DBG0	3
+#define MCSR_M_DBG0	(1<<MCSR_V_DBG0)
+#define MCSR_V_SP	1
+#define MCSR_M_SP	(3<<MCSR_V_SP)
+#define MCSR_V_M_BE	0
+#define MCSR_M_M_BE	(1<<MCSR_V_M_BE)
+
+/*
+**
+**  Dcache Parity Error Status Register (DCPERR) Bit Summary
+**
+**	Extent	Size	Name	Type	Function
+**	------	----	----	----	------------------------------------
+**	 <5>	 1	TP1	RO	Dcache bank 1 tag parity error
+**	 <4>	 1	TP0	RO	Dcache bank 0 tag parity error
+**	 <3>	 1	DP1	RO	Dcache bank 1 data parity error
+**	 <2>	 1	DP0	RO	Dcache bank 0 data parity error
+**	 <1>	 1	LOCK	W1C	Locks/clears bits <5:2>
+**	 <0>	 1	SEO	W1C	Second Dcache parity error occurred
+**
+*/
+
+#define DCPERR_V_TP1	5
+#define DCPERR_M_TP1	(1<<DCPERR_V_TP1)
+#define	DCPERR_V_TP0   	4
+#define DCPERR_M_TP0	(1<<DCPERR_V_TP0)
+#define DCPERR_V_DP1	3
+#define DCPERR_M_DP1	(1<<DCPERR_V_DP1)
+#define DCPERR_V_DP0    2
+#define DCPERR_M_DP0	(1<<DCPERR_V_DP0)
+#define DCPERR_V_LOCK	1
+#define DCPERR_M_LOCK	(1<<DCPERR_V_LOCK)
+#define DCPERR_V_SEO	0
+#define DCPERR_M_SEO	(1<<DCPERR_V_SEO)
+
+#define DCPERR_M_ALL	(DCPERR_M_LOCK | DCPERR_M_SEO)
+
+/*
+**
+**  Dcache Mode Register (DC_MODE) Bit Summary
+**
+**	 Extent	Size	Name	  Type	Function
+**	 ------	----	----	  ----	---------------------------------
+**	   <4>	  1	DOA	  RO    Hardware Dcache Disable
+**	   <3>	  1	PERR_DIS  RW,0	Disable Dcache Parity Error reporting
+**	   <2>	  1	BAD_DP	  RW,0	Force Dcache data bad parity
+**	   <1>	  1	FHIT	  RW,0	Force Dcache hit
+**	   <0>	  1	ENA 	  RW,0	Software Dcache Enable
+**
+*/
+
+#define	DC_V_DOA	4
+#define DC_M_DOA        (1<<DC_V_DOA)
+#define DC_V_PERR_DIS	3
+#define DC_M_PERR_DIS	(1<<DC_V_PERR_DIS)
+#define DC_V_BAD_DP	2
+#define DC_M_BAD_DP	(1<<DC_V_BAD_DP)
+#define DC_V_FHIT	1
+#define DC_M_FHIT	(1<<DC_V_FHIT)
+#define DC_V_ENA	0
+#define DC_M_ENA	(1<<DC_V_ENA)
+
+/*
+**
+**  Miss Address File Mode Register (MAF_MODE) Bit Summay
+**
+**	 Extent	Size	Name	  Type	Function
+**	 ------	----	----	  ----	---------------------------------
+**         <7>    1     WB        RO,0  If set, pending WB request
+**	   <6>	  1	DREAD	  RO,0  If set, pending D-read request
+**
+*/
+
+#define MAF_V_WB_PENDING        7
+#define MAF_M_WB_PENDING        (1<<MAF_V_WB_PENDING)
+#define MAF_V_DREAD_PENDING     6
+#define MAF_M_DREAD_PENDING     (1<<MAF_V_DREAD_PENDING)
+
+/*
+**
+**  Cbox IPR Definitions:
+**
+*/
+
+#define scCtl		0x0A8	/* RW - Scache Control */
+#define scStat		0x0E8	/* RO - Scache Error Status */
+#define scAddr		0x188	/* RO - Scache Error Address */
+#define	bcCtl		0x128	/* WO - Bcache/System Interface Control */
+#define bcCfg		0x1C8	/* WO - Bcache Configuration Parameters */
+#define bcTagAddr	0x108	/* RO - Bcache Tag */
+#define eiStat		0x168	/* RO - Bcache/System Error Status */
+#define eiAddr		0x148	/* RO - Bcache/System Error Address */
+#define fillSyn		0x068	/* RO - Fill Syndrome */
+#define ldLock		0x1E8	/* RO - LDx_L Address */
+
+/*
+**
+**  Scache Control Register (SC_CTL) Bit Summary
+**
+**	 Extent	Size	Name	  Type	Function
+**	 ------	----	----	  ----	---------------------------------
+**	 <15:13>  3	SET_EN	  RW,1  Set enable
+**	    <12>  1	BLK_SIZE  RW,1	Scache/Bcache block size select
+**	 <11:08>  4	FB_DP	  RW,0	Force bad data parity
+**	 <07:02>  6	TAG_STAT  RW	Tag status and parity
+**	     <1>  1	FLUSH	  RW,0	If set, clear all tag valid bits
+**	     <0>  1     FHIT	  RW,0  Force hits
+**
+*/
+
+#define	SC_V_SET_EN	13
+#define SC_M_SET_EN	(7<<SC_V_SET_EN)
+#define SC_V_BLK_SIZE	12
+#define SC_M_BLK_SIZE	(1<<SC_V_BLK_SIZE)
+#define SC_V_FB_DP	8
+#define SC_M_FB_DP	(0xF<<SC_V_FB_DP)
+#define SC_V_TAG_STAT	2
+#define SC_M_TAG_STAT	(0x3F<<SC_V_TAG_STAT)
+#define SC_V_FLUSH	1
+#define SC_M_FLUSH	(1<<SC_V_FLUSH)
+#define SC_V_FHIT	0
+#define SC_M_FHIT	(1<<SC_V_FHIT)
+
+/*
+**
+**  Bcache Control Register (BC_CTL) Bit Summary
+**
+**	 Extent	Size  Name	    Type  Function
+**	 ------	----  ----	    ----  ---------------------------------
+**	    <27>  1   DIS_VIC_BUF   WO,0  Disable Scache victim buffer
+**	    <26>  1   DIS_BAF_BYP   WO,0  Disable speculative Bcache reads
+**	    <25>  1   DBG_MUX_SEL   WO,0  Debug MUX select
+**	 <24:19>  6   PM_MUX_SEL    WO,0  Performance counter MUX select
+**       <18:17>  2   BC_WAVE       WO,0  Number of cycles of wave pipelining
+**	    <16>  1   TL_PIPE_LATCH WO,0  Pipe system control pins
+**	    <15>  1   EI_DIS_ERR    WO,1  Disable ECC (parity) error
+**       <14:13>  2   BC_BAD_DAT    WO,0  Force bad data
+**       <12:08>  5   BC_TAG_STAT   WO    Bcache tag status and parity
+**           <7>  1   BC_FHIT       WO,0  Bcache force hit
+**           <6>  1   EI_ECC        WO,1  ECC or byte parity mode
+**           <5>  1   VTM_FIRST     WO,1  Drive out victim block address first
+**           <4>  1   CORR_FILL_DAT WO,1  Correct fill data
+**           <3>  1   EI_CMD_GRP3   WO,0  Drive MB command to external pins
+**           <2>  1   EI_CMD_GRP2   WO,0  Drive LOCK & SET_DIRTY to ext. pins
+**           <1>  1   ALLOC_CYC     WO,0  Allocate cycle for non-cached LDs.
+**           <0>  1   BC_ENA        W0,0  Bcache enable
+**
+*/
+#define BC_V_DIS_SC_VIC_BUF	27
+#define BC_M_DIS_SC_VIC_BUF	(1<<BC_V_DIS_SC_VIC_BUF)
+#define BC_V_DIS_BAF_BYP	26
+#define BC_M_DIS_BAF_BYP	(1<<BC_V_DIS_BAF_BYP)
+#define BC_V_DBG_MUX_SEL	25
+#define BC_M_DBG_MUX_SEL	(1<<BC_V_DBG_MUX_SEL)
+#define BC_V_PM_MUX_SEL		19
+#define BC_M_PM_MUX_SEL		(0x3F<<BC_V_PM_MUX_SEL)
+#define BC_V_BC_WAVE		17
+#define BC_M_BC_WAVE		(3<<BC_V_BC_WAVE)
+#define BC_V_TL_PIPE_LATCH	16
+#define BC_M_TL_PIPE_LATCH	(1<<BC_V_TL_PIPE_LATCH)
+#define BC_V_EI_DIS_ERR		15
+#define BC_M_EI_DIS_ERR		(1<<BC_V_EI_DIS_ERR)
+#define BC_V_BC_BAD_DAT		13
+#define BC_M_BC_BAD_DAT		(3<<BC_V_BC_BAD_DAT)
+#define BC_V_BC_TAG_STAT	8
+#define BC_M_BC_TAG_STAT	(0x1F<<BC_V_BC_TAG_STAT)
+#define BC_V_BC_FHIT		7
+#define BC_M_BC_FHIT		(1<<BC_V_BC_FHIT)
+#define BC_V_EI_ECC_OR_PARITY	6
+#define BC_M_EI_ECC_OR_PARITY	(1<<BC_V_EI_ECC_OR_PARITY)
+#define BC_V_VTM_FIRST		5
+#define BC_M_VTM_FIRST		(1<<BC_V_VTM_FIRST)
+#define BC_V_CORR_FILL_DAT	4
+#define BC_M_CORR_FILL_DAT	(1<<BC_V_CORR_FILL_DAT)
+#define BC_V_EI_CMD_GRP3	3
+#define BC_M_EI_CMD_GRP3	(1<<BC_V_EI_CMD_GRP3)
+#define BC_V_EI_CMD_GRP2	2
+#define BC_M_EI_CMD_GRP2	(1<<BC_V_EI_CMD_GRP2)
+#define BC_V_ALLOC_CYC		1
+#define BC_M_ALLOC_CYC		(1<<BC_V_ALLOC_CYC)
+#define BC_V_BC_ENA		0
+#define BC_M_BC_ENA		(1<<BC_V_BC_ENA)
+
+#define BC_K_DFAULT \
+        (((BC_M_EI_DIS_ERR)       | \
+          (BC_M_EI_ECC_OR_PARITY) | \
+          (BC_M_VTM_FIRST)        | \
+          (BC_M_CORR_FILL_DAT))>>1)
+/*
+**
+**  Bcache Configuration Register (BC_CONFIG) Bit Summary
+**
+**	 Extent	Size  Name	    Type  Function
+**	 ------	----  ----	    ----  ---------------------------------
+**	<35:29>   7   RSVD	    WO    Reserved - Must Be Zero
+**	<28:20>   9   WE_CTL        WO,0  Bcache write enable control
+**	<19:19>   1   RSVD	    WO,0  Reserved - Must Be Zero
+**	<18:16>   3   WE_OFF        WO,1  Bcache fill write enable pulse offset
+**	<15:15>   1   RSVD          WO,0  Reserved - Must Be Zero
+**	<14:12>   3   RD_WR_SPC     WO,7  Bcache private read/write spacing
+**	<11:08>   4   WR_SPD        WO,4  Bcache write speed in CPU cycles
+**	<07:04>   4   RD_SPD	    WO,4  Bcache read speed in CPU cycles
+**	<03:03>   1   RSVD	    WO,0  Reserved - Must Be Zero
+**	<02:00>   3   SIZE	    WO,1  Bcache size
+*/
+#define	BC_V_WE_CTL	20
+#define BC_M_WE_CTL	(0x1FF<<BC_V_WE_CTL)
+#define BC_V_WE_OFF	16
+#define BC_M_WE_OFF	(0x7<<BC_V_WE_OFF)
+#define BC_V_RD_WR_SPC	12
+#define BC_M_RD_WR_SPC	(0x7<<BC_V_RD_WR_SPC)
+#define BC_V_WR_SPD	8
+#define BC_M_WR_SPD	(0xF<<BC_V_WR_SPD)
+#define BC_V_RD_SPD	4
+#define BC_M_RD_SPD	(0xF<<BC_V_RD_SPD)
+#define BC_V_SIZE	0
+#define BC_M_SIZE	(0x7<<BC_V_SIZE)
+
+#define BC_K_CONFIG \
+        ((0x1<<BC_V_WE_OFF)    | \
+         (0x7<<BC_V_RD_WR_SPC) | \
+         (0x4<<BC_V_WR_SPD)    | \
+         (0x4<<BC_V_RD_SPD)    | \
+         (0x1<<BC_V_SIZE))
+
+/*
+**
+**  DECchip 21164 Privileged Architecture Library Entry Offsets:
+**
+**	Entry Name	    Offset (Hex)
+**
+**	RESET			0000
+**	IACCVIO			0080
+**	INTERRUPT	       	0100
+**	ITB_MISS		0180
+**	DTB_MISS (Single)       0200
+**	DTB_MISS (Double)       0280
+**	UNALIGN			0300
+**	D_FAULT			0380
+**	MCHK			0400
+**	OPCDEC			0480
+**	ARITH			0500
+**	FEN			0580
+**	CALL_PAL (Privileged)	2000
+**	CALL_PAL (Unprivileged)	3000
+**
+*/
+
+#define PAL_RESET_ENTRY		    0x0000
+#define PAL_IACCVIO_ENTRY	    0x0080
+#define PAL_INTERRUPT_ENTRY	    0x0100
+#define PAL_ITB_MISS_ENTRY	    0x0180
+#define PAL_DTB_MISS_ENTRY	    0x0200
+#define PAL_DOUBLE_MISS_ENTRY	    0x0280
+#define PAL_UNALIGN_ENTRY	    0x0300
+#define PAL_D_FAULT_ENTRY	    0x0380
+#define PAL_MCHK_ENTRY		    0x0400
+#define PAL_OPCDEC_ENTRY	    0x0480
+#define PAL_ARITH_ENTRY	    	    0x0500
+#define PAL_FEN_ENTRY		    0x0580
+#define PAL_CALL_PAL_PRIV_ENTRY	    0x2000
+#define PAL_CALL_PAL_UNPRIV_ENTRY   0x3000
+
+/*
+**
+** Architecturally Reserved Opcode (PALRES) Definitions:
+**
+*/
+
+#define	mtpr	    hw_mtpr
+#define	mfpr	    hw_mfpr
+
+#define	ldl_a	    hw_ldl/a
+#define ldq_a	    hw_ldq/a
+#define stq_a	    hw_stq/a
+#define stl_a	    hw_stl/a
+
+#define ldl_p	    hw_ldl/p
+#define ldq_p	    hw_ldq/p
+#define stl_p	    hw_stl/p
+#define stq_p	    hw_stq/p
+
+/*
+** Virtual PTE fetch variants of HW_LD.
+*/
+#define ld_vpte     hw_ldq/v
+
+/*
+** Physical mode load-lock and store-conditional variants of
+** HW_LD and HW_ST.
+*/
+
+#define ldq_lp	    hw_ldq/pl
+#define stq_cp	    hw_stq/pc
+
+/*
+**
+**  General Purpose Register Definitions:
+**
+*/
+
+#define	r0		$0
+#define r1		$1
+#define r2		$2
+#define r3		$3
+#define r4		$4
+#define r5		$5
+#define r6		$6
+#define r7		$7
+#define r8		$8
+#define r9		$9
+#define r10		$10
+#define r11		$11
+#define r12		$12
+#define r13		$13
+#define r14		$14
+#define	r15		$15
+#define	r16		$16
+#define	r17		$17
+#define	r18		$18
+#define	r19		$19
+#define	r20		$20
+#define	r21		$21
+#define r22		$22
+#define r23		$23
+#define r24		$24
+#define r25		$25
+#define r26		$26
+#define r27		$27
+#define r28		$28
+#define r29		$29
+#define r30		$30
+#define r31		$31
+
+/*
+**
+** Floating Point Register Definitions:
+**
+*/
+
+#define	f0		$f0
+#define f1		$f1
+#define f2		$f2
+#define f3		$f3
+#define f4		$f4
+#define f5		$f5
+#define f6		$f6
+#define f7		$f7
+#define f8		$f8
+#define f9		$f9
+#define f10		$f10
+#define f11		$f11
+#define f12		$f12
+#define f13		$f13
+#define f14		$f14
+#define	f15		$f15
+#define	f16		$f16
+#define	f17		$f17
+#define	f18		$f18
+#define	f19		$f19
+#define	f20		$f20
+#define	f21		$f21
+#define f22		$f22
+#define f23		$f23
+#define f24		$f24
+#define f25		$f25
+#define f26		$f26
+#define f27		$f27
+#define f28		$f28
+#define f29		$f29
+#define f30		$f30
+#define f31		$f31
+
+/*
+**
+**  PAL Temporary Register Definitions:
+**
+*/
+
+// covered by fetch distribution..pb Nov/95
+
+// #define	pt0		0x140
+// #define	pt1		0x141
+// #define	pt2		0x142
+// #define	pt3		0x143
+// #define	pt4		0x144
+// #define	pt5		0x145
+// #define	pt6		0x146
+// #define	pt7		0x147
+// #define	pt8		0x148
+// #define	pt9		0x149
+// #define	pt10		0x14A
+// #define	pt11		0x14B
+// #define	pt12		0x14C
+// #define	pt13		0x14D
+// #define	pt14		0x14E
+// #define	pt15		0x14F
+// #define	pt16		0x150
+// #define	pt17		0x151
+// #define	pt18		0x152
+// #define	pt19		0x153
+// #define	pt20		0x154
+// #define	pt21		0x155
+// #define	pt22		0x156
+// #define	pt23		0x157
+
+/*
+**  PAL Shadow Registers:
+**
+**  The DECchip 21164 shadows r8-r14 and r25 when in PALmode and
+**  ICSR<SDE> = 1.
+*/
+
+#define	p0		r8	/* ITB/DTB Miss Scratch */
+#define p1		r9	/* ITB/DTB Miss Scratch */
+#define p2		r10	/* ITB/DTB Miss Scratch */
+#define p3		r11
+// #define ps		r11	/* Processor Status */
+#define p4		r12	/* Local Scratch */
+#define p5		r13	/* Local Scratch */
+#define p6		r14	/* Local Scratch */
+#define p7		r25	/* Local Scratch */
+
+/*
+** SRM Defined State Definitions:
+*/
+
+/*
+**  This table is an accounting of the DECchip 21164 storage used to
+**  implement the SRM defined state for OSF/1.
+**
+** 	IPR Name			Internal Storage
+**      --------                        ----------------
+**	Processor Status		ps, dtbCm, ipl, r11
+**	Program Counter			Ibox
+**	Interrupt Entry			ptEntInt
+**	Arith Trap Entry		ptEntArith
+**	MM Fault Entry			ptEntMM
+**	Unaligned Access Entry		ptEntUna
+**	Instruction Fault Entry		ptEntIF
+**	Call System Entry		ptEntSys
+**	User Stack Pointer		ptUsp
+**	Kernel Stack Pointer		ptKsp
+**	Kernel Global Pointer		ptKgp
+**	System Value			ptSysVal
+**	Page Table Base Register	ptPtbr
+**	Virtual Page Table Base		iVptBr, mVptBr
+**	Process Control Block Base	ptPcbb
+**	Address Space Number		itbAsn, dtbAsn
+**	Cycle Counter			cc, ccCtl
+**	Float Point Enable		icsr
+**	Lock Flag			Cbox/System
+**	Unique				PCB
+**	Who-Am-I			ptWhami
+*/
+
+#define ptEntUna	pt2	/* Unaligned Access Dispatch Entry */
+#define ptImpure	pt3	/* Pointer To PAL Scratch Area */
+#define ptEntIF		pt7	/* Instruction Fault Dispatch Entry */
+#define ptIntMask	pt8	/* Interrupt Enable Mask */
+#define ptEntSys	pt9	/* Call System Dispatch Entry */
+#define ptTrap          pt11
+#define ptEntInt	pt11	/* Hardware Interrupt Dispatch Entry */
+#define ptEntArith	pt12	/* Arithmetic Trap Dispatch Entry */
+#if defined(KDEBUG)
+#define ptEntDbg	pt13	/* Kernel Debugger Dispatch Entry */
+#endif /* KDEBUG */
+#define ptMisc          pt16    /* Miscellaneous Flags */
+#define ptWhami		pt16	/* Who-Am-I Register Pt16<15:8> */
+#define ptMces		pt16	/* Machine Check Error Summary Pt16<4:0> */
+#define ptSysVal	pt17	/* Per-Processor System Value */
+#define ptUsp		pt18	/* User Stack Pointer */
+#define ptKsp		pt19	/* Kernel Stack Pointer */
+#define ptPtbr		pt20	/* Page Table Base Register */
+#define ptEntMM		pt21	/* MM Fault Dispatch Entry */
+#define ptKgp		pt22	/* Kernel Global Pointer */
+#define ptPcbb		pt23	/* Process Control Block Base */
+
+/*
+**
+**   Miscellaneous PAL State Flags (ptMisc) Bit Summary
+**
+**	 Extent	Size  Name	Function
+**	 ------	----  ----	---------------------------------
+**	 <55:48>  8   SWAP      Swap PALcode flag -- character 'S'
+**	 <47:32> 16   MCHK      Machine Check Error code
+**	 <31:16> 16   SCB       System Control Block vector
+**	 <15:08>  8   WHAMI     Who-Am-I identifier
+**       <04:00>  5   MCES      Machine Check Error Summary bits
+**
+*/
+
+#define PT16_V_MCES	0
+#define PT16_V_WHAMI	8
+#define PT16_V_SCB	16
+#define PT16_V_MCHK	32
+#define PT16_V_SWAP	48
+
+#endif /* DC21164FROMGASSOURCES_INCLUDED */
diff --git a/system/alpha/h/ev5_alpha_defs.h b/system/alpha/h/ev5_alpha_defs.h
new file mode 100644
index 0000000000..d0264a4ca9
--- /dev/null
+++ b/system/alpha/h/ev5_alpha_defs.h
@@ -0,0 +1,314 @@
+/*
+ * Copyright (c) 1993 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef EV5_ALPHA_DEFS_INCLUDED
+#define EV5_ALPHA_DEFS_INCLUDED 1
+
+// from ev5_alpha_defs.mar from Lance's fetch directory
+// Lower-caseified and $ signs removed ... pb Nov/95
+
+//
+// PS Layout - PS
+//	Loc	Size	name 	function
+//	------	------	______	-----------------------------------
+//	<31:29>	3	SA	stack alignment
+//	<31:13>	24	RES	Reserved MBZ
+//	<12:8>	5	IPL	Priority level
+//	<7>	1	VMM	Virtual Mach Monitor
+//	<6:5>	2	RES	Reserved MBZ
+//	<4:3>	2	CM	Current Mode
+//	<2>	1	IP	Interrupt Pending
+//	<1:0>	2	SW	Software bits
+//
+
+#define ps_v_sw		0
+#define ps_m_sw		(3<<ps_v_sw)
+
+#define ps_v_ip		2
+#define ps_m_ip		(1<<ps_v_ip)
+
+#define ps_v_cm		3
+#define ps_m_cm		(3<<ps_v_cm)
+
+#define ps_v_vmm	7
+#define ps_m_vmm	(1<<ps_v_vmm)
+
+#define ps_v_ipl	8
+#define ps_m_ipl	(0x1f<<ps_v_ipl)
+
+#define ps_v_sp		(0x38)
+#define ps_m_sp		(0x3f<<ps_v_sp)
+
+
+#define ps_c_kern	(0x00)
+#define ps_c_exec	(0x08)
+#define ps_c_supr	(0x10)
+#define ps_c_user	(0x18)
+#define ps_c_ipl0	(0x0000)
+#define ps_c_ipl1	(0x0100)
+#define ps_c_ipl2	(0x0200)
+#define ps_c_ipl3	(0x0300)
+#define ps_c_ipl4	(0x0400)
+#define ps_c_ipl5	(0x0500)
+#define ps_c_ipl6	(0x0600)
+#define ps_c_ipl7	(0x0700)
+#define ps_c_ipl8	(0x0800)
+#define ps_c_ipl9	(0x0900)
+#define ps_c_ipl10	(0x0A00)
+#define ps_c_ipl11	(0x0B00)
+#define ps_c_ipl12	(0x0C00)
+#define ps_c_ipl13	(0x0D00)
+#define ps_c_ipl14	(0x0E00)
+#define ps_c_ipl15	(0x0F00)
+#define ps_c_ipl16	(0x1000)
+#define ps_c_ipl17	(0x1100)
+#define ps_c_ipl18	(0x1200)
+#define ps_c_ipl19	(0x1300)
+#define ps_c_ipl20	(0x1400)
+#define ps_c_ipl21	(0x1500)
+#define ps_c_ipl22	(0x1600)
+#define ps_c_ipl23	(0x1700)
+#define ps_c_ipl24	(0x1800)
+#define ps_c_ipl25	(0x1900)
+#define ps_c_ipl26	(0x1A00)
+#define ps_c_ipl27	(0x1B00)
+#define ps_c_ipl28	(0x1C00)
+#define ps_c_ipl29	(0x1D00)
+#define ps_c_ipl30	(0x1E00)
+#define ps_c_ipl31	(0x1F00)
+
+//
+// PTE layout - symbol prefix PTE_
+//
+//	Loc	Size	name 	function
+//	------	------	------	-----------------------------------
+//	<63:32>	32	PFN	Page Frame Number
+//	<31:16>	16	SOFT	Bits reserved for software use
+//	<15>	1	UWE	User write enable
+//	<14>	1	SWE	Super write enable
+//	<13>	1	EWE	Exec write enable
+//	<12>	1	KWE	Kernel write enable
+//	<11>	1	URE	User read enable
+//	<10>	1	SRE	Super read enable
+//	<9>	1	ERE	Exec read enable
+//	<8>	1	KRE	Kernel read enable
+//	<7:6>	2	RES	Reserved SBZ
+//	<5>	1	HPF	Huge Page Flag
+//	<4>	1	ASM	Wild card address space number match
+//	<3>	1	FOE	Fault On execute
+//	<2>	1	FOW	Fault On Write
+//	<1>	1	FOR	Fault On Read
+// 	<0>	1	V	valid bit
+//
+
+#define pte_v_pfn	32
+#define pte_m_soft	(0xFFFF0000)
+#define pte_v_soft	16
+#define pte_m_uwe	(0x8000)
+#define pte_v_uwe	15
+#define pte_m_swe	(0x4000)
+#define pte_v_swe	14
+#define pte_m_ewe	(0x2000)
+#define pte_v_ewe	13
+#define pte_m_kwe	(0x1000)
+#define pte_v_kwe	12
+#define pte_m_ure	(0x0800)
+#define pte_v_ure	11
+#define pte_m_sre	(0x0400)
+#define pte_v_sre	10
+#define pte_m_ere	(0x0200)
+#define pte_v_ere	 9
+#define pte_m_kre	(0x0100)
+#define pte_v_kre	 8
+#define pte_m_hpf	(0x0020)
+#define pte_v_hpf	5
+#define pte_m_asm	(0x0010)
+#define pte_v_asm	4
+#define pte_m_foe	(0x0008)
+#define pte_v_foe	3
+#define pte_m_fow	(0x0004)
+#define pte_v_fow	2
+#define pte_m_for	(0x0002)
+#define pte_v_for	1
+#define pte_m_v		(0x0001)
+#define pte_v_v		0
+
+//
+// VA layout - symbol prefix VA_
+//
+//	Loc	Size	name 	function
+//	------	------	-------	-----------------------------------
+//	<42:33>	10	SEG1	First seg table offset for mapping
+//	<32:23>	10	SEG2	Second seg table offset for mapping
+//	<22:13>	10	SEG3	Third seg table offset for mapping
+//	<12:0>	13	OFFSET	Byte within page
+//
+
+#define va_m_offset	(0x000000001FFF)
+#define va_v_offset	0
+#define va_m_seg3	(0x0000007FE000)
+#define va_v_seg3	13
+#define va_m_seg2	(0x0001FF800000)
+#define va_v_seg2	23
+#define va_m_seg1	(0x7FE00000000)
+#define va_v_seg1	33
+
+//
+//PRIVILEGED CONTEXT BLOCK (PCB)
+//
+#define pcb_q_ksp	0
+#define pcb_q_esp	8
+#define pcb_q_ssp	16
+#define pcb_q_usp	24
+#define pcb_q_ptbr	32
+#define pcb_q_asn	40
+#define pcb_q_ast	48
+#define pcb_q_fen	56
+#define pcb_q_cc	64
+#define pcb_q_unq	72
+#define pcb_q_sct	80
+
+#define pcb_v_asten	0
+#define pcb_m_asten	(0x0f<<pcb_v_asten)
+#define pcb_v_astsr	4
+#define pcb_m_astsr	(0x0f<<pcb_v_astsr)
+#define pcb_v_dat	63
+#define pcb_v_pme	62
+
+//
+// SYSTEM CONTROL BLOCK (SCB)
+//
+
+#define scb_v_fen		(0x0010)
+#define scb_v_acv		(0x0080)
+#define scb_v_tnv		(0x0090)
+#define scb_v_for		(0x00A0)
+#define scb_v_fow		(0x00B0)
+#define scb_v_foe		(0x00C0)
+#define scb_v_arith		(0x0200)
+#define scb_v_kast		(0x0240)
+#define scb_v_east		(0x0250)
+#define scb_v_sast		(0x0260)
+#define scb_v_uast		(0x0270)
+#define scb_v_unalign		(0x0280)
+#define scb_v_bpt		(0x0400)
+#define scb_v_bugchk		(0x0410)
+#define scb_v_opcdec		(0x0420)
+#define scb_v_illpal		(0x0430)
+#define scb_v_trap		(0x0440)
+#define scb_v_chmk		(0x0480)
+#define scb_v_chme		(0x0490)
+#define scb_v_chms		(0x04A0)
+#define scb_v_chmu		(0x04B0)
+#define scb_v_sw0		(0x0500)
+#define scb_v_sw1		(0x0510)
+#define scb_v_sw2		(0x0520)
+#define scb_v_sw3		(0x0530)
+#define scb_v_sw4		(0x0540)
+#define scb_v_sw5		(0x0550)
+#define scb_v_sw6		(0x0560)
+#define scb_v_sw7		(0x0570)
+#define scb_v_sw8		(0x0580)
+#define scb_v_sw9		(0x0590)
+#define scb_v_sw10		(0x05A0)
+#define scb_v_sw11		(0x05B0)
+#define scb_v_sw12		(0x05C0)
+#define scb_v_sw13		(0x05D0)
+#define scb_v_sw14		(0x05E0)
+#define scb_v_sw15		(0x05F0)
+#define scb_v_clock		(0x0600)
+#define scb_v_inter		(0x0610)
+#define scb_v_sys_corr_err	(0x0620)
+#define scb_v_proc_corr_err	(0x0630)
+#define scb_v_pwrfail		(0x0640)
+#define scb_v_perfmon		(0x0650)
+#define scb_v_sysmchk		(0x0660)
+#define scb_v_procmchk		(0x0670)
+#define scb_v_passive_rel	(0x06F0)
+
+//
+// Stack frame (FRM)
+//
+
+#define frm_v_r2		(0x0000)
+#define frm_v_r3		(0x0008)
+#define frm_v_r4		(0x0010)
+#define frm_v_r5		(0x0018)
+#define frm_v_r6		(0x0020)
+#define frm_v_r7		(0x0028)
+#define frm_v_pc		(0x0030)
+#define frm_v_ps		(0x0038)
+
+//
+// Exeception summary register (EXS)
+//
+// exs_v_swc		<0>	; Software completion
+// exs_v_inv		<1>	; Ivalid operation
+// exs_v_dze		<2>	; Div by zero
+// exs_v_fov		<3>	; Floating point overflow
+// exs_v_unf		<4>	; Floating point underflow
+// exs_v_ine		<5>	; Floating point inexact
+// exs_v_iov		<6>	; Floating convert to integer overflow
+#define exs_v_swc	  0
+#define exs_v_inv	  1
+#define exs_v_dze	  2
+#define exs_v_fov	  3
+#define exs_v_unf	  4
+#define exs_v_ine	  5
+#define exs_v_iov	  6
+
+#define exs_m_swc               (1<<exs_v_swc)
+#define exs_m_inv               (1<<exs_v_inv)
+#define exs_m_dze               (1<<exs_v_dze)
+#define exs_m_fov               (1<<exs_v_fov)
+#define exs_m_unf               (1<<exs_v_unf)
+#define exs_m_ine               (1<<exs_v_ine)
+#define exs_m_iov               (1<<exs_v_iov)
+
+//
+// machine check error summary register (mces)
+//
+// mces_v_mchk		<0>	; machine check in progress
+// mces_v_sce		<1>	; system correctable error
+// mces_v_pce		<2>	; processor correctable error
+// mces_v_dpc		<3>	; disable reporting of processor correctable errors
+// mces_v_dsc		<4>	; disable reporting of system correctable errors
+#define mces_v_mchk	 0
+#define mces_v_sce	 1
+#define mces_v_pce	 2
+#define mces_v_dpc	 3
+#define mces_v_dsc	 4
+
+#define mces_m_mchk              (1<<mces_v_mchk)
+#define mces_m_sce               (1<<mces_v_sce)
+#define mces_m_pce               (1<<mces_v_pce)
+#define mces_m_dpc               (1<<mces_v_dpc)
+#define mces_m_dsc               (1<<mces_v_dsc)
+#define mces_m_all		 ((1<<mces_v_mchk) | (1<<mces_v_sce) | (1<<mces_v_pce) | (1<<mces_v_dpc) | (1<<mces_v_dsc))
+
+#endif
diff --git a/system/alpha/h/ev5_defs.h b/system/alpha/h/ev5_defs.h
new file mode 100644
index 0000000000..c8b2f5b2ee
--- /dev/null
+++ b/system/alpha/h/ev5_defs.h
@@ -0,0 +1,598 @@
+/*
+ * Copyright (c) 1995 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef EV5_DEFS_INCLUDED
+#define EV5_DEFS_INCLUDED 1
+
+// adapted from the version emailed to lance..pb Nov/95
+
+//  In the definitions below, registers are annotated with one of the
+//  following symbols:
+//
+//      RW - The register may be read and written
+//   	RO - The register may only be read
+//   	WO - The register may only be written
+//
+//  For RO and WO registers, all bits and fields within the register
+//  are also read-only or write-only.  For RW registers, each bit or
+//  field within the register is annotated with one of the following:
+//
+//   	RW - The bit/field may be read and written
+//   	RO - The bit/field may be read; writes are ignored
+//   	WO - The bit/field may be written; reads return an UNPREDICTABLE result
+//   	WZ - The bit/field may be written; reads return a 0
+//   	WC - The bit/field may be read; writes cause state to clear
+//   	RC - The bit/field may be read, which also causes state to clear;
+//           writes are ignored
+//  Architecturally-defined (SRM) registers for EVMS
+
+#define pt0 320
+#define pt1 321
+#define pt2 322
+#define pt3 323
+#define pt4 324
+#define pt5 325
+#define pt6 326
+#define pt7 327
+#define pt8 328
+#define pt9 329
+#define pt10 330
+#define pt11 331
+#define pt12 332
+#define pt13 333
+#define pt14 334
+#define pt15 335
+#define pt16 336
+#define pt17 337
+#define pt18 338
+#define pt19 339
+#define pt20 340
+#define pt21 341
+#define pt22 342
+#define pt23 343
+#define cbox_ipr_offset 16777200
+#define sc_ctl 168
+#define sc_stat 232
+#define sc_addr 392
+#define sc_addr_nm 392
+#define sc_addr_fhm 392
+#define bc_ctl 296
+#define bc_config 456
+#define ei_stat 360
+#define ei_addr 328
+#define fill_syn 104
+#define bc_tag_addr 264
+#define ld_lock 488
+#define aster 266
+#define astrr 265
+#define exc_addr 267
+#define exc_sum 268
+#define exc_mask 269
+#define hwint_clr 277
+#define ic_flush_ctl 281
+#define icperr_stat 282
+#define ic_perr_stat 282
+#define ic_row_map 283
+#define icsr 280
+#define ifault_va_form 274
+#define intid 273
+#define ipl 272
+#define isr 256
+#define itb_is 263
+#define itb_asn 259
+#define itb_ia 261
+#define itb_iap 262
+#define itb_pte 258
+#define itb_pte_temp 260
+#define itb_tag 257
+#define ivptbr 275
+#define pal_base 270
+#define pmctr 284
+// this is not the register ps .. pb #define ps 271
+#define sirr 264
+#define sl_txmit 278
+#define sl_rcv 279
+#define alt_mode 524
+#define cc 525
+#define cc_ctl 526
+#define dc_flush 528
+#define dcperr_stat 530
+#define dc_test_ctl 531
+#define dc_test_tag 532
+#define dc_test_tag_temp 533
+#define dtb_asn 512
+#define dtb_cm 513
+#define dtb_ia 522
+#define dtb_iap 521
+#define dtb_is 523
+#define dtb_pte 515
+#define dtb_pte_temp 516
+#define dtb_tag 514
+#define mcsr 527
+#define dc_mode 534
+#define maf_mode 535
+#define mm_stat 517
+#define mvptbr 520
+#define va 518
+#define va_form 519
+#define ev5_srm__ps 0
+#define ev5_srm__pc 0
+#define ev5_srm__asten 0
+#define ev5_srm__astsr 0
+#define ev5_srm__ipir 0
+#define ev5_srm__ipl 0
+#define ev5_srm__mces 0
+#define ev5_srm__pcbb 0
+#define ev5_srm__prbr 0
+#define ev5_srm__ptbr 0
+#define ev5_srm__scbb 0
+#define ev5_srm__sirr 0
+#define ev5_srm__sisr 0
+#define ev5_srm__tbchk 0
+#define ev5_srm__tb1a 0
+#define ev5_srm__tb1ap 0
+#define ev5_srm__tb1ad 0
+#define ev5_srm__tb1ai 0
+#define ev5_srm__tbis 0
+#define ev5_srm__ksp 0
+#define ev5_srm__esp 0
+#define ev5_srm__ssp 0
+#define ev5_srm__usp 0
+#define ev5_srm__vptb 0
+#define ev5_srm__whami 0
+#define ev5_srm__cc 0
+#define ev5_srm__unq 0
+//  processor-specific iprs.
+#define ev5__sc_ctl 168
+#define ev5__sc_stat 232
+#define ev5__sc_addr 392
+#define ev5__bc_ctl 296
+#define ev5__bc_config 456
+#define bc_config_k_size_1mb 1
+#define bc_config_k_size_2mb 2
+#define bc_config_k_size_4mb 3
+#define bc_config_k_size_8mb 4
+#define bc_config_k_size_16mb 5
+#define bc_config_k_size_32mb 6
+#define bc_config_k_size_64mb 7
+#define ev5__ei_stat 360
+#define ev5__ei_addr 328
+#define ev5__fill_syn 104
+#define ev5__bc_tag_addr 264
+#define ev5__aster 266
+#define ev5__astrr 265
+#define ev5__exc_addr 267
+#define exc_addr_v_pa 2
+#define exc_addr_s_pa 62
+#define ev5__exc_sum 268
+#define ev5__exc_mask 269
+#define ev5__hwint_clr 277
+#define ev5__ic_flush_ctl 281
+#define ev5__icperr_stat 282
+#define ev5__ic_perr_stat 282
+#define ev5__ic_row_map 283
+#define ev5__icsr 280
+#define ev5__ifault_va_form 274
+#define ev5__ifault_va_form_nt 274
+#define ifault_va_form_nt_v_vptb 30
+#define ifault_va_form_nt_s_vptb 34
+#define ev5__intid 273
+#define ev5__ipl 272
+#define ev5__itb_is 263
+#define ev5__itb_asn 259
+#define ev5__itb_ia 261
+#define ev5__itb_iap 262
+#define ev5__itb_pte 258
+#define ev5__itb_pte_temp 260
+#define ev5__itb_tag 257
+#define ev5__ivptbr 275
+#define ivptbr_v_vptb 30
+#define ivptbr_s_vptb 34
+#define ev5__pal_base 270
+#define ev5__pmctr 284
+#define ev5__ps 271
+#define ev5__isr 256
+#define ev5__sirr 264
+#define ev5__sl_txmit 278
+#define ev5__sl_rcv 279
+#define ev5__alt_mode 524
+#define ev5__cc 525
+#define ev5__cc_ctl 526
+#define ev5__dc_flush 528
+#define ev5__dcperr_stat 530
+#define ev5__dc_test_ctl 531
+#define ev5__dc_test_tag 532
+#define ev5__dc_test_tag_temp 533
+#define ev5__dtb_asn 512
+#define ev5__dtb_cm 513
+#define ev5__dtb_ia 522
+#define ev5__dtb_iap 521
+#define ev5__dtb_is 523
+#define ev5__dtb_pte 515
+#define ev5__dtb_pte_temp 516
+#define ev5__dtb_tag 514
+#define ev5__mcsr 527
+#define ev5__dc_mode 534
+#define ev5__maf_mode 535
+#define ev5__mm_stat 517
+#define ev5__mvptbr 520
+#define ev5__va 518
+#define ev5__va_form 519
+#define ev5__va_form_nt 519
+#define va_form_nt_s_va 19
+#define va_form_nt_v_vptb 30
+#define va_form_nt_s_vptb 34
+#define ev5s_ev5_def 10
+#define ev5_def 0
+//  cbox registers.
+#define sc_ctl_v_sc_fhit 0
+#define sc_ctl_v_sc_flush 1
+#define sc_ctl_s_sc_tag_stat 6
+#define sc_ctl_v_sc_tag_stat 2
+#define sc_ctl_s_sc_fb_dp 4
+#define sc_ctl_v_sc_fb_dp 8
+#define sc_ctl_v_sc_blk_size 12
+#define sc_ctl_s_sc_set_en 3
+#define sc_ctl_v_sc_set_en 13
+#define sc_ctl_s_sc_soft_repair 3
+#define sc_ctl_v_sc_soft_repair 16
+#define sc_stat_s_sc_tperr 3
+#define sc_stat_v_sc_tperr 0
+#define sc_stat_s_sc_dperr 8
+#define sc_stat_v_sc_dperr 3
+#define sc_stat_s_cbox_cmd 5
+#define sc_stat_v_cbox_cmd 11
+#define sc_stat_v_sc_scnd_err 16
+#define sc_addr_fhm_v_sc_tag_parity 4
+#define sc_addr_fhm_s_tag_stat_sb0 3
+#define sc_addr_fhm_v_tag_stat_sb0 5
+#define sc_addr_fhm_s_tag_stat_sb1 3
+#define sc_addr_fhm_v_tag_stat_sb1 8
+#define sc_addr_fhm_s_ow_mod0 2
+#define sc_addr_fhm_v_ow_mod0 11
+#define sc_addr_fhm_s_ow_mod1 2
+#define sc_addr_fhm_v_ow_mod1 13
+#define sc_addr_fhm_s_tag_lo 17
+#define sc_addr_fhm_v_tag_lo 15
+#define sc_addr_fhm_s_tag_hi 7
+#define sc_addr_fhm_v_tag_hi 32
+#define bc_ctl_v_bc_enabled 0
+#define bc_ctl_v_alloc_cyc 1
+#define bc_ctl_v_ei_opt_cmd 2
+#define bc_ctl_v_ei_opt_cmd_mb 3
+#define bc_ctl_v_corr_fill_dat 4
+#define bc_ctl_v_vtm_first 5
+#define bc_ctl_v_ei_ecc_or_parity 6
+#define bc_ctl_v_bc_fhit 7
+#define bc_ctl_s_bc_tag_stat 5
+#define bc_ctl_v_bc_tag_stat 8
+#define bc_ctl_s_bc_bad_dat 2
+#define bc_ctl_v_bc_bad_dat 13
+#define bc_ctl_v_ei_dis_err 15
+#define bc_ctl_v_tl_pipe_latch 16
+#define bc_ctl_s_bc_wave_pipe 2
+#define bc_ctl_v_bc_wave_pipe 17
+#define bc_ctl_s_pm_mux_sel 6
+#define bc_ctl_v_pm_mux_sel 19
+#define bc_ctl_v_dbg_mux_sel 25
+#define bc_ctl_v_dis_baf_byp 26
+#define bc_ctl_v_dis_sc_vic_buf 27
+#define bc_ctl_v_dis_sys_addr_par 28
+#define bc_ctl_v_read_dirty_cln_shr 29
+#define bc_ctl_v_write_read_bubble 30
+#define bc_ctl_v_bc_wave_pipe_2 31
+#define bc_ctl_v_auto_dack 32
+#define bc_ctl_v_dis_byte_word 33
+#define bc_ctl_v_stclk_delay 34
+#define bc_ctl_v_write_under_miss 35
+#define bc_config_s_bc_size 3
+#define bc_config_v_bc_size 0
+#define bc_config_s_bc_rd_spd 4
+#define bc_config_v_bc_rd_spd 4
+#define bc_config_s_bc_wr_spd 4
+#define bc_config_v_bc_wr_spd 8
+#define bc_config_s_bc_rd_wr_spc 3
+#define bc_config_v_bc_rd_wr_spc 12
+#define bc_config_s_fill_we_offset 3
+#define bc_config_v_fill_we_offset 16
+#define bc_config_s_bc_we_ctl 9
+#define bc_config_v_bc_we_ctl 20
+//  cbox registers, continued
+#define ei_stat_s_sys_id 4
+#define ei_stat_v_sys_id 24
+#define ei_stat_v_bc_tperr 28
+#define ei_stat_v_bc_tc_perr 29
+#define ei_stat_v_ei_es 30
+#define ei_stat_v_cor_ecc_err 31
+#define ei_stat_v_unc_ecc_err 32
+#define ei_stat_v_ei_par_err 33
+#define ei_stat_v_fil_ird 34
+#define ei_stat_v_seo_hrd_err 35
+//
+#define bc_tag_addr_v_hit 12
+#define bc_tag_addr_v_tagctl_p 13
+#define bc_tag_addr_v_tagctl_d 14
+#define bc_tag_addr_v_tagctl_s 15
+#define bc_tag_addr_v_tagctl_v 16
+#define bc_tag_addr_v_tag_p 17
+#define bc_tag_addr_s_bc_tag 19
+#define bc_tag_addr_v_bc_tag 20
+//  ibox and icache registers.
+#define aster_v_kar 0
+#define aster_v_ear 1
+#define aster_v_sar 2
+#define aster_v_uar 3
+#define astrr_v_kar 0
+#define astrr_v_ear 1
+#define astrr_v_sar 2
+#define astrr_v_uar 3
+#define exc_addr_v_pal 0
+#define exc_sum_v_swc 10
+#define exc_sum_v_inv 11
+#define exc_sum_v_dze 12
+#define exc_sum_v_fov 13
+#define exc_sum_v_unf 14
+#define exc_sum_v_ine 15
+#define exc_sum_v_iov 16
+#define hwint_clr_v_pc0c 27
+#define hwint_clr_v_pc1c 28
+#define hwint_clr_v_pc2c 29
+#define hwint_clr_v_crdc 32
+#define hwint_clr_v_slc 33
+//  ibox and icache registers, continued
+#define icperr_stat_v_dpe 11
+#define icperr_stat_v_tpe 12
+#define icperr_stat_v_tmr 13
+#define ic_perr_stat_v_dpe 11
+#define ic_perr_stat_v_tpe 12
+#define ic_perr_stat_v_tmr 13
+#define icsr_v_pma 8
+#define icsr_v_pmp 9
+#define icsr_v_byt 17
+#define icsr_v_fmp 18
+#define icsr_v_im0 20
+#define icsr_v_im1 21
+#define icsr_v_im2 22
+#define icsr_v_im3 23
+#define icsr_v_tmm 24
+#define icsr_v_tmd 25
+#define icsr_v_fpe 26
+#define icsr_v_hwe 27
+#define icsr_s_spe 2
+#define icsr_v_spe 28
+#define icsr_v_sde 30
+#define icsr_v_crde 32
+#define icsr_v_sle 33
+#define icsr_v_fms 34
+#define icsr_v_fbt 35
+#define icsr_v_fbd 36
+#define icsr_v_dbs 37
+#define icsr_v_ista 38
+#define icsr_v_tst 39
+#define ifault_va_form_s_va 30
+#define ifault_va_form_v_va 3
+#define ifault_va_form_s_vptb 31
+#define ifault_va_form_v_vptb 33
+#define ifault_va_form_nt_s_va 19
+#define ifault_va_form_nt_v_va 3
+#define intid_s_intid 5
+#define intid_v_intid 0
+//  ibox and icache registers, continued
+#define ipl_s_ipl 5
+#define ipl_v_ipl 0
+#define itb_is_s_va 30
+#define itb_is_v_va 13
+#define itb_asn_s_asn 7
+#define itb_asn_v_asn 4
+#define itb_pte_v_asm 4
+#define itb_pte_s_gh 2
+#define itb_pte_v_gh 5
+#define itb_pte_v_kre 8
+#define itb_pte_v_ere 9
+#define itb_pte_v_sre 10
+#define itb_pte_v_ure 11
+#define itb_pte_s_pfn 27
+#define itb_pte_v_pfn 32
+#define itb_pte_temp_v_asm 13
+#define itb_pte_temp_v_kre 18
+#define itb_pte_temp_v_ere 19
+#define itb_pte_temp_v_sre 20
+#define itb_pte_temp_v_ure 21
+#define itb_pte_temp_s_gh 3
+#define itb_pte_temp_v_gh 29
+#define itb_pte_temp_s_pfn 27
+#define itb_pte_temp_v_pfn 32
+//  ibox and icache registers, continued
+#define itb_tag_s_va 30
+#define itb_tag_v_va 13
+#define pal_base_s_pal_base 26
+#define pal_base_v_pal_base 14
+#define pmctr_s_sel2 4
+#define pmctr_v_sel2 0
+#define pmctr_s_sel1 4
+#define pmctr_v_sel1 4
+#define pmctr_v_killk 8
+#define pmctr_v_killp 9
+#define pmctr_s_ctl2 2
+#define pmctr_v_ctl2 10
+#define pmctr_s_ctl1 2
+#define pmctr_v_ctl1 12
+#define pmctr_s_ctl0 2
+#define pmctr_v_ctl0 14
+#define pmctr_s_ctr2 14
+#define pmctr_v_ctr2 16
+#define pmctr_v_killu 30
+#define pmctr_v_sel0 31
+#define pmctr_s_ctr1 16
+#define pmctr_v_ctr1 32
+#define pmctr_s_ctr0 16
+#define pmctr_v_ctr0 48
+#define ps_v_cm0 3
+#define ps_v_cm1 4
+#define isr_s_astrr 4
+#define isr_v_astrr 0
+#define isr_s_sisr 15
+#define isr_v_sisr 4
+#define isr_v_atr 19
+#define isr_v_i20 20
+#define isr_v_i21 21
+#define isr_v_i22 22
+#define isr_v_i23 23
+#define isr_v_pc0 27
+#define isr_v_pc1 28
+#define isr_v_pc2 29
+#define isr_v_pfl 30
+#define isr_v_mck 31
+#define isr_v_crd 32
+#define isr_v_sli 33
+#define isr_v_hlt 34
+#define sirr_s_sirr 15
+#define sirr_v_sirr 4
+//  ibox and icache registers, continued
+#define sl_txmit_v_tmt 7
+#define sl_rcv_v_rcv 6
+//  mbox and dcache registers.
+#define alt_mode_v_am0 3
+#define alt_mode_v_am1 4
+#define cc_ctl_v_cc_ena 32
+#define dcperr_stat_v_seo 0
+#define dcperr_stat_v_lock 1
+#define dcperr_stat_v_dp0 2
+#define dcperr_stat_v_dp1 3
+#define dcperr_stat_v_tp0 4
+#define dcperr_stat_v_tp1 5
+//  the following two registers are used exclusively for test and diagnostics.
+//  they should not be referenced in normal operation.
+#define dc_test_ctl_v_bank0 0
+#define dc_test_ctl_v_bank1 1
+#define dc_test_ctl_v_fill_0 2
+#define dc_test_ctl_s_index 10
+#define dc_test_ctl_v_index 3
+#define dc_test_ctl_s_fill_1 19
+#define dc_test_ctl_v_fill_1 13
+#define dc_test_ctl_s_fill_2 32
+#define dc_test_ctl_v_fill_2 32
+//  mbox and dcache registers, continued.
+#define dc_test_tag_v_tag_par 2
+#define dc_test_tag_v_ow0 11
+#define dc_test_tag_v_ow1 12
+#define dc_test_tag_s_tag 26
+#define dc_test_tag_v_tag 13
+#define dc_test_tag_temp_v_tag_par 2
+#define dc_test_tag_temp_v_d0p0 3
+#define dc_test_tag_temp_v_d0p1 4
+#define dc_test_tag_temp_v_d1p0 5
+#define dc_test_tag_temp_v_d1p1 6
+#define dc_test_tag_temp_v_ow0 11
+#define dc_test_tag_temp_v_ow1 12
+#define dc_test_tag_temp_s_tag 26
+#define dc_test_tag_temp_v_tag 13
+#define dtb_asn_s_asn 7
+#define dtb_asn_v_asn 57
+#define dtb_cm_v_cm0 3
+#define dtb_cm_v_cm1 4
+#define dtbis_s_va0 30
+#define dtbis_v_va0 13
+#define dtb_pte_v_for 1
+#define dtb_pte_v_fow 2
+#define dtb_pte_v_asm 4
+#define dtb_pte_s_gh 2
+#define dtb_pte_v_gh 5
+#define dtb_pte_v_kre 8
+#define dtb_pte_v_ere 9
+#define dtb_pte_v_sre 10
+#define dtb_pte_v_ure 11
+#define dtb_pte_v_kwe 12
+#define dtb_pte_v_ewe 13
+#define dtb_pte_v_swe 14
+#define dtb_pte_v_uwe 15
+#define dtb_pte_s_pfn 27
+#define dtb_pte_v_pfn 32
+//  mbox and dcache registers, continued.
+#define dtb_pte_temp_v_for 0
+#define dtb_pte_temp_v_fow 1
+#define dtb_pte_temp_v_kre 2
+#define dtb_pte_temp_v_ere 3
+#define dtb_pte_temp_v_sre 4
+#define dtb_pte_temp_v_ure 5
+#define dtb_pte_temp_v_kwe 6
+#define dtb_pte_temp_v_ewe 7
+#define dtb_pte_temp_v_swe 8
+#define dtb_pte_temp_v_uwe 9
+#define dtb_pte_temp_v_asm 10
+#define dtb_pte_temp_s_fill_0 2
+#define dtb_pte_temp_v_fill_0 11
+#define dtb_pte_temp_s_pfn 27
+#define dtb_pte_temp_v_pfn 13
+#define dtb_tag_s_va 30
+#define dtb_tag_v_va 13
+//  most mcsr bits are used for testability and diagnostics only.
+//  for normal operation, they will be supported in the following configuration:
+//  split_dcache = 1, maf_nomerge = 0, wb_flush_always = 0, wb_nomerge = 0,
+//  dc_ena<1:0> = 1, dc_fhit = 0, dc_bad_parity = 0
+#define mcsr_v_big_endian 0
+#define mcsr_v_sp0 1
+#define mcsr_v_sp1 2
+#define mcsr_v_mbox_sel 3
+#define mcsr_v_e_big_endian 4
+#define mcsr_v_dbg_packet_sel 5
+#define dc_mode_v_dc_ena 0
+#define dc_mode_v_dc_fhit 1
+#define dc_mode_v_dc_bad_parity 2
+#define dc_mode_v_dc_perr_dis 3
+#define dc_mode_v_dc_doa 4
+#define maf_mode_v_maf_nomerge 0
+#define maf_mode_v_wb_flush_always 1
+#define maf_mode_v_wb_nomerge 2
+#define maf_mode_v_io_nomerge 3
+#define maf_mode_v_wb_cnt_disable 4
+#define maf_mode_v_maf_arb_disable 5
+#define maf_mode_v_dread_pending 6
+#define maf_mode_v_wb_pending 7
+//  mbox and dcache registers, continued.
+#define mm_stat_v_wr 0
+#define mm_stat_v_acv 1
+#define mm_stat_v_for 2
+#define mm_stat_v_fow 3
+#define mm_stat_v_dtb_miss 4
+#define mm_stat_v_bad_va 5
+#define mm_stat_s_ra 5
+#define mm_stat_v_ra 6
+#define mm_stat_s_opcode 6
+#define mm_stat_v_opcode 11
+#define mvptbr_s_vptb 31
+#define mvptbr_v_vptb 33
+#define va_form_s_va 30
+#define va_form_v_va 3
+#define va_form_s_vptb 31
+#define va_form_v_vptb 33
+#define va_form_nt_s_va 19
+#define va_form_nt_v_va 3
+//.endm
+
+#endif
diff --git a/system/alpha/h/ev5_impure.h b/system/alpha/h/ev5_impure.h
new file mode 100644
index 0000000000..88634a7ef0
--- /dev/null
+++ b/system/alpha/h/ev5_impure.h
@@ -0,0 +1,420 @@
+/*
+ * Copyright (c) 1993 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef EV5_IMPURE_INCLUDED
+#define EV5_IMPURE_INCLUDED
+
+// This uses the Hudson file format from "impure.h" but with the fields from
+// the distrubuted palcode "ev5_impure.sdl" .. pboyle Nov/95
+
+//  file:	impure.sdl
+//
+//  PAL impure scratch area and logout area data structure definitions for
+// 		Alpha firmware.
+//
+//
+//  module	$pal_impure;
+//
+//  Edit   Date     Who       Description
+//  ---- ---------  ---  ---------------------
+//     1   7-Jul-93 JEM   Initial Entry
+//     2  18-nov-93 JEM   Add shadow bc_ctl and pmctr_ctl to impure area
+// 			  Delete mvptbr
+// 			  Calculate pal$logout from end of impure area
+//     3   6-dec-93 JEM   Add pmctr_ctl bitfield definitions
+//     4   3-feb-94 JEM   Remove f31,r31 from impure area; Remove bc_ctl,
+//                        pmctr_ctl; add ic_perr_stat, pmctr, dc_perr_stat,
+//                        sc_stat, sc_addr, sc_ctl, bc_tag_addr, ei_stat,
+//                        ei_addr, fill_syn, ld_lock
+//     5  19-feb-94 JEM   add gpr constants, and add f31,r31 back in to be
+//                        consistent with ev4
+//                        add cns$ipr_offset
+//     6  18-apr-94 JEM   Add shadow bc_ctl and pmctr_ctl to impure area again.
+//     7  18-jul-94 JEM   Add bc_config shadow.   Add mchk$sys_base constant
+//                        to mchk logout frame
+//
+//
+//     constant REVISION equals 7 prefix IMPURE$;            // Revision number of this file
+//orig
+
+/*
+** Macros for saving/restoring data to/from the PAL impure scratch
+** area.
+**
+** The console save state area is larger than the addressibility
+** of the HW_LD/ST instructions (10-bit signed byte displacement),
+** so some adjustments to the base offsets, as well as the offsets
+** within each base region, are necessary.
+**
+** The console save state area is divided into two segments; the
+** CPU-specific segment and the platform-specific segment.  The
+** state that is saved in the CPU-specific segment includes GPRs,
+** FPRs, IPRs, halt code, MCHK flag, etc.  All other state is saved
+** in the platform-specific segment.
+**
+** The impure pointer will need to be adjusted by a different offset
+** value for each region within a given segment.  The SAVE and RESTORE
+** macros will auto-magically adjust the offsets accordingly.
+**
+*/
+//#define SEXT10(X) (((X) & 0x200) ? ((X) | 0xfffffffffffffc00) : (X))
+#define SEXT10(X) ((X) & 0x3ff)
+//#define SEXT10(X) (((X) << 55) >> 55)
+
+#define SAVE_GPR(reg,offset,base) \
+        stq_p	reg, (SEXT10(offset-0x200))(base)
+
+#define RESTORE_GPR(reg,offset,base) \
+        ldq_p	reg, (SEXT10(offset-0x200))(base)
+
+
+#define SAVE_FPR(reg,offset,base) \
+        stt	reg, (SEXT10(offset-0x200))(base)
+
+#define RESTORE_FPR(reg,offset,base) \
+        ldt	reg, (SEXT10(offset-0x200))(base)
+
+#define SAVE_IPR(reg,offset,base) \
+        mfpr	v0, reg;	  \
+        stq_p	v0, (SEXT10(offset-CNS_Q_IPR))(base)
+
+#define RESTORE_IPR(reg,offset,base) \
+        ldq_p	v0, (SEXT10(offset-CNS_Q_IPR))(base); \
+        mtpr	v0, reg
+
+#define SAVE_SHADOW(reg,offset,base) \
+        stq_p	reg, (SEXT10(offset-CNS_Q_IPR))(base)
+
+#define	RESTORE_SHADOW(reg,offset,base)\
+        ldq_p	reg, (SEXT10(offset-CNS_Q_IPR))(base)
+
+/* Structure of the processor-specific impure area */
+
+/* aggregate impure struct prefix "" tag "";
+ * 	cns$flag	quadword;
+ * 	cns$hlt		quadword;
+ */
+
+/* Define base for debug monitor compatibility */
+#define CNS_Q_BASE      0x000
+#define CNS_Q_FLAG	0x100
+#define CNS_Q_HALT	0x108
+
+
+/* constant (
+ * 	cns$r0,cns$r1,cns$r2,cns$r3,cns$r4,cns$r5,cns$r6,cns$r7,
+ * 	cns$r8,cns$r9,cns$r10,cns$r11,cns$r12,cns$r13,cns$r14,cns$r15,
+ * 	cns$r16,cns$r17,cns$r18,cns$r19,cns$r20,cns$r21,cns$r22,cns$r23,
+ * 	cns$r24,cns$r25,cns$r26,cns$r27,cns$r28,cns$r29,cns$r30,cns$r31
+ * 	) equals . increment 8 prefix "" tag "";
+ * 	cns$gpr	quadword dimension 32;
+ */
+
+/* Offset to base of saved GPR area - 32 quadword */
+#define CNS_Q_GPR	0x110
+#define cns_gpr CNS_Q_GPR
+
+/* constant (
+ * 	cns$f0,cns$f1,cns$f2,cns$f3,cns$f4,cns$f5,cns$f6,cns$f7,
+ * 	cns$f8,cns$f9,cns$f10,cns$f11,cns$f12,cns$f13,cns$f14,cns$f15,
+ * 	cns$f16,cns$f17,cns$f18,cns$f19,cns$f20,cns$f21,cns$f22,cns$f23,
+ * 	cns$f24,cns$f25,cns$f26,cns$f27,cns$f28,cns$f29,cns$f30,cns$f31
+ * 	) equals . increment 8 prefix "" tag "";
+ * 	cns$fpr	quadword dimension 32;
+ */
+
+/* Offset to base of saved FPR area - 32 quadwords */
+#define CNS_Q_FPR	0x210
+
+/* 	#t=.;
+ * 	cns$mchkflag quadword;
+ */
+#define CNS_Q_MCHK	0x310
+
+/* 	constant cns$pt_offset equals .;
+ *  constant (
+ * 	cns$pt0,cns$pt1,cns$pt2,cns$pt3,cns$pt4,cns$pt5,cns$pt6,
+ * 	cns$pt7,cns$pt8,cns$pt9,cns$pt10,cns$pt11,cns$pt12,cns$pt13,
+ * 	cns$pt14,cns$pt15,cns$pt16,cns$pt17,cns$pt18,cns$pt19,cns$pt20,
+ * 	cns$pt21,cns$pt22,cns$pt23
+ * 	) equals . increment 8 prefix "" tag "";
+ * 	cns$pt	quadword dimension 24;
+ */
+/* Offset to base of saved PALtemp area - 25 quadwords */
+#define CNS_Q_PT	0x318
+
+/* 	cns$shadow8	quadword;
+ * 	cns$shadow9	quadword;
+ * 	cns$shadow10	quadword;
+ * 	cns$shadow11	quadword;
+ * 	cns$shadow12	quadword;
+ * 	cns$shadow13	quadword;
+ * 	cns$shadow14	quadword;
+ * 	cns$shadow25	quadword;
+ */
+/* Offset to base of saved PALshadow area - 8 quadwords */
+#define CNS_Q_SHADOW	0x3D8
+
+/* Offset to base of saved IPR area */
+#define CNS_Q_IPR	0x418
+
+/* 	constant cns$ipr_offset equals .; */
+/* 	cns$exc_addr	quadword; */
+#define CNS_Q_EXC_ADDR		0x418
+/* 	cns$pal_base	quadword; */
+#define CNS_Q_PAL_BASE		0x420
+/* 	cns$mm_stat	quadword; */
+#define CNS_Q_MM_STAT		0x428
+/* 	cns$va		quadword; */
+#define CNS_Q_VA		0x430
+/* 	cns$icsr	quadword; */
+#define CNS_Q_ICSR		0x438
+/* 	cns$ipl		quadword; */
+#define CNS_Q_IPL		0x440
+/* 	cns$ps		quadword;	// Ibox current mode */
+#define CNS_Q_IPS		0x448
+/* 	cns$itb_asn	quadword; */
+#define CNS_Q_ITB_ASN		0x450
+/* 	cns$aster	quadword; */
+#define CNS_Q_ASTER		0x458
+/* 	cns$astrr	quadword; */
+#define CNS_Q_ASTRR		0x460
+/* 	cns$isr		quadword; */
+#define CNS_Q_ISR		0x468
+/* 	cns$ivptbr	quadword; */
+#define CNS_Q_IVPTBR		0x470
+/* 	cns$mcsr	quadword; */
+#define CNS_Q_MCSR		0x478
+/* 	cns$dc_mode	quadword; */
+#define CNS_Q_DC_MODE		0x480
+/* 	cns$maf_mode	quadword; */
+#define CNS_Q_MAF_MODE		0x488
+/* 	cns$sirr	quadword; */
+#define CNS_Q_SIRR		0x490
+/* 	cns$fpcsr	quadword; */
+#define CNS_Q_FPCSR		0x498
+/* 	cns$icperr_stat	quadword; */
+#define CNS_Q_ICPERR_STAT	0x4A0
+/* 	cns$pmctr	quadword; */
+#define CNS_Q_PM_CTR		0x4A8
+/* 	cns$exc_sum	quadword; */
+#define CNS_Q_EXC_SUM		0x4B0
+/* 	cns$exc_mask	quadword; */
+#define CNS_Q_EXC_MASK		0x4B8
+/* 	cns$intid	quadword; */
+#define CNS_Q_INT_ID		0x4C0
+/* 	cns$dcperr_stat quadword; */
+#define CNS_Q_DCPERR_STAT	0x4C8
+/* 	cns$sc_stat	quadword; */
+#define CNS_Q_SC_STAT		0x4D0
+/* 	cns$sc_addr	quadword; */
+#define CNS_Q_SC_ADDR		0x4D8
+/* 	cns$sc_ctl	quadword; */
+#define CNS_Q_SC_CTL		0x4E0
+/* 	cns$bc_tag_addr	quadword; */
+#define CNS_Q_BC_TAG_ADDR	0x4E8
+/* 	cns$ei_stat	quadword; */
+#define CNS_Q_EI_STAT		0x4F0
+/* 	cns$ei_addr	quadword; */
+#define CNS_Q_EI_ADDR		0x4F8
+/* 	cns$fill_syn	quadword; */
+#define CNS_Q_FILL_SYN		0x500
+/* 	cns$ld_lock	quadword; */
+#define CNS_Q_LD_LOCK		0x508
+/* 	cns$bc_ctl	quadword;	// shadow of on chip bc_ctl  */
+#define CNS_Q_BC_CTL		0x510
+/* 	cns$pmctr_ctl   quadword;	// saved frequency select info for performance monitor counter */
+#define CNS_Q_PM_CTL		0x518
+/* 	cns$bc_config	quadword;	// shadow of on chip bc_config */
+#define CNS_Q_BC_CFG            0x520
+
+/* 	constant cns$size equals .;
+ *
+ * 	constant pal$impure_common_size equals (%x0200 +7) & %xfff8;
+ * 	constant pal$impure_specific_size equals (.+7) & %xfff8;
+ * 	constant cns$mchksize equals (.+7-#t) & %xfff8;
+ * 	constant pal$logout_area	equals pal$impure_specific_size ;
+ * end impure;
+*/
+
+/* This next set of stuff came from the old code ..pb */
+#define CNS_Q_SROM_REV          0x528
+#define CNS_Q_PROC_ID           0x530
+#define CNS_Q_MEM_SIZE          0x538
+#define CNS_Q_CYCLE_CNT         0x540
+#define CNS_Q_SIGNATURE         0x548
+#define CNS_Q_PROC_MASK         0x550
+#define CNS_Q_SYSCTX            0x558
+
+
+
+#define MACHINE_CHECK_CRD_BASE 0
+#define MACHINE_CHECK_SIZE ((CNS_Q_SYSCTX + 7 - CNS_Q_MCHK) & 0xfff8)
+
+
+
+/*
+ * aggregate EV5PMCTRCTL_BITS structure fill prefix PMCTR_CTL$;
+ * 	SPROCESS bitfield length 1 ;
+ * 	FILL_0 bitfield length 3 fill tag $$;
+ * 	FRQ2 bitfield length 2 ;
+ * 	FRQ1 bitfield length 2 ;
+ * 	FRQ0 bitfield length 2 ;
+ * 	CTL2 bitfield length 2 ;
+ * 	CTL1 bitfield length 2 ;
+ * 	CTL0 bitfield length 2 ;
+ * 	FILL_1 bitfield length 16 fill tag $$;
+ * 	FILL_2 bitfield length 32 fill tag $$;
+ * end EV5PMCTRCTL_BITS;
+ *
+ * end_module $pal_impure;
+ *
+ * module	$pal_logout;
+ *
+ * //
+ * // Start definition of Corrected Error Frame
+ * //
+ */
+
+/*
+ * aggregate crd_logout struct prefix "" tag "";
+ */
+
+#define pal_logout_area 0x600
+#define mchk_crd_base  0
+
+/* 	mchk$crd_flag		quadword; */
+#define mchk_crd_flag 0
+/* 	mchk$crd_offsets	quadword; */
+#define mchk_crd_offsets 8
+/*
+ * 	// Pal-specific information	*/
+#define mchk_crd_mchk_code 0x10
+/* 	mchk$crd_mchk_code	quadword;
+ *
+ * 	// CPU-specific information
+ * 	constant mchk$crd_cpu_base equals . ;
+ * 	mchk$crd_ei_addr	quadword; */
+#define mchk_crd_ei_addr 0x18
+/* 	mchk$crd_fill_syn	quadword; */
+#define mchk_crd_fill_syn 0x20
+/* 	mchk$crd_ei_stat	quadword; */
+#define mchk_crd_ei_stat 0x28
+/* 	mchk$crd_isr		quadword; */
+#define mchk_crd_isr 0x30
+
+/*
+ * Hacked up constants for the turbolaser build. Hope
+ * this is moreless correct
+ */
+
+#define mchk_crd_whami   0x38
+#define mchk_crd_tldev   0x40
+#define mchk_crd_tlber   0x48
+#define mchk_crd_tlesr0  0x50
+#define mchk_crd_tlesr1  0x58
+#define mchk_crd_tlesr2  0x60
+#define mchk_crd_tlesr3  0x68
+#define mchk_crd_rsvd    0x70
+
+
+/*
+ * mchk area seems different for tlaser
+ */
+
+#define mchk_crd_size   0x80
+#define mchk_mchk_base (mchk_crd_size)
+
+#define mchk_tlber      0x0
+#define mchk_tlepaerr   0x8
+#define mchk_tlepderr   0x10
+#define mchk_tlepmerr   0x18
+
+
+/*
+ * 	// System-specific information
+ * 	constant mchk$crd_sys_base equals . ;
+ * 	constant mchk$crd_size equals (.+7) & %xfff8;
+ *
+ * end crd_logout;
+ * //
+ * // Start definition of Machine check logout Frame
+ * //
+ * aggregate logout struct prefix "" tag "";
+ * 	mchk$flag		quadword; */
+/* 	mchk$offsets		quadword; */
+/*
+ *  // Pal-specific information
+ * 	mchk$mchk_code		quadword; */
+/*
+
+ * 	mchk$pt	quadword dimension 24;
+ *
+ *  // CPU-specific information
+ * 	constant mchk$cpu_base equals . ;
+ * 	mchk$exc_addr		quadword;
+ * 	mchk$exc_sum		quadword;
+ * 	mchk$exc_mask		quadword;
+ * 	mchk$pal_base		quadword;
+ * 	mchk$isr		quadword;
+ * 	mchk$icsr		quadword;
+ * 	mchk$ic_perr_stat       quadword;
+ * 	mchk$dc_perr_stat	quadword;
+ * 	mchk$va		        quadword;
+ * 	mchk$mm_stat		quadword;
+ * 	mchk$sc_addr		quadword;
+ * 	mchk$sc_stat		quadword;
+ * 	mchk$bc_tag_addr	quadword;
+ * 	mchk$ei_addr		quadword;
+ * 	mchk$fill_syn		quadword;
+ * 	mchk$ei_stat		quadword;
+ * 	mchk$ld_lock		quadword;
+ *
+ *         // System-specific information
+ *
+ * 	constant mchk$sys_base equals . ;
+ * 	mchk$sys_ipr1		quadword	; // Holder for system-specific stuff
+ *
+ * 	constant mchk$size equals (.+7) & %xfff8;
+ *
+ *
+ * 	constant mchk$crd_base	equals 0 ;
+ * 	constant mchk$mchk_base	equals mchk$crd_size ;
+ *
+ *
+ * end logout;
+ *
+ * end_module $pal_logout;
+*/
+
+/*
+ * this is lingering in the old ladbx code but looks like it was from
+ * ev4 days.  This was 0x160 in the old days..pb
+ */
+#define LAF_K_SIZE         MACHINE_CHECK_SIZE
+#endif
diff --git a/system/alpha/h/ev5_osfalpha_defs.h b/system/alpha/h/ev5_osfalpha_defs.h
new file mode 100644
index 0000000000..bb98503b41
--- /dev/null
+++ b/system/alpha/h/ev5_osfalpha_defs.h
@@ -0,0 +1,152 @@
+/*
+ * Copyright (c) 1993 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef EV5_OSFALPHA_DEFS_INCLUDED
+#define EV5_OSFALPHA_DEFS_INCLUDED 1
+
+// from ev5_osfalpha_defs.mar from Lance's fetch directory
+// lowercaseified and $ changed to _ and reformatting for gas...pb Nov/95
+
+//
+// PS Layout - PS
+//	Loc	Size	name 	function
+//	------	------	-----	-----------------------------------
+//	<0:2>	3	IPL	Prio level
+//	<3>	1	CM	Current Mode
+//
+
+#define	osfps_v_mode		3
+#define	osfps_m_mode		(1<<osfps_v_mode)
+#define	osfps_v_ipl		0
+#define	osfps_m_ipl		(7<<osfps_v_ipl)
+
+#define	osfipl_c_mchk		7
+#define	osfipl_c_rt		6
+#define	osfipl_c_clk		5
+#define	osfipl_c_dev1		4
+#define	osfipl_c_dev0		3
+#define	osfipl_c_sw1		2
+#define	osfipl_c_sw0		1
+#define	osfipl_c_zero		0
+
+#define	osfint_c_mchk		2
+#define	osfint_c_clk		1
+#define	osfint_c_dev		3
+#define	osfint_c_ip		0
+#define	osfint_c_perf		4
+#define	osfint_c_passrel	5
+
+//
+// PTE layout - symbol prefix osfpte_
+//
+//	Loc	Size	name 	function
+//	------	------	------	-----------------------------------
+//	<63:32>	32	PFN	Page Frame Number
+//	<31:16>	16	SOFT	Bits reserved for software use
+//	<15:14>	2
+//	<13>	1	UWE	User write enable
+//	<12>	1	KWE	Kernel write enable
+//	<11:10>	2
+//	<9>	1	URE	User read enable
+//	<8>	1	KRE	Kernel read enable
+//	<7:6>	2	RES	Reserved SBZ
+//	<5>	1	HPF	Huge Page Flag
+//	<4>	1	ASM	Wild card address space number match
+//	<3>	1	FOE	Fault On execute
+//	<2>	1	FOW	Fault On Write
+//	<1>	1	FOR	Fault On Read
+// 	<0>	1	V	valid bit
+//
+
+#define	osfpte_v_pfn	32
+#define	osfpte_m_soft	(0xFFFF0000)
+#define	osfpte_v_soft	16
+#define	osfpte_m_uwe	(0x2000)
+#define	osfpte_v_uwe	13
+#define	osfpte_m_kwe	(0x1000)
+#define	osfpte_v_kwe	12
+#define	osfpte_m_ure	(0x0200)
+#define	osfpte_v_ure	 9
+#define	osfpte_m_kre	(0x0100)
+#define	osfpte_v_kre	 8
+#define	osfpte_m_hpf	(0x0020)
+#define	osfpte_v_hpf	5
+#define	osfpte_m_asm	(0x0010)
+#define	osfpte_v_asm	4
+#define	osfpte_m_foe	(0x0008)
+#define	osfpte_v_foe	3
+#define	osfpte_m_fow	(0x0004)
+#define	osfpte_v_fow	2
+#define	osfpte_m_for	(0x0002)
+#define	osfpte_v_for	1
+#define	osfpte_m_v	(0x0001)
+#define	osfpte_v_v	0
+
+#define	osfpte_m_ksegbits	(osfpte_m_kre | osfpte_m_kwe | osfpte_m_v | osfpte_m_asm)
+#define	osfpte_m_prot	(osfpte_m_ure+osfpte_m_uwe | osfpte_m_kre | osfpte_m_kwe)
+
+//
+// VA layout - symbol prefix VA_
+//
+//	Loc	Size	name 	function
+//	------	------	-------	-----------------------------------
+//	<42:33>	10	SEG1	First seg table offset for mapping
+//	<32:23>	10	SEG2	Second seg table offset for mapping
+//	<22:13>	10	SEG3	Third seg table offset for mapping
+//	<12:0>	13	OFFSET	Byte within page
+//
+
+#define	osfva_m_offset	(0x000000001FFF)
+#define	osfva_v_offset	0
+#define	osfva_m_seg3	(0x0000007FE000)
+#define	osfva_v_seg3	13
+#define	osfva_m_seg2	(0x0001FF800000)
+#define	osfva_v_seg2	23
+#define	osfva_m_seg1	(0x7FE00000000)
+#define	osfva_v_seg1	33
+
+#define	osfpcb_q_ksp	(0x0000)
+#define	osfpcb_q_usp	(0x0008)
+#define	osfpcb_q_Usp	(0x0008)
+#define	osfpcb_q_mmptr	(0x0010)
+#define	osfpcb_q_Mmptr	(0x0010)
+#define	osfpcb_l_cc	(0x0018)
+#define	osfpcb_l_asn	(0x001C)
+#define	osfpcb_q_unique (0x0020)
+#define	osfpcb_q_fen	(0x0028)
+#define	osfpcb_v_pme	62
+
+#define	osfsf_ps	(0x00)
+#define	osfsf_pc	(0x08)
+#define	osfsf_gp	(0x10)
+#define	osfsf_a0	(0x18)
+#define	osfsf_a1	(0x20)
+#define	osfsf_a2	(0x28)
+#define	osfsf_c_size	(0x30)
+
+#endif
diff --git a/system/alpha/h/ev5_paldef.h b/system/alpha/h/ev5_paldef.h
new file mode 100644
index 0000000000..49cea5faab
--- /dev/null
+++ b/system/alpha/h/ev5_paldef.h
@@ -0,0 +1,162 @@
+/*
+ * Copyright (c) 1993 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef EV5_PALDEF_INCLUDED
+#define EV5_PALDEF_INCLUDED 1
+
+// from ev5_paldef.mar from Lance's fetch directory...pb Nov/95
+// some entries have been superceeded by the more recent evt_defs.h
+
+// These are lower-caseified and have the $ signs (unnecessarily we
+// now discover) removed.
+
+// Note that at the bottom of this file is the version of ev5_defs.mar
+// which is more recent than the top part of the file and contains
+// overlapping information...pb Nov/95
+
+#define hlt_c_reset		0
+#define hlt_c_hw_halt		1
+#define hlt_c_ksp_inval		2
+#define hlt_c_scbb_inval	3
+#define hlt_c_ptbr_inval	4
+#define hlt_c_sw_halt		5
+#define hlt_c_dbl_mchk		6
+#define hlt_c_mchk_from_pal	7
+#define hlt_c_start		32
+#define hlt_c_callback		33
+#define hlt_c_mpstart		34
+#define hlt_c_lfu_start		35
+
+#define mchk_c_tperr			(64<<1)
+#define mchk_c_tcperr			(65<<1)
+#define mchk_c_herr			(66<<1)
+#define mchk_c_ecc_c			(67<<1)
+#define mchk_c_ecc_nc			(68<<1)
+#define mchk_c_unknown		        (69<<1)
+#define mchk_c_cacksoft			(70<<1)
+#define mchk_c_bugcheck			(71<<1)
+#define mchk_c_os_bugcheck		(72<<1)
+#define mchk_c_dcperr			(73<<1)
+#define mchk_c_icperr			(74<<1)
+#define mchk_c_retryable_ird		(75<<1)
+#define mchk_c_proc_hrd_error		(76<<1)
+#define mchk_c_scperr			(77<<1)
+#define mchk_c_bcperr			(78<<1)
+//; mchk codes above 255 reserved for platform specific errors
+
+
+#define mchk_c_read_nxm			(256<<1)
+#define mchk_c_sys_hrd_error		(257<<1)
+#define mchk_c_sys_ecc			(258<<1)
+
+#define page_seg_size_bits	 10
+#define page_offset_size_bits	 13
+#define page_size_bytes		 8192
+#define va_size_bits		 43
+#define pa_size_bits		 45
+
+// replaced by ev5_defs.h #define pt0  		(0x140)
+// replaced by ev5_defs.h #define pt1  		(0x141)
+// replaced by ev5_defs.h #define pt2  		(0x142)
+#define pt_entuna	(0x142)
+// replaced by ev5_defs.h #define pt3	 	(0x143)
+#define pt_impure	(0x143)
+// replaced by ev5_defs.h #define pt4  		(0x144)
+// replaced by ev5_defs.h #define pt5  		(0x145)
+// replaced by ev5_defs.h #define pt6  		(0x146)
+// replaced by ev5_defs.h #define pt7  		(0x147)
+#define pt_entif	(0x147)
+// replaced by ev5_defs.h #define pt8  		(0x148)
+#define pt_intmask	(0x148)
+// replaced by ev5_defs.h #define pt9  		(0x149)
+#define pt_entsys	(0x149)
+#define pt_ps  		(0x149)
+// replaced by ev5_defs.h #define pt10  		(0x14a)
+// replaced by ev5_defs.h #define pt11  		(0x14b)
+#define pt_trap		(0x14b)
+#define pt_entint	(0x14b)
+// replaced by ev5_defs.h #define pt12  		(0x14c)
+#define pt_entarith	(0x14c)
+// replaced by ev5_defs.h #define pt13		(0x14d)
+#define pt_sys0		(0x14d)
+// replaced by ev5_defs.h #define pt14		(0x14e)
+#define pt_sys1		(0x14e)
+// replaced by ev5_defs.h #define pt15		(0x14f)
+#define pt_sys2		(0x14f)
+// replaced by ev5_defs.h #define pt16  		(0x150)
+#define pt_whami	(0x150)
+#define pt_mces		(0x150)
+#define pt_misc 	(0x150)
+// replaced by ev5_defs.h #define pt17  		(0x151)
+#define pt_scc 		(0x151)
+#define pt_sysval	(0x151)
+// replaced by ev5_defs.h #define pt18  		(0x152)
+#define pt_prbr		(0x152)
+#define pt_usp		(0x152)
+// replaced by ev5_defs.h #define pt19  		(0x153)
+#define pt_ksp 		(0x153)
+// replaced by ev5_defs.h #define pt20  		(0x154)
+#define pt_ptbr		(0x154)
+// replaced by ev5_defs.h #define pt21  		(0x155)
+#define pt_vptbr	(0x155)
+#define pt_entmm	(0x155)
+// replaced by ev5_defs.h #define pt22  		(0x156)
+#define pt_scbb		(0x156)
+#define pt_kgp		(0x156)
+// replaced by ev5_defs.h #define pt23  		(0x157)
+#define pt_pcbb		(0x157)
+
+
+#define pt_misc_v_switch 48
+#define pt_misc_v_cm     56
+
+#define mmcsr_c_tnv		0
+#define mmcsr_c_acv		1
+#define mmcsr_c_for		2
+#define mmcsr_c_foe		3
+#define mmcsr_c_fow		4
+
+#define mm_stat_m_opcode  	(0x3F)
+#define mm_stat_m_ra  		(0x1F)
+#define evx_opc_sync	 	(0x18)
+#define EVX_OPC_SYNC	 	(0x18)
+#define evx_opc_hw_ld	 	(0x1B)
+
+#define osf_a0_bpt	  	(0x0)
+#define osf_a0_bugchk	  	(0x1)
+#define osf_a0_gentrap	  	(0x2)
+#define osf_a0_fen	  	(0x3)
+#define osf_a0_opdec	  	(0x4)
+
+#define ipl_machine_check	31
+#define ipl_powerfail		30
+#define ipl_perf_count		29
+#define ipl_clock		22
+#define ipl_interprocessor	22
+
+#endif
diff --git a/system/alpha/h/fromHudsonMacros.h b/system/alpha/h/fromHudsonMacros.h
new file mode 100644
index 0000000000..68f8999c05
--- /dev/null
+++ b/system/alpha/h/fromHudsonMacros.h
@@ -0,0 +1,88 @@
+/*
+ * Copyright (c) 1993-1994 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef HUDSON_MACROS_LOADED
+#define	HUDSON_MACROS_LOADED	    1
+
+#define	STALL \
+    mfpr    r31, pt0
+
+#define	NOP \
+    bis	    $31, $31, $31
+
+/*
+** Align code on an 8K byte page boundary.
+*/
+
+#define	ALIGN_PAGE \
+    .align  13
+
+/*
+** Align code on a 32 byte block boundary.
+*/
+
+#define	ALIGN_BLOCK \
+    .align  5
+
+/*
+** Align code on a quadword boundary.
+*/
+
+#define ALIGN_BRANCH \
+    .align  3
+
+/*
+** Hardware vectors go in .text 0 sub-segment.
+*/
+
+#define	HDW_VECTOR(offset) \
+    . = offset
+
+/*
+** Privileged CALL_PAL functions are in .text 1 sub-segment.
+*/
+
+#define	CALL_PAL_PRIV(vector) \
+    . = (PAL_CALL_PAL_PRIV_ENTRY+(vector<<6))
+
+/*
+** Unprivileged CALL_PAL functions are in .text 1 sub-segment,
+** the privileged bit is removed from these vectors.
+*/
+
+#define CALL_PAL_UNPRIV(vector) \
+    . = (PAL_CALL_PAL_UNPRIV_ENTRY+((vector&0x3F)<<6))
+
+/*
+** Implements a load "immediate" longword function
+*/
+#define LDLI(reg,val) \
+        ldah	reg, ((val+0x8000) >> 16)(zero); \
+        lda	reg, (val&0xffff)(reg)
+
+#endif
diff --git a/system/alpha/h/fromHudsonOsf.h b/system/alpha/h/fromHudsonOsf.h
new file mode 100644
index 0000000000..e1dfc8171d
--- /dev/null
+++ b/system/alpha/h/fromHudsonOsf.h
@@ -0,0 +1,483 @@
+/*
+ * Copyright (c) 1993-1994 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef FROMHUDSONOSF_INCLUDED
+#define FROMHUDSONOSF_INCLUDED 1
+
+#define	__OSF_LOADED	1
+/*
+**  Seg0 and Seg1 Virtual Address (VA) Format
+**
+**	  Loc	Size	Name	Function
+**	 -----	----	----	---------------------------------
+**	<42:33>  10	SEG1	First level page table offset
+**	<32:23>  10	SEG2	Second level page table offset
+**	<22:13>  10	SEG3	Third level page table offset
+**	<12:00>  13	OFFSET	Byte within page offset
+*/
+
+#define VA_V_SEG1	33
+#define	VA_M_SEG1	(0x3FF<<VA_V_SEG1)
+#define VA_V_SEG2	23
+#define VA_M_SEG2	(0x3FF<<VA_V_SEG2)
+#define VA_V_SEG3	13
+#define VA_M_SEG3	(0x3FF<<VA_V_SEG3)
+#define VA_V_OFFSET	0
+#define VA_M_OFFSET	0x1FFF
+
+/*
+**  Virtual Address Options: 8K byte page size
+*/
+
+#define	VA_S_SIZE	43
+#define	VA_S_OFF	13
+#define	va_s_off	13
+#define VA_S_SEG	10
+#define VA_S_PAGE_SIZE	8192
+
+/*
+**  Page Table Entry (PTE) Format
+**
+**	 Extent	Size	Name	Function
+**	 ------	----	----	---------------------------------
+**	<63:32>	  32	PFN	Page Frame Number
+**	<31:16>	  16	SW	Reserved for software
+**	<15:14>	   2	RSV0	Reserved for hardware SBZ
+**	   <13>	   1	UWE	User Write Enable
+**	   <12>	   1	KWE	Kernel Write Enable
+**	<11:10>	   2	RSV1	Reserved for hardware SBZ
+**	    <9>	   1	URE	User Read Enable
+**	    <8>	   1	KRE	Kernel Read Enable
+**	    <7>	   1	RSV2	Reserved for hardware SBZ
+**	  <6:5>	   2	GH	Granularity Hint
+**	    <4>	   1	ASM	Address Space Match
+**	    <3>	   1	FOE	Fault On Execute
+**	    <2>	   1	FOW	Fault On Write
+**	    <1>	   1	FOR	Fault On Read
+**	    <0>	   1	V	Valid
+*/
+
+#define	PTE_V_PFN	32
+#define PTE_M_PFN	0xFFFFFFFF00000000
+#define PTE_V_SW	16
+#define PTE_M_SW	0x00000000FFFF0000
+#define PTE_V_UWE	13
+#define PTE_M_UWE	(1<<PTE_V_UWE)
+#define PTE_V_KWE	12
+#define PTE_M_KWE	(1<<PTE_V_KWE)
+#define PTE_V_URE	9
+#define PTE_M_URE	(1<<PTE_V_URE)
+#define PTE_V_KRE	8
+#define PTE_M_KRE	(1<<PTE_V_KRE)
+#define PTE_V_GH	5
+#define PTE_M_GH	(3<<PTE_V_GH)
+#define PTE_V_ASM	4
+#define PTE_M_ASM	(1<<PTE_V_ASM)
+#define PTE_V_FOE	3
+#define PTE_M_FOE	(1<<PTE_V_FOE)
+#define PTE_V_FOW	2
+#define PTE_M_FOW	(1<<PTE_V_FOW)
+#define PTE_V_FOR	1
+#define PTE_M_FOR	(1<<PTE_V_FOR)
+#define PTE_V_VALID	0
+#define PTE_M_VALID	(1<<PTE_V_VALID)
+
+#define PTE_M_KSEG	0x1111
+#define PTE_M_PROT	0x3300
+#define pte_m_prot	0x3300
+
+/*
+**  System Entry Instruction Fault (entIF) Constants:
+*/
+
+#define IF_K_BPT        0x0
+#define IF_K_BUGCHK     0x1
+#define IF_K_GENTRAP    0x2
+#define IF_K_FEN        0x3
+#define IF_K_OPCDEC     0x4
+
+/*
+**  System Entry Hardware Interrupt (entInt) Constants:
+*/
+
+#define INT_K_IP	0x0
+#define INT_K_CLK	0x1
+#define INT_K_MCHK	0x2
+#define INT_K_DEV	0x3
+#define INT_K_PERF	0x4
+
+/*
+**  System Entry MM Fault (entMM) Constants:
+*/
+
+#define	MM_K_TNV	0x0
+#define MM_K_ACV	0x1
+#define MM_K_FOR	0x2
+#define MM_K_FOE	0x3
+#define MM_K_FOW	0x4
+
+/*
+**  Process Control Block (PCB) Offsets:
+*/
+
+#define PCB_Q_KSP	0x0000
+#define PCB_Q_USP	0x0008
+#define PCB_Q_PTBR	0x0010
+#define PCB_L_PCC	0x0018
+#define PCB_L_ASN	0x001C
+#define PCB_Q_UNIQUE	0x0020
+#define PCB_Q_FEN	0x0028
+#define PCB_Q_RSV0	0x0030
+#define PCB_Q_RSV1	0x0038
+
+/*
+**  Processor Status Register (PS) Bit Summary
+**
+**	Extent	Size	Name	Function
+**	------	----	----	---------------------------------
+**	  <3>	 1	CM	Current Mode
+**	<2:0>	 3	IPL	Interrupt Priority Level
+**/
+
+#define	PS_V_CM		3
+#define PS_M_CM		(1<<PS_V_CM)
+#define	PS_V_IPL	0
+#define	PS_M_IPL	(7<<PS_V_IPL)
+
+#define	PS_K_KERN	(0<<PS_V_CM)
+#define PS_K_USER	(1<<PS_V_CM)
+
+#define	IPL_K_ZERO	0x0
+#define IPL_K_SW0	0x1
+#define IPL_K_SW1	0x2
+#define IPL_K_DEV0	0x3
+#define IPL_K_DEV1	0x4
+#define IPL_K_CLK	0x5
+#define IPL_K_RT	0x6
+#define IPL_K_PERF      0x6
+#define IPL_K_PFAIL     0x6
+#define IPL_K_MCHK	0x7
+
+#define IPL_K_LOW	0x0
+#define IPL_K_HIGH	0x7
+
+/*
+**  SCB Offset Definitions:
+*/
+
+#define SCB_Q_FEN	    	0x0010
+#define SCB_Q_ACV		0x0080
+#define SCB_Q_TNV		0x0090
+#define SCB_Q_FOR		0x00A0
+#define SCB_Q_FOW		0x00B0
+#define SCB_Q_FOE		0x00C0
+#define SCB_Q_ARITH		0x0200
+#define SCB_Q_KAST		0x0240
+#define SCB_Q_EAST		0x0250
+#define SCB_Q_SAST		0x0260
+#define SCB_Q_UAST		0x0270
+#define SCB_Q_UNALIGN		0x0280
+#define SCB_Q_BPT		0x0400
+#define SCB_Q_BUGCHK		0x0410
+#define SCB_Q_OPCDEC		0x0420
+#define SCB_Q_ILLPAL		0x0430
+#define SCB_Q_TRAP		0x0440
+#define SCB_Q_CHMK		0x0480
+#define SCB_Q_CHME		0x0490
+#define SCB_Q_CHMS		0x04A0
+#define SCB_Q_CHMU		0x04B0
+#define SCB_Q_SW0		0x0500
+#define SCB_Q_SW1		0x0510
+#define SCB_Q_SW2		0x0520
+#define SCB_Q_SW3		0x0530
+#define	SCB_Q_SW4		0x0540
+#define SCB_Q_SW5		0x0550
+#define SCB_Q_SW6		0x0560
+#define SCB_Q_SW7		0x0570
+#define SCB_Q_SW8		0x0580
+#define SCB_Q_SW9		0x0590
+#define SCB_Q_SW10		0x05A0
+#define SCB_Q_SW11		0x05B0
+#define SCB_Q_SW12		0x05C0
+#define SCB_Q_SW13		0x05D0
+#define SCB_Q_SW14		0x05E0
+#define SCB_Q_SW15		0x05F0
+#define SCB_Q_CLOCK		0x0600
+#define SCB_Q_INTER		0x0610
+#define SCB_Q_SYSERR        	0x0620
+#define SCB_Q_PROCERR		0x0630
+#define SCB_Q_PWRFAIL		0x0640
+#define SCB_Q_PERFMON		0x0650
+#define SCB_Q_SYSMCHK		0x0660
+#define SCB_Q_PROCMCHK      	0x0670
+#define SCB_Q_PASSREL		0x0680
+
+/*
+**  Stack Frame (FRM) Offsets:
+**
+**  There are two types of system entries for OSF/1 - those for the
+**  callsys CALL_PAL function and those for exceptions and interrupts.
+**  Both entry types use the same stack frame layout.  The stack frame
+**  contains space for the PC, the PS, the saved GP, and the saved
+**  argument registers a0, a1, and a2.  On entry, SP points to the
+**  saved PS.
+*/
+
+#define	FRM_Q_PS	0x0000
+#define FRM_Q_PC	0x0008
+#define FRM_Q_GP	0x0010
+#define FRM_Q_A0	0x0018
+#define FRM_Q_A1	0x0020
+#define FRM_Q_A2	0x0028
+
+#define FRM_K_SIZE	48
+
+#define STACK_FRAME(tmp1,tmp2)	\
+        sll	ps, 63-PS_V_CM, p7;	\
+        bge	p7, 0f;			\
+        bis	zero, zero, ps;		\
+        mtpr	sp, ptUsp;		\
+        mfpr	sp, ptKsp;		\
+0:	lda	sp, 0-FRM_K_SIZE(sp);	\
+        stq	tmp1, FRM_Q_PS(sp);	\
+        stq	tmp2, FRM_Q_PC(sp);	\
+        stq	gp, FRM_Q_GP(sp);	\
+        stq	a0, FRM_Q_A0(sp);	\
+        stq	a1, FRM_Q_A1(sp);	\
+        stq	a2, FRM_Q_A2(sp)
+
+/*
+**  Halt Codes:
+*/
+
+#define HLT_K_RESET	    0x0000
+#define HLT_K_HW_HALT	    0x0001
+#define HLT_K_KSP_INVAL	    0x0002
+#define HLT_K_SCBB_INVAL    0x0003
+#define HLT_K_PTBR_INVAL    0x0004
+#define HLT_K_SW_HALT	    0x0005
+#define HLT_K_DBL_MCHK	    0x0006
+#define HLT_K_MCHK_FROM_PAL 0x0007
+
+/*
+**  Machine Check Codes:
+*/
+
+#define MCHK_K_TPERR	    0x0080
+#define MCHK_K_TCPERR	    0x0082
+#define MCHK_K_HERR	    0x0084
+#define MCHK_K_ECC_C	    0x0086
+#define MCHK_K_ECC_NC	    0x0088
+#define MCHK_K_UNKNOWN	    0x008A
+#define MCHK_K_CACKSOFT	    0x008C
+#define MCHK_K_BUGCHECK	    0x008E
+#define MCHK_K_OS_BUGCHECK  0x0090
+#define MCHK_K_DCPERR	    0x0092
+#define MCHK_K_ICPERR	    0x0094
+#define MCHK_K_RETRY_IRD    0x0096
+#define MCHK_K_PROC_HERR    0x0098
+
+/*
+** System Machine Check Codes:
+*/
+
+#define MCHK_K_READ_NXM     0x0200
+#define MCHK_K_SYS_HERR     0x0202
+
+/*
+**  Machine Check Error Status Summary (MCES) Register Format
+**
+**	 Extent	Size	Name	Function
+**	 ------	----	----	---------------------------------
+**	  <0>	  1	MIP	Machine check in progress
+**	  <1>	  1	SCE	System correctable error in progress
+**	  <2>	  1	PCE	Processor correctable error in progress
+**	  <3>	  1	DPC	Disable PCE error reporting
+**	  <4>	  1	DSC	Disable SCE error reporting
+*/
+
+#define MCES_V_MIP	0
+#define MCES_M_MIP	(1<<MCES_V_MIP)
+#define MCES_V_SCE	1
+#define MCES_M_SCE	(1<<MCES_V_SCE)
+#define MCES_V_PCE	2
+#define MCES_M_PCE	(1<<MCES_V_PCE)
+#define MCES_V_DPC	3
+#define MCES_M_DPC	(1<<MCES_V_DPC)
+#define MCES_V_DSC	4
+#define MCES_M_DSC	(1<<MCES_V_DSC)
+
+#define MCES_M_ALL      (MCES_M_MIP | MCES_M_SCE | MCES_M_PCE | MCES_M_DPC \
+                         | MCES_M_DSC)
+
+/*
+**  Who-Am-I (WHAMI) Register Format
+**
+**	 Extent	Size	Name	Function
+**	 ------	----	----	---------------------------------
+**	  <7:0>	  8	ID	Who-Am-I identifier
+**	  <15:8>   1	SWAP	Swap PALcode flag - character 'S'
+*/
+
+#define WHAMI_V_SWAP	8
+#define WHAMI_M_SWAP	(1<<WHAMI_V_SWAP)
+#define WHAMI_V_ID	0
+#define WHAMI_M_ID	0xFF
+
+#define WHAMI_K_SWAP    0x53    /* Character 'S' */
+
+/*
+**  Conventional Register Usage Definitions
+**
+**  Assembler temporary `at' is `AT' so it doesn't conflict with the
+**  `.set at' assembler directive.
+*/
+
+#define v0		$0	/* Function Return Value Register */
+#define t0		$1	/* Scratch (Temporary) Registers ... */
+#define t1		$2
+#define t2		$3
+#define t3		$4
+#define t4		$5
+#define t5		$6
+#define t6		$7
+#define t7		$8
+#define s0		$9	/* Saved (Non-Volatile) Registers ... */
+#define s1		$10
+#define s2		$11
+#define s3		$12
+#define s4		$13
+#define s5		$14
+#define fp		$15	/* Frame Pointer Register, Or S6 */
+#define s6		$15
+#define a0		$16	/* Argument Registers ... */
+#define a1		$17
+#define a2		$18
+#define a3		$19
+#define a4		$20
+#define a5		$21
+#define t8		$22	/* Scratch (Temporary) Registers ... */
+#define t9		$23
+#define t10		$24
+#define t11		$25
+#define ra		$26	/* Return Address Register */
+#define pv		$27	/* Procedure Value Register, Or T12 */
+#define t12		$27
+#define AT		$28	/* Assembler Temporary (Volatile) Register */
+#define gp		$29	/* Global Pointer Register */
+#define sp		$30	/* Stack Pointer Register */
+#define zero		$31	/* Zero Register */
+
+/*
+**  OSF/1 Unprivileged CALL_PAL Entry Offsets:
+**
+**	Entry Name	    Offset (Hex)
+**
+**	bpt		     0080
+**	bugchk		     0081
+**	callsys		     0083
+**	imb		     0086
+**	rdunique	     009E
+**	wrunique	     009F
+**	gentrap		     00AA
+**	dbgstop		     00AD
+*/
+
+#define UNPRIV			    0x80
+#define	PAL_BPT_ENTRY		    0x80
+#define PAL_BUGCHK_ENTRY	    0x81
+#define PAL_CALLSYS_ENTRY	    0x83
+#define PAL_IMB_ENTRY		    0x86
+#define PAL_RDUNIQUE_ENTRY	    0x9E
+#define PAL_WRUNIQUE_ENTRY	    0x9F
+#define PAL_GENTRAP_ENTRY	    0xAA
+
+#if defined(KDEBUG)
+#define	PAL_DBGSTOP_ENTRY	    0xAD
+/* #define NUM_UNPRIV_CALL_PALS	    10 */
+#else
+/* #define NUM_UNPRIV_CALL_PALS	    9  */
+#endif /* KDEBUG */
+
+/*
+**  OSF/1 Privileged CALL_PAL Entry Offsets:
+**
+**	Entry Name	    Offset (Hex)
+**
+**	halt		     0000
+**	cflush		     0001
+**	draina		     0002
+**	cserve		     0009
+**	swppal		     000A
+**	rdmces		     0010
+**	wrmces		     0011
+**	wrfen		     002B
+**	wrvptptr	     002D
+**	swpctx		     0030
+**	wrval		     0031
+**	rdval		     0032
+**	tbi		     0033
+**	wrent		     0034
+**	swpipl		     0035
+**	rdps		     0036
+**	wrkgp		     0037
+**	wrusp		     0038
+**	rdusp		     003A
+**	whami		     003C
+**	retsys		     003D
+**	rti		     003F
+*/
+
+#define PAL_HALT_ENTRY	    0x0000
+#define PAL_CFLUSH_ENTRY    0x0001
+#define PAL_DRAINA_ENTRY    0x0002
+#define PAL_CSERVE_ENTRY    0x0009
+#define PAL_SWPPAL_ENTRY    0x000A
+#define PAL_WRIPIR_ENTRY    0x000D
+#define PAL_RDMCES_ENTRY    0x0010
+#define PAL_WRMCES_ENTRY    0x0011
+#define PAL_WRFEN_ENTRY	    0x002B
+#define PAL_WRVPTPTR_ENTRY  0x002D
+#define PAL_SWPCTX_ENTRY    0x0030
+#define PAL_WRVAL_ENTRY	    0x0031
+#define PAL_RDVAL_ENTRY	    0x0032
+#define PAL_TBI_ENTRY	    0x0033
+#define PAL_WRENT_ENTRY	    0x0034
+#define PAL_SWPIPL_ENTRY    0x0035
+#define PAL_RDPS_ENTRY	    0x0036
+#define PAL_WRKGP_ENTRY	    0x0037
+#define PAL_WRUSP_ENTRY	    0x0038
+#define PAL_RDUSP_ENTRY	    0x003A
+#define PAL_WHAMI_ENTRY	    0x003C
+#define PAL_RETSYS_ENTRY    0x003D
+#define PAL_RTI_ENTRY	    0x003F
+
+#define NUM_PRIV_CALL_PALS  23
+
+#endif
+
diff --git a/system/alpha/h/rpb.h b/system/alpha/h/rpb.h
new file mode 100644
index 0000000000..1adaf82e7b
--- /dev/null
+++ b/system/alpha/h/rpb.h
@@ -0,0 +1,387 @@
+/*
+ * Copyright (c) 1990 The Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+/*
+ * Copyright (c) 1994, 1995, 1996 Carnegie-Mellon University.
+ * All rights reserved.
+ *
+ * Author: Keith Bostic, Chris G. Demetriou
+ *
+ * Permission to use, copy, modify and distribute this software and
+ * its documentation is hereby granted, provided that both the copyright
+ * notice and this permission notice appear in all copies of the
+ * software, derivative works or modified versions, and any portions
+ * thereof, and that both notices appear in supporting documentation.
+ *
+ * CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS"
+ * CONDITION.  CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND
+ * FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
+ *
+ * Carnegie Mellon requests users of this software to return to
+ *
+ *  Software Distribution Coordinator  or  Software.Distribution@CS.CMU.EDU
+ *  School of Computer Science
+ *  Carnegie Mellon University
+ *  Pittsburgh PA 15213-3890
+ *
+ * any improvements or extensions that they make and grant Carnegie the
+ * rights to redistribute these changes.
+ */
+
+/*
+ * Defines for the architected startup addresses.
+ */
+#define HWRPB_ADDR	0x10000000	/* 256 MB */
+#define BOOT_ADDR	0x20000000	/* 512 MB */
+#define PGTBL_ADDR	0x40000000	/*   1 GB */
+
+/*
+ * Values for the "haltcode" field in the per-cpu portion of the HWRPB
+ *
+ * Bit defines for the "sysvar" field in the HWRPB.
+ * Each platform has different values for SYSBOARD and IOBOARD bits.
+ */
+#define HALT_PWRUP	0		/* power up */
+#define HALT_OPR	1		/* operator issued halt cmd */
+#define HALT_KSTK	2		/* kernel stack not valid */
+#define HALT_SCBB	3		/* invalid SCBB */
+#define HALT_PTBR	4		/* invalid PTBR */
+#define HALT_EXE	5		/* kernel executed halt instruction */
+#define HALT_DBLE	6		/* double error abort */
+
+/*
+ * Bit defines for the "state" field in the per-cpu portion of the HWRPB
+ */
+#define STATE_BIP	0x00000001	/* bootstrap in progress */
+#define STATE_RC	0x00000002	/* restart capable */
+#define STATE_PA	0x00000004	/* processor available to OS */
+#define STATE_PP	0x00000008	/* processor present */
+#define STATE_OH	0x00000010	/* operator halted */
+#define STATE_CV	0x00000020	/* context valid */
+#define STATE_PV	0x00000040	/* PALcode valid */
+#define STATE_PMV	0x00000080	/* PALcode memory valid */
+#define STATE_PL	0x00000100	/* PALcode loaded */
+#define STATE_HALT_MASK	0x00ff0000	/* Mask for Halt Requested field */
+#define STATE_DEFAULT	0x00000000	/* Default (no specific action) */
+#define STATE_SVRS_TERM	0x00010000	/* SAVE_TERM/RESTORE_TERM Exit */
+#define STATE_COLD_BOOT	0x00020000	/* Cold Bootstrap Requested */
+#define STATE_WARM_BOOT	0x00030000	/* Warm Bootstrap Requested */
+#define STATE_HALT	0x00040000	/* Remain halted (no restart) */
+
+
+#define SV_PF_RSVD	0x00000000	/* RESERVED */
+#define SV_RESERVED	0x00000000	/* All STS bits; 0 for back compat */
+#define SV_MPCAP	0x00000001	/* MP capable */
+#define SV_PF_UNITED	0x00000020	/* United */
+#define SV_PF_SEPARATE	0x00000040	/* Separate */
+#define SV_PF_FULLBB	0x00000060	/* Full battery backup */
+#define SV_POWERFAIL	0x000000e0	/* Powerfail implementation */
+#define SV_PF_RESTART	0x00000100	/* Powerfail restart */
+
+#define SV_GRAPHICS	0x00000200	/* Embedded graphics processor */
+
+#define SV_STS_MASK	0x0000fc00	/* STS bits - system and I/O board */
+#define SV_SANDPIPER	0x00000400	/* others define system platforms */
+#define SV_FLAMINGO	0x00000800	/* STS BIT SETTINGS */
+#define SV_HOTPINK	0x00000c00	/* STS BIT SETTINGS */
+#define SV_FLAMINGOPLUS	0x00001000	/* STS BIT SETTINGS */
+#define SV_ULTRA	0x00001400	/* STS BIT SETTINGS */
+#define SV_SANDPLUS	0x00001800	/* STS BIT SETTINGS */
+#define SV_SANDPIPER45	0x00001c00	/* STS BIT SETTINGS */
+#define SV_FLAMINGO45	0x00002000	/* STS BIT SETTINGS */
+
+#define SV_SABLE	0x00000400	/* STS BIT SETTINGS */
+
+#define SV_KN20AA	0x00000400	/* STS BIT SETTINGS */
+
+/*
+ * Values for the "console type" field in the CTB portion of the HWRPB
+ */
+#define CONS_NONE	0		/* no console present */
+#define CONS_SRVC	1		/* console is service processor */
+#define CONS_DZ		2		/* console is dz/dl VT device */
+#define CONS_GRPH	3		/* cons is gfx dev w/ dz/dl keybd*/
+#define CONS_REM	4		/* cons is remote, protocal enet/MOP */
+
+/*
+ * PALcode variants that we're interested in.
+ * Used as indices into the "palrev_avail" array in the per-cpu portion
+ * of the HWRPB.
+ */
+#define PALvar_reserved	0
+#define PALvar_OpenVMS	1
+#define PALvar_OSF1	2
+
+/*
+ * The Alpha restart parameter block, which is a page or 2 in low memory
+ */
+struct rpb {
+    struct rpb *rpb_selfref;	/* 000: physical self-reference */
+    long  rpb_string;		/* 008: contains string "HWRPB" */
+    long  rpb_vers;		/* 010: HWRPB version number */
+    ulong rpb_size;		/* 018: bytes in RPB perCPU CTB CRB MEMDSC */
+    ulong rpb_cpuid;		/* 020: primary cpu id */
+    ulong rpb_pagesize;		/* 028: page size in bytes */
+    ulong rpb_addrbits;		/* 030: number of phys addr bits */
+    ulong rpb_maxasn;		/* 038: max valid ASN */
+    char  rpb_ssn[16];		/* 040: system serial num: 10 ascii chars */
+    ulong grpb_systype;		/* 050: system type */
+    long  rpb_sysvar;		/* 058: system variation */
+    long  rpb_sysrev;		/* 060: system revision */
+    ulong rpb_clock;		/* 068: scaled interval clock intr freq */
+    ulong rpb_counter;		/* 070: cycle counter frequency */
+    ulong rpb_vptb;		/* 078: virtual page table base */
+    long  rpb_res1;		/* 080: reserved */
+    ulong rpb_trans_off;	/* 088: offset to translation buffer hint */
+    ulong rpb_numprocs;		/* 090: number of processor slots */
+    ulong rpb_slotsize;		/* 098: per-cpu slot size */
+    ulong rpb_percpu_off;	/* 0A0: offset to per_cpu slots */
+    ulong rpb_num_ctb;		/* 0A8: number of CTBs */
+    ulong rpb_ctb_size;		/* 0B0: bytes in largest CTB */
+    ulong rpb_ctb_off;		/* 0B8: offset to CTB (cons term block) */
+    ulong rpb_crb_off;		/* 0C0: offset to CRB (cons routine block) */
+    ulong rpb_mdt_off;		/* 0C8: offset to memory descriptor table */
+    ulong rpb_config_off;	/* 0D0: offset to config data block */
+    ulong rpb_fru_off;		/* 0D8: offset to FRU table */
+    void  (*rpb_saveterm)();	/* 0E0: virt addr of save term routine */
+    long  rpb_saveterm_pv;	/* 0E8: proc value for save term routine */
+    void  (*rpb_rstrterm)();	/* 0F0: virt addr of restore term routine */
+    long  rpb_rstrterm_pv;	/* 0F8: proc value for restore term routine */
+    void  (*rpb_restart)();	/* 100: virt addr of CPU restart routine */
+    long  rpb_restart_pv;	/* 108: proc value for CPU restart routine */
+    long  rpb_software;		/* 110: used to determine presence of kdebug */
+    long  rpb_hardware;		/* 118: reserved for hardware */
+    long  rpb_checksum;		/* 120: checksum of prior entries in rpb */
+    long  rpb_rxrdy;		/* 128: receive ready bitmask */
+    long  rpb_txrdy;		/* 130: transmit ready bitmask */
+    ulong rpb_dsr_off;		/* 138: Dynamic System Recog. offset */
+};
+
+#define rpb_kdebug rpb_software
+
+#define OSF_HWRPB_ADDR	((vm_offset_t)(-1L << 23))
+
+/*
+ * This is the format for the boot/restart HWPCB.  It must match the
+ * initial fields of the pcb structure as defined in pcb.h, but must
+ * additionally contain the appropriate amount of padding to line up
+ * with formats used by other palcode types.
+ */
+struct bootpcb {
+    long rpb_ksp;		/* 000: kernel stack pointer */
+    long rpb_usp;		/* 008: user stack pointer */
+    long rpb_ptbr;		/* 010: page table base register */
+    int  rpb_cc;		/* 018: cycle counter */
+    int  rpb_asn;		/* 01C: address space number */
+    long rpb_proc_uniq;		/* 020: proc/thread unique value */
+    long rpb_fen;		/* 028: floating point enable */
+    long rpb_palscr[2];		/* 030: pal scratch area */
+    long rpb_pcbpad[8];		/* 040: padding for fixed size */
+};
+
+/*
+ * Inter-Console Communications Buffer
+ * Used for the primary processor to communcate with the console
+ * of secondary processors.
+ */
+struct iccb {
+    uint iccb_rxlen;		/* receive length in bytes      */
+    uint iccb_txlen;		/* transmit length in bytes     */
+    char iccb_rxbuf[80];	/* receive buffer               */
+    char iccb_txbuf[80];	/* transmit buffer              */
+};
+
+/*
+ * The per-cpu portion of the Alpha HWRPB.
+ * Note that the main portion of the HWRPB is of variable size,
+ * hence this must be a separate structure.
+ *
+ */
+struct rpb_percpu {
+    struct bootpcb rpb_pcb;	/* 000: boot/restart HWPCB */
+    long rpb_state;		/* 080: per-cpu state bits */
+    long rpb_palmem;		/* 088: palcode memory length */
+    long rpb_palscratch;	/* 090: palcode scratch length */
+    long rpb_palmem_addr;	/* 098: phys addr of palcode mem space */
+    long rpb_palscratch_addr;	/* 0A0: phys addr of palcode scratch space */
+    long rpb_palrev;		/* 0A8: PALcode rev required */
+    long rpb_proctype;		/* 0B0: processor type */
+    long rpb_procvar;		/* 0B8: processor variation */
+    long rpb_procrev;		/* 0C0: processor revision */
+    char rpb_procsn[16];	/* 0C8: proc serial num: 10 ascii chars */
+    long rpb_logout;		/* 0D8: phys addr of logout area */
+    long rpb_logout_len;	/* 0E0: length in bytes of logout area */
+    long rpb_haltpb;		/* 0E8: halt pcb base */
+    long rpb_haltpc;		/* 0F0: halt pc */
+    long rpb_haltps;		/* 0F8: halt ps */
+    long rpb_haltal;		/* 100: halt arg list (R25) */
+    long rpb_haltra;		/* 108: halt return address (R26) */
+    long rpb_haltpv;		/* 110: halt procedure value (R27) */
+    long rpb_haltcode;		/* 118: reason for halt */
+    long rpb_software;		/* 120: for software */
+    struct iccb	rpb_iccb;       /* 128: inter-console communications buffer */
+    long rpb_palrev_avail[16];	/* 1D0: PALcode revs available */
+    long rpb_pcrsvd[6];		/* 250: reserved for arch use */
+/* the dump stack grows from the end of the rpb page not to reach here */
+};
+
+/* The firmware revision is in the (unused) first entry of palrevs available */
+#define rpb_firmrev rpb_palrev_avail[0]
+
+/*
+ * The memory cluster descriptor.
+ */
+struct rpb_cluster {
+    long rpb_pfn;		/* 000: starting PFN of this cluster */
+    long rpb_pfncount;		/* 008: count of PFNs in this cluster */
+    long rpb_pfntested;		/* 010: count of tested PFNs in cluster */
+    long rpb_va;		/* 018: va of bitmap */
+    long rpb_pa;		/* 020: pa of bitmap */
+    long rpb_checksum;		/* 028: checksum of bitmap */
+    long rpb_usage;		/* 030: usage of cluster */
+};
+#define CLUSTER_USAGE_OS	((long)0)
+#define CLUSTER_USAGE_PAL	((long)1)
+#define CLUSTER_USAGE_NVRAM	((long)2)
+
+/*
+ * The "memory descriptor table" portion of the HWRPB.
+ * Note that the main portion of the HWRPB is of variable size and there is a
+ * variable number of per-cpu slots, hence this must be a separate structure.
+ * Also note that the memory descriptor table contains a fixed portion plus
+ * a variable number of "memory cluster descriptors" (one for each "cluster"
+ * of memory).
+ */
+struct rpb_mdt {
+    long rpb_checksum;		/* 000: checksum of entire mem desc table */
+    long rpb_impaddr;		/* 008: PA of implementation dep info */
+    long rpb_numcl;		/* 010: number of clusters */
+    struct rpb_cluster rpb_cluster[1];	/* first instance of a cluster */
+};
+
+/*
+ * The "Console Terminal Block" portion of the HWRPB, for serial line
+ * UART console device.
+ */
+struct ctb_tt {
+
+    long ctb_type;               /*   0: always 4 */
+    long ctb_unit;               /*   8: */
+    long ctb_reserved;           /*  16: */
+    long ctb_len;                /*  24: bytes of info */
+    long ctb_ipl;                /*  32: console ipl level */
+    long ctb_tintr_vec;          /*  40: transmit vec (0x800) */
+    long ctb_rintr_vec;          /*  48: receive vec (0x800) */
+#define CTB_GRAPHICS       3     /* graphics device */
+#define CTB_NETWORK     0xC0     /* network device */
+#define CTB_PRINTERPORT    2     /* printer port on the SCC */
+    long ctb_term_type;          /*  56: terminal type */
+    long ctb_keybd_type;         /*  64: keyboard nationality */
+    long ctb_keybd_trans;        /*  72: trans. table addr */
+    long ctb_keybd_map;          /*  80: map table addr */
+    long ctb_keybd_state;        /*  88: keyboard flags */
+    long ctb_keybd_last;         /*  96: last key entered */
+    long ctb_font_us;            /* 104: US font table addr */
+    long ctb_font_mcs;           /* 112: MCS font table addr */
+    long ctb_font_width;         /* 120: font width, height */
+    long ctb_font_height;        /* 128:         in pixels */
+    long ctb_mon_width;          /* 136: monitor width, height */
+    long ctb_mon_height;         /* 144:         in pixels */
+    long ctb_dpi;                /* 152: monitor dots per inch */
+    long ctb_planes;             /* 160: # of planes */
+    long ctb_cur_width;          /* 168: cursor width, height */
+    long ctb_cur_height;         /* 176:         in pixels */
+    long ctb_head_cnt;           /* 184: # of heads */
+    long ctb_opwindow;           /* 192: opwindow on screen */
+    long ctb_head_offset;        /* 200: offset to head info */
+    long ctb_putchar;            /* 208: output char to TURBO */
+    long ctb_io_state;           /* 216: I/O flags */
+    long ctb_listen_state;       /* 224: listener flags */
+    long ctb_xaddr;              /* 232: extended info addr */
+    long ctb_turboslot;          /* 248: TURBOchannel slot # */
+    long ctb_server_off;         /* 256: offset to server info */
+    long ctb_line_off;           /* 264: line parameter offset */
+    char ctb_csd;                /* 272: console specific data */
+};
+
+/*
+ * The "Console Terminal Block" portion of the HWRPB.
+ */
+struct rpb_ctb {
+    long rpb_type;		/* 000: console type */
+    long rpb_unit;		/* 008: console unit */
+    long rpb_resv;		/* 010: reserved */
+    long rpb_length;		/* 018: byte length of device dep portion */
+    long rpb_first;		/* 000: first field of device dep portion */
+};
+
+/*
+ * The physical/virtual map for the console routine block.
+ */
+struct rpb_map {
+    long rpb_virt;		/* virtual address for map entry */
+    long rpb_phys;		/* phys address for map entry */
+    long rpb_pgcount;		/* page count for map entry */
+};
+
+/*
+ * The "Console Routine Block" portion of the HWRPB.
+ * Note: the "offsets" are all relative to the start of the HWRPB (HWRPB_ADDR).
+ */
+struct rpb_crb {
+    long rpb_va_disp;		/* va of call-back dispatch rtn */
+    long rpb_pa_disp;		/* pa of call-back dispatch rtn */
+    long rpb_va_fixup;		/* va of call-back fixup rtn */
+    long rpb_pa_fixup;		/* pa of call-back fixup rtn */
+    long rpb_num;		/* number of entries in phys/virt map */
+    long rpb_mapped_pages;	/* Number of pages to be mapped */
+    struct rpb_map rpb_map[1];	/* first instance of a map entry */
+};
+
+/*
+ * These macros define where within the HWRPB the CTB and CRB are located.
+ */
+#define CTB_SETUP \
+    ((struct rpb_ctb *) ((long)hwrpb_addr + (long)(hwrpb_addr->rpb_ctb_off)))
+
+#define CRB_SETUP \
+    ((struct rpb_crb *) ((long)hwrpb_addr + (long)(hwrpb_addr->rpb_crb_off)))
+
+/*
+ * The "Dynamic System Recognition" portion of the HWRPB.
+ * It is used to obtain the platform specific data need to allow
+ * the platform define the platform name, the platform SMM and LURT
+ * data for software licensing
+ */
+struct rpb_dsr {
+    long rpb_smm;		/* SMM nubber used by LMF	*/
+    ulong rpb_lurt_off;		/* offset to LURT table		*/
+    ulong rpb_sysname_off;	/* offset to sysname char count	*/
+    int	lurt[10];		/* XXM has one LURT entry	*/
+};
diff --git a/system/alpha/h/tlaser.h b/system/alpha/h/tlaser.h
new file mode 100644
index 0000000000..283d61be3c
--- /dev/null
+++ b/system/alpha/h/tlaser.h
@@ -0,0 +1,34 @@
+/*
+ * Copyright (c) 1990 Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#define tlsb_tlber      0x40
+#define tlsb_tldev      0x00
+#define tlsb_tlesr0     0x680
+#define tlsb_tlesr1     0x6C0
+#define tlsb_tlesr2     0x700
+#define tlsb_tlesr3     0x740
diff --git a/system/alpha/palcode/Makefile b/system/alpha/palcode/Makefile
new file mode 100644
index 0000000000..2f1eded33f
--- /dev/null
+++ b/system/alpha/palcode/Makefile
@@ -0,0 +1,92 @@
+# Copyright (c) 2003, 2004
+# The Regents of The University of Michigan
+# All Rights Reserved
+#
+# This code is part of the M5 simulator.
+#
+# Permission is granted to use, copy, create derivative works and
+# redistribute this software and such derivative works for any purpose,
+# so long as the copyright notice above, this grant of permission, and
+# the disclaimer below appear in all copies made; and so long as the
+# name of The University of Michigan is not used in any advertising or
+# publicity pertaining to the use or distribution of this software
+# without specific, written prior authorization.
+#
+# THIS SOFTWARE IS PROVIDED AS IS, WITHOUT REPRESENTATION FROM THE
+# UNIVERSITY OF MICHIGAN AS TO ITS FITNESS FOR ANY PURPOSE, AND WITHOUT
+# WARRANTY BY THE UNIVERSITY OF MICHIGAN OF ANY KIND, EITHER EXPRESS OR
+# IMPLIED, INCLUDING WITHOUT LIMITATION THE IMPLIED WARRANTIES OF
+# MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE REGENTS OF
+# THE UNIVERSITY OF MICHIGAN SHALL NOT BE LIABLE FOR ANY DAMAGES,
+# INCLUDING DIRECT, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL
+# DAMAGES, WITH RESPECT TO ANY CLAIM ARISING OUT OF OR IN CONNECTION
+# WITH THE USE OF THE SOFTWARE, EVEN IF IT HAS BEEN OR IS HEREAFTER
+# ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
+#
+# Authors: Nathan L. Binkert
+#          Ali G. Saidi
+
+# Makefile for palcode
+# Works on alpha-linux and builds elf executable
+
+### If we are not compiling on an alpha, we must use cross tools ###    
+ifneq ($(shell uname -m), alpha)
+CROSS_COMPILE?=alpha-unknown-linux-gnu-
+endif
+CC=$(CROSS_COMPILE)gcc
+AS=$(CROSS_COMPILE)as
+LD=$(CROSS_COMPILE)ld
+
+CFLAGS=-I . -I ../h -nostdinc -nostdinc++ -Wa,-m21164
+LDFLAGS=-Ttext 0x4000
+
+TLOBJS = osfpal.o platform_tlaser.o
+TLOBJS_COPY = osfpal_cache_copy.o platform_tlaser.o
+TLOBJS_COPY_UNALIGNED = osfpal_cache_copy_unaligned.o platform_tlaser.o
+TSOBJS = osfpal.o platform_tsunami.o
+TSBOBJS = osfpal.o platform_bigtsunami.o
+TSOBJS_COPY = osfpal_cache_copy.o platform_tsunami.o
+TSOBJS_COPY_UNALIGNED = osfpal_cache_copy_unaligned.o platform_bigtsunami.o
+
+all: tlaser tsunami tsunami_b64
+
+all_copy: tlaser tlaser_copy tsunami tsunami_b64 tsunami_copy
+
+osfpal.o: osfpal.S
+	$(CC) $(CFLAGS) -o $@ -c $<
+
+osfpal_cache_copy.o: osfpal.S
+	$(CC) $(CFLAGS) -DCACHE_COPY -o $@ -c $<
+
+osfpal_cache_copy_unaligned.o: osfpal.S
+	$(CC) $(CFLAGS) -DCACHE_COPY -DCACHE_COPY_UNALIGNED -o $@ -c $<
+
+platform_tlaser.o: platform.S
+	$(CC) $(CFLAGS) -DTLASER -o $@ -c $<
+
+platform_tsunami.o: platform.S
+	$(CC) $(CFLAGS) -DTSUNAMI -o $@ -c $<
+
+platform_bigtsunami.o: platform.S
+	$(CC) $(CFLAGS) -DBIG_TSUNAMI -o $@ -c $<
+
+tlaser:  $(TLOBJS)
+	$(LD) $(LDFLAGS) -o tl_osfpal $(TLOBJS)
+
+tlaser_copy: $(TLOBJS_COPY) $(TLOBJS_COPY_UNALIGNED)
+	$(LD) $(LDFLAGS) -o tl_osfpal_cache $(TLOBJS_COPY)
+	$(LD) $(LDFLAGS) -o tl_osfpal_unalign $(TLOBJS_COPY_UNALIGNED)
+
+tsunami: $(TSOBJS)
+	$(LD) $(LDFLAGS) -o ts_osfpal $(TSOBJS)
+
+tsunami_b64: $(TSBOBJS)
+	$(LD) $(LDFLAGS) -o tsb_osfpal $(TSBOBJS)
+
+tsunami_copy: $(TSOBJS_COPY) $(TSOBJS_COPY_UNALIGNED)
+	$(LD) $(LDFLAGS) -o ts_osfpal_cache $(TSOBJS_COPY)
+	$(LD) $(LDFLAGS) -o ts_osfpal_unalign $(TSOBJS_COPY_UNALIGNED)
+
+clean:
+	rm -f *.o tl_osfpal tl_osfpal_cache tl_osfpal_unalign ts_osfpal \
+	ts_osfpal_cache ts_osfpal_unalign tsb_osfpal
diff --git a/system/alpha/palcode/osfpal.S b/system/alpha/palcode/osfpal.S
new file mode 100644
index 0000000000..3ec4d40118
--- /dev/null
+++ b/system/alpha/palcode/osfpal.S
@@ -0,0 +1,4202 @@
+/*
+ * Copyright (c) 2003-2006 The Regents of The University of Michigan
+ * Copyright (c) 1992-1995 Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: Ali G. Saidi
+ *          Nathan L. Binkert
+ */
+
+// modified to use the Hudson style "impure.h" instead of ev5_impure.sdl
+// since we don't have a mechanism to expand the data structures.... pb Nov/95
+#include "ev5_defs.h"
+#include "ev5_impure.h"
+#include "ev5_alpha_defs.h"
+#include "ev5_paldef.h"
+#include "ev5_osfalpha_defs.h"
+#include "fromHudsonMacros.h"
+#include "fromHudsonOsf.h"
+#include "dc21164FromGasSources.h"
+
+#define DEBUGSTORE(c) nop
+
+#define DEBUG_EXC_ADDR()\
+        bsr	r25, put_exc_addr; \
+        DEBUGSTORE(13)		; \
+        DEBUGSTORE(10)
+
+// This is the fix for the user-mode super page references causing the
+// machine to crash.
+#define hw_rei_spe	hw_rei
+
+#define vmaj 1
+#define vmin 18
+#define vms_pal 1
+#define osf_pal 2
+#define pal_type osf_pal
+#define osfpal_version_l ((pal_type<<16) | (vmaj<<8) | (vmin<<0))
+
+
+///////////////////////////
+// PALtemp register usage
+///////////////////////////
+
+//  The EV5 Ibox holds 24 PALtemp registers.  This maps the OSF PAL usage
+//  for these PALtemps:
+//
+//	pt0   local scratch
+//	pt1   local scratch
+//	pt2   entUna					pt_entUna
+//	pt3   CPU specific impure area pointer		pt_impure
+//	pt4   memory management temp
+//	pt5   memory management temp
+//	pt6   memory management temp
+//	pt7   entIF					pt_entIF
+//	pt8   intmask					pt_intmask
+//	pt9   entSys					pt_entSys
+//	pt10
+//	pt11  entInt					pt_entInt
+//	pt12  entArith					pt_entArith
+//	pt13  reserved for system specific PAL
+//	pt14  reserved for system specific PAL
+//	pt15  reserved for system specific PAL
+//	pt16  MISC: scratch ! WHAMI<7:0> ! 0 0 0 MCES<4:0> pt_misc, pt_whami,
+//                pt_mces
+//	pt17  sysval					pt_sysval
+//	pt18  usp					pt_usp
+//	pt19  ksp					pt_ksp
+//	pt20  PTBR					pt_ptbr
+//	pt21  entMM					pt_entMM
+//	pt22  kgp					pt_kgp
+//	pt23  PCBB					pt_pcbb
+//
+//
+
+
+/////////////////////////////
+// PALshadow register usage
+/////////////////////////////
+
+//
+// EV5 shadows R8-R14 and R25 when in PALmode and ICSR<shadow_enable> = 1.
+// This maps the OSF PAL usage of R8 - R14 and R25:
+//
+// 	r8    ITBmiss/DTBmiss scratch
+// 	r9    ITBmiss/DTBmiss scratch
+// 	r10   ITBmiss/DTBmiss scratch
+//	r11   PS
+//	r12   local scratch
+//	r13   local scratch
+//	r14   local scratch
+//	r25   local scratch
+//
+
+
+
+// .sbttl	"PALcode configuration options"
+
+// There are a number of options that may be assembled into this version of
+// PALcode. They should be adjusted in a prefix assembly file (i.e. do not edit
+// the following). The options that can be adjusted cause the resultant PALcode
+// to reflect the desired target system.
+
+// multiprocessor support can be enabled for a max of n processors by
+// setting the following to the number of processors on the system.
+// Note that this is really the max cpuid.
+
+#define max_cpuid 1
+#ifndef max_cpuid
+#define max_cpuid 8
+#endif
+
+#define osf_svmin 1
+#define osfpal_version_h ((max_cpuid<<16) | (osf_svmin<<0))
+
+//
+// RESET	-  Reset Trap Entry Point
+//
+// RESET - offset 0000
+// Entry:
+//	Vectored into via hardware trap on reset, or branched to
+//	on swppal.
+//
+//	r0 = whami
+//	r1 = pal_base
+//	r2 = base of scratch area
+//	r3 = halt code
+//
+//
+// Function:
+//
+//
+
+        .text	0
+        . = 0x0000
+        .globl _start
+        .globl Pal_Base
+_start:
+Pal_Base:
+        HDW_VECTOR(PAL_RESET_ENTRY)
+Trap_Reset:
+        nop
+        /*
+         * store into r1
+         */
+        br r1,sys_reset
+
+        // Specify PAL version info as a constant
+        // at a known location (reset + 8).
+
+        .long osfpal_version_l		// <pal_type@16> ! <vmaj@8> ! <vmin@0>
+        .long osfpal_version_h		// <max_cpuid@16> ! <osf_svmin@0>
+        .long 0
+        .long 0
+pal_impure_start:
+        .quad 0
+pal_debug_ptr:
+        .quad 0				// reserved for debug pointer ; 20
+
+
+//
+// IACCVIO - Istream Access Violation Trap Entry Point
+//
+// IACCVIO - offset 0080
+// Entry:
+//	Vectored into via hardware trap on Istream access violation or sign check error on PC.
+//
+// Function:
+//	Build stack frame
+//	a0 <- Faulting VA
+//	a1 <- MMCSR  (1 for ACV)
+//	a2 <- -1 (for ifetch fault)
+//	vector via entMM
+//
+
+        HDW_VECTOR(PAL_IACCVIO_ENTRY)
+Trap_Iaccvio:
+        DEBUGSTORE(0x42)
+        sll	r11, 63-osfps_v_mode, r25 // Shift mode up to MS bit
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        bis	r11, r31, r12		// Save PS
+        bge	r25, TRAP_IACCVIO_10_		// no stack swap needed if cm=kern
+
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        mtpr	r30, pt_usp		// save user stack
+
+        bis	r31, r31, r12		// Set new PS
+        mfpr	r30, pt_ksp
+
+TRAP_IACCVIO_10_:
+        lda	sp, 0-osfsf_c_size(sp)// allocate stack space
+        mfpr	r14, exc_addr		// get pc
+
+        stq	r16, osfsf_a0(sp)	// save regs
+        bic	r14, 3, r16		// pass pc/va as a0
+
+        stq	r17, osfsf_a1(sp)	// a1
+        or	r31, mmcsr_c_acv, r17	// pass mm_csr as a1
+
+        stq	r18, osfsf_a2(sp) 	// a2
+        mfpr	r13, pt_entmm		// get entry point
+
+        stq	r11, osfsf_ps(sp)	// save old ps
+        bis	r12, r31, r11		// update ps
+
+        stq	r16, osfsf_pc(sp)	// save pc
+        stq	r29, osfsf_gp(sp) 	// save gp
+
+        mtpr	r13, exc_addr		// load exc_addr with entMM
+                                        // 1 cycle to hw_rei
+        mfpr	r29, pt_kgp		// get the kgp
+
+        subq	r31, 1, r18		// pass flag of istream, as a2
+        hw_rei_spe
+
+
+//
+// INTERRUPT - Interrupt Trap Entry Point
+//
+// INTERRUPT - offset 0100
+// Entry:
+//	Vectored into via trap on hardware interrupt
+//
+// Function:
+//	check for halt interrupt
+//	check for passive release (current ipl geq requestor)
+//	if necessary, switch to kernel mode push stack frame,
+//      update ps (including current mode and ipl copies), sp, and gp
+//	pass the interrupt info to the system module
+//
+//
+        HDW_VECTOR(PAL_INTERRUPT_ENTRY)
+Trap_Interrupt:
+        mfpr    r13, ev5__intid         // Fetch level of interruptor
+        mfpr    r25, ev5__isr           // Fetch interrupt summary register
+
+        srl     r25, isr_v_hlt, r9     // Get HLT bit
+        mfpr	r14, ev5__ipl
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kern
+        blbs    r9, sys_halt_interrupt	// halt_interrupt if HLT bit set
+
+        cmple   r13, r14, r8            // R8 = 1 if intid .less than or eql. ipl
+        bne     r8, sys_passive_release // Passive release is current rupt is lt or eq ipl
+
+        and	r11, osfps_m_mode, r10 // get mode bit
+        beq	r10, TRAP_INTERRUPT_10_		// Skip stack swap in kernel
+
+        mtpr	r30, pt_usp		// save user stack
+        mfpr	r30, pt_ksp		// get kern stack
+
+TRAP_INTERRUPT_10_:
+        lda	sp, (0-osfsf_c_size)(sp)// allocate stack space
+        mfpr	r14, exc_addr		// get pc
+
+        stq	r11, osfsf_ps(sp) 	// save ps
+        stq	r14, osfsf_pc(sp) 	// save pc
+
+        stq     r29, osfsf_gp(sp)       // push gp
+        stq	r16, osfsf_a0(sp)	// a0
+
+//	pvc_violate 354			// ps is cleared anyway,  if store to stack faults.
+        mtpr    r31, ev5__ps            // Set Ibox current mode to kernel
+        stq	r17, osfsf_a1(sp)	// a1
+
+        stq	r18, osfsf_a2(sp) 	// a2
+        subq	r13, 0x11, r12		// Start to translate from EV5IPL->OSFIPL
+
+        srl	r12, 1, r8		// 1d, 1e: ipl 6.  1f: ipl 7.
+        subq	r13, 0x1d, r9		// Check for 1d, 1e, 1f
+
+        cmovge	r9, r8, r12		// if .ge. 1d, then take shifted value
+        bis	r12, r31, r11		// set new ps
+
+        mfpr	r12, pt_intmask
+        and	r11, osfps_m_ipl, r14	// Isolate just new ipl (not really needed, since all non-ipl bits zeroed already)
+
+        /*
+         * Lance had space problems. We don't.
+         */
+        extbl	r12, r14, r14		// Translate new OSFIPL->EV5IPL
+        mfpr	r29, pt_kgp		// update gp
+        mtpr	r14, ev5__ipl		// load the new IPL into Ibox
+        br	r31, sys_interrupt	// Go handle interrupt
+
+
+
+//
+// ITBMISS - Istream TBmiss Trap Entry Point
+//
+// ITBMISS - offset 0180
+// Entry:
+//	Vectored into via hardware trap on Istream translation buffer miss.
+//
+// Function:
+//       Do a virtual fetch of the PTE, and fill the ITB if the PTE is valid.
+//       Can trap into DTBMISS_DOUBLE.
+//       This routine can use the PALshadow registers r8, r9, and r10
+//
+//
+
+        HDW_VECTOR(PAL_ITB_MISS_ENTRY)
+Trap_Itbmiss:
+        // Real MM mapping
+        nop
+        mfpr	r8, ev5__ifault_va_form // Get virtual address of PTE.
+
+        nop
+        mfpr    r10, exc_addr           // Get PC of faulting instruction in case of DTBmiss.
+
+pal_itb_ldq:
+        ld_vpte r8, 0(r8)             	// Get PTE, traps to DTBMISS_DOUBLE in case of TBmiss
+        mtpr	r10, exc_addr		// Restore exc_address if there was a trap.
+
+        mfpr	r31, ev5__va		// Unlock VA in case there was a double miss
+        nop
+
+        and	r8, osfpte_m_foe, r25 	// Look for FOE set.
+        blbc	r8, invalid_ipte_handler // PTE not valid.
+
+        nop
+        bne	r25, foe_ipte_handler	// FOE is set
+
+        nop
+        mtpr	r8, ev5__itb_pte	// Ibox remembers the VA, load the PTE into the ITB.
+
+        hw_rei_stall			//
+
+
+//
+// DTBMISS_SINGLE - Dstream Single TBmiss Trap Entry Point
+//
+// DTBMISS_SINGLE - offset 0200
+// Entry:
+//	Vectored into via hardware trap on Dstream single translation
+//      buffer miss.
+//
+// Function:
+//	Do a virtual fetch of the PTE, and fill the DTB if the PTE is valid.
+//	Can trap into DTBMISS_DOUBLE.
+//	This routine can use the PALshadow registers r8, r9, and r10
+//
+
+        HDW_VECTOR(PAL_DTB_MISS_ENTRY)
+Trap_Dtbmiss_Single:
+        mfpr	r8, ev5__va_form      	// Get virtual address of PTE - 1 cycle delay.  E0.
+        mfpr    r10, exc_addr           // Get PC of faulting instruction in case of error.  E1.
+
+//	DEBUGSTORE(0x45)
+//	DEBUG_EXC_ADDR()
+                                        // Real MM mapping
+        mfpr    r9, ev5__mm_stat	// Get read/write bit.  E0.
+        mtpr	r10, pt6		// Stash exc_addr away
+
+pal_dtb_ldq:
+        ld_vpte r8, 0(r8)             	// Get PTE, traps to DTBMISS_DOUBLE in case of TBmiss
+        nop				// Pad MF VA
+
+        mfpr	r10, ev5__va            // Get original faulting VA for TB load.  E0.
+        nop
+
+        mtpr    r8, ev5__dtb_pte       	// Write DTB PTE part.   E0.
+        blbc    r8, invalid_dpte_handler    // Handle invalid PTE
+
+        mtpr    r10, ev5__dtb_tag      	// Write DTB TAG part, completes DTB load.  No virt ref for 3 cycles.
+        mfpr	r10, pt6
+
+                                        // Following 2 instructions take 2 cycles
+        mtpr    r10, exc_addr           // Return linkage in case we trapped.  E1.
+        mfpr	r31,  pt0		// Pad the write to dtb_tag
+
+        hw_rei                          // Done, return
+
+
+//
+// DTBMISS_DOUBLE - Dstream Double TBmiss Trap Entry Point
+//
+//
+// DTBMISS_DOUBLE - offset 0280
+// Entry:
+//	Vectored into via hardware trap on Double TBmiss from single
+//      miss flows.
+//
+//	r8   - faulting VA
+//	r9   - original MMstat
+//	r10 - original exc_addr (both itb,dtb miss)
+//	pt6 - original exc_addr (dtb miss flow only)
+//	VA IPR - locked with original faulting VA
+//
+// Function:
+// 	Get PTE, if valid load TB and return.
+//	If not valid then take TNV/ACV exception.
+//
+//	pt4 and pt5 are reserved for this flow.
+//
+//
+//
+
+        HDW_VECTOR(PAL_DOUBLE_MISS_ENTRY)
+Trap_Dtbmiss_double:
+        mtpr 	r8, pt4			// save r8 to do exc_addr check
+        mfpr	r8, exc_addr
+        blbc	r8, Trap_Dtbmiss_Single	//if not in palmode, should be in the single routine, dummy!
+        mfpr	r8, pt4			// restore r8
+        nop
+        mtpr	r22, pt5		// Get some scratch space. E1.
+                                        // Due to virtual scheme, we can skip the first lookup and go
+                                        // right to fetch of level 2 PTE
+        sll     r8, (64-((2*page_seg_size_bits)+page_offset_size_bits)), r22  // Clean off upper bits of VA
+        mtpr	r21, pt4		// Get some scratch space. E1.
+
+        srl    	r22, 61-page_seg_size_bits, r22 // Get Va<seg1>*8
+        mfpr	r21, pt_ptbr		// Get physical address of the page table.
+
+        nop
+        addq    r21, r22, r21           // Index into page table for level 2 PTE.
+
+        sll    	r8, (64-((1*page_seg_size_bits)+page_offset_size_bits)), r22  // Clean off upper bits of VA
+        ldq_p   	r21, 0(r21)            	// Get level 2 PTE (addr<2:0> ignored)
+
+        srl    	r22, 61-page_seg_size_bits, r22	// Get Va<seg1>*8
+        blbc 	r21, double_pte_inv		// Check for Invalid PTE.
+
+        srl    	r21, 32, r21			// extract PFN from PTE
+        sll     r21, page_offset_size_bits, r21	// get PFN * 2^13 for add to <seg3>*8
+
+        addq    r21, r22, r21           // Index into page table for level 3 PTE.
+        nop
+
+        ldq_p   	r21, 0(r21)            	// Get level 3 PTE (addr<2:0> ignored)
+        blbc	r21, double_pte_inv	// Check for invalid PTE.
+
+        mtpr	r21, ev5__dtb_pte	// Write the PTE.  E0.
+        mfpr	r22, pt5		// Restore scratch register
+
+        mtpr	r8, ev5__dtb_tag	// Write the TAG. E0.  No virtual references in subsequent 3 cycles.
+        mfpr	r21, pt4		// Restore scratch register
+
+        nop				// Pad write to tag.
+        nop
+
+        nop				// Pad write to tag.
+        nop
+
+        hw_rei
+
+
+
+//
+// UNALIGN -- Dstream unalign trap
+//
+// UNALIGN - offset 0300
+// Entry:
+//	Vectored into via hardware trap on unaligned Dstream reference.
+//
+// Function:
+//	Build stack frame
+//	a0 <- Faulting VA
+//	a1 <- Opcode
+//	a2 <- src/dst register number
+//	vector via entUna
+//
+
+        HDW_VECTOR(PAL_UNALIGN_ENTRY)
+Trap_Unalign:
+/*	DEBUGSTORE(0x47)*/
+        sll	r11, 63-osfps_v_mode, r25 // Shift mode up to MS bit
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        mfpr	r8, ev5__mm_stat	// Get mmstat --ok to use r8, no tbmiss
+        mfpr	r14, exc_addr		// get pc
+
+        srl	r8, mm_stat_v_ra, r13	// Shift Ra field to ls bits
+        blbs	r14, pal_pal_bug_check  // Bugcheck if unaligned in PAL
+
+        blbs	r8, UNALIGN_NO_DISMISS // lsb only set on store or fetch_m
+                                        // not set, must be a load
+        and	r13, 0x1F, r8		// isolate ra
+
+        cmpeq   r8, 0x1F, r8		// check for r31/F31
+        bne     r8, dfault_fetch_ldr31_err // if its a load to r31 or f31 -- dismiss the fault
+
+UNALIGN_NO_DISMISS:
+        bis	r11, r31, r12		// Save PS
+        bge	r25, UNALIGN_NO_DISMISS_10_		// no stack swap needed if cm=kern
+
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        mtpr	r30, pt_usp		// save user stack
+
+        bis	r31, r31, r12		// Set new PS
+        mfpr	r30, pt_ksp
+
+UNALIGN_NO_DISMISS_10_:
+        mfpr	r25, ev5__va		// Unlock VA
+        lda	sp, 0-osfsf_c_size(sp)// allocate stack space
+
+        mtpr	r25, pt0		// Stash VA
+        stq	r18, osfsf_a2(sp) 	// a2
+
+        stq	r11, osfsf_ps(sp)	// save old ps
+        srl	r13, mm_stat_v_opcode-mm_stat_v_ra, r25// Isolate opcode
+
+        stq	r29, osfsf_gp(sp) 	// save gp
+        addq	r14, 4, r14		// inc PC past the ld/st
+
+        stq	r17, osfsf_a1(sp)	// a1
+        and	r25, mm_stat_m_opcode, r17// Clean opocde for a1
+
+        stq	r16, osfsf_a0(sp)	// save regs
+        mfpr	r16, pt0		// a0 <- va/unlock
+
+        stq	r14, osfsf_pc(sp)	// save pc
+        mfpr	r25, pt_entuna		// get entry point
+
+
+        bis	r12, r31, r11		// update ps
+        br 	r31, unalign_trap_cont
+
+
+//
+// DFAULT	- Dstream Fault Trap Entry Point
+//
+// DFAULT - offset 0380
+// Entry:
+//	Vectored into via hardware trap on dstream fault or sign check
+//      error on DVA.
+//
+// Function:
+//	Ignore faults on FETCH/FETCH_M
+//	Check for DFAULT in PAL
+//	Build stack frame
+//	a0 <- Faulting VA
+//	a1 <- MMCSR (1 for ACV, 2 for FOR, 4 for FOW)
+//	a2 <- R/W
+//	vector via entMM
+//
+//
+        HDW_VECTOR(PAL_D_FAULT_ENTRY)
+Trap_Dfault:
+//	DEBUGSTORE(0x48)
+        sll	r11, 63-osfps_v_mode, r25 // Shift mode up to MS bit
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        mfpr	r13, ev5__mm_stat	// Get mmstat
+        mfpr	r8, exc_addr		// get pc, preserve r14
+
+        srl	r13, mm_stat_v_opcode, r9 // Shift opcode field to ls bits
+        blbs	r8, dfault_in_pal
+
+        bis	r8, r31, r14		// move exc_addr to correct place
+        bis	r11, r31, r12		// Save PS
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        and	r9, mm_stat_m_opcode, r9 // Clean all but opcode
+
+        cmpeq   r9, evx_opc_sync, r9 	// Is the opcode fetch/fetchm?
+        bne     r9, dfault_fetch_ldr31_err   // Yes, dismiss the fault
+
+        //dismiss exception if load to r31/f31
+        blbs	r13, dfault_no_dismiss	// mm_stat<0> set on store or fetchm
+
+                                        // not a store or fetch, must be a load
+        srl	r13, mm_stat_v_ra, r9	// Shift rnum to low bits
+
+        and	r9, 0x1F, r9		// isolate rnum
+        nop
+
+        cmpeq   r9, 0x1F, r9   	// Is the rnum r31 or f31?
+        bne     r9, dfault_fetch_ldr31_err    // Yes, dismiss the fault
+
+dfault_no_dismiss:
+        and	r13, 0xf, r13	// Clean extra bits in mm_stat
+        bge	r25, dfault_trap_cont	// no stack swap needed if cm=kern
+
+
+        mtpr	r30, pt_usp		// save user stack
+        bis	r31, r31, r12		// Set new PS
+
+        mfpr	r30, pt_ksp
+        br	r31, dfault_trap_cont
+
+
+//
+// MCHK	-  Machine Check Trap Entry Point
+//
+// MCHK - offset 0400
+// Entry:
+//	Vectored into via hardware trap on machine check.
+//
+// Function:
+//
+//
+
+        HDW_VECTOR(PAL_MCHK_ENTRY)
+Trap_Mchk:
+        DEBUGSTORE(0x49)
+        mtpr    r31, ic_flush_ctl       // Flush the Icache
+        br      r31, sys_machine_check
+
+
+//
+// OPCDEC	-  Illegal Opcode Trap Entry Point
+//
+// OPCDEC - offset 0480
+// Entry:
+//	Vectored into via hardware trap on illegal opcode.
+//
+//	Build stack frame
+//	a0 <- code
+//	a1 <- unpred
+//	a2 <- unpred
+//	vector via entIF
+//
+//
+
+        HDW_VECTOR(PAL_OPCDEC_ENTRY)
+Trap_Opcdec:
+        DEBUGSTORE(0x4a)
+//simos	DEBUG_EXC_ADDR()
+        sll	r11, 63-osfps_v_mode, r25 // Shift mode up to MS bit
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        mfpr	r14, exc_addr		// get pc
+        blbs	r14, pal_pal_bug_check	// check opcdec in palmode
+
+        bis	r11, r31, r12		// Save PS
+        bge	r25, TRAP_OPCDEC_10_		// no stack swap needed if cm=kern
+
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        mtpr	r30, pt_usp		// save user stack
+
+        bis	r31, r31, r12		// Set new PS
+        mfpr	r30, pt_ksp
+
+TRAP_OPCDEC_10_:
+        lda	sp, 0-osfsf_c_size(sp)// allocate stack space
+        addq	r14, 4, r14		// inc pc
+
+        stq	r16, osfsf_a0(sp)	// save regs
+        bis	r31, osf_a0_opdec, r16	// set a0
+
+        stq	r11, osfsf_ps(sp)	// save old ps
+        mfpr	r13, pt_entif		// get entry point
+
+        stq	r18, osfsf_a2(sp) 	// a2
+        stq	r17, osfsf_a1(sp)	// a1
+
+        stq	r29, osfsf_gp(sp) 	// save gp
+        stq	r14, osfsf_pc(sp)	// save pc
+
+        bis	r12, r31, r11		// update ps
+        mtpr	r13, exc_addr		// load exc_addr with entIF
+                                        // 1 cycle to hw_rei, E1
+
+        mfpr	r29, pt_kgp		// get the kgp, E1
+
+        hw_rei_spe			// done, E1
+
+
+//
+// ARITH	-  Arithmetic Exception Trap Entry Point
+//
+// ARITH - offset 0500
+// Entry:
+//	Vectored into via hardware trap on arithmetic excpetion.
+//
+// Function:
+//	Build stack frame
+//	a0 <- exc_sum
+//	a1 <- exc_mask
+//	a2 <- unpred
+//	vector via entArith
+//
+//
+        HDW_VECTOR(PAL_ARITH_ENTRY)
+Trap_Arith:
+        DEBUGSTORE(0x4b)
+        and	r11, osfps_m_mode, r12 // get mode bit
+        mfpr	r31, ev5__va		// unlock mbox
+
+        bis	r11, r31, r25		// save ps
+        mfpr	r14, exc_addr		// get pc
+
+        nop
+        blbs	r14, pal_pal_bug_check	// arith trap from PAL
+
+        mtpr    r31, ev5__dtb_cm        // Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        beq	r12, TRAP_ARITH_10_		// if zero we are in kern now
+
+        bis	r31, r31, r25		// set the new ps
+        mtpr	r30, pt_usp		// save user stack
+
+        nop
+        mfpr	r30, pt_ksp		// get kern stack
+
+TRAP_ARITH_10_: 	lda	sp, 0-osfsf_c_size(sp)	// allocate stack space
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        nop				// Pad current mode write and stq
+        mfpr	r13, ev5__exc_sum	// get the exc_sum
+
+        mfpr	r12, pt_entarith
+        stq	r14, osfsf_pc(sp)	// save pc
+
+        stq	r17, osfsf_a1(sp)
+        mfpr    r17, ev5__exc_mask      // Get exception register mask IPR - no mtpr exc_sum in next cycle
+
+        stq	r11, osfsf_ps(sp)	// save ps
+        bis	r25, r31, r11		// set new ps
+
+        stq	r16, osfsf_a0(sp)	// save regs
+        srl	r13, exc_sum_v_swc, r16	// shift data to correct position
+
+        stq	r18, osfsf_a2(sp)
+//	pvc_violate 354			// ok, but make sure reads of exc_mask/sum are not in same trap shadow
+        mtpr	r31, ev5__exc_sum	// Unlock exc_sum and exc_mask
+
+        stq	r29, osfsf_gp(sp)
+        mtpr	r12, exc_addr		// Set new PC - 1 bubble to hw_rei - E1
+
+        mfpr	r29, pt_kgp		// get the kern gp - E1
+        hw_rei_spe			// done - E1
+
+
+//
+// FEN	-  Illegal Floating Point Operation Trap Entry Point
+//
+// FEN - offset 0580
+// Entry:
+//	Vectored into via hardware trap on illegal FP op.
+//
+// Function:
+//	Build stack frame
+//	a0 <- code
+//	a1 <- unpred
+//	a2 <- unpred
+//	vector via entIF
+//
+//
+
+        HDW_VECTOR(PAL_FEN_ENTRY)
+Trap_Fen:
+        sll	r11, 63-osfps_v_mode, r25 // Shift mode up to MS bit
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        mfpr	r14, exc_addr		// get pc
+        blbs	r14, pal_pal_bug_check	// check opcdec in palmode
+
+        mfpr	r13, ev5__icsr
+        nop
+
+        bis	r11, r31, r12		// Save PS
+        bge	r25, TRAP_FEN_10_		// no stack swap needed if cm=kern
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        mtpr	r30, pt_usp		// save user stack
+
+        bis	r31, r31, r12		// Set new PS
+        mfpr	r30, pt_ksp
+
+TRAP_FEN_10_:
+        lda	sp, 0-osfsf_c_size(sp)// allocate stack space
+        srl     r13, icsr_v_fpe, r25   // Shift FP enable to bit 0
+
+
+        stq	r16, osfsf_a0(sp)	// save regs
+        mfpr	r13, pt_entif		// get entry point
+
+        stq	r18, osfsf_a2(sp) 	// a2
+        stq	r11, osfsf_ps(sp)	// save old ps
+
+        stq	r29, osfsf_gp(sp) 	// save gp
+        bis	r12, r31, r11		// set new ps
+
+        stq	r17, osfsf_a1(sp)	// a1
+        blbs	r25,fen_to_opcdec	// If FP is enabled, this is really OPCDEC.
+
+        bis	r31, osf_a0_fen, r16	// set a0
+        stq	r14, osfsf_pc(sp)	// save pc
+
+        mtpr	r13, exc_addr		// load exc_addr with entIF
+                                        // 1 cycle to hw_rei -E1
+
+        mfpr	r29, pt_kgp		// get the kgp -E1
+
+        hw_rei_spe			// done -E1
+
+//	FEN trap was taken, but the fault is really opcdec.
+        ALIGN_BRANCH
+fen_to_opcdec:
+        addq	r14, 4, r14		// save PC+4
+        bis	r31, osf_a0_opdec, r16	// set a0
+
+        stq	r14, osfsf_pc(sp)	// save pc
+        mtpr	r13, exc_addr		// load exc_addr with entIF
+                                        // 1 cycle to hw_rei
+
+        mfpr	r29, pt_kgp		// get the kgp
+        hw_rei_spe			// done
+
+
+
+//////////////////////////////////////////////////////////////////////////////
+// Misc handlers - Start area for misc code.
+//////////////////////////////////////////////////////////////////////////////
+
+//
+// dfault_trap_cont
+//	A dfault trap has been taken.  The sp has been updated if necessary.
+//	Push a stack frame a vector via entMM.
+//
+//	Current state:
+//		r12 - new PS
+//		r13 - MMstat
+//		VA - locked
+//
+//
+        ALIGN_BLOCK
+dfault_trap_cont:
+        lda	sp, 0-osfsf_c_size(sp)// allocate stack space
+        mfpr	r25, ev5__va		// Fetch VA/unlock
+
+        stq	r18, osfsf_a2(sp) 	// a2
+        and	r13, 1, r18		// Clean r/w bit for a2
+
+        stq	r16, osfsf_a0(sp)	// save regs
+        bis	r25, r31, r16		// a0 <- va
+
+        stq	r17, osfsf_a1(sp)	// a1
+        srl	r13, 1, r17		// shift fault bits to right position
+
+        stq	r11, osfsf_ps(sp)	// save old ps
+        bis	r12, r31, r11		// update ps
+
+        stq	r14, osfsf_pc(sp)	// save pc
+        mfpr	r25, pt_entmm		// get entry point
+
+        stq	r29, osfsf_gp(sp) 	// save gp
+        cmovlbs	r17, 1, r17		// a2. acv overrides fox.
+
+        mtpr	r25, exc_addr		// load exc_addr with entMM
+                                        // 1 cycle to hw_rei
+        mfpr	r29, pt_kgp		// get the kgp
+
+        hw_rei_spe			// done
+
+//
+//unalign_trap_cont
+//	An unalign trap has been taken.  Just need to finish up a few things.
+//
+//	Current state:
+//		r25 - entUna
+//		r13 - shifted MMstat
+//
+//
+        ALIGN_BLOCK
+unalign_trap_cont:
+        mtpr	r25, exc_addr		// load exc_addr with entUna
+                                        // 1 cycle to hw_rei
+
+
+        mfpr	r29, pt_kgp		// get the kgp
+        and	r13, mm_stat_m_ra, r18	// Clean Ra for a2
+
+        hw_rei_spe			// done
+
+
+
+//
+// dfault_in_pal
+//	Dfault trap was taken, exc_addr points to a PAL PC.
+//	r9 - mmstat<opcode> right justified
+//	r8 - exception address
+//
+//	These are the cases:
+//		opcode was STQ -- from a stack builder, KSP not valid halt
+//			r14 - original exc_addr
+//			r11 - original PS
+//		opcode was STL_C  -- rti or retsys clear lock_flag by stack write,
+//					KSP not valid halt
+//			r11 - original PS
+//			r14 - original exc_addr
+//		opcode was LDQ -- retsys or rti stack read, KSP not valid halt
+//			r11 - original PS
+//			r14 - original exc_addr
+//		opcode was HW_LD -- itbmiss or dtbmiss, bugcheck due to fault on page tables
+//			r10 - original exc_addr
+//			r11 - original PS
+//
+//
+//
+        ALIGN_BLOCK
+dfault_in_pal:
+        DEBUGSTORE(0x50)
+        bic     r8, 3, r8            // Clean PC
+        mfpr	r9, pal_base
+
+        mfpr	r31, va			// unlock VA
+
+        // if not real_mm, should never get here from miss flows
+
+        subq    r9, r8, r8            // pal_base - offset
+
+        lda     r9, pal_itb_ldq-pal_base(r8)
+        nop
+
+        beq 	r9, dfault_do_bugcheck
+        lda     r9, pal_dtb_ldq-pal_base(r8)
+
+        beq 	r9, dfault_do_bugcheck
+
+//
+// KSP invalid halt case --
+ksp_inval_halt:
+        DEBUGSTORE(76)
+        bic	r11, osfps_m_mode, r11	// set ps to kernel mode
+        mtpr    r0, pt0
+
+        mtpr	r31, dtb_cm		// Make sure that the CM IPRs are all kernel mode
+        mtpr	r31, ips
+
+        mtpr	r14, exc_addr		// Set PC to instruction that caused trouble
+        bsr     r0, pal_update_pcb      // update the pcb
+
+        lda     r0, hlt_c_ksp_inval(r31)  // set halt code to hw halt
+        br      r31, sys_enter_console  // enter the console
+
+        ALIGN_BRANCH
+dfault_do_bugcheck:
+        bis	r10, r31, r14		// bugcheck expects exc_addr in r14
+        br	r31, pal_pal_bug_check
+
+
+//
+// dfault_fetch_ldr31_err - ignore faults on fetch(m) and loads to r31/f31
+//	On entry -
+//		r14 - exc_addr
+//		VA is locked
+//
+//
+        ALIGN_BLOCK
+dfault_fetch_ldr31_err:
+        mtpr	r11, ev5__dtb_cm
+        mtpr	r11, ev5__ps		// Make sure ps hasn't changed
+
+        mfpr	r31, va			// unlock the mbox
+        addq	r14, 4, r14		// inc the pc to skip the fetch
+
+        mtpr	r14, exc_addr		// give ibox new PC
+        mfpr	r31, pt0		// pad exc_addr write
+
+        hw_rei
+
+
+
+        ALIGN_BLOCK
+//
+// sys_from_kern
+//	callsys from kernel mode - OS bugcheck machine check
+//
+//
+sys_from_kern:
+        mfpr	r14, exc_addr			// PC points to call_pal
+        subq	r14, 4, r14
+
+        lda	r25, mchk_c_os_bugcheck(r31)    // fetch mchk code
+        br      r31, pal_pal_mchk
+
+
+// Continuation of long call_pal flows
+//
+// wrent_tbl
+//	Table to write *int in paltemps.
+//	4 instructions/entry
+//	r16 has new value
+//
+//
+        ALIGN_BLOCK
+wrent_tbl:
+//orig	pvc_jsr	wrent, dest=1
+        nop
+        mtpr	r16, pt_entint
+
+        mfpr	r31, pt0		// Pad for mt->mf paltemp rule
+        hw_rei
+
+
+//orig	pvc_jsr	wrent, dest=1
+        nop
+        mtpr	r16, pt_entarith
+
+        mfpr    r31, pt0                // Pad for mt->mf paltemp rule
+        hw_rei
+
+
+//orig	pvc_jsr	wrent, dest=1
+        nop
+        mtpr	r16, pt_entmm
+
+        mfpr    r31, pt0                // Pad for mt->mf paltemp rule
+        hw_rei
+
+
+//orig	pvc_jsr	wrent, dest=1
+        nop
+        mtpr	r16, pt_entif
+
+        mfpr    r31, pt0                // Pad for mt->mf paltemp rule
+        hw_rei
+
+
+//orig	pvc_jsr	wrent, dest=1
+        nop
+        mtpr	r16, pt_entuna
+
+        mfpr    r31, pt0                // Pad for mt->mf paltemp rule
+        hw_rei
+
+
+//orig	pvc_jsr	wrent, dest=1
+        nop
+        mtpr	r16, pt_entsys
+
+        mfpr    r31, pt0                // Pad for mt->mf paltemp rule
+        hw_rei
+
+        ALIGN_BLOCK
+//
+// tbi_tbl
+//	Table to do tbi instructions
+//	4 instructions per entry
+//
+tbi_tbl:
+        // -2 tbia
+//orig	pvc_jsr tbi, dest=1
+        mtpr	r31, ev5__dtb_ia	// Flush DTB
+        mtpr	r31, ev5__itb_ia	// Flush ITB
+
+        hw_rei_stall
+
+        nop				// Pad table
+
+        // -1 tbiap
+//orig	pvc_jsr tbi, dest=1
+        mtpr	r31, ev5__dtb_iap	// Flush DTB
+        mtpr	r31, ev5__itb_iap	// Flush ITB
+
+        hw_rei_stall
+
+        nop				// Pad table
+
+
+        // 0 unused
+//orig	pvc_jsr tbi, dest=1
+        hw_rei				// Pad table
+        nop
+        nop
+        nop
+
+
+        // 1 tbisi
+//orig	pvc_jsr tbi, dest=1
+
+        nop
+        nop
+        mtpr	r17, ev5__itb_is	// Flush ITB
+        hw_rei_stall
+
+        // 2 tbisd
+//orig	pvc_jsr tbi, dest=1
+        mtpr	r17, ev5__dtb_is	// Flush DTB.
+        nop
+
+        nop
+        hw_rei_stall
+
+
+        // 3 tbis
+//orig	pvc_jsr tbi, dest=1
+        mtpr	r17, ev5__dtb_is	// Flush DTB
+        br	r31, tbi_finish
+        ALIGN_BRANCH
+tbi_finish:
+        mtpr	r17, ev5__itb_is	// Flush ITB
+        hw_rei_stall
+
+
+
+        ALIGN_BLOCK
+//
+// bpt_bchk_common:
+//	Finish up the bpt/bchk instructions
+//
+bpt_bchk_common:
+        stq	r18, osfsf_a2(sp) 	// a2
+        mfpr	r13, pt_entif		// get entry point
+
+        stq	r12, osfsf_ps(sp)	// save old ps
+        stq	r14, osfsf_pc(sp)	// save pc
+
+        stq	r29, osfsf_gp(sp) 	// save gp
+        mtpr	r13, exc_addr		// load exc_addr with entIF
+                                        // 1 cycle to hw_rei
+
+        mfpr	r29, pt_kgp		// get the kgp
+
+
+        hw_rei_spe			// done
+
+
+        ALIGN_BLOCK
+//
+// rti_to_user
+//	Finish up the rti instruction
+//
+rti_to_user:
+        mtpr	r11, ev5__dtb_cm	// set Mbox current mode - no virt ref for 2 cycles
+        mtpr	r11, ev5__ps		// set Ibox current mode - 2 bubble to hw_rei
+
+        mtpr	r31, ev5__ipl		// set the ipl. No hw_rei for 2 cycles
+        mtpr	r25, pt_ksp		// save off incase RTI to user
+
+        mfpr	r30, pt_usp
+        hw_rei_spe			// and back
+
+
+        ALIGN_BLOCK
+//
+// rti_to_kern
+//	Finish up the rti instruction
+//
+rti_to_kern:
+        and	r12, osfps_m_ipl, r11	// clean ps
+        mfpr	r12, pt_intmask		// get int mask
+
+        extbl	r12, r11, r12		// get mask for this ipl
+        mtpr	r25, pt_ksp		// save off incase RTI to user
+
+        mtpr	r12, ev5__ipl		// set the new ipl.
+        or	r25, r31, sp		// sp
+
+//	pvc_violate 217			// possible hidden mt->mf ipl not a problem in callpals
+        hw_rei
+
+        ALIGN_BLOCK
+//
+// swpctx_cont
+//	Finish up the swpctx instruction
+//
+
+swpctx_cont:
+
+        bic	r25, r24, r25		// clean icsr<FPE,PMP>
+        sll	r12, icsr_v_fpe, r12	// shift new fen to pos
+
+        ldq_p	r14, osfpcb_q_mmptr(r16)// get new mmptr
+        srl	r22, osfpcb_v_pme, r22	// get pme down to bit 0
+
+        or	r25, r12, r25		// icsr with new fen
+        srl	r23, 32, r24		// move asn to low asn pos
+
+        and	r22, 1, r22
+        sll	r24, itb_asn_v_asn, r12
+
+        sll	r22, icsr_v_pmp, r22
+        nop
+
+        or	r25, r22, r25		// icsr with new pme
+
+        sll	r24, dtb_asn_v_asn, r24
+
+        subl	r23, r13, r13		// gen new cc offset
+        mtpr	r12, itb_asn		// no hw_rei_stall in 0,1,2,3,4
+
+        mtpr	r24, dtb_asn		// Load up new ASN
+        mtpr	r25, icsr		// write the icsr
+
+        sll	r14, page_offset_size_bits, r14 // Move PTBR into internal position.
+        ldq_p	r25, osfpcb_q_usp(r16)	// get new usp
+
+        insll	r13, 4, r13		// >> 32
+//	pvc_violate 379			// ldq_p can't trap except replay.  only problem if mf same ipr in same shadow
+        mtpr	r14, pt_ptbr		// load the new ptbr
+
+        mtpr	r13, cc			// set new offset
+        ldq_p	r30, osfpcb_q_ksp(r16)	// get new ksp
+
+//	pvc_violate 379			// ldq_p can't trap except replay.  only problem if mf same ipr in same shadow
+        mtpr	r25, pt_usp		// save usp
+
+no_pm_change_10_:	hw_rei_stall			// back we go
+
+        ALIGN_BLOCK
+//
+// swppal_cont - finish up the swppal call_pal
+//
+
+swppal_cont:
+        mfpr	r2, pt_misc		// get misc bits
+        sll	r0, pt_misc_v_switch, r0 // get the "I've switched" bit
+        or	r2, r0, r2		// set the bit
+        mtpr	r31, ev5__alt_mode	// ensure alt_mode set to 0 (kernel)
+        mtpr	r2, pt_misc		// update the chip
+
+        or	r3, r31, r4
+        mfpr	r3, pt_impure		// pass pointer to the impure area in r3
+//orig	fix_impure_ipr	r3		// adjust impure pointer for ipr read
+//orig	restore_reg1	bc_ctl, r1, r3, ipr=1		// pass cns_bc_ctl in r1
+//orig	restore_reg1	bc_config, r2, r3, ipr=1	// pass cns_bc_config in r2
+//orig	unfix_impure_ipr r3		// restore impure pointer
+        lda	r3, CNS_Q_IPR(r3)
+        RESTORE_SHADOW(r1,CNS_Q_BC_CTL,r3);
+        RESTORE_SHADOW(r1,CNS_Q_BC_CFG,r3);
+        lda	r3, -CNS_Q_IPR(r3)
+
+        or	r31, r31, r0		// set status to success
+//	pvc_violate	1007
+        jmp	r31, (r4)		// and call our friend, it's her problem now
+
+
+swppal_fail:
+        addq	r0, 1, r0		// set unknown pal or not loaded
+        hw_rei				// and return
+
+
+// .sbttl	"Memory management"
+
+        ALIGN_BLOCK
+//
+//foe_ipte_handler
+// IFOE detected on level 3 pte, sort out FOE vs ACV
+//
+// on entry:
+//	with
+//	R8	 = pte
+//	R10	 = pc
+//
+// Function
+//	Determine TNV vs ACV vs FOE. Build stack and dispatch
+//	Will not be here if TNV.
+//
+
+foe_ipte_handler:
+        sll	r11, 63-osfps_v_mode, r25 // Shift mode up to MS bit
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        bis	r11, r31, r12		// Save PS for stack write
+        bge	r25, foe_ipte_handler_10_		// no stack swap needed if cm=kern
+
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        mtpr	r30, pt_usp		// save user stack
+
+        bis	r31, r31, r11		// Set new PS
+        mfpr	r30, pt_ksp
+
+        srl	r8, osfpte_v_ure-osfpte_v_kre, r8 // move pte user bits to kern
+        nop
+
+foe_ipte_handler_10_:	srl	r8, osfpte_v_kre, r25	// get kre to <0>
+        lda	sp, 0-osfsf_c_size(sp)// allocate stack space
+
+        or	r10, r31, r14		// Save pc/va in case TBmiss or fault on stack
+        mfpr	r13, pt_entmm		// get entry point
+
+        stq	r16, osfsf_a0(sp)	// a0
+        or	r14, r31, r16		// pass pc/va as a0
+
+        stq	r17, osfsf_a1(sp)	// a1
+        nop
+
+        stq	r18, osfsf_a2(sp) 	// a2
+        lda	r17, mmcsr_c_acv(r31)	// assume ACV
+
+        stq	r16, osfsf_pc(sp)	// save pc
+        cmovlbs r25, mmcsr_c_foe, r17	// otherwise FOE
+
+        stq	r12, osfsf_ps(sp)	// save ps
+        subq	r31, 1, r18		// pass flag of istream as a2
+
+        stq	r29, osfsf_gp(sp)
+        mtpr	r13, exc_addr		// set vector address
+
+        mfpr	r29, pt_kgp		// load kgp
+        hw_rei_spe			// out to exec
+
+        ALIGN_BLOCK
+//
+//invalid_ipte_handler
+// TNV detected on level 3 pte, sort out TNV vs ACV
+//
+// on entry:
+//	with
+//	R8	 = pte
+//	R10	 = pc
+//
+// Function
+//	Determine TNV vs ACV. Build stack and dispatch.
+//
+
+invalid_ipte_handler:
+        sll	r11, 63-osfps_v_mode, r25 // Shift mode up to MS bit
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        bis	r11, r31, r12		// Save PS for stack write
+        bge	r25, invalid_ipte_handler_10_		// no stack swap needed if cm=kern
+
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        mtpr	r30, pt_usp		// save user stack
+
+        bis	r31, r31, r11		// Set new PS
+        mfpr	r30, pt_ksp
+
+        srl	r8, osfpte_v_ure-osfpte_v_kre, r8 // move pte user bits to kern
+        nop
+
+invalid_ipte_handler_10_:	srl	r8, osfpte_v_kre, r25	// get kre to <0>
+        lda	sp, 0-osfsf_c_size(sp)// allocate stack space
+
+        or	r10, r31, r14		// Save pc/va in case TBmiss on stack
+        mfpr	r13, pt_entmm		// get entry point
+
+        stq	r16, osfsf_a0(sp)	// a0
+        or	r14, r31, r16		// pass pc/va as a0
+
+        stq	r17, osfsf_a1(sp)	// a1
+        nop
+
+        stq	r18, osfsf_a2(sp) 	// a2
+        and	r25, 1, r17		// Isolate kre
+
+        stq	r16, osfsf_pc(sp)	// save pc
+        xor	r17, 1, r17		// map to acv/tnv as a1
+
+        stq	r12, osfsf_ps(sp)	// save ps
+        subq	r31, 1, r18		// pass flag of istream as a2
+
+        stq	r29, osfsf_gp(sp)
+        mtpr	r13, exc_addr		// set vector address
+
+        mfpr	r29, pt_kgp		// load kgp
+        hw_rei_spe			// out to exec
+
+
+
+
+        ALIGN_BLOCK
+//
+//invalid_dpte_handler
+// INVALID detected on level 3 pte, sort out TNV vs ACV
+//
+// on entry:
+//	with
+//	R10	 = va
+//	R8	 = pte
+//	R9	 = mm_stat
+//	PT6	 = pc
+//
+// Function
+//	Determine TNV vs ACV. Build stack and dispatch
+//
+
+
+invalid_dpte_handler:
+        mfpr	r12, pt6
+        blbs	r12, tnv_in_pal		// Special handler if original faulting reference was in PALmode
+
+        bis	r12, r31, r14		// save PC in case of tbmiss or fault
+        srl	r9, mm_stat_v_opcode, r25	// shift opc to <0>
+
+        mtpr	r11, pt0		// Save PS for stack write
+        and 	r25, mm_stat_m_opcode, r25	// isolate opcode
+
+        cmpeq	r25, evx_opc_sync, r25	// is it FETCH/FETCH_M?
+        blbs	r25, nmiss_fetch_ldr31_err	// yes
+
+        //dismiss exception if load to r31/f31
+        blbs	r9, invalid_dpte_no_dismiss	// mm_stat<0> set on store or fetchm
+
+                                        // not a store or fetch, must be a load
+        srl	r9, mm_stat_v_ra, r25	// Shift rnum to low bits
+
+        and	r25, 0x1F, r25		// isolate rnum
+        nop
+
+        cmpeq   r25, 0x1F, r25  	// Is the rnum r31 or f31?
+        bne     r25, nmiss_fetch_ldr31_err    // Yes, dismiss the fault
+
+invalid_dpte_no_dismiss:
+        sll	r11, 63-osfps_v_mode, r25 // Shift mode up to MS bit
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        bge	r25, invalid_dpte_no_dismiss_10_		// no stack swap needed if cm=kern
+
+        srl	r8, osfpte_v_ure-osfpte_v_kre, r8 // move pte user bits to kern
+        mtpr	r30, pt_usp		// save user stack
+
+        bis	r31, r31, r11		// Set new PS
+        mfpr	r30, pt_ksp
+
+invalid_dpte_no_dismiss_10_:	srl	r8, osfpte_v_kre, r12	// get kre to <0>
+        lda	sp, 0-osfsf_c_size(sp)// allocate stack space
+
+        or	r10, r31, r25		// Save va in case TBmiss on stack
+        and	r9, 1, r13		// save r/w flag
+
+        stq	r16, osfsf_a0(sp)	// a0
+        or	r25, r31, r16		// pass va as a0
+
+        stq	r17, osfsf_a1(sp)	// a1
+        or	r31, mmcsr_c_acv, r17 	// assume acv
+
+        srl	r12, osfpte_v_kwe-osfpte_v_kre, r25 // get write enable to <0>
+        stq	r29, osfsf_gp(sp)
+
+        stq	r18, osfsf_a2(sp) 	// a2
+        cmovlbs r13, r25, r12		// if write access move acv based on write enable
+
+        or	r13, r31, r18		// pass flag of dstream access and read vs write
+        mfpr	r25, pt0		// get ps
+
+        stq	r14, osfsf_pc(sp)	// save pc
+        mfpr	r13, pt_entmm		// get entry point
+
+        stq	r25, osfsf_ps(sp)	// save ps
+        mtpr	r13, exc_addr		// set vector address
+
+        mfpr	r29, pt_kgp		// load kgp
+        cmovlbs	r12, mmcsr_c_tnv, r17 	// make p2 be tnv if access ok else acv
+
+        hw_rei_spe			// out to exec
+
+//
+//
+// We come here if we are erring on a dtb_miss, and the instr is a
+// fetch, fetch_m, of load to r31/f31.
+// The PC is incremented, and we return to the program.
+// essentially ignoring the instruction and error.
+//
+//
+        ALIGN_BLOCK
+nmiss_fetch_ldr31_err:
+        mfpr	r12, pt6
+        addq	r12, 4, r12		// bump pc to pc+4
+
+        mtpr	r12, exc_addr		// and set entry point
+        mfpr	r31, pt0		// pad exc_addr write
+
+        hw_rei				//
+
+        ALIGN_BLOCK
+//
+// double_pte_inv
+//	We had a single tbmiss which turned into a double tbmiss which found
+//	an invalid PTE.  Return to single miss with a fake pte, and the invalid
+//	single miss flow will report the error.
+//
+// on entry:
+//	r21  	PTE
+//	r22	available
+//	VA IPR	locked with original fault VA
+//       pt4  	saved r21
+//	pt5  	saved r22
+//	pt6	original exc_addr
+//
+// on return to tbmiss flow:
+//	r8	fake PTE
+//
+//
+//
+double_pte_inv:
+        srl	r21, osfpte_v_kre, r21	// get the kre bit to <0>
+        mfpr	r22, exc_addr		// get the pc
+
+        lda	r22, 4(r22)		// inc the pc
+        lda	r8, osfpte_m_prot(r31)	 // make a fake pte with xre and xwe set
+
+        cmovlbc r21, r31, r8		// set to all 0 for acv if pte<kre> is 0
+        mtpr	r22, exc_addr		// set for rei
+
+        mfpr	r21, pt4		// restore regs
+        mfpr	r22, pt5		// restore regs
+
+        hw_rei				// back to tb miss
+
+        ALIGN_BLOCK
+//
+//tnv_in_pal
+//	The only places in pal that ld or store are the
+// 	stack builders, rti or retsys.  Any of these mean we
+//	need to take a ksp not valid halt.
+//
+//
+tnv_in_pal:
+
+
+        br	r31, ksp_inval_halt
+
+
+// .sbttl	"Icache flush routines"
+
+        ALIGN_BLOCK
+//
+// Common Icache flush routine.
+//
+//
+//
+pal_ic_flush:
+        nop
+        mtpr	r31, ev5__ic_flush_ctl		// Icache flush - E1
+        nop
+        nop
+
+// Now, do 44 NOPs.  3RFB prefetches (24) + IC buffer,IB,slot,issue (20)
+        nop
+        nop
+        nop
+        nop
+
+        nop
+        nop
+        nop
+        nop
+
+        nop
+        nop		// 10
+
+        nop
+        nop
+        nop
+        nop
+
+        nop
+        nop
+        nop
+        nop
+
+        nop
+        nop		// 20
+
+        nop
+        nop
+        nop
+        nop
+
+        nop
+        nop
+        nop
+        nop
+
+        nop
+        nop		// 30
+        nop
+        nop
+        nop
+        nop
+
+        nop
+        nop
+        nop
+        nop
+
+        nop
+        nop		// 40
+
+        nop
+        nop
+
+one_cycle_and_hw_rei:
+        nop
+        nop
+
+        hw_rei_stall
+
+        ALIGN_BLOCK
+//
+//osfpal_calpal_opcdec
+//  Here for all opcdec CALL_PALs
+//
+//	Build stack frame
+//	a0 <- code
+//	a1 <- unpred
+//	a2 <- unpred
+//	vector via entIF
+//
+//
+
+osfpal_calpal_opcdec:
+        sll	r11, 63-osfps_v_mode, r25 // Shift mode up to MS bit
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        mfpr	r14, exc_addr		// get pc
+        nop
+
+        bis	r11, r31, r12		// Save PS for stack write
+        bge	r25, osfpal_calpal_opcdec_10_		// no stack swap needed if cm=kern
+
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        mtpr	r30, pt_usp		// save user stack
+
+        bis	r31, r31, r11		// Set new PS
+        mfpr	r30, pt_ksp
+
+osfpal_calpal_opcdec_10_:
+        lda	sp, 0-osfsf_c_size(sp)// allocate stack space
+        nop
+
+        stq	r16, osfsf_a0(sp)	// save regs
+        bis	r31, osf_a0_opdec, r16	// set a0
+
+        stq	r18, osfsf_a2(sp) 	// a2
+        mfpr	r13, pt_entif		// get entry point
+
+        stq	r12, osfsf_ps(sp)	// save old ps
+        stq	r17, osfsf_a1(sp)	// a1
+
+        stq	r14, osfsf_pc(sp)	// save pc
+        nop
+
+        stq	r29, osfsf_gp(sp) 	// save gp
+        mtpr	r13, exc_addr		// load exc_addr with entIF
+                                        // 1 cycle to hw_rei
+
+        mfpr	r29, pt_kgp		// get the kgp
+
+
+        hw_rei_spe			// done
+
+
+
+
+
+//
+//pal_update_pcb
+//	Update the PCB with the current SP, AST, and CC info
+//
+//	r0 - return linkage
+//
+        ALIGN_BLOCK
+
+pal_update_pcb:
+        mfpr	r12, pt_pcbb		// get pcbb
+        and	r11, osfps_m_mode, r25	// get mode
+        beq	r25, pal_update_pcb_10_		// in kern? no need to update user sp
+        mtpr	r30, pt_usp		// save user stack
+        stq_p	r30, osfpcb_q_usp(r12)	// store usp
+        br	r31, pal_update_pcb_20_		// join common
+pal_update_pcb_10_:	stq_p	r30, osfpcb_q_ksp(r12)	// store ksp
+pal_update_pcb_20_:	rpcc	r13			// get cyccounter
+        srl	r13, 32, r14		// move offset
+        addl	r13, r14, r14		// merge for new time
+        stl_p	r14, osfpcb_l_cc(r12)	// save time
+
+//orig	pvc_jsr	updpcb, bsr=1, dest=1
+        ret	r31, (r0)
+
+
+//
+// pal_save_state
+//
+//	Function
+//		All chip state saved, all PT's, SR's FR's, IPR's
+//
+//
+// Regs' on entry...
+//
+//	R0 	= halt code
+//	pt0	= r0
+//	R1	= pointer to impure
+//	pt4	= r1
+//	R3	= return addr
+//	pt5	= r3
+//
+//	register usage:
+//		r0 = halt_code
+//		r1 = addr of impure area
+//		r3 = return_address
+//		r4 = scratch
+//
+//
+
+        ALIGN_BLOCK
+        .globl pal_save_state
+pal_save_state:
+//
+//
+// start of implementation independent save routine
+//
+// 		the impure area is larger than the addressibility of hw_ld and hw_st
+//		therefore, we need to play some games:  The impure area
+//		is informally divided into the "machine independent" part and the
+//		"machine dependent" part.  The state that will be saved in the
+//    		"machine independent" part are gpr's, fpr's, hlt, flag, mchkflag (use  (un)fix_impure_gpr macros).
+//		All others will be in the "machine dependent" part (use (un)fix_impure_ipr macros).
+//		The impure pointer will need to be adjusted by a different offset for each.  The store/restore_reg
+//		macros will automagically adjust the offset correctly.
+//
+
+// The distributed code is commented out and followed by corresponding SRC code.
+// Beware: SAVE_IPR and RESTORE_IPR blow away r0(v0)
+
+//orig	fix_impure_gpr	r1		// adjust impure area pointer for stores to "gpr" part of impure area
+        lda	r1, 0x200(r1)		// Point to center of CPU segment
+//orig	store_reg1 flag, r31, r1, ipr=1	// clear dump area flag
+        SAVE_GPR(r31,CNS_Q_FLAG,r1)	// Clear the valid flag
+//orig	store_reg1 hlt, r0, r1, ipr=1
+        SAVE_GPR(r0,CNS_Q_HALT,r1)	// Save the halt code
+
+        mfpr	r0, pt0			// get r0 back			//orig
+//orig	store_reg1 0, r0, r1		// save r0
+        SAVE_GPR(r0,CNS_Q_GPR+0x00,r1)	// Save r0
+
+        mfpr	r0, pt4			// get r1 back			//orig
+//orig	store_reg1 1, r0, r1		// save r1
+        SAVE_GPR(r0,CNS_Q_GPR+0x08,r1)	// Save r1
+
+//orig	store_reg 2			// save r2
+        SAVE_GPR(r2,CNS_Q_GPR+0x10,r1)	// Save r2
+
+        mfpr	r0, pt5			// get r3 back			//orig
+//orig	store_reg1 3, r0, r1		// save r3
+        SAVE_GPR(r0,CNS_Q_GPR+0x18,r1)	// Save r3
+
+        // reason code has been saved
+        // r0 has been saved
+        // r1 has been saved
+        // r2 has been saved
+        // r3 has been saved
+        // pt0, pt4, pt5 have been lost
+
+        //
+        // Get out of shadow mode
+        //
+
+        mfpr	r2, icsr		// Get icsr
+        ldah	r0, (1<<(icsr_v_sde-16))(r31)
+        bic	r2, r0, r0		// ICSR with SDE clear
+        mtpr	r0, icsr		// Turn off SDE
+
+        mfpr	r31, pt0		// SDE bubble cycle 1
+        mfpr	r31, pt0		// SDE bubble cycle 2
+        mfpr	r31, pt0		// SDE bubble cycle 3
+        nop
+
+
+        // save integer regs R4-r31
+        SAVE_GPR(r4,CNS_Q_GPR+0x20,r1)
+        SAVE_GPR(r5,CNS_Q_GPR+0x28,r1)
+        SAVE_GPR(r6,CNS_Q_GPR+0x30,r1)
+        SAVE_GPR(r7,CNS_Q_GPR+0x38,r1)
+        SAVE_GPR(r8,CNS_Q_GPR+0x40,r1)
+        SAVE_GPR(r9,CNS_Q_GPR+0x48,r1)
+        SAVE_GPR(r10,CNS_Q_GPR+0x50,r1)
+        SAVE_GPR(r11,CNS_Q_GPR+0x58,r1)
+        SAVE_GPR(r12,CNS_Q_GPR+0x60,r1)
+        SAVE_GPR(r13,CNS_Q_GPR+0x68,r1)
+        SAVE_GPR(r14,CNS_Q_GPR+0x70,r1)
+        SAVE_GPR(r15,CNS_Q_GPR+0x78,r1)
+        SAVE_GPR(r16,CNS_Q_GPR+0x80,r1)
+        SAVE_GPR(r17,CNS_Q_GPR+0x88,r1)
+        SAVE_GPR(r18,CNS_Q_GPR+0x90,r1)
+        SAVE_GPR(r19,CNS_Q_GPR+0x98,r1)
+        SAVE_GPR(r20,CNS_Q_GPR+0xA0,r1)
+        SAVE_GPR(r21,CNS_Q_GPR+0xA8,r1)
+        SAVE_GPR(r22,CNS_Q_GPR+0xB0,r1)
+        SAVE_GPR(r23,CNS_Q_GPR+0xB8,r1)
+        SAVE_GPR(r24,CNS_Q_GPR+0xC0,r1)
+        SAVE_GPR(r25,CNS_Q_GPR+0xC8,r1)
+        SAVE_GPR(r26,CNS_Q_GPR+0xD0,r1)
+        SAVE_GPR(r27,CNS_Q_GPR+0xD8,r1)
+        SAVE_GPR(r28,CNS_Q_GPR+0xE0,r1)
+        SAVE_GPR(r29,CNS_Q_GPR+0xE8,r1)
+        SAVE_GPR(r30,CNS_Q_GPR+0xF0,r1)
+        SAVE_GPR(r31,CNS_Q_GPR+0xF8,r1)
+
+        // save all paltemp regs except pt0
+
+//orig	unfix_impure_gpr	r1		// adjust impure area pointer for gpr stores
+//orig	fix_impure_ipr	r1			// adjust impure area pointer for pt stores
+
+        lda	r1, -0x200(r1)		// Restore the impure base address.
+        lda	r1, CNS_Q_IPR(r1)	// Point to the base of IPR area.
+        SAVE_IPR(pt0,CNS_Q_PT+0x00,r1)		// the osf code didn't save/restore palTemp 0 ?? pboyle
+        SAVE_IPR(pt1,CNS_Q_PT+0x08,r1)
+        SAVE_IPR(pt2,CNS_Q_PT+0x10,r1)
+        SAVE_IPR(pt3,CNS_Q_PT+0x18,r1)
+        SAVE_IPR(pt4,CNS_Q_PT+0x20,r1)
+        SAVE_IPR(pt5,CNS_Q_PT+0x28,r1)
+        SAVE_IPR(pt6,CNS_Q_PT+0x30,r1)
+        SAVE_IPR(pt7,CNS_Q_PT+0x38,r1)
+        SAVE_IPR(pt8,CNS_Q_PT+0x40,r1)
+        SAVE_IPR(pt9,CNS_Q_PT+0x48,r1)
+        SAVE_IPR(pt10,CNS_Q_PT+0x50,r1)
+        SAVE_IPR(pt11,CNS_Q_PT+0x58,r1)
+        SAVE_IPR(pt12,CNS_Q_PT+0x60,r1)
+        SAVE_IPR(pt13,CNS_Q_PT+0x68,r1)
+        SAVE_IPR(pt14,CNS_Q_PT+0x70,r1)
+        SAVE_IPR(pt15,CNS_Q_PT+0x78,r1)
+        SAVE_IPR(pt16,CNS_Q_PT+0x80,r1)
+        SAVE_IPR(pt17,CNS_Q_PT+0x88,r1)
+        SAVE_IPR(pt18,CNS_Q_PT+0x90,r1)
+        SAVE_IPR(pt19,CNS_Q_PT+0x98,r1)
+        SAVE_IPR(pt20,CNS_Q_PT+0xA0,r1)
+        SAVE_IPR(pt21,CNS_Q_PT+0xA8,r1)
+        SAVE_IPR(pt22,CNS_Q_PT+0xB0,r1)
+        SAVE_IPR(pt23,CNS_Q_PT+0xB8,r1)
+
+        // Restore shadow mode
+        mfpr	r31, pt0		// pad write to icsr out of shadow of store (trap does not abort write)
+        mfpr	r31, pt0
+        mtpr	r2, icsr		// Restore original ICSR
+
+        mfpr	r31, pt0		// SDE bubble cycle 1
+        mfpr	r31, pt0		// SDE bubble cycle 2
+        mfpr	r31, pt0		// SDE bubble cycle 3
+        nop
+
+        // save all integer shadow regs
+        SAVE_SHADOW( r8,CNS_Q_SHADOW+0x00,r1)	// also called p0...p7 in the Hudson code
+        SAVE_SHADOW( r9,CNS_Q_SHADOW+0x08,r1)
+        SAVE_SHADOW(r10,CNS_Q_SHADOW+0x10,r1)
+        SAVE_SHADOW(r11,CNS_Q_SHADOW+0x18,r1)
+        SAVE_SHADOW(r12,CNS_Q_SHADOW+0x20,r1)
+        SAVE_SHADOW(r13,CNS_Q_SHADOW+0x28,r1)
+        SAVE_SHADOW(r14,CNS_Q_SHADOW+0x30,r1)
+        SAVE_SHADOW(r25,CNS_Q_SHADOW+0x38,r1)
+
+        SAVE_IPR(excAddr,CNS_Q_EXC_ADDR,r1)
+        SAVE_IPR(palBase,CNS_Q_PAL_BASE,r1)
+        SAVE_IPR(mmStat,CNS_Q_MM_STAT,r1)
+        SAVE_IPR(va,CNS_Q_VA,r1)
+        SAVE_IPR(icsr,CNS_Q_ICSR,r1)
+        SAVE_IPR(ipl,CNS_Q_IPL,r1)
+        SAVE_IPR(ips,CNS_Q_IPS,r1)
+        SAVE_IPR(itbAsn,CNS_Q_ITB_ASN,r1)
+        SAVE_IPR(aster,CNS_Q_ASTER,r1)
+        SAVE_IPR(astrr,CNS_Q_ASTRR,r1)
+        SAVE_IPR(sirr,CNS_Q_SIRR,r1)
+        SAVE_IPR(isr,CNS_Q_ISR,r1)
+        SAVE_IPR(iVptBr,CNS_Q_IVPTBR,r1)
+        SAVE_IPR(mcsr,CNS_Q_MCSR,r1)
+        SAVE_IPR(dcMode,CNS_Q_DC_MODE,r1)
+
+//orig	pvc_violate 379			// mf maf_mode after a store ok (pvc doesn't distinguish ld from st)
+//orig	store_reg maf_mode,	ipr=1	// save ipr -- no mbox instructions for
+//orig                                  // PVC violation applies only to
+pvc$osf35$379:				    // loads. HW_ST ok here, so ignore
+        SAVE_IPR(mafMode,CNS_Q_MAF_MODE,r1) // MBOX INST->MF MAF_MODE IN 0,1,2
+
+
+        //the following iprs are informational only -- will not be restored
+
+        SAVE_IPR(icPerr,CNS_Q_ICPERR_STAT,r1)
+        SAVE_IPR(PmCtr,CNS_Q_PM_CTR,r1)
+        SAVE_IPR(intId,CNS_Q_INT_ID,r1)
+        SAVE_IPR(excSum,CNS_Q_EXC_SUM,r1)
+        SAVE_IPR(excMask,CNS_Q_EXC_MASK,r1)
+        ldah	r14, 0xFFF0(zero)
+        zap	r14, 0xE0, r14		// Get base address of CBOX IPRs
+        NOP				// Pad mfpr dcPerr out of shadow of
+        NOP				// last store
+        NOP
+        SAVE_IPR(dcPerr,CNS_Q_DCPERR_STAT,r1)
+
+        // read cbox ipr state
+
+        mb
+        ldq_p	r2, scCtl(r14)
+        ldq_p	r13, ldLock(r14)
+        ldq_p	r4, scAddr(r14)
+        ldq_p	r5, eiAddr(r14)
+        ldq_p	r6, bcTagAddr(r14)
+        ldq_p	r7, fillSyn(r14)
+        bis	r5, r4, zero		// Make sure all loads complete before
+        bis	r7, r6, zero		// reading registers that unlock them.
+        ldq_p	r8, scStat(r14)		// Unlocks scAddr.
+        ldq_p	r9, eiStat(r14)		// Unlocks eiAddr, bcTagAddr, fillSyn.
+        ldq_p	zero, eiStat(r14)	// Make sure it is really unlocked.
+        mb
+
+        // save cbox ipr state
+        SAVE_SHADOW(r2,CNS_Q_SC_CTL,r1);
+        SAVE_SHADOW(r13,CNS_Q_LD_LOCK,r1);
+        SAVE_SHADOW(r4,CNS_Q_SC_ADDR,r1);
+        SAVE_SHADOW(r5,CNS_Q_EI_ADDR,r1);
+        SAVE_SHADOW(r6,CNS_Q_BC_TAG_ADDR,r1);
+        SAVE_SHADOW(r7,CNS_Q_FILL_SYN,r1);
+        SAVE_SHADOW(r8,CNS_Q_SC_STAT,r1);
+        SAVE_SHADOW(r9,CNS_Q_EI_STAT,r1);
+        //bc_config? sl_rcv?
+
+// restore impure base
+//orig	unfix_impure_ipr r1
+        lda	r1, -CNS_Q_IPR(r1)
+
+// save all floating regs
+        mfpr	r0, icsr		// get icsr
+        or	r31, 1, r2		// get a one
+        sll	r2, icsr_v_fpe, r2	// Shift it into ICSR<FPE> position
+        or	r2, r0, r0		// set FEN on
+        mtpr	r0, icsr		// write to icsr, enabling FEN
+
+// map the save area virtually
+        mtpr	r31, dtbIa		// Clear all DTB entries
+        srl	r1, va_s_off, r0	// Clean off byte-within-page offset
+        sll	r0, pte_v_pfn, r0	// Shift to form PFN
+        lda	r0, pte_m_prot(r0)	// Set all read/write enable bits
+        mtpr	r0, dtbPte		// Load the PTE and set valid
+        mtpr	r1, dtbTag		// Write the PTE and tag into the DTB
+
+
+// map the next page too - in case the impure area crosses a page boundary
+        lda	r4, (1<<va_s_off)(r1)	// Generate address for next page
+        srl	r4, va_s_off, r0	// Clean off byte-within-page offset
+        sll	r0, pte_v_pfn, r0	// Shift to form PFN
+        lda	r0, pte_m_prot(r0)	// Set all read/write enable bits
+        mtpr	r0, dtbPte		// Load the PTE and set valid
+        mtpr	r4, dtbTag		// Write the PTE and tag into the DTB
+
+        sll	r31, 0, r31		// stall cycle 1
+        sll	r31, 0, r31		// stall cycle 2
+        sll	r31, 0, r31		// stall cycle 3
+        nop
+
+// add offset for saving fpr regs
+//orig	fix_impure_gpr r1
+        lda	r1, 0x200(r1)		// Point to center of CPU segment
+
+// now save the regs - F0-F31
+        mf_fpcr  f0			// original
+
+        SAVE_FPR(f0,CNS_Q_FPR+0x00,r1)
+        SAVE_FPR(f1,CNS_Q_FPR+0x08,r1)
+        SAVE_FPR(f2,CNS_Q_FPR+0x10,r1)
+        SAVE_FPR(f3,CNS_Q_FPR+0x18,r1)
+        SAVE_FPR(f4,CNS_Q_FPR+0x20,r1)
+        SAVE_FPR(f5,CNS_Q_FPR+0x28,r1)
+        SAVE_FPR(f6,CNS_Q_FPR+0x30,r1)
+        SAVE_FPR(f7,CNS_Q_FPR+0x38,r1)
+        SAVE_FPR(f8,CNS_Q_FPR+0x40,r1)
+        SAVE_FPR(f9,CNS_Q_FPR+0x48,r1)
+        SAVE_FPR(f10,CNS_Q_FPR+0x50,r1)
+        SAVE_FPR(f11,CNS_Q_FPR+0x58,r1)
+        SAVE_FPR(f12,CNS_Q_FPR+0x60,r1)
+        SAVE_FPR(f13,CNS_Q_FPR+0x68,r1)
+        SAVE_FPR(f14,CNS_Q_FPR+0x70,r1)
+        SAVE_FPR(f15,CNS_Q_FPR+0x78,r1)
+        SAVE_FPR(f16,CNS_Q_FPR+0x80,r1)
+        SAVE_FPR(f17,CNS_Q_FPR+0x88,r1)
+        SAVE_FPR(f18,CNS_Q_FPR+0x90,r1)
+        SAVE_FPR(f19,CNS_Q_FPR+0x98,r1)
+        SAVE_FPR(f20,CNS_Q_FPR+0xA0,r1)
+        SAVE_FPR(f21,CNS_Q_FPR+0xA8,r1)
+        SAVE_FPR(f22,CNS_Q_FPR+0xB0,r1)
+        SAVE_FPR(f23,CNS_Q_FPR+0xB8,r1)
+        SAVE_FPR(f24,CNS_Q_FPR+0xC0,r1)
+        SAVE_FPR(f25,CNS_Q_FPR+0xC8,r1)
+        SAVE_FPR(f26,CNS_Q_FPR+0xD0,r1)
+        SAVE_FPR(f27,CNS_Q_FPR+0xD8,r1)
+        SAVE_FPR(f28,CNS_Q_FPR+0xE0,r1)
+        SAVE_FPR(f29,CNS_Q_FPR+0xE8,r1)
+        SAVE_FPR(f30,CNS_Q_FPR+0xF0,r1)
+        SAVE_FPR(f31,CNS_Q_FPR+0xF8,r1)
+
+//switch impure offset from gpr to ipr---
+//orig	unfix_impure_gpr	r1
+//orig	fix_impure_ipr	r1
+//orig	store_reg1 fpcsr, f0, r1, fpcsr=1
+
+        SAVE_FPR(f0,CNS_Q_FPCSR,r1)	// fpcsr loaded above into f0 -- can it reach
+        lda	r1, -0x200(r1)		// Restore the impure base address
+
+// and back to gpr ---
+//orig	unfix_impure_ipr	r1
+//orig	fix_impure_gpr	r1
+
+//orig	lda	r0, cns_mchksize(r31)	// get size of mchk area
+//orig	store_reg1 mchkflag, r0, r1, ipr=1
+//orig	mb
+
+        lda	r1, CNS_Q_IPR(r1)	// Point to base of IPR area again
+        // save this using the IPR base (it is closer) not the GRP base as they used...pb
+        lda	r0, MACHINE_CHECK_SIZE(r31)	// get size of mchk area
+        SAVE_SHADOW(r0,CNS_Q_MCHK,r1);
+        mb
+
+//orig	or	r31, 1, r0		// get a one
+//orig	store_reg1 flag, r0, r1, ipr=1	// set dump area flag
+//orig	mb
+
+        lda	r1, -CNS_Q_IPR(r1)	// back to the base
+        lda	r1, 0x200(r1)		// Point to center of CPU segment
+        or	r31, 1, r0		// get a one
+        SAVE_GPR(r0,CNS_Q_FLAG,r1)	// // set dump area valid flag
+        mb
+
+        // restore impure area base
+//orig	unfix_impure_gpr r1
+        lda	r1, -0x200(r1)		// Point to center of CPU segment
+
+        mtpr	r31, dtb_ia		// clear the dtb
+        mtpr	r31, itb_ia		// clear the itb
+
+//orig	pvc_jsr	savsta, bsr=1, dest=1
+        ret	r31, (r3)		// and back we go
+
+
+
+// .sbttl  "PAL_RESTORE_STATE"
+//
+//
+//	Pal_restore_state
+//
+//
+//	register usage:
+//		r1 = addr of impure area
+//		r3 = return_address
+//		all other regs are scratchable, as they are about to
+//		be reloaded from ram.
+//
+//	Function:
+//		All chip state restored, all SRs, FRs, PTs, IPRs
+//					*** except R1, R3, PT0, PT4, PT5 ***
+//
+//
+        ALIGN_BLOCK
+pal_restore_state:
+
+//need to restore sc_ctl,bc_ctl,bc_config??? if so, need to figure out a safe way to do so.
+
+// map the console io area virtually
+        mtpr	r31, dtbIa		// Clear all DTB entries
+        srl	r1, va_s_off, r0	// Clean off byte-within-page offset
+        sll	r0, pte_v_pfn, r0	// Shift to form PFN
+        lda	r0, pte_m_prot(r0)	// Set all read/write enable bits
+        mtpr	r0, dtbPte		// Load the PTE and set valid
+        mtpr	r1, dtbTag		// Write the PTE and tag into the DTB
+
+
+// map the next page too, in case impure area crosses page boundary
+        lda	r4, (1<<VA_S_OFF)(r1)	// Generate address for next page
+        srl	r4, va_s_off, r0	// Clean off byte-within-page offset
+        sll	r0, pte_v_pfn, r0	// Shift to form PFN
+        lda	r0, pte_m_prot(r0)	// Set all read/write enable bits
+        mtpr	r0, dtbPte		// Load the PTE and set valid
+        mtpr	r4, dtbTag		// Write the PTE and tag into the DTB
+
+// save all floating regs
+        mfpr	r0, icsr		// Get current ICSR
+        bis	zero, 1, r2		// Get a '1'
+        or	r2, (1<<(icsr_v_sde-icsr_v_fpe)), r2
+        sll	r2, icsr_v_fpe, r2	// Shift bits into position
+        bis	r2, r2, r0		// Set ICSR<SDE> and ICSR<FPE>
+        mtpr	r0, icsr		// Update the chip
+
+        mfpr	r31, pt0		// FPE bubble cycle 1		//orig
+        mfpr	r31, pt0		// FPE bubble cycle 2		//orig
+        mfpr	r31, pt0		// FPE bubble cycle 3		//orig
+
+//orig	fix_impure_ipr r1
+//orig	restore_reg1 fpcsr, f0, r1, fpcsr=1
+//orig	mt_fpcr  f0
+//orig
+//orig	unfix_impure_ipr r1
+//orig	fix_impure_gpr r1		// adjust impure pointer offset for gpr access
+        lda	r1, 200(r1)	// Point to base of IPR area again
+        RESTORE_FPR(f0,CNS_Q_FPCSR,r1)		// can it reach?? pb
+        mt_fpcr  f0			// original
+
+        lda	r1, 0x200(r1)		// point to center of CPU segment
+
+// restore all floating regs
+        RESTORE_FPR(f0,CNS_Q_FPR+0x00,r1)
+        RESTORE_FPR(f1,CNS_Q_FPR+0x08,r1)
+        RESTORE_FPR(f2,CNS_Q_FPR+0x10,r1)
+        RESTORE_FPR(f3,CNS_Q_FPR+0x18,r1)
+        RESTORE_FPR(f4,CNS_Q_FPR+0x20,r1)
+        RESTORE_FPR(f5,CNS_Q_FPR+0x28,r1)
+        RESTORE_FPR(f6,CNS_Q_FPR+0x30,r1)
+        RESTORE_FPR(f7,CNS_Q_FPR+0x38,r1)
+        RESTORE_FPR(f8,CNS_Q_FPR+0x40,r1)
+        RESTORE_FPR(f9,CNS_Q_FPR+0x48,r1)
+        RESTORE_FPR(f10,CNS_Q_FPR+0x50,r1)
+        RESTORE_FPR(f11,CNS_Q_FPR+0x58,r1)
+        RESTORE_FPR(f12,CNS_Q_FPR+0x60,r1)
+        RESTORE_FPR(f13,CNS_Q_FPR+0x68,r1)
+        RESTORE_FPR(f14,CNS_Q_FPR+0x70,r1)
+        RESTORE_FPR(f15,CNS_Q_FPR+0x78,r1)
+        RESTORE_FPR(f16,CNS_Q_FPR+0x80,r1)
+        RESTORE_FPR(f17,CNS_Q_FPR+0x88,r1)
+        RESTORE_FPR(f18,CNS_Q_FPR+0x90,r1)
+        RESTORE_FPR(f19,CNS_Q_FPR+0x98,r1)
+        RESTORE_FPR(f20,CNS_Q_FPR+0xA0,r1)
+        RESTORE_FPR(f21,CNS_Q_FPR+0xA8,r1)
+        RESTORE_FPR(f22,CNS_Q_FPR+0xB0,r1)
+        RESTORE_FPR(f23,CNS_Q_FPR+0xB8,r1)
+        RESTORE_FPR(f24,CNS_Q_FPR+0xC0,r1)
+        RESTORE_FPR(f25,CNS_Q_FPR+0xC8,r1)
+        RESTORE_FPR(f26,CNS_Q_FPR+0xD0,r1)
+        RESTORE_FPR(f27,CNS_Q_FPR+0xD8,r1)
+        RESTORE_FPR(f28,CNS_Q_FPR+0xE0,r1)
+        RESTORE_FPR(f29,CNS_Q_FPR+0xE8,r1)
+        RESTORE_FPR(f30,CNS_Q_FPR+0xF0,r1)
+        RESTORE_FPR(f31,CNS_Q_FPR+0xF8,r1)
+
+// switch impure pointer from gpr to ipr area --
+//orig	unfix_impure_gpr r1
+//orig	fix_impure_ipr r1
+        lda	r1, -0x200(r1)		// Restore base address of impure area.
+        lda	r1, CNS_Q_IPR(r1)	// Point to base of IPR area.
+
+// restore all pal regs
+        RESTORE_IPR(pt0,CNS_Q_PT+0x00,r1)		// the osf code didn't save/restore palTemp 0 ?? pboyle
+        RESTORE_IPR(pt1,CNS_Q_PT+0x08,r1)
+        RESTORE_IPR(pt2,CNS_Q_PT+0x10,r1)
+        RESTORE_IPR(pt3,CNS_Q_PT+0x18,r1)
+        RESTORE_IPR(pt4,CNS_Q_PT+0x20,r1)
+        RESTORE_IPR(pt5,CNS_Q_PT+0x28,r1)
+        RESTORE_IPR(pt6,CNS_Q_PT+0x30,r1)
+        RESTORE_IPR(pt7,CNS_Q_PT+0x38,r1)
+        RESTORE_IPR(pt8,CNS_Q_PT+0x40,r1)
+        RESTORE_IPR(pt9,CNS_Q_PT+0x48,r1)
+        RESTORE_IPR(pt10,CNS_Q_PT+0x50,r1)
+        RESTORE_IPR(pt11,CNS_Q_PT+0x58,r1)
+        RESTORE_IPR(pt12,CNS_Q_PT+0x60,r1)
+        RESTORE_IPR(pt13,CNS_Q_PT+0x68,r1)
+        RESTORE_IPR(pt14,CNS_Q_PT+0x70,r1)
+        RESTORE_IPR(pt15,CNS_Q_PT+0x78,r1)
+        RESTORE_IPR(pt16,CNS_Q_PT+0x80,r1)
+        RESTORE_IPR(pt17,CNS_Q_PT+0x88,r1)
+        RESTORE_IPR(pt18,CNS_Q_PT+0x90,r1)
+        RESTORE_IPR(pt19,CNS_Q_PT+0x98,r1)
+        RESTORE_IPR(pt20,CNS_Q_PT+0xA0,r1)
+        RESTORE_IPR(pt21,CNS_Q_PT+0xA8,r1)
+        RESTORE_IPR(pt22,CNS_Q_PT+0xB0,r1)
+        RESTORE_IPR(pt23,CNS_Q_PT+0xB8,r1)
+
+
+//orig	restore_reg exc_addr,	ipr=1	// restore ipr
+//orig	restore_reg pal_base,	ipr=1	// restore ipr
+//orig	restore_reg ipl,	ipr=1	// restore ipr
+//orig	restore_reg ps,		ipr=1	// restore ipr
+//orig	mtpr	r0, dtb_cm		// set current mode in mbox too
+//orig	restore_reg itb_asn,	ipr=1
+//orig	srl	r0, itb_asn_v_asn, r0
+//orig	sll	r0, dtb_asn_v_asn, r0
+//orig	mtpr	r0, dtb_asn		// set ASN in Mbox too
+//orig	restore_reg ivptbr,	ipr=1
+//orig	mtpr	r0, mvptbr			// use ivptbr value to restore mvptbr
+//orig	restore_reg mcsr,	ipr=1
+//orig	restore_reg aster,	ipr=1
+//orig	restore_reg astrr,	ipr=1
+//orig	restore_reg sirr,	ipr=1
+//orig	restore_reg maf_mode, 	ipr=1		// no mbox instruction for 3 cycles
+//orig	mfpr	r31, pt0			// (may issue with mt maf_mode)
+//orig	mfpr	r31, pt0			// bubble cycle 1
+//orig	mfpr	r31, pt0                        // bubble cycle 2
+//orig	mfpr	r31, pt0                        // bubble cycle 3
+//orig	mfpr	r31, pt0			// (may issue with following ld)
+
+        // r0 gets the value of RESTORE_IPR in the macro and this code uses this side effect (gag)
+        RESTORE_IPR(excAddr,CNS_Q_EXC_ADDR,r1)
+        RESTORE_IPR(palBase,CNS_Q_PAL_BASE,r1)
+        RESTORE_IPR(ipl,CNS_Q_IPL,r1)
+        RESTORE_IPR(ips,CNS_Q_IPS,r1)
+        mtpr	r0, dtbCm			// Set Mbox current mode too.
+        RESTORE_IPR(itbAsn,CNS_Q_ITB_ASN,r1)
+        srl	r0, 4, r0
+        sll	r0, 57, r0
+        mtpr	r0, dtbAsn			// Set Mbox ASN too
+        RESTORE_IPR(iVptBr,CNS_Q_IVPTBR,r1)
+        mtpr	r0, mVptBr			// Set Mbox VptBr too
+        RESTORE_IPR(mcsr,CNS_Q_MCSR,r1)
+        RESTORE_IPR(aster,CNS_Q_ASTER,r1)
+        RESTORE_IPR(astrr,CNS_Q_ASTRR,r1)
+        RESTORE_IPR(sirr,CNS_Q_SIRR,r1)
+        RESTORE_IPR(mafMode,CNS_Q_MAF_MODE,r1)
+        STALL
+        STALL
+        STALL
+        STALL
+        STALL
+
+
+        // restore all integer shadow regs
+        RESTORE_SHADOW( r8,CNS_Q_SHADOW+0x00,r1)	// also called p0...p7 in the Hudson code
+        RESTORE_SHADOW( r9,CNS_Q_SHADOW+0x08,r1)
+        RESTORE_SHADOW(r10,CNS_Q_SHADOW+0x10,r1)
+        RESTORE_SHADOW(r11,CNS_Q_SHADOW+0x18,r1)
+        RESTORE_SHADOW(r12,CNS_Q_SHADOW+0x20,r1)
+        RESTORE_SHADOW(r13,CNS_Q_SHADOW+0x28,r1)
+        RESTORE_SHADOW(r14,CNS_Q_SHADOW+0x30,r1)
+        RESTORE_SHADOW(r25,CNS_Q_SHADOW+0x38,r1)
+        RESTORE_IPR(dcMode,CNS_Q_DC_MODE,r1)
+
+        //
+        // Get out of shadow mode
+        //
+
+        mfpr	r31, pt0		// pad last load to icsr write (in case of replay, icsr will be written anyway)
+        mfpr	r31, pt0		// ""
+        mfpr	r0, icsr		// Get icsr
+        ldah	r2,  (1<<(ICSR_V_SDE-16))(r31)	// Get a one in SHADOW_ENABLE bit location
+        bic	r0, r2, r2		// ICSR with SDE clear
+        mtpr	r2, icsr		// Turn off SDE - no palshadow rd/wr for 3 bubble cycles
+
+        mfpr	r31, pt0		// SDE bubble cycle 1
+        mfpr	r31, pt0		// SDE bubble cycle 2
+        mfpr	r31, pt0		// SDE bubble cycle 3
+        nop
+
+// switch impure pointer from ipr to gpr area --
+//orig	unfix_impure_ipr	r1
+//orig	fix_impure_gpr	r1
+
+// Restore GPRs (r0, r2 are restored later, r1 and r3 are trashed) ...
+
+        lda	r1, -CNS_Q_IPR(r1)	// Restore base address of impure area
+        lda	r1, 0x200(r1)		// Point to center of CPU segment
+
+        // restore all integer regs
+        RESTORE_GPR(r4,CNS_Q_GPR+0x20,r1)
+        RESTORE_GPR(r5,CNS_Q_GPR+0x28,r1)
+        RESTORE_GPR(r6,CNS_Q_GPR+0x30,r1)
+        RESTORE_GPR(r7,CNS_Q_GPR+0x38,r1)
+        RESTORE_GPR(r8,CNS_Q_GPR+0x40,r1)
+        RESTORE_GPR(r9,CNS_Q_GPR+0x48,r1)
+        RESTORE_GPR(r10,CNS_Q_GPR+0x50,r1)
+        RESTORE_GPR(r11,CNS_Q_GPR+0x58,r1)
+        RESTORE_GPR(r12,CNS_Q_GPR+0x60,r1)
+        RESTORE_GPR(r13,CNS_Q_GPR+0x68,r1)
+        RESTORE_GPR(r14,CNS_Q_GPR+0x70,r1)
+        RESTORE_GPR(r15,CNS_Q_GPR+0x78,r1)
+        RESTORE_GPR(r16,CNS_Q_GPR+0x80,r1)
+        RESTORE_GPR(r17,CNS_Q_GPR+0x88,r1)
+        RESTORE_GPR(r18,CNS_Q_GPR+0x90,r1)
+        RESTORE_GPR(r19,CNS_Q_GPR+0x98,r1)
+        RESTORE_GPR(r20,CNS_Q_GPR+0xA0,r1)
+        RESTORE_GPR(r21,CNS_Q_GPR+0xA8,r1)
+        RESTORE_GPR(r22,CNS_Q_GPR+0xB0,r1)
+        RESTORE_GPR(r23,CNS_Q_GPR+0xB8,r1)
+        RESTORE_GPR(r24,CNS_Q_GPR+0xC0,r1)
+        RESTORE_GPR(r25,CNS_Q_GPR+0xC8,r1)
+        RESTORE_GPR(r26,CNS_Q_GPR+0xD0,r1)
+        RESTORE_GPR(r27,CNS_Q_GPR+0xD8,r1)
+        RESTORE_GPR(r28,CNS_Q_GPR+0xE0,r1)
+        RESTORE_GPR(r29,CNS_Q_GPR+0xE8,r1)
+        RESTORE_GPR(r30,CNS_Q_GPR+0xF0,r1)
+        RESTORE_GPR(r31,CNS_Q_GPR+0xF8,r1)
+
+//orig	// switch impure pointer from gpr to ipr area --
+//orig	unfix_impure_gpr	r1
+//orig	fix_impure_ipr	r1
+//orig	restore_reg icsr, ipr=1		// restore original icsr- 4 bubbles to hw_rei
+
+        lda	t0, -0x200(t0)		// Restore base address of impure area.
+        lda	t0, CNS_Q_IPR(t0)	// Point to base of IPR area again.
+        RESTORE_IPR(icsr,CNS_Q_ICSR,r1)
+
+//orig	// and back again --
+//orig	unfix_impure_ipr	r1
+//orig	fix_impure_gpr	r1
+//orig	store_reg1 	flag, r31, r1, ipr=1 // clear dump area valid flag
+//orig	mb
+
+        lda	t0, -CNS_Q_IPR(t0)	// Back to base of impure area again,
+        lda	t0, 0x200(t0)		// and back to center of CPU segment
+        SAVE_GPR(r31,CNS_Q_FLAG,r1)	// Clear the dump area valid flag
+        mb
+
+//orig	// and back we go
+//orig//	restore_reg 3
+//orig	restore_reg 2
+//orig//	restore_reg 1
+//orig	restore_reg 0
+//orig	// restore impure area base
+//orig	unfix_impure_gpr r1
+
+        RESTORE_GPR(r2,CNS_Q_GPR+0x10,r1)
+        RESTORE_GPR(r0,CNS_Q_GPR+0x00,r1)
+        lda	r1, -0x200(r1)		// Restore impure base address
+
+        mfpr	r31, pt0		// stall for ldq_p above		//orig
+
+        mtpr	r31, dtb_ia		// clear the tb			//orig
+        mtpr	r31, itb_ia		// clear the itb		//orig
+
+//orig	pvc_jsr	rststa, bsr=1, dest=1
+        ret	r31, (r3)		// back we go			//orig
+
+
+//
+// pal_pal_bug_check -- code has found a bugcheck situation.
+//	Set things up and join common machine check flow.
+//
+// Input:
+//	r14 	- exc_addr
+//
+// On exit:
+//	pt0	- saved r0
+//	pt1	- saved	r1
+//	pt4	- saved r4
+//	pt5	- saved r5
+//	pt6	- saved r6
+//	pt10	- saved exc_addr
+//       pt_misc<47:32> - mchk code
+//       pt_misc<31:16> - scb vector
+//	r14	- base of Cbox IPRs in IO space
+//	MCES<mchk> is set
+//
+
+                ALIGN_BLOCK
+        .globl pal_pal_bug_check_from_int
+pal_pal_bug_check_from_int:
+        DEBUGSTORE(0x79)
+//simos	DEBUG_EXC_ADDR()
+        DEBUGSTORE(0x20)
+//simos	bsr	r25, put_hex
+        lda	r25, mchk_c_bugcheck(r31)
+        addq	r25, 1, r25			// set flag indicating we came from interrupt and stack is already pushed
+        br	r31, pal_pal_mchk
+        nop
+
+pal_pal_bug_check:
+        lda     r25, mchk_c_bugcheck(r31)
+
+pal_pal_mchk:
+        sll	r25, 32, r25			// Move mchk code to position
+
+        mtpr	r14, pt10			// Stash exc_addr
+        mtpr	r14, exc_addr
+
+        mfpr	r12, pt_misc			// Get MCES and scratch
+        zap	r12, 0x3c, r12
+
+        or	r12, r25, r12			// Combine mchk code
+        lda	r25, scb_v_procmchk(r31)	// Get SCB vector
+
+        sll	r25, 16, r25			// Move SCBv to position
+        or	r12, r25, r25			// Combine SCBv
+
+        mtpr	r0, pt0				// Stash for scratch
+        bis	r25, mces_m_mchk, r25	// Set MCES<MCHK> bit
+
+        mtpr	r25, pt_misc			// Save mchk code!scbv!whami!mces
+        ldah	r14, 0xfff0(r31)
+
+        mtpr	r1, pt1				// Stash for scratch
+        zap	r14, 0xE0, r14			// Get Cbox IPR base
+
+        mtpr	r4, pt4
+        mtpr	r5, pt5
+
+        mtpr	r6, pt6
+        blbs	r12, sys_double_machine_check   // MCHK halt if double machine check
+
+        br	r31, sys_mchk_collect_iprs	// Join common machine check flow
+
+
+
+//	align_to_call_pal_section
+//      Align to address of first call_pal entry point - 2000
+
+//
+// HALT	- PALcode for HALT instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	GO to console code
+//
+//
+
+        .text	1
+//	. = 0x2000
+       CALL_PAL_PRIV(PAL_HALT_ENTRY)
+call_pal_halt:
+        mfpr	r31, pt0		// Pad exc_addr read
+        mfpr	r31, pt0
+
+        mfpr	r12, exc_addr		// get PC
+        subq	r12, 4, r12		// Point to the HALT
+
+        mtpr	r12, exc_addr
+        mtpr	r0, pt0
+
+//orig	pvc_jsr updpcb, bsr=1
+        bsr    r0, pal_update_pcb      	// update the pcb
+        lda    r0, hlt_c_sw_halt(r31)  	// set halt code to sw halt
+        br     r31, sys_enter_console  	// enter the console
+
+//
+// CFLUSH - PALcode for CFLUSH instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+//	R16 - contains the PFN of the page to be flushed
+//
+// Function:
+//	Flush all Dstream caches of 1 entire page
+//	The CFLUSH routine is in the system specific module.
+//
+//
+
+        CALL_PAL_PRIV(PAL_CFLUSH_ENTRY)
+Call_Pal_Cflush:
+        br	r31, sys_cflush
+
+//
+// DRAINA	- PALcode for DRAINA instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//	Implicit TRAPB performed by hardware.
+//
+// Function:
+//	Stall instruction issue until all prior instructions are guaranteed to
+//	complete without incurring aborts.  For the EV5 implementation, this
+//	means waiting until all pending DREADS are returned.
+//
+//
+
+        CALL_PAL_PRIV(PAL_DRAINA_ENTRY)
+Call_Pal_Draina:
+        ldah	r14, 0x100(r31)		// Init counter.  Value?
+        nop
+
+DRAINA_LOOP:
+        subq	r14, 1, r14		// Decrement counter
+        mfpr	r13, ev5__maf_mode	// Fetch status bit
+
+        srl	r13, maf_mode_v_dread_pending, r13
+        ble	r14, DRAINA_LOOP_TOO_LONG
+
+        nop
+        blbs	r13, DRAINA_LOOP	// Wait until all DREADS clear
+
+        hw_rei
+
+DRAINA_LOOP_TOO_LONG:
+        br	r31, call_pal_halt
+
+// CALL_PAL OPCDECs
+
+        CALL_PAL_PRIV(0x0003)
+CallPal_OpcDec03:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0004)
+CallPal_OpcDec04:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0005)
+CallPal_OpcDec05:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0006)
+CallPal_OpcDec06:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0007)
+CallPal_OpcDec07:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0008)
+CallPal_OpcDec08:
+        br	r31, osfpal_calpal_opcdec
+
+//
+// CSERVE - PALcode for CSERVE instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//       Various functions for private use of console software
+//
+//       option selector in r0
+//       arguments in r16....
+//	The CSERVE routine is in the system specific module.
+//
+//
+
+        CALL_PAL_PRIV(PAL_CSERVE_ENTRY)
+Call_Pal_Cserve:
+        br	r31, sys_cserve
+
+//
+// swppal - PALcode for swppal instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//       Vectored into via hardware PALcode instruction dispatch.
+//               R16 contains the new PAL identifier
+//               R17:R21 contain implementation-specific entry parameters
+//
+//               R0  receives status:
+//                0 success (PAL was switched)
+//                1 unknown PAL variant
+//                2 known PAL variant, but PAL not loaded
+//
+//
+// Function:
+//       Swap control to another PAL.
+//
+
+        CALL_PAL_PRIV(PAL_SWPPAL_ENTRY)
+Call_Pal_Swppal:
+        cmpule	r16, 255, r0		// see if a kibble was passed
+        cmoveq  r16, r16, r0            // if r16=0 then a valid address (ECO 59)
+
+        or	r16, r31, r3		// set r3 incase this is a address
+        blbc	r0, swppal_cont		// nope, try it as an address
+
+        cmpeq	r16, 2, r0		// is it our friend OSF?
+        blbc	r0, swppal_fail		// nope, don't know this fellow
+
+        br	r2, CALL_PAL_SWPPAL_10_			// tis our buddy OSF
+
+//	.global	osfpal_hw_entry_reset
+//	.weak	osfpal_hw_entry_reset
+//	.long	<osfpal_hw_entry_reset-pal_start>
+//orig	halt				// don't know how to get the address here - kludge ok, load pal at 0
+        .long	0			// ?? hack upon hack...pb
+
+CALL_PAL_SWPPAL_10_: 	ldl_p	r3, 0(r2)		// fetch target addr
+//	ble	r3, swppal_fail		; if OSF not linked in say not loaded.
+        mfpr	r2, pal_base		// fetch pal base
+
+        addq	r2, r3, r3		// add pal base
+        lda	r2, 0x3FFF(r31)		// get pal base checker mask
+
+        and	r3, r2, r2		// any funky bits set?
+        cmpeq	r2, 0, r0		//
+
+        blbc	r0, swppal_fail		// return unknown if bad bit set.
+        br	r31, swppal_cont
+
+// .sbttl	"CALL_PAL OPCDECs"
+
+        CALL_PAL_PRIV(0x000B)
+CallPal_OpcDec0B:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x000C)
+CallPal_OpcDec0C:
+        br	r31, osfpal_calpal_opcdec
+
+//
+// wripir - PALcode for wripir instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//	r16 = processor number to interrupt
+//
+// Function:
+//	IPIR	<- R16
+//	Handled in system-specific code
+//
+// Exit:
+//	interprocessor interrupt is recorded on the target processor
+//	and is initiated when the proper enabling conditions are present.
+//
+
+        CALL_PAL_PRIV(PAL_WRIPIR_ENTRY)
+Call_Pal_Wrpir:
+        br	r31, sys_wripir
+
+// .sbttl	"CALL_PAL OPCDECs"
+
+        CALL_PAL_PRIV(0x000E)
+CallPal_OpcDec0E:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x000F)
+CallPal_OpcDec0F:
+        br	r31, osfpal_calpal_opcdec
+
+//
+// rdmces - PALcode for rdmces instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	R0 <- ZEXT(MCES)
+//
+
+        CALL_PAL_PRIV(PAL_RDMCES_ENTRY)
+Call_Pal_Rdmces:
+        mfpr	r0, pt_mces		// Read from PALtemp
+        and	r0, mces_m_all, r0	// Clear other bits
+
+        hw_rei
+
+//
+// wrmces - PALcode for wrmces instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	If {R16<0> EQ 1} then MCES<0> <- 0 (MCHK)
+//	If {R16<1> EQ 1} then MCES<1> <- 0 (SCE)
+//	If {R16<2> EQ 1} then MCES<2> <- 0 (PCE)
+//	MCES<3> <- R16<3>		   (DPC)
+//	MCES<4> <- R16<4>		   (DSC)
+//
+//
+
+        CALL_PAL_PRIV(PAL_WRMCES_ENTRY)
+Call_Pal_Wrmces:
+        and	r16, ((1<<mces_v_mchk) | (1<<mces_v_sce) | (1<<mces_v_pce)), r13	// Isolate MCHK, SCE, PCE
+        mfpr	r14, pt_mces		// Get current value
+
+        ornot	r31, r13, r13		// Flip all the bits
+        and	r16, ((1<<mces_v_dpc) | (1<<mces_v_dsc)), r17
+
+        and	r14, r13, r1		// Update MCHK, SCE, PCE
+        bic	r1, ((1<<mces_v_dpc) | (1<<mces_v_dsc)), r1	// Clear old DPC, DSC
+
+        or	r1, r17, r1		// Update DPC and DSC
+        mtpr	r1, pt_mces		// Write MCES back
+
+        nop				// Pad to fix PT write->read restriction
+
+        nop
+        hw_rei
+
+
+
+// CALL_PAL OPCDECs
+
+        CALL_PAL_PRIV(0x0012)
+CallPal_OpcDec12:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0013)
+CallPal_OpcDec13:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0014)
+CallPal_OpcDec14:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0015)
+CallPal_OpcDec15:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0016)
+CallPal_OpcDec16:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0017)
+CallPal_OpcDec17:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0018)
+CallPal_OpcDec18:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0019)
+CallPal_OpcDec19:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x001A)
+CallPal_OpcDec1A:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x001B)
+CallPal_OpcDec1B:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x001C)
+CallPal_OpcDec1C:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x001D)
+CallPal_OpcDec1D:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x001E)
+CallPal_OpcDec1E:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x001F)
+CallPal_OpcDec1F:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0020)
+CallPal_OpcDec20:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0021)
+CallPal_OpcDec21:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0022)
+CallPal_OpcDec22:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0023)
+CallPal_OpcDec23:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0024)
+CallPal_OpcDec24:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0025)
+CallPal_OpcDec25:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0026)
+CallPal_OpcDec26:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0027)
+CallPal_OpcDec27:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0028)
+CallPal_OpcDec28:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x0029)
+CallPal_OpcDec29:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x002A)
+CallPal_OpcDec2A:
+        br	r31, osfpal_calpal_opcdec
+
+//
+// wrfen - PALcode for wrfen instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	a0<0> -> ICSR<FPE>
+//	Store new FEN in PCB
+//	Final value of t0 (r1), t8..t10 (r22..r24) and a0 (r16)
+//          are UNPREDICTABLE
+//
+// Issue: What about pending FP loads when FEN goes from on->off????
+//
+
+        CALL_PAL_PRIV(PAL_WRFEN_ENTRY)
+Call_Pal_Wrfen:
+        or	r31, 1, r13		// Get a one
+        mfpr	r1, ev5__icsr		// Get current FPE
+
+        sll	r13, icsr_v_fpe, r13	// shift 1 to icsr<fpe> spot, e0
+        and	r16, 1, r16		// clean new fen
+
+        sll	r16, icsr_v_fpe, r12	// shift new fen to correct bit position
+        bic	r1, r13, r1		// zero icsr<fpe>
+
+        or	r1, r12, r1		// Or new FEN into ICSR
+        mfpr	r12, pt_pcbb		// Get PCBB - E1
+
+        mtpr	r1, ev5__icsr		// write new ICSR.  3 Bubble cycles to HW_REI
+        stl_p	r16, osfpcb_q_fen(r12)	// Store FEN in PCB.
+
+        mfpr	r31, pt0		// Pad ICSR<FPE> write.
+        mfpr	r31, pt0
+
+        mfpr	r31, pt0
+//	pvc_violate 	225		// cuz PVC can't distinguish which bits changed
+        hw_rei
+
+
+        CALL_PAL_PRIV(0x002C)
+CallPal_OpcDec2C:
+        br	r31, osfpal_calpal_opcdec
+
+//
+// wrvptpr - PALcode for wrvptpr instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	vptptr <- a0 (r16)
+//
+
+        CALL_PAL_PRIV(PAL_WRVPTPTR_ENTRY)
+Call_Pal_Wrvptptr:
+        mtpr    r16, ev5__mvptbr                // Load Mbox copy
+        mtpr    r16, ev5__ivptbr                // Load Ibox copy
+        nop                                     // Pad IPR write
+        nop
+        hw_rei
+
+        CALL_PAL_PRIV(0x002E)
+CallPal_OpcDec2E:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_PRIV(0x002F)
+CallPal_OpcDec2F:
+        br	r31, osfpal_calpal_opcdec
+
+
+//
+// swpctx - PALcode for swpctx instruction
+//
+// Entry:
+//       hardware dispatch via callPal instruction
+//       R16 -> new pcb
+//
+// Function:
+//       dynamic state moved to old pcb
+//       new state loaded from new pcb
+//       pcbb pointer set
+//       old pcbb returned in R0
+//
+//  Note: need to add perf monitor stuff
+//
+
+        CALL_PAL_PRIV(PAL_SWPCTX_ENTRY)
+Call_Pal_Swpctx:
+        rpcc	r13			// get cyccounter
+        mfpr	r0, pt_pcbb		// get pcbb
+
+        ldq_p	r22, osfpcb_q_fen(r16)	// get new fen/pme
+        ldq_p	r23, osfpcb_l_cc(r16)	// get new asn
+
+        srl	r13, 32, r25		// move offset
+        mfpr	r24, pt_usp		// get usp
+
+        stq_p	r30, osfpcb_q_ksp(r0)	// store old ksp
+//	pvc_violate 379			// stq_p can't trap except replay.  only problem if mf same ipr in same shadow.
+        mtpr	r16, pt_pcbb		// set new pcbb
+
+        stq_p	r24, osfpcb_q_usp(r0)	// store usp
+        addl	r13, r25, r25		// merge for new time
+
+        stl_p	r25, osfpcb_l_cc(r0)	// save time
+        ldah	r24, (1<<(icsr_v_fpe-16))(r31)
+
+        and	r22, 1, r12		// isolate fen
+        mfpr	r25, icsr		// get current icsr
+
+        lda	r24, (1<<icsr_v_pmp)(r24)
+        br	r31, swpctx_cont
+
+//
+// wrval - PALcode for wrval instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	sysvalue <- a0 (r16)
+//
+
+        CALL_PAL_PRIV(PAL_WRVAL_ENTRY)
+Call_Pal_Wrval:
+        nop
+        mtpr	r16, pt_sysval		// Pad paltemp write
+        nop
+        nop
+        hw_rei
+
+//
+// rdval - PALcode for rdval instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	v0 (r0) <- sysvalue
+//
+
+        CALL_PAL_PRIV(PAL_RDVAL_ENTRY)
+Call_Pal_Rdval:
+        nop
+        mfpr	r0, pt_sysval
+        nop
+        hw_rei
+
+//
+// tbi - PALcode for tbi instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	TB invalidate
+//       r16/a0 = TBI type
+//       r17/a1 = Va for TBISx instructions
+//
+
+        CALL_PAL_PRIV(PAL_TBI_ENTRY)
+Call_Pal_Tbi:
+        addq	r16, 2, r16			// change range to 0-2
+        br	r23, CALL_PAL_tbi_10_		// get our address
+
+CALL_PAL_tbi_10_: cmpult	r16, 6, r22		// see if in range
+        lda	r23, tbi_tbl-CALL_PAL_tbi_10_(r23)	// set base to start of table
+        sll	r16, 4, r16		// * 16
+        blbc	r22, CALL_PAL_tbi_30_		// go rei, if not
+
+        addq	r23, r16, r23		// addr of our code
+//orig	pvc_jsr	tbi
+        jmp	r31, (r23)		// and go do it
+
+CALL_PAL_tbi_30_:
+        hw_rei
+        nop
+
+//
+// wrent - PALcode for wrent instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	Update ent* in paltemps
+//       r16/a0 = Address of entry routine
+//       r17/a1 = Entry Number 0..5
+//
+//       r22, r23 trashed
+//
+
+        CALL_PAL_PRIV(PAL_WRENT_ENTRY)
+Call_Pal_Wrent:
+        cmpult	r17, 6, r22			// see if in range
+        br	r23, CALL_PAL_wrent_10_		// get our address
+
+CALL_PAL_wrent_10_:	bic	r16, 3, r16	// clean pc
+        blbc	r22, CALL_PAL_wrent_30_		// go rei, if not in range
+
+        lda	r23, wrent_tbl-CALL_PAL_wrent_10_(r23)	// set base to start of table
+        sll	r17, 4, r17				// *16
+
+        addq  	r17, r23, r23		// Get address in table
+//orig	pvc_jsr	wrent
+        jmp	r31, (r23)		// and go do it
+
+CALL_PAL_wrent_30_:
+        hw_rei				// out of range, just return
+
+//
+// swpipl - PALcode for swpipl instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	v0 (r0)  <- PS<IPL>
+//	PS<IPL>  <- a0<2:0>  (r16)
+//
+//	t8 (r22) is scratch
+//
+
+        CALL_PAL_PRIV(PAL_SWPIPL_ENTRY)
+Call_Pal_Swpipl:
+        and	r16, osfps_m_ipl, r16	// clean New ipl
+        mfpr	r22, pt_intmask		// get int mask
+
+        extbl	r22, r16, r22		// get mask for this ipl
+        bis	r11, r31, r0		// return old ipl
+
+        bis	r16, r31, r11		// set new ps
+        mtpr	r22, ev5__ipl		// set new mask
+
+        mfpr	r31, pt0		// pad ipl write
+        mfpr	r31, pt0		// pad ipl write
+
+        hw_rei				// back
+
+//
+// rdps - PALcode for rdps instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	v0 (r0) <- ps
+//
+
+        CALL_PAL_PRIV(PAL_RDPS_ENTRY)
+Call_Pal_Rdps:
+        bis	r11, r31, r0		// Fetch PALshadow PS
+        nop				// Must be 2 cycles long
+        hw_rei
+
+//
+// wrkgp - PALcode for wrkgp instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	kgp <- a0 (r16)
+//
+
+        CALL_PAL_PRIV(PAL_WRKGP_ENTRY)
+Call_Pal_Wrkgp:
+        nop
+        mtpr	r16, pt_kgp
+        nop				// Pad for pt write->read restriction
+        nop
+        hw_rei
+
+//
+// wrusp - PALcode for wrusp instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//       usp <- a0 (r16)
+//
+
+        CALL_PAL_PRIV(PAL_WRUSP_ENTRY)
+Call_Pal_Wrusp:
+        nop
+        mtpr	r16, pt_usp
+        nop				// Pad possible pt write->read restriction
+        nop
+        hw_rei
+
+//
+// wrperfmon - PALcode for wrperfmon instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+//
+// Function:
+//	Various control functions for the onchip performance counters
+//
+//	option selector in r16
+//	option argument in r17
+//	returned status in r0
+//
+//
+//	r16 = 0	Disable performance monitoring for one or more cpu's
+//	  r17 = 0		disable no counters
+//	  r17 = bitmask		disable counters specified in bit mask (1=disable)
+//
+//	r16 = 1	Enable performance monitoring for one or more cpu's
+//	  r17 = 0		enable no counters
+//	  r17 = bitmask		enable counters specified in bit mask (1=enable)
+//
+//	r16 = 2	Mux select for one or more cpu's
+//	  r17 = Mux selection (cpu specific)
+//    		<24:19>  	 bc_ctl<pm_mux_sel> field (see spec)
+//		<31>,<7:4>,<3:0> pmctr <sel0>,<sel1>,<sel2> fields (see spec)
+//
+//	r16 = 3	Options
+//	  r17 = (cpu specific)
+//		<0> = 0 	log all processes
+//		<0> = 1		log only selected processes
+//		<30,9,8> 		mode select - ku,kp,kk
+//
+//	r16 = 4	Interrupt frequency select
+//	  r17 = (cpu specific)	indicates interrupt frequencies desired for each
+//				counter, with "zero interrupts" being an option
+//				frequency info in r17 bits as defined by PMCTR_CTL<FRQx> below
+//
+//	r16 = 5	Read Counters
+//	  r17 = na
+//	  r0  = value (same format as ev5 pmctr)
+//	        <0> = 0		Read failed
+//	        <0> = 1		Read succeeded
+//
+//	r16 = 6	Write Counters
+//	  r17 = value (same format as ev5 pmctr; all counters written simultaneously)
+//
+//	r16 = 7	Enable performance monitoring for one or more cpu's and reset counter to 0
+//	  r17 = 0		enable no counters
+//	  r17 = bitmask		enable & clear counters specified in bit mask (1=enable & clear)
+//
+//=============================================================================
+//Assumptions:
+//PMCTR_CTL:
+//
+//       <15:14>         CTL0 -- encoded frequency select and enable - CTR0
+//       <13:12>         CTL1 --			"		   - CTR1
+//       <11:10>         CTL2 --			"		   - CTR2
+//
+//       <9:8>           FRQ0 -- frequency select for CTR0 (no enable info)
+//       <7:6>           FRQ1 -- frequency select for CTR1
+//       <5:4>           FRQ2 -- frequency select for CTR2
+//
+//       <0>		all vs. select processes (0=all,1=select)
+//
+//     where
+//	FRQx<1:0>
+//	     0 1	disable interrupt
+//	     1 0	frequency = 65536 (16384 for ctr2)
+//	     1 1	frequency = 256
+//	note:  FRQx<1:0> = 00 will keep counters from ever being enabled.
+//
+//=============================================================================
+//
+        CALL_PAL_PRIV(0x0039)
+// unsupported in Hudson code .. pboyle Nov/95
+CALL_PAL_Wrperfmon:
+        // "real" performance monitoring code
+        cmpeq	r16, 1, r0		// check for enable
+        bne	r0, perfmon_en		// br if requested to enable
+
+        cmpeq	r16, 2, r0		// check for mux ctl
+        bne	r0, perfmon_muxctl	// br if request to set mux controls
+
+        cmpeq	r16, 3, r0		// check for options
+        bne	r0, perfmon_ctl		// br if request to set options
+
+        cmpeq	r16, 4, r0		// check for interrupt frequency select
+        bne	r0, perfmon_freq	// br if request to change frequency select
+
+        cmpeq	r16, 5, r0		// check for counter read request
+        bne	r0, perfmon_rd		// br if request to read counters
+
+        cmpeq	r16, 6, r0		// check for counter write request
+        bne	r0, perfmon_wr		// br if request to write counters
+
+        cmpeq	r16, 7, r0		// check for counter clear/enable request
+        bne	r0, perfmon_enclr	// br if request to clear/enable counters
+
+        beq	r16, perfmon_dis	// br if requested to disable (r16=0)
+        br	r31, perfmon_unknown	// br if unknown request
+
+//
+// rdusp - PALcode for rdusp instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	v0 (r0) <- usp
+//
+
+        CALL_PAL_PRIV(PAL_RDUSP_ENTRY)
+Call_Pal_Rdusp:
+        nop
+        mfpr	r0, pt_usp
+        hw_rei
+
+
+        CALL_PAL_PRIV(0x003B)
+CallPal_OpcDec3B:
+        br	r31, osfpal_calpal_opcdec
+
+//
+// whami - PALcode for whami instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	v0 (r0) <- whami
+//
+        CALL_PAL_PRIV(PAL_WHAMI_ENTRY)
+Call_Pal_Whami:
+        nop
+        mfpr    r0, pt_whami            // Get Whami
+        extbl	r0, 1, r0		// Isolate just whami bits
+        hw_rei
+
+//
+// retsys - PALcode for retsys instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//       00(sp) contains return pc
+//       08(sp) contains r29
+//
+// Function:
+//	Return from system call.
+//       mode switched from kern to user.
+//       stacks swapped, ugp, upc restored.
+//       r23, r25 junked
+//
+
+        CALL_PAL_PRIV(PAL_RETSYS_ENTRY)
+Call_Pal_Retsys:
+        lda	r25, osfsf_c_size(sp) 	// pop stack
+        bis	r25, r31, r14		// touch r25 & r14 to stall mf exc_addr
+
+        mfpr	r14, exc_addr		// save exc_addr in case of fault
+        ldq	r23, osfsf_pc(sp) 	// get pc
+
+        ldq	r29, osfsf_gp(sp) 	// get gp
+        stl_c	r31, -4(sp)		// clear lock_flag
+
+        lda	r11, 1<<osfps_v_mode(r31)// new PS:mode=user
+        mfpr	r30, pt_usp		// get users stack
+
+        bic	r23, 3, r23		// clean return pc
+        mtpr	r31, ev5__ipl		// zero ibox IPL - 2 bubbles to hw_rei
+
+        mtpr	r11, ev5__dtb_cm	// set Mbox current mode - no virt ref for 2 cycles
+        mtpr	r11, ev5__ps		// set Ibox current mode - 2 bubble to hw_rei
+
+        mtpr	r23, exc_addr		// set return address - 1 bubble to hw_rei
+        mtpr	r25, pt_ksp		// save kern stack
+
+        rc	r31			// clear inter_flag
+//	pvc_violate 248			// possible hidden mt->mf pt violation ok in callpal
+        hw_rei_spe			// and back
+
+
+        CALL_PAL_PRIV(0x003E)
+CallPal_OpcDec3E:
+        br	r31, osfpal_calpal_opcdec
+
+//
+// rti - PALcode for rti instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	00(sp) -> ps
+//	08(sp) -> pc
+//	16(sp) -> r29 (gp)
+//	24(sp) -> r16 (a0)
+//	32(sp) -> r17 (a1)
+//	40(sp) -> r18 (a3)
+//
+
+        CALL_PAL_PRIV(PAL_RTI_ENTRY)
+        /* called once by platform_tlaser */
+        .globl Call_Pal_Rti
+Call_Pal_Rti:
+        lda	r25, osfsf_c_size(sp)	// get updated sp
+        bis	r25, r31, r14		// touch r14,r25 to stall mf exc_addr
+
+        mfpr	r14, exc_addr		// save PC in case of fault
+        rc	r31			// clear intr_flag
+
+        ldq	r12, -6*8(r25)		// get ps
+        ldq	r13, -5*8(r25)		// pc
+
+        ldq	r18, -1*8(r25)		// a2
+        ldq	r17, -2*8(r25)		// a1
+
+        ldq	r16, -3*8(r25)		// a0
+        ldq	r29, -4*8(r25)		// gp
+
+        bic	r13, 3, r13		// clean return pc
+        stl_c	r31, -4(r25)		// clear lock_flag
+
+        and	r12, osfps_m_mode, r11	// get mode
+        mtpr	r13, exc_addr		// set return address
+
+        beq	r11, rti_to_kern	// br if rti to Kern
+        br	r31, rti_to_user	// out of call_pal space
+
+
+///////////////////////////////////////////////////
+// Start the Unprivileged CALL_PAL Entry Points
+///////////////////////////////////////////////////
+
+//
+// bpt - PALcode for bpt instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	Build stack frame
+//	a0 <- code
+//	a1 <- unpred
+//	a2 <- unpred
+//	vector via entIF
+//
+//
+//
+        .text	1
+//	. = 0x3000
+        CALL_PAL_UNPRIV(PAL_BPT_ENTRY)
+Call_Pal_Bpt:
+        sll	r11, 63-osfps_v_mode, r25 // Shift mode up to MS bit
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        bis	r11, r31, r12		// Save PS for stack write
+        bge	r25, CALL_PAL_bpt_10_		// no stack swap needed if cm=kern
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        mtpr	r30, pt_usp		// save user stack
+
+        bis	r31, r31, r11		// Set new PS
+        mfpr	r30, pt_ksp
+
+CALL_PAL_bpt_10_:
+        lda	sp, 0-osfsf_c_size(sp)// allocate stack space
+        mfpr	r14, exc_addr		// get pc
+
+        stq	r16, osfsf_a0(sp)	// save regs
+        bis	r31, osf_a0_bpt, r16	// set a0
+
+        stq	r17, osfsf_a1(sp)	// a1
+        br	r31, bpt_bchk_common	// out of call_pal space
+
+
+//
+// bugchk - PALcode for bugchk instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	Build stack frame
+//	a0 <- code
+//	a1 <- unpred
+//	a2 <- unpred
+//	vector via entIF
+//
+//
+//
+        CALL_PAL_UNPRIV(PAL_BUGCHK_ENTRY)
+Call_Pal_Bugchk:
+        sll	r11, 63-osfps_v_mode, r25 // Shift mode up to MS bit
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        bis	r11, r31, r12		// Save PS for stack write
+        bge	r25, CALL_PAL_bugchk_10_		// no stack swap needed if cm=kern
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        mtpr	r30, pt_usp		// save user stack
+
+        bis	r31, r31, r11		// Set new PS
+        mfpr	r30, pt_ksp
+
+CALL_PAL_bugchk_10_:
+        lda	sp, 0-osfsf_c_size(sp)// allocate stack space
+        mfpr	r14, exc_addr		// get pc
+
+        stq	r16, osfsf_a0(sp)	// save regs
+        bis	r31, osf_a0_bugchk, r16	// set a0
+
+        stq	r17, osfsf_a1(sp)	// a1
+        br	r31, bpt_bchk_common	// out of call_pal space
+
+
+        CALL_PAL_UNPRIV(0x0082)
+CallPal_OpcDec82:
+        br	r31, osfpal_calpal_opcdec
+
+//
+// callsys - PALcode for callsys instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+// 	Switch mode to kernel and build a callsys stack frame.
+//       sp = ksp
+//       gp = kgp
+//	t8 - t10 (r22-r24) trashed
+//
+//
+//
+        CALL_PAL_UNPRIV(PAL_CALLSYS_ENTRY)
+Call_Pal_Callsys:
+
+        and	r11, osfps_m_mode, r24	// get mode
+        mfpr	r22, pt_ksp		// get ksp
+
+        beq	r24, sys_from_kern 	// sysCall from kern is not allowed
+        mfpr	r12, pt_entsys		// get address of callSys routine
+
+//
+// from here on we know we are in user going to Kern
+//
+        mtpr	r31, ev5__dtb_cm	// set Mbox current mode - no virt ref for 2 cycles
+        mtpr	r31, ev5__ps		// set Ibox current mode - 2 bubble to hw_rei
+
+        bis	r31, r31, r11		// PS=0 (mode=kern)
+        mfpr	r23, exc_addr		// get pc
+
+        mtpr	r30, pt_usp		// save usp
+        lda	sp, 0-osfsf_c_size(r22)// set new sp
+
+        stq	r29, osfsf_gp(sp)	// save user gp/r29
+        stq	r24, osfsf_ps(sp)	// save ps
+
+        stq	r23, osfsf_pc(sp)	// save pc
+        mtpr	r12, exc_addr		// set address
+                                        // 1 cycle to hw_rei
+
+        mfpr	r29, pt_kgp		// get the kern gp/r29
+
+        hw_rei_spe			// and off we go!
+
+
+        CALL_PAL_UNPRIV(0x0084)
+CallPal_OpcDec84:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0085)
+CallPal_OpcDec85:
+        br	r31, osfpal_calpal_opcdec
+
+//
+// imb - PALcode for imb instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//       Flush the writebuffer and flush the Icache
+//
+//
+//
+        CALL_PAL_UNPRIV(PAL_IMB_ENTRY)
+Call_Pal_Imb:
+        mb                              // Clear the writebuffer
+        mfpr    r31, ev5__mcsr          // Sync with clear
+        nop
+        nop
+        br      r31, pal_ic_flush           // Flush Icache
+
+
+// CALL_PAL OPCDECs
+
+        CALL_PAL_UNPRIV(0x0087)
+CallPal_OpcDec87:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0088)
+CallPal_OpcDec88:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0089)
+CallPal_OpcDec89:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x008A)
+CallPal_OpcDec8A:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x008B)
+CallPal_OpcDec8B:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x008C)
+CallPal_OpcDec8C:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x008D)
+CallPal_OpcDec8D:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x008E)
+CallPal_OpcDec8E:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x008F)
+CallPal_OpcDec8F:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0090)
+CallPal_OpcDec90:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0091)
+CallPal_OpcDec91:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0092)
+CallPal_OpcDec92:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0093)
+CallPal_OpcDec93:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0094)
+CallPal_OpcDec94:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0095)
+CallPal_OpcDec95:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0096)
+CallPal_OpcDec96:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0097)
+CallPal_OpcDec97:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0098)
+CallPal_OpcDec98:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x0099)
+CallPal_OpcDec99:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x009A)
+CallPal_OpcDec9A:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x009B)
+CallPal_OpcDec9B:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x009C)
+CallPal_OpcDec9C:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x009D)
+CallPal_OpcDec9D:
+        br	r31, osfpal_calpal_opcdec
+
+//
+// rdunique - PALcode for rdunique instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	v0 (r0) <- unique
+//
+//
+//
+        CALL_PAL_UNPRIV(PAL_RDUNIQUE_ENTRY)
+CALL_PALrdunique_:
+        mfpr	r0, pt_pcbb		// get pcb pointer
+        ldq_p	r0, osfpcb_q_unique(r0) // get new value
+
+        hw_rei
+
+//
+// wrunique - PALcode for wrunique instruction
+//
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	unique <- a0 (r16)
+//
+//
+//
+CALL_PAL_UNPRIV(PAL_WRUNIQUE_ENTRY)
+CALL_PAL_Wrunique:
+        nop
+        mfpr	r12, pt_pcbb		// get pcb pointer
+        stq_p	r16, osfpcb_q_unique(r12)// get new value
+        nop				// Pad palshadow write
+        hw_rei				// back
+
+// CALL_PAL OPCDECs
+
+        CALL_PAL_UNPRIV(0x00A0)
+CallPal_OpcDecA0:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00A1)
+CallPal_OpcDecA1:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00A2)
+CallPal_OpcDecA2:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00A3)
+CallPal_OpcDecA3:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00A4)
+CallPal_OpcDecA4:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00A5)
+CallPal_OpcDecA5:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00A6)
+CallPal_OpcDecA6:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00A7)
+CallPal_OpcDecA7:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00A8)
+CallPal_OpcDecA8:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00A9)
+CallPal_OpcDecA9:
+        br	r31, osfpal_calpal_opcdec
+
+
+//
+// gentrap - PALcode for gentrap instruction
+//
+// CALL_PAL_gentrap:
+// Entry:
+//	Vectored into via hardware PALcode instruction dispatch.
+//
+// Function:
+//	Build stack frame
+//	a0 <- code
+//	a1 <- unpred
+//	a2 <- unpred
+//	vector via entIF
+//
+//
+
+        CALL_PAL_UNPRIV(0x00AA)
+// unsupported in Hudson code .. pboyle Nov/95
+CALL_PAL_gentrap:
+        sll	r11, 63-osfps_v_mode, r25 // Shift mode up to MS bit
+        mtpr	r31, ev5__ps		// Set Ibox current mode to kernel
+
+        bis	r11, r31, r12			// Save PS for stack write
+        bge	r25, CALL_PAL_gentrap_10_	// no stack swap needed if cm=kern
+
+        mtpr	r31, ev5__dtb_cm	// Set Mbox current mode to kernel -
+                                        //     no virt ref for next 2 cycles
+        mtpr	r30, pt_usp		// save user stack
+
+        bis	r31, r31, r11		// Set new PS
+        mfpr	r30, pt_ksp
+
+CALL_PAL_gentrap_10_:
+        lda	sp, 0-osfsf_c_size(sp)// allocate stack space
+        mfpr	r14, exc_addr		// get pc
+
+        stq	r16, osfsf_a0(sp)	// save regs
+        bis	r31, osf_a0_gentrap, r16// set a0
+
+        stq	r17, osfsf_a1(sp)	// a1
+        br	r31, bpt_bchk_common	// out of call_pal space
+
+
+// CALL_PAL OPCDECs
+
+        CALL_PAL_UNPRIV(0x00AB)
+CallPal_OpcDecAB:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00AC)
+CallPal_OpcDecAC:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00AD)
+CallPal_OpcDecAD:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00AE)
+CallPal_OpcDecAE:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00AF)
+CallPal_OpcDecAF:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00B0)
+CallPal_OpcDecB0:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00B1)
+CallPal_OpcDecB1:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00B2)
+CallPal_OpcDecB2:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00B3)
+CallPal_OpcDecB3:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00B4)
+CallPal_OpcDecB4:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00B5)
+CallPal_OpcDecB5:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00B6)
+CallPal_OpcDecB6:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00B7)
+CallPal_OpcDecB7:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00B8)
+CallPal_OpcDecB8:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00B9)
+CallPal_OpcDecB9:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00BA)
+CallPal_OpcDecBA:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00BB)
+CallPal_OpcDecBB:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00BC)
+CallPal_OpcDecBC:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00BD)
+CallPal_OpcDecBD:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00BE)
+CallPal_OpcDecBE:
+        br	r31, osfpal_calpal_opcdec
+
+        CALL_PAL_UNPRIV(0x00BF)
+CallPal_OpcDecBF:
+        // MODIFIED BY EGH 2/25/04
+        br	r31, copypal_impl
+
+
+/*======================================================================*/
+/*                   OSF/1 CALL_PAL CONTINUATION AREA                   */
+/*======================================================================*/
+
+        .text	2
+
+        . = 0x4000
+
+
+// Continuation of MTPR_PERFMON
+        ALIGN_BLOCK
+          // "real" performance monitoring code
+// mux ctl
+perfmon_muxctl:
+        lda     r8, 1(r31) 			// get a 1
+        sll     r8, pmctr_v_sel0, r8		// move to sel0 position
+        or      r8, ((0xf<<pmctr_v_sel1) | (0xf<<pmctr_v_sel2)), r8	// build mux select mask
+        and	r17, r8, r25			// isolate pmctr mux select bits
+        mfpr	r0, ev5__pmctr
+        bic	r0, r8, r0			// clear old mux select bits
+        or	r0,r25, r25			// or in new mux select bits
+        mtpr	r25, ev5__pmctr
+
+        // ok, now tackle cbox mux selects
+        ldah    r14, 0xfff0(r31)
+        zap     r14, 0xE0, r14                 // Get Cbox IPR base
+//orig	get_bc_ctl_shadow	r16		// bc_ctl returned in lower longword
+// adapted from ev5_pal_macros.mar
+        mfpr	r16, pt_impure
+        lda	r16, CNS_Q_IPR(r16)
+        RESTORE_SHADOW(r16,CNS_Q_BC_CTL,r16);
+
+        lda	r8, 0x3F(r31)			// build mux select mask
+        sll	r8, bc_ctl_v_pm_mux_sel, r8
+
+        and	r17, r8, r25			// isolate bc_ctl mux select bits
+        bic	r16, r8, r16			// isolate old mux select bits
+        or	r16, r25, r25			// create new bc_ctl
+        mb					// clear out cbox for future ipr write
+        stq_p	r25, ev5__bc_ctl(r14)		// store to cbox ipr
+        mb					// clear out cbox for future ipr write
+
+//orig	update_bc_ctl_shadow	r25, r16	// r25=value, r16-overwritten with adjusted impure ptr
+// adapted from ev5_pal_macros.mar
+        mfpr	r16, pt_impure
+        lda	r16, CNS_Q_IPR(r16)
+        SAVE_SHADOW(r25,CNS_Q_BC_CTL,r16);
+
+        br 	r31, perfmon_success
+
+
+// requested to disable perf monitoring
+perfmon_dis:
+        mfpr	r14, ev5__pmctr		// read ibox pmctr ipr
+perfmon_dis_ctr0:			// and begin with ctr0
+        blbc	r17, perfmon_dis_ctr1	// do not disable ctr0
+        lda 	r8, 3(r31)
+        sll	r8, pmctr_v_ctl0, r8
+        bic	r14, r8, r14		// disable ctr0
+perfmon_dis_ctr1:
+        srl	r17, 1, r17
+        blbc	r17, perfmon_dis_ctr2	// do not disable ctr1
+        lda 	r8, 3(r31)
+        sll	r8, pmctr_v_ctl1, r8
+        bic	r14, r8, r14		// disable ctr1
+perfmon_dis_ctr2:
+        srl	r17, 1, r17
+        blbc	r17, perfmon_dis_update	// do not disable ctr2
+        lda 	r8, 3(r31)
+        sll	r8, pmctr_v_ctl2, r8
+        bic	r14, r8, r14		// disable ctr2
+perfmon_dis_update:
+        mtpr	r14, ev5__pmctr		// update pmctr ipr
+//;the following code is not needed for ev5 pass2 and later, but doesn't hurt anything to leave in
+// adapted from ev5_pal_macros.mar
+//orig	get_pmctr_ctl	r8, r25		// pmctr_ctl bit in r8.  adjusted impure pointer in r25
+        mfpr	r25, pt_impure
+        lda	r25, CNS_Q_IPR(r25)
+        RESTORE_SHADOW(r8,CNS_Q_PM_CTL,r25);
+
+        lda	r17, 0x3F(r31)		// build mask
+        sll	r17, pmctr_v_ctl2, r17 // shift mask to correct position
+        and 	r14, r17, r14		// isolate ctl bits
+        bic	r8, r17, r8		// clear out old ctl bits
+        or	r14, r8, r14		// create shadow ctl bits
+//orig	store_reg1 pmctr_ctl, r14, r25, ipr=1	// update pmctr_ctl register
+//adjusted impure pointer still in r25
+        SAVE_SHADOW(r14,CNS_Q_PM_CTL,r25);
+
+        br 	r31, perfmon_success
+
+
+// requested to enable perf monitoring
+//;the following code can be greatly simplified for pass2, but should work fine as is.
+
+
+perfmon_enclr:
+        lda	r9, 1(r31)		// set enclr flag
+        br perfmon_en_cont
+
+perfmon_en:
+        bis	r31, r31, r9		// clear enclr flag
+
+perfmon_en_cont:
+        mfpr	r8, pt_pcbb		// get PCB base
+//orig	get_pmctr_ctl r25, r25
+        mfpr	r25, pt_impure
+        lda	r25, CNS_Q_IPR(r25)
+        RESTORE_SHADOW(r25,CNS_Q_PM_CTL,r25);
+
+        ldq_p	r16, osfpcb_q_fen(r8)	// read DAT/PME/FEN quadword
+        mfpr	r14, ev5__pmctr		// read ibox pmctr ipr
+        srl 	r16, osfpcb_v_pme, r16	// get pme bit
+        mfpr	r13, icsr
+        and	r16,  1, r16		// isolate pme bit
+
+        // this code only needed in pass2 and later
+        lda	r12, 1<<icsr_v_pmp(r31)		// pb
+        bic	r13, r12, r13		// clear pmp bit
+        sll	r16, icsr_v_pmp, r12	// move pme bit to icsr<pmp> position
+        or	r12, r13, r13		// new icsr with icsr<pmp> bit set/clear
+        mtpr	r13, icsr		// update icsr
+
+        bis	r31, 1, r16		// set r16<0> on pass2 to update pmctr always (icsr provides real enable)
+
+        sll	r25, 6, r25		// shift frequency bits into pmctr_v_ctl positions
+        bis	r14, r31, r13		// copy pmctr
+
+perfmon_en_ctr0:			// and begin with ctr0
+        blbc	r17, perfmon_en_ctr1	// do not enable ctr0
+
+        blbc	r9, perfmon_en_noclr0	// enclr flag set, clear ctr0 field
+        lda	r8, 0xffff(r31)
+        zapnot  r8, 3, r8		// ctr0<15:0> mask
+        sll	r8, pmctr_v_ctr0, r8
+        bic	r14, r8, r14		// clear ctr bits
+        bic	r13, r8, r13		// clear ctr bits
+
+perfmon_en_noclr0:
+//orig	get_addr r8, 3<<pmctr_v_ctl0, r31
+        LDLI(r8, (3<<pmctr_v_ctl0))
+        and 	r25, r8, r12		//isolate frequency select bits for ctr0
+        bic	r14, r8, r14		// clear ctl0 bits in preparation for enabling
+        or	r14,r12,r14		// or in new ctl0 bits
+
+perfmon_en_ctr1:			// enable ctr1
+        srl	r17, 1, r17		// get ctr1 enable
+        blbc	r17, perfmon_en_ctr2	// do not enable ctr1
+
+        blbc	r9, perfmon_en_noclr1   // if enclr flag set, clear ctr1 field
+        lda	r8, 0xffff(r31)
+        zapnot  r8, 3, r8		// ctr1<15:0> mask
+        sll	r8, pmctr_v_ctr1, r8
+        bic	r14, r8, r14		// clear ctr bits
+        bic	r13, r8, r13		// clear ctr bits
+
+perfmon_en_noclr1:
+//orig	get_addr r8, 3<<pmctr_v_ctl1, r31
+        LDLI(r8, (3<<pmctr_v_ctl1))
+        and 	r25, r8, r12		//isolate frequency select bits for ctr1
+        bic	r14, r8, r14		// clear ctl1 bits in preparation for enabling
+        or	r14,r12,r14		// or in new ctl1 bits
+
+perfmon_en_ctr2:			// enable ctr2
+        srl	r17, 1, r17		// get ctr2 enable
+        blbc	r17, perfmon_en_return	// do not enable ctr2 - return
+
+        blbc	r9, perfmon_en_noclr2	// if enclr flag set, clear ctr2 field
+        lda	r8, 0x3FFF(r31)		// ctr2<13:0> mask
+        sll	r8, pmctr_v_ctr2, r8
+        bic	r14, r8, r14		// clear ctr bits
+        bic	r13, r8, r13		// clear ctr bits
+
+perfmon_en_noclr2:
+//orig	get_addr r8, 3<<pmctr_v_ctl2, r31
+        LDLI(r8, (3<<pmctr_v_ctl2))
+        and 	r25, r8, r12		//isolate frequency select bits for ctr2
+        bic	r14, r8, r14		// clear ctl2 bits in preparation for enabling
+        or	r14,r12,r14		// or in new ctl2 bits
+
+perfmon_en_return:
+        cmovlbs	r16, r14, r13		// if pme enabled, move enables into pmctr
+                                        // else only do the counter clears
+        mtpr	r13, ev5__pmctr		// update pmctr ipr
+
+//;this code not needed for pass2 and later, but does not hurt to leave it in
+        lda	r8, 0x3F(r31)
+//orig	get_pmctr_ctl r25, r12         	// read pmctr ctl; r12=adjusted impure pointer
+        mfpr	r12, pt_impure
+        lda	r12, CNS_Q_IPR(r12)
+        RESTORE_SHADOW(r25,CNS_Q_PM_CTL,r12);
+
+        sll	r8, pmctr_v_ctl2, r8	// build ctl mask
+        and	r8, r14, r14		// isolate new ctl bits
+        bic	r25, r8, r25		// clear out old ctl value
+        or	r25, r14, r14		// create new pmctr_ctl
+//orig	store_reg1 pmctr_ctl, r14, r12, ipr=1
+        SAVE_SHADOW(r14,CNS_Q_PM_CTL,r12); // r12 still has the adjusted impure ptr
+
+        br 	r31, perfmon_success
+
+
+// options...
+perfmon_ctl:
+
+// set mode
+//orig	get_pmctr_ctl r14, r12         	// read shadow pmctr ctl; r12=adjusted impure pointer
+        mfpr	r12, pt_impure
+        lda	r12, CNS_Q_IPR(r12)
+        RESTORE_SHADOW(r14,CNS_Q_PM_CTL,r12);
+
+        // build mode mask for pmctr register
+        LDLI(r8, ((1<<pmctr_v_killu) | (1<<pmctr_v_killp) | (1<<pmctr_v_killk)))
+        mfpr	r0, ev5__pmctr
+        and	r17, r8, r25			// isolate pmctr mode bits
+        bic	r0, r8, r0			// clear old mode bits
+        or	r0, r25, r25			// or in new mode bits
+        mtpr	r25, ev5__pmctr
+
+        // the following code will only be used in pass2, but should
+        // not hurt anything if run in pass1.
+        mfpr	r8, icsr
+        lda	r25, 1<<icsr_v_pma(r31)		// set icsr<pma> if r17<0>=0
+        bic 	r8, r25, r8			// clear old pma bit
+        cmovlbs r17, r31, r25			// and clear icsr<pma> if r17<0>=1
+        or	r8, r25, r8
+        mtpr	r8, icsr		// 4 bubbles to hw_rei
+        mfpr	r31, pt0			// pad icsr write
+        mfpr	r31, pt0			// pad icsr write
+
+        // the following code not needed for pass2 and later, but
+        // should work anyway.
+        bis     r14, 1, r14       		// set for select processes
+        blbs	r17, perfmon_sp			// branch if select processes
+        bic	r14, 1, r14			// all processes
+perfmon_sp:
+//orig	store_reg1 pmctr_ctl, r14, r12, ipr=1   // update pmctr_ctl register
+        SAVE_SHADOW(r14,CNS_Q_PM_CTL,r12); // r12 still has the adjusted impure ptr
+        br 	r31, perfmon_success
+
+// counter frequency select
+perfmon_freq:
+//orig	get_pmctr_ctl r14, r12         	// read shadow pmctr ctl; r12=adjusted impure pointer
+        mfpr	r12, pt_impure
+        lda	r12, CNS_Q_IPR(r12)
+        RESTORE_SHADOW(r14,CNS_Q_PM_CTL,r12);
+
+        lda	r8, 0x3F(r31)
+//orig	sll	r8, pmctr_ctl_v_frq2, r8		// build mask for frequency select field
+// I guess this should be a shift of 4 bits from the above control register structure
+#define	pmctr_ctl_v_frq2_SHIFT 4
+        sll	r8, pmctr_ctl_v_frq2_SHIFT, r8		// build mask for frequency select field
+
+        and 	r8, r17, r17
+        bic 	r14, r8, r14				// clear out old frequency select bits
+
+        or 	r17, r14, r14				// or in new frequency select info
+//orig	store_reg1 pmctr_ctl, r14, r12, ipr=1   // update pmctr_ctl register
+        SAVE_SHADOW(r14,CNS_Q_PM_CTL,r12); // r12 still has the adjusted impure ptr
+
+        br 	r31, perfmon_success
+
+// read counters
+perfmon_rd:
+        mfpr	r0, ev5__pmctr
+        or	r0, 1, r0	// or in return status
+        hw_rei			// back to user
+
+// write counters
+perfmon_wr:
+        mfpr	r14, ev5__pmctr
+        lda	r8, 0x3FFF(r31)		// ctr2<13:0> mask
+        sll	r8, pmctr_v_ctr2, r8
+
+        LDLI(r9, (0xFFFFFFFF))		// ctr2<15:0>,ctr1<15:0> mask
+        sll	r9, pmctr_v_ctr1, r9
+        or	r8, r9, r8		// or ctr2, ctr1, ctr0 mask
+        bic	r14, r8, r14		// clear ctr fields
+        and	r17, r8, r25		// clear all but ctr  fields
+        or	r25, r14, r14		// write ctr fields
+        mtpr	r14, ev5__pmctr		// update pmctr ipr
+
+        mfpr	r31, pt0		// pad pmctr write (needed only to keep PVC happy)
+
+perfmon_success:
+        or      r31, 1, r0                     // set success
+        hw_rei					// back to user
+
+perfmon_unknown:
+        or	r31, r31, r0		// set fail
+        hw_rei				// back to user
+
+
+//////////////////////////////////////////////////////////
+// Copy code
+//////////////////////////////////////////////////////////
+
+copypal_impl:
+        mov r16, r0
+#ifdef CACHE_COPY
+#ifndef CACHE_COPY_UNALIGNED
+        and r16, 63, r8
+        and r17, 63, r9
+        bis r8, r9, r8
+        bne r8, cache_copy_done
+#endif
+        bic r18, 63, r8
+        and r18, 63, r18
+        beq r8, cache_copy_done
+cache_loop:
+        ldf f17, 0(r16)
+        stf f17, 0(r16)
+        addq r17, 64, r17
+        addq r16, 64, r16
+        subq r8, 64, r8
+        bne r8, cache_loop
+cache_copy_done:
+#endif
+        ble r18, finished	// if len <=0 we are finished
+        ldq_u r8, 0(r17)
+        xor r17, r16, r9
+        and r9, 7, r9
+        and r16, 7, r10
+        bne r9, unaligned
+        beq r10, aligned
+        ldq_u r9, 0(r16)
+        addq r18, r10, r18
+        mskqh r8, r17, r8
+        mskql r9, r17, r9
+        bis r8, r9, r8
+aligned:
+        subq r18, 1, r10
+        bic r10, 7, r10
+        and r18, 7, r18
+        beq r10, aligned_done
+loop:
+        stq_u r8, 0(r16)
+        ldq_u r8, 8(r17)
+        subq r10, 8, r10
+        lda r16,8(r16)
+        lda r17,8(r17)
+        bne r10, loop
+aligned_done:
+        bne r18, few_left
+        stq_u r8, 0(r16)
+        br r31, finished
+        few_left:
+        mskql r8, r18, r10
+        ldq_u r9, 0(r16)
+        mskqh r9, r18, r9
+        bis r10, r9, r10
+        stq_u r10, 0(r16)
+        br r31, finished
+unaligned:
+        addq r17, r18, r25
+        cmpule r18, 8, r9
+        bne r9, unaligned_few_left
+        beq r10, unaligned_dest_aligned
+        and r16, 7, r10
+        subq r31, r10, r10
+        addq r10, 8, r10
+        ldq_u r9, 7(r17)
+        extql r8, r17, r8
+        extqh r9, r17, r9
+        bis r8, r9, r12
+        insql r12, r16, r12
+        ldq_u r13, 0(r16)
+        mskql r13, r16, r13
+        bis r12, r13, r12
+        stq_u r12, 0(r16)
+        addq r16, r10, r16
+        addq r17, r10, r17
+        subq r18, r10, r18
+        ldq_u r8, 0(r17)
+unaligned_dest_aligned:
+        subq r18, 1, r10
+        bic r10, 7, r10
+        and r18, 7, r18
+        beq r10, unaligned_partial_left
+unaligned_loop:
+        ldq_u r9, 7(r17)
+        lda r17, 8(r17)
+        extql r8, r17, r12
+        extqh r9, r17, r13
+        subq r10, 8, r10
+        bis r12, r13, r13
+        stq r13, 0(r16)
+        lda r16, 8(r16)
+        beq r10, unaligned_second_partial_left
+        ldq_u r8, 7(r17)
+        lda r17, 8(r17)
+        extql r9, r17, r12
+        extqh r8, r17, r13
+        bis r12, r13, r13
+        subq r10, 8, r10
+        stq r13, 0(r16)
+        lda r16, 8(r16)
+        bne r10, unaligned_loop
+unaligned_partial_left:
+        mov r8, r9
+unaligned_second_partial_left:
+        ldq_u r8, -1(r25)
+        extql r9, r17, r9
+        extqh r8, r17, r8
+        bis r8, r9, r8
+        bne r18, few_left
+        stq_u r8, 0(r16)
+        br r31, finished
+unaligned_few_left:
+        ldq_u r9, -1(r25)
+        extql r8, r17, r8
+        extqh r9, r17, r9
+        bis r8, r9, r8
+        insqh r8, r16, r9
+        insql r8, r16, r8
+        lda r12, -1(r31)
+        mskql r12, r18, r13
+        cmovne r13, r13, r12
+        insqh r12, r16, r13
+        insql r12, r16, r12
+        addq r16, r18, r10
+        ldq_u r14, 0(r16)
+        ldq_u r25, -1(r10)
+        bic r14, r12, r14
+        bic r25, r13, r25
+        and r8, r12, r8
+        and r9, r13, r9
+        bis r8, r14, r8
+        bis r9, r25, r9
+        stq_u r9, -1(r10)
+        stq_u r8, 0(r16)
+finished:
+        hw_rei
diff --git a/system/alpha/palcode/platform.S b/system/alpha/palcode/platform.S
new file mode 100644
index 0000000000..da3f466c14
--- /dev/null
+++ b/system/alpha/palcode/platform.S
@@ -0,0 +1,2337 @@
+/*
+ * Copyright (c) 2003-2005 The Regents of The University of Michigan
+ * Copyright (c) 1993 Hewlett-Packard Development Company
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met: redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer;
+ * redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution;
+ * neither the name of the copyright holders nor the names of its
+ * contributors may be used to endorse or promote products derived from
+ * this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Authors: Ali G. Saidi
+ *          Nathan L. Binkert
+ */
+
+#define max_cpuid 1
+#define hw_rei_spe hw_rei
+
+#include "ev5_defs.h"
+#include "ev5_impure.h"
+#include "ev5_alpha_defs.h"
+#include "ev5_paldef.h"
+#include "ev5_osfalpha_defs.h"
+#include "fromHudsonMacros.h"
+#include "fromHudsonOsf.h"
+#include "dc21164FromGasSources.h"
+#include "cserve.h"
+#include "tlaser.h"
+
+#define pt_entInt pt_entint
+#define pt_entArith pt_entarith
+#define mchk_size ((mchk_cpu_base + 7  + 8) &0xfff8)
+#define mchk_flag CNS_Q_FLAG
+#define mchk_sys_base 56
+#define mchk_cpu_base (CNS_Q_LD_LOCK + 8)
+#define mchk_offsets CNS_Q_EXC_ADDR
+#define mchk_mchk_code 8
+#define mchk_ic_perr_stat CNS_Q_ICPERR_STAT
+#define mchk_dc_perr_stat CNS_Q_DCPERR_STAT
+#define mchk_sc_addr CNS_Q_SC_ADDR
+#define mchk_sc_stat CNS_Q_SC_STAT
+#define mchk_ei_addr CNS_Q_EI_ADDR
+#define mchk_bc_tag_addr CNS_Q_BC_TAG_ADDR
+#define mchk_fill_syn CNS_Q_FILL_SYN
+#define mchk_ei_stat CNS_Q_EI_STAT
+#define mchk_exc_addr CNS_Q_EXC_ADDR
+#define mchk_ld_lock CNS_Q_LD_LOCK
+#define osfpcb_q_Ksp pcb_q_ksp
+#define pal_impure_common_size ((0x200 + 7) & 0xfff8)
+
+#if defined(BIG_TSUNAMI)
+#define MAXPROC         0x3f
+#define IPIQ_addr       0x800
+#define IPIQ_shift      0
+#define IPIR_addr       0x840
+#define IPIR_shift      0
+#define RTC_addr        0x880
+#define RTC_shift       0
+#define DIR_addr        0xa2
+#elif defined(TSUNAMI)
+#define MAXPROC         0x3
+#define IPIQ_addr       0x080
+#define IPIQ_shift      12
+#define IPIR_addr       0x080
+#define IPIR_shift      8
+#define RTC_addr        0x080
+#define RTC_shift       4
+#define DIR_addr        0xa0
+#elif defined(TLASER)
+#define MAXPROC         0xf
+#else
+#error Must define BIG_TSUNAMI, TSUNAMI, or TLASER
+#endif
+
+#define ALIGN_BLOCK \
+        .align 5
+
+#define ALIGN_BRANCH \
+        .align 3
+
+#define EXPORT(_x)	\
+        .align 5;	\
+        .globl _x;	\
+_x:
+
+// XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
+// XXX the following is 'made up'
+// XXX bugnion
+
+// XXX bugnion not sure how to align 'quad'
+#define ALIGN_QUAD \
+        .align  3
+
+#define ALIGN_128 \
+        .align  7
+
+
+#define GET_IMPURE(_r) mfpr _r,pt_impure
+#define GET_ADDR(_r1,_off,_r2)  lda _r1,_off(_r2)
+
+
+#define BIT(_x) (1<<(_x))
+
+
+// System specific code - beh model version
+//
+//
+// Entry points
+//	SYS_CFLUSH - Cache flush
+//	SYS_CSERVE - Console service
+//	SYS_WRIPIR - interprocessor interrupts
+//	SYS_HALT_INTERRUPT - Halt interrupt
+//	SYS_PASSIVE_RELEASE - Interrupt, passive release
+//	SYS_INTERRUPT - Interrupt
+//	SYS_RESET - Reset
+//	SYS_ENTER_CONSOLE
+//
+//
+// Macro to read TLINTRSUMx
+//
+// Based on the CPU_NUMBER, read either the TLINTRSUM0 or TLINTRSUM1 register
+//
+// Assumed register usage:
+//   rsum TLINTRSUMx contents
+//   raddr node space address
+//   scratch scratch register
+//
+#define Read_TLINTRSUMx(_rsum, _raddr, _scratch)	                  \
+    nop;                                                                  \
+    mfpr  _scratch, pt_whami;      /* Get our whami (VID) */              \
+    extbl _scratch, 1, _scratch;   /* shift down to bit 0 */              \
+    lda	  _raddr, 0xff88(zero);    /* Get base node space address bits */ \
+    sll	  _raddr, 24, _raddr;      /* Shift up to proper position */      \
+    srl	  _scratch, 1, _rsum;      /* Shift off the cpu number */         \
+    sll   _rsum, 22, _rsum;        /* Get our node offset */		  \
+    addq  _raddr, _rsum, _raddr;   /* Get our base node space address */  \
+    blbs  _scratch, 1f;                                                   \
+    lda	  _raddr, 0x1180(_raddr);                                         \
+    br	  r31, 2f;                                                        \
+1:  lda	  _raddr, 0x11c0(_raddr);                                         \
+2:  ldl_p _rsum, 0(_raddr)         /* read the right tlintrsum reg */
+
+//
+// Macro to write TLINTRSUMx
+//
+//  Based on the CPU_NUMBER, write either the TLINTRSUM0 or TLINTRSUM1 register
+//
+// Assumed register usage:
+//   rsum TLINTRSUMx write data
+//   raddr node space address
+//   scratch scratch register
+//
+#define Write_TLINTRSUMx(_rsum,_raddr,_whami)                              \
+    nop;                                                                   \
+    mfpr  _whami, pt_whami;       /* Get our whami (VID) */                \
+    extbl _whami, 1, _whami;      /* shift down to bit 0 */                \
+    lda   _raddr, 0xff88(zero);   /* Get base node space address bits */   \
+    sll   _raddr, 24, _raddr;     /* Shift up to proper position */        \
+    blbs  _whami, 1f;                                                      \
+    lda   _raddr, 0x1180(_raddr);                                          \
+    br    zero, 2f;                                                        \
+1:  lda   _raddr, 0x11c0(_raddr);                                          \
+2:  srl	  _whami, 1, _whami;      /* Get our node offset */                \
+    addq  _raddr, _whami, _raddr; /* Get our base node space address */    \
+    mb;                                                                    \
+    stq_p _rsum, 0(_raddr);       /* write the right tlintrsum reg */      \
+    ldq_p _rsum, 0(_raddr);       /* dummy read to tlintrsum */            \
+    bis   _rsum, _rsum, _rsum     /* needed to complete the ldqp above */
+
+
+//
+// Macro to determine highest priority TIOP Node ID from interrupt pending mask
+//
+// Assumed register usage:
+//  rmask - TLINTRSUMx contents, shifted to isolate IOx bits
+//  rid - TLSB Node ID of highest TIOP
+//
+#define Intr_Find_TIOP(_rmask,_rid)              \
+    srl  _rmask,3,_rid;    /* check IOP8 */      \
+    blbc _rid,1f;          /* not IOP8 */        \
+    lda  _rid,8(zero);     /* IOP8 */            \
+    br   zero,6f;                                \
+1:  srl  _rmask,3,_rid;    /* check IOP7 */      \
+    blbc _rid, 2f;         /* not IOP7 */        \
+    lda  _rid, 7(r31);     /* IOP7 */            \
+    br   r31, 6f;                                \
+2:  srl  _rmask, 2, _rid;  /* check IOP6 */      \
+    blbc _rid, 3f;         /* not IOP6 */        \
+    lda  _rid, 6(r31);     /* IOP6 */            \
+    br   r31, 6f;                                \
+3:  srl  _rmask, 1, _rid;  /* check IOP5 */      \
+    blbc _rid, 4f;         /* not IOP5 */        \
+    lda  _rid, 5(r31);     /* IOP5 */            \
+    br   r31, 6f;                                \
+4:  srl  _rmask, 0, _rid;  /* check IOP4 */      \
+    blbc _rid, 5f;         /* not IOP4 */        \
+    lda  r14, 4(r31);      /* IOP4 */            \
+    br   r31, 6f;                                \
+5:  lda  r14, 0(r31);      /* passive release */ \
+6:
+
+//
+// Macro to calculate base node space address for given node id
+//
+// Assumed register usage:
+//  rid - TLSB node id
+//  raddr - base node space address
+#define Get_TLSB_Node_Address(_rid,_raddr)  \
+    sll  _rid, 22, _rid;                    \
+    lda  _raddr, 0xff88(zero);              \
+    sll  _raddr, 24, _raddr;                \
+    addq _raddr, _rid, _raddr
+
+
+#define OSFmchk_TLEPstore_1(_rlog,_rs,_rs1,_nodebase,_tlepreg)     \
+    lda   _rs1, tlep_##_tlepreg(zero);                             \
+    or    _rs1, _nodebase, _rs1;                                   \
+    ldl_p _rs1, 0(_rs1);                                           \
+    stl_p _rs, mchk_##_tlepreg(_rlog)   /* store in frame */
+
+#define OSFmchk_TLEPstore(_tlepreg) \
+    OSFmchk_TLEPstore_1(r14,r8,r4,r13,_tlepreg)
+
+#define OSFcrd_TLEPstore_1(_rlog,_rs,_rs1,_nodebase,_tlepreg) 		\
+        lda	_rs1, tlep_##_tlepreg(zero);				\
+        or	_rs1, _nodebase, _rs1;  				\
+        ldl_p	_rs1, 0(_rs1);						\
+        stl_p	_rs, mchk_crd_##_tlepreg(_rlog)
+
+#define OSFcrd_TLEPstore_tlsb_1(_rlog,_rs,_rs1,_nodebase,_tlepreg) 	\
+        lda	_rs1, tlsb_##_tlepreg(zero);				\
+        or	_rs1, _nodebase, _rs1;  				\
+        ldl_p	_rs1, 0(_rs1);						\
+        stl_p	_rs,mchk_crd_##_tlepreg(_rlog)
+
+#define OSFcrd_TLEPstore_tlsb_clr_1(_rlog,_rs,_rs1,_nodebase,_tlepreg) 	\
+        lda	_rs1,tlsb_##_tlepreg(zero);				\
+        or	_rs1, _nodebase,_rs1;  					\
+        ldl_p	_rs1, 0(_rs1);						\
+        stl_p	_rs, mchk_crd_##_tlepreg(_rlog);			\
+        stl_p   _rs, 0(_rs1)
+
+#define OSFcrd_TLEPstore(_tlepreg) \
+    OSFcrd_TLEPstore_1(r14,r8,r4,r13,_tlepreg)
+#define OSFcrd_TLEPstore_tlsb(_tlepreg) \
+    OSFcrd_TLEPstore_tlsb_1(r14,r8,r4,r13,_tlepreg)
+#define OSFcrd_TLEPstore_tlsb_clr(_tlepreg) \
+    OSFcrd_TLEPstore_tlsb_clr_1(r14,r8,r4,r13,_tlepreg)
+
+
+#define save_pcia_intr(_irq)                                            \
+    and   r13, 0xf, r25;            /* isolate low 4 bits */            \
+    addq  r14, 4, r14;              /* format the TIOP Node id field */ \
+    sll   r14, 4, r14;              /* shift the TIOP Node id */        \
+    or    r14, r25, r10;            /* merge Node id/hose/HPC */        \
+    mfpr  r14, pt14;                /* get saved value */               \
+    extbl r14, _irq, r25;           /* confirm none outstanding */      \
+    bne   r25, sys_machine_check_while_in_pal;                          \
+    insbl r10, _irq, r10;           /* align new info */                \
+    or    r14, r10, r14;            /* merge info */                    \
+    mtpr  r14, pt14;                /* save it */                       \
+    bic   r13, 0xf, r13             /* clear low 4 bits of vector */
+
+
+// wripir - PALcode for wripir instruction
+// R16 has the processor number.
+//
+        ALIGN_BLOCK
+EXPORT(sys_wripir)
+    //
+    // Convert the processor number to a CPU mask
+    //
+    and   r16, MAXPROC, r14	// mask the top stuff: MAXPROC+1 CPUs supported
+    bis   r31, 0x1, r16		// get a one
+    sll   r16, r14, r14		// shift the bit to the right place
+#if defined(TSUNAMI) || defined(BIG_TSUNAMI)
+    sll   r14,IPIQ_shift,r14
+#endif
+
+
+    //
+    // Build the Broadcast Space base address
+    //
+#if defined(TSUNAMI) || defined(BIG_TSUNAMI)
+    lda   r16,0xf01(r31)
+    sll   r16,32,r16
+    ldah  r13,0xa0(r31)
+    sll   r13,8,r13
+    bis   r16,r13,r16
+    lda   r16,IPIQ_addr(r16)
+#elif defined(TLASER)
+    lda   r13, 0xff8e(r31)	// Load the upper address bits
+    sll   r13, 24, r13		// shift them to the top
+#endif
+
+    //
+    // Send out the IP Intr
+    //
+#if defined(TSUNAMI) || defined(BIG_TSUNAMI)
+    stq_p r14, 0(r16)		// Tsunami MISC Register
+#elif defined(TLASER)
+    stq_p r14, 0x40(r13)	// Write to TLIPINTR reg
+#endif
+    wmb				// Push out the store
+    hw_rei
+
+
+// cflush - PALcode for CFLUSH instruction
+//
+// SYS_CFLUSH
+// Entry:
+//	R16 - contains the PFN of the page to be flushed
+//
+// Function:
+//	Flush all Dstream caches of 1 entire page
+//
+//
+        ALIGN_BLOCK
+EXPORT(sys_cflush)
+
+//      #convert pfn to addr, and clean off <63:20>
+//      #sll	r16, <page_offset_size_bits>+<63-20>>, r12
+        sll	r16, page_offset_size_bits+(63-20),r12
+
+//      #ldah	r13,<<1@22>+32768>@-16(r31)// + xxx<31:16>
+//      # stolen from srcmax code. XXX bugnion
+        lda	r13, 0x10(r31)				   // assume 16Mbytes of cache
+        sll	r13, 20, r13				   // convert to bytes
+
+
+        srl	r12, 63-20, r12	// shift back to normal position
+        xor	r12, r13, r12		// xor addr<18>
+
+        or	r31, 8192/(32*8), r13	// get count of loads
+        nop
+
+cflush_loop:
+        subq	r13, 1, r13		// decr counter
+        mfpr    r25, ev5__intid         // Fetch level of interruptor
+
+        ldq_p	r31, 32*0(r12)		// do a load
+        ldq_p	r31, 32*1(r12)		// do next load
+
+        ldq_p	r31, 32*2(r12)		// do next load
+        ldq_p	r31, 32*3(r12)		// do next load
+
+        ldq_p	r31, 32*4(r12)		// do next load
+        ldq_p	r31, 32*5(r12)		// do next load
+
+        ldq_p	r31, 32*6(r12)		// do next load
+        ldq_p	r31, 32*7(r12)		// do next load
+
+        mfpr    r14, ev5__ipl           // Fetch current level
+        lda	r12, (32*8)(r12)	// skip to next cache block addr
+
+        cmple   r25, r14, r25           // R25 = 1 if intid .less than or eql ipl
+        beq	r25, 1f		// if any int's pending, re-queue CFLUSH -- need to check for hlt interrupt???
+
+        bne	r13, cflush_loop 	// loop till done
+        hw_rei				// back to user
+
+        ALIGN_BRANCH
+1:					// Here if interrupted
+        mfpr	r12, exc_addr
+        subq	r12, 4, r12		// Backup PC to point to CFLUSH
+
+        mtpr	r12, exc_addr
+        nop
+
+        mfpr	r31, pt0		// Pad exc_addr write
+        hw_rei
+
+
+        ALIGN_BLOCK
+//
+// sys_cserve - PALcode for CSERVE instruction
+//
+// Function:
+//	Various functions for private use of console software
+//
+//	option selector in r0
+//	arguments in r16....
+//
+//
+//	r0 = 0	unknown
+//
+//	r0 = 1	ldq_p
+//	r0 = 2	stq_p
+//		args, are as for normal STQ_P/LDQ_P in VMS PAL
+//
+//	r0 = 3	dump_tb's
+//	r16 = detination PA to dump tb's to.
+//
+//	r0<0> = 1, success
+//	r0<0> = 0, failure, or option not supported
+//	r0<63:1> = (generally 0, but may be function dependent)
+//	r0 - load data on ldq_p
+//
+//
+EXPORT(sys_cserve)
+
+        /* taken from scrmax */
+        cmpeq	r18, CSERVE_K_RD_IMPURE, r0
+        bne	r0, Sys_Cserve_Rd_Impure
+
+        cmpeq	r18, CSERVE_K_JTOPAL, r0
+        bne	r0, Sys_Cserve_Jtopal
+        call_pal        0
+
+        or	r31, r31, r0
+        hw_rei				// and back we go
+
+Sys_Cserve_Rd_Impure:
+        mfpr	r0, pt_impure		// Get base of impure scratch area.
+        hw_rei
+
+        ALIGN_BRANCH
+
+Sys_Cserve_Jtopal:
+        bic	a0, 3, t8		// Clear out low 2 bits of address
+        bis	t8, 1, t8		// Or in PAL mode bit
+        mtpr    t8,exc_addr
+        hw_rei
+
+        // ldq_p
+        ALIGN_QUAD
+1:
+        ldq_p	r0,0(r17)		// get the data
+        nop				// pad palshadow write
+
+        hw_rei				// and back we go
+
+
+        // stq_p
+        ALIGN_QUAD
+2:
+        stq_p	r18, 0(r17)		// store the data
+        lda     r0,17(r31) // bogus
+        hw_rei				// and back we go
+
+
+        ALIGN_QUAD
+csrv_callback:
+        ldq	r16, 0(r17)		// restore r16
+        ldq	r17, 8(r17)		// restore r17
+        lda	r0, hlt_c_callback(r31)
+        br	r31, sys_enter_console
+
+
+csrv_identify:
+        mfpr	r0, pal_base
+        ldq_p	r0, 8(r0)
+        hw_rei
+
+
+// dump tb's
+        ALIGN_QUAD
+0:
+        // DTB PTEs - 64 entries
+        addq	r31, 64, r0		// initialize loop counter
+        nop
+
+1:	mfpr	r12, ev5__dtb_pte_temp	// read out next pte to temp
+        mfpr	r12, ev5__dtb_pte	// read out next pte to reg file
+
+        subq	r0, 1, r0		// decrement loop counter
+        nop				// Pad - no Mbox instr in cycle after mfpr
+
+        stq_p	r12, 0(r16)		// store out PTE
+        addq	r16, 8 ,r16		// increment pointer
+
+        bne	r0, 1b
+
+        ALIGN_BRANCH
+        // ITB PTEs - 48 entries
+        addq	r31, 48, r0		// initialize loop counter
+        nop
+
+2:	mfpr	r12, ev5__itb_pte_temp	// read out next pte to temp
+        mfpr	r12, ev5__itb_pte	// read out next pte to reg file
+
+        subq	r0, 1, r0		// decrement loop counter
+        nop				//
+
+        stq_p	r12, 0(r16)		// store out PTE
+        addq	r16, 8 ,r16		// increment pointer
+
+        bne	r0, 2b
+        or	r31, 1, r0		// set success
+
+        hw_rei				// and back we go
+
+
+//
+// SYS_INTERRUPT  - Interrupt processing code
+//
+//	Current state:
+//		Stack is pushed
+//		ps, sp and gp are updated
+//		r12, r14 - available
+//		r13 - INTID (new EV5 IPL)
+//		r25 - ISR
+//		r16, r17, r18 - available
+//
+//
+EXPORT(sys_interrupt)
+    cmpeq  r13, 31, r12			// Check for level 31 interrupt
+    bne    r12, sys_int_mchk_or_crd	// machine check or crd
+
+    cmpeq  r13, 30, r12			// Check for level 30 interrupt
+    bne    r12, sys_int_powerfail	// powerfail
+
+    cmpeq  r13, 29, r12			// Check for level 29 interrupt
+    bne    r12, sys_int_perf_cnt	// performance counters
+
+    cmpeq  r13, 23, r12			// Check for level 23 interrupt
+    bne    r12, sys_int_23 		// IPI in Tsunami
+
+    cmpeq  r13, 22, r12			// Check for level 22 interrupt
+    bne    r12, sys_int_22 		// timer interrupt
+
+    cmpeq  r13, 21, r12			// Check for level 21 interrupt
+    bne    r12, sys_int_21	 	// I/O
+
+    cmpeq  r13, 20, r12			// Check for level 20 interrupt
+    bne    r12, sys_int_20		// system error interrupt
+                                        // (might be corrected)
+
+    mfpr   r14, exc_addr		// ooops, something is wrong
+    br     r31, pal_pal_bug_check_from_int
+
+
+//
+//sys_int_2*
+//	Routines to handle device interrupts at IPL 23-20.
+//	System specific method to ack/clear the interrupt, detect passive
+//      release, detect interprocessor (22),  interval clock (22),  corrected
+//	system error (20)
+//
+//	Current state:
+//		Stack is pushed
+//		ps, sp and gp are updated
+//		r12, r14 - available
+//		r13 - INTID (new EV5 IPL)
+//		r25 - ISR
+//
+//	On exit:
+//		Interrupt has been ack'd/cleared
+//		a0/r16 - signals IO device interrupt
+//		a1/r17 - contains interrupt vector
+//		exit to ent_int address
+//
+//
+
+#if defined(TSUNAMI) || defined(BIG_TSUNAMI)
+        ALIGN_BRANCH
+sys_int_23:
+        or      r31,0,r16                        // IPI interrupt A0 = 0
+        lda     r12,0xf01(r31)                   // build up an address for the MISC register
+        sll     r12,16,r12
+        lda     r12,0xa000(r12)
+        sll     r12,16,r12
+        lda     r12,IPIR_addr(r12)
+
+        mfpr    r10, pt_whami                   // get CPU ID
+        extbl	r10, 1, r10		        // Isolate just whami bits
+        or      r31,0x1,r14                     // load r14 with bit to clear
+        sll     r14,r10,r14                     // left shift by CPU ID
+        sll     r14,IPIR_shift,r14
+        stq_p   r14, 0(r12)                     // clear the ipi interrupt
+
+        br	r31, pal_post_interrupt		// Notify the OS
+
+
+        ALIGN_BRANCH
+sys_int_22:
+        or      r31,1,r16                       // a0 means it is a clock interrupt
+        lda     r12,0xf01(r31)                  // build up an address for the MISC register
+        sll     r12,16,r12
+        lda     r12,0xa000(r12)
+        sll     r12,16,r12
+        lda     r12,RTC_addr(r12)
+
+        mfpr    r10, pt_whami                   // get CPU ID
+        extbl	r10, 1, r10		        // Isolate just whami bits
+        or      r31,0x1,r14                     // load r14 with bit to clear
+        sll     r14,r10,r14                     // left shift by CPU ID
+        sll     r14,RTC_shift,r14               // put the bits in the right position
+        stq_p   r14, 0(r12)                     // clear the rtc interrupt
+
+        br	r31, pal_post_interrupt		// Tell the OS
+
+
+        ALIGN_BRANCH
+sys_int_20:
+        Read_TLINTRSUMx(r13,r10,r14)		// read the right TLINTRSUMx
+        srl	r13, 12, r13			// shift down to examine IPL15
+
+        Intr_Find_TIOP(r13,r14)
+        beq	r14, 1f
+
+        Get_TLSB_Node_Address(r14,r10)
+        lda	r10, 0xa40(r10)	// Get base TLILID address
+
+        ldl_p	r13, 0(r10)			// Read the TLILID register
+        bne	r13, pal_post_dev_interrupt
+        beq	r13, 1f
+
+        and	r13, 0x3, r10			// check for PCIA bits
+        beq	r10, pal_post_dev_interrupt	// done if nothing set
+        save_pcia_intr(1)
+        br	r31, pal_post_dev_interrupt	//
+
+1:	lda	r16, osfint_c_passrel(r31)	// passive release
+        br	r31, pal_post_interrupt		//
+
+
+        ALIGN_BRANCH
+sys_int_21:
+
+    lda     r12,0xf01(r31)                // calculate DIRn address
+    sll     r12,32,r12
+    ldah    r13,DIR_addr(r31)
+    sll	    r13,8,r13
+    bis	    r12,r13,r12
+
+    mfpr    r13, pt_whami                   // get CPU ID
+    extbl   r13, 1, r13		            // Isolate just whami bits
+
+#ifdef BIG_TSUNAMI
+    sll     r13,4,r13
+    or      r12,r13,r12
+#else
+    lda     r12,0x0080(r12)
+    and     r13,0x1,r14                     // grab LSB and shift left 6
+    sll     r14,6,r14
+    and     r13,0x2,r10                     // grabl LSB+1 and shift left 9
+    sll     r10,9,r10
+
+    mskbl   r12,0,r12                       // calculate DIRn address
+    lda     r13,0x280(r31)
+    bis     r12,r13,r12
+    or      r12,r14,r12
+    or      r12,r10,r12
+#endif
+
+    ldq_p    r13, 0(r12)                     // read DIRn
+
+    or      r31,1,r14                       // set bit 55 (ISA Interrupt)
+    sll     r14,55,r14
+
+    and     r13, r14, r14                    // check if bit 55 is set
+    lda     r16,0x900(r31)                  // load offset for normal into r13
+    beq     r14, normal_int                 // if not compute the vector normally
+
+    lda     r16,0x800(r31)                  // replace with offset for pic
+    lda     r12,0xf01(r31)                   // build an addr to access PIC
+    sll     r12,32,r12                        // at f01fc000000
+    ldah    r13,0xfc(r31)
+    sll	    r13,8,r13
+    bis	    r12,r13,r12
+    ldq_p    r13,0x0020(r12)                   // read PIC1 ISR for interrupting dev
+
+normal_int:
+    //ctlz    r13,r14                          // count the number of leading zeros
+    // EV5 doesn't have ctlz, but we do, so lets use it
+    .byte 0x4e
+    .byte 0x06
+    .byte 0xed
+    .byte 0x73
+    lda     r10,63(r31)
+    subq    r10,r14,r17                     // subtract from
+
+    lda	    r13,0x10(r31)
+    mulq    r17,r13,r17                    // compute 0x900 + (0x10 * Highest DIRn-bit)
+    addq    r17,r16,r17
+
+    or      r31,3,r16                       // a0 means it is a I/O interrupt
+
+    br      r31, pal_post_interrupt
+
+#elif defined(TLASER)
+        ALIGN_BRANCH
+sys_int_23:
+        Read_TLINTRSUMx(r13,r10,r14)		// read the right TLINTRSUMx
+        srl	r13, 22, r13			// shift down to examine IPL17
+
+        Intr_Find_TIOP(r13,r14)
+        beq	r14, 1f
+
+        Get_TLSB_Node_Address(r14,r10)
+        lda	r10, 0xac0(r10)	// Get base TLILID address
+
+        ldl_p	r13, 0(r10)			// Read the TLILID register
+        bne	r13, pal_post_dev_interrupt
+
+1:	lda	r16, osfint_c_passrel(r31)	// passive release
+        br	r31, pal_post_interrupt		//
+
+
+        ALIGN_BRANCH
+sys_int_22:
+        Read_TLINTRSUMx(r13,r10,r14)		// read the right TLINTRSUMx
+        srl	r13, 6, r14			// check the Intim bit
+
+        blbs	r14, tlep_intim			// go service Intim
+        srl	r13, 5, r14			// check the IP Int bit
+
+        blbs	r14, tlep_ipint			// go service IP Int
+        srl	r13, 17, r13			// shift down to examine IPL16
+
+        Intr_Find_TIOP(r13,r14)
+        beq	r14, 1f
+
+        Get_TLSB_Node_Address(r14,r10)
+        lda	r10, 0xa80(r10)	// Get base TLILID address
+
+        ldl_p	r13, 0(r10)			// Read the TLILID register
+        bne	r13, pal_post_dev_interrupt
+        beq	r13, 1f
+
+        and	r13, 0x3, r10			// check for PCIA bits
+        beq	r10, pal_post_dev_interrupt	// done if nothing set
+        save_pcia_intr(2)
+        br	r31, pal_post_dev_interrupt	//
+
+1:	lda	r16, osfint_c_passrel(r31)	// passive release
+        br	r31, pal_post_interrupt		//
+
+
+        ALIGN_BRANCH
+sys_int_21:
+        Read_TLINTRSUMx(r13,r10,r14)		// read the right TLINTRSUMx
+        srl	r13, 12, r13			// shift down to examine IPL15
+
+        Intr_Find_TIOP(r13,r14)
+        beq	r14, 1f
+
+        Get_TLSB_Node_Address(r14,r10)
+        lda	r10, 0xa40(r10)	// Get base TLILID address
+
+        ldl_p	r13, 0(r10)			// Read the TLILID register
+        bne	r13, pal_post_dev_interrupt
+        beq	r13, 1f
+
+        and	r13, 0x3, r10			// check for PCIA bits
+        beq	r10, pal_post_dev_interrupt	// done if nothing set
+        save_pcia_intr(1)
+        br	r31, pal_post_dev_interrupt	//
+
+1:	lda	r16, osfint_c_passrel(r31)	// passive release
+        br	r31, pal_post_interrupt		//
+
+
+        ALIGN_BRANCH
+sys_int_20:
+        lda	r13, 1(r31)			// Duart0 bit
+        Write_TLINTRSUMx(r13,r10,r14)		// clear the duart0 bit
+
+        Read_TLINTRSUMx(r13,r10,r14)		// read the right TLINTRSUMx
+        blbs	r13, tlep_uart0			// go service UART int
+
+        srl	r13, 7, r13			// shift down to examine IPL14
+        Intr_Find_TIOP(r13,r14)
+
+        beq	r14, tlep_ecc			// Branch if not IPL14
+        Get_TLSB_Node_Address(r14,r10)
+
+        lda	r10, 0xa00(r10)	                // Get base TLILID0 address
+        ldl_p	r13, 0(r10)			// Read the TLILID register
+
+        bne	r13, pal_post_dev_interrupt
+        beq	r13, 1f
+
+        and	r13, 0x3, r10			// check for PCIA bits
+        beq	r10, pal_post_dev_interrupt	// done if nothing set
+        save_pcia_intr(0)
+        br	r31, pal_post_dev_interrupt	//
+1:	lda	r16, osfint_c_passrel(r31)	// passive release
+        br	r31, pal_post_interrupt		//
+
+
+        ALIGN_BRANCH
+tlep_intim:
+        lda	r13, 0xffb(r31)			// get upper GBUS address bits
+        sll	r13, 28, r13			// shift up to top
+
+        lda	r13, (0x300)(r13)  // full CSRC address (tlep watch csrc offset)
+        ldq_p	r13, 0(r13)			// read CSRC
+
+        lda	r13, 0x40(r31)			// load Intim bit
+        Write_TLINTRSUMx(r13,r10,r14)		// clear the Intim bit
+
+        lda	r16, osfint_c_clk(r31)		// passive release
+        br	r31, pal_post_interrupt		// Build the stack frame
+
+
+        ALIGN_BRANCH
+tlep_ipint:
+        lda	r13, 0x20(r31)			// load IP Int bit
+        Write_TLINTRSUMx(r13,r10,r14)		// clear the IP Int bit
+
+        lda	r16, osfint_c_ip(r31)		// passive release
+        br	r31, pal_post_interrupt		// Build the stack frame
+
+
+        ALIGN_BRANCH
+tlep_uart0:
+        lda	r13, 0xffa(r31)			// get upper GBUS address bits
+        sll	r13, 28, r13			// shift up to top
+
+        ldl_p	r14, 0x80(r13)			// zero pointer register
+        lda	r14, 3(r31)			// index to RR3
+
+        stl_p	r14, 0x80(r13)			// write pointer register
+        mb
+
+        mb
+        ldl_p	r14, 0x80(r13)			// read RR3
+
+        srl	r14, 5, r10			// is it Channel A RX?
+        blbs	r10, uart0_rx
+
+        srl	r14, 4, r10			// is it Channel A TX?
+        blbs	r10, uart0_tx
+
+        srl	r14, 2, r10			// is it Channel B RX?
+        blbs	r10, uart1_rx
+
+        srl	r14, 1, r10			// is it Channel B TX?
+        blbs	r10, uart1_tx
+
+        lda	r8, 0(r31)			// passive release
+        br	r31, clear_duart0_int		// clear tlintrsum and post
+
+
+        ALIGN_BRANCH
+uart0_rx:
+        lda	r8, 0x680(r31)			// UART0 RX vector
+        br	r31, clear_duart0_int		// clear tlintrsum and post
+
+
+        ALIGN_BRANCH
+uart0_tx:
+        lda	r14, 0x28(r31)			// Reset TX Int Pending code
+        mb
+        stl_p	r14, 0x80(r13)			// write Channel A WR0
+        mb
+
+        lda	r8, 0x6c0(r31)			// UART0 TX vector
+        br	r31, clear_duart0_int		// clear tlintrsum and post
+
+
+        ALIGN_BRANCH
+uart1_rx:
+        lda	r8, 0x690(r31)			// UART1 RX vector
+        br	r31, clear_duart0_int		// clear tlintrsum and post
+
+
+        ALIGN_BRANCH
+uart1_tx:
+        lda	r14, 0x28(r31)			// Reset TX Int Pending code
+        stl_p	r14, 0(r13)			// write Channel B WR0
+
+        lda	r8, 0x6d0(r31)			// UART1 TX vector
+        br	r31, clear_duart0_int		// clear tlintrsum and post
+
+
+        ALIGN_BRANCH
+clear_duart0_int:
+        lda	r13, 1(r31)			// load duart0 bit
+        Write_TLINTRSUMx(r13,r10,r14)		// clear the duart0 bit
+
+        beq	r8, 1f
+        or	r8, r31, r13			// move vector to r13
+        br	r31, pal_post_dev_interrupt	// Build the stack frame
+1:	nop
+        nop
+        hw_rei
+//	lda	r16, osfint_c_passrel(r31)	// passive release
+//	br	r31, pal_post_interrupt		//
+
+
+        ALIGN_BRANCH
+tlep_ecc:
+        mfpr	r14, pt_whami			// get our node id
+        extbl	r14, 1, r14			// shift to bit 0
+
+        srl	r14, 1, r14			// shift off cpu number
+        Get_TLSB_Node_Address(r14,r10)		// compute our nodespace address
+
+        ldl_p	r13, 0x40(r10)	// read our TLBER WAS tlsb_tlber_offset
+        srl	r13, 17, r13			// shift down the CWDE/CRDE bits
+
+        and	r13, 3, r13			// mask the CWDE/CRDE bits
+        beq	r13, 1f
+
+        ornot	r31, r31, r12			// set flag
+        lda	r9, mchk_c_sys_ecc(r31)		// System Correctable error MCHK code
+        br	r31, sys_merge_sys_corr		// jump to CRD logout frame code
+
+1:	lda	r16, osfint_c_passrel(r31)	// passive release
+
+#endif // if TSUNAMI || BIG_TSUNAMI elif TLASER
+
+        ALIGN_BRANCH
+pal_post_dev_interrupt:
+        or	r13, r31, r17			// move vector to a1
+        or	r31, osfint_c_dev, r16		// a0 signals IO device interrupt
+
+pal_post_interrupt:
+        mfpr	r12, pt_entint
+
+        mtpr	r12, exc_addr
+
+        nop
+        nop
+
+        hw_rei_spe
+
+
+//
+// sys_passive_release
+//	Just pretend the interrupt never occurred.
+//
+
+EXPORT(sys_passive_release)
+        mtpr	r11, ev5__dtb_cm	// Restore Mbox current mode for ps
+        nop
+
+        mfpr	r31, pt0		// Pad write to dtb_cm
+        hw_rei
+
+//
+// sys_int_powerfail
+//	A powerfail interrupt has been detected.  The stack has been pushed.
+//	IPL and PS are updated as well.
+//
+//	I'm not sure what to do here, I'm treating it as an IO device interrupt
+//
+//
+
+        ALIGN_BLOCK
+sys_int_powerfail:
+        lda	r12, 0xffc4(r31)		// get GBUS_MISCR address bits
+        sll	r12, 24, r12			// shift to proper position
+        ldq_p	r12, 0(r12)			// read GBUS_MISCR
+        srl	r12, 5, r12			// isolate bit <5>
+        blbc	r12, 1f 			// if clear, no missed mchk
+
+                                                // Missed a CFAIL mchk
+        lda	r13, 0xffc7(r31)		// get GBUS$SERNUM address bits
+        sll	r13, 24, r13			// shift to proper position
+        lda	r14, 0x40(r31)			// get bit <6> mask
+        ldq_p	r12, 0(r13)			// read GBUS$SERNUM
+        or	r12, r14, r14			// set bit <6>
+        stq_p	r14, 0(r13)			// clear GBUS$SERNUM<6>
+        mb
+        mb
+
+1:	br	r31, sys_int_mchk		// do a machine check
+
+        lda	r17, scb_v_pwrfail(r31)	// a1 to interrupt vector
+        mfpr	r25, pt_entint
+
+        lda	r16, osfint_c_dev(r31)	// a0 to device code
+        mtpr	r25, exc_addr
+
+        nop				// pad exc_addr write
+        nop
+
+        hw_rei_spe
+
+//
+// sys_halt_interrupt
+//       A halt interrupt has been detected.  Pass control to the console.
+//
+//
+//
+        EXPORT(sys_halt_interrupt)
+
+        ldah	r13, 0x1800(r31)		// load Halt/^PHalt bits
+        Write_TLINTRSUMx(r13,r10,r14)		// clear the ^PHalt bits
+
+        mtpr	r11, dtb_cm		// Restore Mbox current mode
+        nop
+        nop
+        mtpr	r0, pt0
+        lda     r0, hlt_c_hw_halt(r31)  // set halt code to hw halt
+        br      r31, sys_enter_console  // enter the console
+
+
+
+//
+// sys_int_mchk_or_crd
+//
+//	Current state:
+//		Stack is pushed
+//		ps, sp and gp are updated
+//		r12
+//		r13 - INTID (new EV5 IPL)
+//		r14 - exc_addr
+//		r25 - ISR
+//		r16, r17, r18 - available
+//
+//
+        ALIGN_BLOCK
+sys_int_mchk_or_crd:
+        srl	r25, isr_v_mck, r12
+        blbs	r12, sys_int_mchk
+        //
+        // Not a Machine check interrupt, so must be an Internal CRD interrupt
+        //
+
+        mb					//Clear out Cbox prior to reading IPRs
+        srl 	r25, isr_v_crd, r13		//Check for CRD
+        blbc	r13, pal_pal_bug_check_from_int	//If CRD not set, shouldn't be here!!!
+
+        lda	r9, 1(r31)
+        sll 	r9, hwint_clr_v_crdc, r9	// get ack bit for crd
+        mtpr	r9, ev5__hwint_clr		// ack the crd interrupt
+
+        or	r31, r31, r12			// clear flag
+        lda	r9, mchk_c_ecc_c(r31)		// Correctable error MCHK code
+
+sys_merge_sys_corr:
+        ldah	r14, 0xfff0(r31)
+        mtpr   	r0, pt0				// save r0 for scratch
+        zap	r14, 0xE0, r14			// Get Cbox IPR base
+        mtpr   	r1, pt1				// save r0 for scratch
+
+        ldq_p	r0, ei_addr(r14)		// EI_ADDR IPR
+        ldq_p	r10, fill_syn(r14)		// FILL_SYN IPR
+        bis	r0, r10, r31			// Touch lds to make sure they complete before doing scrub
+
+        blbs	r12, 1f				// no scrubbing for IRQ0 case
+// XXX bugnion	pvc_jsr	crd_scrub_mem, bsr=1
+        bsr	r13, sys_crd_scrub_mem		// and go scrub
+
+                                                // ld/st pair in scrub routine will have finished due
+                                                // to ibox stall of stx_c.  Don't need another mb.
+        ldq_p	r8, ei_stat(r14)		// EI_STAT, unlock EI_ADDR, BC_TAG_ADDR, FILL_SYN
+        or	r8, r31, r12			// Must only be executed once in this flow, and must
+        br	r31, 2f				// be after the scrub routine.
+
+1:	ldq_p	r8, ei_stat(r14)		// EI_STAT, unlock EI_ADDR, BC_TAG_ADDR, FILL_SYN
+                                                // For IRQ0 CRD case only - meaningless data.
+
+2:	mfpr	r13, pt_mces			// Get MCES
+        srl	r12, ei_stat_v_ei_es, r14	// Isolate EI_STAT:EI_ES
+        blbc	r14, 6f			// branch if 630
+        srl	r13, mces_v_dsc, r14		// check if 620 reporting disabled
+        blbc	r14, 5f				// branch if enabled
+        or	r13, r31, r14			// don't set SCE if disabled
+        br	r31, 8f			// continue
+5:	bis	r13, BIT(mces_v_sce), r14	// Set MCES<SCE> bit
+        br	r31, 8f
+
+6:     	srl	r13, mces_v_dpc, r14		// check if 630 reporting disabled
+        blbc	r14, 7f			// branch if enabled
+        or	r13, r31, r14			// don't set PCE if disabled
+        br	r31, 8f			// continue
+7:	bis	r13, BIT(mces_v_pce), r14	// Set MCES<PCE> bit
+
+        // Setup SCB if dpc is not set
+8:	mtpr	r14, pt_mces			// Store updated MCES
+        srl	r13, mces_v_sce, r1		// Get SCE
+        srl	r13, mces_v_pce, r14		// Get PCE
+        or	r1, r14, r1			// SCE OR PCE, since they share
+                                                // the CRD logout frame
+        // Get base of the logout area.
+        GET_IMPURE(r14)				 // addr of per-cpu impure area
+        GET_ADDR(r14,(pal_logout_area+mchk_crd_base),r14)
+
+        blbc	r1, sys_crd_write_logout_frame	// If pce/sce not set, build the frame
+
+        // Set the 2nd error flag in the logout area:
+
+        lda     r1, 3(r31)			// Set retry and 2nd error flags
+        sll	r1, 30, r1			// Move to bits 31:30 of logout frame flag longword
+        stl_p	r1, mchk_crd_flag+4(r14)	// store flag longword
+        br 	sys_crd_ack
+
+sys_crd_write_logout_frame:
+        // should only be here if neither the pce or sce bits are set
+
+        //
+        // Write the mchk code to the logout area
+        //
+        stq_p	r9, mchk_crd_mchk_code(r14)
+
+
+        //
+        // Write the first 2 quadwords of the logout area:
+        //
+        lda     r1, 1(r31)		  	// Set retry flag
+        sll	r1, 63, r9		  	// Move retry flag to bit 63
+        lda	r1, mchk_crd_size(r9)	  	// Combine retry flag and frame size
+        stq_p	r1, mchk_crd_flag(r14)	  	// store flag/frame size
+
+        //
+        // Write error IPRs already fetched to the logout area
+        //
+        stq_p	r0, mchk_crd_ei_addr(r14)
+        stq_p	r10, mchk_crd_fill_syn(r14)
+        stq_p	r8, mchk_crd_ei_stat(r14)
+        stq_p	r25, mchk_crd_isr(r14)
+        //
+        // Log system specific info here
+        //
+crd_storeTLEP_:
+        lda	r1, 0xffc4(r31)			// Get GBUS$MISCR address
+        sll	r1, 24, r1
+        ldq_p	r1, 0(r1)			// Read GBUS$MISCR
+        sll	r1, 16, r1			// shift up to proper field
+        mfpr	r10, pt_whami			// get our node id
+        extbl	r10, 1, r10			// shift to bit 0
+        or	r1, r10, r1			// merge MISCR and WHAMI
+        stl_p	r1, mchk_crd_whami(r14)		// write to crd logout area
+        srl	r10, 1, r10			// shift off cpu number
+
+        Get_TLSB_Node_Address(r10,r0)		// compute our nodespace address
+
+        OSFcrd_TLEPstore_tlsb(tldev)
+        OSFcrd_TLEPstore_tlsb_clr(tlber)
+        OSFcrd_TLEPstore_tlsb_clr(tlesr0)
+        OSFcrd_TLEPstore_tlsb_clr(tlesr1)
+        OSFcrd_TLEPstore_tlsb_clr(tlesr2)
+        OSFcrd_TLEPstore_tlsb_clr(tlesr3)
+
+sys_crd_ack:
+        mfpr	r0, pt0					// restore r0
+        mfpr	r1, pt1					// restore r1
+
+        srl	r12, ei_stat_v_ei_es, r12
+        blbc	r12, 5f
+        srl	r13, mces_v_dsc, r10			// logging enabled?
+        br	r31, 6f
+5:	srl	r13, mces_v_dpc, r10			// logging enabled?
+6:	blbc	r10, sys_crd_post_interrupt		// logging enabled -- report it
+
+                                                        // logging not enabled
+        // Get base of the logout area.
+        GET_IMPURE(r13)				 // addr of per-cpu impure area
+        GET_ADDR(r13,(pal_logout_area+mchk_crd_base),r13)
+        ldl_p	r10, mchk_crd_rsvd(r13)			// bump counter
+        addl	r10, 1, r10
+        stl_p	r10, mchk_crd_rsvd(r13)
+        mb
+        br	r31, sys_crd_dismiss_interrupt		// just return
+
+        //
+        // The stack is pushed.  Load up a0,a1,a2 and vector via entInt
+        //
+        //
+
+        ALIGN_BRANCH
+sys_crd_post_interrupt:
+        lda	r16, osfint_c_mchk(r31)	// flag as mchk/crd in a0
+        lda	r17, scb_v_proc_corr_err(r31) // a1 <- interrupt vector
+
+        blbc	r12, 1f
+        lda	r17, scb_v_sys_corr_err(r31) // a1 <- interrupt vector
+
+1:	subq    r31, 1, r18            // get a -1
+        mfpr	r25, pt_entInt
+
+        srl     r18, 42, r18           // shift off low bits of kseg addr
+        mtpr	r25, exc_addr		// load interrupt vector
+
+        sll     r18, 42, r18           // shift back into position
+        or    	r14, r18, r18           // EV4 algorithm - pass pointer to mchk frame as kseg address
+
+        hw_rei_spe			// done
+
+
+        //
+        // The stack is pushed.  Need to back out of it all.
+        //
+
+sys_crd_dismiss_interrupt:
+        br	r31, Call_Pal_Rti
+
+
+// sys_crd_scrub_mem
+//
+//	r0 = addr of cache block
+//
+        ALIGN_BLOCK	// align for branch target
+sys_crd_scrub_mem:
+        // now find error in memory, and attempt to scrub that cache block
+        // This routine just scrubs the failing octaword
+        // Only need to "touch" one quadword per octaword to accomplish the scrub
+        srl	r0, 39, r8		// get high bit of bad pa
+        blbs	r8, 1f  		// don't attempt fixup on IO space addrs
+        nop				// needed to align the ldq_pl to octaword boundary
+        nop				//             "
+
+        ldq_p 	r8,  0(r0) 		// attempt to read the bad memory
+                                        // location
+                                        //    (Note bits 63:40,3:0 of ei_addr
+                                        //     are set to 1, but as long as
+                                        //     we are doing a phys ref, should
+                                        //     be ok)
+        nop				// Needed to keep the Ibox from swapping the ldq_p into E1
+
+        stq_p 	r8,  0(r0) 		// Store it back if it is still there.
+                                        // If store fails, location already
+                                        //  scrubbed by someone else
+
+        nop				// needed to align the ldq_p to octaword boundary
+
+        lda	r8, 0x20(r31)		// flip bit 5 to touch next hexaword
+        xor	r8, r0, r0
+        nop				// needed to align the ldq_p to octaword boundary
+        nop				//             "
+
+        ldq_p 	r8,  0(r0) 		// attempt to read the bad memory
+                                        // location
+                                        //    (Note bits 63:40,3:0 of ei_addr
+                                        //     are set to 1, but as long as
+                                        //     we are doing a phys ref, should
+                                        //     be ok)
+        nop				// Needed to keep the Ibox from swapping the ldq_p into E1
+
+        stq_p 	r8,  0(r0) 		// Store it back if it is still there.
+                                        // If store fails, location already
+                                        //  scrubbed by someone else
+
+        lda	r8, 0x20(r31)		// restore r0 to original address
+        xor	r8, r0, r0
+
+        //at this point, ei_stat could be locked due to a new corr error on the ld,
+        //so read ei_stat to unlock AFTER this routine.
+
+// XXX bugnion	pvc$jsr	crd_scrub_mem, bsr=1, dest=1
+1:	ret	r31, (r13)		// and back we go
+
+
+//
+// sys_int_mchk - MCHK Interrupt code
+//
+// Machine check interrupt from the system.  Setup and join the
+// regular machine check flow.
+// On exit:
+//       pt0     - saved r0
+//       pt1     - saved r1
+//       pt4     - saved r4
+//       pt5     - saved r5
+//       pt6     - saved r6
+//       pt10    - saved exc_addr
+//       pt_misc<47:32> - mchk code
+//       pt_misc<31:16> - scb vector
+//       r14     - base of Cbox IPRs in IO space
+//       MCES<mchk> is set
+//
+        ALIGN_BLOCK
+sys_int_mchk:
+        lda	r14, mchk_c_sys_hrd_error(r31)
+        mfpr	r12, exc_addr
+
+        addq	r14, 1, r14			// Flag as interrupt
+        nop
+
+        sll	r14, 32, r14			// Move mchk code to position
+        mtpr	r12, pt10			// Stash exc_addr
+
+        mfpr	r12, pt_misc			// Get MCES and scratch
+        mtpr	r0, pt0				// Stash for scratch
+
+        zap	r12, 0x3c, r12			// Clear scratch
+        blbs    r12, sys_double_machine_check   // MCHK halt if double machine check
+
+        or	r12, r14, r12			// Combine mchk code
+        lda	r14, scb_v_sysmchk(r31)		// Get SCB vector
+
+        sll	r14, 16, r14			// Move SCBv to position
+        or	r12, r14, r14			// Combine SCBv
+
+        bis	r14, BIT(mces_v_mchk), r14	// Set MCES<MCHK> bit
+        mtpr	r14, pt_misc			// Save mchk code!scbv!whami!mces
+
+        ldah	r14, 0xfff0(r31)
+        mtpr	r1, pt1				// Stash for scratch
+
+        zap	r14, 0xE0, r14			// Get Cbox IPR base
+        mtpr	r4, pt4
+
+        mtpr	r5, pt5
+
+        mtpr	r6, pt6
+        br	r31, sys_mchk_collect_iprs	// Join common machine check flow
+
+
+//
+// sys_int_perf_cnt - Performance counter interrupt code
+//
+//	A performance counter interrupt has been detected.  The stack
+//	has been pushed. IPL and PS are updated as well.
+//
+//	on exit to interrupt entry point ENTINT::
+//		a0 = osfint$c_perf
+//		a1 = scb$v_perfmon (650)
+//		a2 = 0 if performance counter 0 fired
+//		a2 = 1 if performance counter 1 fired
+//		a2 = 2 if performance counter 2 fired
+//		     (if more than one counter overflowed, an interrupt will be
+//			generated for each counter that overflows)
+//
+//
+//
+        ALIGN_BLOCK
+sys_int_perf_cnt:			// Performance counter interrupt
+        lda	r17, scb_v_perfmon(r31)	// a1 to interrupt vector
+        mfpr	r25, pt_entint
+
+        lda	r16, osfint_c_perf(r31)	// a0 to perf counter code
+        mtpr	r25, exc_addr
+
+        //isolate which perf ctr fired, load code in a2, and ack
+        mfpr	r25, isr
+        or	r31, r31, r18			// assume interrupt was pc0
+
+        srl	r25, isr_v_pc1, r25		// isolate
+        cmovlbs	r25, 1, r18			// if pc1 set, load 1 into r14
+
+        srl	r25, 1, r25			// get pc2
+        cmovlbs r25, 2, r18			// if pc2 set, load 2 into r14
+
+        lda	r25, 1(r31)			// get a one
+        sll	r25, r18, r25
+
+        sll	r25, hwint_clr_v_pc0c, r25	// ack only the perf counter that generated the interrupt
+        mtpr	r25, hwint_clr
+
+        hw_rei_spe
+
+
+
+//
+//  sys_reset - System specific RESET code
+//	On entry:
+//       r1 = pal_base +8
+//
+//	Entry state on trap:
+//       r0 = whami
+//       r2 = base of scratch area
+//       r3 = halt code
+//	and the following 3 if init_cbox is enabled:
+//       r5 = sc_ctl
+//       r6 = bc_ctl
+//       r7 = bc_cnfg
+//
+//	Entry state on switch:
+//       r17 - new PC
+//       r18 - new PCBB
+//       r19 - new VPTB
+//
+
+        ALIGN_BLOCK
+        .globl sys_reset
+sys_reset:
+//	mtpr	r31, ic_flush_ctl	// do not flush the icache - done by hardware before SROM load
+        mtpr	r31, itb_ia		// clear the ITB
+        mtpr	r31, dtb_ia		// clear the DTB
+
+        lda	r1, -8(r1)		// point to start of code
+        mtpr	r1, pal_base		// initialize PAL_BASE
+
+        // Interrupts
+        mtpr	r31, astrr		// stop ASTs
+        mtpr	r31, aster		// stop ASTs
+        mtpr	r31, sirr		// clear software interrupts
+
+        mtpr	r0, pt1			// r0 is whami (unless we entered via swp)
+
+        ldah     r1,(BIT(icsr_v_sde-16)|BIT(icsr_v_fpe-16)|BIT(icsr_v_spe-16+1))(zero)
+
+        bis	r31, 1, r0
+        sll	r0, icsr_v_crde, r0	// A 1 in iscr<corr_read_enable>
+        or	r0, r1, r1		// Set the bit
+
+        mtpr	r1, icsr		// ICSR - Shadows enabled, Floating point enable,
+                                        //	super page enabled, correct read per assembly option
+
+        // Mbox/Dcache init
+        lda     r1,BIT(mcsr_v_sp1)(zero)
+
+        mtpr	r1, mcsr		// MCSR - Super page enabled
+        lda	r1, BIT(dc_mode_v_dc_ena)(r31)
+        ALIGN_BRANCH
+//	mtpr	r1, dc_mode		// turn Dcache on
+        nop
+
+        mfpr	r31, pt0		// No Mbox instr in 1,2,3,4
+        mfpr	r31, pt0
+        mfpr	r31, pt0
+        mfpr	r31, pt0
+        mtpr	r31, dc_flush		// flush Dcache
+
+        // build PS (IPL=7,CM=K,VMM=0,SW=0)
+        lda	r11, 0x7(r31)		// Set shadow copy of PS - kern mode, IPL=7
+        lda	r1, 0x1F(r31)
+        mtpr	r1, ipl			// set internal <ipl>=1F
+        mtpr	r31, ev5__ps			// set new ps<cm>=0, Ibox copy
+        mtpr	r31, dtb_cm		// set new ps<cm>=0, Mbox copy
+
+        // Create the PALtemp pt_intmask
+        //   MAP:
+        //	OSF IPL		EV5 internal IPL(hex)	note
+        //	0		0
+        //	1		1
+        //	2		2
+        //	3		14			device
+        //	4		15			device
+        //	5		16			device
+        //	6		1E			device,performance counter, powerfail
+        //	7		1F
+        //
+
+        ldah	r1, 0x1f1E(r31)		// Create upper lw of int_mask
+        lda	r1, 0x1615(r1)
+
+        sll	r1, 32, r1
+        ldah	r1, 0x1402(r1)		// Create lower lw of int_mask
+
+        lda	r1, 0x0100(r1)
+        mtpr	r1, pt_intmask		// Stash in PALtemp
+
+        // Unlock a bunch of chip internal IPRs
+        mtpr	r31, exc_sum		// clear out exeception summary and exc_mask
+        mfpr	r31, va			// unlock va, mmstat
+        lda     r8,(BIT(icperr_stat_v_dpe)|BIT(icperr_stat_v_tpe)|BIT(icperr_stat_v_tmr))(zero)
+
+        mtpr	r8, icperr_stat			// Clear Icache parity error & timeout status
+        lda	r8,(BIT(dcperr_stat_v_lock)|BIT(dcperr_stat_v_seo))(r31)
+
+        mtpr	r8, dcperr_stat			// Clear Dcache parity error status
+
+        rc	r0			// clear intr_flag
+        mtpr	r31, pt_trap
+
+        mfpr	r0, pt_misc
+        srl	r0, pt_misc_v_switch, r1
+        blbs	r1, sys_reset_switch	// see if we got here from swppal
+
+        // Rest of the "real" reset flow
+        // ASN
+        mtpr	r31, dtb_asn
+        mtpr	r31, itb_asn
+
+        lda	r1, 0x67(r31)
+        sll	r1, hwint_clr_v_pc0c, r1
+        mtpr	r1, hwint_clr		// Clear hardware interrupt requests
+
+        lda	r1, BIT(mces_v_dpc)(r31) // 1 in disable processor correctable error
+        mfpr	r0, pt1			// get whami
+        insbl	r0, 1, r0		// isolate whami in correct pt_misc position
+        or	r0, r1, r1		// combine whami and mces
+        mtpr	r1, pt_misc		// store whami and mces, swap bit clear
+
+        zapnot	r3, 1, r0		// isolate halt code
+        mtpr	r0, pt0			// save entry type
+
+        // Cycle counter
+        or	r31, 1, r9		// get a one
+        sll	r9, 32, r9		// shift to <32>
+        mtpr	r31, cc			// clear Cycle Counter
+        mtpr	r9, cc_ctl		// clear and enable the Cycle Counter
+        mtpr	r31, pt_scc		// clear System Cycle Counter
+
+
+        // Misc PALtemps
+        mtpr	r31, maf_mode		// no mbox instructions for 3 cycles
+        or	r31, 1, r1		// get bogus scbb value
+        mtpr	r1, pt_scbb		// load scbb
+        mtpr	r31, pt_prbr		// clear out prbr
+#if defined(TSUNAMI) || defined(BIG_TSUNAMI)
+        // yes, this is ugly, but you figure out a better
+        // way to get the address of the kludge_initial_pcbb
+        // in r1 with an uncooperative assembler --ali
+        br     r1, kludge_getpcb_addr
+        br     r31, kludge_initial_pcbb
+kludge_getpcb_addr:
+        ldq_p   r19, 0(r1)
+        sll    r19, 44, r19
+        srl    r19, 44, r19
+        mulq   r19,4,r19
+        addq   r19, r1, r1
+        addq   r1,4,r1
+#elif defined(TLASER)
+        // or      zero,kludge_initial_pcbb,r1
+        GET_ADDR(r1, (kludge_initial_pcbb-pal_base), r1)
+#endif
+        mtpr	r1, pt_pcbb		// load pcbb
+        lda	r1, 2(r31)		// get a two
+        sll	r1, 32, r1		// gen up upper bits
+        mtpr	r1, mvptbr
+        mtpr	r1, ivptbr
+        mtpr	r31, pt_ptbr
+        // Performance counters
+        mtpr	r31, pmctr
+
+        // Clear pmctr_ctl in impure area
+
+
+        ldah	r14, 0xfff0(r31)
+        zap	r14, 0xE0, r14		// Get Cbox IPR base
+        GET_IMPURE(r13)
+        stq_p	r31, 0(r13)		// Clear lock_flag
+
+        mfpr	r0, pt0			// get entry type
+        br	r31, sys_enter_console	// enter the cosole
+
+
+        // swppal entry
+        // r0 - pt_misc
+        // r17 - new PC
+        // r18 - new PCBB
+        // r19 - new VPTB
+sys_reset_switch:
+        or	r31, 1, r9
+        sll	r9, pt_misc_v_switch, r9
+        bic	r0, r9, r0		// clear switch bit
+        mtpr	r0, pt_misc
+
+        rpcc	r1			// get cyccounter
+
+        ldq_p	r22, osfpcb_q_fen(r18)	// get new fen/pme
+        ldl_p	r23, osfpcb_l_cc(r18)	// get cycle counter
+        ldl_p	r24, osfpcb_l_asn(r18)	// get new asn
+
+
+        ldq_p	r25, osfpcb_q_Mmptr(r18)// get new mmptr
+        sll	r25, page_offset_size_bits, r25 // convert pfn to pa
+        mtpr	r25, pt_ptbr		// load the new mmptr
+        mtpr	r18, pt_pcbb		// set new pcbb
+
+        bic	r17, 3, r17		// clean use pc
+        mtpr	r17, exc_addr		// set new pc
+        mtpr	r19, mvptbr
+        mtpr	r19, ivptbr
+
+        ldq_p	r30, osfpcb_q_Usp(r18)	// get new usp
+        mtpr	r30, pt_usp		// save usp
+
+        sll	r24, dtb_asn_v_asn, r8
+        mtpr	r8, dtb_asn
+        sll	r24, itb_asn_v_asn, r24
+        mtpr	r24, itb_asn
+
+        mfpr	r25, icsr		// get current icsr
+        lda	r24, 1(r31)
+        sll	r24, icsr_v_fpe, r24	// 1 in icsr<fpe> position
+        bic	r25, r24, r25		// clean out old fpe
+        and	r22, 1, r22		// isolate new fen bit
+        sll	r22, icsr_v_fpe, r22
+        or	r22, r25, r25		// or in new fpe
+        mtpr	r25, icsr		// update ibox ipr
+
+        subl	r23, r1, r1		// gen new cc offset
+        insll	r1, 4, r1		// << 32
+        mtpr	r1, cc			// set new offset
+
+        or	r31, r31, r0		// set success
+        ldq_p	r30, osfpcb_q_Ksp(r18)	// get new ksp
+        mfpr	r31, pt0		// stall
+        hw_rei_stall
+
+//
+//sys_machine_check - Machine check PAL
+// 	A machine_check trap has occurred.  The Icache has been flushed.
+//
+//
+
+        ALIGN_BLOCK
+EXPORT(sys_machine_check)
+        // Need to fill up the refill buffer (32 instructions) and
+        // then flush the Icache again.
+        // Also, due to possible 2nd Cbox register file write for
+        // uncorrectable errors, no register file read or write for 7 cycles.
+
+        //nop
+        .long 0x4000054 // call M5 Panic
+        mtpr	r0, pt0		// Stash for scratch -- OK if Cbox overwrites
+                                //    r0 later
+        nop
+        nop
+
+        nop
+        nop
+
+        nop
+        nop
+
+        nop
+        nop
+                                // 10 instructions// 5 cycles
+
+        nop
+        nop
+
+        nop
+        nop
+
+                                                // Register file can now be written
+        lda	r0, scb_v_procmchk(r31)		// SCB vector
+        mfpr	r13, pt_mces			// Get MCES
+        sll	r0, 16, r0			// Move SCBv to correct position
+        bis	r13, BIT(mces_v_mchk), r14	// Set MCES<MCHK> bit
+
+
+        zap	r14, 0x3C, r14			// Clear mchk_code word and SCBv word
+        mtpr	r14, pt_mces
+                                                // 20 instructions
+
+        nop
+        or	r14, r0, r14			// Insert new SCB vector
+        lda	r0, mchk_c_proc_hrd_error(r31)	// MCHK code
+        mfpr	r12, exc_addr
+
+        sll	r0, 32, r0			// Move MCHK code to correct position
+        mtpr	r4, pt4
+        or	r14, r0, r14			// Insert new MCHK code
+        mtpr	r14, pt_misc			// Store updated MCES, MCHK code, and SCBv
+
+        ldah	r14, 0xfff0(r31)
+        mtpr	r1, pt1				// Stash for scratch - 30 instructions
+
+        zap	r14, 0xE0, r14			// Get Cbox IPR base
+        mtpr	r12, pt10			// Stash exc_addr
+
+
+
+        mtpr	r31, ic_flush_ctl			// Second Icache flush, now it is really flushed.
+        blbs	r13, sys_double_machine_check		// MCHK halt if double machine check
+
+        mtpr	r6, pt6
+        mtpr	r5, pt5
+
+        // Look for the powerfail cases here....
+        mfpr	r4, isr
+        srl	r4, isr_v_pfl, r4
+        blbc	r4, sys_mchk_collect_iprs	// skip if no powerfail interrupt pending
+        lda	r4, 0xffc4(r31)			// get GBUS$MISCR address bits
+        sll	r4, 24, r4			// shift to proper position
+        ldq_p	r4, 0(r4)			// read GBUS$MISCR
+        srl	r4, 5, r4			// isolate bit <5>
+        blbc	r4, sys_mchk_collect_iprs	// skip if already cleared
+                                                // No missed CFAIL mchk
+        lda	r5, 0xffc7(r31)			// get GBUS$SERNUM address bits
+        sll	r5, 24, r5			// shift to proper position
+        lda	r6, 0x40(r31)			// get bit <6> mask
+        ldq_p	r4, 0(r5)			// read GBUS$SERNUM
+        or	r4, r6, r6			// set bit <6>
+        stq_p	r6, 0(r5)			// clear GBUS$SERNUM<6>
+        mb
+        mb
+
+
+        //
+        // Start to collect the IPRs.  Common entry point for mchk flows.
+        //
+        // Current state:
+        //	pt0	- saved r0
+        //	pt1	- saved	r1
+        //	pt4	- saved r4
+        //	pt5	- saved r5
+        //	pt6	- saved r6
+        //	pt10	- saved exc_addr
+        //	pt_misc<47:32> - mchk code
+        //	pt_misc<31:16> - scb vector
+        //	r14	- base of Cbox IPRs in IO space
+        //	r0, r1, r4, r5, r6, r12, r13, r25 - available
+        //	r8, r9, r10 - available as all loads are physical
+        //	MCES<mchk> is set
+        //
+        //
+
+EXPORT(sys_mchk_collect_iprs)
+        .long 0x4000054 // call M5 Panic
+        //mb						// MB before reading Scache IPRs
+        mfpr	r1, icperr_stat
+
+        mfpr	r8, dcperr_stat
+        mtpr	r31, dc_flush				// Flush the Dcache
+
+        mfpr	r31, pt0				// Pad Mbox instructions from dc_flush
+        mfpr	r31, pt0
+        nop
+        nop
+
+        ldq_p	r9, sc_addr(r14)			// SC_ADDR IPR
+        bis	r9, r31, r31				// Touch ld to make sure it completes before
+                                                        // read of SC_STAT
+        ldq_p	r10, sc_stat(r14)			// SC_STAT, also unlocks SC_ADDR
+
+        ldq_p	r12, ei_addr(r14)			// EI_ADDR IPR
+        ldq_p	r13, bc_tag_addr(r14)			// BC_TAG_ADDR IPR
+        ldq_p	r0, fill_syn(r14)			// FILL_SYN IPR
+        bis	r12, r13, r31				// Touch lds to make sure they complete before reading EI_STAT
+        bis	r0, r0, r31				// Touch lds to make sure they complete before reading EI_STAT
+        ldq_p	r25, ei_stat(r14)			// EI_STAT, unlock EI_ADDR, BC_TAG_ADDR, FILL_SYN
+        ldq_p	r31, ei_stat(r14)			// Read again to insure it is unlocked
+
+
+
+
+        //
+        // Look for nonretryable cases
+        // In this segment:
+        //	r5<0> = 1 means retryable
+        //	r4, r6, and r14 are available for scratch
+        //
+        //
+
+
+        bis	r31, r31, r5				// Clear local retryable flag
+        srl	r25, ei_stat_v_bc_tperr, r25		// Move EI_STAT status bits to low bits
+
+        lda	r4, 1(r31)
+        sll	r4, icperr_stat_v_tmr, r4
+        and 	r1, r4, r4				// Timeout reset
+        bne	r4, sys_cpu_mchk_not_retryable
+
+        and	r8, BIT(dcperr_stat_v_lock), r4		// DCache parity error locked
+        bne	r4, sys_cpu_mchk_not_retryable
+
+        lda	r4, 1(r31)
+        sll	r4, sc_stat_v_sc_scnd_err, r4
+        and	r10, r4, r4				// 2nd Scache error occurred
+        bne	r4, sys_cpu_mchk_not_retryable
+
+
+        bis	r31, 0xa3, r4				// EI_STAT Bcache Tag Parity Error, Bcache Tag Control
+                                                        // Parity Error, Interface Parity Error, 2nd Error
+
+        and	r25, r4, r4
+        bne	r4, sys_cpu_mchk_not_retryable
+
+//	bis	r31, #<1@<ei_stat$v_unc_ecc_err-ei_stat$v_bc_tperr>>, r4
+        bis	r31, BIT((ei_stat_v_unc_ecc_err-ei_stat_v_bc_tperr)), r4
+        and	r25, r4, r4				// Isolate the Uncorrectable Error Bit
+//	bis	r31, #<1@<ei_stat$v_fil_ird-ei_stat$v_bc_tperr>>, r6
+        bis	r31, BIT((ei_stat_v_fil_ird-ei_stat_v_bc_tperr)), r6 // Isolate the Iread bit
+        cmovne	r6, 0, r4				// r4 = 0 if IRD or if No Uncorrectable Error
+        bne     r4, sys_cpu_mchk_not_retryable
+
+        lda	r4, 7(r31)
+        and 	r10, r4, r4				// Isolate the Scache Tag Parity Error bits
+        bne	r4, sys_cpu_mchk_not_retryable		// All Scache Tag PEs are not retryable
+
+
+        lda	r4, 0x7f8(r31)
+        and	r10, r4, r4				// Isolate the Scache Data Parity Error bits
+        srl	r10, sc_stat_v_cbox_cmd, r6
+        and	r6, 0x1f, r6				// Isolate Scache Command field
+        subq	r6, 1, r6				// Scache Iread command = 1
+        cmoveq	r6, 0, r4				// r4 = 0 if IRD or if No Parity Error
+        bne     r4, sys_cpu_mchk_not_retryable
+
+        // Look for the system unretryable cases here....
+
+        mfpr	r4, isr					// mchk_interrupt pin asserted
+        srl	r4, isr_v_mck, r4
+        blbs	r4, sys_cpu_mchk_not_retryable
+
+
+
+        //
+        // Look for retryable cases
+        // In this segment:
+        //	r5<0> = 1 means retryable
+        //	r6 - holds the mchk code
+        //	r4 and r14 are available for scratch
+        //
+        //
+
+
+        // Within the chip, the retryable cases are Istream errors
+        lda	r4, 3(r31)
+        sll	r4, icperr_stat_v_dpe, r4
+        and	r1, r4, r4
+        cmovne	r4, 1, r5				// Retryable if just Icache parity error
+
+
+        lda	r4, 0x7f8(r31)
+        and	r10, r4, r4				// Isolate the Scache Data Parity Error bits
+        srl	r10, sc_stat_v_cbox_cmd, r14
+        and	r14, 0x1f, r14				// Isolate Scache Command field
+        subq	r14, 1, r14				// Scache Iread command = 1
+        cmovne	r4, 1, r4				// r4 = 1 if Scache data parity error bit set
+        cmovne	r14, 0, r4				// r4 = 1 if Scache PE and Iread
+        bis	r4, r5, r5				// Accumulate
+
+
+        bis	r31, BIT((ei_stat_v_unc_ecc_err-ei_stat_v_bc_tperr)), r4
+        and	r25, r4, r4				// Isolate the Uncorrectable Error Bit
+        and	r25, BIT((ei_stat_v_fil_ird-ei_stat_v_bc_tperr)), r14 // Isolate the Iread bit
+        cmovne	r4, 1, r4				// r4 = 1 if uncorr error
+        cmoveq	r14, 0, r4				// r4 = 1 if uncorr and Iread
+        bis	r4, r5, r5				// Accumulate
+
+        mfpr	r6, pt_misc
+        extwl	r6, 4, r6				// Fetch mchk code
+        bic	r6, 1, r6				// Clear flag from interrupt flow
+        cmovne	r5, mchk_c_retryable_ird, r6		// Set mchk code
+
+
+        //
+        // Write the logout frame
+        //
+        // Current state:
+        //	r0	- fill_syn
+        //	r1	- icperr_stat
+        //	r4	- available
+        // 	r5<0>  	- retry flag
+        //	r6     	- mchk code
+        //	r8	- dcperr_stat
+        //	r9	- sc_addr
+        //	r10	- sc_stat
+        //	r12	- ei_addr
+        //	r13	- bc_tag_addr
+        //	r14	- available
+        //	r25	- ei_stat (shifted)
+        //	pt0	- saved r0
+        //	pt1	- saved	r1
+        //	pt4	- saved r4
+        //	pt5	- saved r5
+        //	pt6	- saved r6
+        //	pt10	- saved exc_addr
+        //
+        //
+
+sys_mchk_write_logout_frame:
+        // Get base of the logout area.
+        GET_IMPURE(r14)				 // addr of per-cpu impure area
+        GET_ADDR(r14,pal_logout_area+mchk_mchk_base,r14)
+
+        // Write the first 2 quadwords of the logout area:
+
+        sll	r5, 63, r5				// Move retry flag to bit 63
+        lda	r4, mchk_size(r5)			// Combine retry flag and frame size
+        stq_p	r4, mchk_flag(r14)			// store flag/frame size
+        lda	r4, mchk_sys_base(r31)			// sys offset
+        sll	r4, 32, r4
+        lda	r4, mchk_cpu_base(r4)			// cpu offset
+        stq_p	r4, mchk_offsets(r14)			// store sys offset/cpu offset into logout frame
+
+        //
+        // Write the mchk code to the logout area
+        // Write error IPRs already fetched to the logout area
+        // Restore some GPRs from PALtemps
+        //
+
+        mfpr	r5, pt5
+        stq_p	r6, mchk_mchk_code(r14)
+        mfpr	r4, pt4
+        stq_p	r1, mchk_ic_perr_stat(r14)
+        mfpr	r6, pt6
+        stq_p	r8, mchk_dc_perr_stat(r14)
+        mfpr	r1, pt1
+        stq_p	r9, mchk_sc_addr(r14)
+        stq_p	r10, mchk_sc_stat(r14)
+        stq_p	r12, mchk_ei_addr(r14)
+        stq_p	r13, mchk_bc_tag_addr(r14)
+        stq_p	r0,  mchk_fill_syn(r14)
+        mfpr	r0, pt0
+        sll	r25, ei_stat_v_bc_tperr, r25		// Move EI_STAT status bits back to expected position
+        // retrieve lower 28 bits again from ei_stat and restore before storing to logout frame
+        ldah    r13, 0xfff0(r31)
+        zapnot  r13, 0x1f, r13
+        ldq_p    r13, ei_stat(r13)
+        sll     r13, 64-ei_stat_v_bc_tperr, r13
+        srl     r13, 64-ei_stat_v_bc_tperr, r13
+        or      r25, r13, r25
+        stq_p	r25, mchk_ei_stat(r14)
+
+
+
+
+        //
+        // complete the CPU-specific part of the logout frame
+        //
+
+        ldah	r13, 0xfff0(r31)
+        zap	r13, 0xE0, r13			// Get Cbox IPR base
+        ldq_p	r13, ld_lock(r13)		// Get ld_lock IPR
+        stq_p	r13, mchk_ld_lock(r14)		// and stash it in the frame
+
+        // Unlock IPRs
+        lda	r8, (BIT(dcperr_stat_v_lock)|BIT(dcperr_stat_v_seo))(r31)
+        mtpr	r8, dcperr_stat			// Clear Dcache parity error status
+
+        lda	r8, (BIT(icperr_stat_v_dpe)|BIT(icperr_stat_v_tpe)|BIT(icperr_stat_v_tmr))(r31)
+        mtpr	r8, icperr_stat			// Clear Icache parity error & timeout status
+
+1:	ldq_p	r8, mchk_ic_perr_stat(r14)	// get ICPERR_STAT value
+        GET_ADDR(r0,0x1800,r31)		// get ICPERR_STAT value
+        and	r0, r8, r0			// compare
+        beq	r0, 2f				// check next case if nothing set
+        lda	r0, mchk_c_retryable_ird(r31)	// set new MCHK code
+        br	r31, do_670			// setup new vector
+
+2:	ldq_p	r8, mchk_dc_perr_stat(r14)	// get DCPERR_STAT value
+        GET_ADDR(r0,0x3f,r31)			// get DCPERR_STAT value
+        and	r0, r8, r0			// compare
+        beq	r0, 3f				// check next case if nothing set
+        lda	r0, mchk_c_dcperr(r31)		// set new MCHK code
+        br	r31, do_670			// setup new vector
+
+3:	ldq_p	r8, mchk_sc_stat(r14)		// get SC_STAT value
+        GET_ADDR(r0,0x107ff,r31)		// get SC_STAT value
+        and	r0, r8, r0			// compare
+        beq	r0, 4f				// check next case if nothing set
+        lda	r0, mchk_c_scperr(r31)		// set new MCHK code
+        br	r31, do_670			// setup new vector
+
+4:	ldq_p	r8, mchk_ei_stat(r14)		// get EI_STAT value
+        GET_ADDR(r0,0x30000000,r31)		// get EI_STAT value
+        and	r0, r8, r0			// compare
+        beq	r0, 5f				// check next case if nothing set
+        lda	r0, mchk_c_bcperr(r31)		// set new MCHK code
+        br	r31, do_670			// setup new vector
+
+5:	ldl_p	r8, mchk_tlber(r14)		// get TLBER value
+        GET_ADDR(r0,0xfe01,r31)	        	// get high TLBER mask value
+        sll	r0, 16, r0			// shift into proper position
+        GET_ADDR(r1,0x03ff,r31)		        // get low TLBER mask value
+        or	r0, r1, r0			// merge mask values
+        and	r0, r8, r0			// compare
+        beq	r0, 6f				// check next case if nothing set
+        GET_ADDR(r0, 0xfff0, r31)		// set new MCHK code
+        br	r31, do_660			// setup new vector
+
+6:	ldl_p	r8, mchk_tlepaerr(r14)		// get TLEPAERR value
+        GET_ADDR(r0,0xff7f,r31) 		// get TLEPAERR mask value
+        and	r0, r8, r0			// compare
+        beq	r0, 7f				// check next case if nothing set
+        GET_ADDR(r0, 0xfffa, r31)		// set new MCHK code
+        br	r31, do_660			// setup new vector
+
+7:	ldl_p	r8, mchk_tlepderr(r14)		// get TLEPDERR value
+        GET_ADDR(r0,0x7,r31)			// get TLEPDERR mask value
+        and	r0, r8, r0			// compare
+        beq	r0, 8f				// check next case if nothing set
+        GET_ADDR(r0, 0xfffb, r31)		// set new MCHK code
+        br	r31, do_660			// setup new vector
+
+8:	ldl_p	r8, mchk_tlepmerr(r14)		// get TLEPMERR value
+        GET_ADDR(r0,0x3f,r31)			// get TLEPMERR mask value
+        and	r0, r8, r0			// compare
+        beq	r0, 9f				// check next case if nothing set
+        GET_ADDR(r0, 0xfffc, r31)		// set new MCHK code
+        br	r31, do_660			// setup new vector
+
+9:	ldq_p	r8, mchk_ei_stat(r14)		// get EI_STAT value
+        GET_ADDR(r0,0xb,r31)			// get EI_STAT mask value
+        sll	r0, 32, r0			// shift to upper lw
+        and	r0, r8, r0			// compare
+        beq	r0, 1f				// check next case if nothing set
+        GET_ADDR(r0,0xfffd,r31) 		// set new MCHK code
+        br	r31, do_660			// setup new vector
+
+1:	ldl_p	r8, mchk_tlepaerr(r14)		// get TLEPAERR value
+        GET_ADDR(r0,0x80,r31)			// get TLEPAERR mask value
+        and	r0, r8, r0			// compare
+        beq	r0, cont_logout_frame		// check next case if nothing set
+        GET_ADDR(r0, 0xfffe, r31)		// set new MCHK code
+        br	r31, do_660			// setup new vector
+
+do_670:	lda	r8, scb_v_procmchk(r31)		// SCB vector
+        br	r31, do_6x0_cont
+do_660:	lda	r8, scb_v_sysmchk(r31)		// SCB vector
+do_6x0_cont:
+        sll	r8, 16, r8			// shift to proper position
+        mfpr	r1, pt_misc			// fetch current pt_misc
+        GET_ADDR(r4,0xffff, r31)		// mask for vector field
+        sll	r4, 16, r4			// shift to proper position
+        bic	r1, r4, r1			// clear out old vector field
+        or	r1, r8, r1			// merge in new vector
+        mtpr	r1, pt_misc			// save new vector field
+        stl_p	r0, mchk_mchk_code(r14)		// save new mchk code
+
+cont_logout_frame:
+        // Restore some GPRs from PALtemps
+        mfpr	r0, pt0
+        mfpr	r1, pt1
+        mfpr	r4, pt4
+
+        mfpr	r12, pt10			// fetch original PC
+        blbs	r12, sys_machine_check_while_in_pal	// MCHK halt if machine check in pal
+
+//XXXbugnion        pvc_jsr armc, bsr=1
+        bsr     r12, sys_arith_and_mchk     	// go check for and deal with arith trap
+
+        mtpr	r31, exc_sum			// Clear Exception Summary
+
+        mfpr	r25, pt10			// write exc_addr after arith_and_mchk to pickup new pc
+        stq_p	r25, mchk_exc_addr(r14)
+
+        //
+        // Set up the km trap
+        //
+
+
+sys_post_mchk_trap:
+        mfpr	r25, pt_misc		// Check for flag from mchk interrupt
+        extwl	r25, 4, r25
+        blbs	r25, sys_mchk_stack_done // Stack from already pushed if from interrupt flow
+
+        bis	r14, r31, r12		// stash pointer to logout area
+        mfpr	r14, pt10		// get exc_addr
+
+        sll	r11, 63-3, r25		// get mode to msb
+        bge	r25, 3f
+
+        mtpr	r31, dtb_cm
+        mtpr	r31, ev5__ps
+
+        mtpr	r30, pt_usp		// save user stack
+        mfpr	r30, pt_ksp
+
+3:
+        lda	sp, 0-osfsf_c_size(sp)	// allocate stack space
+        nop
+
+        stq	r18, osfsf_a2(sp) 	// a2
+        stq	r11, osfsf_ps(sp)	// save ps
+
+        stq	r14, osfsf_pc(sp)	// save pc
+        mfpr	r25, pt_entint		// get the VA of the interrupt routine
+
+        stq	r16, osfsf_a0(sp)	// a0
+        lda	r16, osfint_c_mchk(r31)	// flag as mchk in a0
+
+        stq	r17, osfsf_a1(sp)	// a1
+        mfpr	r17, pt_misc		// get vector
+
+        stq	r29, osfsf_gp(sp) 	// old gp
+        mtpr	r25, exc_addr		//
+
+        or	r31, 7, r11		// get new ps (km, high ipl)
+        subq	r31, 1, r18		// get a -1
+
+        extwl	r17, 2, r17		// a1 <- interrupt vector
+        bis	r31, ipl_machine_check, r25
+
+        mtpr	r25, ipl		// Set internal ipl
+        srl    	r18, 42, r18          	// shift off low bits of kseg addr
+
+        sll    	r18, 42, r18          	// shift back into position
+        mfpr	r29, pt_kgp		// get the kern r29
+
+        or    	r12, r18, r18          	// EV4 algorithm - pass pointer to mchk frame as kseg address
+        hw_rei_spe			// out to interrupt dispatch routine
+
+
+        //
+        // The stack is pushed.  Load up a0,a1,a2 and vector via entInt
+        //
+        //
+        ALIGN_BRANCH
+sys_mchk_stack_done:
+        lda	r16, osfint_c_mchk(r31)	// flag as mchk/crd in a0
+        lda	r17, scb_v_sysmchk(r31) // a1 <- interrupt vector
+
+        subq    r31, 1, r18            // get a -1
+        mfpr	r25, pt_entInt
+
+        srl     r18, 42, r18           // shift off low bits of kseg addr
+        mtpr	r25, exc_addr		// load interrupt vector
+
+        sll     r18, 42, r18           // shift back into position
+        or    	r14, r18, r18           // EV4 algorithm - pass pointer to mchk frame as kseg address
+
+        hw_rei_spe			// done
+
+
+        ALIGN_BRANCH
+sys_cpu_mchk_not_retryable:
+        mfpr	r6, pt_misc
+        extwl	r6, 4, r6				// Fetch mchk code
+        br	r31,  sys_mchk_write_logout_frame	//
+
+
+
+//
+//sys_double_machine_check - a machine check was started, but MCES<MCHK> was
+//	already set.  We will now double machine check halt.
+//
+//	pt0 - old R0
+//
+//
+
+EXPORT(sys_double_machine_check)
+        lda	r0, hlt_c_dbl_mchk(r31)
+        br	r31, sys_enter_console
+
+//
+// sys_machine_check_while_in_pal - a machine check was started,
+//	exc_addr points to a PAL PC.  We will now machine check halt.
+//
+//	pt0 - old R0
+//
+//
+sys_machine_check_while_in_pal:
+        stq_p	r12, mchk_exc_addr(r14)		// exc_addr has not yet been written
+        lda	r0, hlt_c_mchk_from_pal(r31)
+        br	r31, sys_enter_console
+
+
+//ARITH and MCHK
+//  Check for arithmetic errors and build trap frame,
+//  but don't post the trap.
+//  on entry:
+//	pt10 - exc_addr
+//	r12  - return address
+//	r14  - logout frame pointer
+//	r13 - available
+//	r8,r9,r10 - available except across stq's
+//	pt0,1,6 - available
+//
+//  on exit:
+//	pt10 - new exc_addr
+//	r17 = exc_mask
+//	r16 = exc_sum
+//	r14 - logout frame pointer
+//
+        ALIGN_BRANCH
+sys_arith_and_mchk:
+        mfpr	r13, ev5__exc_sum
+        srl	r13, exc_sum_v_swc, r13
+        bne	r13, handle_arith_and_mchk
+
+// XXX bugnion        pvc$jsr armc, bsr=1, dest=1
+        ret     r31, (r12)              // return if no outstanding arithmetic error
+
+handle_arith_and_mchk:
+        mtpr    r31, ev5__dtb_cm        // Set Mbox current mode to kernel
+                                        //     no virt ref for next 2 cycles
+        mtpr	r14, pt0
+
+        mtpr	r1, pt1			// get a scratch reg
+        and     r11, osfps_m_mode, r1 // get mode bit
+
+        bis     r11, r31, r25           // save ps
+        beq     r1, 1f                 // if zero we are in kern now
+
+        bis     r31, r31, r25           // set the new ps
+        mtpr    r30, pt_usp             // save user stack
+
+        mfpr    r30, pt_ksp             // get kern stack
+1:
+        mfpr    r14, exc_addr           // get pc into r14 in case stack writes fault
+
+        lda     sp, 0-osfsf_c_size(sp)  // allocate stack space
+        mtpr    r31, ev5__ps            // Set Ibox current mode to kernel
+
+        mfpr    r1, pt_entArith
+        stq     r14, osfsf_pc(sp)       // save pc
+
+        stq     r17, osfsf_a1(sp)
+        mfpr    r17, ev5__exc_mask      // Get exception register mask IPR - no mtpr exc_sum in next cycle
+
+        stq     r29, osfsf_gp(sp)
+        stq     r16, osfsf_a0(sp)       // save regs
+
+        bis	r13, r31, r16		// move exc_sum to r16
+        stq     r18, osfsf_a2(sp)
+
+        stq     r11, osfsf_ps(sp)       // save ps
+        mfpr    r29, pt_kgp             // get the kern gp
+
+        mfpr	r14, pt0		// restore logout frame pointer from pt0
+        bis     r25, r31, r11           // set new ps
+
+        mtpr    r1, pt10		// Set new PC
+        mfpr	r1, pt1
+
+// XXX bugnion        pvc$jsr armc, bsr=1, dest=1
+        ret     r31, (r12)              // return if no outstanding arithmetic error
+
+
+
+// sys_enter_console - Common PALcode for ENTERING console
+//
+// Entry:
+//	Entered when PAL wants to enter the console.
+//	usually as the result of a HALT instruction or button,
+//	or catastrophic error.
+//
+// Regs on entry...
+//
+//	R0 	= halt code
+//	pt0	<- r0
+//
+// Function:
+//
+//	Save all readable machine state, and "call" the console
+//
+// Returns:
+//
+//
+// Notes:
+//
+//	In these routines, once the save state routine has been executed,
+//	the remainder of the registers become scratchable, as the only
+//	"valid" copy of them is the "saved" copy.
+//
+//	Any registers or PTs that are modified before calling the save
+//	routine will have there data lost. The code below will save all
+//	state, but will loose pt 0,4,5.
+//
+//
+
+        ALIGN_BLOCK
+EXPORT(sys_enter_console)
+        mtpr	r1, pt4
+        mtpr	r3, pt5
+        subq	r31, 1, r1
+        sll	r1, 42, r1
+        ldah	r1, 1(r1)
+
+        /* taken from scrmax, seems like the obvious thing to do */
+        mtpr	r1, exc_addr
+        mfpr	r1, pt4
+        mfpr	r3, pt5
+        STALL
+        STALL
+        hw_rei_stall
+
+
+//
+// sys_exit_console - Common PALcode for ENTERING console
+//
+// Entry:
+//	Entered when console wants to reenter PAL.
+//	usually as the result of a CONTINUE.
+//
+//
+// Regs' on entry...
+//
+//
+// Function:
+//
+//	Restore all readable machine state, and return to user code.
+//
+//
+//
+//
+        ALIGN_BLOCK
+sys_exit_console:
+
+        GET_IMPURE(r1)
+
+        // clear lock and intr_flags prior to leaving console
+        rc	r31			// clear intr_flag
+        // lock flag cleared by restore_state
+        // TB's have been flushed
+
+        ldq_p	r3, (cns_gpr+(8*3))(r1)		// restore r3
+        ldq_p	r1, (cns_gpr+8)(r1)		// restore r1
+        hw_rei_stall				// back to user
+
+
+// kludge_initial_pcbb - PCB for Boot use only
+
+        ALIGN_128
+.globl kludge_initial_pcbb
+kludge_initial_pcbb:			// PCB is 128 bytes long
+        nop
+        nop
+        nop
+        nop
+
+        nop
+        nop
+        nop
+        nop
+
+        nop
+        nop
+        nop
+        nop
+
+        nop
+        nop
+        nop
+        nop
+
+
+// SET_SC_BC_CTL subroutine
+//
+// Subroutine to set the SC_CTL, BC_CONFIG, and BC_CTL registers and
+// flush the Scache
+// There must be no outstanding memory references -- istream or
+// dstream -- when these registers are written.  EV5 prefetcher is
+// difficult to turn off.  So, this routine needs to be exactly 32
+// instructions long// the final jmp must be in the last octaword of a
+// page (prefetcher doesn't go across page)
+//
+//
+// Register expecations:
+//	r0	base address of CBOX iprs
+//      r5      value to set sc_ctl to (flush bit is added in)
+//      r6      value to set bc_ctl to
+//	r7	value to set bc_config to
+//	r10	return address
+// 	r19     old sc_ctl value
+// 	r20	old value of bc_ctl
+//	r21	old value of bc_config
+//	r23	flush scache flag
+// Register usage:
+//      r17     sc_ctl with flush bit cleared
+//	r22	loop address
+//
+//
+set_sc_bc_ctl:
+        ret	r31, (r10)		// return to where we came from
diff --git a/tests/long/00.gzip/ref/x86/linux/o3-timing/config.ini b/tests/long/00.gzip/ref/x86/linux/o3-timing/config.ini
index 503c61f1c7..9a2e60122c 100644
--- a/tests/long/00.gzip/ref/x86/linux/o3-timing/config.ini
+++ b/tests/long/00.gzip/ref/x86/linux/o3-timing/config.ini
@@ -488,7 +488,7 @@ type=ExeTracer
 [system.cpu.workload]
 type=LiveProcess
 cmd=gzip input.log 1
-cwd=build/X86_SE/tests/fast/long/00.gzip/x86/linux/o3-timing
+cwd=build/X86_SE/tests/opt/long/00.gzip/x86/linux/o3-timing
 egid=100
 env=
 errout=cerr
diff --git a/tests/long/00.gzip/ref/x86/linux/o3-timing/simout b/tests/long/00.gzip/ref/x86/linux/o3-timing/simout
index 3dbb4b0b40..d3a2b5cda1 100755
--- a/tests/long/00.gzip/ref/x86/linux/o3-timing/simout
+++ b/tests/long/00.gzip/ref/x86/linux/o3-timing/simout
@@ -5,11 +5,11 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:13
+M5 compiled Feb 12 2011 02:22:23
+M5 revision 5e76f9de6972 7961 default qtip tip x86branchdetectstats.patch
+M5 started Feb 12 2011 02:22:27
 M5 executing on burrito
-command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/00.gzip/x86/linux/o3-timing -re tests/run.py build/X86_SE/tests/fast/long/00.gzip/x86/linux/o3-timing
+command line: build/X86_SE/m5.opt -d build/X86_SE/tests/opt/long/00.gzip/x86/linux/o3-timing -re tests/run.py build/X86_SE/tests/opt/long/00.gzip/x86/linux/o3-timing
 Global frequency set at 1000000000000 ticks per second
 info: Entering event queue @ 0.  Starting simulation...
 spec_init
@@ -1067,4 +1067,4 @@ Uncompressing Data
 Uncompressed data 1048576 bytes in length
 Uncompressed data compared correctly
 Tested 1MB buffer: OK!
-Exiting @ tick 772390499500 because target called exit()
+Exiting @ tick 766217705000 because target called exit()
diff --git a/tests/long/00.gzip/ref/x86/linux/o3-timing/stats.txt b/tests/long/00.gzip/ref/x86/linux/o3-timing/stats.txt
index 05b37528bf..cc548bebc8 100644
--- a/tests/long/00.gzip/ref/x86/linux/o3-timing/stats.txt
+++ b/tests/long/00.gzip/ref/x86/linux/o3-timing/stats.txt
@@ -1,41 +1,41 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 168346                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 232444                       # Number of bytes of host memory used
-host_seconds                                  9631.89                       # Real time elapsed on the host
-host_tick_rate                               80190939                       # Simulator tick rate (ticks/s)
+host_inst_rate                                 123498                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 236748                       # Number of bytes of host memory used
+host_seconds                                 13129.74                       # Real time elapsed on the host
+host_tick_rate                               58357436                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                  1621493982                       # Number of instructions simulated
-sim_seconds                                  0.772390                       # Number of seconds simulated
-sim_ticks                                772390499500                       # Number of ticks simulated
+sim_seconds                                  0.766218                       # Number of seconds simulated
+sim_ticks                                766217705000                       # Number of ticks simulated
 system.cpu.BPredUnit.BTBCorrect                     0                       # Number of correct BTB predictions (this stat may not work properly.
-system.cpu.BPredUnit.BTBHits                126254885                       # Number of BTB hits
-system.cpu.BPredUnit.BTBLookups             126894033                       # Number of BTB lookups
+system.cpu.BPredUnit.BTBHits                169776992                       # Number of BTB hits
+system.cpu.BPredUnit.BTBLookups             171183773                       # Number of BTB lookups
 system.cpu.BPredUnit.RASInCorrect                   0                       # Number of incorrect RAS predictions.
-system.cpu.BPredUnit.condIncorrect            5933287                       # Number of conditional branches incorrect
-system.cpu.BPredUnit.condPredicted          126894073                       # Number of conditional branches predicted
-system.cpu.BPredUnit.lookups                126894073                       # Number of BP lookups
+system.cpu.BPredUnit.condIncorrect            8003535                       # Number of conditional branches incorrect
+system.cpu.BPredUnit.condPredicted          180455810                       # Number of conditional branches predicted
+system.cpu.BPredUnit.lookups                180455810                       # Number of BP lookups
 system.cpu.BPredUnit.usedRAS                        0                       # Number of times the RAS was used to get a target.
 system.cpu.commit.COM:branches              107161579                       # Number of branches committed
-system.cpu.commit.COM:bw_lim_events           3710402                       # number cycles where commit BW limit reached
+system.cpu.commit.COM:bw_lim_events           7534042                       # number cycles where commit BW limit reached
 system.cpu.commit.COM:bw_limited                    0                       # number of insts not committed due to BW limits
-system.cpu.commit.COM:committed_per_cycle::samples   1511501895                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::mean     1.072770                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::stdev     1.173458                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::samples   1432274296                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::mean     1.132111                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::stdev     1.344268                       # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::underflows            0      0.00%      0.00% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::0    505879323     33.47%     33.47% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::1    677452709     44.82%     78.29% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::2    153213861     10.14%     88.43% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::3    112394621      7.44%     95.86% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::4     32585093      2.16%     98.02% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::5     19016713      1.26%     99.27% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::6      5421676      0.36%     99.63% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::7      1827497      0.12%     99.75% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::8      3710402      0.25%    100.00% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::0    536173455     37.44%     37.44% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::1    547306108     38.21%     75.65% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::2    130197340      9.09%     84.74% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::3    136647601      9.54%     94.28% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::4     42821104      2.99%     97.27% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::5     22915800      1.60%     98.87% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::6      3037283      0.21%     99.08% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::7      5641563      0.39%     99.47% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::8      7534042      0.53%    100.00% # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::overflows            0      0.00%    100.00% # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::min_value            0                       # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::max_value            8                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::total   1511501895                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::total   1432274296                       # Number of insts commited each cycle
 system.cpu.commit.COM:count                1621493982                       # Number of instructions committed
 system.cpu.commit.COM:fp_insts                      0                       # Number of committed floating point instructions.
 system.cpu.commit.COM:function_calls                0                       # Number of function calls committed.
@@ -44,422 +44,422 @@ system.cpu.commit.COM:loads                 419042125                       # Nu
 system.cpu.commit.COM:membars                       0                       # Number of memory barriers committed
 system.cpu.commit.COM:refs                  607228182                       # Number of memory references committed
 system.cpu.commit.COM:swp_count                     0                       # Number of s/w prefetches committed
-system.cpu.commit.branchMispredicts           5933318                       # The number of times a branch was mispredicted
+system.cpu.commit.branchMispredicts           8003567                       # The number of times a branch was mispredicted
 system.cpu.commit.commitCommittedInsts     1621493982                       # The number of committed instructions
 system.cpu.commit.commitNonSpecStalls              50                       # The number of times commit has been forced to stall to communicate backwards
-system.cpu.commit.commitSquashedInsts       227874068                       # The number of squashed insts skipped by commit
+system.cpu.commit.commitSquashedInsts       729601482                       # The number of squashed insts skipped by commit
 system.cpu.committedInsts                  1621493982                       # Number of Instructions Simulated
 system.cpu.committedInsts_total            1621493982                       # Number of Instructions Simulated
-system.cpu.cpi                               0.952690                       # CPI: Cycles Per Instruction
-system.cpu.cpi_total                         0.952690                       # CPI: Total CPI of All Threads
-system.cpu.dcache.ReadReq_accesses          326327666                       # number of ReadReq accesses(hits+misses)
-system.cpu.dcache.ReadReq_avg_miss_latency 10363.748203                       # average ReadReq miss latency
-system.cpu.dcache.ReadReq_avg_mshr_miss_latency  7391.735933                       # average ReadReq mshr miss latency
-system.cpu.dcache.ReadReq_hits              326125265                       # number of ReadReq hits
-system.cpu.dcache.ReadReq_miss_latency     2097633000                       # number of ReadReq miss cycles
-system.cpu.dcache.ReadReq_miss_rate          0.000620                       # miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_misses               202401                       # number of ReadReq misses
-system.cpu.dcache.ReadReq_mshr_hits              1725                       # number of ReadReq MSHR hits
-system.cpu.dcache.ReadReq_mshr_miss_latency   1483344000                       # number of ReadReq MSHR miss cycles
-system.cpu.dcache.ReadReq_mshr_miss_rate     0.000615                       # mshr miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_mshr_misses          200676                       # number of ReadReq MSHR misses
+system.cpu.cpi                               0.945076                       # CPI: Cycles Per Instruction
+system.cpu.cpi_total                         0.945076                       # CPI: Total CPI of All Threads
+system.cpu.dcache.ReadReq_accesses          330979138                       # number of ReadReq accesses(hits+misses)
+system.cpu.dcache.ReadReq_avg_miss_latency 10103.492713                       # average ReadReq miss latency
+system.cpu.dcache.ReadReq_avg_mshr_miss_latency  7153.561618                       # average ReadReq mshr miss latency
+system.cpu.dcache.ReadReq_hits              330761084                       # number of ReadReq hits
+system.cpu.dcache.ReadReq_miss_latency     2203107000                       # number of ReadReq miss cycles
+system.cpu.dcache.ReadReq_miss_rate          0.000659                       # miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_misses               218054                       # number of ReadReq misses
+system.cpu.dcache.ReadReq_mshr_hits              3264                       # number of ReadReq MSHR hits
+system.cpu.dcache.ReadReq_mshr_miss_latency   1536513500                       # number of ReadReq MSHR miss cycles
+system.cpu.dcache.ReadReq_mshr_miss_rate     0.000649                       # mshr miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_mshr_misses          214790                       # number of ReadReq MSHR misses
 system.cpu.dcache.WriteReq_accesses         188186057                       # number of WriteReq accesses(hits+misses)
-system.cpu.dcache.WriteReq_avg_miss_latency 19667.198248                       # average WriteReq miss latency
-system.cpu.dcache.WriteReq_avg_mshr_miss_latency 10021.451346                       # average WriteReq mshr miss latency
-system.cpu.dcache.WriteReq_hits             186945733                       # number of WriteReq hits
-system.cpu.dcache.WriteReq_miss_latency   24393698000                       # number of WriteReq miss cycles
-system.cpu.dcache.WriteReq_miss_rate         0.006591                       # miss rate for WriteReq accesses
-system.cpu.dcache.WriteReq_misses             1240324                       # number of WriteReq misses
-system.cpu.dcache.WriteReq_mshr_hits           994745                       # number of WriteReq MSHR hits
-system.cpu.dcache.WriteReq_mshr_miss_latency   2461058000                       # number of WriteReq MSHR miss cycles
-system.cpu.dcache.WriteReq_mshr_miss_rate     0.001305                       # mshr miss rate for WriteReq accesses
-system.cpu.dcache.WriteReq_mshr_misses         245579                       # number of WriteReq MSHR misses
+system.cpu.dcache.WriteReq_avg_miss_latency 19459.417847                       # average WriteReq miss latency
+system.cpu.dcache.WriteReq_avg_mshr_miss_latency 10004.386505                       # average WriteReq mshr miss latency
+system.cpu.dcache.WriteReq_hits             186948986                       # number of WriteReq hits
+system.cpu.dcache.WriteReq_miss_latency   24072681495                       # number of WriteReq miss cycles
+system.cpu.dcache.WriteReq_miss_rate         0.006574                       # miss rate for WriteReq accesses
+system.cpu.dcache.WriteReq_misses             1237071                       # number of WriteReq misses
+system.cpu.dcache.WriteReq_mshr_hits           986986                       # number of WriteReq MSHR hits
+system.cpu.dcache.WriteReq_mshr_miss_latency   2501946999                       # number of WriteReq MSHR miss cycles
+system.cpu.dcache.WriteReq_mshr_miss_rate     0.001329                       # mshr miss rate for WriteReq accesses
+system.cpu.dcache.WriteReq_mshr_misses         250085                       # number of WriteReq MSHR misses
 system.cpu.dcache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
-system.cpu.dcache.avg_blocked_cycles::no_targets 15789.833755                       # average number of cycles each access was blocked
-system.cpu.dcache.avg_refs                1149.728625                       # Average number of references to valid blocks.
+system.cpu.dcache.avg_blocked_cycles::no_targets 16007.596007                       # average number of cycles each access was blocked
+system.cpu.dcache.avg_refs                1113.654359                       # Average number of references to valid blocks.
 system.cpu.dcache.blocked::no_mshrs                 0                       # number of cycles access was blocked
-system.cpu.dcache.blocked::no_targets           29234                       # number of cycles access was blocked
+system.cpu.dcache.blocked::no_targets           29555                       # number of cycles access was blocked
 system.cpu.dcache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
-system.cpu.dcache.blocked_cycles::no_targets    461600000                       # number of cycles access was blocked
+system.cpu.dcache.blocked_cycles::no_targets    473104500                       # number of cycles access was blocked
 system.cpu.dcache.cache_copies                      0                       # number of cache copies performed
-system.cpu.dcache.demand_accesses           514513723                       # number of demand (read+write) accesses
-system.cpu.dcache.demand_avg_miss_latency 18362.010085                       # average overall miss latency
-system.cpu.dcache.demand_avg_mshr_miss_latency  8838.897043                       # average overall mshr miss latency
-system.cpu.dcache.demand_hits               513070998                       # number of demand (read+write) hits
-system.cpu.dcache.demand_miss_latency     26491331000                       # number of demand (read+write) miss cycles
-system.cpu.dcache.demand_miss_rate           0.002804                       # miss rate for demand accesses
-system.cpu.dcache.demand_misses               1442725                       # number of demand (read+write) misses
-system.cpu.dcache.demand_mshr_hits             996470                       # number of demand (read+write) MSHR hits
-system.cpu.dcache.demand_mshr_miss_latency   3944402000                       # number of demand (read+write) MSHR miss cycles
-system.cpu.dcache.demand_mshr_miss_rate      0.000867                       # mshr miss rate for demand accesses
-system.cpu.dcache.demand_mshr_misses           446255                       # number of demand (read+write) MSHR misses
+system.cpu.dcache.demand_accesses           519165195                       # number of demand (read+write) accesses
+system.cpu.dcache.demand_avg_miss_latency 18057.409841                       # average overall miss latency
+system.cpu.dcache.demand_avg_mshr_miss_latency  8687.196556                       # average overall mshr miss latency
+system.cpu.dcache.demand_hits               517710070                       # number of demand (read+write) hits
+system.cpu.dcache.demand_miss_latency     26275788495                       # number of demand (read+write) miss cycles
+system.cpu.dcache.demand_miss_rate           0.002803                       # miss rate for demand accesses
+system.cpu.dcache.demand_misses               1455125                       # number of demand (read+write) misses
+system.cpu.dcache.demand_mshr_hits             990250                       # number of demand (read+write) MSHR hits
+system.cpu.dcache.demand_mshr_miss_latency   4038460499                       # number of demand (read+write) MSHR miss cycles
+system.cpu.dcache.demand_mshr_miss_rate      0.000895                       # mshr miss rate for demand accesses
+system.cpu.dcache.demand_mshr_misses           464875                       # number of demand (read+write) MSHR misses
 system.cpu.dcache.fast_writes                       0                       # number of fast writes performed
 system.cpu.dcache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.dcache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.dcache.occ_%::0                   0.999781                       # Average percentage of cache occupancy
-system.cpu.dcache.occ_blocks::0           4095.101758                       # Average occupied blocks per context
-system.cpu.dcache.overall_accesses          514513723                       # number of overall (read+write) accesses
-system.cpu.dcache.overall_avg_miss_latency 18362.010085                       # average overall miss latency
-system.cpu.dcache.overall_avg_mshr_miss_latency  8838.897043                       # average overall mshr miss latency
+system.cpu.dcache.occ_%::0                   0.999796                       # Average percentage of cache occupancy
+system.cpu.dcache.occ_blocks::0           4095.162912                       # Average occupied blocks per context
+system.cpu.dcache.overall_accesses          519165195                       # number of overall (read+write) accesses
+system.cpu.dcache.overall_avg_miss_latency 18057.409841                       # average overall miss latency
+system.cpu.dcache.overall_avg_mshr_miss_latency  8687.196556                       # average overall mshr miss latency
 system.cpu.dcache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.dcache.overall_hits              513070998                       # number of overall hits
-system.cpu.dcache.overall_miss_latency    26491331000                       # number of overall miss cycles
-system.cpu.dcache.overall_miss_rate          0.002804                       # miss rate for overall accesses
-system.cpu.dcache.overall_misses              1442725                       # number of overall misses
-system.cpu.dcache.overall_mshr_hits            996470                       # number of overall MSHR hits
-system.cpu.dcache.overall_mshr_miss_latency   3944402000                       # number of overall MSHR miss cycles
-system.cpu.dcache.overall_mshr_miss_rate     0.000867                       # mshr miss rate for overall accesses
-system.cpu.dcache.overall_mshr_misses          446255                       # number of overall MSHR misses
+system.cpu.dcache.overall_hits              517710070                       # number of overall hits
+system.cpu.dcache.overall_miss_latency    26275788495                       # number of overall miss cycles
+system.cpu.dcache.overall_miss_rate          0.002803                       # miss rate for overall accesses
+system.cpu.dcache.overall_misses              1455125                       # number of overall misses
+system.cpu.dcache.overall_mshr_hits            990250                       # number of overall MSHR hits
+system.cpu.dcache.overall_mshr_miss_latency   4038460499                       # number of overall MSHR miss cycles
+system.cpu.dcache.overall_mshr_miss_rate     0.000895                       # mshr miss rate for overall accesses
+system.cpu.dcache.overall_mshr_misses          464875                       # number of overall MSHR misses
 system.cpu.dcache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.dcache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.dcache.replacements                 442158                       # number of replacements
-system.cpu.dcache.sampled_refs                 446254                       # Sample count of references to valid blocks.
+system.cpu.dcache.replacements                 460779                       # number of replacements
+system.cpu.dcache.sampled_refs                 464875                       # Sample count of references to valid blocks.
 system.cpu.dcache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.dcache.tagsinuse               4095.101758                       # Cycle average of tags in use
-system.cpu.dcache.total_refs                513070998                       # Total number of references to valid blocks.
-system.cpu.dcache.warmup_cycle              331552000                       # Cycle when the warmup percentage was hit.
-system.cpu.dcache.writebacks                   398281                       # number of writebacks
-system.cpu.decode.DECODE:BlockedCycles      176333648                       # Number of cycles decode is blocked
-system.cpu.decode.DECODE:DecodedInsts      1886463332                       # Number of instructions handled by decode
-system.cpu.decode.DECODE:IdleCycles         320369444                       # Number of cycles decode is idle
-system.cpu.decode.DECODE:RunCycles          981528406                       # Number of cycles decode is running
-system.cpu.decode.DECODE:SquashCycles        33063147                       # Number of cycles decode is squashing
-system.cpu.decode.DECODE:UnblockCycles       33270397                       # Number of cycles decode is unblocking
-system.cpu.fetch.Branches                   126894073                       # Number of branches that fetch encountered
-system.cpu.fetch.CacheLines                 119630706                       # Number of cache lines fetched
-system.cpu.fetch.Cycles                    1056772647                       # Number of cycles fetch has run and was not squashing or blocked
-system.cpu.fetch.IcacheSquashes                432705                       # Number of outstanding Icache misses that were squashed
-system.cpu.fetch.Insts                     1026147627                       # Number of instructions fetch has processed
-system.cpu.fetch.MiscStallCycles                   46                       # Number of cycles fetch has spent waiting on interrupts, or bad addresses, or out of MSHRs
-system.cpu.fetch.SquashCycles                 9324994                       # Number of cycles fetch has spent squashing
-system.cpu.fetch.branchRate                  0.082144                       # Number of branch fetches per cycle
-system.cpu.fetch.icacheStallCycles          119630706                       # Number of cycles fetch is stalled on an Icache miss
-system.cpu.fetch.predictedBranches          126254885                       # Number of branches that fetch has predicted taken
-system.cpu.fetch.rate                        0.664267                       # Number of inst fetches per cycle
-system.cpu.fetch.rateDist::samples         1544565042                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::mean              1.230490                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::stdev             1.292215                       # Number of instructions fetched each cycle (Total)
+system.cpu.dcache.tagsinuse               4095.162912                       # Cycle average of tags in use
+system.cpu.dcache.total_refs                517710070                       # Total number of references to valid blocks.
+system.cpu.dcache.warmup_cycle              317835000                       # Cycle when the warmup percentage was hit.
+system.cpu.dcache.writebacks                   411288                       # number of writebacks
+system.cpu.decode.DECODE:BlockedCycles      610366395                       # Number of cycles decode is blocked
+system.cpu.decode.DECODE:DecodedInsts      2477699501                       # Number of instructions handled by decode
+system.cpu.decode.DECODE:IdleCycles         436378814                       # Number of cycles decode is idle
+system.cpu.decode.DECODE:RunCycles          330621598                       # Number of cycles decode is running
+system.cpu.decode.DECODE:SquashCycles        99870091                       # Number of cycles decode is squashing
+system.cpu.decode.DECODE:UnblockCycles       54907489                       # Number of cycles decode is unblocking
+system.cpu.fetch.Branches                   180455810                       # Number of branches that fetch encountered
+system.cpu.fetch.CacheLines                 168863429                       # Number of cache lines fetched
+system.cpu.fetch.Cycles                     400342229                       # Number of cycles fetch has run and was not squashing or blocked
+system.cpu.fetch.IcacheSquashes                931185                       # Number of outstanding Icache misses that were squashed
+system.cpu.fetch.Insts                     1404767222                       # Number of instructions fetch has processed
+system.cpu.fetch.MiscStallCycles                   49                       # Number of cycles fetch has spent waiting on interrupts, or bad addresses, or out of MSHRs
+system.cpu.fetch.SquashCycles                14936403                       # Number of cycles fetch has spent squashing
+system.cpu.fetch.branchRate                  0.117758                       # Number of branch fetches per cycle
+system.cpu.fetch.icacheStallCycles          168863429                       # Number of cycles fetch is stalled on an Icache miss
+system.cpu.fetch.predictedBranches          169776992                       # Number of branches that fetch has predicted taken
+system.cpu.fetch.rate                        0.916689                       # Number of inst fetches per cycle
+system.cpu.fetch.rateDist::samples         1532144387                       # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::mean              1.666939                       # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::stdev             3.038798                       # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::underflows               0      0.00%      0.00% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::0                522111775     33.80%     33.80% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::1                496583342     32.15%     65.95% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::2                273451194     17.70%     83.66% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::3                224891951     14.56%     98.22% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::4                  8280335      0.54%     98.75% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::5                  1557581      0.10%     98.85% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::6                      722      0.00%     98.85% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::7                     8665      0.00%     98.86% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::8                 17679477      1.14%    100.00% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::0               1134818986     74.07%     74.07% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::1                 25831687      1.69%     75.75% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::2                 14383456      0.94%     76.69% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::3                 13631087      0.89%     77.58% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::4                 30570437      2.00%     79.58% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::5                 20250642      1.32%     80.90% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::6                 34285955      2.24%     83.14% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::7                 37728615      2.46%     85.60% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::8                220643522     14.40%    100.00% # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::overflows                0      0.00%    100.00% # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::min_value                0                       # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::max_value                8                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::total           1544565042                       # Number of instructions fetched each cycle (Total)
-system.cpu.fp_regfile_reads                         2                       # number of floating regfile reads
-system.cpu.icache.ReadReq_accesses          119630706                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_avg_miss_latency 37171.926007                       # average ReadReq miss latency
-system.cpu.icache.ReadReq_avg_mshr_miss_latency 35433.712121                       # average ReadReq mshr miss latency
-system.cpu.icache.ReadReq_hits              119629787                       # number of ReadReq hits
-system.cpu.icache.ReadReq_miss_latency       34161000                       # number of ReadReq miss cycles
-system.cpu.icache.ReadReq_miss_rate          0.000008                       # miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_misses                  919                       # number of ReadReq misses
-system.cpu.icache.ReadReq_mshr_hits               127                       # number of ReadReq MSHR hits
-system.cpu.icache.ReadReq_mshr_miss_latency     28063500                       # number of ReadReq MSHR miss cycles
-system.cpu.icache.ReadReq_mshr_miss_rate     0.000007                       # mshr miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_mshr_misses             792                       # number of ReadReq MSHR misses
+system.cpu.fetch.rateDist::total           1532144387                       # Number of instructions fetched each cycle (Total)
+system.cpu.fp_regfile_reads                        12                       # number of floating regfile reads
+system.cpu.icache.ReadReq_accesses          168863429                       # number of ReadReq accesses(hits+misses)
+system.cpu.icache.ReadReq_avg_miss_latency 34706.050695                       # average ReadReq miss latency
+system.cpu.icache.ReadReq_avg_mshr_miss_latency 35310.841984                       # average ReadReq mshr miss latency
+system.cpu.icache.ReadReq_hits              168862206                       # number of ReadReq hits
+system.cpu.icache.ReadReq_miss_latency       42445500                       # number of ReadReq miss cycles
+system.cpu.icache.ReadReq_miss_rate          0.000007                       # miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_misses                 1223                       # number of ReadReq misses
+system.cpu.icache.ReadReq_mshr_hits               356                       # number of ReadReq MSHR hits
+system.cpu.icache.ReadReq_mshr_miss_latency     30614500                       # number of ReadReq MSHR miss cycles
+system.cpu.icache.ReadReq_mshr_miss_rate     0.000005                       # mshr miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_mshr_misses             867                       # number of ReadReq MSHR misses
 system.cpu.icache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.icache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.icache.avg_refs               151047.710859                       # Average number of references to valid blocks.
+system.cpu.icache.avg_refs               194766.096886                       # Average number of references to valid blocks.
 system.cpu.icache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.icache.blocked::no_targets               0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.icache.cache_copies                      0                       # number of cache copies performed
-system.cpu.icache.demand_accesses           119630706                       # number of demand (read+write) accesses
-system.cpu.icache.demand_avg_miss_latency 37171.926007                       # average overall miss latency
-system.cpu.icache.demand_avg_mshr_miss_latency 35433.712121                       # average overall mshr miss latency
-system.cpu.icache.demand_hits               119629787                       # number of demand (read+write) hits
-system.cpu.icache.demand_miss_latency        34161000                       # number of demand (read+write) miss cycles
-system.cpu.icache.demand_miss_rate           0.000008                       # miss rate for demand accesses
-system.cpu.icache.demand_misses                   919                       # number of demand (read+write) misses
-system.cpu.icache.demand_mshr_hits                127                       # number of demand (read+write) MSHR hits
-system.cpu.icache.demand_mshr_miss_latency     28063500                       # number of demand (read+write) MSHR miss cycles
-system.cpu.icache.demand_mshr_miss_rate      0.000007                       # mshr miss rate for demand accesses
-system.cpu.icache.demand_mshr_misses              792                       # number of demand (read+write) MSHR misses
+system.cpu.icache.demand_accesses           168863429                       # number of demand (read+write) accesses
+system.cpu.icache.demand_avg_miss_latency 34706.050695                       # average overall miss latency
+system.cpu.icache.demand_avg_mshr_miss_latency 35310.841984                       # average overall mshr miss latency
+system.cpu.icache.demand_hits               168862206                       # number of demand (read+write) hits
+system.cpu.icache.demand_miss_latency        42445500                       # number of demand (read+write) miss cycles
+system.cpu.icache.demand_miss_rate           0.000007                       # miss rate for demand accesses
+system.cpu.icache.demand_misses                  1223                       # number of demand (read+write) misses
+system.cpu.icache.demand_mshr_hits                356                       # number of demand (read+write) MSHR hits
+system.cpu.icache.demand_mshr_miss_latency     30614500                       # number of demand (read+write) MSHR miss cycles
+system.cpu.icache.demand_mshr_miss_rate      0.000005                       # mshr miss rate for demand accesses
+system.cpu.icache.demand_mshr_misses              867                       # number of demand (read+write) MSHR misses
 system.cpu.icache.fast_writes                       0                       # number of fast writes performed
 system.cpu.icache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.icache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.icache.occ_%::0                   0.352078                       # Average percentage of cache occupancy
-system.cpu.icache.occ_blocks::0            721.055018                       # Average occupied blocks per context
-system.cpu.icache.overall_accesses          119630706                       # number of overall (read+write) accesses
-system.cpu.icache.overall_avg_miss_latency 37171.926007                       # average overall miss latency
-system.cpu.icache.overall_avg_mshr_miss_latency 35433.712121                       # average overall mshr miss latency
+system.cpu.icache.occ_%::0                   0.386137                       # Average percentage of cache occupancy
+system.cpu.icache.occ_blocks::0            790.808810                       # Average occupied blocks per context
+system.cpu.icache.overall_accesses          168863429                       # number of overall (read+write) accesses
+system.cpu.icache.overall_avg_miss_latency 34706.050695                       # average overall miss latency
+system.cpu.icache.overall_avg_mshr_miss_latency 35310.841984                       # average overall mshr miss latency
 system.cpu.icache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.icache.overall_hits              119629787                       # number of overall hits
-system.cpu.icache.overall_miss_latency       34161000                       # number of overall miss cycles
-system.cpu.icache.overall_miss_rate          0.000008                       # miss rate for overall accesses
-system.cpu.icache.overall_misses                  919                       # number of overall misses
-system.cpu.icache.overall_mshr_hits               127                       # number of overall MSHR hits
-system.cpu.icache.overall_mshr_miss_latency     28063500                       # number of overall MSHR miss cycles
-system.cpu.icache.overall_mshr_miss_rate     0.000007                       # mshr miss rate for overall accesses
-system.cpu.icache.overall_mshr_misses             792                       # number of overall MSHR misses
+system.cpu.icache.overall_hits              168862206                       # number of overall hits
+system.cpu.icache.overall_miss_latency       42445500                       # number of overall miss cycles
+system.cpu.icache.overall_miss_rate          0.000007                       # miss rate for overall accesses
+system.cpu.icache.overall_misses                 1223                       # number of overall misses
+system.cpu.icache.overall_mshr_hits               356                       # number of overall MSHR hits
+system.cpu.icache.overall_mshr_miss_latency     30614500                       # number of overall MSHR miss cycles
+system.cpu.icache.overall_mshr_miss_rate     0.000005                       # mshr miss rate for overall accesses
+system.cpu.icache.overall_mshr_misses             867                       # number of overall MSHR misses
 system.cpu.icache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.icache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.icache.replacements                      4                       # number of replacements
-system.cpu.icache.sampled_refs                    792                       # Sample count of references to valid blocks.
+system.cpu.icache.replacements                     11                       # number of replacements
+system.cpu.icache.sampled_refs                    867                       # Sample count of references to valid blocks.
 system.cpu.icache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.icache.tagsinuse                721.055018                       # Cycle average of tags in use
-system.cpu.icache.total_refs                119629787                       # Total number of references to valid blocks.
+system.cpu.icache.tagsinuse                790.808810                       # Cycle average of tags in use
+system.cpu.icache.total_refs                168862206                       # Total number of references to valid blocks.
 system.cpu.icache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
 system.cpu.icache.writebacks                        0                       # number of writebacks
-system.cpu.idleCycles                          215958                       # Total number of cycles that the CPU has spent unscheduled due to idling
-system.cpu.iew.EXEC:branches                108586362                       # Number of branches executed
+system.cpu.idleCycles                          291024                       # Total number of cycles that the CPU has spent unscheduled due to idling
+system.cpu.iew.EXEC:branches                111314295                       # Number of branches executed
 system.cpu.iew.EXEC:nop                             0                       # number of nop insts executed
-system.cpu.iew.EXEC:rate                     1.090888                       # Inst execution rate
-system.cpu.iew.EXEC:refs                    624680336                       # number of memory reference insts executed
-system.cpu.iew.EXEC:stores                  190102881                       # Number of stores executed
+system.cpu.iew.EXEC:rate                     1.203312                       # Inst execution rate
+system.cpu.iew.EXEC:refs                    636104355                       # number of memory reference insts executed
+system.cpu.iew.EXEC:stores                  191312994                       # Number of stores executed
 system.cpu.iew.EXEC:swp                             0                       # number of swp insts executed
-system.cpu.iew.WB:consumers                2506292363                       # num instructions consuming a value
-system.cpu.iew.WB:count                    1680860111                       # cumulative count of insts written-back
-system.cpu.iew.WB:fanout                     0.529936                       # average fanout of values written-back
+system.cpu.iew.WB:consumers                2089450315                       # num instructions consuming a value
+system.cpu.iew.WB:count                    1839101566                       # cumulative count of insts written-back
+system.cpu.iew.WB:fanout                     0.684612                       # average fanout of values written-back
 system.cpu.iew.WB:penalized                         0                       # number of instrctions required to write to 'other' IQ
 system.cpu.iew.WB:penalized_rate                    0                       # fraction of instructions written-back that wrote to 'other' IQ
-system.cpu.iew.WB:producers                1328173821                       # num instructions producing a value
-system.cpu.iew.WB:rate                       1.088090                       # insts written-back per cycle
-system.cpu.iew.WB:sent                     1681411195                       # cumulative count of insts sent to commit
-system.cpu.iew.branchMispredicts              6122546                       # Number of branch mispredicts detected at execute
-system.cpu.iew.iewBlockCycles                 1253236                       # Number of cycles IEW is blocking
-system.cpu.iew.iewDispLoadInsts             492554241                       # Number of dispatched load instructions
-system.cpu.iew.iewDispNonSpecInsts                 66                       # Number of dispatched non-speculative instructions
-system.cpu.iew.iewDispSquashedInsts           3215387                       # Number of squashed instructions skipped by dispatch
-system.cpu.iew.iewDispStoreInsts            210212351                       # Number of dispatched store instructions
-system.cpu.iew.iewDispatchedInsts          1849358863                       # Number of instructions dispatched to IQ
-system.cpu.iew.iewExecLoadInsts             434577455                       # Number of load instructions executed
-system.cpu.iew.iewExecSquashedInsts           8332046                       # Number of squashed instructions skipped in execute
-system.cpu.iew.iewExecutedInsts            1685183738                       # Number of executed instructions
-system.cpu.iew.iewIQFullEvents                  18939                       # Number of times the IQ has become full, causing a stall
+system.cpu.iew.WB:producers                1430463261                       # num instructions producing a value
+system.cpu.iew.WB:rate                       1.200117                       # insts written-back per cycle
+system.cpu.iew.WB:sent                     1842290775                       # cumulative count of insts sent to commit
+system.cpu.iew.branchMispredicts              8145736                       # Number of branch mispredicts detected at execute
+system.cpu.iew.iewBlockCycles                 1415270                       # Number of cycles IEW is blocking
+system.cpu.iew.iewDispLoadInsts             617903270                       # Number of dispatched load instructions
+system.cpu.iew.iewDispNonSpecInsts                 78                       # Number of dispatched non-speculative instructions
+system.cpu.iew.iewDispSquashedInsts            633937                       # Number of squashed instructions skipped by dispatch
+system.cpu.iew.iewDispStoreInsts            251132554                       # Number of dispatched store instructions
+system.cpu.iew.iewDispatchedInsts          2351086206                       # Number of instructions dispatched to IQ
+system.cpu.iew.iewExecLoadInsts             444791361                       # Number of load instructions executed
+system.cpu.iew.iewExecSquashedInsts          11969895                       # Number of squashed instructions skipped in execute
+system.cpu.iew.iewExecutedInsts            1843997360                       # Number of executed instructions
+system.cpu.iew.iewIQFullEvents                  60905                       # Number of times the IQ has become full, causing a stall
 system.cpu.iew.iewIdleCycles                        0                       # Number of cycles IEW is idle
-system.cpu.iew.iewLSQFullEvents                     0                       # Number of times the LSQ has become full, causing a stall
-system.cpu.iew.iewSquashCycles               33063147                       # Number of cycles IEW is squashing
-system.cpu.iew.iewUnblockCycles                 72665                       # Number of cycles IEW is unblocking
+system.cpu.iew.iewLSQFullEvents                     2                       # Number of times the LSQ has become full, causing a stall
+system.cpu.iew.iewSquashCycles               99870091                       # Number of cycles IEW is squashing
+system.cpu.iew.iewUnblockCycles                117847                       # Number of cycles IEW is unblocking
 system.cpu.iew.lsq.thread.0.blockedLoads            0                       # Number of blocked loads due to partial load-store forwarding
-system.cpu.iew.lsq.thread.0.cacheBlocked        29234                       # Number of times an access to memory failed due to the cache being blocked
-system.cpu.iew.lsq.thread.0.forwLoads       108234700                       # Number of loads that had data forwarded from stores
-system.cpu.iew.lsq.thread.0.ignoredResponses        16690                       # Number of memory responses ignored because the instruction is squashed
+system.cpu.iew.lsq.thread.0.cacheBlocked        29753                       # Number of times an access to memory failed due to the cache being blocked
+system.cpu.iew.lsq.thread.0.forwLoads       113796852                       # Number of loads that had data forwarded from stores
+system.cpu.iew.lsq.thread.0.ignoredResponses         8470                       # Number of memory responses ignored because the instruction is squashed
 system.cpu.iew.lsq.thread.0.invAddrLoads            0                       # Number of loads ignored due to an invalid address
 system.cpu.iew.lsq.thread.0.invAddrSwpfs            0                       # Number of software prefetches ignored due to an invalid address
-system.cpu.iew.lsq.thread.0.memOrderViolation      3968261                       # Number of memory ordering violations
-system.cpu.iew.lsq.thread.0.rescheduledLoads           13                       # Number of loads that were rescheduled
-system.cpu.iew.lsq.thread.0.squashedLoads     73512116                       # Number of loads squashed
-system.cpu.iew.lsq.thread.0.squashedStores     22026294                       # Number of stores squashed
-system.cpu.iew.memOrderViolationEvents        3968261                       # Number of memory order violations
-system.cpu.iew.predictedNotTakenIncorrect         2078                       # Number of branches that were predicted not taken incorrectly
-system.cpu.iew.predictedTakenIncorrect        6120468                       # Number of branches that were predicted taken incorrectly
-system.cpu.int_regfile_reads               4148897019                       # number of integer regfile reads
-system.cpu.int_regfile_writes              1677631671                       # number of integer regfile writes
-system.cpu.ipc                               1.049659                       # IPC: Instructions Per Cycle
-system.cpu.ipc_total                         1.049659                       # IPC: Total IPC of All Threads
-system.cpu.iq.ISSUE:FU_type_0::No_OpClass     24157467      1.43%      1.43% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntAlu      1040578234     61.44%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntMult              0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntDiv               0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatAdd             0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatCmp             0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatCvt             0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatMult            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatDiv             0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatSqrt            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAdd              0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAddAcc            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAlu              0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdCmp              0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdCvt              0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMisc             0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMult             0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMultAcc            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdShift            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdShiftAcc            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdSqrt             0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatAdd            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatAlu            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatCmp            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatCvt            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatDiv            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMisc            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMult            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMultAcc            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatSqrt            0      0.00%     62.87% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::MemRead      438214492     25.88%     88.75% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::MemWrite     190565591     11.25%    100.00% # Type of FU issued
+system.cpu.iew.lsq.thread.0.memOrderViolation      6921754                       # Number of memory ordering violations
+system.cpu.iew.lsq.thread.0.rescheduledLoads           21                       # Number of loads that were rescheduled
+system.cpu.iew.lsq.thread.0.squashedLoads    198861145                       # Number of loads squashed
+system.cpu.iew.lsq.thread.0.squashedStores     62946497                       # Number of stores squashed
+system.cpu.iew.memOrderViolationEvents        6921754                       # Number of memory order violations
+system.cpu.iew.predictedNotTakenIncorrect      3700861                       # Number of branches that were predicted not taken incorrectly
+system.cpu.iew.predictedTakenIncorrect        4444875                       # Number of branches that were predicted taken incorrectly
+system.cpu.int_regfile_reads               3233304065                       # number of integer regfile reads
+system.cpu.int_regfile_writes              1832324218                       # number of integer regfile writes
+system.cpu.ipc                               1.058116                       # IPC: Instructions Per Cycle
+system.cpu.ipc_total                         1.058116                       # IPC: Total IPC of All Threads
+system.cpu.iq.ISSUE:FU_type_0::No_OpClass     27128947      1.46%      1.46% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntAlu      1186880889     63.95%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntMult              0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntDiv               0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatAdd             0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatCmp             0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatCvt             0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatMult            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatDiv             0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatSqrt            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAdd              0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAddAcc            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAlu              0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdCmp              0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdCvt              0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMisc             0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMult             0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMultAcc            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdShift            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdShiftAcc            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdSqrt             0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatAdd            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatAlu            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatCmp            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatCvt            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatDiv            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMisc            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMult            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMultAcc            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatSqrt            0      0.00%     65.41% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::MemRead      450365179     24.27%     89.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::MemWrite     191592240     10.32%    100.00% # Type of FU issued
 system.cpu.iq.ISSUE:FU_type_0::IprAccess            0      0.00%    100.00% # Type of FU issued
 system.cpu.iq.ISSUE:FU_type_0::InstPrefetch            0      0.00%    100.00% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::total       1693515784                       # Type of FU issued
-system.cpu.iq.ISSUE:fu_busy_cnt                252744                       # FU busy when requested
-system.cpu.iq.ISSUE:fu_busy_rate             0.000149                       # FU busy rate (busy events/executed inst)
+system.cpu.iq.ISSUE:FU_type_0::total       1855967255                       # Type of FU issued
+system.cpu.iq.ISSUE:fu_busy_cnt               4437489                       # FU busy when requested
+system.cpu.iq.ISSUE:fu_busy_rate             0.002391                       # FU busy rate (busy events/executed inst)
 system.cpu.iq.ISSUE:fu_full::No_OpClass             0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntAlu                40      0.02%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntMult                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntDiv                 0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatAdd               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatCmp               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatCvt               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatMult              0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatDiv               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatSqrt              0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAdd                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAddAcc             0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAlu                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdCmp                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdCvt                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMisc               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMult               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMultAcc            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdShift              0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdShiftAcc            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdSqrt               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatAdd            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatAlu            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatCmp            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatCvt            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatDiv            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMisc            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMult            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMultAcc            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatSqrt            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::MemRead           250833     99.24%     99.26% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::MemWrite            1871      0.74%    100.00% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntAlu            118316      2.67%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntMult                0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntDiv                 0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatAdd               0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatCmp               0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatCvt               0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatMult              0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatDiv               0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatSqrt              0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAdd                0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAddAcc             0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAlu                0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdCmp                0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdCvt                0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMisc               0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMult               0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMultAcc            0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdShift              0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdShiftAcc            0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdSqrt               0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatAdd            0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatAlu            0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatCmp            0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatCvt            0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatDiv            0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMisc            0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMult            0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMultAcc            0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatSqrt            0      0.00%      2.67% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::MemRead          3486899     78.58%     81.24% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::MemWrite          832274     18.76%    100.00% # attempts to use FU when none available
 system.cpu.iq.ISSUE:fu_full::IprAccess              0      0.00%    100.00% # attempts to use FU when none available
 system.cpu.iq.ISSUE:fu_full::InstPrefetch            0      0.00%    100.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:issued_per_cycle::samples   1544565042                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::mean     1.096435                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::stdev     0.983023                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::samples   1532144387                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::mean     1.211353                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::stdev     1.177271                       # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::underflows            0      0.00%      0.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::0     454758636     29.44%     29.44% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::1     667103033     43.19%     72.63% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::2     281275831     18.21%     90.84% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::3     105166888      6.81%     97.65% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::4      33264638      2.15%     99.81% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::5       2679834      0.17%     99.98% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::6        311387      0.02%    100.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::7          3979      0.00%    100.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::8           816      0.00%    100.00% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::0     466354124     30.44%     30.44% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::1     601647548     39.27%     69.71% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::2     244545222     15.96%     85.67% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::3     139808763      9.13%     94.79% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::4      60228260      3.93%     98.72% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::5      13792665      0.90%     99.62% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::6       4627487      0.30%     99.93% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::7        960857      0.06%     99.99% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::8        179461      0.01%    100.00% # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::overflows            0      0.00%    100.00% # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::min_value            0                       # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::max_value            8                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::total   1544565042                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:rate                     1.096282                       # Inst issue rate
-system.cpu.iq.fp_alu_accesses                       4                       # Number of floating point alu accesses
-system.cpu.iq.fp_inst_queue_reads                   8                       # Number of floating instruction queue reads
-system.cpu.iq.fp_inst_queue_wakeup_accesses            2                       # Number of floating instruction queue wakeup accesses
-system.cpu.iq.fp_inst_queue_writes                  8                       # Number of floating instruction queue writes
-system.cpu.iq.int_alu_accesses             1669611057                       # Number of integer alu accesses
-system.cpu.iq.int_inst_queue_reads         4931850619                       # Number of integer instruction queue reads
-system.cpu.iq.int_inst_queue_wakeup_accesses   1680860109                       # Number of integer instruction queue wakeup accesses
-system.cpu.iq.int_inst_queue_writes        2080058032                       # Number of integer instruction queue writes
-system.cpu.iq.iqInstsAdded                 1849358797                       # Number of instructions added to the IQ (excludes non-spec)
-system.cpu.iq.iqInstsIssued                1693515784                       # Number of instructions issued
-system.cpu.iq.iqNonSpecInstsAdded                  66                       # Number of non-speculative instructions added to the IQ
-system.cpu.iq.iqSquashedInstsExamined       226765112                       # Number of squashed instructions iterated over during squash; mainly for profiling
-system.cpu.iq.iqSquashedInstsIssued              1273                       # Number of squashed instructions issued
-system.cpu.iq.iqSquashedNonSpecRemoved             16                       # Number of squashed non-spec instructions that were removed
-system.cpu.iq.iqSquashedOperandsExamined    584800312                       # Number of squashed operands that are examined and possibly removed from graph
-system.cpu.l2cache.ReadExReq_accesses          245580                       # number of ReadExReq accesses(hits+misses)
-system.cpu.l2cache.ReadExReq_avg_miss_latency 34276.926221                       # average ReadExReq miss latency
-system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 31075.745964                       # average ReadExReq mshr miss latency
-system.cpu.l2cache.ReadExReq_hits              186864                       # number of ReadExReq hits
-system.cpu.l2cache.ReadExReq_miss_latency   2012604000                       # number of ReadExReq miss cycles
-system.cpu.l2cache.ReadExReq_miss_rate       0.239091                       # miss rate for ReadExReq accesses
-system.cpu.l2cache.ReadExReq_misses             58716                       # number of ReadExReq misses
-system.cpu.l2cache.ReadExReq_mshr_miss_latency   1824643500                       # number of ReadExReq MSHR miss cycles
-system.cpu.l2cache.ReadExReq_mshr_miss_rate     0.239091                       # mshr miss rate for ReadExReq accesses
-system.cpu.l2cache.ReadExReq_mshr_misses        58716                       # number of ReadExReq MSHR misses
-system.cpu.l2cache.ReadReq_accesses            201467                       # number of ReadReq accesses(hits+misses)
-system.cpu.l2cache.ReadReq_avg_miss_latency 34133.939861                       # average ReadReq miss latency
-system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 31003.577487                       # average ReadReq mshr miss latency
-system.cpu.l2cache.ReadReq_hits                169042                       # number of ReadReq hits
-system.cpu.l2cache.ReadReq_miss_latency    1106793000                       # number of ReadReq miss cycles
-system.cpu.l2cache.ReadReq_miss_rate         0.160944                       # miss rate for ReadReq accesses
-system.cpu.l2cache.ReadReq_misses               32425                       # number of ReadReq misses
-system.cpu.l2cache.ReadReq_mshr_miss_latency   1005291000                       # number of ReadReq MSHR miss cycles
-system.cpu.l2cache.ReadReq_mshr_miss_rate     0.160944                       # mshr miss rate for ReadReq accesses
-system.cpu.l2cache.ReadReq_mshr_misses          32425                       # number of ReadReq MSHR misses
-system.cpu.l2cache.Writeback_accesses          398281                       # number of Writeback accesses(hits+misses)
-system.cpu.l2cache.Writeback_hits              398281                       # number of Writeback hits
+system.cpu.iq.ISSUE:issued_per_cycle::total   1532144387                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:rate                     1.211123                       # Inst issue rate
+system.cpu.iq.fp_alu_accesses                      18                       # Number of floating point alu accesses
+system.cpu.iq.fp_inst_queue_reads                  33                       # Number of floating instruction queue reads
+system.cpu.iq.fp_inst_queue_wakeup_accesses           12                       # Number of floating instruction queue wakeup accesses
+system.cpu.iq.fp_inst_queue_writes                 32                       # Number of floating instruction queue writes
+system.cpu.iq.int_alu_accesses             1833275779                       # Number of integer alu accesses
+system.cpu.iq.int_inst_queue_reads         5248603279                       # Number of integer instruction queue reads
+system.cpu.iq.int_inst_queue_wakeup_accesses   1839101554                       # Number of integer instruction queue wakeup accesses
+system.cpu.iq.int_inst_queue_writes        3087460502                       # Number of integer instruction queue writes
+system.cpu.iq.iqInstsAdded                 2351086128                       # Number of instructions added to the IQ (excludes non-spec)
+system.cpu.iq.iqInstsIssued                1855967255                       # Number of instructions issued
+system.cpu.iq.iqNonSpecInstsAdded                  78                       # Number of non-speculative instructions added to the IQ
+system.cpu.iq.iqSquashedInstsExamined       729454588                       # Number of squashed instructions iterated over during squash; mainly for profiling
+system.cpu.iq.iqSquashedInstsIssued             86926                       # Number of squashed instructions issued
+system.cpu.iq.iqSquashedNonSpecRemoved             28                       # Number of squashed non-spec instructions that were removed
+system.cpu.iq.iqSquashedOperandsExamined   1543114171                       # Number of squashed operands that are examined and possibly removed from graph
+system.cpu.l2cache.ReadExReq_accesses          250094                       # number of ReadExReq accesses(hits+misses)
+system.cpu.l2cache.ReadExReq_avg_miss_latency 34363.888228                       # average ReadExReq miss latency
+system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 31092.455043                       # average ReadExReq mshr miss latency
+system.cpu.l2cache.ReadExReq_hits              191260                       # number of ReadExReq hits
+system.cpu.l2cache.ReadExReq_miss_latency   2021765000                       # number of ReadExReq miss cycles
+system.cpu.l2cache.ReadExReq_miss_rate       0.235248                       # miss rate for ReadExReq accesses
+system.cpu.l2cache.ReadExReq_misses             58834                       # number of ReadExReq misses
+system.cpu.l2cache.ReadExReq_mshr_miss_latency   1829293500                       # number of ReadExReq MSHR miss cycles
+system.cpu.l2cache.ReadExReq_mshr_miss_rate     0.235248                       # mshr miss rate for ReadExReq accesses
+system.cpu.l2cache.ReadExReq_mshr_misses        58834                       # number of ReadExReq MSHR misses
+system.cpu.l2cache.ReadReq_accesses            215648                       # number of ReadReq accesses(hits+misses)
+system.cpu.l2cache.ReadReq_avg_miss_latency 34134.880348                       # average ReadReq miss latency
+system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 31005.967489                       # average ReadReq mshr miss latency
+system.cpu.l2cache.ReadReq_hits                182552                       # number of ReadReq hits
+system.cpu.l2cache.ReadReq_miss_latency    1129728000                       # number of ReadReq miss cycles
+system.cpu.l2cache.ReadReq_miss_rate         0.153472                       # miss rate for ReadReq accesses
+system.cpu.l2cache.ReadReq_misses               33096                       # number of ReadReq misses
+system.cpu.l2cache.ReadReq_mshr_miss_latency   1026173500                       # number of ReadReq MSHR miss cycles
+system.cpu.l2cache.ReadReq_mshr_miss_rate     0.153472                       # mshr miss rate for ReadReq accesses
+system.cpu.l2cache.ReadReq_mshr_misses          33096                       # number of ReadReq MSHR misses
+system.cpu.l2cache.Writeback_accesses          411288                       # number of Writeback accesses(hits+misses)
+system.cpu.l2cache.Writeback_hits              411288                       # number of Writeback hits
 system.cpu.l2cache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.l2cache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.l2cache.avg_refs                  4.844642                       # Average number of references to valid blocks.
+system.cpu.l2cache.avg_refs                  5.099303                       # Average number of references to valid blocks.
 system.cpu.l2cache.blocked::no_mshrs                0                       # number of cycles access was blocked
 system.cpu.l2cache.blocked::no_targets              0                       # number of cycles access was blocked
 system.cpu.l2cache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.l2cache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.l2cache.cache_copies                     0                       # number of cache copies performed
-system.cpu.l2cache.demand_accesses             447047                       # number of demand (read+write) accesses
-system.cpu.l2cache.demand_avg_miss_latency 34226.056330                       # average overall miss latency
-system.cpu.l2cache.demand_avg_mshr_miss_latency 31050.070769                       # average overall mshr miss latency
-system.cpu.l2cache.demand_hits                 355906                       # number of demand (read+write) hits
-system.cpu.l2cache.demand_miss_latency     3119397000                       # number of demand (read+write) miss cycles
-system.cpu.l2cache.demand_miss_rate          0.203873                       # miss rate for demand accesses
-system.cpu.l2cache.demand_misses                91141                       # number of demand (read+write) misses
+system.cpu.l2cache.demand_accesses             465742                       # number of demand (read+write) accesses
+system.cpu.l2cache.demand_avg_miss_latency 34281.442402                       # average overall miss latency
+system.cpu.l2cache.demand_avg_mshr_miss_latency 31061.318394                       # average overall mshr miss latency
+system.cpu.l2cache.demand_hits                 373812                       # number of demand (read+write) hits
+system.cpu.l2cache.demand_miss_latency     3151493000                       # number of demand (read+write) miss cycles
+system.cpu.l2cache.demand_miss_rate          0.197384                       # miss rate for demand accesses
+system.cpu.l2cache.demand_misses                91930                       # number of demand (read+write) misses
 system.cpu.l2cache.demand_mshr_hits                 0                       # number of demand (read+write) MSHR hits
-system.cpu.l2cache.demand_mshr_miss_latency   2829934500                       # number of demand (read+write) MSHR miss cycles
-system.cpu.l2cache.demand_mshr_miss_rate     0.203873                       # mshr miss rate for demand accesses
-system.cpu.l2cache.demand_mshr_misses           91141                       # number of demand (read+write) MSHR misses
+system.cpu.l2cache.demand_mshr_miss_latency   2855467000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.l2cache.demand_mshr_miss_rate     0.197384                       # mshr miss rate for demand accesses
+system.cpu.l2cache.demand_mshr_misses           91930                       # number of demand (read+write) MSHR misses
 system.cpu.l2cache.fast_writes                      0                       # number of fast writes performed
 system.cpu.l2cache.mshr_cap_events                  0                       # number of times MSHR cap was activated
 system.cpu.l2cache.no_allocate_misses               0                       # Number of misses that were no-allocate
-system.cpu.l2cache.occ_%::0                  0.058867                       # Average percentage of cache occupancy
-system.cpu.l2cache.occ_%::1                  0.490866                       # Average percentage of cache occupancy
-system.cpu.l2cache.occ_blocks::0          1928.938344                       # Average occupied blocks per context
-system.cpu.l2cache.occ_blocks::1         16084.711341                       # Average occupied blocks per context
-system.cpu.l2cache.overall_accesses            447047                       # number of overall (read+write) accesses
-system.cpu.l2cache.overall_avg_miss_latency 34226.056330                       # average overall miss latency
-system.cpu.l2cache.overall_avg_mshr_miss_latency 31050.070769                       # average overall mshr miss latency
+system.cpu.l2cache.occ_%::0                  0.059053                       # Average percentage of cache occupancy
+system.cpu.l2cache.occ_%::1                  0.491352                       # Average percentage of cache occupancy
+system.cpu.l2cache.occ_blocks::0          1935.054426                       # Average occupied blocks per context
+system.cpu.l2cache.occ_blocks::1         16100.609355                       # Average occupied blocks per context
+system.cpu.l2cache.overall_accesses            465742                       # number of overall (read+write) accesses
+system.cpu.l2cache.overall_avg_miss_latency 34281.442402                       # average overall miss latency
+system.cpu.l2cache.overall_avg_mshr_miss_latency 31061.318394                       # average overall mshr miss latency
 system.cpu.l2cache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.l2cache.overall_hits                355906                       # number of overall hits
-system.cpu.l2cache.overall_miss_latency    3119397000                       # number of overall miss cycles
-system.cpu.l2cache.overall_miss_rate         0.203873                       # miss rate for overall accesses
-system.cpu.l2cache.overall_misses               91141                       # number of overall misses
+system.cpu.l2cache.overall_hits                373812                       # number of overall hits
+system.cpu.l2cache.overall_miss_latency    3151493000                       # number of overall miss cycles
+system.cpu.l2cache.overall_miss_rate         0.197384                       # miss rate for overall accesses
+system.cpu.l2cache.overall_misses               91930                       # number of overall misses
 system.cpu.l2cache.overall_mshr_hits                0                       # number of overall MSHR hits
-system.cpu.l2cache.overall_mshr_miss_latency   2829934500                       # number of overall MSHR miss cycles
-system.cpu.l2cache.overall_mshr_miss_rate     0.203873                       # mshr miss rate for overall accesses
-system.cpu.l2cache.overall_mshr_misses          91141                       # number of overall MSHR misses
+system.cpu.l2cache.overall_mshr_miss_latency   2855467000                       # number of overall MSHR miss cycles
+system.cpu.l2cache.overall_mshr_miss_rate     0.197384                       # mshr miss rate for overall accesses
+system.cpu.l2cache.overall_mshr_misses          91930                       # number of overall MSHR misses
 system.cpu.l2cache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.l2cache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.l2cache.replacements                 72873                       # number of replacements
-system.cpu.l2cache.sampled_refs                 88473                       # Sample count of references to valid blocks.
+system.cpu.l2cache.replacements                 73661                       # number of replacements
+system.cpu.l2cache.sampled_refs                 89262                       # Sample count of references to valid blocks.
 system.cpu.l2cache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.l2cache.tagsinuse             18013.649684                       # Cycle average of tags in use
-system.cpu.l2cache.total_refs                  428620                       # Total number of references to valid blocks.
+system.cpu.l2cache.tagsinuse             18035.663781                       # Cycle average of tags in use
+system.cpu.l2cache.total_refs                  455174                       # Total number of references to valid blocks.
 system.cpu.l2cache.warmup_cycle                     0                       # Cycle when the warmup percentage was hit.
-system.cpu.l2cache.writebacks                   58405                       # number of writebacks
-system.cpu.memDep0.conflictingLoads         289036318                       # Number of conflicting loads.
-system.cpu.memDep0.conflictingStores        113016383                       # Number of conflicting stores.
-system.cpu.memDep0.insertedLoads            492554241                       # Number of loads inserted to the mem dependence unit.
-system.cpu.memDep0.insertedStores           210212351                       # Number of stores inserted to the mem dependence unit.
-system.cpu.misc_regfile_reads               864820574                       # number of misc regfile reads
-system.cpu.numCycles                       1544781000                       # number of cpu cycles simulated
+system.cpu.l2cache.writebacks                   58542                       # number of writebacks
+system.cpu.memDep0.conflictingLoads         537232404                       # Number of conflicting loads.
+system.cpu.memDep0.conflictingStores        219207458                       # Number of conflicting stores.
+system.cpu.memDep0.insertedLoads            617903270                       # Number of loads inserted to the mem dependence unit.
+system.cpu.memDep0.insertedStores           251132554                       # Number of stores inserted to the mem dependence unit.
+system.cpu.misc_regfile_reads               931505074                       # number of misc regfile reads
+system.cpu.numCycles                       1532435411                       # number of cpu cycles simulated
 system.cpu.numWorkItemsCompleted                    0                       # number of work items this cpu completed
 system.cpu.numWorkItemsStarted                      0                       # number of work items this cpu started
-system.cpu.rename.RENAME:BlockCycles         55578139                       # Number of cycles rename is blocking
+system.cpu.rename.RENAME:BlockCycles        175534951                       # Number of cycles rename is blocking
 system.cpu.rename.RENAME:CommittedMaps     1617994650                       # Number of HB maps that are committed
-system.cpu.rename.RENAME:IQFullEvents        65710608                       # Number of times rename has blocked due to IQ full
-system.cpu.rename.RENAME:IdleCycles         361165681                       # Number of cycles rename is idle
-system.cpu.rename.RENAME:LSQFullEvents       36822801                       # Number of times rename has blocked due to LSQ full
-system.cpu.rename.RENAME:ROBFullEvents             16                       # Number of times rename has blocked due to ROB full
-system.cpu.rename.RENAME:RenameLookups     5668050381                       # Number of register rename lookups that rename has made
-system.cpu.rename.RENAME:RenamedInsts      1874385455                       # Number of instructions processed by rename
-system.cpu.rename.RENAME:RenamedOperands   1871676358                       # Number of destination operands rename has renamed
-system.cpu.rename.RENAME:RunCycles          968560202                       # Number of cycles rename is running
-system.cpu.rename.RENAME:SquashCycles        33063147                       # Number of cycles rename is squashing
-system.cpu.rename.RENAME:UnblockCycles      126195704                       # Number of cycles rename is unblocking
-system.cpu.rename.RENAME:UndoneMaps         253681708                       # Number of HB maps that are undone due to squashing
-system.cpu.rename.RENAME:fp_rename_lookups           32                       # Number of floating rename lookups
-system.cpu.rename.RENAME:int_rename_lookups   5668050349                       # Number of integer rename lookups
-system.cpu.rename.RENAME:serializeStallCycles         2169                       # count of cycles rename stalled for serializing inst
-system.cpu.rename.RENAME:serializingInsts           67                       # count of serializing insts renamed
-system.cpu.rename.RENAME:skidInsts          186996608                       # count of insts added to the skid buffer
-system.cpu.rename.RENAME:tempSerializingInsts           71                       # count of temporary serializing insts renamed
-system.cpu.rob.rob_reads                   3357159543                       # The number of ROB reads
-system.cpu.rob.rob_writes                  3732197477                       # The number of ROB writes
-system.cpu.timesIdled                           45108                       # Number of times that the entire CPU went into an idle state and unscheduled itself
+system.cpu.rename.RENAME:IQFullEvents       318243703                       # Number of times rename has blocked due to IQ full
+system.cpu.rename.RENAME:IdleCycles         499996104                       # Number of cycles rename is idle
+system.cpu.rename.RENAME:LSQFullEvents      107154792                       # Number of times rename has blocked due to LSQ full
+system.cpu.rename.RENAME:ROBFullEvents             44                       # Number of times rename has blocked due to ROB full
+system.cpu.rename.RENAME:RenameLookups     5827367622                       # Number of register rename lookups that rename has made
+system.cpu.rename.RENAME:RenamedInsts      2403532061                       # Number of instructions processed by rename
+system.cpu.rename.RENAME:RenamedOperands   2403383901                       # Number of destination operands rename has renamed
+system.cpu.rename.RENAME:RunCycles          306300874                       # Number of cycles rename is running
+system.cpu.rename.RENAME:SquashCycles        99870091                       # Number of cycles rename is squashing
+system.cpu.rename.RENAME:UnblockCycles      450439326                       # Number of cycles rename is unblocking
+system.cpu.rename.RENAME:UndoneMaps         785389251                       # Number of HB maps that are undone due to squashing
+system.cpu.rename.RENAME:fp_rename_lookups           96                       # Number of floating rename lookups
+system.cpu.rename.RENAME:int_rename_lookups   5827367526                       # Number of integer rename lookups
+system.cpu.rename.RENAME:serializeStallCycles         3041                       # count of cycles rename stalled for serializing inst
+system.cpu.rename.RENAME:serializingInsts           87                       # count of serializing insts renamed
+system.cpu.rename.RENAME:skidInsts          739921776                       # count of insts added to the skid buffer
+system.cpu.rename.RENAME:tempSerializingInsts           87                       # count of temporary serializing insts renamed
+system.cpu.rob.rob_reads                   3775835718                       # The number of ROB reads
+system.cpu.rob.rob_writes                  4802062478                       # The number of ROB writes
+system.cpu.timesIdled                           45517                       # Number of times that the entire CPU went into an idle state and unscheduled itself
 system.cpu.workload.PROG:num_syscalls              48                       # Number of system calls
 
 ---------- End Simulation Statistics   ----------
diff --git a/tests/long/00.gzip/ref/x86/linux/simple-atomic/simout b/tests/long/00.gzip/ref/x86/linux/simple-atomic/simout
index 1dd3bb0d2d..bb63956251 100755
--- a/tests/long/00.gzip/ref/x86/linux/simple-atomic/simout
+++ b/tests/long/00.gzip/ref/x86/linux/simple-atomic/simout
@@ -5,9 +5,9 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:38:48
+M5 compiled Feb  8 2011 00:58:32
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:34
 M5 executing on burrito
 command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/00.gzip/x86/linux/simple-atomic -re tests/run.py build/X86_SE/tests/fast/long/00.gzip/x86/linux/simple-atomic
 Global frequency set at 1000000000000 ticks per second
diff --git a/tests/long/00.gzip/ref/x86/linux/simple-atomic/stats.txt b/tests/long/00.gzip/ref/x86/linux/simple-atomic/stats.txt
index ce8635d175..5b839ec88b 100644
--- a/tests/long/00.gzip/ref/x86/linux/simple-atomic/stats.txt
+++ b/tests/long/00.gzip/ref/x86/linux/simple-atomic/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                1066510                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 223440                       # Number of bytes of host memory used
-host_seconds                                  1520.37                       # Real time elapsed on the host
-host_tick_rate                              634049597                       # Simulator tick rate (ticks/s)
+host_inst_rate                                2470310                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 224012                       # Number of bytes of host memory used
+host_seconds                                   656.39                       # Real time elapsed on the host
+host_tick_rate                             1468620897                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                  1621493983                       # Number of instructions simulated
 sim_seconds                                  0.963993                       # Number of seconds simulated
@@ -24,7 +24,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                       1621493983                       # Number of instructions executed
 system.cpu.num_int_alu_accesses            1621354493                       # Number of integer alu accesses
 system.cpu.num_int_insts                   1621354493                       # number of integer instructions
-system.cpu.num_int_register_reads          4883555465                       # number of times the integer registers were read
+system.cpu.num_int_register_reads          3953866002                       # number of times the integer registers were read
 system.cpu.num_int_register_writes         1617994650                       # number of times the integer registers were written
 system.cpu.num_load_insts                   419042125                       # Number of load instructions
 system.cpu.num_mem_refs                     607228182                       # number of memory refs
diff --git a/tests/long/00.gzip/ref/x86/linux/simple-timing/simout b/tests/long/00.gzip/ref/x86/linux/simple-timing/simout
index 889c6868b8..9205746538 100755
--- a/tests/long/00.gzip/ref/x86/linux/simple-timing/simout
+++ b/tests/long/00.gzip/ref/x86/linux/simple-timing/simout
@@ -5,9 +5,9 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:35
+M5 compiled Feb  8 2011 00:58:32
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:34
 M5 executing on burrito
 command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/00.gzip/x86/linux/simple-timing -re tests/run.py build/X86_SE/tests/fast/long/00.gzip/x86/linux/simple-timing
 Global frequency set at 1000000000000 ticks per second
diff --git a/tests/long/00.gzip/ref/x86/linux/simple-timing/stats.txt b/tests/long/00.gzip/ref/x86/linux/simple-timing/stats.txt
index 46400c9208..120240c595 100644
--- a/tests/long/00.gzip/ref/x86/linux/simple-timing/stats.txt
+++ b/tests/long/00.gzip/ref/x86/linux/simple-timing/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 685934                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 231240                       # Number of bytes of host memory used
-host_seconds                                  2363.92                       # Real time elapsed on the host
-host_tick_rate                              762824620                       # Simulator tick rate (ticks/s)
+host_inst_rate                                1667736                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 231728                       # Number of bytes of host memory used
+host_seconds                                   972.27                       # Real time elapsed on the host
+host_tick_rate                             1854683738                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                  1621493983                       # Number of instructions simulated
 sim_seconds                                  1.803259                       # Number of seconds simulated
@@ -213,7 +213,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                       1621493983                       # Number of instructions executed
 system.cpu.num_int_alu_accesses            1621354493                       # Number of integer alu accesses
 system.cpu.num_int_insts                   1621354493                       # number of integer instructions
-system.cpu.num_int_register_reads          4883555465                       # number of times the integer registers were read
+system.cpu.num_int_register_reads          3953866002                       # number of times the integer registers were read
 system.cpu.num_int_register_writes         1617994650                       # number of times the integer registers were written
 system.cpu.num_load_insts                   419042125                       # Number of load instructions
 system.cpu.num_mem_refs                     607228182                       # number of memory refs
diff --git a/tests/long/10.linux-boot/ref/x86/linux/pc-simple-atomic/simout b/tests/long/10.linux-boot/ref/x86/linux/pc-simple-atomic/simout
index 1d53161472..30d3a70e12 100755
--- a/tests/long/10.linux-boot/ref/x86/linux/pc-simple-atomic/simout
+++ b/tests/long/10.linux-boot/ref/x86/linux/pc-simple-atomic/simout
@@ -5,13 +5,12 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 01:04:06
-M5 revision 8e058bca28fb 7927 default qtip tip x86fsstats.patch
-M5 started Feb  7 2011 01:04:09
+M5 compiled Feb  8 2011 00:58:27
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:30
 M5 executing on burrito
-command line: build/X86_FS/m5.opt -d build/X86_FS/tests/opt/long/10.linux-boot/x86/linux/pc-simple-atomic -re tests/run.py build/X86_FS/tests/opt/long/10.linux-boot/x86/linux/pc-simple-atomic
+command line: build/X86_FS/m5.fast -d build/X86_FS/tests/fast/long/10.linux-boot/x86/linux/pc-simple-atomic -re tests/run.py build/X86_FS/tests/fast/long/10.linux-boot/x86/linux/pc-simple-atomic
 Global frequency set at 1000000000000 ticks per second
 info: kernel located at: /dist/m5/system/binaries/x86_64-vmlinux-2.6.22.9
-      0: rtc: Real-time clock set to Sun Jan  1 00:00:00 2012
 info: Entering event queue @ 0.  Starting simulation...
 Exiting @ tick 5112051463500 because m5_exit instruction encountered
diff --git a/tests/long/10.linux-boot/ref/x86/linux/pc-simple-atomic/stats.txt b/tests/long/10.linux-boot/ref/x86/linux/pc-simple-atomic/stats.txt
index 1cabd6a2df..113af673f1 100644
--- a/tests/long/10.linux-boot/ref/x86/linux/pc-simple-atomic/stats.txt
+++ b/tests/long/10.linux-boot/ref/x86/linux/pc-simple-atomic/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                2329852                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 370744                       # Number of bytes of host memory used
-host_seconds                                   174.53                       # Real time elapsed on the host
-host_tick_rate                            29290692573                       # Simulator tick rate (ticks/s)
+host_inst_rate                                1892986                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 370804                       # Number of bytes of host memory used
+host_seconds                                   214.81                       # Real time elapsed on the host
+host_tick_rate                            23798444654                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                   406624453                       # Number of instructions simulated
 sim_seconds                                  5.112051                       # Number of seconds simulated
@@ -341,7 +341,7 @@ system.cpu.num_idle_cycles               9770620811.997942
 system.cpu.num_insts                        406624453                       # Number of instructions executed
 system.cpu.num_int_alu_accesses             391833833                       # Number of integer alu accesses
 system.cpu.num_int_insts                    391833833                       # number of integer instructions
-system.cpu.num_int_register_reads          1007515486                       # number of times the integer registers were read
+system.cpu.num_int_register_reads           836347867                       # number of times the integer registers were read
 system.cpu.num_int_register_writes          419160860                       # number of times the integer registers were written
 system.cpu.num_load_insts                    29720540                       # Number of load instructions
 system.cpu.num_mem_refs                      38133606                       # number of memory refs
diff --git a/tests/long/10.linux-boot/ref/x86/linux/pc-simple-timing/simout b/tests/long/10.linux-boot/ref/x86/linux/pc-simple-timing/simout
index 6d191e20f1..628b3cd613 100755
--- a/tests/long/10.linux-boot/ref/x86/linux/pc-simple-timing/simout
+++ b/tests/long/10.linux-boot/ref/x86/linux/pc-simple-timing/simout
@@ -5,13 +5,12 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 01:04:06
-M5 revision 8e058bca28fb 7927 default qtip tip x86fsstats.patch
-M5 started Feb  7 2011 01:04:09
+M5 compiled Feb  8 2011 00:58:27
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:30
 M5 executing on burrito
-command line: build/X86_FS/m5.opt -d build/X86_FS/tests/opt/long/10.linux-boot/x86/linux/pc-simple-timing -re tests/run.py build/X86_FS/tests/opt/long/10.linux-boot/x86/linux/pc-simple-timing
+command line: build/X86_FS/m5.fast -d build/X86_FS/tests/fast/long/10.linux-boot/x86/linux/pc-simple-timing -re tests/run.py build/X86_FS/tests/fast/long/10.linux-boot/x86/linux/pc-simple-timing
 Global frequency set at 1000000000000 ticks per second
 info: kernel located at: /dist/m5/system/binaries/x86_64-vmlinux-2.6.22.9
-      0: rtc: Real-time clock set to Sun Jan  1 00:00:00 2012
 info: Entering event queue @ 0.  Starting simulation...
 Exiting @ tick 5187506658000 because m5_exit instruction encountered
diff --git a/tests/long/10.linux-boot/ref/x86/linux/pc-simple-timing/stats.txt b/tests/long/10.linux-boot/ref/x86/linux/pc-simple-timing/stats.txt
index b4552b7b70..091a2e71cf 100644
--- a/tests/long/10.linux-boot/ref/x86/linux/pc-simple-timing/stats.txt
+++ b/tests/long/10.linux-boot/ref/x86/linux/pc-simple-timing/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                1700985                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 367580                       # Number of bytes of host memory used
-host_seconds                                   155.42                       # Real time elapsed on the host
-host_tick_rate                            33377224644                       # Simulator tick rate (ticks/s)
+host_inst_rate                                1227876                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 367348                       # Number of bytes of host memory used
+host_seconds                                   215.31                       # Real time elapsed on the host
+host_tick_rate                            24093749418                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                   264367743                       # Number of instructions simulated
 sim_seconds                                  5.187507                       # Number of seconds simulated
@@ -395,7 +395,7 @@ system.cpu.num_idle_cycles               9771315874.126116
 system.cpu.num_insts                        264367743                       # Number of instructions executed
 system.cpu.num_int_alu_accesses             249584659                       # Number of integer alu accesses
 system.cpu.num_int_insts                    249584659                       # number of integer instructions
-system.cpu.num_int_register_reads           660399505                       # number of times the integer registers were read
+system.cpu.num_int_register_reads           543556622                       # number of times the integer registers were read
 system.cpu.num_int_register_writes          266062505                       # number of times the integer registers were written
 system.cpu.num_load_insts                    14817593                       # Number of load instructions
 system.cpu.num_mem_refs                      23178416                       # number of memory refs
diff --git a/tests/long/10.mcf/ref/x86/linux/o3-timing/config.ini b/tests/long/10.mcf/ref/x86/linux/o3-timing/config.ini
index 8e006cde52..31cbafe2a1 100644
--- a/tests/long/10.mcf/ref/x86/linux/o3-timing/config.ini
+++ b/tests/long/10.mcf/ref/x86/linux/o3-timing/config.ini
@@ -488,7 +488,7 @@ type=ExeTracer
 [system.cpu.workload]
 type=LiveProcess
 cmd=mcf mcf.in
-cwd=build/X86_SE/tests/fast/long/10.mcf/x86/linux/o3-timing
+cwd=build/X86_SE/tests/opt/long/10.mcf/x86/linux/o3-timing
 egid=100
 env=
 errout=cerr
diff --git a/tests/long/10.mcf/ref/x86/linux/o3-timing/simout b/tests/long/10.mcf/ref/x86/linux/o3-timing/simout
index bf0cc96de4..41587c0af5 100755
--- a/tests/long/10.mcf/ref/x86/linux/o3-timing/simout
+++ b/tests/long/10.mcf/ref/x86/linux/o3-timing/simout
@@ -5,11 +5,11 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:24
+M5 compiled Feb 12 2011 02:22:23
+M5 revision 5e76f9de6972 7961 default qtip tip x86branchdetectstats.patch
+M5 started Feb 12 2011 02:22:27
 M5 executing on burrito
-command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/10.mcf/x86/linux/o3-timing -re tests/run.py build/X86_SE/tests/fast/long/10.mcf/x86/linux/o3-timing
+command line: build/X86_SE/m5.opt -d build/X86_SE/tests/opt/long/10.mcf/x86/linux/o3-timing -re tests/run.py build/X86_SE/tests/opt/long/10.mcf/x86/linux/o3-timing
 Global frequency set at 1000000000000 ticks per second
 info: Entering event queue @ 0.  Starting simulation...
 
@@ -28,4 +28,4 @@ simplex iterations         : 2663
 flow value                 : 3080014995
 checksum                   : 68389
 optimal
-Exiting @ tick 170680631000 because target called exit()
+Exiting @ tick 98622214000 because target called exit()
diff --git a/tests/long/10.mcf/ref/x86/linux/o3-timing/stats.txt b/tests/long/10.mcf/ref/x86/linux/o3-timing/stats.txt
index 3db6ff1612..33b45551d5 100644
--- a/tests/long/10.mcf/ref/x86/linux/o3-timing/stats.txt
+++ b/tests/long/10.mcf/ref/x86/linux/o3-timing/stats.txt
@@ -1,41 +1,41 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                  83481                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 366872                       # Number of bytes of host memory used
-host_seconds                                  3332.41                       # Real time elapsed on the host
-host_tick_rate                               51218385                       # Simulator tick rate (ticks/s)
+host_inst_rate                                 133029                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 371192                       # Number of bytes of host memory used
+host_seconds                                  2091.22                       # Real time elapsed on the host
+host_tick_rate                               47160241                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                   278192519                       # Number of instructions simulated
-sim_seconds                                  0.170681                       # Number of seconds simulated
-sim_ticks                                170680631000                       # Number of ticks simulated
+sim_seconds                                  0.098622                       # Number of seconds simulated
+sim_ticks                                 98622214000                       # Number of ticks simulated
 system.cpu.BPredUnit.BTBCorrect                     0                       # Number of correct BTB predictions (this stat may not work properly.
-system.cpu.BPredUnit.BTBHits                 50810617                       # Number of BTB hits
-system.cpu.BPredUnit.BTBLookups              51416767                       # Number of BTB lookups
+system.cpu.BPredUnit.BTBHits                 44152407                       # Number of BTB hits
+system.cpu.BPredUnit.BTBLookups              44769192                       # Number of BTB lookups
 system.cpu.BPredUnit.RASInCorrect                   0                       # Number of incorrect RAS predictions.
-system.cpu.BPredUnit.condIncorrect            4328981                       # Number of conditional branches incorrect
-system.cpu.BPredUnit.condPredicted           51416803                       # Number of conditional branches predicted
-system.cpu.BPredUnit.lookups                 51416803                       # Number of BP lookups
+system.cpu.BPredUnit.condIncorrect            3292099                       # Number of conditional branches incorrect
+system.cpu.BPredUnit.condPredicted           50608102                       # Number of conditional branches predicted
+system.cpu.BPredUnit.lookups                 50608102                       # Number of BP lookups
 system.cpu.BPredUnit.usedRAS                        0                       # Number of times the RAS was used to get a target.
 system.cpu.commit.COM:branches               29309710                       # Number of branches committed
-system.cpu.commit.COM:bw_lim_events           2488105                       # number cycles where commit BW limit reached
+system.cpu.commit.COM:bw_lim_events          11603540                       # number cycles where commit BW limit reached
 system.cpu.commit.COM:bw_limited                    0                       # number of insts not committed due to BW limits
-system.cpu.commit.COM:committed_per_cycle::samples    321793097                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::mean     0.864507                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::stdev     1.425920                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::samples    176948364                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::mean     1.572168                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::stdev     2.280995                       # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::underflows            0      0.00%      0.00% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::0    183622049     57.06%     57.06% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::1     75902754     23.59%     80.65% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::2     27223254      8.46%     89.11% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::3     17908154      5.57%     94.67% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::4      5463718      1.70%     96.37% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::5      3630830      1.13%     97.50% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::6      4674698      1.45%     98.95% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::7       879535      0.27%     99.23% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::8      2488105      0.77%    100.00% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::0     83964580     47.45%     47.45% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::1     36146762     20.43%     67.88% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::2     16087394      9.09%     76.97% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::3     14069173      7.95%     84.92% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::4      7224288      4.08%     89.00% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::5      2649535      1.50%     90.50% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::6      3731341      2.11%     92.61% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::7      1471751      0.83%     93.44% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::8     11603540      6.56%    100.00% # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::overflows            0      0.00%    100.00% # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::min_value            0                       # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::max_value            8                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::total    321793097                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::total    176948364                       # Number of insts commited each cycle
 system.cpu.commit.COM:count                 278192519                       # Number of instructions committed
 system.cpu.commit.COM:fp_insts                     40                       # Number of committed floating point instructions.
 system.cpu.commit.COM:function_calls                0                       # Number of function calls committed.
@@ -44,421 +44,430 @@ system.cpu.commit.COM:loads                  90779388                       # Nu
 system.cpu.commit.COM:membars                       0                       # Number of memory barriers committed
 system.cpu.commit.COM:refs                  122219139                       # Number of memory references committed
 system.cpu.commit.COM:swp_count                     0                       # Number of s/w prefetches committed
-system.cpu.commit.branchMispredicts           4328992                       # The number of times a branch was mispredicted
+system.cpu.commit.branchMispredicts           3292117                       # The number of times a branch was mispredicted
 system.cpu.commit.commitCommittedInsts      278192519                       # The number of committed instructions
 system.cpu.commit.commitNonSpecStalls             446                       # The number of times commit has been forced to stall to communicate backwards
-system.cpu.commit.commitSquashedInsts       111464423                       # The number of squashed insts skipped by commit
+system.cpu.commit.commitSquashedInsts       130955012                       # The number of squashed insts skipped by commit
 system.cpu.committedInsts                   278192519                       # Number of Instructions Simulated
 system.cpu.committedInsts_total             278192519                       # Number of Instructions Simulated
-system.cpu.cpi                               1.227068                       # CPI: Cycles Per Instruction
-system.cpu.cpi_total                         1.227068                       # CPI: Total CPI of All Threads
-system.cpu.dcache.ReadReq_accesses           82779625                       # number of ReadReq accesses(hits+misses)
-system.cpu.dcache.ReadReq_avg_miss_latency  5978.815311                       # average ReadReq miss latency
-system.cpu.dcache.ReadReq_avg_mshr_miss_latency  2941.059048                       # average ReadReq mshr miss latency
-system.cpu.dcache.ReadReq_hits               80764514                       # number of ReadReq hits
-system.cpu.dcache.ReadReq_miss_latency    12047976500                       # number of ReadReq miss cycles
-system.cpu.dcache.ReadReq_miss_rate          0.024343                       # miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_misses              2015111                       # number of ReadReq misses
-system.cpu.dcache.ReadReq_mshr_hits             45360                       # number of ReadReq MSHR hits
-system.cpu.dcache.ReadReq_mshr_miss_latency   5793154000                       # number of ReadReq MSHR miss cycles
-system.cpu.dcache.ReadReq_mshr_miss_rate     0.023795                       # mshr miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_mshr_misses         1969751                       # number of ReadReq MSHR misses
+system.cpu.cpi                               0.709021                       # CPI: Cycles Per Instruction
+system.cpu.cpi_total                         0.709021                       # CPI: Total CPI of All Threads
+system.cpu.dcache.ReadReq_accesses           69458873                       # number of ReadReq accesses(hits+misses)
+system.cpu.dcache.ReadReq_avg_miss_latency  6142.707591                       # average ReadReq miss latency
+system.cpu.dcache.ReadReq_avg_mshr_miss_latency  3039.983703                       # average ReadReq mshr miss latency
+system.cpu.dcache.ReadReq_hits               67343989                       # number of ReadReq hits
+system.cpu.dcache.ReadReq_miss_latency    12991114000                       # number of ReadReq miss cycles
+system.cpu.dcache.ReadReq_miss_rate          0.030448                       # miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_misses              2114884                       # number of ReadReq misses
+system.cpu.dcache.ReadReq_mshr_hits            142693                       # number of ReadReq MSHR hits
+system.cpu.dcache.ReadReq_mshr_miss_latency   5995428500                       # number of ReadReq MSHR miss cycles
+system.cpu.dcache.ReadReq_mshr_miss_rate     0.028394                       # mshr miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_mshr_misses         1972191                       # number of ReadReq MSHR misses
 system.cpu.dcache.WriteReq_accesses          31439751                       # number of WriteReq accesses(hits+misses)
-system.cpu.dcache.WriteReq_avg_miss_latency 20696.077989                       # average WriteReq miss latency
-system.cpu.dcache.WriteReq_avg_mshr_miss_latency 15440.513442                       # average WriteReq mshr miss latency
-system.cpu.dcache.WriteReq_hits              31284703                       # number of WriteReq hits
-system.cpu.dcache.WriteReq_miss_latency    3208885500                       # number of WriteReq miss cycles
-system.cpu.dcache.WriteReq_miss_rate         0.004932                       # miss rate for WriteReq accesses
-system.cpu.dcache.WriteReq_misses              155048                       # number of WriteReq misses
-system.cpu.dcache.WriteReq_mshr_hits            48629                       # number of WriteReq MSHR hits
-system.cpu.dcache.WriteReq_mshr_miss_latency   1643164000                       # number of WriteReq MSHR miss cycles
-system.cpu.dcache.WriteReq_mshr_miss_rate     0.003385                       # mshr miss rate for WriteReq accesses
-system.cpu.dcache.WriteReq_mshr_misses         106419                       # number of WriteReq MSHR misses
-system.cpu.dcache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
+system.cpu.dcache.WriteReq_avg_miss_latency 17842.235128                       # average WriteReq miss latency
+system.cpu.dcache.WriteReq_avg_mshr_miss_latency 17696.947420                       # average WriteReq mshr miss latency
+system.cpu.dcache.WriteReq_hits              31210017                       # number of WriteReq hits
+system.cpu.dcache.WriteReq_miss_latency    4098968045                       # number of WriteReq miss cycles
+system.cpu.dcache.WriteReq_miss_rate         0.007307                       # miss rate for WriteReq accesses
+system.cpu.dcache.WriteReq_misses              229734                       # number of WriteReq misses
+system.cpu.dcache.WriteReq_mshr_hits           123609                       # number of WriteReq MSHR hits
+system.cpu.dcache.WriteReq_mshr_miss_latency   1878088545                       # number of WriteReq MSHR miss cycles
+system.cpu.dcache.WriteReq_mshr_miss_rate     0.003376                       # mshr miss rate for WriteReq accesses
+system.cpu.dcache.WriteReq_mshr_misses         106125                       # number of WriteReq MSHR misses
+system.cpu.dcache.avg_blocked_cycles::no_mshrs  3358.823529                       # average number of cycles each access was blocked
 system.cpu.dcache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.dcache.avg_refs                  53.969218                       # Average number of references to valid blocks.
-system.cpu.dcache.blocked::no_mshrs                 0                       # number of cycles access was blocked
+system.cpu.dcache.avg_refs                  47.420176                       # Average number of references to valid blocks.
+system.cpu.dcache.blocked::no_mshrs                85                       # number of cycles access was blocked
 system.cpu.dcache.blocked::no_targets               0                       # number of cycles access was blocked
-system.cpu.dcache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
+system.cpu.dcache.blocked_cycles::no_mshrs       285500                       # number of cycles access was blocked
 system.cpu.dcache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.dcache.cache_copies                      0                       # number of cache copies performed
-system.cpu.dcache.demand_accesses           114219376                       # number of demand (read+write) accesses
-system.cpu.dcache.demand_avg_miss_latency  7030.296858                       # average overall miss latency
-system.cpu.dcache.demand_avg_mshr_miss_latency  3581.748123                       # average overall mshr miss latency
-system.cpu.dcache.demand_hits               112049217                       # number of demand (read+write) hits
-system.cpu.dcache.demand_miss_latency     15256862000                       # number of demand (read+write) miss cycles
-system.cpu.dcache.demand_miss_rate           0.019000                       # miss rate for demand accesses
-system.cpu.dcache.demand_misses               2170159                       # number of demand (read+write) misses
-system.cpu.dcache.demand_mshr_hits              93989                       # number of demand (read+write) MSHR hits
-system.cpu.dcache.demand_mshr_miss_latency   7436318000                       # number of demand (read+write) MSHR miss cycles
-system.cpu.dcache.demand_mshr_miss_rate      0.018177                       # mshr miss rate for demand accesses
-system.cpu.dcache.demand_mshr_misses          2076170                       # number of demand (read+write) MSHR misses
+system.cpu.dcache.demand_accesses           100898624                       # number of demand (read+write) accesses
+system.cpu.dcache.demand_avg_miss_latency  7289.068857                       # average overall miss latency
+system.cpu.dcache.demand_avg_mshr_miss_latency  3788.411890                       # average overall mshr miss latency
+system.cpu.dcache.demand_hits                98554006                       # number of demand (read+write) hits
+system.cpu.dcache.demand_miss_latency     17090082045                       # number of demand (read+write) miss cycles
+system.cpu.dcache.demand_miss_rate           0.023237                       # miss rate for demand accesses
+system.cpu.dcache.demand_misses               2344618                       # number of demand (read+write) misses
+system.cpu.dcache.demand_mshr_hits             266302                       # number of demand (read+write) MSHR hits
+system.cpu.dcache.demand_mshr_miss_latency   7873517045                       # number of demand (read+write) MSHR miss cycles
+system.cpu.dcache.demand_mshr_miss_rate      0.020598                       # mshr miss rate for demand accesses
+system.cpu.dcache.demand_mshr_misses          2078316                       # number of demand (read+write) MSHR misses
 system.cpu.dcache.fast_writes                       0                       # number of fast writes performed
 system.cpu.dcache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.dcache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.dcache.occ_%::0                   0.995143                       # Average percentage of cache occupancy
-system.cpu.dcache.occ_blocks::0           4076.104755                       # Average occupied blocks per context
-system.cpu.dcache.overall_accesses          114219376                       # number of overall (read+write) accesses
-system.cpu.dcache.overall_avg_miss_latency  7030.296858                       # average overall miss latency
-system.cpu.dcache.overall_avg_mshr_miss_latency  3581.748123                       # average overall mshr miss latency
+system.cpu.dcache.occ_%::0                   0.994974                       # Average percentage of cache occupancy
+system.cpu.dcache.occ_blocks::0           4075.414607                       # Average occupied blocks per context
+system.cpu.dcache.overall_accesses          100898624                       # number of overall (read+write) accesses
+system.cpu.dcache.overall_avg_miss_latency  7289.068857                       # average overall miss latency
+system.cpu.dcache.overall_avg_mshr_miss_latency  3788.411890                       # average overall mshr miss latency
 system.cpu.dcache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.dcache.overall_hits              112049217                       # number of overall hits
-system.cpu.dcache.overall_miss_latency    15256862000                       # number of overall miss cycles
-system.cpu.dcache.overall_miss_rate          0.019000                       # miss rate for overall accesses
-system.cpu.dcache.overall_misses              2170159                       # number of overall misses
-system.cpu.dcache.overall_mshr_hits             93989                       # number of overall MSHR hits
-system.cpu.dcache.overall_mshr_miss_latency   7436318000                       # number of overall MSHR miss cycles
-system.cpu.dcache.overall_mshr_miss_rate     0.018177                       # mshr miss rate for overall accesses
-system.cpu.dcache.overall_mshr_misses         2076170                       # number of overall MSHR misses
+system.cpu.dcache.overall_hits               98554006                       # number of overall hits
+system.cpu.dcache.overall_miss_latency    17090082045                       # number of overall miss cycles
+system.cpu.dcache.overall_miss_rate          0.023237                       # miss rate for overall accesses
+system.cpu.dcache.overall_misses              2344618                       # number of overall misses
+system.cpu.dcache.overall_mshr_hits            266302                       # number of overall MSHR hits
+system.cpu.dcache.overall_mshr_miss_latency   7873517045                       # number of overall MSHR miss cycles
+system.cpu.dcache.overall_mshr_miss_rate     0.020598                       # mshr miss rate for overall accesses
+system.cpu.dcache.overall_mshr_misses         2078316                       # number of overall MSHR misses
 system.cpu.dcache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.dcache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.dcache.replacements                2072073                       # number of replacements
-system.cpu.dcache.sampled_refs                2076169                       # Sample count of references to valid blocks.
+system.cpu.dcache.replacements                2074218                       # number of replacements
+system.cpu.dcache.sampled_refs                2078314                       # Sample count of references to valid blocks.
 system.cpu.dcache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.dcache.tagsinuse               4076.104755                       # Cycle average of tags in use
-system.cpu.dcache.total_refs                112049217                       # Total number of references to valid blocks.
-system.cpu.dcache.warmup_cycle            66009760000                       # Cycle when the warmup percentage was hit.
-system.cpu.dcache.writebacks                  1440063                       # number of writebacks
-system.cpu.decode.DECODE:BlockedCycles         922031                       # Number of cycles decode is blocked
-system.cpu.decode.DECODE:DecodedInsts       437195268                       # Number of instructions handled by decode
-system.cpu.decode.DECODE:IdleCycles          92021485                       # Number of cycles decode is idle
-system.cpu.decode.DECODE:RunCycles          228705655                       # Number of cycles decode is running
-system.cpu.decode.DECODE:SquashCycles        19453848                       # Number of cycles decode is squashing
-system.cpu.decode.DECODE:UnblockCycles         143926                       # Number of cycles decode is unblocking
-system.cpu.fetch.Branches                    51416803                       # Number of branches that fetch encountered
-system.cpu.fetch.CacheLines                  39245397                       # Number of cache lines fetched
-system.cpu.fetch.Cycles                     242939967                       # Number of cycles fetch has run and was not squashing or blocked
-system.cpu.fetch.IcacheSquashes                793923                       # Number of outstanding Icache misses that were squashed
-system.cpu.fetch.Insts                      249694241                       # Number of instructions fetch has processed
-system.cpu.fetch.MiscStallCycles                   16                       # Number of cycles fetch has spent waiting on interrupts, or bad addresses, or out of MSHRs
-system.cpu.fetch.SquashCycles                 9845420                       # Number of cycles fetch has spent squashing
-system.cpu.fetch.branchRate                  0.150623                       # Number of branch fetches per cycle
-system.cpu.fetch.icacheStallCycles           39245397                       # Number of cycles fetch is stalled on an Icache miss
-system.cpu.fetch.predictedBranches           50810617                       # Number of branches that fetch has predicted taken
-system.cpu.fetch.rate                        0.731466                       # Number of inst fetches per cycle
-system.cpu.fetch.rateDist::samples          341246945                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::mean              1.321737                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::stdev             1.251135                       # Number of instructions fetched each cycle (Total)
+system.cpu.dcache.tagsinuse               4075.414607                       # Cycle average of tags in use
+system.cpu.dcache.total_refs                 98554015                       # Total number of references to valid blocks.
+system.cpu.dcache.warmup_cycle            40655663000                       # Cycle when the warmup percentage was hit.
+system.cpu.dcache.writebacks                  1442059                       # number of writebacks
+system.cpu.decode.DECODE:BlockedCycles       21837286                       # Number of cycles decode is blocked
+system.cpu.decode.DECODE:DecodedInsts       443283148                       # Number of instructions handled by decode
+system.cpu.decode.DECODE:IdleCycles          77587406                       # Number of cycles decode is idle
+system.cpu.decode.DECODE:RunCycles           75762450                       # Number of cycles decode is running
+system.cpu.decode.DECODE:SquashCycles        19022168                       # Number of cycles decode is squashing
+system.cpu.decode.DECODE:UnblockCycles        1761222                       # Number of cycles decode is unblocking
+system.cpu.fetch.Branches                    50608102                       # Number of branches that fetch encountered
+system.cpu.fetch.CacheLines                  34652495                       # Number of cache lines fetched
+system.cpu.fetch.Cycles                      82344495                       # Number of cycles fetch has run and was not squashing or blocked
+system.cpu.fetch.IcacheSquashes                326035                       # Number of outstanding Icache misses that were squashed
+system.cpu.fetch.Insts                      259681215                       # Number of instructions fetch has processed
+system.cpu.fetch.MiscStallCycles                   35                       # Number of cycles fetch has spent waiting on interrupts, or bad addresses, or out of MSHRs
+system.cpu.fetch.SquashCycles                 3883025                       # Number of cycles fetch has spent squashing
+system.cpu.fetch.branchRate                  0.256576                       # Number of branch fetches per cycle
+system.cpu.fetch.icacheStallCycles           34652495                       # Number of cycles fetch is stalled on an Icache miss
+system.cpu.fetch.predictedBranches           44152407                       # Number of branches that fetch has predicted taken
+system.cpu.fetch.rate                        1.316545                       # Number of inst fetches per cycle
+system.cpu.fetch.rateDist::samples          195970532                       # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::mean              2.323843                       # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::stdev             3.188074                       # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::underflows               0      0.00%      0.00% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::0                105340577     30.87%     30.87% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::1                115413940     33.82%     64.69% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::2                 47580781     13.94%     78.63% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::3                 58732555     17.21%     95.84% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::4                  7189604      2.11%     97.95% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::5                  6451059      1.89%     99.84% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::6                   527277      0.15%    100.00% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::7                      932      0.00%    100.00% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::8                    10220      0.00%    100.00% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::0                116145210     59.27%     59.27% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::1                  6750085      3.44%     62.71% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::2                  3016102      1.54%     64.25% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::3                  8362073      4.27%     68.52% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::4                  7646936      3.90%     72.42% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::5                  6348764      3.24%     75.66% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::6                  9080088      4.63%     80.29% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::7                  8246058      4.21%     84.50% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::8                 30375216     15.50%    100.00% # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::overflows                0      0.00%    100.00% # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::min_value                0                       # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::max_value                8                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::total            341246945                       # Number of instructions fetched each cycle (Total)
-system.cpu.fp_regfile_reads                        44                       # number of floating regfile reads
-system.cpu.fp_regfile_writes                       31                       # number of floating regfile writes
-system.cpu.icache.ReadReq_accesses           39245397                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_avg_miss_latency 37208.490566                       # average ReadReq miss latency
-system.cpu.icache.ReadReq_avg_mshr_miss_latency 35316.192560                       # average ReadReq mshr miss latency
-system.cpu.icache.ReadReq_hits               39244337                       # number of ReadReq hits
-system.cpu.icache.ReadReq_miss_latency       39441000                       # number of ReadReq miss cycles
-system.cpu.icache.ReadReq_miss_rate          0.000027                       # miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_misses                 1060                       # number of ReadReq misses
-system.cpu.icache.ReadReq_mshr_hits               146                       # number of ReadReq MSHR hits
-system.cpu.icache.ReadReq_mshr_miss_latency     32279000                       # number of ReadReq MSHR miss cycles
-system.cpu.icache.ReadReq_mshr_miss_rate     0.000023                       # mshr miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_mshr_misses             914                       # number of ReadReq MSHR misses
+system.cpu.fetch.rateDist::total            195970532                       # Number of instructions fetched each cycle (Total)
+system.cpu.fp_regfile_reads                        75                       # number of floating regfile reads
+system.cpu.fp_regfile_writes                       41                       # number of floating regfile writes
+system.cpu.icache.ReadReq_accesses           34652495                       # number of ReadReq accesses(hits+misses)
+system.cpu.icache.ReadReq_avg_miss_latency 35675.242356                       # average ReadReq miss latency
+system.cpu.icache.ReadReq_avg_mshr_miss_latency 35201.684836                       # average ReadReq mshr miss latency
+system.cpu.icache.ReadReq_hits               34651154                       # number of ReadReq hits
+system.cpu.icache.ReadReq_miss_latency       47840500                       # number of ReadReq miss cycles
+system.cpu.icache.ReadReq_miss_rate          0.000039                       # miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_misses                 1341                       # number of ReadReq misses
+system.cpu.icache.ReadReq_mshr_hits               332                       # number of ReadReq MSHR hits
+system.cpu.icache.ReadReq_mshr_miss_latency     35518500                       # number of ReadReq MSHR miss cycles
+system.cpu.icache.ReadReq_mshr_miss_rate     0.000029                       # mshr miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_mshr_misses            1009                       # number of ReadReq MSHR misses
 system.cpu.icache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.icache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.icache.avg_refs               42936.911379                       # Average number of references to valid blocks.
+system.cpu.icache.avg_refs               34376.144841                       # Average number of references to valid blocks.
 system.cpu.icache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.icache.blocked::no_targets               0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.icache.cache_copies                      0                       # number of cache copies performed
-system.cpu.icache.demand_accesses            39245397                       # number of demand (read+write) accesses
-system.cpu.icache.demand_avg_miss_latency 37208.490566                       # average overall miss latency
-system.cpu.icache.demand_avg_mshr_miss_latency 35316.192560                       # average overall mshr miss latency
-system.cpu.icache.demand_hits                39244337                       # number of demand (read+write) hits
-system.cpu.icache.demand_miss_latency        39441000                       # number of demand (read+write) miss cycles
-system.cpu.icache.demand_miss_rate           0.000027                       # miss rate for demand accesses
-system.cpu.icache.demand_misses                  1060                       # number of demand (read+write) misses
-system.cpu.icache.demand_mshr_hits                146                       # number of demand (read+write) MSHR hits
-system.cpu.icache.demand_mshr_miss_latency     32279000                       # number of demand (read+write) MSHR miss cycles
-system.cpu.icache.demand_mshr_miss_rate      0.000023                       # mshr miss rate for demand accesses
-system.cpu.icache.demand_mshr_misses              914                       # number of demand (read+write) MSHR misses
+system.cpu.icache.demand_accesses            34652495                       # number of demand (read+write) accesses
+system.cpu.icache.demand_avg_miss_latency 35675.242356                       # average overall miss latency
+system.cpu.icache.demand_avg_mshr_miss_latency 35201.684836                       # average overall mshr miss latency
+system.cpu.icache.demand_hits                34651154                       # number of demand (read+write) hits
+system.cpu.icache.demand_miss_latency        47840500                       # number of demand (read+write) miss cycles
+system.cpu.icache.demand_miss_rate           0.000039                       # miss rate for demand accesses
+system.cpu.icache.demand_misses                  1341                       # number of demand (read+write) misses
+system.cpu.icache.demand_mshr_hits                332                       # number of demand (read+write) MSHR hits
+system.cpu.icache.demand_mshr_miss_latency     35518500                       # number of demand (read+write) MSHR miss cycles
+system.cpu.icache.demand_mshr_miss_rate      0.000029                       # mshr miss rate for demand accesses
+system.cpu.icache.demand_mshr_misses             1009                       # number of demand (read+write) MSHR misses
 system.cpu.icache.fast_writes                       0                       # number of fast writes performed
 system.cpu.icache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.icache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.icache.occ_%::0                   0.360466                       # Average percentage of cache occupancy
-system.cpu.icache.occ_blocks::0            738.235227                       # Average occupied blocks per context
-system.cpu.icache.overall_accesses           39245397                       # number of overall (read+write) accesses
-system.cpu.icache.overall_avg_miss_latency 37208.490566                       # average overall miss latency
-system.cpu.icache.overall_avg_mshr_miss_latency 35316.192560                       # average overall mshr miss latency
+system.cpu.icache.occ_%::0                   0.392466                       # Average percentage of cache occupancy
+system.cpu.icache.occ_blocks::0            803.770978                       # Average occupied blocks per context
+system.cpu.icache.overall_accesses           34652495                       # number of overall (read+write) accesses
+system.cpu.icache.overall_avg_miss_latency 35675.242356                       # average overall miss latency
+system.cpu.icache.overall_avg_mshr_miss_latency 35201.684836                       # average overall mshr miss latency
 system.cpu.icache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.icache.overall_hits               39244337                       # number of overall hits
-system.cpu.icache.overall_miss_latency       39441000                       # number of overall miss cycles
-system.cpu.icache.overall_miss_rate          0.000027                       # miss rate for overall accesses
-system.cpu.icache.overall_misses                 1060                       # number of overall misses
-system.cpu.icache.overall_mshr_hits               146                       # number of overall MSHR hits
-system.cpu.icache.overall_mshr_miss_latency     32279000                       # number of overall MSHR miss cycles
-system.cpu.icache.overall_mshr_miss_rate     0.000023                       # mshr miss rate for overall accesses
-system.cpu.icache.overall_mshr_misses             914                       # number of overall MSHR misses
+system.cpu.icache.overall_hits               34651154                       # number of overall hits
+system.cpu.icache.overall_miss_latency       47840500                       # number of overall miss cycles
+system.cpu.icache.overall_miss_rate          0.000039                       # miss rate for overall accesses
+system.cpu.icache.overall_misses                 1341                       # number of overall misses
+system.cpu.icache.overall_mshr_hits               332                       # number of overall MSHR hits
+system.cpu.icache.overall_mshr_miss_latency     35518500                       # number of overall MSHR miss cycles
+system.cpu.icache.overall_mshr_miss_rate     0.000029                       # mshr miss rate for overall accesses
+system.cpu.icache.overall_mshr_misses            1009                       # number of overall MSHR misses
 system.cpu.icache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.icache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.icache.replacements                     37                       # number of replacements
-system.cpu.icache.sampled_refs                    914                       # Sample count of references to valid blocks.
+system.cpu.icache.replacements                     60                       # number of replacements
+system.cpu.icache.sampled_refs                   1008                       # Sample count of references to valid blocks.
 system.cpu.icache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.icache.tagsinuse                738.235227                       # Cycle average of tags in use
-system.cpu.icache.total_refs                 39244337                       # Total number of references to valid blocks.
+system.cpu.icache.tagsinuse                803.770978                       # Cycle average of tags in use
+system.cpu.icache.total_refs                 34651154                       # Total number of references to valid blocks.
 system.cpu.icache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
 system.cpu.icache.writebacks                        0                       # number of writebacks
-system.cpu.idleCycles                          114318                       # Total number of cycles that the CPU has spent unscheduled due to idling
-system.cpu.iew.EXEC:branches                 31118985                       # Number of branches executed
+system.cpu.idleCycles                         1273897                       # Total number of cycles that the CPU has spent unscheduled due to idling
+system.cpu.iew.EXEC:branches                 33755681                       # Number of branches executed
 system.cpu.iew.EXEC:nop                             0                       # number of nop insts executed
-system.cpu.iew.EXEC:rate                     0.940576                       # Inst execution rate
-system.cpu.iew.EXEC:refs                    137464023                       # number of memory reference insts executed
-system.cpu.iew.EXEC:stores                   32172568                       # Number of stores executed
+system.cpu.iew.EXEC:rate                     1.719732                       # Inst execution rate
+system.cpu.iew.EXEC:refs                    143271490                       # number of memory reference insts executed
+system.cpu.iew.EXEC:stores                   33964004                       # Number of stores executed
 system.cpu.iew.EXEC:swp                             0                       # number of swp insts executed
-system.cpu.iew.WB:consumers                 361852587                       # num instructions consuming a value
-system.cpu.iew.WB:count                     317781549                       # cumulative count of insts written-back
-system.cpu.iew.WB:fanout                     0.623035                       # average fanout of values written-back
+system.cpu.iew.WB:consumers                 356152066                       # num instructions consuming a value
+system.cpu.iew.WB:count                     334303723                       # cumulative count of insts written-back
+system.cpu.iew.WB:fanout                     0.713943                       # average fanout of values written-back
 system.cpu.iew.WB:penalized                         0                       # number of instrctions required to write to 'other' IQ
 system.cpu.iew.WB:penalized_rate                    0                       # fraction of instructions written-back that wrote to 'other' IQ
-system.cpu.iew.WB:producers                 225446782                       # num instructions producing a value
-system.cpu.iew.WB:rate                       0.930924                       # insts written-back per cycle
-system.cpu.iew.WB:sent                      318008427                       # cumulative count of insts sent to commit
-system.cpu.iew.branchMispredicts              5390321                       # Number of branch mispredicts detected at execute
-system.cpu.iew.iewBlockCycles                  197365                       # Number of cycles IEW is blocking
-system.cpu.iew.iewDispLoadInsts             131280417                       # Number of dispatched load instructions
-system.cpu.iew.iewDispNonSpecInsts                455                       # Number of dispatched non-speculative instructions
-system.cpu.iew.iewDispSquashedInsts           3671049                       # Number of squashed instructions skipped by dispatch
-system.cpu.iew.iewDispStoreInsts             41039188                       # Number of dispatched store instructions
-system.cpu.iew.iewDispatchedInsts           389592858                       # Number of instructions dispatched to IQ
-system.cpu.iew.iewExecLoadInsts             105291455                       # Number of load instructions executed
-system.cpu.iew.iewExecSquashedInsts          12266571                       # Number of squashed instructions skipped in execute
-system.cpu.iew.iewExecutedInsts             321076071                       # Number of executed instructions
-system.cpu.iew.iewIQFullEvents                   2799                       # Number of times the IQ has become full, causing a stall
+system.cpu.iew.WB:producers                 254272214                       # num instructions producing a value
+system.cpu.iew.WB:rate                       1.694870                       # insts written-back per cycle
+system.cpu.iew.WB:sent                      336664522                       # cumulative count of insts sent to commit
+system.cpu.iew.branchMispredicts              3987132                       # Number of branch mispredicts detected at execute
+system.cpu.iew.iewBlockCycles                  754395                       # Number of cycles IEW is blocking
+system.cpu.iew.iewDispLoadInsts             138835558                       # Number of dispatched load instructions
+system.cpu.iew.iewDispNonSpecInsts                465                       # Number of dispatched non-speculative instructions
+system.cpu.iew.iewDispSquashedInsts            663120                       # Number of squashed instructions skipped by dispatch
+system.cpu.iew.iewDispStoreInsts             42750154                       # Number of dispatched store instructions
+system.cpu.iew.iewDispatchedInsts           409142439                       # Number of instructions dispatched to IQ
+system.cpu.iew.iewExecLoadInsts             109307486                       # Number of load instructions executed
+system.cpu.iew.iewExecSquashedInsts           6572046                       # Number of squashed instructions skipped in execute
+system.cpu.iew.iewExecutedInsts             339207523                       # Number of executed instructions
+system.cpu.iew.iewIQFullEvents                   2275                       # Number of times the IQ has become full, causing a stall
 system.cpu.iew.iewIdleCycles                        0                       # Number of cycles IEW is idle
-system.cpu.iew.iewLSQFullEvents                  1704                       # Number of times the LSQ has become full, causing a stall
-system.cpu.iew.iewSquashCycles               19453848                       # Number of cycles IEW is squashing
-system.cpu.iew.iewUnblockCycles                 10507                       # Number of cycles IEW is unblocking
+system.cpu.iew.iewLSQFullEvents                 78833                       # Number of times the LSQ has become full, causing a stall
+system.cpu.iew.iewSquashCycles               19022168                       # Number of cycles IEW is squashing
+system.cpu.iew.iewUnblockCycles                104797                       # Number of cycles IEW is unblocking
 system.cpu.iew.lsq.thread.0.blockedLoads            0                       # Number of blocked loads due to partial load-store forwarding
-system.cpu.iew.lsq.thread.0.cacheBlocked            0                       # Number of times an access to memory failed due to the cache being blocked
-system.cpu.iew.lsq.thread.0.forwLoads        22405068                       # Number of loads that had data forwarded from stores
-system.cpu.iew.lsq.thread.0.ignoredResponses        64376                       # Number of memory responses ignored because the instruction is squashed
+system.cpu.iew.lsq.thread.0.cacheBlocked        14565                       # Number of times an access to memory failed due to the cache being blocked
+system.cpu.iew.lsq.thread.0.forwLoads        39666706                       # Number of loads that had data forwarded from stores
+system.cpu.iew.lsq.thread.0.ignoredResponses        30063                       # Number of memory responses ignored because the instruction is squashed
 system.cpu.iew.lsq.thread.0.invAddrLoads            0                       # Number of loads ignored due to an invalid address
 system.cpu.iew.lsq.thread.0.invAddrSwpfs            0                       # Number of software prefetches ignored due to an invalid address
-system.cpu.iew.lsq.thread.0.memOrderViolation      5520980                       # Number of memory ordering violations
-system.cpu.iew.lsq.thread.0.rescheduledLoads         2668                       # Number of loads that were rescheduled
-system.cpu.iew.lsq.thread.0.squashedLoads     40501029                       # Number of loads squashed
-system.cpu.iew.lsq.thread.0.squashedStores      9599437                       # Number of stores squashed
-system.cpu.iew.memOrderViolationEvents        5520980                       # Number of memory order violations
-system.cpu.iew.predictedNotTakenIncorrect        16897                       # Number of branches that were predicted not taken incorrectly
-system.cpu.iew.predictedTakenIncorrect        5373424                       # Number of branches that were predicted taken incorrectly
-system.cpu.int_regfile_reads                754340794                       # number of integer regfile reads
-system.cpu.int_regfile_writes               286169707                       # number of integer regfile writes
-system.cpu.ipc                               0.814950                       # IPC: Instructions Per Cycle
-system.cpu.ipc_total                         0.814950                       # IPC: Total IPC of All Threads
-system.cpu.iq.ISSUE:FU_type_0::No_OpClass        16700      0.01%      0.01% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntAlu       193455065     58.03%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntMult              0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntDiv               0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatAdd            15      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatCmp             0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatCvt             0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatMult            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatDiv             0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatSqrt            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAdd              0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAddAcc            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAlu              0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdCmp              0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdCvt              0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMisc             0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMult             0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMultAcc            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdShift            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdShiftAcc            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdSqrt             0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatAdd            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatAlu            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatCmp            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatCvt            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatDiv            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMisc            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMult            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMultAcc            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatSqrt            0      0.00%     58.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::MemRead      107162338     32.15%     90.19% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::MemWrite      32708524      9.81%    100.00% # Type of FU issued
+system.cpu.iew.lsq.thread.0.memOrderViolation      1469253                       # Number of memory ordering violations
+system.cpu.iew.lsq.thread.0.rescheduledLoads         2742                       # Number of loads that were rescheduled
+system.cpu.iew.lsq.thread.0.squashedLoads     48056170                       # Number of loads squashed
+system.cpu.iew.lsq.thread.0.squashedStores     11310403                       # Number of stores squashed
+system.cpu.iew.memOrderViolationEvents        1469253                       # Number of memory order violations
+system.cpu.iew.predictedNotTakenIncorrect       865481                       # Number of branches that were predicted not taken incorrectly
+system.cpu.iew.predictedTakenIncorrect        3121651                       # Number of branches that were predicted taken incorrectly
+system.cpu.int_regfile_reads                577634708                       # number of integer regfile reads
+system.cpu.int_regfile_writes               302216415                       # number of integer regfile writes
+system.cpu.ipc                               1.410395                       # IPC: Instructions Per Cycle
+system.cpu.ipc_total                         1.410395                       # IPC: Total IPC of All Threads
+system.cpu.iq.ISSUE:FU_type_0::No_OpClass        16702      0.00%      0.00% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntAlu       200471700     57.98%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntMult              0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntDiv               0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatAdd            15      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatCmp             0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatCvt             0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatMult            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatDiv             0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatSqrt            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAdd              0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAddAcc            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAlu              0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdCmp              0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdCvt              0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMisc             0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMult             0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMultAcc            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdShift            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdShiftAcc            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdSqrt             0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatAdd            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatAlu            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatCmp            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatCvt            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatDiv            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMisc            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMult            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMultAcc            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatSqrt            0      0.00%     57.98% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::MemRead      110857049     32.06%     90.04% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::MemWrite      34434103      9.96%    100.00% # Type of FU issued
 system.cpu.iq.ISSUE:FU_type_0::IprAccess            0      0.00%    100.00% # Type of FU issued
 system.cpu.iq.ISSUE:FU_type_0::InstPrefetch            0      0.00%    100.00% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::total        333342642                       # Type of FU issued
-system.cpu.iq.ISSUE:fu_busy_cnt                 98152                       # FU busy when requested
-system.cpu.iq.ISSUE:fu_busy_rate             0.000294                       # FU busy rate (busy events/executed inst)
+system.cpu.iq.ISSUE:FU_type_0::total        345779569                       # Type of FU issued
+system.cpu.iq.ISSUE:fu_busy_cnt               4109732                       # FU busy when requested
+system.cpu.iq.ISSUE:fu_busy_rate             0.011885                       # FU busy rate (busy events/executed inst)
 system.cpu.iq.ISSUE:fu_full::No_OpClass             0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntAlu                15      0.02%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntMult                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntDiv                 0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatAdd               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatCmp               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatCvt               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatMult              0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatDiv               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatSqrt              0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAdd                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAddAcc             0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAlu                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdCmp                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdCvt                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMisc               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMult               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMultAcc            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdShift              0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdShiftAcc            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdSqrt               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatAdd            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatAlu            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatCmp            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatCvt            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatDiv            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMisc            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMult            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMultAcc            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatSqrt            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::MemRead            97651     99.49%     99.50% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::MemWrite             486      0.50%    100.00% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntAlu             26819      0.65%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntMult                0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntDiv                 0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatAdd               0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatCmp               0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatCvt               0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatMult              0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatDiv               0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatSqrt              0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAdd                0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAddAcc             0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAlu                0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdCmp                0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdCvt                0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMisc               0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMult               0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMultAcc            0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdShift              0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdShiftAcc            0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdSqrt               0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatAdd            0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatAlu            0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatCmp            0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatCvt            0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatDiv            0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMisc            0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMult            0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMultAcc            0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatSqrt            0      0.00%      0.65% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::MemRead          3817756     92.90%     93.55% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::MemWrite          265157      6.45%    100.00% # attempts to use FU when none available
 system.cpu.iq.ISSUE:fu_full::IprAccess              0      0.00%    100.00% # attempts to use FU when none available
 system.cpu.iq.ISSUE:fu_full::InstPrefetch            0      0.00%    100.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:issued_per_cycle::samples    341246945                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::mean     0.976837                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::stdev     1.032280                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::samples    195970532                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::mean     1.764447                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::stdev     1.745109                       # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::underflows            0      0.00%      0.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::0     143332703     42.00%     42.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::1      98734149     28.93%     70.94% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::2      68142120     19.97%     90.90% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::3      26890607      7.88%     98.78% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::4       3089152      0.91%     99.69% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::5       1054470      0.31%    100.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::6          2951      0.00%    100.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::7           576      0.00%    100.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::8           217      0.00%    100.00% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::0      63955785     32.64%     32.64% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::1      38956843     19.88%     52.51% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::2      30997952     15.82%     68.33% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::3      27554899     14.06%     82.39% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::4      19728653     10.07%     92.46% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::5       8783605      4.48%     96.94% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::6       3191043      1.63%     98.57% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::7       2230786      1.14%     99.71% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::8        570966      0.29%    100.00% # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::overflows            0      0.00%    100.00% # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::min_value            0                       # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::max_value            8                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::total    341246945                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:rate                     0.976510                       # Inst issue rate
-system.cpu.iq.fp_alu_accesses                      55                       # Number of floating point alu accesses
-system.cpu.iq.fp_inst_queue_reads                 110                       # Number of floating instruction queue reads
-system.cpu.iq.fp_inst_queue_wakeup_accesses           49                       # Number of floating instruction queue wakeup accesses
-system.cpu.iq.fp_inst_queue_writes                110                       # Number of floating instruction queue writes
-system.cpu.iq.int_alu_accesses              333424039                       # Number of integer alu accesses
-system.cpu.iq.int_inst_queue_reads         1008030271                       # Number of integer instruction queue reads
-system.cpu.iq.int_inst_queue_wakeup_accesses    317781500                       # Number of integer instruction queue wakeup accesses
-system.cpu.iq.int_inst_queue_writes         504991584                       # Number of integer instruction queue writes
-system.cpu.iq.iqInstsAdded                  389592403                       # Number of instructions added to the IQ (excludes non-spec)
-system.cpu.iq.iqInstsIssued                 333342642                       # Number of instructions issued
-system.cpu.iq.iqNonSpecInstsAdded                 455                       # Number of non-speculative instructions added to the IQ
-system.cpu.iq.iqSquashedInstsExamined       109882124                       # Number of squashed instructions iterated over during squash; mainly for profiling
-system.cpu.iq.iqSquashedNonSpecRemoved              9                       # Number of squashed non-spec instructions that were removed
-system.cpu.iq.iqSquashedOperandsExamined    237362106                       # Number of squashed operands that are examined and possibly removed from graph
-system.cpu.l2cache.ReadExReq_accesses          106419                       # number of ReadExReq accesses(hits+misses)
-system.cpu.l2cache.ReadExReq_avg_miss_latency 34277.831445                       # average ReadExReq miss latency
-system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 31049.336758                       # average ReadExReq mshr miss latency
-system.cpu.l2cache.ReadExReq_hits               63976                       # number of ReadExReq hits
-system.cpu.l2cache.ReadExReq_miss_latency   1454854000                       # number of ReadExReq miss cycles
-system.cpu.l2cache.ReadExReq_miss_rate       0.398829                       # miss rate for ReadExReq accesses
-system.cpu.l2cache.ReadExReq_misses             42443                       # number of ReadExReq misses
-system.cpu.l2cache.ReadExReq_mshr_miss_latency   1317827000                       # number of ReadExReq MSHR miss cycles
-system.cpu.l2cache.ReadExReq_mshr_miss_rate     0.398829                       # mshr miss rate for ReadExReq accesses
-system.cpu.l2cache.ReadExReq_mshr_misses        42443                       # number of ReadExReq MSHR misses
-system.cpu.l2cache.ReadReq_accesses           1970665                       # number of ReadReq accesses(hits+misses)
-system.cpu.l2cache.ReadReq_avg_miss_latency 34310.495712                       # average ReadReq miss latency
-system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 31007.530164                       # average ReadReq mshr miss latency
-system.cpu.l2cache.ReadReq_hits               1936270                       # number of ReadReq hits
-system.cpu.l2cache.ReadReq_miss_latency    1180109500                       # number of ReadReq miss cycles
-system.cpu.l2cache.ReadReq_miss_rate         0.017453                       # miss rate for ReadReq accesses
-system.cpu.l2cache.ReadReq_misses               34395                       # number of ReadReq misses
-system.cpu.l2cache.ReadReq_mshr_miss_latency   1066504000                       # number of ReadReq MSHR miss cycles
-system.cpu.l2cache.ReadReq_mshr_miss_rate     0.017453                       # mshr miss rate for ReadReq accesses
-system.cpu.l2cache.ReadReq_mshr_misses          34395                       # number of ReadReq MSHR misses
-system.cpu.l2cache.Writeback_accesses         1440063                       # number of Writeback accesses(hits+misses)
-system.cpu.l2cache.Writeback_hits             1440063                       # number of Writeback hits
-system.cpu.l2cache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
+system.cpu.iq.ISSUE:issued_per_cycle::total    195970532                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:rate                     1.753051                       # Inst issue rate
+system.cpu.iq.fp_alu_accesses                     110                       # Number of floating point alu accesses
+system.cpu.iq.fp_inst_queue_reads                 224                       # Number of floating instruction queue reads
+system.cpu.iq.fp_inst_queue_wakeup_accesses           83                       # Number of floating instruction queue wakeup accesses
+system.cpu.iq.fp_inst_queue_writes                263                       # Number of floating instruction queue writes
+system.cpu.iq.int_alu_accesses              349872489                       # Number of integer alu accesses
+system.cpu.iq.int_inst_queue_reads          891669703                       # Number of integer instruction queue reads
+system.cpu.iq.int_inst_queue_wakeup_accesses    334303640                       # Number of integer instruction queue wakeup accesses
+system.cpu.iq.int_inst_queue_writes         540919004                       # Number of integer instruction queue writes
+system.cpu.iq.iqInstsAdded                  409141974                       # Number of instructions added to the IQ (excludes non-spec)
+system.cpu.iq.iqInstsIssued                 345779569                       # Number of instructions issued
+system.cpu.iq.iqNonSpecInstsAdded                 465                       # Number of non-speculative instructions added to the IQ
+system.cpu.iq.iqSquashedInstsExamined       130872312                       # Number of squashed instructions iterated over during squash; mainly for profiling
+system.cpu.iq.iqSquashedInstsIssued             30525                       # Number of squashed instructions issued
+system.cpu.iq.iqSquashedNonSpecRemoved             19                       # Number of squashed non-spec instructions that were removed
+system.cpu.iq.iqSquashedOperandsExamined    221868127                       # Number of squashed operands that are examined and possibly removed from graph
+system.cpu.l2cache.ReadExReq_accesses          106126                       # number of ReadExReq accesses(hits+misses)
+system.cpu.l2cache.ReadExReq_avg_miss_latency 34139.167845                       # average ReadExReq miss latency
+system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 31050.412541                       # average ReadExReq mshr miss latency
+system.cpu.l2cache.ReadExReq_hits               63706                       # number of ReadExReq hits
+system.cpu.l2cache.ReadExReq_miss_latency   1448183500                       # number of ReadExReq miss cycles
+system.cpu.l2cache.ReadExReq_miss_rate       0.399714                       # miss rate for ReadExReq accesses
+system.cpu.l2cache.ReadExReq_misses             42420                       # number of ReadExReq misses
+system.cpu.l2cache.ReadExReq_mshr_miss_latency   1317158500                       # number of ReadExReq MSHR miss cycles
+system.cpu.l2cache.ReadExReq_mshr_miss_rate     0.399714                       # mshr miss rate for ReadExReq accesses
+system.cpu.l2cache.ReadExReq_mshr_misses        42420                       # number of ReadExReq MSHR misses
+system.cpu.l2cache.ReadReq_accesses           1973197                       # number of ReadReq accesses(hits+misses)
+system.cpu.l2cache.ReadReq_avg_miss_latency 34279.521718                       # average ReadReq miss latency
+system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 31013.978995                       # average ReadReq mshr miss latency
+system.cpu.l2cache.ReadReq_hits               1938824                       # number of ReadReq hits
+system.cpu.l2cache.ReadReq_miss_latency    1178290000                       # number of ReadReq miss cycles
+system.cpu.l2cache.ReadReq_miss_rate         0.017420                       # miss rate for ReadReq accesses
+system.cpu.l2cache.ReadReq_misses               34373                       # number of ReadReq misses
+system.cpu.l2cache.ReadReq_mshr_miss_latency   1066043500                       # number of ReadReq MSHR miss cycles
+system.cpu.l2cache.ReadReq_mshr_miss_rate     0.017420                       # mshr miss rate for ReadReq accesses
+system.cpu.l2cache.ReadReq_mshr_misses          34373                       # number of ReadReq MSHR misses
+system.cpu.l2cache.UpgradeReq_accesses              1                       # number of UpgradeReq accesses(hits+misses)
+system.cpu.l2cache.UpgradeReq_avg_mshr_miss_latency        31000                       # average UpgradeReq mshr miss latency
+system.cpu.l2cache.UpgradeReq_miss_rate             1                       # miss rate for UpgradeReq accesses
+system.cpu.l2cache.UpgradeReq_misses                1                       # number of UpgradeReq misses
+system.cpu.l2cache.UpgradeReq_mshr_miss_latency        31000                       # number of UpgradeReq MSHR miss cycles
+system.cpu.l2cache.UpgradeReq_mshr_miss_rate            1                       # mshr miss rate for UpgradeReq accesses
+system.cpu.l2cache.UpgradeReq_mshr_misses            1                       # number of UpgradeReq MSHR misses
+system.cpu.l2cache.Writeback_accesses         1442058                       # number of Writeback accesses(hits+misses)
+system.cpu.l2cache.Writeback_hits             1442058                       # number of Writeback hits
+system.cpu.l2cache.avg_blocked_cycles::no_mshrs  2176.470588                       # average number of cycles each access was blocked
 system.cpu.l2cache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.l2cache.avg_refs                 42.751383                       # Average number of references to valid blocks.
-system.cpu.l2cache.blocked::no_mshrs                0                       # number of cycles access was blocked
+system.cpu.l2cache.avg_refs                 42.835533                       # Average number of references to valid blocks.
+system.cpu.l2cache.blocked::no_mshrs               17                       # number of cycles access was blocked
 system.cpu.l2cache.blocked::no_targets              0                       # number of cycles access was blocked
-system.cpu.l2cache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
+system.cpu.l2cache.blocked_cycles::no_mshrs        37000                       # number of cycles access was blocked
 system.cpu.l2cache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.l2cache.cache_copies                     0                       # number of cache copies performed
-system.cpu.l2cache.demand_accesses            2077084                       # number of demand (read+write) accesses
-system.cpu.l2cache.demand_avg_miss_latency 34292.452953                       # average overall miss latency
-system.cpu.l2cache.demand_avg_mshr_miss_latency 31030.622869                       # average overall mshr miss latency
-system.cpu.l2cache.demand_hits                2000246                       # number of demand (read+write) hits
-system.cpu.l2cache.demand_miss_latency     2634963500                       # number of demand (read+write) miss cycles
-system.cpu.l2cache.demand_miss_rate          0.036993                       # miss rate for demand accesses
-system.cpu.l2cache.demand_misses                76838                       # number of demand (read+write) misses
+system.cpu.l2cache.demand_accesses            2079323                       # number of demand (read+write) accesses
+system.cpu.l2cache.demand_avg_miss_latency 34201.991067                       # average overall miss latency
+system.cpu.l2cache.demand_avg_mshr_miss_latency 31034.104671                       # average overall mshr miss latency
+system.cpu.l2cache.demand_hits                2002530                       # number of demand (read+write) hits
+system.cpu.l2cache.demand_miss_latency     2626473500                       # number of demand (read+write) miss cycles
+system.cpu.l2cache.demand_miss_rate          0.036932                       # miss rate for demand accesses
+system.cpu.l2cache.demand_misses                76793                       # number of demand (read+write) misses
 system.cpu.l2cache.demand_mshr_hits                 0                       # number of demand (read+write) MSHR hits
-system.cpu.l2cache.demand_mshr_miss_latency   2384331000                       # number of demand (read+write) MSHR miss cycles
-system.cpu.l2cache.demand_mshr_miss_rate     0.036993                       # mshr miss rate for demand accesses
-system.cpu.l2cache.demand_mshr_misses           76838                       # number of demand (read+write) MSHR misses
+system.cpu.l2cache.demand_mshr_miss_latency   2383202000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.l2cache.demand_mshr_miss_rate     0.036932                       # mshr miss rate for demand accesses
+system.cpu.l2cache.demand_mshr_misses           76793                       # number of demand (read+write) MSHR misses
 system.cpu.l2cache.fast_writes                      0                       # number of fast writes performed
 system.cpu.l2cache.mshr_cap_events                  0                       # number of times MSHR cap was activated
 system.cpu.l2cache.no_allocate_misses               0                       # Number of misses that were no-allocate
-system.cpu.l2cache.occ_%::0                  0.192442                       # Average percentage of cache occupancy
-system.cpu.l2cache.occ_%::1                  0.349126                       # Average percentage of cache occupancy
-system.cpu.l2cache.occ_blocks::0          6305.950681                       # Average occupied blocks per context
-system.cpu.l2cache.occ_blocks::1         11440.167306                       # Average occupied blocks per context
-system.cpu.l2cache.overall_accesses           2077084                       # number of overall (read+write) accesses
-system.cpu.l2cache.overall_avg_miss_latency 34292.452953                       # average overall miss latency
-system.cpu.l2cache.overall_avg_mshr_miss_latency 31030.622869                       # average overall mshr miss latency
+system.cpu.l2cache.occ_%::0                  0.185144                       # Average percentage of cache occupancy
+system.cpu.l2cache.occ_%::1                  0.337522                       # Average percentage of cache occupancy
+system.cpu.l2cache.occ_blocks::0          6066.784489                       # Average occupied blocks per context
+system.cpu.l2cache.occ_blocks::1         11059.931141                       # Average occupied blocks per context
+system.cpu.l2cache.overall_accesses           2079323                       # number of overall (read+write) accesses
+system.cpu.l2cache.overall_avg_miss_latency 34201.991067                       # average overall miss latency
+system.cpu.l2cache.overall_avg_mshr_miss_latency 31034.104671                       # average overall mshr miss latency
 system.cpu.l2cache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.l2cache.overall_hits               2000246                       # number of overall hits
-system.cpu.l2cache.overall_miss_latency    2634963500                       # number of overall miss cycles
-system.cpu.l2cache.overall_miss_rate         0.036993                       # miss rate for overall accesses
-system.cpu.l2cache.overall_misses               76838                       # number of overall misses
+system.cpu.l2cache.overall_hits               2002530                       # number of overall hits
+system.cpu.l2cache.overall_miss_latency    2626473500                       # number of overall miss cycles
+system.cpu.l2cache.overall_miss_rate         0.036932                       # miss rate for overall accesses
+system.cpu.l2cache.overall_misses               76793                       # number of overall misses
 system.cpu.l2cache.overall_mshr_hits                0                       # number of overall MSHR hits
-system.cpu.l2cache.overall_mshr_miss_latency   2384331000                       # number of overall MSHR miss cycles
-system.cpu.l2cache.overall_mshr_miss_rate     0.036993                       # mshr miss rate for overall accesses
-system.cpu.l2cache.overall_mshr_misses          76838                       # number of overall MSHR misses
+system.cpu.l2cache.overall_mshr_miss_latency   2383202000                       # number of overall MSHR miss cycles
+system.cpu.l2cache.overall_mshr_miss_rate     0.036932                       # mshr miss rate for overall accesses
+system.cpu.l2cache.overall_mshr_misses          76793                       # number of overall MSHR misses
 system.cpu.l2cache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.l2cache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.l2cache.replacements                 49392                       # number of replacements
-system.cpu.l2cache.sampled_refs                 77392                       # Sample count of references to valid blocks.
+system.cpu.l2cache.replacements                 49342                       # number of replacements
+system.cpu.l2cache.sampled_refs                 77347                       # Sample count of references to valid blocks.
 system.cpu.l2cache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.l2cache.tagsinuse             17746.117987                       # Cycle average of tags in use
-system.cpu.l2cache.total_refs                 3308615                       # Total number of references to valid blocks.
+system.cpu.l2cache.tagsinuse             17126.715630                       # Cycle average of tags in use
+system.cpu.l2cache.total_refs                 3313200                       # Total number of references to valid blocks.
 system.cpu.l2cache.warmup_cycle                     0                       # Cycle when the warmup percentage was hit.
-system.cpu.l2cache.writebacks                   29474                       # number of writebacks
-system.cpu.memDep0.conflictingLoads          22358679                       # Number of conflicting loads.
-system.cpu.memDep0.conflictingStores          3757180                       # Number of conflicting stores.
-system.cpu.memDep0.insertedLoads            131280417                       # Number of loads inserted to the mem dependence unit.
-system.cpu.memDep0.insertedStores            41039188                       # Number of stores inserted to the mem dependence unit.
-system.cpu.misc_regfile_reads               204301939                       # number of misc regfile reads
-system.cpu.numCycles                        341361263                       # number of cpu cycles simulated
+system.cpu.l2cache.writebacks                   29450                       # number of writebacks
+system.cpu.memDep0.conflictingLoads          87882428                       # Number of conflicting loads.
+system.cpu.memDep0.conflictingStores         16100005                       # Number of conflicting stores.
+system.cpu.memDep0.insertedLoads            138835558                       # Number of loads inserted to the mem dependence unit.
+system.cpu.memDep0.insertedStores            42750154                       # Number of stores inserted to the mem dependence unit.
+system.cpu.misc_regfile_reads               218323859                       # number of misc regfile reads
+system.cpu.numCycles                        197244429                       # number of cpu cycles simulated
 system.cpu.numWorkItemsCompleted                    0                       # number of work items this cpu completed
 system.cpu.numWorkItemsStarted                      0                       # number of work items this cpu started
-system.cpu.rename.RENAME:BlockCycles           486743                       # Number of cycles rename is blocking
+system.cpu.rename.RENAME:BlockCycles          6557218                       # Number of cycles rename is blocking
 system.cpu.rename.RENAME:CommittedMaps      248344192                       # Number of HB maps that are committed
-system.cpu.rename.RENAME:IQFullEvents           12249                       # Number of times rename has blocked due to IQ full
-system.cpu.rename.RENAME:IdleCycles          98511117                       # Number of cycles rename is idle
-system.cpu.rename.RENAME:LSQFullEvents         368076                       # Number of times rename has blocked due to LSQ full
-system.cpu.rename.RENAME:RenameLookups     1292599643                       # Number of register rename lookups that rename has made
-system.cpu.rename.RENAME:RenamedInsts       423407319                       # Number of instructions processed by rename
-system.cpu.rename.RENAME:RenamedOperands    377348250                       # Number of destination operands rename has renamed
-system.cpu.rename.RENAME:RunCycles          222275258                       # Number of cycles rename is running
-system.cpu.rename.RENAME:SquashCycles        19453848                       # Number of cycles rename is squashing
-system.cpu.rename.RENAME:UnblockCycles         514692                       # Number of cycles rename is unblocking
-system.cpu.rename.RENAME:UndoneMaps         129004058                       # Number of HB maps that are undone due to squashing
-system.cpu.rename.RENAME:fp_rename_lookups          291                       # Number of floating rename lookups
-system.cpu.rename.RENAME:int_rename_lookups   1292599352                       # Number of integer rename lookups
-system.cpu.rename.RENAME:serializeStallCycles         5287                       # count of cycles rename stalled for serializing inst
-system.cpu.rename.RENAME:serializingInsts          454                       # count of serializing insts renamed
-system.cpu.rename.RENAME:skidInsts             779091                       # count of insts added to the skid buffer
-system.cpu.rename.RENAME:tempSerializingInsts          452                       # count of temporary serializing insts renamed
-system.cpu.rob.rob_reads                    708961934                       # The number of ROB reads
-system.cpu.rob.rob_writes                   799263493                       # The number of ROB writes
-system.cpu.timesIdled                            5627                       # Number of times that the entire CPU went into an idle state and unscheduled itself
+system.cpu.rename.RENAME:IQFullEvents          228138                       # Number of times rename has blocked due to IQ full
+system.cpu.rename.RENAME:IdleCycles          83203716                       # Number of cycles rename is idle
+system.cpu.rename.RENAME:LSQFullEvents       14824029                       # Number of times rename has blocked due to LSQ full
+system.cpu.rename.RENAME:ROBFullEvents             13                       # Number of times rename has blocked due to ROB full
+system.cpu.rename.RENAME:RenameLookups     1059543178                       # Number of register rename lookups that rename has made
+system.cpu.rename.RENAME:RenamedInsts       431467970                       # Number of instructions processed by rename
+system.cpu.rename.RENAME:RenamedOperands    388798641                       # Number of destination operands rename has renamed
+system.cpu.rename.RENAME:RunCycles           71280917                       # Number of cycles rename is running
+system.cpu.rename.RENAME:SquashCycles        19022168                       # Number of cycles rename is squashing
+system.cpu.rename.RENAME:UnblockCycles       15900092                       # Number of cycles rename is unblocking
+system.cpu.rename.RENAME:UndoneMaps         140454449                       # Number of HB maps that are undone due to squashing
+system.cpu.rename.RENAME:fp_rename_lookups          574                       # Number of floating rename lookups
+system.cpu.rename.RENAME:int_rename_lookups   1059542604                       # Number of integer rename lookups
+system.cpu.rename.RENAME:serializeStallCycles         6421                       # count of cycles rename stalled for serializing inst
+system.cpu.rename.RENAME:serializingInsts          469                       # count of serializing insts renamed
+system.cpu.rename.RENAME:skidInsts           38067869                       # count of insts added to the skid buffer
+system.cpu.rename.RENAME:tempSerializingInsts          463                       # count of temporary serializing insts renamed
+system.cpu.rob.rob_reads                    574492355                       # The number of ROB reads
+system.cpu.rob.rob_writes                   837321831                       # The number of ROB writes
+system.cpu.timesIdled                           40675                       # Number of times that the entire CPU went into an idle state and unscheduled itself
 system.cpu.workload.PROG:num_syscalls             444                       # Number of system calls
 
 ---------- End Simulation Statistics   ----------
diff --git a/tests/long/10.mcf/ref/x86/linux/simple-atomic/simout b/tests/long/10.mcf/ref/x86/linux/simple-atomic/simout
index e76d608191..2aa2852bef 100755
--- a/tests/long/10.mcf/ref/x86/linux/simple-atomic/simout
+++ b/tests/long/10.mcf/ref/x86/linux/simple-atomic/simout
@@ -5,9 +5,9 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:12
+M5 compiled Feb  8 2011 00:58:32
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:34
 M5 executing on burrito
 command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/10.mcf/x86/linux/simple-atomic -re tests/run.py build/X86_SE/tests/fast/long/10.mcf/x86/linux/simple-atomic
 Global frequency set at 1000000000000 ticks per second
diff --git a/tests/long/10.mcf/ref/x86/linux/simple-atomic/stats.txt b/tests/long/10.mcf/ref/x86/linux/simple-atomic/stats.txt
index bcab65c404..aacdb23096 100644
--- a/tests/long/10.mcf/ref/x86/linux/simple-atomic/stats.txt
+++ b/tests/long/10.mcf/ref/x86/linux/simple-atomic/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 722489                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 358012                       # Number of bytes of host memory used
-host_seconds                                   385.05                       # Real time elapsed on the host
-host_tick_rate                              438776725                       # Simulator tick rate (ticks/s)
+host_inst_rate                                1568972                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 358500                       # Number of bytes of host memory used
+host_seconds                                   177.31                       # Real time elapsed on the host
+host_tick_rate                              952856596                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                   278192520                       # Number of instructions simulated
 sim_seconds                                  0.168950                       # Number of seconds simulated
@@ -24,7 +24,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                        278192520                       # Number of instructions executed
 system.cpu.num_int_alu_accesses             278186228                       # Number of integer alu accesses
 system.cpu.num_int_insts                    278186228                       # number of integer instructions
-system.cpu.num_int_register_reads           855210512                       # number of times the integer registers were read
+system.cpu.num_int_register_reads           685043114                       # number of times the integer registers were read
 system.cpu.num_int_register_writes          248344166                       # number of times the integer registers were written
 system.cpu.num_load_insts                    90779388                       # Number of load instructions
 system.cpu.num_mem_refs                     122219139                       # number of memory refs
diff --git a/tests/long/10.mcf/ref/x86/linux/simple-timing/simout b/tests/long/10.mcf/ref/x86/linux/simple-timing/simout
index 0b92276cca..56b5fe9df9 100755
--- a/tests/long/10.mcf/ref/x86/linux/simple-timing/simout
+++ b/tests/long/10.mcf/ref/x86/linux/simple-timing/simout
@@ -5,9 +5,9 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:12
+M5 compiled Feb  8 2011 00:58:32
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:34
 M5 executing on burrito
 command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/10.mcf/x86/linux/simple-timing -re tests/run.py build/X86_SE/tests/fast/long/10.mcf/x86/linux/simple-timing
 Global frequency set at 1000000000000 ticks per second
diff --git a/tests/long/10.mcf/ref/x86/linux/simple-timing/stats.txt b/tests/long/10.mcf/ref/x86/linux/simple-timing/stats.txt
index cf6f03e98a..e90dea7b7d 100644
--- a/tests/long/10.mcf/ref/x86/linux/simple-timing/stats.txt
+++ b/tests/long/10.mcf/ref/x86/linux/simple-timing/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 424375                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 365728                       # Number of bytes of host memory used
-host_seconds                                   655.54                       # Real time elapsed on the host
-host_tick_rate                              564440982                       # Simulator tick rate (ticks/s)
+host_inst_rate                                1018906                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 366224                       # Number of bytes of host memory used
+host_seconds                                   273.03                       # Real time elapsed on the host
+host_tick_rate                             1355197592                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                   278192520                       # Number of instructions simulated
 sim_seconds                                  0.370011                       # Number of seconds simulated
@@ -213,7 +213,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                        278192520                       # Number of instructions executed
 system.cpu.num_int_alu_accesses             278186228                       # Number of integer alu accesses
 system.cpu.num_int_insts                    278186228                       # number of integer instructions
-system.cpu.num_int_register_reads           855210512                       # number of times the integer registers were read
+system.cpu.num_int_register_reads           685043114                       # number of times the integer registers were read
 system.cpu.num_int_register_writes          248344166                       # number of times the integer registers were written
 system.cpu.num_load_insts                    90779388                       # Number of load instructions
 system.cpu.num_mem_refs                     122219139                       # number of memory refs
diff --git a/tests/long/20.parser/ref/x86/linux/o3-timing/config.ini b/tests/long/20.parser/ref/x86/linux/o3-timing/config.ini
index 8363ae747d..da344ea4b5 100644
--- a/tests/long/20.parser/ref/x86/linux/o3-timing/config.ini
+++ b/tests/long/20.parser/ref/x86/linux/o3-timing/config.ini
@@ -488,7 +488,7 @@ type=ExeTracer
 [system.cpu.workload]
 type=LiveProcess
 cmd=parser 2.1.dict -batch
-cwd=build/X86_SE/tests/fast/long/20.parser/x86/linux/o3-timing
+cwd=build/X86_SE/tests/opt/long/20.parser/x86/linux/o3-timing
 egid=100
 env=
 errout=cerr
diff --git a/tests/long/20.parser/ref/x86/linux/o3-timing/simout b/tests/long/20.parser/ref/x86/linux/o3-timing/simout
index 4d3b5f29b8..696087afcd 100755
--- a/tests/long/20.parser/ref/x86/linux/o3-timing/simout
+++ b/tests/long/20.parser/ref/x86/linux/o3-timing/simout
@@ -5,16 +5,16 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:13
+M5 compiled Feb 12 2011 02:22:23
+M5 revision 5e76f9de6972 7961 default qtip tip x86branchdetectstats.patch
+M5 started Feb 12 2011 02:22:27
 M5 executing on burrito
-command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/20.parser/x86/linux/o3-timing -re tests/run.py build/X86_SE/tests/fast/long/20.parser/x86/linux/o3-timing
+command line: build/X86_SE/m5.opt -d build/X86_SE/tests/opt/long/20.parser/x86/linux/o3-timing -re tests/run.py build/X86_SE/tests/opt/long/20.parser/x86/linux/o3-timing
 Global frequency set at 1000000000000 ticks per second
 info: Entering event queue @ 0.  Starting simulation...
 
- Reading the dictionary files: *****************************info: Increasing stack size by one page.
-********************
+ Reading the dictionary files: ***********************info: Increasing stack size by one page.
+**************************
  58924 words stored in 3784810 bytes
 
 
@@ -74,4 +74,4 @@ info: Increasing stack size by one page.
   about 2 million people attended 
   the five best costumes got prizes 
 No errors!
-Exiting @ tick 817002039000 because target called exit()
+Exiting @ tick 610952992000 because target called exit()
diff --git a/tests/long/20.parser/ref/x86/linux/o3-timing/stats.txt b/tests/long/20.parser/ref/x86/linux/o3-timing/stats.txt
index c39e8dfaec..070979214c 100644
--- a/tests/long/20.parser/ref/x86/linux/o3-timing/stats.txt
+++ b/tests/long/20.parser/ref/x86/linux/o3-timing/stats.txt
@@ -1,475 +1,475 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 160923                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 240360                       # Number of bytes of host memory used
-host_seconds                                  9501.35                       # Real time elapsed on the host
-host_tick_rate                               85987979                       # Simulator tick rate (ticks/s)
+host_inst_rate                                 130186                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 285488                       # Number of bytes of host memory used
+host_seconds                                 11733.03                       # Real time elapsed on the host
+host_tick_rate                               52071207                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
-sim_insts                                  1528988756                       # Number of instructions simulated
-sim_seconds                                  0.817002                       # Number of seconds simulated
-sim_ticks                                817002039000                       # Number of ticks simulated
+sim_insts                                  1527476062                       # Number of instructions simulated
+sim_seconds                                  0.610953                       # Number of seconds simulated
+sim_ticks                                610952992000                       # Number of ticks simulated
 system.cpu.BPredUnit.BTBCorrect                     0                       # Number of correct BTB predictions (this stat may not work properly.
-system.cpu.BPredUnit.BTBHits                197674461                       # Number of BTB hits
-system.cpu.BPredUnit.BTBLookups             215147546                       # Number of BTB lookups
+system.cpu.BPredUnit.BTBHits                220273443                       # Number of BTB hits
+system.cpu.BPredUnit.BTBLookups             239822696                       # Number of BTB lookups
 system.cpu.BPredUnit.RASInCorrect                   0                       # Number of incorrect RAS predictions.
-system.cpu.BPredUnit.condIncorrect           17901021                       # Number of conditional branches incorrect
-system.cpu.BPredUnit.condPredicted          215739151                       # Number of conditional branches predicted
-system.cpu.BPredUnit.lookups                215739151                       # Number of BP lookups
+system.cpu.BPredUnit.condIncorrect           16691862                       # Number of conditional branches incorrect
+system.cpu.BPredUnit.condPredicted          254901320                       # Number of conditional branches predicted
+system.cpu.BPredUnit.lookups                254901320                       # Number of BP lookups
 system.cpu.BPredUnit.usedRAS                        0                       # Number of times the RAS was used to get a target.
-system.cpu.commit.COM:branches              149758588                       # Number of branches committed
-system.cpu.commit.COM:bw_lim_events           8186576                       # number cycles where commit BW limit reached
+system.cpu.commit.COM:branches              149616585                       # Number of branches committed
+system.cpu.commit.COM:bw_lim_events          33918821                       # number cycles where commit BW limit reached
 system.cpu.commit.COM:bw_limited                    0                       # number of insts not committed due to BW limits
-system.cpu.commit.COM:committed_per_cycle::samples   1552269342                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::mean     0.985002                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::stdev     1.301395                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::samples   1083369873                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::mean     1.409930                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::stdev     1.877801                       # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::underflows            0      0.00%      0.00% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::0    694185983     44.72%     44.72% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::1    509617235     32.83%     77.55% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::2    176087126     11.34%     88.90% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::3    105147186      6.77%     95.67% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::4     31137095      2.01%     97.67% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::5     11224991      0.72%     98.40% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::6     11192282      0.72%     99.12% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::7      5490868      0.35%     99.47% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::8      8186576      0.53%    100.00% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::0    454928288     41.99%     41.99% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::1    282557908     26.08%     68.07% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::2    120287774     11.10%     79.18% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::3    105365409      9.73%     88.90% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::4     40172301      3.71%     92.61% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::5     27676804      2.55%     95.16% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::6     11415389      1.05%     96.22% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::7      7047179      0.65%     96.87% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::8     33918821      3.13%    100.00% # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::overflows            0      0.00%    100.00% # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::min_value            0                       # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::max_value            8                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::total   1552269342                       # Number of insts commited each cycle
-system.cpu.commit.COM:count                1528988756                       # Number of instructions committed
+system.cpu.commit.COM:committed_per_cycle::total   1083369873                       # Number of insts commited each cycle
+system.cpu.commit.COM:count                1527476062                       # Number of instructions committed
 system.cpu.commit.COM:fp_insts                      0                       # Number of committed floating point instructions.
 system.cpu.commit.COM:function_calls                0                       # Number of function calls committed.
-system.cpu.commit.COM:int_insts            1528317614                       # Number of committed integer instructions.
-system.cpu.commit.COM:loads                 384102160                       # Number of loads committed
+system.cpu.commit.COM:int_insts            1526804920                       # Number of committed integer instructions.
+system.cpu.commit.COM:loads                 383724495                       # Number of loads committed
 system.cpu.commit.COM:membars                       0                       # Number of memory barriers committed
-system.cpu.commit.COM:refs                  533262345                       # Number of memory references committed
+system.cpu.commit.COM:refs                  532790180                       # Number of memory references committed
 system.cpu.commit.COM:swp_count                     0                       # Number of s/w prefetches committed
-system.cpu.commit.branchMispredicts          17902344                       # The number of times a branch was mispredicted
-system.cpu.commit.commitCommittedInsts     1528988756                       # The number of committed instructions
+system.cpu.commit.branchMispredicts          16726957                       # The number of times a branch was mispredicted
+system.cpu.commit.commitCommittedInsts     1527476062                       # The number of committed instructions
 system.cpu.commit.commitNonSpecStalls             553                       # The number of times commit has been forced to stall to communicate backwards
-system.cpu.commit.commitSquashedInsts       459109010                       # The number of squashed insts skipped by commit
-system.cpu.committedInsts                  1528988756                       # Number of Instructions Simulated
-system.cpu.committedInsts_total            1528988756                       # Number of Instructions Simulated
-system.cpu.cpi                               1.068683                       # CPI: Cycles Per Instruction
-system.cpu.cpi_total                         1.068683                       # CPI: Total CPI of All Threads
-system.cpu.dcache.ReadReq_accesses          352008034                       # number of ReadReq accesses(hits+misses)
-system.cpu.dcache.ReadReq_avg_miss_latency 14100.976079                       # average ReadReq miss latency
-system.cpu.dcache.ReadReq_avg_mshr_miss_latency  8499.435037                       # average ReadReq mshr miss latency
-system.cpu.dcache.ReadReq_hits              350035037                       # number of ReadReq hits
-system.cpu.dcache.ReadReq_miss_latency    27821183500                       # number of ReadReq miss cycles
-system.cpu.dcache.ReadReq_miss_rate          0.005605                       # miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_misses              1972997                       # number of ReadReq misses
-system.cpu.dcache.ReadReq_mshr_hits            237485                       # number of ReadReq MSHR hits
-system.cpu.dcache.ReadReq_mshr_miss_latency  14750871500                       # number of ReadReq MSHR miss cycles
-system.cpu.dcache.ReadReq_mshr_miss_rate     0.004930                       # mshr miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_mshr_misses         1735512                       # number of ReadReq MSHR misses
-system.cpu.dcache.WriteReq_accesses         149160201                       # number of WriteReq accesses(hits+misses)
-system.cpu.dcache.WriteReq_avg_miss_latency 15942.157352                       # average WriteReq miss latency
-system.cpu.dcache.WriteReq_avg_mshr_miss_latency 12645.445755                       # average WriteReq mshr miss latency
-system.cpu.dcache.WriteReq_hits             148213244                       # number of WriteReq hits
-system.cpu.dcache.WriteReq_miss_latency   15096537500                       # number of WriteReq miss cycles
-system.cpu.dcache.WriteReq_miss_rate         0.006349                       # miss rate for WriteReq accesses
-system.cpu.dcache.WriteReq_misses              946957                       # number of WriteReq misses
-system.cpu.dcache.WriteReq_mshr_hits           159966                       # number of WriteReq MSHR hits
-system.cpu.dcache.WriteReq_mshr_miss_latency   9951852000                       # number of WriteReq MSHR miss cycles
-system.cpu.dcache.WriteReq_mshr_miss_rate     0.005276                       # mshr miss rate for WriteReq accesses
-system.cpu.dcache.WriteReq_mshr_misses         786991                       # number of WriteReq MSHR misses
+system.cpu.commit.commitSquashedInsts       841443918                       # The number of squashed insts skipped by commit
+system.cpu.committedInsts                  1527476062                       # Number of Instructions Simulated
+system.cpu.committedInsts_total            1527476062                       # Number of Instructions Simulated
+system.cpu.cpi                               0.799951                       # CPI: Cycles Per Instruction
+system.cpu.cpi_total                         0.799951                       # CPI: Total CPI of All Threads
+system.cpu.dcache.ReadReq_accesses          320046346                       # number of ReadReq accesses(hits+misses)
+system.cpu.dcache.ReadReq_avg_miss_latency 15794.070061                       # average ReadReq miss latency
+system.cpu.dcache.ReadReq_avg_mshr_miss_latency  8150.695480                       # average ReadReq mshr miss latency
+system.cpu.dcache.ReadReq_hits              317137092                       # number of ReadReq hits
+system.cpu.dcache.ReadReq_miss_latency    45948961500                       # number of ReadReq miss cycles
+system.cpu.dcache.ReadReq_miss_rate          0.009090                       # miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_misses              2909254                       # number of ReadReq misses
+system.cpu.dcache.ReadReq_mshr_hits           1183970                       # number of ReadReq MSHR hits
+system.cpu.dcache.ReadReq_mshr_miss_latency  14062264500                       # number of ReadReq MSHR miss cycles
+system.cpu.dcache.ReadReq_mshr_miss_rate     0.005391                       # mshr miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_mshr_misses         1725284                       # number of ReadReq MSHR misses
+system.cpu.dcache.WriteReq_accesses         149065701                       # number of WriteReq accesses(hits+misses)
+system.cpu.dcache.WriteReq_avg_miss_latency 23554.108597                       # average WriteReq miss latency
+system.cpu.dcache.WriteReq_avg_mshr_miss_latency 18051.470496                       # average WriteReq mshr miss latency
+system.cpu.dcache.WriteReq_hits             147419835                       # number of WriteReq hits
+system.cpu.dcache.WriteReq_miss_latency   38766906500                       # number of WriteReq miss cycles
+system.cpu.dcache.WriteReq_miss_rate         0.011041                       # miss rate for WriteReq accesses
+system.cpu.dcache.WriteReq_misses             1645866                       # number of WriteReq misses
+system.cpu.dcache.WriteReq_mshr_hits           608291                       # number of WriteReq MSHR hits
+system.cpu.dcache.WriteReq_mshr_miss_latency  18729754500                       # number of WriteReq MSHR miss cycles
+system.cpu.dcache.WriteReq_mshr_miss_rate     0.006961                       # mshr miss rate for WriteReq accesses
+system.cpu.dcache.WriteReq_mshr_misses        1037575                       # number of WriteReq MSHR misses
 system.cpu.dcache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.dcache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.dcache.avg_refs                 197.709284                       # Average number of references to valid blocks.
+system.cpu.dcache.avg_refs                 185.704246                       # Average number of references to valid blocks.
 system.cpu.dcache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.dcache.blocked::no_targets               0                       # number of cycles access was blocked
 system.cpu.dcache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.dcache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.dcache.cache_copies                      0                       # number of cache copies performed
-system.cpu.dcache.demand_accesses           501168235                       # number of demand (read+write) accesses
-system.cpu.dcache.demand_avg_miss_latency 14698.081203                       # average overall miss latency
-system.cpu.dcache.demand_avg_mshr_miss_latency  9792.941178                       # average overall mshr miss latency
-system.cpu.dcache.demand_hits               498248281                       # number of demand (read+write) hits
-system.cpu.dcache.demand_miss_latency     42917721000                       # number of demand (read+write) miss cycles
-system.cpu.dcache.demand_miss_rate           0.005826                       # miss rate for demand accesses
-system.cpu.dcache.demand_misses               2919954                       # number of demand (read+write) misses
-system.cpu.dcache.demand_mshr_hits             397451                       # number of demand (read+write) MSHR hits
-system.cpu.dcache.demand_mshr_miss_latency  24702723500                       # number of demand (read+write) MSHR miss cycles
-system.cpu.dcache.demand_mshr_miss_rate      0.005033                       # mshr miss rate for demand accesses
-system.cpu.dcache.demand_mshr_misses          2522503                       # number of demand (read+write) MSHR misses
+system.cpu.dcache.demand_accesses           469112047                       # number of demand (read+write) accesses
+system.cpu.dcache.demand_avg_miss_latency 18597.944291                       # average overall miss latency
+system.cpu.dcache.demand_avg_mshr_miss_latency 11868.871701                       # average overall mshr miss latency
+system.cpu.dcache.demand_hits               464556927                       # number of demand (read+write) hits
+system.cpu.dcache.demand_miss_latency     84715868000                       # number of demand (read+write) miss cycles
+system.cpu.dcache.demand_miss_rate           0.009710                       # miss rate for demand accesses
+system.cpu.dcache.demand_misses               4555120                       # number of demand (read+write) misses
+system.cpu.dcache.demand_mshr_hits            1792261                       # number of demand (read+write) MSHR hits
+system.cpu.dcache.demand_mshr_miss_latency  32792019000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.dcache.demand_mshr_miss_rate      0.005890                       # mshr miss rate for demand accesses
+system.cpu.dcache.demand_mshr_misses          2762859                       # number of demand (read+write) MSHR misses
 system.cpu.dcache.fast_writes                       0                       # number of fast writes performed
 system.cpu.dcache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.dcache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.dcache.occ_%::0                   0.997749                       # Average percentage of cache occupancy
-system.cpu.dcache.occ_blocks::0           4086.780222                       # Average occupied blocks per context
-system.cpu.dcache.overall_accesses          501168235                       # number of overall (read+write) accesses
-system.cpu.dcache.overall_avg_miss_latency 14698.081203                       # average overall miss latency
-system.cpu.dcache.overall_avg_mshr_miss_latency  9792.941178                       # average overall mshr miss latency
+system.cpu.dcache.occ_%::0                   0.998028                       # Average percentage of cache occupancy
+system.cpu.dcache.occ_blocks::0           4087.922333                       # Average occupied blocks per context
+system.cpu.dcache.overall_accesses          469112047                       # number of overall (read+write) accesses
+system.cpu.dcache.overall_avg_miss_latency 18597.944291                       # average overall miss latency
+system.cpu.dcache.overall_avg_mshr_miss_latency 11868.871701                       # average overall mshr miss latency
 system.cpu.dcache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.dcache.overall_hits              498248281                       # number of overall hits
-system.cpu.dcache.overall_miss_latency    42917721000                       # number of overall miss cycles
-system.cpu.dcache.overall_miss_rate          0.005826                       # miss rate for overall accesses
-system.cpu.dcache.overall_misses              2919954                       # number of overall misses
-system.cpu.dcache.overall_mshr_hits            397451                       # number of overall MSHR hits
-system.cpu.dcache.overall_mshr_miss_latency  24702723500                       # number of overall MSHR miss cycles
-system.cpu.dcache.overall_mshr_miss_rate     0.005033                       # mshr miss rate for overall accesses
-system.cpu.dcache.overall_mshr_misses         2522503                       # number of overall MSHR misses
+system.cpu.dcache.overall_hits              464556927                       # number of overall hits
+system.cpu.dcache.overall_miss_latency    84715868000                       # number of overall miss cycles
+system.cpu.dcache.overall_miss_rate          0.009710                       # miss rate for overall accesses
+system.cpu.dcache.overall_misses              4555120                       # number of overall misses
+system.cpu.dcache.overall_mshr_hits           1792261                       # number of overall MSHR hits
+system.cpu.dcache.overall_mshr_miss_latency  32792019000                       # number of overall MSHR miss cycles
+system.cpu.dcache.overall_mshr_miss_rate     0.005890                       # mshr miss rate for overall accesses
+system.cpu.dcache.overall_mshr_misses         2762859                       # number of overall MSHR misses
 system.cpu.dcache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.dcache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.dcache.replacements                2516044                       # number of replacements
-system.cpu.dcache.sampled_refs                2520140                       # Sample count of references to valid blocks.
+system.cpu.dcache.replacements                2504740                       # number of replacements
+system.cpu.dcache.sampled_refs                2508836                       # Sample count of references to valid blocks.
 system.cpu.dcache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.dcache.tagsinuse               4086.780222                       # Cycle average of tags in use
-system.cpu.dcache.total_refs                498255076                       # Total number of references to valid blocks.
-system.cpu.dcache.warmup_cycle             3876881000                       # Cycle when the warmup percentage was hit.
-system.cpu.dcache.writebacks                  2224034                       # number of writebacks
-system.cpu.decode.DECODE:BlockedCycles       25470243                       # Number of cycles decode is blocked
-system.cpu.decode.DECODE:DecodedInsts      2119227193                       # Number of instructions handled by decode
-system.cpu.decode.DECODE:IdleCycles         403203369                       # Number of cycles decode is idle
-system.cpu.decode.DECODE:RunCycles         1116867689                       # Number of cycles decode is running
-system.cpu.decode.DECODE:SquashCycles        71636028                       # Number of cycles decode is squashing
-system.cpu.decode.DECODE:UnblockCycles        6728041                       # Number of cycles decode is unblocking
-system.cpu.fetch.Branches                   215739151                       # Number of branches that fetch encountered
-system.cpu.fetch.CacheLines                 165973622                       # Number of cache lines fetched
-system.cpu.fetch.Cycles                    1190006834                       # Number of cycles fetch has run and was not squashing or blocked
-system.cpu.fetch.IcacheSquashes               2725815                       # Number of outstanding Icache misses that were squashed
-system.cpu.fetch.Insts                     1144873460                       # Number of instructions fetch has processed
-system.cpu.fetch.MiscStallCycles                 1839                       # Number of cycles fetch has spent waiting on interrupts, or bad addresses, or out of MSHRs
-system.cpu.fetch.SquashCycles                29822694                       # Number of cycles fetch has spent squashing
-system.cpu.fetch.branchRate                  0.132031                       # Number of branch fetches per cycle
-system.cpu.fetch.icacheStallCycles          165973622                       # Number of cycles fetch is stalled on an Icache miss
-system.cpu.fetch.predictedBranches          197674461                       # Number of branches that fetch has predicted taken
-system.cpu.fetch.rate                        0.700655                       # Number of inst fetches per cycle
-system.cpu.fetch.rateDist::samples         1623905370                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::mean              1.336094                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::stdev             1.273592                       # Number of instructions fetched each cycle (Total)
+system.cpu.dcache.tagsinuse               4087.922333                       # Cycle average of tags in use
+system.cpu.dcache.total_refs                465901497                       # Total number of references to valid blocks.
+system.cpu.dcache.warmup_cycle             2529382000                       # Cycle when the warmup percentage was hit.
+system.cpu.dcache.writebacks                  2229751                       # number of writebacks
+system.cpu.decode.DECODE:BlockedCycles      215366555                       # Number of cycles decode is blocked
+system.cpu.decode.DECODE:DecodedInsts      2516935544                       # Number of instructions handled by decode
+system.cpu.decode.DECODE:IdleCycles         437043857                       # Number of cycles decode is idle
+system.cpu.decode.DECODE:RunCycles          404205746                       # Number of cycles decode is running
+system.cpu.decode.DECODE:SquashCycles       113949773                       # Number of cycles decode is squashing
+system.cpu.decode.DECODE:UnblockCycles       26753715                       # Number of cycles decode is unblocking
+system.cpu.fetch.Branches                   254901320                       # Number of branches that fetch encountered
+system.cpu.fetch.CacheLines                 190461812                       # Number of cache lines fetched
+system.cpu.fetch.Cycles                     445534669                       # Number of cycles fetch has run and was not squashing or blocked
+system.cpu.fetch.IcacheSquashes               3068431                       # Number of outstanding Icache misses that were squashed
+system.cpu.fetch.Insts                     1374706338                       # Number of instructions fetch has processed
+system.cpu.fetch.MiscStallCycles                85274                       # Number of cycles fetch has spent waiting on interrupts, or bad addresses, or out of MSHRs
+system.cpu.fetch.SquashCycles                18549281                       # Number of cycles fetch has spent squashing
+system.cpu.fetch.branchRate                  0.208610                       # Number of branch fetches per cycle
+system.cpu.fetch.icacheStallCycles          190461812                       # Number of cycles fetch is stalled on an Icache miss
+system.cpu.fetch.predictedBranches          220273443                       # Number of branches that fetch has predicted taken
+system.cpu.fetch.rate                        1.125051                       # Number of inst fetches per cycle
+system.cpu.fetch.rateDist::samples         1197319646                       # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::mean              2.144693                       # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::stdev             3.178811                       # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::underflows               0      0.00%      0.00% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::0                477535637     29.41%     29.41% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::1                564706157     34.77%     64.18% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::2                259330057     15.97%     80.15% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::3                261180842     16.08%     96.23% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::4                 22809127      1.40%     97.64% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::5                 31399021      1.93%     99.57% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::6                   502829      0.03%     99.60% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::7                       12      0.00%     99.60% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::8                  6441688      0.40%    100.00% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::0                756027205     63.14%     63.14% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::1                 34054494      2.84%     65.99% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::2                 36745231      3.07%     69.06% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::3                 33767076      2.82%     71.88% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::4                 21459245      1.79%     73.67% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::5                 40493114      3.38%     77.05% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::6                 45860411      3.83%     80.88% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::7                 35731624      2.98%     83.87% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::8                193181246     16.13%    100.00% # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::overflows                0      0.00%    100.00% # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::min_value                0                       # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::max_value                8                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::total           1623905370                       # Number of instructions fetched each cycle (Total)
-system.cpu.fp_regfile_reads                        10                       # number of floating regfile reads
-system.cpu.icache.ReadReq_accesses          165973622                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_avg_miss_latency 22741.617211                       # average ReadReq miss latency
-system.cpu.icache.ReadReq_avg_mshr_miss_latency 19372.661290                       # average ReadReq mshr miss latency
-system.cpu.icache.ReadReq_hits              165966882                       # number of ReadReq hits
-system.cpu.icache.ReadReq_miss_latency      153278500                       # number of ReadReq miss cycles
-system.cpu.icache.ReadReq_miss_rate          0.000041                       # miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_misses                 6740                       # number of ReadReq misses
-system.cpu.icache.ReadReq_mshr_hits               540                       # number of ReadReq MSHR hits
-system.cpu.icache.ReadReq_mshr_miss_latency    120110500                       # number of ReadReq MSHR miss cycles
-system.cpu.icache.ReadReq_mshr_miss_rate     0.000037                       # mshr miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_mshr_misses            6200                       # number of ReadReq MSHR misses
+system.cpu.fetch.rateDist::total           1197319646                       # Number of instructions fetched each cycle (Total)
+system.cpu.fp_regfile_reads                        31                       # number of floating regfile reads
+system.cpu.icache.ReadReq_accesses          190461812                       # number of ReadReq accesses(hits+misses)
+system.cpu.icache.ReadReq_avg_miss_latency  6527.954910                       # average ReadReq miss latency
+system.cpu.icache.ReadReq_avg_mshr_miss_latency  3419.281975                       # average ReadReq mshr miss latency
+system.cpu.icache.ReadReq_hits              190192396                       # number of ReadReq hits
+system.cpu.icache.ReadReq_miss_latency     1758735500                       # number of ReadReq miss cycles
+system.cpu.icache.ReadReq_miss_rate          0.001415                       # miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_misses               269416                       # number of ReadReq misses
+system.cpu.icache.ReadReq_mshr_hits              1570                       # number of ReadReq MSHR hits
+system.cpu.icache.ReadReq_mshr_miss_latency    915841000                       # number of ReadReq MSHR miss cycles
+system.cpu.icache.ReadReq_mshr_miss_rate     0.001406                       # mshr miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_mshr_misses          267846                       # number of ReadReq MSHR misses
 system.cpu.icache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.icache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.icache.avg_refs               49795.025203                       # Average number of references to valid blocks.
+system.cpu.icache.avg_refs               17699.832480                       # Average number of references to valid blocks.
 system.cpu.icache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.icache.blocked::no_targets               0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.icache.cache_copies                      0                       # number of cache copies performed
-system.cpu.icache.demand_accesses           165973622                       # number of demand (read+write) accesses
-system.cpu.icache.demand_avg_miss_latency 22741.617211                       # average overall miss latency
-system.cpu.icache.demand_avg_mshr_miss_latency 19372.661290                       # average overall mshr miss latency
-system.cpu.icache.demand_hits               165966882                       # number of demand (read+write) hits
-system.cpu.icache.demand_miss_latency       153278500                       # number of demand (read+write) miss cycles
-system.cpu.icache.demand_miss_rate           0.000041                       # miss rate for demand accesses
-system.cpu.icache.demand_misses                  6740                       # number of demand (read+write) misses
-system.cpu.icache.demand_mshr_hits                540                       # number of demand (read+write) MSHR hits
-system.cpu.icache.demand_mshr_miss_latency    120110500                       # number of demand (read+write) MSHR miss cycles
-system.cpu.icache.demand_mshr_miss_rate      0.000037                       # mshr miss rate for demand accesses
-system.cpu.icache.demand_mshr_misses             6200                       # number of demand (read+write) MSHR misses
+system.cpu.icache.demand_accesses           190461812                       # number of demand (read+write) accesses
+system.cpu.icache.demand_avg_miss_latency  6527.954910                       # average overall miss latency
+system.cpu.icache.demand_avg_mshr_miss_latency  3419.281975                       # average overall mshr miss latency
+system.cpu.icache.demand_hits               190192396                       # number of demand (read+write) hits
+system.cpu.icache.demand_miss_latency      1758735500                       # number of demand (read+write) miss cycles
+system.cpu.icache.demand_miss_rate           0.001415                       # miss rate for demand accesses
+system.cpu.icache.demand_misses                269416                       # number of demand (read+write) misses
+system.cpu.icache.demand_mshr_hits               1570                       # number of demand (read+write) MSHR hits
+system.cpu.icache.demand_mshr_miss_latency    915841000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.icache.demand_mshr_miss_rate      0.001406                       # mshr miss rate for demand accesses
+system.cpu.icache.demand_mshr_misses           267846                       # number of demand (read+write) MSHR misses
 system.cpu.icache.fast_writes                       0                       # number of fast writes performed
 system.cpu.icache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.icache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.icache.occ_%::0                   0.436573                       # Average percentage of cache occupancy
-system.cpu.icache.occ_blocks::0            894.100654                       # Average occupied blocks per context
-system.cpu.icache.overall_accesses          165973622                       # number of overall (read+write) accesses
-system.cpu.icache.overall_avg_miss_latency 22741.617211                       # average overall miss latency
-system.cpu.icache.overall_avg_mshr_miss_latency 19372.661290                       # average overall mshr miss latency
+system.cpu.icache.occ_%::0                   0.466021                       # Average percentage of cache occupancy
+system.cpu.icache.occ_blocks::0            954.411836                       # Average occupied blocks per context
+system.cpu.icache.overall_accesses          190461812                       # number of overall (read+write) accesses
+system.cpu.icache.overall_avg_miss_latency  6527.954910                       # average overall miss latency
+system.cpu.icache.overall_avg_mshr_miss_latency  3419.281975                       # average overall mshr miss latency
 system.cpu.icache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.icache.overall_hits              165966882                       # number of overall hits
-system.cpu.icache.overall_miss_latency      153278500                       # number of overall miss cycles
-system.cpu.icache.overall_miss_rate          0.000041                       # miss rate for overall accesses
-system.cpu.icache.overall_misses                 6740                       # number of overall misses
-system.cpu.icache.overall_mshr_hits               540                       # number of overall MSHR hits
-system.cpu.icache.overall_mshr_miss_latency    120110500                       # number of overall MSHR miss cycles
-system.cpu.icache.overall_mshr_miss_rate     0.000037                       # mshr miss rate for overall accesses
-system.cpu.icache.overall_mshr_misses            6200                       # number of overall MSHR misses
+system.cpu.icache.overall_hits              190192396                       # number of overall hits
+system.cpu.icache.overall_miss_latency     1758735500                       # number of overall miss cycles
+system.cpu.icache.overall_miss_rate          0.001415                       # miss rate for overall accesses
+system.cpu.icache.overall_misses               269416                       # number of overall misses
+system.cpu.icache.overall_mshr_hits              1570                       # number of overall MSHR hits
+system.cpu.icache.overall_mshr_miss_latency    915841000                       # number of overall MSHR miss cycles
+system.cpu.icache.overall_mshr_miss_rate     0.001406                       # mshr miss rate for overall accesses
+system.cpu.icache.overall_mshr_misses          267846                       # number of overall MSHR misses
 system.cpu.icache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.icache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.icache.replacements                   1750                       # number of replacements
-system.cpu.icache.sampled_refs                   3333                       # Sample count of references to valid blocks.
+system.cpu.icache.replacements                   9298                       # number of replacements
+system.cpu.icache.sampled_refs                  10745                       # Sample count of references to valid blocks.
 system.cpu.icache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.icache.tagsinuse                894.100654                       # Cycle average of tags in use
-system.cpu.icache.total_refs                165966819                       # Total number of references to valid blocks.
+system.cpu.icache.tagsinuse                954.411836                       # Cycle average of tags in use
+system.cpu.icache.total_refs                190184700                       # Total number of references to valid blocks.
 system.cpu.icache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
-system.cpu.icache.writebacks                        0                       # number of writebacks
-system.cpu.idleCycles                        10098709                       # Total number of cycles that the CPU has spent unscheduled due to idling
-system.cpu.iew.EXEC:branches                158001976                       # Number of branches executed
+system.cpu.icache.writebacks                        3                       # number of writebacks
+system.cpu.idleCycles                        24586339                       # Total number of cycles that the CPU has spent unscheduled due to idling
+system.cpu.iew.EXEC:branches                175611349                       # Number of branches executed
 system.cpu.iew.EXEC:nop                             0                       # number of nop insts executed
-system.cpu.iew.EXEC:rate                     1.044762                       # Inst execution rate
-system.cpu.iew.EXEC:refs                    586795750                       # number of memory reference insts executed
-system.cpu.iew.EXEC:stores                  160862585                       # Number of stores executed
+system.cpu.iew.EXEC:rate                     1.537639                       # Inst execution rate
+system.cpu.iew.EXEC:refs                    604612823                       # number of memory reference insts executed
+system.cpu.iew.EXEC:stores                  164362000                       # Number of stores executed
 system.cpu.iew.EXEC:swp                             0                       # number of swp insts executed
-system.cpu.iew.WB:consumers                2114014731                       # num instructions consuming a value
-system.cpu.iew.WB:count                    1694146367                       # cumulative count of insts written-back
-system.cpu.iew.WB:fanout                     0.583880                       # average fanout of values written-back
+system.cpu.iew.WB:consumers                2150205320                       # num instructions consuming a value
+system.cpu.iew.WB:count                    1865910107                       # cumulative count of insts written-back
+system.cpu.iew.WB:fanout                     0.666196                       # average fanout of values written-back
 system.cpu.iew.WB:penalized                         0                       # number of instrctions required to write to 'other' IQ
 system.cpu.iew.WB:penalized_rate                    0                       # fraction of instructions written-back that wrote to 'other' IQ
-system.cpu.iew.WB:producers                1234331323                       # num instructions producing a value
-system.cpu.iew.WB:rate                       1.036807                       # insts written-back per cycle
-system.cpu.iew.WB:sent                     1697627373                       # cumulative count of insts sent to commit
-system.cpu.iew.branchMispredicts             18573506                       # Number of branch mispredicts detected at execute
-system.cpu.iew.iewBlockCycles                 6103126                       # Number of cycles IEW is blocking
-system.cpu.iew.iewDispLoadInsts             508224738                       # Number of dispatched load instructions
-system.cpu.iew.iewDispNonSpecInsts                579                       # Number of dispatched non-speculative instructions
-system.cpu.iew.iewDispSquashedInsts          12080656                       # Number of squashed instructions skipped by dispatch
-system.cpu.iew.iewDispStoreInsts            194089353                       # Number of dispatched store instructions
-system.cpu.iew.iewDispatchedInsts          1988097398                       # Number of instructions dispatched to IQ
-system.cpu.iew.iewExecLoadInsts             425933165                       # Number of load instructions executed
-system.cpu.iew.iewExecSquashedInsts          26013466                       # Number of squashed instructions skipped in execute
-system.cpu.iew.iewExecutedInsts            1707144682                       # Number of executed instructions
-system.cpu.iew.iewIQFullEvents                 381189                       # Number of times the IQ has become full, causing a stall
+system.cpu.iew.WB:producers                1432458045                       # num instructions producing a value
+system.cpu.iew.WB:rate                       1.527049                       # insts written-back per cycle
+system.cpu.iew.WB:sent                     1872952311                       # cumulative count of insts sent to commit
+system.cpu.iew.branchMispredicts             18187438                       # Number of branch mispredicts detected at execute
+system.cpu.iew.iewBlockCycles                 9702727                       # Number of cycles IEW is blocking
+system.cpu.iew.iewDispLoadInsts             598780500                       # Number of dispatched load instructions
+system.cpu.iew.iewDispNonSpecInsts               6555                       # Number of dispatched non-speculative instructions
+system.cpu.iew.iewDispSquashedInsts           2427132                       # Number of squashed instructions skipped by dispatch
+system.cpu.iew.iewDispStoreInsts            227725972                       # Number of dispatched store instructions
+system.cpu.iew.iewDispatchedInsts          2368916953                       # Number of instructions dispatched to IQ
+system.cpu.iew.iewExecLoadInsts             440250823                       # Number of load instructions executed
+system.cpu.iew.iewExecSquashedInsts          24902522                       # Number of squashed instructions skipped in execute
+system.cpu.iew.iewExecutedInsts            1878850199                       # Number of executed instructions
+system.cpu.iew.iewIQFullEvents                 999062                       # Number of times the IQ has become full, causing a stall
 system.cpu.iew.iewIdleCycles                        0                       # Number of cycles IEW is idle
-system.cpu.iew.iewLSQFullEvents                 10588                       # Number of times the LSQ has become full, causing a stall
-system.cpu.iew.iewSquashCycles               71636028                       # Number of cycles IEW is squashing
-system.cpu.iew.iewUnblockCycles                847228                       # Number of cycles IEW is unblocking
+system.cpu.iew.iewLSQFullEvents                 48995                       # Number of times the LSQ has become full, causing a stall
+system.cpu.iew.iewSquashCycles              113949773                       # Number of cycles IEW is squashing
+system.cpu.iew.iewUnblockCycles               1501929                       # Number of cycles IEW is unblocking
 system.cpu.iew.lsq.thread.0.blockedLoads            0                       # Number of blocked loads due to partial load-store forwarding
 system.cpu.iew.lsq.thread.0.cacheBlocked            0                       # Number of times an access to memory failed due to the cache being blocked
-system.cpu.iew.lsq.thread.0.forwLoads        72909425                       # Number of loads that had data forwarded from stores
-system.cpu.iew.lsq.thread.0.ignoredResponses       277837                       # Number of memory responses ignored because the instruction is squashed
+system.cpu.iew.lsq.thread.0.forwLoads       119150872                       # Number of loads that had data forwarded from stores
+system.cpu.iew.lsq.thread.0.ignoredResponses       153037                       # Number of memory responses ignored because the instruction is squashed
 system.cpu.iew.lsq.thread.0.invAddrLoads            0                       # Number of loads ignored due to an invalid address
 system.cpu.iew.lsq.thread.0.invAddrSwpfs            0                       # Number of software prefetches ignored due to an invalid address
-system.cpu.iew.lsq.thread.0.memOrderViolation     11954619                       # Number of memory ordering violations
-system.cpu.iew.lsq.thread.0.rescheduledLoads          832                       # Number of loads that were rescheduled
-system.cpu.iew.lsq.thread.0.squashedLoads    124122578                       # Number of loads squashed
-system.cpu.iew.lsq.thread.0.squashedStores     44929168                       # Number of stores squashed
-system.cpu.iew.memOrderViolationEvents       11954619                       # Number of memory order violations
-system.cpu.iew.predictedNotTakenIncorrect       280770                       # Number of branches that were predicted not taken incorrectly
-system.cpu.iew.predictedTakenIncorrect       18292736                       # Number of branches that were predicted taken incorrectly
-system.cpu.int_regfile_reads               3876226209                       # number of integer regfile reads
-system.cpu.int_regfile_writes              1582892637                       # number of integer regfile writes
-system.cpu.ipc                               0.935731                       # IPC: Instructions Per Cycle
-system.cpu.ipc_total                         0.935731                       # IPC: Total IPC of All Threads
-system.cpu.iq.ISSUE:FU_type_0::No_OpClass      1927969      0.11%      0.11% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntAlu      1131725915     65.30%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntMult              0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntDiv               0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatAdd             0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatCmp             0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatCvt             0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatMult            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatDiv             0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatSqrt            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAdd              0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAddAcc            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAlu              0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdCmp              0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdCvt              0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMisc             0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMult             0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMultAcc            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdShift            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdShiftAcc            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdSqrt             0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatAdd            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatAlu            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatCmp            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatCvt            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatDiv            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMisc            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMult            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMultAcc            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatSqrt            0      0.00%     65.41% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::MemRead      435582288     25.13%     90.54% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::MemWrite     163921976      9.46%    100.00% # Type of FU issued
+system.cpu.iew.lsq.thread.0.memOrderViolation      1905759                       # Number of memory ordering violations
+system.cpu.iew.lsq.thread.0.rescheduledLoads         1230                       # Number of loads that were rescheduled
+system.cpu.iew.lsq.thread.0.squashedLoads    215056005                       # Number of loads squashed
+system.cpu.iew.lsq.thread.0.squashedStores     78660287                       # Number of stores squashed
+system.cpu.iew.memOrderViolationEvents        1905759                       # Number of memory order violations
+system.cpu.iew.predictedNotTakenIncorrect      2718790                       # Number of branches that were predicted not taken incorrectly
+system.cpu.iew.predictedTakenIncorrect       15468648                       # Number of branches that were predicted taken incorrectly
+system.cpu.int_regfile_reads               3097184079                       # number of integer regfile reads
+system.cpu.int_regfile_writes              1741804464                       # number of integer regfile writes
+system.cpu.ipc                               1.250077                       # IPC: Instructions Per Cycle
+system.cpu.ipc_total                         1.250077                       # IPC: Total IPC of All Threads
+system.cpu.iq.ISSUE:FU_type_0::No_OpClass      2283854      0.12%      0.12% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntAlu      1286143659     67.56%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntMult              0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntDiv               0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatAdd             0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatCmp             0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatCvt             0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatMult            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatDiv             0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatSqrt            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAdd              0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAddAcc            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAlu              0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdCmp              0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdCvt              0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMisc             0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMult             0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMultAcc            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdShift            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdShiftAcc            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdSqrt             0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatAdd            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatAlu            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatCmp            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatCvt            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatDiv            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMisc            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMult            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMultAcc            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatSqrt            0      0.00%     67.68% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::MemRead      446588315     23.46%     91.14% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::MemWrite     168736893      8.86%    100.00% # Type of FU issued
 system.cpu.iq.ISSUE:FU_type_0::IprAccess            0      0.00%    100.00% # Type of FU issued
 system.cpu.iq.ISSUE:FU_type_0::InstPrefetch            0      0.00%    100.00% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::total       1733158148                       # Type of FU issued
-system.cpu.iq.ISSUE:fu_busy_cnt               1029171                       # FU busy when requested
-system.cpu.iq.ISSUE:fu_busy_rate             0.000594                       # FU busy rate (busy events/executed inst)
+system.cpu.iq.ISSUE:FU_type_0::total       1903752721                       # Type of FU issued
+system.cpu.iq.ISSUE:fu_busy_cnt              12019370                       # FU busy when requested
+system.cpu.iq.ISSUE:fu_busy_rate             0.006314                       # FU busy rate (busy events/executed inst)
 system.cpu.iq.ISSUE:fu_full::No_OpClass             0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntAlu               182      0.02%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntMult                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntDiv                 0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatAdd               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatCmp               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatCvt               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatMult              0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatDiv               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatSqrt              0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAdd                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAddAcc             0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAlu                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdCmp                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdCvt                0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMisc               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMult               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMultAcc            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdShift              0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdShiftAcc            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdSqrt               0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatAdd            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatAlu            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatCmp            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatCvt            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatDiv            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMisc            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMult            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMultAcc            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatSqrt            0      0.00%      0.02% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::MemRead           466697     45.35%     45.36% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::MemWrite          562292     54.64%    100.00% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntAlu           1063366      8.85%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntMult                0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntDiv                 0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatAdd               0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatCmp               0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatCvt               0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatMult              0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatDiv               0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatSqrt              0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAdd                0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAddAcc             0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAlu                0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdCmp                0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdCvt                0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMisc               0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMult               0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMultAcc            0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdShift              0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdShiftAcc            0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdSqrt               0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatAdd            0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatAlu            0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatCmp            0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatCvt            0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatDiv            0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMisc            0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMult            0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMultAcc            0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatSqrt            0      0.00%      8.85% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::MemRead          7508013     62.47%     71.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::MemWrite         3447991     28.69%    100.00% # attempts to use FU when none available
 system.cpu.iq.ISSUE:fu_full::IprAccess              0      0.00%    100.00% # attempts to use FU when none available
 system.cpu.iq.ISSUE:fu_full::InstPrefetch            0      0.00%    100.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:issued_per_cycle::samples   1623905370                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::mean     1.067278                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::stdev     1.066518                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::samples   1197319646                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::mean     1.590012                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::stdev     1.576110                       # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::underflows            0      0.00%      0.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::0     608633589     37.48%     37.48% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::1     503635145     31.01%     68.49% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::2     353739534     21.78%     90.28% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::3     117719188      7.25%     97.53% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::4      32883027      2.02%     99.55% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::5       6737765      0.41%     99.97% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::6        234496      0.01%     99.98% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::7        322546      0.02%    100.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::8            80      0.00%    100.00% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::0     380569061     31.79%     31.79% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::1     297509781     24.85%     56.63% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::2     210374930     17.57%     74.20% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::3     147240855     12.30%     86.50% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::4      95168176      7.95%     94.45% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::5      42314918      3.53%     97.98% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::6      17818883      1.49%     99.47% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::7       5974413      0.50%     99.97% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::8        348629      0.03%    100.00% # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::overflows            0      0.00%    100.00% # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::min_value            0                       # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::max_value            8                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::total   1623905370                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:rate                     1.060682                       # Inst issue rate
-system.cpu.iq.fp_alu_accesses                      24                       # Number of floating point alu accesses
-system.cpu.iq.fp_inst_queue_reads                  48                       # Number of floating instruction queue reads
-system.cpu.iq.fp_inst_queue_wakeup_accesses           10                       # Number of floating instruction queue wakeup accesses
-system.cpu.iq.fp_inst_queue_writes                 68                       # Number of floating instruction queue writes
-system.cpu.iq.int_alu_accesses             1732259326                       # Number of integer alu accesses
-system.cpu.iq.int_inst_queue_reads         5091250901                       # Number of integer instruction queue reads
-system.cpu.iq.int_inst_queue_wakeup_accesses   1694146357                       # Number of integer instruction queue wakeup accesses
-system.cpu.iq.int_inst_queue_writes        2453039449                       # Number of integer instruction queue writes
-system.cpu.iq.iqInstsAdded                 1988096819                       # Number of instructions added to the IQ (excludes non-spec)
-system.cpu.iq.iqInstsIssued                1733158148                       # Number of instructions issued
-system.cpu.iq.iqNonSpecInstsAdded                 579                       # Number of non-speculative instructions added to the IQ
-system.cpu.iq.iqSquashedInstsExamined       452995728                       # Number of squashed instructions iterated over during squash; mainly for profiling
-system.cpu.iq.iqSquashedInstsIssued               112                       # Number of squashed instructions issued
-system.cpu.iq.iqSquashedNonSpecRemoved             26                       # Number of squashed non-spec instructions that were removed
-system.cpu.iq.iqSquashedOperandsExamined   1010995901                       # Number of squashed operands that are examined and possibly removed from graph
-system.cpu.l2cache.ReadExReq_accesses          789062                       # number of ReadExReq accesses(hits+misses)
-system.cpu.l2cache.ReadExReq_avg_miss_latency 34275.179377                       # average ReadExReq miss latency
-system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 31001.682665                       # average ReadExReq mshr miss latency
-system.cpu.l2cache.ReadExReq_hits              541538                       # number of ReadExReq hits
-system.cpu.l2cache.ReadExReq_miss_latency   8483929500                       # number of ReadExReq miss cycles
-system.cpu.l2cache.ReadExReq_miss_rate       0.313694                       # miss rate for ReadExReq accesses
-system.cpu.l2cache.ReadExReq_misses            247524                       # number of ReadExReq misses
-system.cpu.l2cache.ReadExReq_mshr_miss_latency   7673660500                       # number of ReadExReq MSHR miss cycles
-system.cpu.l2cache.ReadExReq_mshr_miss_rate     0.313694                       # mshr miss rate for ReadExReq accesses
-system.cpu.l2cache.ReadExReq_mshr_misses       247524                       # number of ReadExReq MSHR misses
-system.cpu.l2cache.ReadReq_accesses           1734408                       # number of ReadReq accesses(hits+misses)
-system.cpu.l2cache.ReadReq_avg_miss_latency 34153.383782                       # average ReadReq miss latency
-system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 31001.108327                       # average ReadReq mshr miss latency
-system.cpu.l2cache.ReadReq_hits               1401925                       # number of ReadReq hits
-system.cpu.l2cache.ReadReq_miss_latency   11355419500                       # number of ReadReq miss cycles
-system.cpu.l2cache.ReadReq_miss_rate         0.191698                       # miss rate for ReadReq accesses
-system.cpu.l2cache.ReadReq_misses              332483                       # number of ReadReq misses
-system.cpu.l2cache.ReadReq_mshr_miss_latency  10307341500                       # number of ReadReq MSHR miss cycles
-system.cpu.l2cache.ReadReq_mshr_miss_rate     0.191698                       # mshr miss rate for ReadReq accesses
-system.cpu.l2cache.ReadReq_mshr_misses         332483                       # number of ReadReq MSHR misses
-system.cpu.l2cache.UpgradeReq_accesses           2863                       # number of UpgradeReq accesses(hits+misses)
-system.cpu.l2cache.UpgradeReq_avg_miss_latency    24.346581                       # average UpgradeReq miss latency
-system.cpu.l2cache.UpgradeReq_avg_mshr_miss_latency 31002.148228                       # average UpgradeReq mshr miss latency
-system.cpu.l2cache.UpgradeReq_hits                 70                       # number of UpgradeReq hits
-system.cpu.l2cache.UpgradeReq_miss_latency        68000                       # number of UpgradeReq miss cycles
-system.cpu.l2cache.UpgradeReq_miss_rate      0.975550                       # miss rate for UpgradeReq accesses
-system.cpu.l2cache.UpgradeReq_misses             2793                       # number of UpgradeReq misses
-system.cpu.l2cache.UpgradeReq_mshr_miss_latency     86589000                       # number of UpgradeReq MSHR miss cycles
-system.cpu.l2cache.UpgradeReq_mshr_miss_rate     0.975550                       # mshr miss rate for UpgradeReq accesses
-system.cpu.l2cache.UpgradeReq_mshr_misses         2793                       # number of UpgradeReq MSHR misses
-system.cpu.l2cache.Writeback_accesses         2224034                       # number of Writeback accesses(hits+misses)
-system.cpu.l2cache.Writeback_hits             2224034                       # number of Writeback hits
+system.cpu.iq.ISSUE:issued_per_cycle::total   1197319646                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:rate                     1.558019                       # Inst issue rate
+system.cpu.iq.fp_alu_accesses                      59                       # Number of floating point alu accesses
+system.cpu.iq.fp_inst_queue_reads                 119                       # Number of floating instruction queue reads
+system.cpu.iq.fp_inst_queue_wakeup_accesses           31                       # Number of floating instruction queue wakeup accesses
+system.cpu.iq.fp_inst_queue_writes               7970                       # Number of floating instruction queue writes
+system.cpu.iq.int_alu_accesses             1913488178                       # Number of integer alu accesses
+system.cpu.iq.int_inst_queue_reads         5017400189                       # Number of integer instruction queue reads
+system.cpu.iq.int_inst_queue_wakeup_accesses   1865910076                       # Number of integer instruction queue wakeup accesses
+system.cpu.iq.int_inst_queue_writes        3209512631                       # Number of integer instruction queue writes
+system.cpu.iq.iqInstsAdded                 2368910398                       # Number of instructions added to the IQ (excludes non-spec)
+system.cpu.iq.iqInstsIssued                1903752721                       # Number of instructions issued
+system.cpu.iq.iqNonSpecInstsAdded                6555                       # Number of non-speculative instructions added to the IQ
+system.cpu.iq.iqSquashedInstsExamined       838752495                       # Number of squashed instructions iterated over during squash; mainly for profiling
+system.cpu.iq.iqSquashedInstsIssued            555850                       # Number of squashed instructions issued
+system.cpu.iq.iqSquashedNonSpecRemoved           6002                       # Number of squashed non-spec instructions that were removed
+system.cpu.iq.iqSquashedOperandsExamined   1472792375                       # Number of squashed operands that are examined and possibly removed from graph
+system.cpu.l2cache.ReadExReq_accesses          786848                       # number of ReadExReq accesses(hits+misses)
+system.cpu.l2cache.ReadExReq_avg_miss_latency 34255.494728                       # average ReadExReq miss latency
+system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 31001.453653                       # average ReadExReq mshr miss latency
+system.cpu.l2cache.ReadExReq_hits              539884                       # number of ReadExReq hits
+system.cpu.l2cache.ReadExReq_miss_latency   8459874000                       # number of ReadExReq miss cycles
+system.cpu.l2cache.ReadExReq_miss_rate       0.313865                       # miss rate for ReadExReq accesses
+system.cpu.l2cache.ReadExReq_misses            246964                       # number of ReadExReq misses
+system.cpu.l2cache.ReadExReq_mshr_miss_latency   7656243000                       # number of ReadExReq MSHR miss cycles
+system.cpu.l2cache.ReadExReq_mshr_miss_rate     0.313865                       # mshr miss rate for ReadExReq accesses
+system.cpu.l2cache.ReadExReq_mshr_misses       246964                       # number of ReadExReq MSHR misses
+system.cpu.l2cache.ReadReq_accesses           1732679                       # number of ReadReq accesses(hits+misses)
+system.cpu.l2cache.ReadReq_avg_miss_latency 34171.480760                       # average ReadReq miss latency
+system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 31002.917505                       # average ReadReq mshr miss latency
+system.cpu.l2cache.ReadReq_hits               1415970                       # number of ReadReq hits
+system.cpu.l2cache.ReadReq_miss_latency   10822415500                       # number of ReadReq miss cycles
+system.cpu.l2cache.ReadReq_miss_rate         0.182786                       # miss rate for ReadReq accesses
+system.cpu.l2cache.ReadReq_misses              316709                       # number of ReadReq misses
+system.cpu.l2cache.ReadReq_mshr_miss_latency   9818903000                       # number of ReadReq MSHR miss cycles
+system.cpu.l2cache.ReadReq_mshr_miss_rate     0.182786                       # mshr miss rate for ReadReq accesses
+system.cpu.l2cache.ReadReq_mshr_misses         316709                       # number of ReadReq MSHR misses
+system.cpu.l2cache.UpgradeReq_accesses         256943                       # number of UpgradeReq accesses(hits+misses)
+system.cpu.l2cache.UpgradeReq_avg_miss_latency    40.077896                       # average UpgradeReq miss latency
+system.cpu.l2cache.UpgradeReq_avg_mshr_miss_latency 31003.030576                       # average UpgradeReq mshr miss latency
+system.cpu.l2cache.UpgradeReq_hits               1216                       # number of UpgradeReq hits
+system.cpu.l2cache.UpgradeReq_miss_latency     10249000                       # number of UpgradeReq miss cycles
+system.cpu.l2cache.UpgradeReq_miss_rate      0.995267                       # miss rate for UpgradeReq accesses
+system.cpu.l2cache.UpgradeReq_misses           255727                       # number of UpgradeReq misses
+system.cpu.l2cache.UpgradeReq_mshr_miss_latency   7928312000                       # number of UpgradeReq MSHR miss cycles
+system.cpu.l2cache.UpgradeReq_mshr_miss_rate     0.995267                       # mshr miss rate for UpgradeReq accesses
+system.cpu.l2cache.UpgradeReq_mshr_misses       255727                       # number of UpgradeReq MSHR misses
+system.cpu.l2cache.Writeback_accesses         2229754                       # number of Writeback accesses(hits+misses)
+system.cpu.l2cache.Writeback_hits             2229754                       # number of Writeback hits
 system.cpu.l2cache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.l2cache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.l2cache.avg_refs                  5.356881                       # Average number of references to valid blocks.
+system.cpu.l2cache.avg_refs                  5.404070                       # Average number of references to valid blocks.
 system.cpu.l2cache.blocked::no_mshrs                0                       # number of cycles access was blocked
 system.cpu.l2cache.blocked::no_targets              0                       # number of cycles access was blocked
 system.cpu.l2cache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.l2cache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.l2cache.cache_copies                     0                       # number of cache copies performed
-system.cpu.l2cache.demand_accesses            2523470                       # number of demand (read+write) accesses
-system.cpu.l2cache.demand_avg_miss_latency 34205.361315                       # average overall miss latency
-system.cpu.l2cache.demand_avg_mshr_miss_latency 31001.353432                       # average overall mshr miss latency
-system.cpu.l2cache.demand_hits                1943463                       # number of demand (read+write) hits
-system.cpu.l2cache.demand_miss_latency    19839349000                       # number of demand (read+write) miss cycles
-system.cpu.l2cache.demand_miss_rate          0.229845                       # miss rate for demand accesses
-system.cpu.l2cache.demand_misses               580007                       # number of demand (read+write) misses
+system.cpu.l2cache.demand_accesses            2519527                       # number of demand (read+write) accesses
+system.cpu.l2cache.demand_avg_miss_latency 34208.290090                       # average overall miss latency
+system.cpu.l2cache.demand_avg_mshr_miss_latency 31002.276142                       # average overall mshr miss latency
+system.cpu.l2cache.demand_hits                1955854                       # number of demand (read+write) hits
+system.cpu.l2cache.demand_miss_latency    19282289500                       # number of demand (read+write) miss cycles
+system.cpu.l2cache.demand_miss_rate          0.223722                       # miss rate for demand accesses
+system.cpu.l2cache.demand_misses               563673                       # number of demand (read+write) misses
 system.cpu.l2cache.demand_mshr_hits                 0                       # number of demand (read+write) MSHR hits
-system.cpu.l2cache.demand_mshr_miss_latency  17981002000                       # number of demand (read+write) MSHR miss cycles
-system.cpu.l2cache.demand_mshr_miss_rate     0.229845                       # mshr miss rate for demand accesses
-system.cpu.l2cache.demand_mshr_misses          580007                       # number of demand (read+write) MSHR misses
+system.cpu.l2cache.demand_mshr_miss_latency  17475146000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.l2cache.demand_mshr_miss_rate     0.223722                       # mshr miss rate for demand accesses
+system.cpu.l2cache.demand_mshr_misses          563673                       # number of demand (read+write) MSHR misses
 system.cpu.l2cache.fast_writes                      0                       # number of fast writes performed
 system.cpu.l2cache.mshr_cap_events                  0                       # number of times MSHR cap was activated
 system.cpu.l2cache.no_allocate_misses               0                       # Number of misses that were no-allocate
-system.cpu.l2cache.occ_%::0                  0.233067                       # Average percentage of cache occupancy
-system.cpu.l2cache.occ_%::1                  0.421257                       # Average percentage of cache occupancy
-system.cpu.l2cache.occ_blocks::0          7637.149597                       # Average occupied blocks per context
-system.cpu.l2cache.occ_blocks::1         13803.753842                       # Average occupied blocks per context
-system.cpu.l2cache.overall_accesses           2523470                       # number of overall (read+write) accesses
-system.cpu.l2cache.overall_avg_miss_latency 34205.361315                       # average overall miss latency
-system.cpu.l2cache.overall_avg_mshr_miss_latency 31001.353432                       # average overall mshr miss latency
+system.cpu.l2cache.occ_%::0                  0.213694                       # Average percentage of cache occupancy
+system.cpu.l2cache.occ_%::1                  0.433705                       # Average percentage of cache occupancy
+system.cpu.l2cache.occ_blocks::0          7002.339473                       # Average occupied blocks per context
+system.cpu.l2cache.occ_blocks::1         14211.631717                       # Average occupied blocks per context
+system.cpu.l2cache.overall_accesses           2519527                       # number of overall (read+write) accesses
+system.cpu.l2cache.overall_avg_miss_latency 34208.290090                       # average overall miss latency
+system.cpu.l2cache.overall_avg_mshr_miss_latency 31002.276142                       # average overall mshr miss latency
 system.cpu.l2cache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.l2cache.overall_hits               1943463                       # number of overall hits
-system.cpu.l2cache.overall_miss_latency   19839349000                       # number of overall miss cycles
-system.cpu.l2cache.overall_miss_rate         0.229845                       # miss rate for overall accesses
-system.cpu.l2cache.overall_misses              580007                       # number of overall misses
+system.cpu.l2cache.overall_hits               1955854                       # number of overall hits
+system.cpu.l2cache.overall_miss_latency   19282289500                       # number of overall miss cycles
+system.cpu.l2cache.overall_miss_rate         0.223722                       # miss rate for overall accesses
+system.cpu.l2cache.overall_misses              563673                       # number of overall misses
 system.cpu.l2cache.overall_mshr_hits                0                       # number of overall MSHR hits
-system.cpu.l2cache.overall_mshr_miss_latency  17981002000                       # number of overall MSHR miss cycles
-system.cpu.l2cache.overall_mshr_miss_rate     0.229845                       # mshr miss rate for overall accesses
-system.cpu.l2cache.overall_mshr_misses         580007                       # number of overall MSHR misses
+system.cpu.l2cache.overall_mshr_miss_latency  17475146000                       # number of overall MSHR miss cycles
+system.cpu.l2cache.overall_mshr_miss_rate     0.223722                       # mshr miss rate for overall accesses
+system.cpu.l2cache.overall_mshr_misses         563673                       # number of overall MSHR misses
 system.cpu.l2cache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.l2cache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.l2cache.replacements                569254                       # number of replacements
-system.cpu.l2cache.sampled_refs                588327                       # Sample count of references to valid blocks.
+system.cpu.l2cache.replacements                553099                       # number of replacements
+system.cpu.l2cache.sampled_refs                571950                       # Sample count of references to valid blocks.
 system.cpu.l2cache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.l2cache.tagsinuse             21440.903439                       # Cycle average of tags in use
-system.cpu.l2cache.total_refs                 3151598                       # Total number of references to valid blocks.
-system.cpu.l2cache.warmup_cycle          469235659000                       # Cycle when the warmup percentage was hit.
-system.cpu.l2cache.writebacks                  411363                       # number of writebacks
-system.cpu.memDep0.conflictingLoads         151128770                       # Number of conflicting loads.
-system.cpu.memDep0.conflictingStores         47539398                       # Number of conflicting stores.
-system.cpu.memDep0.insertedLoads            508224738                       # Number of loads inserted to the mem dependence unit.
-system.cpu.memDep0.insertedStores           194089353                       # Number of stores inserted to the mem dependence unit.
-system.cpu.misc_regfile_reads               947795380                       # number of misc regfile reads
-system.cpu.numCycles                       1634004079                       # number of cpu cycles simulated
+system.cpu.l2cache.tagsinuse             21213.971190                       # Cycle average of tags in use
+system.cpu.l2cache.total_refs                 3090858                       # Total number of references to valid blocks.
+system.cpu.l2cache.warmup_cycle          329890014000                       # Cycle when the warmup percentage was hit.
+system.cpu.l2cache.writebacks                  404346                       # number of writebacks
+system.cpu.memDep0.conflictingLoads         432040536                       # Number of conflicting loads.
+system.cpu.memDep0.conflictingStores        167867809                       # Number of conflicting stores.
+system.cpu.memDep0.insertedLoads            598780500                       # Number of loads inserted to the mem dependence unit.
+system.cpu.memDep0.insertedStores           227724252                       # Number of stores inserted to the mem dependence unit.
+system.cpu.misc_regfile_reads              1024928879                       # number of misc regfile reads
+system.cpu.numCycles                       1221905985                       # number of cpu cycles simulated
 system.cpu.numWorkItemsCompleted                    0                       # number of work items this cpu completed
 system.cpu.numWorkItemsStarted                      0                       # number of work items this cpu started
-system.cpu.rename.RENAME:BlockCycles         11181498                       # Number of cycles rename is blocking
-system.cpu.rename.RENAME:CommittedMaps     1427299027                       # Number of HB maps that are committed
-system.cpu.rename.RENAME:IQFullEvents         8162354                       # Number of times rename has blocked due to IQ full
-system.cpu.rename.RENAME:IdleCycles         430755417                       # Number of cycles rename is idle
-system.cpu.rename.RENAME:LSQFullEvents        1988994                       # Number of times rename has blocked due to LSQ full
-system.cpu.rename.RENAME:ROBFullEvents             37                       # Number of times rename has blocked due to ROB full
-system.cpu.rename.RENAME:RenameLookups     6064799926                       # Number of register rename lookups that rename has made
-system.cpu.rename.RENAME:RenamedInsts      2072679155                       # Number of instructions processed by rename
-system.cpu.rename.RENAME:RenamedOperands   1965930252                       # Number of destination operands rename has renamed
-system.cpu.rename.RENAME:RunCycles         1095363349                       # Number of cycles rename is running
-system.cpu.rename.RENAME:SquashCycles        71636028                       # Number of cycles rename is squashing
-system.cpu.rename.RENAME:UnblockCycles       14962968                       # Number of cycles rename is unblocking
-system.cpu.rename.RENAME:UndoneMaps         538631225                       # Number of HB maps that are undone due to squashing
-system.cpu.rename.RENAME:fp_rename_lookups          168                       # Number of floating rename lookups
-system.cpu.rename.RENAME:int_rename_lookups   6064799758                       # Number of integer rename lookups
-system.cpu.rename.RENAME:serializeStallCycles         6110                       # count of cycles rename stalled for serializing inst
-system.cpu.rename.RENAME:serializingInsts          566                       # count of serializing insts renamed
-system.cpu.rename.RENAME:skidInsts           21122292                       # count of insts added to the skid buffer
-system.cpu.rename.RENAME:tempSerializingInsts          563                       # count of temporary serializing insts renamed
-system.cpu.rob.rob_reads                   3532180532                       # The number of ROB reads
-system.cpu.rob.rob_writes                  4048956705                       # The number of ROB writes
-system.cpu.timesIdled                          351337                       # Number of times that the entire CPU went into an idle state and unscheduled itself
+system.cpu.rename.RENAME:BlockCycles         64472267                       # Number of cycles rename is blocking
+system.cpu.rename.RENAME:CommittedMaps     1425688721                       # Number of HB maps that are committed
+system.cpu.rename.RENAME:IQFullEvents        52544368                       # Number of times rename has blocked due to IQ full
+system.cpu.rename.RENAME:IdleCycles         479786184                       # Number of cycles rename is idle
+system.cpu.rename.RENAME:LSQFullEvents       82632603                       # Number of times rename has blocked due to LSQ full
+system.cpu.rename.RENAME:ROBFullEvents           8428                       # Number of times rename has blocked due to ROB full
+system.cpu.rename.RENAME:RenameLookups     5772028874                       # Number of register rename lookups that rename has made
+system.cpu.rename.RENAME:RenamedInsts      2456264739                       # Number of instructions processed by rename
+system.cpu.rename.RENAME:RenamedOperands   2290118455                       # Number of destination operands rename has renamed
+system.cpu.rename.RENAME:RunCycles          385614091                       # Number of cycles rename is running
+system.cpu.rename.RENAME:SquashCycles       113949773                       # Number of cycles rename is squashing
+system.cpu.rename.RENAME:UnblockCycles      153477395                       # Number of cycles rename is unblocking
+system.cpu.rename.RENAME:UndoneMaps         864429734                       # Number of HB maps that are undone due to squashing
+system.cpu.rename.RENAME:fp_rename_lookups        19762                       # Number of floating rename lookups
+system.cpu.rename.RENAME:int_rename_lookups   5772009112                       # Number of integer rename lookups
+system.cpu.rename.RENAME:serializeStallCycles        19936                       # count of cycles rename stalled for serializing inst
+system.cpu.rename.RENAME:serializingInsts         2550                       # count of serializing insts renamed
+system.cpu.rename.RENAME:skidInsts          360051799                       # count of insts added to the skid buffer
+system.cpu.rename.RENAME:tempSerializingInsts         2561                       # count of temporary serializing insts renamed
+system.cpu.rob.rob_reads                   3418371032                       # The number of ROB reads
+system.cpu.rob.rob_writes                  4851844016                       # The number of ROB writes
+system.cpu.timesIdled                          625791                       # Number of times that the entire CPU went into an idle state and unscheduled itself
 system.cpu.workload.PROG:num_syscalls             551                       # Number of system calls
 
 ---------- End Simulation Statistics   ----------
diff --git a/tests/long/20.parser/ref/x86/linux/simple-atomic/config.ini b/tests/long/20.parser/ref/x86/linux/simple-atomic/config.ini
index fdc891c593..adfcd9b98f 100644
--- a/tests/long/20.parser/ref/x86/linux/simple-atomic/config.ini
+++ b/tests/long/20.parser/ref/x86/linux/simple-atomic/config.ini
@@ -61,7 +61,7 @@ type=ExeTracer
 [system.cpu.workload]
 type=LiveProcess
 cmd=parser 2.1.dict -batch
-cwd=build/X86_SE/tests/fast/long/20.parser/x86/linux/simple-atomic
+cwd=build/X86_SE/tests/opt/long/20.parser/x86/linux/simple-atomic
 egid=100
 env=
 errout=cerr
diff --git a/tests/long/20.parser/ref/x86/linux/simple-atomic/simout b/tests/long/20.parser/ref/x86/linux/simple-atomic/simout
index 70ab31a108..e27ac87ea7 100755
--- a/tests/long/20.parser/ref/x86/linux/simple-atomic/simout
+++ b/tests/long/20.parser/ref/x86/linux/simple-atomic/simout
@@ -5,11 +5,11 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:12
+M5 compiled Feb 11 2011 23:35:10
+M5 revision c3deaa585dd3 7949 default qtip resforflagsstats.patch tip
+M5 started Feb 11 2011 23:35:13
 M5 executing on burrito
-command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/20.parser/x86/linux/simple-atomic -re tests/run.py build/X86_SE/tests/fast/long/20.parser/x86/linux/simple-atomic
+command line: build/X86_SE/m5.opt -d build/X86_SE/tests/opt/long/20.parser/x86/linux/simple-atomic -re tests/run.py build/X86_SE/tests/opt/long/20.parser/x86/linux/simple-atomic
 Global frequency set at 1000000000000 ticks per second
 info: Entering event queue @ 0.  Starting simulation...
 
diff --git a/tests/long/20.parser/ref/x86/linux/simple-atomic/stats.txt b/tests/long/20.parser/ref/x86/linux/simple-atomic/stats.txt
index 836ed15197..afe5ef2355 100644
--- a/tests/long/20.parser/ref/x86/linux/simple-atomic/stats.txt
+++ b/tests/long/20.parser/ref/x86/linux/simple-atomic/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 904614                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 227300                       # Number of bytes of host memory used
-host_seconds                                  1690.21                       # Real time elapsed on the host
-host_tick_rate                              523739013                       # Simulator tick rate (ticks/s)
+host_inst_rate                                1866600                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 231212                       # Number of bytes of host memory used
+host_seconds                                   819.13                       # Real time elapsed on the host
+host_tick_rate                             1080693863                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                  1528988757                       # Number of instructions simulated
 sim_seconds                                  0.885229                       # Number of seconds simulated
@@ -24,7 +24,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                       1528988757                       # Number of instructions executed
 system.cpu.num_int_alu_accesses            1528317615                       # Number of integer alu accesses
 system.cpu.num_int_insts                   1528317615                       # number of integer instructions
-system.cpu.num_int_register_reads          4418676175                       # number of times the integer registers were read
+system.cpu.num_int_register_reads          3581460239                       # number of times the integer registers were read
 system.cpu.num_int_register_writes         1427299027                       # number of times the integer registers were written
 system.cpu.num_load_insts                   384102160                       # Number of load instructions
 system.cpu.num_mem_refs                     533262345                       # number of memory refs
diff --git a/tests/long/20.parser/ref/x86/linux/simple-timing/config.ini b/tests/long/20.parser/ref/x86/linux/simple-timing/config.ini
index 4c1fe374d9..00b5b00f6c 100644
--- a/tests/long/20.parser/ref/x86/linux/simple-timing/config.ini
+++ b/tests/long/20.parser/ref/x86/linux/simple-timing/config.ini
@@ -161,7 +161,7 @@ type=ExeTracer
 [system.cpu.workload]
 type=LiveProcess
 cmd=parser 2.1.dict -batch
-cwd=build/X86_SE/tests/fast/long/20.parser/x86/linux/simple-timing
+cwd=build/X86_SE/tests/opt/long/20.parser/x86/linux/simple-timing
 egid=100
 env=
 errout=cerr
diff --git a/tests/long/20.parser/ref/x86/linux/simple-timing/simout b/tests/long/20.parser/ref/x86/linux/simple-timing/simout
index 9e491e5009..1e739aa16c 100755
--- a/tests/long/20.parser/ref/x86/linux/simple-timing/simout
+++ b/tests/long/20.parser/ref/x86/linux/simple-timing/simout
@@ -5,11 +5,11 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:36:47
+M5 compiled Feb 11 2011 23:35:10
+M5 revision c3deaa585dd3 7949 default qtip resforflagsstats.patch tip
+M5 started Feb 11 2011 23:35:13
 M5 executing on burrito
-command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/20.parser/x86/linux/simple-timing -re tests/run.py build/X86_SE/tests/fast/long/20.parser/x86/linux/simple-timing
+command line: build/X86_SE/m5.opt -d build/X86_SE/tests/opt/long/20.parser/x86/linux/simple-timing -re tests/run.py build/X86_SE/tests/opt/long/20.parser/x86/linux/simple-timing
 Global frequency set at 1000000000000 ticks per second
 info: Entering event queue @ 0.  Starting simulation...
 
diff --git a/tests/long/20.parser/ref/x86/linux/simple-timing/stats.txt b/tests/long/20.parser/ref/x86/linux/simple-timing/stats.txt
index 2cd3235734..dbe8c165b9 100644
--- a/tests/long/20.parser/ref/x86/linux/simple-timing/stats.txt
+++ b/tests/long/20.parser/ref/x86/linux/simple-timing/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 738382                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 235020                       # Number of bytes of host memory used
-host_seconds                                  2070.73                       # Real time elapsed on the host
-host_tick_rate                              801036637                       # Simulator tick rate (ticks/s)
+host_inst_rate                                1188316                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 238940                       # Number of bytes of host memory used
+host_seconds                                  1286.69                       # Real time elapsed on the host
+host_tick_rate                             1289149200                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                  1528988757                       # Number of instructions simulated
 sim_seconds                                  1.658730                       # Number of seconds simulated
@@ -213,7 +213,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                       1528988757                       # Number of instructions executed
 system.cpu.num_int_alu_accesses            1528317615                       # Number of integer alu accesses
 system.cpu.num_int_insts                   1528317615                       # number of integer instructions
-system.cpu.num_int_register_reads          4418676175                       # number of times the integer registers were read
+system.cpu.num_int_register_reads          3581460239                       # number of times the integer registers were read
 system.cpu.num_int_register_writes         1427299027                       # number of times the integer registers were written
 system.cpu.num_load_insts                   384102160                       # Number of load instructions
 system.cpu.num_mem_refs                     533262345                       # number of memory refs
diff --git a/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/simout b/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/simout
index 1ec8b66f17..55fcb321af 100755
--- a/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/simout
+++ b/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/simout
@@ -5,12 +5,12 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 01:47:18
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 01:47:38
-M5 executing on burrito
+M5 compiled Feb 18 2011 15:40:30
+M5 revision Unknown
+M5 started Feb 18 2011 18:53:22
+M5 executing on m55-001.pool
 command line: build/ALPHA_SE/m5.fast -d build/ALPHA_SE/tests/fast/long/50.vortex/alpha/tru64/inorder-timing -re tests/run.py build/ALPHA_SE/tests/fast/long/50.vortex/alpha/tru64/inorder-timing
 Global frequency set at 1000000000000 ticks per second
 info: Entering event queue @ 0.  Starting simulation...
 info: Increasing stack size by one page.
-Exiting @ tick 43686968500 because halt instruction encountered
+Exiting @ tick 43687852500 because target called exit()
diff --git a/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt b/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt
index d26ecb3497..883ec05af4 100644
--- a/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt
+++ b/tests/long/50.vortex/ref/alpha/tru64/inorder-timing/stats.txt
@@ -1,25 +1,25 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                  27953                       # Simulator instruction rate (inst/s)
-host_mem_usage                                1692040                       # Number of bytes of host memory used
-host_seconds                                  3160.33                       # Real time elapsed on the host
-host_tick_rate                               13823537                       # Simulator tick rate (ticks/s)
+host_inst_rate                                 140237                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 237028                       # Number of bytes of host memory used
+host_seconds                                   629.94                       # Real time elapsed on the host
+host_tick_rate                               69352666                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
-sim_insts                                    88340674                       # Number of instructions simulated
-sim_seconds                                  0.043687                       # Number of seconds simulated
-sim_ticks                                 43686968500                       # Number of ticks simulated
+sim_insts                                    88340673                       # Number of instructions simulated
+sim_seconds                                  0.043688                       # Number of seconds simulated
+sim_ticks                                 43687852500                       # Number of ticks simulated
 system.cpu.AGEN-Unit.agens                   35033051                       # Number of Address Generations
-system.cpu.Branch-Predictor.BTBHitPct       40.125175                       # BTB Hit Percentage
-system.cpu.Branch-Predictor.BTBHits           4678518                       # Number of BTB hits
-system.cpu.Branch-Predictor.BTBLookups       11659807                       # Number of BTB lookups
+system.cpu.Branch-Predictor.BTBHitPct       40.125186                       # BTB Hit Percentage
+system.cpu.Branch-Predictor.BTBHits           4678520                       # Number of BTB hits
+system.cpu.Branch-Predictor.BTBLookups       11659809                       # Number of BTB lookups
 system.cpu.Branch-Predictor.RASInCorrect         1539                       # Number of incorrect RAS predictions.
 system.cpu.Branch-Predictor.condIncorrect       753993                       # Number of conditional branches incorrect
-system.cpu.Branch-Predictor.condPredicted      9173158                       # Number of conditional branches predicted
-system.cpu.Branch-Predictor.lookups          14237669                       # Number of BP lookups
+system.cpu.Branch-Predictor.condPredicted      9173160                       # Number of conditional branches predicted
+system.cpu.Branch-Predictor.lookups          14237671                       # Number of BP lookups
 system.cpu.Branch-Predictor.predictedNotTaken      6139595                       # Number of Branches Predicted As Not Taken (False).
-system.cpu.Branch-Predictor.predictedTaken      8098074                       # Number of Branches Predicted As Taken (True).
+system.cpu.Branch-Predictor.predictedTaken      8098076                       # Number of Branches Predicted As Taken (True).
 system.cpu.Branch-Predictor.usedRAS           1660495                       # Number of times the RAS was used to get a target.
-system.cpu.Execution-Unit.executions         53620617                       # Number of Instructions Executed.
+system.cpu.Execution-Unit.executions         44841137                       # Number of Instructions Executed.
 system.cpu.Execution-Unit.mispredictPct      5.481801                       # Percentage of Incorrect Branches Predicts
 system.cpu.Execution-Unit.mispredicted         753993                       # Number of Branches Incorrectly Predicted
 system.cpu.Execution-Unit.predicted          13000484                       # Number of Branches Incorrectly Predicted
@@ -27,43 +27,43 @@ system.cpu.Execution-Unit.predictedNotTakenIncorrect       550902
 system.cpu.Execution-Unit.predictedTakenIncorrect       203091                       # Number of Branches Incorrectly Predicted As Taken.
 system.cpu.Mult-Div-Unit.divides                    0                       # Number of Divide Operations Executed
 system.cpu.Mult-Div-Unit.multiplies             41101                       # Number of Multipy Operations Executed
-system.cpu.RegFile-Manager.regFileAccesses    145605016                       # Number of Total Accesses (Read+Write) to the Register File
-system.cpu.RegFile-Manager.regFileReads      93058135                       # Number of Reads from Register File
+system.cpu.RegFile-Manager.regFileAccesses    145605009                       # Number of Total Accesses (Read+Write) to the Register File
+system.cpu.RegFile-Manager.regFileReads      93058128                       # Number of Reads from Register File
 system.cpu.RegFile-Manager.regFileWrites     52546881                       # Number of Writes to Register File
-system.cpu.RegFile-Manager.regForwards       13517269                       # Number of Registers Read Through Forwarding Logic
-system.cpu.activity                         70.714707                       # Percentage of cycles cpu is active
+system.cpu.RegFile-Manager.regForwards       13517276                       # Number of Registers Read Through Forwarding Logic
+system.cpu.activity                         70.715162                       # Percentage of cycles cpu is active
 system.cpu.comBranches                       13754477                       # Number of Branches instructions committed
 system.cpu.comFloats                           151453                       # Number of Floating Point instructions committed
 system.cpu.comInts                           30791227                       # Number of Integer instructions committed
 system.cpu.comLoads                          20276638                       # Number of Load instructions committed
-system.cpu.comNonSpec                            4584                       # Number of Non-Speculative instructions committed
+system.cpu.comNonSpec                            4583                       # Number of Non-Speculative instructions committed
 system.cpu.comNops                            8748916                       # Number of Nop instructions committed
 system.cpu.comStores                         14613377                       # Number of Store instructions committed
-system.cpu.committedInsts                    88340674                       # Number of Instructions Simulated (Per-Thread)
-system.cpu.committedInsts_total              88340674                       # Number of Instructions Simulated (Total)
+system.cpu.committedInsts                    88340673                       # Number of Instructions Simulated (Per-Thread)
+system.cpu.committedInsts_total              88340673                       # Number of Instructions Simulated (Total)
 system.cpu.contextSwitches                          1                       # Number of context switches
-system.cpu.cpi                               0.989057                       # CPI: Cycles Per Instruction (Per-Thread)
-system.cpu.cpi_total                         0.989057                       # CPI: Total CPI of All Threads
+system.cpu.cpi                               0.989077                       # CPI: Cycles Per Instruction (Per-Thread)
+system.cpu.cpi_total                         0.989077                       # CPI: Total CPI of All Threads
 system.cpu.dcache.ReadReq_accesses           20276638                       # number of ReadReq accesses(hits+misses)
 system.cpu.dcache.ReadReq_avg_miss_latency 43413.349504                       # average ReadReq miss latency
-system.cpu.dcache.ReadReq_avg_mshr_miss_latency 34421.543297                       # average ReadReq mshr miss latency
+system.cpu.dcache.ReadReq_avg_mshr_miss_latency 34421.526841                       # average ReadReq mshr miss latency
 system.cpu.dcache.ReadReq_hits               20182230                       # number of ReadReq hits
 system.cpu.dcache.ReadReq_miss_latency     4098567500                       # number of ReadReq miss cycles
 system.cpu.dcache.ReadReq_miss_rate          0.004656                       # miss rate for ReadReq accesses
 system.cpu.dcache.ReadReq_misses                94408                       # number of ReadReq misses
 system.cpu.dcache.ReadReq_mshr_hits             33642                       # number of ReadReq MSHR hits
-system.cpu.dcache.ReadReq_mshr_miss_latency   2091659500                       # number of ReadReq MSHR miss cycles
+system.cpu.dcache.ReadReq_mshr_miss_latency   2091658500                       # number of ReadReq MSHR miss cycles
 system.cpu.dcache.ReadReq_mshr_miss_rate     0.002997                       # mshr miss rate for ReadReq accesses
 system.cpu.dcache.ReadReq_mshr_misses           60766                       # number of ReadReq MSHR misses
 system.cpu.dcache.WriteReq_accesses          14613377                       # number of WriteReq accesses(hits+misses)
-system.cpu.dcache.WriteReq_avg_miss_latency 50157.670646                       # average WriteReq miss latency
-system.cpu.dcache.WriteReq_avg_mshr_miss_latency 49503.458051                       # average WriteReq mshr miss latency
+system.cpu.dcache.WriteReq_avg_miss_latency 50157.576620                       # average WriteReq miss latency
+system.cpu.dcache.WriteReq_avg_mshr_miss_latency 49503.360543                       # average WriteReq mshr miss latency
 system.cpu.dcache.WriteReq_hits              14405989                       # number of WriteReq hits
-system.cpu.dcache.WriteReq_miss_latency   10402099000                       # number of WriteReq miss cycles
+system.cpu.dcache.WriteReq_miss_latency   10402079500                       # number of WriteReq miss cycles
 system.cpu.dcache.WriteReq_miss_rate         0.014192                       # miss rate for WriteReq accesses
 system.cpu.dcache.WriteReq_misses              207388                       # number of WriteReq misses
 system.cpu.dcache.WriteReq_mshr_hits            63810                       # number of WriteReq MSHR hits
-system.cpu.dcache.WriteReq_mshr_miss_latency   7107607500                       # number of WriteReq MSHR miss cycles
+system.cpu.dcache.WriteReq_mshr_miss_latency   7107593500                       # number of WriteReq MSHR miss cycles
 system.cpu.dcache.WriteReq_mshr_miss_rate     0.009825                       # mshr miss rate for WriteReq accesses
 system.cpu.dcache.WriteReq_mshr_misses         143578                       # number of WriteReq MSHR misses
 system.cpu.dcache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
@@ -75,31 +75,31 @@ system.cpu.dcache.blocked_cycles::no_mshrs            0                       #
 system.cpu.dcache.blocked_cycles::no_targets      2727000                       # number of cycles access was blocked
 system.cpu.dcache.cache_copies                      0                       # number of cache copies performed
 system.cpu.dcache.demand_accesses            34890015                       # number of demand (read+write) accesses
-system.cpu.dcache.demand_avg_miss_latency 48047.908190                       # average overall miss latency
-system.cpu.dcache.demand_avg_mshr_miss_latency 45018.532475                       # average overall mshr miss latency
+system.cpu.dcache.demand_avg_miss_latency 48047.843576                       # average overall miss latency
+system.cpu.dcache.demand_avg_mshr_miss_latency 45018.459069                       # average overall mshr miss latency
 system.cpu.dcache.demand_hits                34588219                       # number of demand (read+write) hits
-system.cpu.dcache.demand_miss_latency     14500666500                       # number of demand (read+write) miss cycles
+system.cpu.dcache.demand_miss_latency     14500647000                       # number of demand (read+write) miss cycles
 system.cpu.dcache.demand_miss_rate           0.008650                       # miss rate for demand accesses
 system.cpu.dcache.demand_misses                301796                       # number of demand (read+write) misses
 system.cpu.dcache.demand_mshr_hits              97452                       # number of demand (read+write) MSHR hits
-system.cpu.dcache.demand_mshr_miss_latency   9199267000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.dcache.demand_mshr_miss_latency   9199252000                       # number of demand (read+write) MSHR miss cycles
 system.cpu.dcache.demand_mshr_miss_rate      0.005857                       # mshr miss rate for demand accesses
 system.cpu.dcache.demand_mshr_misses           204344                       # number of demand (read+write) MSHR misses
 system.cpu.dcache.fast_writes                       0                       # number of fast writes performed
 system.cpu.dcache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.dcache.no_allocate_misses                0                       # Number of misses that were no-allocate
 system.cpu.dcache.occ_%::0                   0.994103                       # Average percentage of cache occupancy
-system.cpu.dcache.occ_blocks::0           4071.844776                       # Average occupied blocks per context
+system.cpu.dcache.occ_blocks::0           4071.844772                       # Average occupied blocks per context
 system.cpu.dcache.overall_accesses           34890015                       # number of overall (read+write) accesses
-system.cpu.dcache.overall_avg_miss_latency 48047.908190                       # average overall miss latency
-system.cpu.dcache.overall_avg_mshr_miss_latency 45018.532475                       # average overall mshr miss latency
+system.cpu.dcache.overall_avg_miss_latency 48047.843576                       # average overall miss latency
+system.cpu.dcache.overall_avg_mshr_miss_latency 45018.459069                       # average overall mshr miss latency
 system.cpu.dcache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
 system.cpu.dcache.overall_hits               34588219                       # number of overall hits
-system.cpu.dcache.overall_miss_latency    14500666500                       # number of overall miss cycles
+system.cpu.dcache.overall_miss_latency    14500647000                       # number of overall miss cycles
 system.cpu.dcache.overall_miss_rate          0.008650                       # miss rate for overall accesses
 system.cpu.dcache.overall_misses               301796                       # number of overall misses
 system.cpu.dcache.overall_mshr_hits             97452                       # number of overall MSHR hits
-system.cpu.dcache.overall_mshr_miss_latency   9199267000                       # number of overall MSHR miss cycles
+system.cpu.dcache.overall_mshr_miss_latency   9199252000                       # number of overall MSHR miss cycles
 system.cpu.dcache.overall_mshr_miss_rate     0.005857                       # mshr miss rate for overall accesses
 system.cpu.dcache.overall_mshr_misses          204344                       # number of overall MSHR misses
 system.cpu.dcache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
@@ -107,9 +107,9 @@ system.cpu.dcache.overall_mshr_uncacheable_misses            0
 system.cpu.dcache.replacements                 200248                       # number of replacements
 system.cpu.dcache.sampled_refs                 204344                       # Sample count of references to valid blocks.
 system.cpu.dcache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.dcache.tagsinuse               4071.844776                       # Cycle average of tags in use
+system.cpu.dcache.tagsinuse               4071.844772                       # Cycle average of tags in use
 system.cpu.dcache.total_refs                 34588219                       # Total number of references to valid blocks.
-system.cpu.dcache.warmup_cycle              497786000                       # Cycle when the warmup percentage was hit.
+system.cpu.dcache.warmup_cycle              497796000                       # Cycle when the warmup percentage was hit.
 system.cpu.dcache.writebacks                   161214                       # number of writebacks
 system.cpu.dtb.data_accesses                 34987415                       # DTB accesses
 system.cpu.dtb.data_acv                             0                       # DTB access violations
@@ -127,51 +127,51 @@ system.cpu.dtb.write_accesses                14620629                       # DT
 system.cpu.dtb.write_acv                            0                       # DTB write access violations
 system.cpu.dtb.write_hits                    14613377                       # DTB write hits
 system.cpu.dtb.write_misses                      7252                       # DTB write misses
-system.cpu.icache.ReadReq_accesses           11384473                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_avg_miss_latency 18619.899316                       # average ReadReq miss latency
-system.cpu.icache.ReadReq_avg_mshr_miss_latency 15557.624423                       # average ReadReq mshr miss latency
-system.cpu.icache.ReadReq_hits               11286741                       # number of ReadReq hits
-system.cpu.icache.ReadReq_miss_latency     1819760000                       # number of ReadReq miss cycles
+system.cpu.icache.ReadReq_accesses           11384439                       # number of ReadReq accesses(hits+misses)
+system.cpu.icache.ReadReq_avg_miss_latency 18620.927639                       # average ReadReq miss latency
+system.cpu.icache.ReadReq_avg_mshr_miss_latency 15557.720286                       # average ReadReq mshr miss latency
+system.cpu.icache.ReadReq_hits               11286707                       # number of ReadReq hits
+system.cpu.icache.ReadReq_miss_latency     1819860500                       # number of ReadReq miss cycles
 system.cpu.icache.ReadReq_miss_rate          0.008585                       # miss rate for ReadReq accesses
 system.cpu.icache.ReadReq_misses                97732                       # number of ReadReq misses
 system.cpu.icache.ReadReq_mshr_hits              9063                       # number of ReadReq MSHR hits
-system.cpu.icache.ReadReq_mshr_miss_latency   1379479000                       # number of ReadReq MSHR miss cycles
+system.cpu.icache.ReadReq_mshr_miss_latency   1379487500                       # number of ReadReq MSHR miss cycles
 system.cpu.icache.ReadReq_mshr_miss_rate     0.007789                       # mshr miss rate for ReadReq accesses
 system.cpu.icache.ReadReq_mshr_misses           88669                       # number of ReadReq MSHR misses
 system.cpu.icache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.icache.avg_blocked_cycles::no_targets 18115.384615                       # average number of cycles each access was blocked
-system.cpu.icache.avg_refs                 127.292157                       # Average number of references to valid blocks.
+system.cpu.icache.avg_refs                 127.291774                       # Average number of references to valid blocks.
 system.cpu.icache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.icache.blocked::no_targets              39                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_targets       706500                       # number of cycles access was blocked
 system.cpu.icache.cache_copies                      0                       # number of cache copies performed
-system.cpu.icache.demand_accesses            11384473                       # number of demand (read+write) accesses
-system.cpu.icache.demand_avg_miss_latency 18619.899316                       # average overall miss latency
-system.cpu.icache.demand_avg_mshr_miss_latency 15557.624423                       # average overall mshr miss latency
-system.cpu.icache.demand_hits                11286741                       # number of demand (read+write) hits
-system.cpu.icache.demand_miss_latency      1819760000                       # number of demand (read+write) miss cycles
+system.cpu.icache.demand_accesses            11384439                       # number of demand (read+write) accesses
+system.cpu.icache.demand_avg_miss_latency 18620.927639                       # average overall miss latency
+system.cpu.icache.demand_avg_mshr_miss_latency 15557.720286                       # average overall mshr miss latency
+system.cpu.icache.demand_hits                11286707                       # number of demand (read+write) hits
+system.cpu.icache.demand_miss_latency      1819860500                       # number of demand (read+write) miss cycles
 system.cpu.icache.demand_miss_rate           0.008585                       # miss rate for demand accesses
 system.cpu.icache.demand_misses                 97732                       # number of demand (read+write) misses
 system.cpu.icache.demand_mshr_hits               9063                       # number of demand (read+write) MSHR hits
-system.cpu.icache.demand_mshr_miss_latency   1379479000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.icache.demand_mshr_miss_latency   1379487500                       # number of demand (read+write) MSHR miss cycles
 system.cpu.icache.demand_mshr_miss_rate      0.007789                       # mshr miss rate for demand accesses
 system.cpu.icache.demand_mshr_misses            88669                       # number of demand (read+write) MSHR misses
 system.cpu.icache.fast_writes                       0                       # number of fast writes performed
 system.cpu.icache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.icache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.icache.occ_%::0                   0.918761                       # Average percentage of cache occupancy
-system.cpu.icache.occ_blocks::0           1881.622790                       # Average occupied blocks per context
-system.cpu.icache.overall_accesses           11384473                       # number of overall (read+write) accesses
-system.cpu.icache.overall_avg_miss_latency 18619.899316                       # average overall miss latency
-system.cpu.icache.overall_avg_mshr_miss_latency 15557.624423                       # average overall mshr miss latency
+system.cpu.icache.occ_%::0                   0.918759                       # Average percentage of cache occupancy
+system.cpu.icache.occ_blocks::0           1881.619179                       # Average occupied blocks per context
+system.cpu.icache.overall_accesses           11384439                       # number of overall (read+write) accesses
+system.cpu.icache.overall_avg_miss_latency 18620.927639                       # average overall miss latency
+system.cpu.icache.overall_avg_mshr_miss_latency 15557.720286                       # average overall mshr miss latency
 system.cpu.icache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.icache.overall_hits               11286741                       # number of overall hits
-system.cpu.icache.overall_miss_latency     1819760000                       # number of overall miss cycles
+system.cpu.icache.overall_hits               11286707                       # number of overall hits
+system.cpu.icache.overall_miss_latency     1819860500                       # number of overall miss cycles
 system.cpu.icache.overall_miss_rate          0.008585                       # miss rate for overall accesses
 system.cpu.icache.overall_misses                97732                       # number of overall misses
 system.cpu.icache.overall_mshr_hits              9063                       # number of overall MSHR hits
-system.cpu.icache.overall_mshr_miss_latency   1379479000                       # number of overall MSHR miss cycles
+system.cpu.icache.overall_mshr_miss_latency   1379487500                       # number of overall MSHR miss cycles
 system.cpu.icache.overall_mshr_miss_rate     0.007789                       # mshr miss rate for overall accesses
 system.cpu.icache.overall_mshr_misses           88669                       # number of overall MSHR misses
 system.cpu.icache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
@@ -179,20 +179,20 @@ system.cpu.icache.overall_mshr_uncacheable_misses            0
 system.cpu.icache.replacements                  86622                       # number of replacements
 system.cpu.icache.sampled_refs                  88668                       # Sample count of references to valid blocks.
 system.cpu.icache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.icache.tagsinuse               1881.622790                       # Cycle average of tags in use
-system.cpu.icache.total_refs                 11286741                       # Total number of references to valid blocks.
+system.cpu.icache.tagsinuse               1881.619179                       # Cycle average of tags in use
+system.cpu.icache.total_refs                 11286707                       # Total number of references to valid blocks.
 system.cpu.icache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
 system.cpu.icache.writebacks                        0                       # number of writebacks
-system.cpu.idleCycles                        25587714                       # Number of cycles cpu's stages were not processed
-system.cpu.ipc                               1.011064                       # IPC: Instructions Per Cycle (Per-Thread)
-system.cpu.ipc_total                         1.011064                       # IPC: Total IPC of All Threads
+system.cpu.idleCycles                        25587834                       # Number of cycles cpu's stages were not processed
+system.cpu.ipc                               1.011044                       # IPC: Instructions Per Cycle (Per-Thread)
+system.cpu.ipc_total                         1.011044                       # IPC: Total IPC of All Threads
 system.cpu.itb.data_accesses                        0                       # DTB accesses
 system.cpu.itb.data_acv                             0                       # DTB access violations
 system.cpu.itb.data_hits                            0                       # DTB hits
 system.cpu.itb.data_misses                          0                       # DTB misses
-system.cpu.itb.fetch_accesses                11389750                       # ITB accesses
+system.cpu.itb.fetch_accesses                11389716                       # ITB accesses
 system.cpu.itb.fetch_acv                            0                       # ITB acv
-system.cpu.itb.fetch_hits                    11384494                       # ITB hits
+system.cpu.itb.fetch_hits                    11384460                       # ITB hits
 system.cpu.itb.fetch_misses                      5256                       # ITB misses
 system.cpu.itb.read_accesses                        0                       # DTB read accesses
 system.cpu.itb.read_acv                             0                       # DTB read access violations
@@ -203,23 +203,23 @@ system.cpu.itb.write_acv                            0                       # DT
 system.cpu.itb.write_hits                           0                       # DTB write hits
 system.cpu.itb.write_misses                         0                       # DTB write misses
 system.cpu.l2cache.ReadExReq_accesses          143582                       # number of ReadExReq accesses(hits+misses)
-system.cpu.l2cache.ReadExReq_avg_miss_latency 52040.936228                       # average ReadExReq miss latency
-system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 40000.851808                       # average ReadExReq mshr miss latency
+system.cpu.l2cache.ReadExReq_avg_miss_latency 52040.829752                       # average ReadExReq miss latency
+system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 40000.848005                       # average ReadExReq mshr miss latency
 system.cpu.l2cache.ReadExReq_hits               12097                       # number of ReadExReq hits
-system.cpu.l2cache.ReadExReq_miss_latency   6842602500                       # number of ReadExReq miss cycles
+system.cpu.l2cache.ReadExReq_miss_latency   6842588500                       # number of ReadExReq miss cycles
 system.cpu.l2cache.ReadExReq_miss_rate       0.915748                       # miss rate for ReadExReq accesses
 system.cpu.l2cache.ReadExReq_misses            131485                       # number of ReadExReq misses
-system.cpu.l2cache.ReadExReq_mshr_miss_latency   5259512000                       # number of ReadExReq MSHR miss cycles
+system.cpu.l2cache.ReadExReq_mshr_miss_latency   5259511500                       # number of ReadExReq MSHR miss cycles
 system.cpu.l2cache.ReadExReq_mshr_miss_rate     0.915748                       # mshr miss rate for ReadExReq accesses
 system.cpu.l2cache.ReadExReq_mshr_misses       131485                       # number of ReadExReq MSHR misses
 system.cpu.l2cache.ReadReq_accesses            149430                       # number of ReadReq accesses(hits+misses)
-system.cpu.l2cache.ReadReq_avg_miss_latency 52294.227145                       # average ReadReq miss latency
-system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 40025.874305                       # average ReadReq mshr miss latency
+system.cpu.l2cache.ReadReq_avg_miss_latency 52294.157340                       # average ReadReq miss latency
+system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 40025.851037                       # average ReadReq mshr miss latency
 system.cpu.l2cache.ReadReq_hits                106453                       # number of ReadReq hits
-system.cpu.l2cache.ReadReq_miss_latency    2247449000                       # number of ReadReq miss cycles
+system.cpu.l2cache.ReadReq_miss_latency    2247446000                       # number of ReadReq miss cycles
 system.cpu.l2cache.ReadReq_miss_rate         0.287606                       # miss rate for ReadReq accesses
 system.cpu.l2cache.ReadReq_misses               42977                       # number of ReadReq misses
-system.cpu.l2cache.ReadReq_mshr_miss_latency   1720192000                       # number of ReadReq MSHR miss cycles
+system.cpu.l2cache.ReadReq_mshr_miss_latency   1720191000                       # number of ReadReq MSHR miss cycles
 system.cpu.l2cache.ReadReq_mshr_miss_rate     0.287606                       # mshr miss rate for ReadReq accesses
 system.cpu.l2cache.ReadReq_mshr_misses          42977                       # number of ReadReq MSHR misses
 system.cpu.l2cache.Writeback_accesses          161214                       # number of Writeback accesses(hits+misses)
@@ -233,33 +233,33 @@ system.cpu.l2cache.blocked_cycles::no_mshrs            0                       #
 system.cpu.l2cache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.l2cache.cache_copies                     0                       # number of cache copies performed
 system.cpu.l2cache.demand_accesses             293012                       # number of demand (read+write) accesses
-system.cpu.l2cache.demand_avg_miss_latency 52103.331958                       # average overall miss latency
-system.cpu.l2cache.demand_avg_mshr_miss_latency 40007.015854                       # average overall mshr miss latency
+system.cpu.l2cache.demand_avg_miss_latency 52103.234515                       # average overall miss latency
+system.cpu.l2cache.demand_avg_mshr_miss_latency 40007.007257                       # average overall mshr miss latency
 system.cpu.l2cache.demand_hits                 118550                       # number of demand (read+write) hits
-system.cpu.l2cache.demand_miss_latency     9090051500                       # number of demand (read+write) miss cycles
+system.cpu.l2cache.demand_miss_latency     9090034500                       # number of demand (read+write) miss cycles
 system.cpu.l2cache.demand_miss_rate          0.595409                       # miss rate for demand accesses
 system.cpu.l2cache.demand_misses               174462                       # number of demand (read+write) misses
 system.cpu.l2cache.demand_mshr_hits                 0                       # number of demand (read+write) MSHR hits
-system.cpu.l2cache.demand_mshr_miss_latency   6979704000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.l2cache.demand_mshr_miss_latency   6979702500                       # number of demand (read+write) MSHR miss cycles
 system.cpu.l2cache.demand_mshr_miss_rate     0.595409                       # mshr miss rate for demand accesses
 system.cpu.l2cache.demand_mshr_misses          174462                       # number of demand (read+write) MSHR misses
 system.cpu.l2cache.fast_writes                      0                       # number of fast writes performed
 system.cpu.l2cache.mshr_cap_events                  0                       # number of times MSHR cap was activated
 system.cpu.l2cache.no_allocate_misses               0                       # Number of misses that were no-allocate
-system.cpu.l2cache.occ_%::0                  0.093045                       # Average percentage of cache occupancy
+system.cpu.l2cache.occ_%::0                  0.093044                       # Average percentage of cache occupancy
 system.cpu.l2cache.occ_%::1                  0.476016                       # Average percentage of cache occupancy
-system.cpu.l2cache.occ_blocks::0          3048.903015                       # Average occupied blocks per context
-system.cpu.l2cache.occ_blocks::1         15598.107451                       # Average occupied blocks per context
+system.cpu.l2cache.occ_blocks::0          3048.873160                       # Average occupied blocks per context
+system.cpu.l2cache.occ_blocks::1         15598.097053                       # Average occupied blocks per context
 system.cpu.l2cache.overall_accesses            293012                       # number of overall (read+write) accesses
-system.cpu.l2cache.overall_avg_miss_latency 52103.331958                       # average overall miss latency
-system.cpu.l2cache.overall_avg_mshr_miss_latency 40007.015854                       # average overall mshr miss latency
+system.cpu.l2cache.overall_avg_miss_latency 52103.234515                       # average overall miss latency
+system.cpu.l2cache.overall_avg_mshr_miss_latency 40007.007257                       # average overall mshr miss latency
 system.cpu.l2cache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
 system.cpu.l2cache.overall_hits                118550                       # number of overall hits
-system.cpu.l2cache.overall_miss_latency    9090051500                       # number of overall miss cycles
+system.cpu.l2cache.overall_miss_latency    9090034500                       # number of overall miss cycles
 system.cpu.l2cache.overall_miss_rate         0.595409                       # miss rate for overall accesses
 system.cpu.l2cache.overall_misses              174462                       # number of overall misses
 system.cpu.l2cache.overall_mshr_hits                0                       # number of overall MSHR hits
-system.cpu.l2cache.overall_mshr_miss_latency   6979704000                       # number of overall MSHR miss cycles
+system.cpu.l2cache.overall_mshr_miss_latency   6979702500                       # number of overall MSHR miss cycles
 system.cpu.l2cache.overall_mshr_miss_rate     0.595409                       # mshr miss rate for overall accesses
 system.cpu.l2cache.overall_mshr_misses         174462                       # number of overall MSHR misses
 system.cpu.l2cache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
@@ -267,35 +267,35 @@ system.cpu.l2cache.overall_mshr_uncacheable_misses            0
 system.cpu.l2cache.replacements                148090                       # number of replacements
 system.cpu.l2cache.sampled_refs                173435                       # Sample count of references to valid blocks.
 system.cpu.l2cache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.l2cache.tagsinuse             18647.010465                       # Cycle average of tags in use
+system.cpu.l2cache.tagsinuse             18646.970214                       # Cycle average of tags in use
 system.cpu.l2cache.total_refs                  134496                       # Total number of references to valid blocks.
 system.cpu.l2cache.warmup_cycle                     0                       # Cycle when the warmup percentage was hit.
 system.cpu.l2cache.writebacks                  120516                       # number of writebacks
-system.cpu.numCycles                         87373938                       # number of cpu cycles simulated
+system.cpu.numCycles                         87375706                       # number of cpu cycles simulated
 system.cpu.numWorkItemsCompleted                    0                       # number of work items this cpu completed
 system.cpu.numWorkItemsStarted                      0                       # number of work items this cpu started
-system.cpu.runCycles                         61786224                       # Number of cycles cpu stages are processed.
+system.cpu.runCycles                         61787872                       # Number of cycles cpu stages are processed.
 system.cpu.smtCommittedInsts                        0                       # Number of SMT Instructions Simulated (Per-Thread)
 system.cpu.smtCycles                                0                       # Total number of cycles that the CPU was in SMT-mode
 system.cpu.smt_cpi                           no_value                       # CPI: Total SMT-CPI
 system.cpu.smt_ipc                           no_value                       # IPC: Total SMT-IPC
-system.cpu.stage-0.idleCycles                42492197                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-0.runCycles                 44881741                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-0.utilization              51.367424                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-1.idleCycles                48180975                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-1.runCycles                 39192963                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-1.utilization              44.856583                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-2.idleCycles                46081271                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-2.runCycles                 41292667                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-2.utilization              47.259707                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-3.idleCycles                63475501                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-0.idleCycles                42493951                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-0.runCycles                 44881755                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-0.utilization              51.366400                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-1.idleCycles                48181868                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-1.runCycles                 39193838                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-1.utilization              44.856677                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-2.idleCycles                46079607                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-2.runCycles                 41296099                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-2.utilization              47.262678                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-3.idleCycles                63477269                       # Number of cycles 0 instructions are processed.
 system.cpu.stage-3.runCycles                 23898437                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-3.utilization              27.351906                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-4.idleCycles                39335442                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-4.runCycles                 48038496                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-4.utilization              54.980349                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.threadCycles                      69006043                       # Total Number of Cycles A Thread Was Active in CPU (Per-Thread)
-system.cpu.timesIdled                          289198                       # Number of times that the entire CPU went into an idle state and unscheduled itself
+system.cpu.stage-3.utilization              27.351352                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-4.idleCycles                39338499                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-4.runCycles                 48037207                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-4.utilization              54.977761                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.threadCycles                      69007682                       # Total Number of Cycles A Thread Was Active in CPU (Per-Thread)
+system.cpu.timesIdled                          289197                       # Number of times that the entire CPU went into an idle state and unscheduled itself
 system.cpu.workload.PROG:num_syscalls            4583                       # Number of system calls
 
 ---------- End Simulation Statistics   ----------
diff --git a/tests/long/60.bzip2/ref/x86/linux/simple-atomic/simout b/tests/long/60.bzip2/ref/x86/linux/simple-atomic/simout
index 228e6ab0ce..403cb4d0bb 100755
--- a/tests/long/60.bzip2/ref/x86/linux/simple-atomic/simout
+++ b/tests/long/60.bzip2/ref/x86/linux/simple-atomic/simout
@@ -5,9 +5,9 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:13
+M5 compiled Feb  8 2011 00:58:32
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:34
 M5 executing on burrito
 command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/60.bzip2/x86/linux/simple-atomic -re tests/run.py build/X86_SE/tests/fast/long/60.bzip2/x86/linux/simple-atomic
 Global frequency set at 1000000000000 ticks per second
diff --git a/tests/long/60.bzip2/ref/x86/linux/simple-atomic/stats.txt b/tests/long/60.bzip2/ref/x86/linux/simple-atomic/stats.txt
index a0361e843d..9e70ccdd10 100644
--- a/tests/long/60.bzip2/ref/x86/linux/simple-atomic/stats.txt
+++ b/tests/long/60.bzip2/ref/x86/linux/simple-atomic/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                1421831                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 223380                       # Number of bytes of host memory used
-host_seconds                                  3296.36                       # Real time elapsed on the host
-host_tick_rate                              863379215                       # Simulator tick rate (ticks/s)
+host_inst_rate                                2540540                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 223860                       # Number of bytes of host memory used
+host_seconds                                  1844.83                       # Real time elapsed on the host
+host_tick_rate                             1542694185                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                  4686862651                       # Number of instructions simulated
 sim_seconds                                  2.846007                       # Number of seconds simulated
@@ -24,7 +24,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                       4686862651                       # Number of instructions executed
 system.cpu.num_int_alu_accesses            4686862580                       # Number of integer alu accesses
 system.cpu.num_int_insts                   4686862580                       # number of integer instructions
-system.cpu.num_int_register_reads         14008880122                       # number of times the integer registers were read
+system.cpu.num_int_register_reads         11558008181                       # number of times the integer registers were read
 system.cpu.num_int_register_writes         4679057393                       # number of times the integer registers were written
 system.cpu.num_load_insts                  1239184749                       # Number of load instructions
 system.cpu.num_mem_refs                    1677713086                       # number of memory refs
diff --git a/tests/long/60.bzip2/ref/x86/linux/simple-timing/simout b/tests/long/60.bzip2/ref/x86/linux/simple-timing/simout
index 2ae1841323..65c0a88403 100755
--- a/tests/long/60.bzip2/ref/x86/linux/simple-timing/simout
+++ b/tests/long/60.bzip2/ref/x86/linux/simple-timing/simout
@@ -5,9 +5,9 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:12
+M5 compiled Feb  8 2011 00:58:32
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:34
 M5 executing on burrito
 command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/60.bzip2/x86/linux/simple-timing -re tests/run.py build/X86_SE/tests/fast/long/60.bzip2/x86/linux/simple-timing
 Global frequency set at 1000000000000 ticks per second
diff --git a/tests/long/60.bzip2/ref/x86/linux/simple-timing/stats.txt b/tests/long/60.bzip2/ref/x86/linux/simple-timing/stats.txt
index 21d2dce983..59534c87e0 100644
--- a/tests/long/60.bzip2/ref/x86/linux/simple-timing/stats.txt
+++ b/tests/long/60.bzip2/ref/x86/linux/simple-timing/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 980837                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 231100                       # Number of bytes of host memory used
-host_seconds                                  4778.43                       # Real time elapsed on the host
-host_tick_rate                             1239642391                       # Simulator tick rate (ticks/s)
+host_inst_rate                                1546064                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 231584                       # Number of bytes of host memory used
+host_seconds                                  3031.48                       # Real time elapsed on the host
+host_tick_rate                             1954011316                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                  4686862651                       # Number of instructions simulated
 sim_seconds                                  5.923548                       # Number of seconds simulated
@@ -213,7 +213,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                       4686862651                       # Number of instructions executed
 system.cpu.num_int_alu_accesses            4686862580                       # Number of integer alu accesses
 system.cpu.num_int_insts                   4686862580                       # number of integer instructions
-system.cpu.num_int_register_reads         14008880122                       # number of times the integer registers were read
+system.cpu.num_int_register_reads         11558008181                       # number of times the integer registers were read
 system.cpu.num_int_register_writes         4679057393                       # number of times the integer registers were written
 system.cpu.num_load_insts                  1239184749                       # Number of load instructions
 system.cpu.num_mem_refs                    1677713086                       # number of memory refs
diff --git a/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/simout b/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/simout
index 2bd9f81402..d80de6314c 100755
--- a/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/simout
+++ b/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/simout
@@ -5,10 +5,10 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 01:47:18
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 01:47:37
-M5 executing on burrito
+M5 compiled Feb 18 2011 15:40:30
+M5 revision Unknown
+M5 started Feb 18 2011 19:04:15
+M5 executing on m55-001.pool
 command line: build/ALPHA_SE/m5.fast -d build/ALPHA_SE/tests/fast/long/70.twolf/alpha/tru64/inorder-timing -re tests/run.py build/ALPHA_SE/tests/fast/long/70.twolf/alpha/tru64/inorder-timing
 Couldn't unlink  build/ALPHA_SE/tests/fast/long/70.twolf/alpha/tru64/inorder-timing/smred.sav
 Couldn't unlink  build/ALPHA_SE/tests/fast/long/70.twolf/alpha/tru64/inorder-timing/smred.sv2
@@ -28,4 +28,4 @@ Authors: Carl Sechen, Bill Swartz
  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90 
  91  92  93  94  95  96  97  98  99 100 101 102 103 104 105 
 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
-122 123 124 Exiting @ tick 40531473000 because halt instruction encountered
+122 123 124 Exiting @ tick 40531279000 because target called exit()
diff --git a/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/stats.txt b/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/stats.txt
index bb16b8b96a..b786833034 100644
--- a/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/stats.txt
+++ b/tests/long/70.twolf/ref/alpha/tru64/inorder-timing/stats.txt
@@ -1,25 +1,25 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                  25888                       # Simulator instruction rate (inst/s)
-host_mem_usage                                1480704                       # Number of bytes of host memory used
-host_seconds                                  3550.03                       # Real time elapsed on the host
-host_tick_rate                               11417230                       # Simulator tick rate (ticks/s)
+host_inst_rate                                 137731                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 254052                       # Number of bytes of host memory used
+host_seconds                                   667.27                       # Real time elapsed on the host
+host_tick_rate                               60742348                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
-sim_insts                                    91903057                       # Number of instructions simulated
+sim_insts                                    91903056                       # Number of instructions simulated
 sim_seconds                                  0.040531                       # Number of seconds simulated
-sim_ticks                                 40531473000                       # Number of ticks simulated
+sim_ticks                                 40531279000                       # Number of ticks simulated
 system.cpu.AGEN-Unit.agens                   27308571                       # Number of Address Generations
-system.cpu.Branch-Predictor.BTBHitPct       59.146475                       # BTB Hit Percentage
+system.cpu.Branch-Predictor.BTBHitPct       59.146483                       # BTB Hit Percentage
 system.cpu.Branch-Predictor.BTBHits           4489525                       # Number of BTB hits
-system.cpu.Branch-Predictor.BTBLookups        7590520                       # Number of BTB lookups
+system.cpu.Branch-Predictor.BTBLookups        7590519                       # Number of BTB lookups
 system.cpu.Branch-Predictor.RASInCorrect          138                       # Number of incorrect RAS predictions.
 system.cpu.Branch-Predictor.condIncorrect      2806970                       # Number of conditional branches incorrect
 system.cpu.Branch-Predictor.condPredicted      7883251                       # Number of conditional branches predicted
-system.cpu.Branch-Predictor.lookups          11539981                       # Number of BP lookups
+system.cpu.Branch-Predictor.lookups          11539980                       # Number of BP lookups
 system.cpu.Branch-Predictor.predictedNotTaken      4913265                       # Number of Branches Predicted As Not Taken (False).
-system.cpu.Branch-Predictor.predictedTaken      6626716                       # Number of Branches Predicted As Taken (True).
+system.cpu.Branch-Predictor.predictedTaken      6626715                       # Number of Branches Predicted As Taken (True).
 system.cpu.Branch-Predictor.usedRAS           1029619                       # Number of times the RAS was used to get a target.
-system.cpu.Execution-Unit.executions         66407277                       # Number of Instructions Executed.
+system.cpu.Execution-Unit.executions         57928840                       # Number of Instructions Executed.
 system.cpu.Execution-Unit.mispredictPct     27.409983                       # Percentage of Incorrect Branches Predicts
 system.cpu.Execution-Unit.mispredicted        2806970                       # Number of Branches Incorrectly Predicted
 system.cpu.Execution-Unit.predicted           7433715                       # Number of Branches Incorrectly Predicted
@@ -27,43 +27,43 @@ system.cpu.Execution-Unit.predictedNotTakenIncorrect      1384945
 system.cpu.Execution-Unit.predictedTakenIncorrect      1422025                       # Number of Branches Incorrectly Predicted As Taken.
 system.cpu.Mult-Div-Unit.divides                    0                       # Number of Divide Operations Executed
 system.cpu.Mult-Div-Unit.multiplies            458252                       # Number of Multipy Operations Executed
-system.cpu.RegFile-Manager.regFileAccesses    152685933                       # Number of Total Accesses (Read+Write) to the Register File
-system.cpu.RegFile-Manager.regFileReads      84258572                       # Number of Reads from Register File
+system.cpu.RegFile-Manager.regFileAccesses    152685930                       # Number of Total Accesses (Read+Write) to the Register File
+system.cpu.RegFile-Manager.regFileReads      84258569                       # Number of Reads from Register File
 system.cpu.RegFile-Manager.regFileWrites     68427361                       # Number of Writes to Register File
-system.cpu.RegFile-Manager.regForwards       38185925                       # Number of Registers Read Through Forwarding Logic
-system.cpu.activity                         91.670105                       # Percentage of cycles cpu is active
+system.cpu.RegFile-Manager.regForwards       38185928                       # Number of Registers Read Through Forwarding Logic
+system.cpu.activity                         91.670040                       # Percentage of cycles cpu is active
 system.cpu.comBranches                       10240685                       # Number of Branches instructions committed
 system.cpu.comFloats                          3775974                       # Number of Floating Point instructions committed
 system.cpu.comInts                           43665352                       # Number of Integer instructions committed
 system.cpu.comLoads                          19996198                       # Number of Load instructions committed
-system.cpu.comNonSpec                             390                       # Number of Non-Speculative instructions committed
+system.cpu.comNonSpec                             389                       # Number of Non-Speculative instructions committed
 system.cpu.comNops                            7723346                       # Number of Nop instructions committed
 system.cpu.comStores                          6501103                       # Number of Store instructions committed
-system.cpu.committedInsts                    91903057                       # Number of Instructions Simulated (Per-Thread)
-system.cpu.committedInsts_total              91903057                       # Number of Instructions Simulated (Total)
+system.cpu.committedInsts                    91903056                       # Number of Instructions Simulated (Per-Thread)
+system.cpu.committedInsts_total              91903056                       # Number of Instructions Simulated (Total)
 system.cpu.contextSwitches                          1                       # Number of context switches
-system.cpu.cpi                               0.882048                       # CPI: Cycles Per Instruction (Per-Thread)
-system.cpu.cpi_total                         0.882048                       # CPI: Total CPI of All Threads
+system.cpu.cpi                               0.882044                       # CPI: Cycles Per Instruction (Per-Thread)
+system.cpu.cpi_total                         0.882044                       # CPI: Total CPI of All Threads
 system.cpu.dcache.ReadReq_accesses           19996198                       # number of ReadReq accesses(hits+misses)
-system.cpu.dcache.ReadReq_avg_miss_latency 51751.953125                       # average ReadReq miss latency
-system.cpu.dcache.ReadReq_avg_mshr_miss_latency 48809.473684                       # average ReadReq mshr miss latency
+system.cpu.dcache.ReadReq_avg_miss_latency 51752.929688                       # average ReadReq miss latency
+system.cpu.dcache.ReadReq_avg_mshr_miss_latency 48810.526316                       # average ReadReq mshr miss latency
 system.cpu.dcache.ReadReq_hits               19995686                       # number of ReadReq hits
-system.cpu.dcache.ReadReq_miss_latency       26497000                       # number of ReadReq miss cycles
+system.cpu.dcache.ReadReq_miss_latency       26497500                       # number of ReadReq miss cycles
 system.cpu.dcache.ReadReq_miss_rate          0.000026                       # miss rate for ReadReq accesses
 system.cpu.dcache.ReadReq_misses                  512                       # number of ReadReq misses
 system.cpu.dcache.ReadReq_mshr_hits                37                       # number of ReadReq MSHR hits
-system.cpu.dcache.ReadReq_mshr_miss_latency     23184500                       # number of ReadReq MSHR miss cycles
+system.cpu.dcache.ReadReq_mshr_miss_latency     23185000                       # number of ReadReq MSHR miss cycles
 system.cpu.dcache.ReadReq_mshr_miss_rate     0.000024                       # mshr miss rate for ReadReq accesses
 system.cpu.dcache.ReadReq_mshr_misses             475                       # number of ReadReq MSHR misses
 system.cpu.dcache.WriteReq_accesses           6501103                       # number of WriteReq accesses(hits+misses)
-system.cpu.dcache.WriteReq_avg_miss_latency 55922.090261                       # average WriteReq miss latency
-system.cpu.dcache.WriteReq_avg_mshr_miss_latency 52793.478261                       # average WriteReq mshr miss latency
+system.cpu.dcache.WriteReq_avg_miss_latency 55921.258907                       # average WriteReq miss latency
+system.cpu.dcache.WriteReq_avg_mshr_miss_latency 52792.620137                       # average WriteReq mshr miss latency
 system.cpu.dcache.WriteReq_hits               6496893                       # number of WriteReq hits
-system.cpu.dcache.WriteReq_miss_latency     235432000                       # number of WriteReq miss cycles
+system.cpu.dcache.WriteReq_miss_latency     235428500                       # number of WriteReq miss cycles
 system.cpu.dcache.WriteReq_miss_rate         0.000648                       # miss rate for WriteReq accesses
 system.cpu.dcache.WriteReq_misses                4210                       # number of WriteReq misses
 system.cpu.dcache.WriteReq_mshr_hits             2462                       # number of WriteReq MSHR hits
-system.cpu.dcache.WriteReq_mshr_miss_latency     92283000                       # number of WriteReq MSHR miss cycles
+system.cpu.dcache.WriteReq_mshr_miss_latency     92281500                       # number of WriteReq MSHR miss cycles
 system.cpu.dcache.WriteReq_mshr_miss_rate     0.000269                       # mshr miss rate for WriteReq accesses
 system.cpu.dcache.WriteReq_mshr_misses           1748                       # number of WriteReq MSHR misses
 system.cpu.dcache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
@@ -75,31 +75,31 @@ system.cpu.dcache.blocked_cycles::no_mshrs            0                       #
 system.cpu.dcache.blocked_cycles::no_targets      1373500                       # number of cycles access was blocked
 system.cpu.dcache.cache_copies                      0                       # number of cache copies performed
 system.cpu.dcache.demand_accesses            26497301                       # number of demand (read+write) accesses
-system.cpu.dcache.demand_avg_miss_latency 55469.927997                       # average overall miss latency
-system.cpu.dcache.demand_avg_mshr_miss_latency 51942.195232                       # average overall mshr miss latency
+system.cpu.dcache.demand_avg_miss_latency 55469.292673                       # average overall miss latency
+system.cpu.dcache.demand_avg_mshr_miss_latency 51941.745389                       # average overall mshr miss latency
 system.cpu.dcache.demand_hits                26492579                       # number of demand (read+write) hits
-system.cpu.dcache.demand_miss_latency       261929000                       # number of demand (read+write) miss cycles
+system.cpu.dcache.demand_miss_latency       261926000                       # number of demand (read+write) miss cycles
 system.cpu.dcache.demand_miss_rate           0.000178                       # miss rate for demand accesses
 system.cpu.dcache.demand_misses                  4722                       # number of demand (read+write) misses
 system.cpu.dcache.demand_mshr_hits               2499                       # number of demand (read+write) MSHR hits
-system.cpu.dcache.demand_mshr_miss_latency    115467500                       # number of demand (read+write) MSHR miss cycles
+system.cpu.dcache.demand_mshr_miss_latency    115466500                       # number of demand (read+write) MSHR miss cycles
 system.cpu.dcache.demand_mshr_miss_rate      0.000084                       # mshr miss rate for demand accesses
 system.cpu.dcache.demand_mshr_misses             2223                       # number of demand (read+write) MSHR misses
 system.cpu.dcache.fast_writes                       0                       # number of fast writes performed
 system.cpu.dcache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.dcache.no_allocate_misses                0                       # Number of misses that were no-allocate
 system.cpu.dcache.occ_%::0                   0.351931                       # Average percentage of cache occupancy
-system.cpu.dcache.occ_blocks::0           1441.507978                       # Average occupied blocks per context
+system.cpu.dcache.occ_blocks::0           1441.508051                       # Average occupied blocks per context
 system.cpu.dcache.overall_accesses           26497301                       # number of overall (read+write) accesses
-system.cpu.dcache.overall_avg_miss_latency 55469.927997                       # average overall miss latency
-system.cpu.dcache.overall_avg_mshr_miss_latency 51942.195232                       # average overall mshr miss latency
+system.cpu.dcache.overall_avg_miss_latency 55469.292673                       # average overall miss latency
+system.cpu.dcache.overall_avg_mshr_miss_latency 51941.745389                       # average overall mshr miss latency
 system.cpu.dcache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
 system.cpu.dcache.overall_hits               26492579                       # number of overall hits
-system.cpu.dcache.overall_miss_latency      261929000                       # number of overall miss cycles
+system.cpu.dcache.overall_miss_latency      261926000                       # number of overall miss cycles
 system.cpu.dcache.overall_miss_rate          0.000178                       # miss rate for overall accesses
 system.cpu.dcache.overall_misses                 4722                       # number of overall misses
 system.cpu.dcache.overall_mshr_hits              2499                       # number of overall MSHR hits
-system.cpu.dcache.overall_mshr_miss_latency    115467500                       # number of overall MSHR miss cycles
+system.cpu.dcache.overall_mshr_miss_latency    115466500                       # number of overall MSHR miss cycles
 system.cpu.dcache.overall_mshr_miss_rate     0.000084                       # mshr miss rate for overall accesses
 system.cpu.dcache.overall_mshr_misses            2223                       # number of overall MSHR misses
 system.cpu.dcache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
@@ -107,7 +107,7 @@ system.cpu.dcache.overall_mshr_uncacheable_misses            0
 system.cpu.dcache.replacements                    157                       # number of replacements
 system.cpu.dcache.sampled_refs                   2223                       # Sample count of references to valid blocks.
 system.cpu.dcache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.dcache.tagsinuse               1441.507978                       # Cycle average of tags in use
+system.cpu.dcache.tagsinuse               1441.508051                       # Cycle average of tags in use
 system.cpu.dcache.total_refs                 26492579                       # Total number of references to valid blocks.
 system.cpu.dcache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
 system.cpu.dcache.writebacks                      107                       # number of writebacks
@@ -127,51 +127,51 @@ system.cpu.dtb.write_accesses                 6501126                       # DT
 system.cpu.dtb.write_acv                            0                       # DTB write access violations
 system.cpu.dtb.write_hits                     6501103                       # DTB write hits
 system.cpu.dtb.write_misses                        23                       # DTB write misses
-system.cpu.icache.ReadReq_accesses            9759566                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_avg_miss_latency 26777.900606                       # average ReadReq miss latency
-system.cpu.icache.ReadReq_avg_mshr_miss_latency 23139.891881                       # average ReadReq mshr miss latency
-system.cpu.icache.ReadReq_hits                9749163                       # number of ReadReq hits
-system.cpu.icache.ReadReq_miss_latency      278570500                       # number of ReadReq miss cycles
+system.cpu.icache.ReadReq_accesses            9759564                       # number of ReadReq accesses(hits+misses)
+system.cpu.icache.ReadReq_avg_miss_latency 26779.967317                       # average ReadReq miss latency
+system.cpu.icache.ReadReq_avg_mshr_miss_latency 23139.993880                       # average ReadReq mshr miss latency
+system.cpu.icache.ReadReq_hits                9749161                       # number of ReadReq hits
+system.cpu.icache.ReadReq_miss_latency      278592000                       # number of ReadReq miss cycles
 system.cpu.icache.ReadReq_miss_rate          0.001066                       # miss rate for ReadReq accesses
 system.cpu.icache.ReadReq_misses                10403                       # number of ReadReq misses
 system.cpu.icache.ReadReq_mshr_hits               599                       # number of ReadReq MSHR hits
-system.cpu.icache.ReadReq_mshr_miss_latency    226863500                       # number of ReadReq MSHR miss cycles
+system.cpu.icache.ReadReq_mshr_miss_latency    226864500                       # number of ReadReq MSHR miss cycles
 system.cpu.icache.ReadReq_mshr_miss_rate     0.001005                       # mshr miss rate for ReadReq accesses
 system.cpu.icache.ReadReq_mshr_misses            9804                       # number of ReadReq MSHR misses
 system.cpu.icache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.icache.avg_blocked_cycles::no_targets 18409.090909                       # average number of cycles each access was blocked
-system.cpu.icache.avg_refs                 994.406671                       # Average number of references to valid blocks.
+system.cpu.icache.avg_refs                 994.406467                       # Average number of references to valid blocks.
 system.cpu.icache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.icache.blocked::no_targets              11                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_targets       202500                       # number of cycles access was blocked
 system.cpu.icache.cache_copies                      0                       # number of cache copies performed
-system.cpu.icache.demand_accesses             9759566                       # number of demand (read+write) accesses
-system.cpu.icache.demand_avg_miss_latency 26777.900606                       # average overall miss latency
-system.cpu.icache.demand_avg_mshr_miss_latency 23139.891881                       # average overall mshr miss latency
-system.cpu.icache.demand_hits                 9749163                       # number of demand (read+write) hits
-system.cpu.icache.demand_miss_latency       278570500                       # number of demand (read+write) miss cycles
+system.cpu.icache.demand_accesses             9759564                       # number of demand (read+write) accesses
+system.cpu.icache.demand_avg_miss_latency 26779.967317                       # average overall miss latency
+system.cpu.icache.demand_avg_mshr_miss_latency 23139.993880                       # average overall mshr miss latency
+system.cpu.icache.demand_hits                 9749161                       # number of demand (read+write) hits
+system.cpu.icache.demand_miss_latency       278592000                       # number of demand (read+write) miss cycles
 system.cpu.icache.demand_miss_rate           0.001066                       # miss rate for demand accesses
 system.cpu.icache.demand_misses                 10403                       # number of demand (read+write) misses
 system.cpu.icache.demand_mshr_hits                599                       # number of demand (read+write) MSHR hits
-system.cpu.icache.demand_mshr_miss_latency    226863500                       # number of demand (read+write) MSHR miss cycles
+system.cpu.icache.demand_mshr_miss_latency    226864500                       # number of demand (read+write) MSHR miss cycles
 system.cpu.icache.demand_mshr_miss_rate      0.001005                       # mshr miss rate for demand accesses
 system.cpu.icache.demand_mshr_misses             9804                       # number of demand (read+write) MSHR misses
 system.cpu.icache.fast_writes                       0                       # number of fast writes performed
 system.cpu.icache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.icache.no_allocate_misses                0                       # Number of misses that were no-allocate
 system.cpu.icache.occ_%::0                   0.729171                       # Average percentage of cache occupancy
-system.cpu.icache.occ_blocks::0           1493.341258                       # Average occupied blocks per context
-system.cpu.icache.overall_accesses            9759566                       # number of overall (read+write) accesses
-system.cpu.icache.overall_avg_miss_latency 26777.900606                       # average overall miss latency
-system.cpu.icache.overall_avg_mshr_miss_latency 23139.891881                       # average overall mshr miss latency
+system.cpu.icache.occ_blocks::0           1493.341252                       # Average occupied blocks per context
+system.cpu.icache.overall_accesses            9759564                       # number of overall (read+write) accesses
+system.cpu.icache.overall_avg_miss_latency 26779.967317                       # average overall miss latency
+system.cpu.icache.overall_avg_mshr_miss_latency 23139.993880                       # average overall mshr miss latency
 system.cpu.icache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.icache.overall_hits                9749163                       # number of overall hits
-system.cpu.icache.overall_miss_latency      278570500                       # number of overall miss cycles
+system.cpu.icache.overall_hits                9749161                       # number of overall hits
+system.cpu.icache.overall_miss_latency      278592000                       # number of overall miss cycles
 system.cpu.icache.overall_miss_rate          0.001066                       # miss rate for overall accesses
 system.cpu.icache.overall_misses                10403                       # number of overall misses
 system.cpu.icache.overall_mshr_hits               599                       # number of overall MSHR hits
-system.cpu.icache.overall_mshr_miss_latency    226863500                       # number of overall MSHR miss cycles
+system.cpu.icache.overall_mshr_miss_latency    226864500                       # number of overall MSHR miss cycles
 system.cpu.icache.overall_mshr_miss_rate     0.001005                       # mshr miss rate for overall accesses
 system.cpu.icache.overall_mshr_misses            9804                       # number of overall MSHR misses
 system.cpu.icache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
@@ -179,20 +179,20 @@ system.cpu.icache.overall_mshr_uncacheable_misses            0
 system.cpu.icache.replacements                   7919                       # number of replacements
 system.cpu.icache.sampled_refs                   9804                       # Sample count of references to valid blocks.
 system.cpu.icache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.icache.tagsinuse               1493.341258                       # Cycle average of tags in use
-system.cpu.icache.total_refs                  9749163                       # Total number of references to valid blocks.
+system.cpu.icache.tagsinuse               1493.341252                       # Cycle average of tags in use
+system.cpu.icache.total_refs                  9749161                       # Total number of references to valid blocks.
 system.cpu.icache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
 system.cpu.icache.writebacks                        0                       # number of writebacks
-system.cpu.idleCycles                         6752458                       # Number of cycles cpu's stages were not processed
-system.cpu.ipc                               1.133725                       # IPC: Instructions Per Cycle (Per-Thread)
-system.cpu.ipc_total                         1.133725                       # IPC: Total IPC of All Threads
+system.cpu.idleCycles                         6752479                       # Number of cycles cpu's stages were not processed
+system.cpu.ipc                               1.133730                       # IPC: Instructions Per Cycle (Per-Thread)
+system.cpu.ipc_total                         1.133730                       # IPC: Total IPC of All Threads
 system.cpu.itb.data_accesses                        0                       # DTB accesses
 system.cpu.itb.data_acv                             0                       # DTB access violations
 system.cpu.itb.data_hits                            0                       # DTB hits
 system.cpu.itb.data_misses                          0                       # DTB misses
-system.cpu.itb.fetch_accesses                 9759621                       # ITB accesses
+system.cpu.itb.fetch_accesses                 9759619                       # ITB accesses
 system.cpu.itb.fetch_acv                            0                       # ITB acv
-system.cpu.itb.fetch_hits                     9759574                       # ITB hits
+system.cpu.itb.fetch_hits                     9759572                       # ITB hits
 system.cpu.itb.fetch_misses                        47                       # ITB misses
 system.cpu.itb.read_accesses                        0                       # DTB read accesses
 system.cpu.itb.read_acv                             0                       # DTB read access violations
@@ -203,23 +203,23 @@ system.cpu.itb.write_acv                            0                       # DT
 system.cpu.itb.write_hits                           0                       # DTB write hits
 system.cpu.itb.write_misses                         0                       # DTB write misses
 system.cpu.l2cache.ReadExReq_accesses            1748                       # number of ReadExReq accesses(hits+misses)
-system.cpu.l2cache.ReadExReq_avg_miss_latency 52356.562137                       # average ReadExReq miss latency
+system.cpu.l2cache.ReadExReq_avg_miss_latency 52355.691057                       # average ReadExReq miss latency
 system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 40114.401858                       # average ReadExReq mshr miss latency
 system.cpu.l2cache.ReadExReq_hits                  26                       # number of ReadExReq hits
-system.cpu.l2cache.ReadExReq_miss_latency     90158000                       # number of ReadExReq miss cycles
+system.cpu.l2cache.ReadExReq_miss_latency     90156500                       # number of ReadExReq miss cycles
 system.cpu.l2cache.ReadExReq_miss_rate       0.985126                       # miss rate for ReadExReq accesses
 system.cpu.l2cache.ReadExReq_misses              1722                       # number of ReadExReq misses
 system.cpu.l2cache.ReadExReq_mshr_miss_latency     69077000                       # number of ReadExReq MSHR miss cycles
 system.cpu.l2cache.ReadExReq_mshr_miss_rate     0.985126                       # mshr miss rate for ReadExReq accesses
 system.cpu.l2cache.ReadExReq_mshr_misses         1722                       # number of ReadExReq MSHR misses
 system.cpu.l2cache.ReadReq_accesses             10279                       # number of ReadReq accesses(hits+misses)
-system.cpu.l2cache.ReadReq_avg_miss_latency 52322.450249                       # average ReadReq miss latency
-system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 40125.777363                       # average ReadReq mshr miss latency
+system.cpu.l2cache.ReadReq_avg_miss_latency 52322.761194                       # average ReadReq miss latency
+system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 40125.621891                       # average ReadReq mshr miss latency
 system.cpu.l2cache.ReadReq_hits                  7063                       # number of ReadReq hits
-system.cpu.l2cache.ReadReq_miss_latency     168269000                       # number of ReadReq miss cycles
+system.cpu.l2cache.ReadReq_miss_latency     168270000                       # number of ReadReq miss cycles
 system.cpu.l2cache.ReadReq_miss_rate         0.312871                       # miss rate for ReadReq accesses
 system.cpu.l2cache.ReadReq_misses                3216                       # number of ReadReq misses
-system.cpu.l2cache.ReadReq_mshr_miss_latency    129044500                       # number of ReadReq MSHR miss cycles
+system.cpu.l2cache.ReadReq_mshr_miss_latency    129044000                       # number of ReadReq MSHR miss cycles
 system.cpu.l2cache.ReadReq_mshr_miss_rate     0.312871                       # mshr miss rate for ReadReq accesses
 system.cpu.l2cache.ReadReq_mshr_misses           3216                       # number of ReadReq MSHR misses
 system.cpu.l2cache.Writeback_accesses             107                       # number of Writeback accesses(hits+misses)
@@ -233,14 +233,14 @@ system.cpu.l2cache.blocked_cycles::no_mshrs            0                       #
 system.cpu.l2cache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.l2cache.cache_copies                     0                       # number of cache copies performed
 system.cpu.l2cache.demand_accesses              12027                       # number of demand (read+write) accesses
-system.cpu.l2cache.demand_avg_miss_latency 52334.345889                       # average overall miss latency
-system.cpu.l2cache.demand_avg_mshr_miss_latency 40121.810450                       # average overall mshr miss latency
+system.cpu.l2cache.demand_avg_miss_latency 52334.244633                       # average overall miss latency
+system.cpu.l2cache.demand_avg_mshr_miss_latency 40121.709194                       # average overall mshr miss latency
 system.cpu.l2cache.demand_hits                   7089                       # number of demand (read+write) hits
-system.cpu.l2cache.demand_miss_latency      258427000                       # number of demand (read+write) miss cycles
+system.cpu.l2cache.demand_miss_latency      258426500                       # number of demand (read+write) miss cycles
 system.cpu.l2cache.demand_miss_rate          0.410576                       # miss rate for demand accesses
 system.cpu.l2cache.demand_misses                 4938                       # number of demand (read+write) misses
 system.cpu.l2cache.demand_mshr_hits                 0                       # number of demand (read+write) MSHR hits
-system.cpu.l2cache.demand_mshr_miss_latency    198121500                       # number of demand (read+write) MSHR miss cycles
+system.cpu.l2cache.demand_mshr_miss_latency    198121000                       # number of demand (read+write) MSHR miss cycles
 system.cpu.l2cache.demand_mshr_miss_rate     0.410576                       # mshr miss rate for demand accesses
 system.cpu.l2cache.demand_mshr_misses            4938                       # number of demand (read+write) MSHR misses
 system.cpu.l2cache.fast_writes                      0                       # number of fast writes performed
@@ -248,18 +248,18 @@ system.cpu.l2cache.mshr_cap_events                  0                       # nu
 system.cpu.l2cache.no_allocate_misses               0                       # Number of misses that were no-allocate
 system.cpu.l2cache.occ_%::0                  0.066327                       # Average percentage of cache occupancy
 system.cpu.l2cache.occ_%::1                  0.000542                       # Average percentage of cache occupancy
-system.cpu.l2cache.occ_blocks::0          2173.408404                       # Average occupied blocks per context
-system.cpu.l2cache.occ_blocks::1            17.762794                       # Average occupied blocks per context
+system.cpu.l2cache.occ_blocks::0          2173.408531                       # Average occupied blocks per context
+system.cpu.l2cache.occ_blocks::1            17.762817                       # Average occupied blocks per context
 system.cpu.l2cache.overall_accesses             12027                       # number of overall (read+write) accesses
-system.cpu.l2cache.overall_avg_miss_latency 52334.345889                       # average overall miss latency
-system.cpu.l2cache.overall_avg_mshr_miss_latency 40121.810450                       # average overall mshr miss latency
+system.cpu.l2cache.overall_avg_miss_latency 52334.244633                       # average overall miss latency
+system.cpu.l2cache.overall_avg_mshr_miss_latency 40121.709194                       # average overall mshr miss latency
 system.cpu.l2cache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
 system.cpu.l2cache.overall_hits                  7089                       # number of overall hits
-system.cpu.l2cache.overall_miss_latency     258427000                       # number of overall miss cycles
+system.cpu.l2cache.overall_miss_latency     258426500                       # number of overall miss cycles
 system.cpu.l2cache.overall_miss_rate         0.410576                       # miss rate for overall accesses
 system.cpu.l2cache.overall_misses                4938                       # number of overall misses
 system.cpu.l2cache.overall_mshr_hits                0                       # number of overall MSHR hits
-system.cpu.l2cache.overall_mshr_miss_latency    198121500                       # number of overall MSHR miss cycles
+system.cpu.l2cache.overall_mshr_miss_latency    198121000                       # number of overall MSHR miss cycles
 system.cpu.l2cache.overall_mshr_miss_rate     0.410576                       # mshr miss rate for overall accesses
 system.cpu.l2cache.overall_mshr_misses           4938                       # number of overall MSHR misses
 system.cpu.l2cache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
@@ -267,35 +267,35 @@ system.cpu.l2cache.overall_mshr_uncacheable_misses            0
 system.cpu.l2cache.replacements                     0                       # number of replacements
 system.cpu.l2cache.sampled_refs                  3282                       # Sample count of references to valid blocks.
 system.cpu.l2cache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.l2cache.tagsinuse              2191.171198                       # Cycle average of tags in use
+system.cpu.l2cache.tagsinuse              2191.171348                       # Cycle average of tags in use
 system.cpu.l2cache.total_refs                    7072                       # Total number of references to valid blocks.
 system.cpu.l2cache.warmup_cycle                     0                       # Cycle when the warmup percentage was hit.
 system.cpu.l2cache.writebacks                       0                       # number of writebacks
-system.cpu.numCycles                         81062947                       # number of cpu cycles simulated
+system.cpu.numCycles                         81062559                       # number of cpu cycles simulated
 system.cpu.numWorkItemsCompleted                    0                       # number of work items this cpu completed
 system.cpu.numWorkItemsStarted                      0                       # number of work items this cpu started
-system.cpu.runCycles                         74310489                       # Number of cycles cpu stages are processed.
+system.cpu.runCycles                         74310080                       # Number of cycles cpu stages are processed.
 system.cpu.smtCommittedInsts                        0                       # Number of SMT Instructions Simulated (Per-Thread)
 system.cpu.smtCycles                                0                       # Total number of cycles that the CPU was in SMT-mode
 system.cpu.smt_cpi                           no_value                       # CPI: Total SMT-CPI
 system.cpu.smt_ipc                           no_value                       # IPC: Total SMT-IPC
-system.cpu.stage-0.idleCycles                27951481                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-0.runCycles                 53111466                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-0.utilization              65.518795                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-1.idleCycles                33263015                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-1.runCycles                 47799932                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-1.utilization              58.966438                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-2.idleCycles                32674388                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-2.runCycles                 48388559                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-2.utilization              59.692573                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-3.idleCycles                63236669                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-3.runCycles                 17826278                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-3.utilization              21.990661                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-4.idleCycles                26883449                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-4.runCycles                 54179498                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-4.utilization              66.836329                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.threadCycles                      80608290                       # Total Number of Cycles A Thread Was Active in CPU (Per-Thread)
-system.cpu.timesIdled                           10787                       # Number of times that the entire CPU went into an idle state and unscheduled itself
+system.cpu.stage-0.idleCycles                27951091                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-0.runCycles                 53111468                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-0.utilization              65.519111                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-1.idleCycles                33262621                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-1.runCycles                 47799938                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-1.utilization              58.966727                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-2.idleCycles                32674404                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-2.runCycles                 48388155                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-2.utilization              59.692361                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-3.idleCycles                63236282                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-3.runCycles                 17826277                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-3.utilization              21.990765                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-4.idleCycles                26883065                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-4.runCycles                 54179494                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-4.utilization              66.836644                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.threadCycles                      80607865                       # Total Number of Cycles A Thread Was Active in CPU (Per-Thread)
+system.cpu.timesIdled                           10786                       # Number of times that the entire CPU went into an idle state and unscheduled itself
 system.cpu.workload.PROG:num_syscalls             389                       # Number of system calls
 
 ---------- End Simulation Statistics   ----------
diff --git a/tests/long/70.twolf/ref/x86/linux/o3-timing/config.ini b/tests/long/70.twolf/ref/x86/linux/o3-timing/config.ini
index 78a8cbd6c3..f69fd4da6a 100644
--- a/tests/long/70.twolf/ref/x86/linux/o3-timing/config.ini
+++ b/tests/long/70.twolf/ref/x86/linux/o3-timing/config.ini
@@ -488,7 +488,7 @@ type=ExeTracer
 [system.cpu.workload]
 type=LiveProcess
 cmd=twolf smred
-cwd=build/X86_SE/tests/fast/long/70.twolf/x86/linux/o3-timing
+cwd=build/X86_SE/tests/opt/long/70.twolf/x86/linux/o3-timing
 egid=100
 env=
 errout=cerr
diff --git a/tests/long/70.twolf/ref/x86/linux/o3-timing/simout b/tests/long/70.twolf/ref/x86/linux/o3-timing/simout
index e89403a2fa..2ac976df68 100755
--- a/tests/long/70.twolf/ref/x86/linux/o3-timing/simout
+++ b/tests/long/70.twolf/ref/x86/linux/o3-timing/simout
@@ -5,13 +5,11 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:12
+M5 compiled Feb 12 2011 02:22:23
+M5 revision 5e76f9de6972 7961 default qtip tip x86branchdetectstats.patch
+M5 started Feb 12 2011 02:22:27
 M5 executing on burrito
-command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/70.twolf/x86/linux/o3-timing -re tests/run.py build/X86_SE/tests/fast/long/70.twolf/x86/linux/o3-timing
-Couldn't unlink  build/X86_SE/tests/fast/long/70.twolf/x86/linux/o3-timing/smred.sav
-Couldn't unlink  build/X86_SE/tests/fast/long/70.twolf/x86/linux/o3-timing/smred.sv2
+command line: build/X86_SE/m5.opt -d build/X86_SE/tests/opt/long/70.twolf/x86/linux/o3-timing -re tests/run.py build/X86_SE/tests/opt/long/70.twolf/x86/linux/o3-timing
 Global frequency set at 1000000000000 ticks per second
 info: Entering event queue @ 0.  Starting simulation...
 
@@ -29,4 +27,4 @@ info: Increasing stack size by one page.
  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90 
  91  92  93  94  95  96  97  98  99 100 101 102 103 104 105 
 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
-122 123 124 Exiting @ tick 127560542500 because target called exit()
+122 123 124 Exiting @ tick 108875474000 because target called exit()
diff --git a/tests/long/70.twolf/ref/x86/linux/o3-timing/stats.txt b/tests/long/70.twolf/ref/x86/linux/o3-timing/stats.txt
index 58c1a12590..a77afc8496 100644
--- a/tests/long/70.twolf/ref/x86/linux/o3-timing/stats.txt
+++ b/tests/long/70.twolf/ref/x86/linux/o3-timing/stats.txt
@@ -1,41 +1,41 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                  87424                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 240332                       # Number of bytes of host memory used
-host_seconds                                  2532.06                       # Real time elapsed on the host
-host_tick_rate                               50378144                       # Simulator tick rate (ticks/s)
+host_inst_rate                                  92938                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 245208                       # Number of bytes of host memory used
+host_seconds                                  2381.84                       # Real time elapsed on the host
+host_tick_rate                               45710653                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                   221363017                       # Number of instructions simulated
-sim_seconds                                  0.127561                       # Number of seconds simulated
-sim_ticks                                127560542500                       # Number of ticks simulated
+sim_seconds                                  0.108875                       # Number of seconds simulated
+sim_ticks                                108875474000                       # Number of ticks simulated
 system.cpu.BPredUnit.BTBCorrect                     0                       # Number of correct BTB predictions (this stat may not work properly.
-system.cpu.BPredUnit.BTBHits                 16939138                       # Number of BTB hits
-system.cpu.BPredUnit.BTBLookups              19067543                       # Number of BTB lookups
+system.cpu.BPredUnit.BTBHits                 19725800                       # Number of BTB hits
+system.cpu.BPredUnit.BTBLookups              22620341                       # Number of BTB lookups
 system.cpu.BPredUnit.RASInCorrect                   0                       # Number of incorrect RAS predictions.
-system.cpu.BPredUnit.condIncorrect            3582609                       # Number of conditional branches incorrect
-system.cpu.BPredUnit.condPredicted           19223942                       # Number of conditional branches predicted
-system.cpu.BPredUnit.lookups                 19223942                       # Number of BP lookups
+system.cpu.BPredUnit.condIncorrect            3050205                       # Number of conditional branches incorrect
+system.cpu.BPredUnit.condPredicted           25317132                       # Number of conditional branches predicted
+system.cpu.BPredUnit.lookups                 25317132                       # Number of BP lookups
 system.cpu.BPredUnit.usedRAS                        0                       # Number of times the RAS was used to get a target.
 system.cpu.commit.COM:branches               12326943                       # Number of branches committed
-system.cpu.commit.COM:bw_lim_events            324452                       # number cycles where commit BW limit reached
+system.cpu.commit.COM:bw_lim_events           2257656                       # number cycles where commit BW limit reached
 system.cpu.commit.COM:bw_limited                    0                       # number of insts not committed due to BW limits
-system.cpu.commit.COM:committed_per_cycle::samples    243992167                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::mean     0.907255                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::stdev     1.057266                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::samples    193712128                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::mean     1.142742                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::stdev     1.492040                       # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::underflows            0      0.00%      0.00% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::0     97637775     40.02%     40.02% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::1    102801930     42.13%     82.15% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::2     24473335     10.03%     92.18% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::3     10688182      4.38%     96.56% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::4      6438517      2.64%     99.20% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::5       836047      0.34%     99.54% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::6       523551      0.21%     99.76% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::7       268378      0.11%     99.87% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::8       324452      0.13%    100.00% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::0     76077426     39.27%     39.27% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::1     72463860     37.41%     76.68% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::2     18818378      9.71%     86.40% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::3     12600057      6.50%     92.90% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::4      5960288      3.08%     95.98% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::5      2688234      1.39%     97.37% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::6      1804943      0.93%     98.30% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::7      1041286      0.54%     98.83% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::8      2257656      1.17%    100.00% # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::overflows            0      0.00%    100.00% # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::min_value            0                       # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::max_value            8                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::total    243992167                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::total    193712128                       # Number of insts commited each cycle
 system.cpu.commit.COM:count                 221363017                       # Number of instructions committed
 system.cpu.commit.COM:fp_insts                2162459                       # Number of committed floating point instructions.
 system.cpu.commit.COM:function_calls                0                       # Number of function calls committed.
@@ -44,423 +44,424 @@ system.cpu.commit.COM:loads                  56649590                       # Nu
 system.cpu.commit.COM:membars                       0                       # Number of memory barriers committed
 system.cpu.commit.COM:refs                   77165306                       # Number of memory references committed
 system.cpu.commit.COM:swp_count                     0                       # Number of s/w prefetches committed
-system.cpu.commit.branchMispredicts           3582617                       # The number of times a branch was mispredicted
+system.cpu.commit.branchMispredicts           3050238                       # The number of times a branch was mispredicted
 system.cpu.commit.commitCommittedInsts      221363017                       # The number of committed instructions
 system.cpu.commit.commitNonSpecStalls            1246                       # The number of times commit has been forced to stall to communicate backwards
-system.cpu.commit.commitSquashedInsts        70151117                       # The number of squashed insts skipped by commit
+system.cpu.commit.commitSquashedInsts       180173936                       # The number of squashed insts skipped by commit
 system.cpu.committedInsts                   221363017                       # Number of Instructions Simulated
 system.cpu.committedInsts_total             221363017                       # Number of Instructions Simulated
-system.cpu.cpi                               1.152501                       # CPI: Cycles Per Instruction
-system.cpu.cpi_total                         1.152501                       # CPI: Total CPI of All Threads
-system.cpu.dcache.ReadReq_accesses           51727133                       # number of ReadReq accesses(hits+misses)
-system.cpu.dcache.ReadReq_avg_miss_latency 34247.563353                       # average ReadReq miss latency
-system.cpu.dcache.ReadReq_avg_mshr_miss_latency 34193.055556                       # average ReadReq mshr miss latency
-system.cpu.dcache.ReadReq_hits               51726620                       # number of ReadReq hits
-system.cpu.dcache.ReadReq_miss_latency       17569000                       # number of ReadReq miss cycles
-system.cpu.dcache.ReadReq_miss_rate          0.000010                       # miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_misses                  513                       # number of ReadReq misses
-system.cpu.dcache.ReadReq_mshr_hits               153                       # number of ReadReq MSHR hits
-system.cpu.dcache.ReadReq_mshr_miss_latency     12309500                       # number of ReadReq MSHR miss cycles
-system.cpu.dcache.ReadReq_mshr_miss_rate     0.000007                       # mshr miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_mshr_misses             360                       # number of ReadReq MSHR misses
+system.cpu.cpi                               0.983683                       # CPI: Cycles Per Instruction
+system.cpu.cpi_total                         0.983683                       # CPI: Total CPI of All Threads
+system.cpu.dcache.ReadReq_accesses           50495037                       # number of ReadReq accesses(hits+misses)
+system.cpu.dcache.ReadReq_avg_miss_latency 33300.295858                       # average ReadReq miss latency
+system.cpu.dcache.ReadReq_avg_mshr_miss_latency 34031.250000                       # average ReadReq mshr miss latency
+system.cpu.dcache.ReadReq_hits               50494361                       # number of ReadReq hits
+system.cpu.dcache.ReadReq_miss_latency       22511000                       # number of ReadReq miss cycles
+system.cpu.dcache.ReadReq_miss_rate          0.000013                       # miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_misses                  676                       # number of ReadReq misses
+system.cpu.dcache.ReadReq_mshr_hits               292                       # number of ReadReq MSHR hits
+system.cpu.dcache.ReadReq_mshr_miss_latency     13068000                       # number of ReadReq MSHR miss cycles
+system.cpu.dcache.ReadReq_mshr_miss_rate     0.000008                       # mshr miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_mshr_misses             384                       # number of ReadReq MSHR misses
 system.cpu.dcache.WriteReq_accesses          20515730                       # number of WriteReq accesses(hits+misses)
-system.cpu.dcache.WriteReq_avg_miss_latency 26394.870828                       # average WriteReq miss latency
-system.cpu.dcache.WriteReq_avg_mshr_miss_latency 35294.285714                       # average WriteReq mshr miss latency
-system.cpu.dcache.WriteReq_hits              20510427                       # number of WriteReq hits
-system.cpu.dcache.WriteReq_miss_latency     139972000                       # number of WriteReq miss cycles
-system.cpu.dcache.WriteReq_miss_rate         0.000258                       # miss rate for WriteReq accesses
-system.cpu.dcache.WriteReq_misses                5303                       # number of WriteReq misses
-system.cpu.dcache.WriteReq_mshr_hits             3728                       # number of WriteReq MSHR hits
-system.cpu.dcache.WriteReq_mshr_miss_latency     55588500                       # number of WriteReq MSHR miss cycles
-system.cpu.dcache.WriteReq_mshr_miss_rate     0.000077                       # mshr miss rate for WriteReq accesses
-system.cpu.dcache.WriteReq_mshr_misses           1575                       # number of WriteReq MSHR misses
+system.cpu.dcache.WriteReq_avg_miss_latency 26250.708416                       # average WriteReq miss latency
+system.cpu.dcache.WriteReq_avg_mshr_miss_latency 35437.100894                       # average WriteReq mshr miss latency
+system.cpu.dcache.WriteReq_hits              20508672                       # number of WriteReq hits
+system.cpu.dcache.WriteReq_miss_latency     185277500                       # number of WriteReq miss cycles
+system.cpu.dcache.WriteReq_miss_rate         0.000344                       # miss rate for WriteReq accesses
+system.cpu.dcache.WriteReq_misses                7058                       # number of WriteReq misses
+system.cpu.dcache.WriteReq_mshr_hits             5492                       # number of WriteReq MSHR hits
+system.cpu.dcache.WriteReq_mshr_miss_latency     55494500                       # number of WriteReq MSHR miss cycles
+system.cpu.dcache.WriteReq_mshr_miss_rate     0.000076                       # mshr miss rate for WriteReq accesses
+system.cpu.dcache.WriteReq_mshr_misses           1566                       # number of WriteReq MSHR misses
 system.cpu.dcache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.dcache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.dcache.avg_refs               37331.807235                       # Average number of references to valid blocks.
+system.cpu.dcache.avg_refs               36411.811795                       # Average number of references to valid blocks.
 system.cpu.dcache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.dcache.blocked::no_targets               0                       # number of cycles access was blocked
 system.cpu.dcache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.dcache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.dcache.cache_copies                      0                       # number of cache copies performed
-system.cpu.dcache.demand_accesses            72242863                       # number of demand (read+write) accesses
-system.cpu.dcache.demand_avg_miss_latency 27087.517194                       # average overall miss latency
-system.cpu.dcache.demand_avg_mshr_miss_latency 35089.405685                       # average overall mshr miss latency
-system.cpu.dcache.demand_hits                72237047                       # number of demand (read+write) hits
-system.cpu.dcache.demand_miss_latency       157541000                       # number of demand (read+write) miss cycles
-system.cpu.dcache.demand_miss_rate           0.000081                       # miss rate for demand accesses
-system.cpu.dcache.demand_misses                  5816                       # number of demand (read+write) misses
-system.cpu.dcache.demand_mshr_hits               3881                       # number of demand (read+write) MSHR hits
-system.cpu.dcache.demand_mshr_miss_latency     67898000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.dcache.demand_accesses            71010767                       # number of demand (read+write) accesses
+system.cpu.dcache.demand_avg_miss_latency 26866.886475                       # average overall miss latency
+system.cpu.dcache.demand_avg_mshr_miss_latency 35160.256410                       # average overall mshr miss latency
+system.cpu.dcache.demand_hits                71003033                       # number of demand (read+write) hits
+system.cpu.dcache.demand_miss_latency       207788500                       # number of demand (read+write) miss cycles
+system.cpu.dcache.demand_miss_rate           0.000109                       # miss rate for demand accesses
+system.cpu.dcache.demand_misses                  7734                       # number of demand (read+write) misses
+system.cpu.dcache.demand_mshr_hits               5784                       # number of demand (read+write) MSHR hits
+system.cpu.dcache.demand_mshr_miss_latency     68562500                       # number of demand (read+write) MSHR miss cycles
 system.cpu.dcache.demand_mshr_miss_rate      0.000027                       # mshr miss rate for demand accesses
-system.cpu.dcache.demand_mshr_misses             1935                       # number of demand (read+write) MSHR misses
+system.cpu.dcache.demand_mshr_misses             1950                       # number of demand (read+write) MSHR misses
 system.cpu.dcache.fast_writes                       0                       # number of fast writes performed
 system.cpu.dcache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.dcache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.dcache.occ_%::0                   0.336997                       # Average percentage of cache occupancy
-system.cpu.dcache.occ_blocks::0           1380.340507                       # Average occupied blocks per context
-system.cpu.dcache.overall_accesses           72242863                       # number of overall (read+write) accesses
-system.cpu.dcache.overall_avg_miss_latency 27087.517194                       # average overall miss latency
-system.cpu.dcache.overall_avg_mshr_miss_latency 35089.405685                       # average overall mshr miss latency
+system.cpu.dcache.occ_%::0                   0.340706                       # Average percentage of cache occupancy
+system.cpu.dcache.occ_blocks::0           1395.531138                       # Average occupied blocks per context
+system.cpu.dcache.overall_accesses           71010767                       # number of overall (read+write) accesses
+system.cpu.dcache.overall_avg_miss_latency 26866.886475                       # average overall miss latency
+system.cpu.dcache.overall_avg_mshr_miss_latency 35160.256410                       # average overall mshr miss latency
 system.cpu.dcache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.dcache.overall_hits               72237047                       # number of overall hits
-system.cpu.dcache.overall_miss_latency      157541000                       # number of overall miss cycles
-system.cpu.dcache.overall_miss_rate          0.000081                       # miss rate for overall accesses
-system.cpu.dcache.overall_misses                 5816                       # number of overall misses
-system.cpu.dcache.overall_mshr_hits              3881                       # number of overall MSHR hits
-system.cpu.dcache.overall_mshr_miss_latency     67898000                       # number of overall MSHR miss cycles
+system.cpu.dcache.overall_hits               71003033                       # number of overall hits
+system.cpu.dcache.overall_miss_latency      207788500                       # number of overall miss cycles
+system.cpu.dcache.overall_miss_rate          0.000109                       # miss rate for overall accesses
+system.cpu.dcache.overall_misses                 7734                       # number of overall misses
+system.cpu.dcache.overall_mshr_hits              5784                       # number of overall MSHR hits
+system.cpu.dcache.overall_mshr_miss_latency     68562500                       # number of overall MSHR miss cycles
 system.cpu.dcache.overall_mshr_miss_rate     0.000027                       # mshr miss rate for overall accesses
-system.cpu.dcache.overall_mshr_misses            1935                       # number of overall MSHR misses
+system.cpu.dcache.overall_mshr_misses            1950                       # number of overall MSHR misses
 system.cpu.dcache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.dcache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.dcache.replacements                     46                       # number of replacements
-system.cpu.dcache.sampled_refs                   1935                       # Sample count of references to valid blocks.
+system.cpu.dcache.replacements                     48                       # number of replacements
+system.cpu.dcache.sampled_refs                   1950                       # Sample count of references to valid blocks.
 system.cpu.dcache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.dcache.tagsinuse               1380.340507                       # Cycle average of tags in use
-system.cpu.dcache.total_refs                 72237047                       # Total number of references to valid blocks.
+system.cpu.dcache.tagsinuse               1395.531138                       # Cycle average of tags in use
+system.cpu.dcache.total_refs                 71003033                       # Total number of references to valid blocks.
 system.cpu.dcache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
-system.cpu.dcache.writebacks                        9                       # number of writebacks
-system.cpu.decode.DECODE:BlockedCycles        5656231                       # Number of cycles decode is blocked
-system.cpu.decode.DECODE:DecodedInsts       309852988                       # Number of instructions handled by decode
-system.cpu.decode.DECODE:IdleCycles          53029625                       # Number of cycles decode is idle
-system.cpu.decode.DECODE:RunCycles          184220573                       # Number of cycles decode is running
-system.cpu.decode.DECODE:SquashCycles        11003980                       # Number of cycles decode is squashing
-system.cpu.decode.DECODE:UnblockCycles        1085738                       # Number of cycles decode is unblocking
-system.cpu.fetch.Branches                    19223942                       # Number of branches that fetch encountered
-system.cpu.fetch.CacheLines                  20440935                       # Number of cache lines fetched
-system.cpu.fetch.Cycles                     196264127                       # Number of cycles fetch has run and was not squashing or blocked
-system.cpu.fetch.IcacheSquashes                182297                       # Number of outstanding Icache misses that were squashed
-system.cpu.fetch.Insts                      184675827                       # Number of instructions fetch has processed
-system.cpu.fetch.MiscStallCycles                   11                       # Number of cycles fetch has spent waiting on interrupts, or bad addresses, or out of MSHRs
-system.cpu.fetch.SquashCycles                 4455378                       # Number of cycles fetch has spent squashing
-system.cpu.fetch.branchRate                  0.075352                       # Number of branch fetches per cycle
-system.cpu.fetch.icacheStallCycles           20440935                       # Number of cycles fetch is stalled on an Icache miss
-system.cpu.fetch.predictedBranches           16939138                       # Number of branches that fetch has predicted taken
-system.cpu.fetch.rate                        0.723875                       # Number of inst fetches per cycle
-system.cpu.fetch.rateDist::samples          254996147                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::mean              1.239017                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::stdev             1.348981                       # Number of instructions fetched each cycle (Total)
+system.cpu.dcache.writebacks                       10                       # number of writebacks
+system.cpu.decode.DECODE:BlockedCycles       58788191                       # Number of cycles decode is blocked
+system.cpu.decode.DECODE:DecodedInsts       426377378                       # Number of instructions handled by decode
+system.cpu.decode.DECODE:IdleCycles          67892396                       # Number of cycles decode is idle
+system.cpu.decode.DECODE:RunCycles           61042516                       # Number of cycles decode is running
+system.cpu.decode.DECODE:SquashCycles        23949638                       # Number of cycles decode is squashing
+system.cpu.decode.DECODE:UnblockCycles        5989025                       # Number of cycles decode is unblocking
+system.cpu.fetch.Branches                    25317132                       # Number of branches that fetch encountered
+system.cpu.fetch.CacheLines                  27858568                       # Number of cache lines fetched
+system.cpu.fetch.Cycles                      70494302                       # Number of cycles fetch has run and was not squashing or blocked
+system.cpu.fetch.IcacheSquashes                451015                       # Number of outstanding Icache misses that were squashed
+system.cpu.fetch.Insts                      267008364                       # Number of instructions fetch has processed
+system.cpu.fetch.MiscStallCycles                   61                       # Number of cycles fetch has spent waiting on interrupts, or bad addresses, or out of MSHRs
+system.cpu.fetch.SquashCycles                 3227425                       # Number of cycles fetch has spent squashing
+system.cpu.fetch.branchRate                  0.116266                       # Number of branch fetches per cycle
+system.cpu.fetch.icacheStallCycles           27858568                       # Number of cycles fetch is stalled on an Icache miss
+system.cpu.fetch.predictedBranches           19725800                       # Number of branches that fetch has predicted taken
+system.cpu.fetch.rate                        1.226210                       # Number of inst fetches per cycle
+system.cpu.fetch.rateDist::samples          217661766                       # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::mean              2.006543                       # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::stdev             3.224025                       # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::underflows               0      0.00%      0.00% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::0                 66307953     26.00%     26.00% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::1                121646972     47.71%     73.71% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::2                 37731127     14.80%     88.51% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::3                 20479784      8.03%     96.54% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::4                  1948325      0.76%     97.30% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::5                  1108960      0.43%     97.74% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::6                  1062530      0.42%     98.15% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::7                     1340      0.00%     98.15% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::8                  4709156      1.85%    100.00% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::0                148998369     68.45%     68.45% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::1                  3780164      1.74%     70.19% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::2                  3170889      1.46%     71.65% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::3                  4293321      1.97%     73.62% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::4                  4655999      2.14%     75.76% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::5                  4463846      2.05%     77.81% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::6                  5161555      2.37%     80.18% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::7                  3267808      1.50%     81.68% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::8                 39869815     18.32%    100.00% # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::overflows                0      0.00%    100.00% # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::min_value                0                       # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::max_value                8                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::total            254996147                       # Number of instructions fetched each cycle (Total)
-system.cpu.fp_regfile_reads                   3212472                       # number of floating regfile reads
-system.cpu.fp_regfile_writes                  2049220                       # number of floating regfile writes
-system.cpu.icache.ReadReq_accesses           20440935                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_avg_miss_latency 25661.556820                       # average ReadReq miss latency
-system.cpu.icache.ReadReq_avg_mshr_miss_latency 22374.875175                       # average ReadReq mshr miss latency
-system.cpu.icache.ReadReq_hits               20435488                       # number of ReadReq hits
-system.cpu.icache.ReadReq_miss_latency      139778500                       # number of ReadReq miss cycles
-system.cpu.icache.ReadReq_miss_rate          0.000266                       # miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_misses                 5447                       # number of ReadReq misses
-system.cpu.icache.ReadReq_mshr_hits               440                       # number of ReadReq MSHR hits
-system.cpu.icache.ReadReq_mshr_miss_latency    112031000                       # number of ReadReq MSHR miss cycles
-system.cpu.icache.ReadReq_mshr_miss_rate     0.000245                       # mshr miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_mshr_misses            5007                       # number of ReadReq MSHR misses
+system.cpu.fetch.rateDist::total            217661766                       # Number of instructions fetched each cycle (Total)
+system.cpu.fp_regfile_reads                   3513078                       # number of floating regfile reads
+system.cpu.fp_regfile_writes                  2177890                       # number of floating regfile writes
+system.cpu.icache.ReadReq_accesses           27858568                       # number of ReadReq accesses(hits+misses)
+system.cpu.icache.ReadReq_avg_miss_latency 25516.664059                       # average ReadReq miss latency
+system.cpu.icache.ReadReq_avg_mshr_miss_latency 22464.816190                       # average ReadReq mshr miss latency
+system.cpu.icache.ReadReq_hits               27852177                       # number of ReadReq hits
+system.cpu.icache.ReadReq_miss_latency      163077000                       # number of ReadReq miss cycles
+system.cpu.icache.ReadReq_miss_rate          0.000229                       # miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_misses                 6391                       # number of ReadReq misses
+system.cpu.icache.ReadReq_mshr_hits              1005                       # number of ReadReq MSHR hits
+system.cpu.icache.ReadReq_mshr_miss_latency    120995500                       # number of ReadReq MSHR miss cycles
+system.cpu.icache.ReadReq_mshr_miss_rate     0.000193                       # mshr miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_mshr_misses            5386                       # number of ReadReq MSHR misses
 system.cpu.icache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.icache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.icache.avg_refs                4082.198961                       # Average number of references to valid blocks.
+system.cpu.icache.avg_refs                5171.217416                       # Average number of references to valid blocks.
 system.cpu.icache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.icache.blocked::no_targets               0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.icache.cache_copies                      0                       # number of cache copies performed
-system.cpu.icache.demand_accesses            20440935                       # number of demand (read+write) accesses
-system.cpu.icache.demand_avg_miss_latency 25661.556820                       # average overall miss latency
-system.cpu.icache.demand_avg_mshr_miss_latency 22374.875175                       # average overall mshr miss latency
-system.cpu.icache.demand_hits                20435488                       # number of demand (read+write) hits
-system.cpu.icache.demand_miss_latency       139778500                       # number of demand (read+write) miss cycles
-system.cpu.icache.demand_miss_rate           0.000266                       # miss rate for demand accesses
-system.cpu.icache.demand_misses                  5447                       # number of demand (read+write) misses
-system.cpu.icache.demand_mshr_hits                440                       # number of demand (read+write) MSHR hits
-system.cpu.icache.demand_mshr_miss_latency    112031000                       # number of demand (read+write) MSHR miss cycles
-system.cpu.icache.demand_mshr_miss_rate      0.000245                       # mshr miss rate for demand accesses
-system.cpu.icache.demand_mshr_misses             5007                       # number of demand (read+write) MSHR misses
+system.cpu.icache.demand_accesses            27858568                       # number of demand (read+write) accesses
+system.cpu.icache.demand_avg_miss_latency 25516.664059                       # average overall miss latency
+system.cpu.icache.demand_avg_mshr_miss_latency 22464.816190                       # average overall mshr miss latency
+system.cpu.icache.demand_hits                27852177                       # number of demand (read+write) hits
+system.cpu.icache.demand_miss_latency       163077000                       # number of demand (read+write) miss cycles
+system.cpu.icache.demand_miss_rate           0.000229                       # miss rate for demand accesses
+system.cpu.icache.demand_misses                  6391                       # number of demand (read+write) misses
+system.cpu.icache.demand_mshr_hits               1005                       # number of demand (read+write) MSHR hits
+system.cpu.icache.demand_mshr_miss_latency    120995500                       # number of demand (read+write) MSHR miss cycles
+system.cpu.icache.demand_mshr_miss_rate      0.000193                       # mshr miss rate for demand accesses
+system.cpu.icache.demand_mshr_misses             5386                       # number of demand (read+write) MSHR misses
 system.cpu.icache.fast_writes                       0                       # number of fast writes performed
 system.cpu.icache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.icache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.icache.occ_%::0                   0.746987                       # Average percentage of cache occupancy
-system.cpu.icache.occ_blocks::0           1529.828433                       # Average occupied blocks per context
-system.cpu.icache.overall_accesses           20440935                       # number of overall (read+write) accesses
-system.cpu.icache.overall_avg_miss_latency 25661.556820                       # average overall miss latency
-system.cpu.icache.overall_avg_mshr_miss_latency 22374.875175                       # average overall mshr miss latency
+system.cpu.icache.occ_%::0                   0.783470                       # Average percentage of cache occupancy
+system.cpu.icache.occ_blocks::0           1604.546925                       # Average occupied blocks per context
+system.cpu.icache.overall_accesses           27858568                       # number of overall (read+write) accesses
+system.cpu.icache.overall_avg_miss_latency 25516.664059                       # average overall miss latency
+system.cpu.icache.overall_avg_mshr_miss_latency 22464.816190                       # average overall mshr miss latency
 system.cpu.icache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.icache.overall_hits               20435488                       # number of overall hits
-system.cpu.icache.overall_miss_latency      139778500                       # number of overall miss cycles
-system.cpu.icache.overall_miss_rate          0.000266                       # miss rate for overall accesses
-system.cpu.icache.overall_misses                 5447                       # number of overall misses
-system.cpu.icache.overall_mshr_hits               440                       # number of overall MSHR hits
-system.cpu.icache.overall_mshr_miss_latency    112031000                       # number of overall MSHR miss cycles
-system.cpu.icache.overall_mshr_miss_rate     0.000245                       # mshr miss rate for overall accesses
-system.cpu.icache.overall_mshr_misses            5007                       # number of overall MSHR misses
+system.cpu.icache.overall_hits               27852177                       # number of overall hits
+system.cpu.icache.overall_miss_latency      163077000                       # number of overall miss cycles
+system.cpu.icache.overall_miss_rate          0.000229                       # miss rate for overall accesses
+system.cpu.icache.overall_misses                 6391                       # number of overall misses
+system.cpu.icache.overall_mshr_hits              1005                       # number of overall MSHR hits
+system.cpu.icache.overall_mshr_miss_latency    120995500                       # number of overall MSHR miss cycles
+system.cpu.icache.overall_mshr_miss_rate     0.000193                       # mshr miss rate for overall accesses
+system.cpu.icache.overall_mshr_misses            5386                       # number of overall MSHR misses
 system.cpu.icache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.icache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.icache.replacements                   3101                       # number of replacements
-system.cpu.icache.sampled_refs                   5006                       # Sample count of references to valid blocks.
+system.cpu.icache.replacements                   3428                       # number of replacements
+system.cpu.icache.sampled_refs                   5386                       # Sample count of references to valid blocks.
 system.cpu.icache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.icache.tagsinuse               1529.828433                       # Cycle average of tags in use
-system.cpu.icache.total_refs                 20435488                       # Total number of references to valid blocks.
+system.cpu.icache.tagsinuse               1604.546925                       # Cycle average of tags in use
+system.cpu.icache.total_refs                 27852177                       # Total number of references to valid blocks.
 system.cpu.icache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
 system.cpu.icache.writebacks                        0                       # number of writebacks
-system.cpu.idleCycles                          124939                       # Total number of cycles that the CPU has spent unscheduled due to idling
-system.cpu.iew.EXEC:branches                 13366188                       # Number of branches executed
+system.cpu.idleCycles                           89183                       # Total number of cycles that the CPU has spent unscheduled due to idling
+system.cpu.iew.EXEC:branches                 15799905                       # Number of branches executed
 system.cpu.iew.EXEC:nop                             0                       # number of nop insts executed
-system.cpu.iew.EXEC:rate                     0.954963                       # Inst execution rate
-system.cpu.iew.EXEC:refs                     84717237                       # number of memory reference insts executed
-system.cpu.iew.EXEC:stores                   21535662                       # Number of stores executed
+system.cpu.iew.EXEC:rate                     1.276995                       # Inst execution rate
+system.cpu.iew.EXEC:refs                     89573185                       # number of memory reference insts executed
+system.cpu.iew.EXEC:stores                   22888685                       # Number of stores executed
 system.cpu.iew.EXEC:swp                             0                       # number of swp insts executed
-system.cpu.iew.WB:consumers                 389337537                       # num instructions consuming a value
-system.cpu.iew.WB:count                     241459353                       # cumulative count of insts written-back
-system.cpu.iew.WB:fanout                     0.499412                       # average fanout of values written-back
+system.cpu.iew.WB:consumers                 372933305                       # num instructions consuming a value
+system.cpu.iew.WB:count                     276026292                       # cumulative count of insts written-back
+system.cpu.iew.WB:fanout                     0.598611                       # average fanout of values written-back
 system.cpu.iew.WB:penalized                         0                       # number of instrctions required to write to 'other' IQ
 system.cpu.iew.WB:penalized_rate                    0                       # fraction of instructions written-back that wrote to 'other' IQ
-system.cpu.iew.WB:producers                 194439848                       # num instructions producing a value
-system.cpu.iew.WB:rate                       0.946450                       # insts written-back per cycle
-system.cpu.iew.WB:sent                      242120517                       # cumulative count of insts sent to commit
-system.cpu.iew.branchMispredicts              3656523                       # Number of branch mispredicts detected at execute
-system.cpu.iew.iewBlockCycles                  214895                       # Number of cycles IEW is blocking
-system.cpu.iew.iewDispLoadInsts              75869162                       # Number of dispatched load instructions
-system.cpu.iew.iewDispNonSpecInsts               1275                       # Number of dispatched non-speculative instructions
-system.cpu.iew.iewDispSquashedInsts           2489008                       # Number of squashed instructions skipped by dispatch
-system.cpu.iew.iewDispStoreInsts             25600521                       # Number of dispatched store instructions
-system.cpu.iew.iewDispatchedInsts           291514094                       # Number of instructions dispatched to IQ
-system.cpu.iew.iewExecLoadInsts              63181575                       # Number of load instructions executed
-system.cpu.iew.iewExecSquashedInsts           4005104                       # Number of squashed instructions skipped in execute
-system.cpu.iew.iewExecutedInsts             243631219                       # Number of executed instructions
-system.cpu.iew.iewIQFullEvents                  25200                       # Number of times the IQ has become full, causing a stall
+system.cpu.iew.WB:producers                 223241922                       # num instructions producing a value
+system.cpu.iew.WB:rate                       1.267624                       # insts written-back per cycle
+system.cpu.iew.WB:sent                      277033647                       # cumulative count of insts sent to commit
+system.cpu.iew.branchMispredicts              3251135                       # Number of branch mispredicts detected at execute
+system.cpu.iew.iewBlockCycles                  619969                       # Number of cycles IEW is blocking
+system.cpu.iew.iewDispLoadInsts             106923422                       # Number of dispatched load instructions
+system.cpu.iew.iewDispNonSpecInsts               1424                       # Number of dispatched non-speculative instructions
+system.cpu.iew.iewDispSquashedInsts            171683                       # Number of squashed instructions skipped by dispatch
+system.cpu.iew.iewDispStoreInsts             37463806                       # Number of dispatched store instructions
+system.cpu.iew.iewDispatchedInsts           401512728                       # Number of instructions dispatched to IQ
+system.cpu.iew.iewExecLoadInsts              66684500                       # Number of load instructions executed
+system.cpu.iew.iewExecSquashedInsts           3440679                       # Number of squashed instructions skipped in execute
+system.cpu.iew.iewExecutedInsts             278066855                       # Number of executed instructions
+system.cpu.iew.iewIQFullEvents                 560615                       # Number of times the IQ has become full, causing a stall
 system.cpu.iew.iewIdleCycles                        0                       # Number of cycles IEW is idle
-system.cpu.iew.iewLSQFullEvents                     0                       # Number of times the LSQ has become full, causing a stall
-system.cpu.iew.iewSquashCycles               11003980                       # Number of cycles IEW is squashing
-system.cpu.iew.iewUnblockCycles                 40028                       # Number of cycles IEW is unblocking
+system.cpu.iew.iewLSQFullEvents                 30447                       # Number of times the LSQ has become full, causing a stall
+system.cpu.iew.iewSquashCycles               23949638                       # Number of cycles IEW is squashing
+system.cpu.iew.iewUnblockCycles                623802                       # Number of cycles IEW is unblocking
 system.cpu.iew.lsq.thread.0.blockedLoads            0                       # Number of blocked loads due to partial load-store forwarding
 system.cpu.iew.lsq.thread.0.cacheBlocked            0                       # Number of times an access to memory failed due to the cache being blocked
-system.cpu.iew.lsq.thread.0.forwLoads        11103688                       # Number of loads that had data forwarded from stores
-system.cpu.iew.lsq.thread.0.ignoredResponses        71380                       # Number of memory responses ignored because the instruction is squashed
+system.cpu.iew.lsq.thread.0.forwLoads        15985064                       # Number of loads that had data forwarded from stores
+system.cpu.iew.lsq.thread.0.ignoredResponses        21414                       # Number of memory responses ignored because the instruction is squashed
 system.cpu.iew.lsq.thread.0.invAddrLoads            0                       # Number of loads ignored due to an invalid address
 system.cpu.iew.lsq.thread.0.invAddrSwpfs            0                       # Number of software prefetches ignored due to an invalid address
-system.cpu.iew.lsq.thread.0.memOrderViolation       879354                       # Number of memory ordering violations
-system.cpu.iew.lsq.thread.0.rescheduledLoads        44904                       # Number of loads that were rescheduled
-system.cpu.iew.lsq.thread.0.squashedLoads     19219572                       # Number of loads squashed
-system.cpu.iew.lsq.thread.0.squashedStores      5084805                       # Number of stores squashed
-system.cpu.iew.memOrderViolationEvents         879354                       # Number of memory order violations
-system.cpu.iew.predictedNotTakenIncorrect       151398                       # Number of branches that were predicted not taken incorrectly
-system.cpu.iew.predictedTakenIncorrect        3505125                       # Number of branches that were predicted taken incorrectly
-system.cpu.int_regfile_reads                614135119                       # number of integer regfile reads
-system.cpu.int_regfile_writes               252115460                       # number of integer regfile writes
-system.cpu.ipc                               0.867678                       # IPC: Instructions Per Cycle
-system.cpu.ipc_total                         0.867678                       # IPC: Total IPC of All Threads
-system.cpu.iq.ISSUE:FU_type_0::No_OpClass      1180294      0.48%      0.48% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntAlu       158353329     63.95%     64.42% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntMult              0      0.00%     64.42% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntDiv               0      0.00%     64.42% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatAdd       1520272      0.61%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatCmp             0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatCvt             0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatMult            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatDiv             0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatSqrt            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAdd              0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAddAcc            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAlu              0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdCmp              0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdCvt              0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMisc             0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMult             0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMultAcc            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdShift            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdShiftAcc            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdSqrt             0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatAdd            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatAlu            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatCmp            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatCvt            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatDiv            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMisc            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMult            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMultAcc            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatSqrt            0      0.00%     65.04% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::MemRead       64587764     26.08%     91.12% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::MemWrite      21994664      8.88%    100.00% # Type of FU issued
+system.cpu.iew.lsq.thread.0.memOrderViolation       187512                       # Number of memory ordering violations
+system.cpu.iew.lsq.thread.0.rescheduledLoads        45117                       # Number of loads that were rescheduled
+system.cpu.iew.lsq.thread.0.squashedLoads     50273832                       # Number of loads squashed
+system.cpu.iew.lsq.thread.0.squashedStores     16948090                       # Number of stores squashed
+system.cpu.iew.memOrderViolationEvents         187512                       # Number of memory order violations
+system.cpu.iew.predictedNotTakenIncorrect       737658                       # Number of branches that were predicted not taken incorrectly
+system.cpu.iew.predictedTakenIncorrect        2513477                       # Number of branches that were predicted taken incorrectly
+system.cpu.int_regfile_reads                514946932                       # number of integer regfile reads
+system.cpu.int_regfile_writes               284476955                       # number of integer regfile writes
+system.cpu.ipc                               1.016588                       # IPC: Instructions Per Cycle
+system.cpu.ipc_total                         1.016588                       # IPC: Total IPC of All Threads
+system.cpu.iq.ISSUE:FU_type_0::No_OpClass      1195391      0.42%      0.42% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntAlu       187555358     66.63%     67.05% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntMult              0      0.00%     67.05% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntDiv               0      0.00%     67.05% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatAdd       1589850      0.56%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatCmp             0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatCvt             0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatMult            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatDiv             0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatSqrt            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAdd              0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAddAcc            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAlu              0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdCmp              0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdCvt              0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMisc             0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMult             0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMultAcc            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdShift            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdShiftAcc            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdSqrt             0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatAdd            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatAlu            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatCmp            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatCvt            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatDiv            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMisc            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMult            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMultAcc            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatSqrt            0      0.00%     67.61% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::MemRead       67998663     24.16%     91.77% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::MemWrite      23168272      8.23%    100.00% # Type of FU issued
 system.cpu.iq.ISSUE:FU_type_0::IprAccess            0      0.00%    100.00% # Type of FU issued
 system.cpu.iq.ISSUE:FU_type_0::InstPrefetch            0      0.00%    100.00% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::total        247636323                       # Type of FU issued
-system.cpu.iq.ISSUE:fu_busy_cnt                 40899                       # FU busy when requested
-system.cpu.iq.ISSUE:fu_busy_rate             0.000165                       # FU busy rate (busy events/executed inst)
+system.cpu.iq.ISSUE:FU_type_0::total        281507534                       # Type of FU issued
+system.cpu.iq.ISSUE:fu_busy_cnt               2779468                       # FU busy when requested
+system.cpu.iq.ISSUE:fu_busy_rate             0.009874                       # FU busy rate (busy events/executed inst)
 system.cpu.iq.ISSUE:fu_full::No_OpClass             0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntAlu                 0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntMult                0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntDiv                 0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatAdd               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatCmp               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatCvt               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatMult              0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatDiv               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatSqrt              0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAdd                0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAddAcc             0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAlu                0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdCmp                0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdCvt                0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMisc               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMult               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMultAcc            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdShift              0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdShiftAcc            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdSqrt               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatAdd            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatAlu            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatCmp            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatCvt            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatDiv            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMisc            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMult            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMultAcc            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatSqrt            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::MemRead            37912     92.70%     92.70% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::MemWrite            2987      7.30%    100.00% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntAlu             58461      2.10%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntMult                0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntDiv                 0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatAdd               0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatCmp               0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatCvt               0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatMult              0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatDiv               0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatSqrt              0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAdd                0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAddAcc             0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAlu                0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdCmp                0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdCvt                0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMisc               0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMult               0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMultAcc            0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdShift              0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdShiftAcc            0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdSqrt               0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatAdd            0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatAlu            0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatCmp            0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatCvt            0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatDiv            0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMisc            0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMult            0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMultAcc            0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatSqrt            0      0.00%      2.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::MemRead          2334735     84.00%     86.10% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::MemWrite          386272     13.90%    100.00% # attempts to use FU when none available
 system.cpu.iq.ISSUE:fu_full::IprAccess              0      0.00%    100.00% # attempts to use FU when none available
 system.cpu.iq.ISSUE:fu_full::InstPrefetch            0      0.00%    100.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:issued_per_cycle::samples    254996147                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::mean     0.971138                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::stdev     0.960460                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::samples    217661766                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::mean     1.293326                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::stdev     1.357747                       # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::underflows            0      0.00%      0.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::0      97493255     38.23%     38.23% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::1      86911390     34.08%     72.32% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::2      54912481     21.53%     93.85% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::3      12234045      4.80%     98.65% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::4       3109625      1.22%     99.87% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::5        255105      0.10%     99.97% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::6         77911      0.03%    100.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::7          2335      0.00%    100.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::8             0      0.00%    100.00% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::0      75328501     34.61%     34.61% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::1      67045740     30.80%     65.41% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::2      37681009     17.31%     82.72% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::3      20059185      9.22%     91.94% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::4      11722195      5.39%     97.32% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::5       3737927      1.72%     99.04% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::6       1378220      0.63%     99.67% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::7        597426      0.27%     99.95% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::8        111563      0.05%    100.00% # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::overflows            0      0.00%    100.00% # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::min_value            0                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::max_value            7                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::total    254996147                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:rate                     0.970662                       # Inst issue rate
-system.cpu.iq.fp_alu_accesses                 2542426                       # Number of floating point alu accesses
-system.cpu.iq.fp_inst_queue_reads             5084249                       # Number of floating instruction queue reads
-system.cpu.iq.fp_inst_queue_wakeup_accesses      2387245                       # Number of floating instruction queue wakeup accesses
-system.cpu.iq.fp_inst_queue_writes            3193021                       # Number of floating instruction queue writes
-system.cpu.iq.int_alu_accesses              243954502                       # Number of integer alu accesses
-system.cpu.iq.int_inst_queue_reads          745226741                       # Number of integer instruction queue reads
-system.cpu.iq.int_inst_queue_wakeup_accesses    239072108                       # Number of integer instruction queue wakeup accesses
-system.cpu.iq.int_inst_queue_writes         358869082                       # Number of integer instruction queue writes
-system.cpu.iq.iqInstsAdded                  291512819                       # Number of instructions added to the IQ (excludes non-spec)
-system.cpu.iq.iqInstsIssued                 247636323                       # Number of instructions issued
-system.cpu.iq.iqNonSpecInstsAdded                1275                       # Number of non-speculative instructions added to the IQ
-system.cpu.iq.iqSquashedInstsExamined        69673728                       # Number of squashed instructions iterated over during squash; mainly for profiling
-system.cpu.iq.iqSquashedInstsIssued              1298                       # Number of squashed instructions issued
-system.cpu.iq.iqSquashedNonSpecRemoved             29                       # Number of squashed non-spec instructions that were removed
-system.cpu.iq.iqSquashedOperandsExamined    182988092                       # Number of squashed operands that are examined and possibly removed from graph
-system.cpu.l2cache.ReadExReq_accesses            1575                       # number of ReadExReq accesses(hits+misses)
-system.cpu.l2cache.ReadExReq_avg_miss_latency 34364.012739                       # average ReadExReq miss latency
-system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 31058.917197                       # average ReadExReq mshr miss latency
-system.cpu.l2cache.ReadExReq_hits                   5                       # number of ReadExReq hits
-system.cpu.l2cache.ReadExReq_miss_latency     53951500                       # number of ReadExReq miss cycles
-system.cpu.l2cache.ReadExReq_miss_rate       0.996825                       # miss rate for ReadExReq accesses
-system.cpu.l2cache.ReadExReq_misses              1570                       # number of ReadExReq misses
-system.cpu.l2cache.ReadExReq_mshr_miss_latency     48762500                       # number of ReadExReq MSHR miss cycles
-system.cpu.l2cache.ReadExReq_mshr_miss_rate     0.996825                       # mshr miss rate for ReadExReq accesses
-system.cpu.l2cache.ReadExReq_mshr_misses         1570                       # number of ReadExReq MSHR misses
-system.cpu.l2cache.ReadReq_accesses              5367                       # number of ReadReq accesses(hits+misses)
-system.cpu.l2cache.ReadReq_avg_miss_latency 34265.528407                       # average ReadReq miss latency
-system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 31035.178098                       # average ReadReq mshr miss latency
-system.cpu.l2cache.ReadReq_hits                  1970                       # number of ReadReq hits
-system.cpu.l2cache.ReadReq_miss_latency     116400000                       # number of ReadReq miss cycles
-system.cpu.l2cache.ReadReq_miss_rate         0.632942                       # miss rate for ReadReq accesses
-system.cpu.l2cache.ReadReq_misses                3397                       # number of ReadReq misses
-system.cpu.l2cache.ReadReq_mshr_miss_latency    105426500                       # number of ReadReq MSHR miss cycles
-system.cpu.l2cache.ReadReq_mshr_miss_rate     0.632942                       # mshr miss rate for ReadReq accesses
-system.cpu.l2cache.ReadReq_mshr_misses           3397                       # number of ReadReq MSHR misses
-system.cpu.l2cache.Writeback_accesses               9                       # number of Writeback accesses(hits+misses)
-system.cpu.l2cache.Writeback_hits                   9                       # number of Writeback hits
+system.cpu.iq.ISSUE:issued_per_cycle::max_value            8                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::total    217661766                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:rate                     1.292796                       # Inst issue rate
+system.cpu.iq.fp_alu_accesses                 2630821                       # Number of floating point alu accesses
+system.cpu.iq.fp_inst_queue_reads             5219937                       # Number of floating instruction queue reads
+system.cpu.iq.fp_inst_queue_wakeup_accesses      2526643                       # Number of floating instruction queue wakeup accesses
+system.cpu.iq.fp_inst_queue_writes            5714467                       # Number of floating instruction queue writes
+system.cpu.iq.int_alu_accesses              280460790                       # Number of integer alu accesses
+system.cpu.iq.int_inst_queue_reads          778290063                       # Number of integer instruction queue reads
+system.cpu.iq.int_inst_queue_wakeup_accesses    273499649                       # Number of integer instruction queue wakeup accesses
+system.cpu.iq.int_inst_queue_writes         575780653                       # Number of integer instruction queue writes
+system.cpu.iq.iqInstsAdded                  401511304                       # Number of instructions added to the IQ (excludes non-spec)
+system.cpu.iq.iqInstsIssued                 281507534                       # Number of instructions issued
+system.cpu.iq.iqNonSpecInstsAdded                1424                       # Number of non-speculative instructions added to the IQ
+system.cpu.iq.iqSquashedInstsExamined       179800569                       # Number of squashed instructions iterated over during squash; mainly for profiling
+system.cpu.iq.iqSquashedInstsIssued             53698                       # Number of squashed instructions issued
+system.cpu.iq.iqSquashedNonSpecRemoved            178                       # Number of squashed non-spec instructions that were removed
+system.cpu.iq.iqSquashedOperandsExamined    375388973                       # Number of squashed operands that are examined and possibly removed from graph
+system.cpu.l2cache.ReadExReq_accesses            1566                       # number of ReadExReq accesses(hits+misses)
+system.cpu.l2cache.ReadExReq_avg_miss_latency 34512.500000                       # average ReadExReq miss latency
+system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 31347.756410                       # average ReadExReq mshr miss latency
+system.cpu.l2cache.ReadExReq_hits                   6                       # number of ReadExReq hits
+system.cpu.l2cache.ReadExReq_miss_latency     53839500                       # number of ReadExReq miss cycles
+system.cpu.l2cache.ReadExReq_miss_rate       0.996169                       # miss rate for ReadExReq accesses
+system.cpu.l2cache.ReadExReq_misses              1560                       # number of ReadExReq misses
+system.cpu.l2cache.ReadExReq_mshr_miss_latency     48902500                       # number of ReadExReq MSHR miss cycles
+system.cpu.l2cache.ReadExReq_mshr_miss_rate     0.996169                       # mshr miss rate for ReadExReq accesses
+system.cpu.l2cache.ReadExReq_mshr_misses         1560                       # number of ReadExReq MSHR misses
+system.cpu.l2cache.ReadReq_accesses              5770                       # number of ReadReq accesses(hits+misses)
+system.cpu.l2cache.ReadReq_avg_miss_latency 34287.021858                       # average ReadReq miss latency
+system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 31043.032787                       # average ReadReq mshr miss latency
+system.cpu.l2cache.ReadReq_hits                  2110                       # number of ReadReq hits
+system.cpu.l2cache.ReadReq_miss_latency     125490500                       # number of ReadReq miss cycles
+system.cpu.l2cache.ReadReq_miss_rate         0.634315                       # miss rate for ReadReq accesses
+system.cpu.l2cache.ReadReq_misses                3660                       # number of ReadReq misses
+system.cpu.l2cache.ReadReq_mshr_miss_latency    113617500                       # number of ReadReq MSHR miss cycles
+system.cpu.l2cache.ReadReq_mshr_miss_rate     0.634315                       # mshr miss rate for ReadReq accesses
+system.cpu.l2cache.ReadReq_mshr_misses           3660                       # number of ReadReq MSHR misses
+system.cpu.l2cache.Writeback_accesses              10                       # number of Writeback accesses(hits+misses)
+system.cpu.l2cache.Writeback_hits                  10                       # number of Writeback hits
 system.cpu.l2cache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.l2cache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.l2cache.avg_refs                  0.579412                       # Average number of references to valid blocks.
+system.cpu.l2cache.avg_refs                  0.575873                       # Average number of references to valid blocks.
 system.cpu.l2cache.blocked::no_mshrs                0                       # number of cycles access was blocked
 system.cpu.l2cache.blocked::no_targets              0                       # number of cycles access was blocked
 system.cpu.l2cache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.l2cache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.l2cache.cache_copies                     0                       # number of cache copies performed
-system.cpu.l2cache.demand_accesses               6942                       # number of demand (read+write) accesses
-system.cpu.l2cache.demand_avg_miss_latency 34296.657942                       # average overall miss latency
-system.cpu.l2cache.demand_avg_mshr_miss_latency 31042.681699                       # average overall mshr miss latency
-system.cpu.l2cache.demand_hits                   1975                       # number of demand (read+write) hits
-system.cpu.l2cache.demand_miss_latency      170351500                       # number of demand (read+write) miss cycles
-system.cpu.l2cache.demand_miss_rate          0.715500                       # miss rate for demand accesses
-system.cpu.l2cache.demand_misses                 4967                       # number of demand (read+write) misses
+system.cpu.l2cache.demand_accesses               7336                       # number of demand (read+write) accesses
+system.cpu.l2cache.demand_avg_miss_latency 34354.406130                       # average overall miss latency
+system.cpu.l2cache.demand_avg_mshr_miss_latency 31134.099617                       # average overall mshr miss latency
+system.cpu.l2cache.demand_hits                   2116                       # number of demand (read+write) hits
+system.cpu.l2cache.demand_miss_latency      179330000                       # number of demand (read+write) miss cycles
+system.cpu.l2cache.demand_miss_rate          0.711559                       # miss rate for demand accesses
+system.cpu.l2cache.demand_misses                 5220                       # number of demand (read+write) misses
 system.cpu.l2cache.demand_mshr_hits                 0                       # number of demand (read+write) MSHR hits
-system.cpu.l2cache.demand_mshr_miss_latency    154189000                       # number of demand (read+write) MSHR miss cycles
-system.cpu.l2cache.demand_mshr_miss_rate     0.715500                       # mshr miss rate for demand accesses
-system.cpu.l2cache.demand_mshr_misses            4967                       # number of demand (read+write) MSHR misses
+system.cpu.l2cache.demand_mshr_miss_latency    162520000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.l2cache.demand_mshr_miss_rate     0.711559                       # mshr miss rate for demand accesses
+system.cpu.l2cache.demand_mshr_misses            5220                       # number of demand (read+write) MSHR misses
 system.cpu.l2cache.fast_writes                      0                       # number of fast writes performed
 system.cpu.l2cache.mshr_cap_events                  0                       # number of times MSHR cap was activated
 system.cpu.l2cache.no_allocate_misses               0                       # Number of misses that were no-allocate
-system.cpu.l2cache.occ_%::0                  0.068086                       # Average percentage of cache occupancy
+system.cpu.l2cache.occ_%::0                  0.074027                       # Average percentage of cache occupancy
 system.cpu.l2cache.occ_%::1                  0.000031                       # Average percentage of cache occupancy
-system.cpu.l2cache.occ_blocks::0          2231.049035                       # Average occupied blocks per context
-system.cpu.l2cache.occ_blocks::1             1.015700                       # Average occupied blocks per context
-system.cpu.l2cache.overall_accesses              6942                       # number of overall (read+write) accesses
-system.cpu.l2cache.overall_avg_miss_latency 34296.657942                       # average overall miss latency
-system.cpu.l2cache.overall_avg_mshr_miss_latency 31042.681699                       # average overall mshr miss latency
+system.cpu.l2cache.occ_blocks::0          2425.713909                       # Average occupied blocks per context
+system.cpu.l2cache.occ_blocks::1             1.014918                       # Average occupied blocks per context
+system.cpu.l2cache.overall_accesses              7336                       # number of overall (read+write) accesses
+system.cpu.l2cache.overall_avg_miss_latency 34354.406130                       # average overall miss latency
+system.cpu.l2cache.overall_avg_mshr_miss_latency 31134.099617                       # average overall mshr miss latency
 system.cpu.l2cache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.l2cache.overall_hits                  1975                       # number of overall hits
-system.cpu.l2cache.overall_miss_latency     170351500                       # number of overall miss cycles
-system.cpu.l2cache.overall_miss_rate         0.715500                       # miss rate for overall accesses
-system.cpu.l2cache.overall_misses                4967                       # number of overall misses
+system.cpu.l2cache.overall_hits                  2116                       # number of overall hits
+system.cpu.l2cache.overall_miss_latency     179330000                       # number of overall miss cycles
+system.cpu.l2cache.overall_miss_rate         0.711559                       # miss rate for overall accesses
+system.cpu.l2cache.overall_misses                5220                       # number of overall misses
 system.cpu.l2cache.overall_mshr_hits                0                       # number of overall MSHR hits
-system.cpu.l2cache.overall_mshr_miss_latency    154189000                       # number of overall MSHR miss cycles
-system.cpu.l2cache.overall_mshr_miss_rate     0.715500                       # mshr miss rate for overall accesses
-system.cpu.l2cache.overall_mshr_misses           4967                       # number of overall MSHR misses
+system.cpu.l2cache.overall_mshr_miss_latency    162520000                       # number of overall MSHR miss cycles
+system.cpu.l2cache.overall_mshr_miss_rate     0.711559                       # mshr miss rate for overall accesses
+system.cpu.l2cache.overall_mshr_misses           5220                       # number of overall MSHR misses
 system.cpu.l2cache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.l2cache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
 system.cpu.l2cache.replacements                     0                       # number of replacements
-system.cpu.l2cache.sampled_refs                  3400                       # Sample count of references to valid blocks.
+system.cpu.l2cache.sampled_refs                  3664                       # Sample count of references to valid blocks.
 system.cpu.l2cache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.l2cache.tagsinuse              2232.064735                       # Cycle average of tags in use
-system.cpu.l2cache.total_refs                    1970                       # Total number of references to valid blocks.
+system.cpu.l2cache.tagsinuse              2426.728827                       # Cycle average of tags in use
+system.cpu.l2cache.total_refs                    2110                       # Total number of references to valid blocks.
 system.cpu.l2cache.warmup_cycle                     0                       # Cycle when the warmup percentage was hit.
 system.cpu.l2cache.writebacks                       0                       # number of writebacks
-system.cpu.memDep0.conflictingLoads          21807942                       # Number of conflicting loads.
-system.cpu.memDep0.conflictingStores          4495847                       # Number of conflicting stores.
-system.cpu.memDep0.insertedLoads             75869162                       # Number of loads inserted to the mem dependence unit.
-system.cpu.memDep0.insertedStores            25600521                       # Number of stores inserted to the mem dependence unit.
-system.cpu.misc_regfile_reads               125958118                       # number of misc regfile reads
+system.cpu.memDep0.conflictingLoads          95035235                       # Number of conflicting loads.
+system.cpu.memDep0.conflictingStores         32152607                       # Number of conflicting stores.
+system.cpu.memDep0.insertedLoads            106923422                       # Number of loads inserted to the mem dependence unit.
+system.cpu.memDep0.insertedStores            37463806                       # Number of stores inserted to the mem dependence unit.
+system.cpu.misc_regfile_reads               144601816                       # number of misc regfile reads
 system.cpu.misc_regfile_writes                    844                       # number of misc regfile writes
-system.cpu.numCycles                        255121086                       # number of cpu cycles simulated
+system.cpu.numCycles                        217750949                       # number of cpu cycles simulated
 system.cpu.numWorkItemsCompleted                    0                       # number of work items this cpu completed
 system.cpu.numWorkItemsStarted                      0                       # number of work items this cpu started
-system.cpu.rename.RENAME:BlockCycles          1303093                       # Number of cycles rename is blocking
+system.cpu.rename.RENAME:BlockCycles         18951054                       # Number of cycles rename is blocking
 system.cpu.rename.RENAME:CommittedMaps      234363409                       # Number of HB maps that are committed
-system.cpu.rename.RENAME:IQFullEvents         2662460                       # Number of times rename has blocked due to IQ full
-system.cpu.rename.RENAME:IdleCycles          57579297                       # Number of cycles rename is idle
-system.cpu.rename.RENAME:LSQFullEvents         975892                       # Number of times rename has blocked due to LSQ full
-system.cpu.rename.RENAME:RenameLookups      963293874                       # Number of register rename lookups that rename has made
-system.cpu.rename.RENAME:RenamedInsts       304077108                       # Number of instructions processed by rename
-system.cpu.rename.RENAME:RenamedOperands    331962025                       # Number of destination operands rename has renamed
-system.cpu.rename.RENAME:RunCycles          180705413                       # Number of cycles rename is running
-system.cpu.rename.RENAME:SquashCycles        11003980                       # Number of cycles rename is squashing
-system.cpu.rename.RENAME:UnblockCycles        4387817                       # Number of cycles rename is unblocking
-system.cpu.rename.RENAME:UndoneMaps          97598616                       # Number of HB maps that are undone due to squashing
-system.cpu.rename.RENAME:fp_rename_lookups      7191870                       # Number of floating rename lookups
-system.cpu.rename.RENAME:int_rename_lookups    956102004                       # Number of integer rename lookups
-system.cpu.rename.RENAME:serializeStallCycles        16547                       # count of cycles rename stalled for serializing inst
-system.cpu.rename.RENAME:serializingInsts         1274                       # count of serializing insts renamed
-system.cpu.rename.RENAME:skidInsts            8156807                       # count of insts added to the skid buffer
-system.cpu.rename.RENAME:tempSerializingInsts         1279                       # count of temporary serializing insts renamed
-system.cpu.rob.rob_reads                    535181849                       # The number of ROB reads
-system.cpu.rob.rob_writes                   594057529                       # The number of ROB writes
-system.cpu.timesIdled                            2349                       # Number of times that the entire CPU went into an idle state and unscheduled itself
+system.cpu.rename.RENAME:IQFullEvents        22087788                       # Number of times rename has blocked due to IQ full
+system.cpu.rename.RENAME:IdleCycles          75841753                       # Number of cycles rename is idle
+system.cpu.rename.RENAME:LSQFullEvents       16619805                       # Number of times rename has blocked due to LSQ full
+system.cpu.rename.RENAME:ROBFullEvents              9                       # Number of times rename has blocked due to ROB full
+system.cpu.rename.RENAME:RenameLookups     1071149424                       # Number of register rename lookups that rename has made
+system.cpu.rename.RENAME:RenamedInsts       415976206                       # Number of instructions processed by rename
+system.cpu.rename.RENAME:RenamedOperands    437655168                       # Number of destination operands rename has renamed
+system.cpu.rename.RENAME:RunCycles           58179410                       # Number of cycles rename is running
+system.cpu.rename.RENAME:SquashCycles        23949638                       # Number of cycles rename is squashing
+system.cpu.rename.RENAME:UnblockCycles       40717504                       # Number of cycles rename is unblocking
+system.cpu.rename.RENAME:UndoneMaps         203291759                       # Number of HB maps that are undone due to squashing
+system.cpu.rename.RENAME:fp_rename_lookups     11132052                       # Number of floating rename lookups
+system.cpu.rename.RENAME:int_rename_lookups   1060017372                       # Number of integer rename lookups
+system.cpu.rename.RENAME:serializeStallCycles        22407                       # count of cycles rename stalled for serializing inst
+system.cpu.rename.RENAME:serializingInsts         1440                       # count of serializing insts renamed
+system.cpu.rename.RENAME:skidInsts           84366850                       # count of insts added to the skid buffer
+system.cpu.rename.RENAME:tempSerializingInsts         1310                       # count of temporary serializing insts renamed
+system.cpu.rob.rob_reads                    592991425                       # The number of ROB reads
+system.cpu.rob.rob_writes                   827053987                       # The number of ROB writes
+system.cpu.timesIdled                            1919                       # Number of times that the entire CPU went into an idle state and unscheduled itself
 system.cpu.workload.PROG:num_syscalls             400                       # Number of system calls
 
 ---------- End Simulation Statistics   ----------
diff --git a/tests/long/70.twolf/ref/x86/linux/simple-atomic/simout b/tests/long/70.twolf/ref/x86/linux/simple-atomic/simout
index 3569c883b0..9f05df4339 100755
--- a/tests/long/70.twolf/ref/x86/linux/simple-atomic/simout
+++ b/tests/long/70.twolf/ref/x86/linux/simple-atomic/simout
@@ -5,13 +5,11 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:36:47
+M5 compiled Feb  8 2011 00:58:32
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:34
 M5 executing on burrito
 command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/70.twolf/x86/linux/simple-atomic -re tests/run.py build/X86_SE/tests/fast/long/70.twolf/x86/linux/simple-atomic
-Couldn't unlink  build/X86_SE/tests/fast/long/70.twolf/x86/linux/simple-atomic/smred.sav
-Couldn't unlink  build/X86_SE/tests/fast/long/70.twolf/x86/linux/simple-atomic/smred.sv2
 Global frequency set at 1000000000000 ticks per second
 info: Entering event queue @ 0.  Starting simulation...
 
diff --git a/tests/long/70.twolf/ref/x86/linux/simple-atomic/stats.txt b/tests/long/70.twolf/ref/x86/linux/simple-atomic/stats.txt
index da648dcbf6..0c54c7d410 100644
--- a/tests/long/70.twolf/ref/x86/linux/simple-atomic/stats.txt
+++ b/tests/long/70.twolf/ref/x86/linux/simple-atomic/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 777141                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 230844                       # Number of bytes of host memory used
-host_seconds                                   284.84                       # Real time elapsed on the host
-host_tick_rate                              461282227                       # Simulator tick rate (ticks/s)
+host_inst_rate                                1396551                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 231332                       # Number of bytes of host memory used
+host_seconds                                   158.51                       # Real time elapsed on the host
+host_tick_rate                              828940820                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                   221363018                       # Number of instructions simulated
 sim_seconds                                  0.131393                       # Number of seconds simulated
@@ -24,7 +24,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                        221363018                       # Number of instructions executed
 system.cpu.num_int_alu_accesses             220339607                       # Number of integer alu accesses
 system.cpu.num_int_insts                    220339607                       # number of integer instructions
-system.cpu.num_int_register_reads           686620674                       # number of times the integer registers were read
+system.cpu.num_int_register_reads           567557364                       # number of times the integer registers were read
 system.cpu.num_int_register_writes          232532006                       # number of times the integer registers were written
 system.cpu.num_load_insts                    56649590                       # Number of load instructions
 system.cpu.num_mem_refs                      77165306                       # number of memory refs
diff --git a/tests/long/70.twolf/ref/x86/linux/simple-timing/simout b/tests/long/70.twolf/ref/x86/linux/simple-timing/simout
index 31ab1843bb..72c0f8f4d3 100755
--- a/tests/long/70.twolf/ref/x86/linux/simple-timing/simout
+++ b/tests/long/70.twolf/ref/x86/linux/simple-timing/simout
@@ -5,13 +5,11 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:24
+M5 compiled Feb  8 2011 00:58:32
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:34
 M5 executing on burrito
 command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/long/70.twolf/x86/linux/simple-timing -re tests/run.py build/X86_SE/tests/fast/long/70.twolf/x86/linux/simple-timing
-Couldn't unlink  build/X86_SE/tests/fast/long/70.twolf/x86/linux/simple-timing/smred.sav
-Couldn't unlink  build/X86_SE/tests/fast/long/70.twolf/x86/linux/simple-timing/smred.sv2
 Global frequency set at 1000000000000 ticks per second
 info: Entering event queue @ 0.  Starting simulation...
 
diff --git a/tests/long/70.twolf/ref/x86/linux/simple-timing/stats.txt b/tests/long/70.twolf/ref/x86/linux/simple-timing/stats.txt
index ebc389a3a6..bbd74268b9 100644
--- a/tests/long/70.twolf/ref/x86/linux/simple-timing/stats.txt
+++ b/tests/long/70.twolf/ref/x86/linux/simple-timing/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 446836                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 238556                       # Number of bytes of host memory used
-host_seconds                                   495.40                       # Real time elapsed on the host
-host_tick_rate                              506580174                       # Simulator tick rate (ticks/s)
+host_inst_rate                                 920852                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 239052                       # Number of bytes of host memory used
+host_seconds                                   240.39                       # Real time elapsed on the host
+host_tick_rate                             1043974445                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                   221363018                       # Number of instructions simulated
 sim_seconds                                  0.250961                       # Number of seconds simulated
@@ -213,7 +213,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                        221363018                       # Number of instructions executed
 system.cpu.num_int_alu_accesses             220339607                       # Number of integer alu accesses
 system.cpu.num_int_insts                    220339607                       # number of integer instructions
-system.cpu.num_int_register_reads           686620674                       # number of times the integer registers were read
+system.cpu.num_int_register_reads           567557364                       # number of times the integer registers were read
 system.cpu.num_int_register_writes          232532006                       # number of times the integer registers were written
 system.cpu.num_load_insts                    56649590                       # Number of load instructions
 system.cpu.num_mem_refs                      77165306                       # number of memory refs
diff --git a/tests/quick/00.hello/ref/alpha/linux/inorder-timing/simout b/tests/quick/00.hello/ref/alpha/linux/inorder-timing/simout
index 254c4b8b1d..fa50fea550 100755
--- a/tests/quick/00.hello/ref/alpha/linux/inorder-timing/simout
+++ b/tests/quick/00.hello/ref/alpha/linux/inorder-timing/simout
@@ -5,13 +5,13 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 01:47:18
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 01:47:37
-M5 executing on burrito
+M5 compiled Feb 18 2011 15:40:30
+M5 revision Unknown
+M5 started Feb 18 2011 18:52:59
+M5 executing on m55-001.pool
 command line: build/ALPHA_SE/m5.fast -d build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/inorder-timing -re tests/run.py build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/inorder-timing
 Global frequency set at 1000000000000 ticks per second
 info: Entering event queue @ 0.  Starting simulation...
 info: Increasing stack size by one page.
 Hello world!
-Exiting @ tick 22288500 because target called exit()
+Exiting @ tick 22294500 because target called exit()
diff --git a/tests/quick/00.hello/ref/alpha/linux/inorder-timing/stats.txt b/tests/quick/00.hello/ref/alpha/linux/inorder-timing/stats.txt
index 246665e32f..bb298d30ae 100644
--- a/tests/quick/00.hello/ref/alpha/linux/inorder-timing/stats.txt
+++ b/tests/quick/00.hello/ref/alpha/linux/inorder-timing/stats.txt
@@ -1,37 +1,37 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                  37548                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 223436                       # Number of bytes of host memory used
-host_seconds                                     0.17                       # Real time elapsed on the host
-host_tick_rate                              130476959                       # Simulator tick rate (ticks/s)
+host_inst_rate                                  97475                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 190320                       # Number of bytes of host memory used
+host_seconds                                     0.07                       # Real time elapsed on the host
+host_tick_rate                              337940129                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                        6404                       # Number of instructions simulated
 sim_seconds                                  0.000022                       # Number of seconds simulated
-sim_ticks                                    22288500                       # Number of ticks simulated
-system.cpu.AGEN-Unit.agens                       2187                       # Number of Address Generations
+sim_ticks                                    22294500                       # Number of ticks simulated
+system.cpu.AGEN-Unit.agens                       2186                       # Number of Address Generations
 system.cpu.Branch-Predictor.BTBHitPct       23.015873                       # BTB Hit Percentage
 system.cpu.Branch-Predictor.BTBHits                87                       # Number of BTB hits
 system.cpu.Branch-Predictor.BTBLookups            378                       # Number of BTB lookups
 system.cpu.Branch-Predictor.RASInCorrect            0                       # Number of incorrect RAS predictions.
-system.cpu.Branch-Predictor.condIncorrect          543                       # Number of conditional branches incorrect
+system.cpu.Branch-Predictor.condIncorrect          542                       # Number of conditional branches incorrect
 system.cpu.Branch-Predictor.condPredicted          995                       # Number of conditional branches predicted
 system.cpu.Branch-Predictor.lookups              1423                       # Number of BP lookups
 system.cpu.Branch-Predictor.predictedNotTaken         1183                       # Number of Branches Predicted As Not Taken (False).
 system.cpu.Branch-Predictor.predictedTaken          240                       # Number of Branches Predicted As Taken (True).
 system.cpu.Branch-Predictor.usedRAS               125                       # Number of times the RAS was used to get a target.
-system.cpu.Execution-Unit.executions             4617                       # Number of Instructions Executed.
-system.cpu.Execution-Unit.mispredictPct     51.615970                       # Percentage of Incorrect Branches Predicts
-system.cpu.Execution-Unit.mispredicted            543                       # Number of Branches Incorrectly Predicted
+system.cpu.Execution-Unit.executions             4596                       # Number of Instructions Executed.
+system.cpu.Execution-Unit.mispredictPct     51.569933                       # Percentage of Incorrect Branches Predicts
+system.cpu.Execution-Unit.mispredicted            542                       # Number of Branches Incorrectly Predicted
 system.cpu.Execution-Unit.predicted               509                       # Number of Branches Incorrectly Predicted
-system.cpu.Execution-Unit.predictedNotTakenIncorrect          538                       # Number of Branches Incorrectly Predicted As Not Taken).
+system.cpu.Execution-Unit.predictedNotTakenIncorrect          537                       # Number of Branches Incorrectly Predicted As Not Taken).
 system.cpu.Execution-Unit.predictedTakenIncorrect            5                       # Number of Branches Incorrectly Predicted As Taken.
 system.cpu.Mult-Div-Unit.divides                    0                       # Number of Divide Operations Executed
 system.cpu.Mult-Div-Unit.multiplies                 1                       # Number of Multipy Operations Executed
-system.cpu.RegFile-Manager.regFileAccesses        10532                       # Number of Total Accesses (Read+Write) to the Register File
-system.cpu.RegFile-Manager.regFileReads          5949                       # Number of Reads from Register File
+system.cpu.RegFile-Manager.regFileAccesses        10530                       # Number of Total Accesses (Read+Write) to the Register File
+system.cpu.RegFile-Manager.regFileReads          5947                       # Number of Reads from Register File
 system.cpu.RegFile-Manager.regFileWrites         4583                       # Number of Writes to Register File
 system.cpu.RegFile-Manager.regForwards           2845                       # Number of Registers Read Through Forwarding Logic
-system.cpu.activity                         16.048275                       # Percentage of cycles cpu is active
+system.cpu.activity                         16.075353                       # Percentage of cycles cpu is active
 system.cpu.comBranches                           1051                       # Number of Branches instructions committed
 system.cpu.comFloats                                2                       # Number of Floating Point instructions committed
 system.cpu.comInts                               3265                       # Number of Integer instructions committed
@@ -42,17 +42,17 @@ system.cpu.comStores                              865                       # Nu
 system.cpu.committedInsts                        6404                       # Number of Instructions Simulated (Per-Thread)
 system.cpu.committedInsts_total                  6404                       # Number of Instructions Simulated (Total)
 system.cpu.contextSwitches                          1                       # Number of context switches
-system.cpu.cpi                               6.960962                       # CPI: Cycles Per Instruction (Per-Thread)
-system.cpu.cpi_total                         6.960962                       # CPI: Total CPI of All Threads
+system.cpu.cpi                               6.962836                       # CPI: Cycles Per Instruction (Per-Thread)
+system.cpu.cpi_total                         6.962836                       # CPI: Total CPI of All Threads
 system.cpu.dcache.ReadReq_accesses               1185                       # number of ReadReq accesses(hits+misses)
-system.cpu.dcache.ReadReq_avg_miss_latency 56781.250000                       # average ReadReq miss latency
-system.cpu.dcache.ReadReq_avg_mshr_miss_latency 53784.210526                       # average ReadReq mshr miss latency
+system.cpu.dcache.ReadReq_avg_miss_latency 56786.458333                       # average ReadReq miss latency
+system.cpu.dcache.ReadReq_avg_mshr_miss_latency 53789.473684                       # average ReadReq mshr miss latency
 system.cpu.dcache.ReadReq_hits                   1089                       # number of ReadReq hits
-system.cpu.dcache.ReadReq_miss_latency        5451000                       # number of ReadReq miss cycles
+system.cpu.dcache.ReadReq_miss_latency        5451500                       # number of ReadReq miss cycles
 system.cpu.dcache.ReadReq_miss_rate          0.081013                       # miss rate for ReadReq accesses
 system.cpu.dcache.ReadReq_misses                   96                       # number of ReadReq misses
 system.cpu.dcache.ReadReq_mshr_hits                 1                       # number of ReadReq MSHR hits
-system.cpu.dcache.ReadReq_mshr_miss_latency      5109500                       # number of ReadReq MSHR miss cycles
+system.cpu.dcache.ReadReq_mshr_miss_latency      5110000                       # number of ReadReq MSHR miss cycles
 system.cpu.dcache.ReadReq_mshr_miss_rate     0.080169                       # mshr miss rate for ReadReq accesses
 system.cpu.dcache.ReadReq_mshr_misses              95                       # number of ReadReq MSHR misses
 system.cpu.dcache.WriteReq_accesses               865                       # number of WriteReq accesses(hits+misses)
@@ -75,31 +75,31 @@ system.cpu.dcache.blocked_cycles::no_mshrs            0                       #
 system.cpu.dcache.blocked_cycles::no_targets       162000                       # number of cycles access was blocked
 system.cpu.dcache.cache_copies                      0                       # number of cache copies performed
 system.cpu.dcache.demand_accesses                2050                       # number of demand (read+write) accesses
-system.cpu.dcache.demand_avg_miss_latency 56661.157025                       # average overall miss latency
-system.cpu.dcache.demand_avg_mshr_miss_latency 53687.500000                       # average overall mshr miss latency
+system.cpu.dcache.demand_avg_miss_latency 56663.223140                       # average overall miss latency
+system.cpu.dcache.demand_avg_mshr_miss_latency 53690.476190                       # average overall mshr miss latency
 system.cpu.dcache.demand_hits                    1808                       # number of demand (read+write) hits
-system.cpu.dcache.demand_miss_latency        13712000                       # number of demand (read+write) miss cycles
+system.cpu.dcache.demand_miss_latency        13712500                       # number of demand (read+write) miss cycles
 system.cpu.dcache.demand_miss_rate           0.118049                       # miss rate for demand accesses
 system.cpu.dcache.demand_misses                   242                       # number of demand (read+write) misses
 system.cpu.dcache.demand_mshr_hits                 74                       # number of demand (read+write) MSHR hits
-system.cpu.dcache.demand_mshr_miss_latency      9019500                       # number of demand (read+write) MSHR miss cycles
+system.cpu.dcache.demand_mshr_miss_latency      9020000                       # number of demand (read+write) MSHR miss cycles
 system.cpu.dcache.demand_mshr_miss_rate      0.081951                       # mshr miss rate for demand accesses
 system.cpu.dcache.demand_mshr_misses              168                       # number of demand (read+write) MSHR misses
 system.cpu.dcache.fast_writes                       0                       # number of fast writes performed
 system.cpu.dcache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.dcache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.dcache.occ_%::0                   0.024901                       # Average percentage of cache occupancy
-system.cpu.dcache.occ_blocks::0            101.993452                       # Average occupied blocks per context
+system.cpu.dcache.occ_%::0                   0.024898                       # Average percentage of cache occupancy
+system.cpu.dcache.occ_blocks::0            101.981030                       # Average occupied blocks per context
 system.cpu.dcache.overall_accesses               2050                       # number of overall (read+write) accesses
-system.cpu.dcache.overall_avg_miss_latency 56661.157025                       # average overall miss latency
-system.cpu.dcache.overall_avg_mshr_miss_latency 53687.500000                       # average overall mshr miss latency
+system.cpu.dcache.overall_avg_miss_latency 56663.223140                       # average overall miss latency
+system.cpu.dcache.overall_avg_mshr_miss_latency 53690.476190                       # average overall mshr miss latency
 system.cpu.dcache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
 system.cpu.dcache.overall_hits                   1808                       # number of overall hits
-system.cpu.dcache.overall_miss_latency       13712000                       # number of overall miss cycles
+system.cpu.dcache.overall_miss_latency       13712500                       # number of overall miss cycles
 system.cpu.dcache.overall_miss_rate          0.118049                       # miss rate for overall accesses
 system.cpu.dcache.overall_misses                  242                       # number of overall misses
 system.cpu.dcache.overall_mshr_hits                74                       # number of overall MSHR hits
-system.cpu.dcache.overall_mshr_miss_latency      9019500                       # number of overall MSHR miss cycles
+system.cpu.dcache.overall_mshr_miss_latency      9020000                       # number of overall MSHR miss cycles
 system.cpu.dcache.overall_mshr_miss_rate     0.081951                       # mshr miss rate for overall accesses
 system.cpu.dcache.overall_mshr_misses             168                       # number of overall MSHR misses
 system.cpu.dcache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
@@ -107,7 +107,7 @@ system.cpu.dcache.overall_mshr_uncacheable_misses            0
 system.cpu.dcache.replacements                      0                       # number of replacements
 system.cpu.dcache.sampled_refs                    168                       # Sample count of references to valid blocks.
 system.cpu.dcache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.dcache.tagsinuse                101.993452                       # Cycle average of tags in use
+system.cpu.dcache.tagsinuse                101.981030                       # Cycle average of tags in use
 system.cpu.dcache.total_refs                     1808                       # Total number of references to valid blocks.
 system.cpu.dcache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
 system.cpu.dcache.writebacks                        0                       # number of writebacks
@@ -128,10 +128,10 @@ system.cpu.dtb.write_acv                            0                       # DT
 system.cpu.dtb.write_hits                         865                       # DTB write hits
 system.cpu.dtb.write_misses                         3                       # DTB write misses
 system.cpu.icache.ReadReq_accesses                955                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_avg_miss_latency 55326.979472                       # average ReadReq miss latency
+system.cpu.icache.ReadReq_avg_miss_latency 55322.580645                       # average ReadReq miss latency
 system.cpu.icache.ReadReq_avg_mshr_miss_latency 53094.684385                       # average ReadReq mshr miss latency
 system.cpu.icache.ReadReq_hits                    614                       # number of ReadReq hits
-system.cpu.icache.ReadReq_miss_latency       18866500                       # number of ReadReq miss cycles
+system.cpu.icache.ReadReq_miss_latency       18865000                       # number of ReadReq miss cycles
 system.cpu.icache.ReadReq_miss_rate          0.357068                       # miss rate for ReadReq accesses
 system.cpu.icache.ReadReq_misses                  341                       # number of ReadReq misses
 system.cpu.icache.ReadReq_mshr_hits                40                       # number of ReadReq MSHR hits
@@ -147,10 +147,10 @@ system.cpu.icache.blocked_cycles::no_mshrs            0                       #
 system.cpu.icache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.icache.cache_copies                      0                       # number of cache copies performed
 system.cpu.icache.demand_accesses                 955                       # number of demand (read+write) accesses
-system.cpu.icache.demand_avg_miss_latency 55326.979472                       # average overall miss latency
+system.cpu.icache.demand_avg_miss_latency 55322.580645                       # average overall miss latency
 system.cpu.icache.demand_avg_mshr_miss_latency 53094.684385                       # average overall mshr miss latency
 system.cpu.icache.demand_hits                     614                       # number of demand (read+write) hits
-system.cpu.icache.demand_miss_latency        18866500                       # number of demand (read+write) miss cycles
+system.cpu.icache.demand_miss_latency        18865000                       # number of demand (read+write) miss cycles
 system.cpu.icache.demand_miss_rate           0.357068                       # miss rate for demand accesses
 system.cpu.icache.demand_misses                   341                       # number of demand (read+write) misses
 system.cpu.icache.demand_mshr_hits                 40                       # number of demand (read+write) MSHR hits
@@ -160,14 +160,14 @@ system.cpu.icache.demand_mshr_misses              301                       # nu
 system.cpu.icache.fast_writes                       0                       # number of fast writes performed
 system.cpu.icache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.icache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.icache.occ_%::0                   0.066887                       # Average percentage of cache occupancy
-system.cpu.icache.occ_blocks::0            136.984147                       # Average occupied blocks per context
+system.cpu.icache.occ_%::0                   0.066877                       # Average percentage of cache occupancy
+system.cpu.icache.occ_blocks::0            136.964505                       # Average occupied blocks per context
 system.cpu.icache.overall_accesses                955                       # number of overall (read+write) accesses
-system.cpu.icache.overall_avg_miss_latency 55326.979472                       # average overall miss latency
+system.cpu.icache.overall_avg_miss_latency 55322.580645                       # average overall miss latency
 system.cpu.icache.overall_avg_mshr_miss_latency 53094.684385                       # average overall mshr miss latency
 system.cpu.icache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
 system.cpu.icache.overall_hits                    614                       # number of overall hits
-system.cpu.icache.overall_miss_latency       18866500                       # number of overall miss cycles
+system.cpu.icache.overall_miss_latency       18865000                       # number of overall miss cycles
 system.cpu.icache.overall_miss_rate          0.357068                       # miss rate for overall accesses
 system.cpu.icache.overall_misses                  341                       # number of overall misses
 system.cpu.icache.overall_mshr_hits                40                       # number of overall MSHR hits
@@ -179,13 +179,13 @@ system.cpu.icache.overall_mshr_uncacheable_misses            0
 system.cpu.icache.replacements                      0                       # number of replacements
 system.cpu.icache.sampled_refs                    300                       # Sample count of references to valid blocks.
 system.cpu.icache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.icache.tagsinuse                136.984147                       # Cycle average of tags in use
+system.cpu.icache.tagsinuse                136.964505                       # Cycle average of tags in use
 system.cpu.icache.total_refs                      614                       # Total number of references to valid blocks.
 system.cpu.icache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
 system.cpu.icache.writebacks                        0                       # number of writebacks
-system.cpu.idleCycles                           37424                       # Number of cycles cpu's stages were not processed
-system.cpu.ipc                               0.143658                       # IPC: Instructions Per Cycle (Per-Thread)
-system.cpu.ipc_total                         0.143658                       # IPC: Total IPC of All Threads
+system.cpu.idleCycles                           37422                       # Number of cycles cpu's stages were not processed
+system.cpu.ipc                               0.143620                       # IPC: Instructions Per Cycle (Per-Thread)
+system.cpu.ipc_total                         0.143620                       # IPC: Total IPC of All Threads
 system.cpu.itb.data_accesses                        0                       # DTB accesses
 system.cpu.itb.data_acv                             0                       # DTB access violations
 system.cpu.itb.data_hits                            0                       # DTB hits
@@ -243,8 +243,8 @@ system.cpu.l2cache.demand_mshr_misses             468                       # nu
 system.cpu.l2cache.fast_writes                      0                       # number of fast writes performed
 system.cpu.l2cache.mshr_cap_events                  0                       # number of times MSHR cap was activated
 system.cpu.l2cache.no_allocate_misses               0                       # Number of misses that were no-allocate
-system.cpu.l2cache.occ_%::0                  0.005889                       # Average percentage of cache occupancy
-system.cpu.l2cache.occ_blocks::0           192.975400                       # Average occupied blocks per context
+system.cpu.l2cache.occ_%::0                  0.005888                       # Average percentage of cache occupancy
+system.cpu.l2cache.occ_blocks::0           192.950109                       # Average occupied blocks per context
 system.cpu.l2cache.overall_accesses               469                       # number of overall (read+write) accesses
 system.cpu.l2cache.overall_avg_miss_latency 52243.589744                       # average overall miss latency
 system.cpu.l2cache.overall_avg_mshr_miss_latency 40087.606838                       # average overall mshr miss latency
@@ -262,34 +262,34 @@ system.cpu.l2cache.overall_mshr_uncacheable_misses            0
 system.cpu.l2cache.replacements                     0                       # number of replacements
 system.cpu.l2cache.sampled_refs                   394                       # Sample count of references to valid blocks.
 system.cpu.l2cache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.l2cache.tagsinuse               192.975400                       # Cycle average of tags in use
+system.cpu.l2cache.tagsinuse               192.950109                       # Cycle average of tags in use
 system.cpu.l2cache.total_refs                       1                       # Total number of references to valid blocks.
 system.cpu.l2cache.warmup_cycle                     0                       # Cycle when the warmup percentage was hit.
 system.cpu.l2cache.writebacks                       0                       # number of writebacks
-system.cpu.numCycles                            44578                       # number of cpu cycles simulated
+system.cpu.numCycles                            44590                       # number of cpu cycles simulated
 system.cpu.numWorkItemsCompleted                    0                       # number of work items this cpu completed
 system.cpu.numWorkItemsStarted                      0                       # number of work items this cpu started
-system.cpu.runCycles                             7154                       # Number of cycles cpu stages are processed.
+system.cpu.runCycles                             7168                       # Number of cycles cpu stages are processed.
 system.cpu.smtCommittedInsts                        0                       # Number of SMT Instructions Simulated (Per-Thread)
 system.cpu.smtCycles                                0                       # Total number of cycles that the CPU was in SMT-mode
 system.cpu.smt_cpi                           no_value                       # CPI: Total SMT-CPI
 system.cpu.smt_ipc                           no_value                       # IPC: Total SMT-IPC
-system.cpu.stage-0.idleCycles                   39836                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-0.runCycles                     4742                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-0.utilization              10.637534                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-1.idleCycles                   40747                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-1.runCycles                     3831                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-1.utilization               8.593925                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-2.idleCycles                   40491                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-2.runCycles                     4087                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-2.utilization               9.168200                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-3.idleCycles                   43168                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-0.idleCycles                   39847                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-0.runCycles                     4743                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-0.utilization              10.636914                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-1.idleCycles                   40758                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-1.runCycles                     3832                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-1.utilization               8.593855                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-2.idleCycles                   40488                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-2.runCycles                     4102                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-2.utilization               9.199372                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-3.idleCycles                   43180                       # Number of cycles 0 instructions are processed.
 system.cpu.stage-3.runCycles                     1410                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-3.utilization               3.162995                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-4.idleCycles                   40170                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-4.runCycles                     4408                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-4.utilization               9.888286                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.threadCycles                         11304                       # Total Number of Cycles A Thread Was Active in CPU (Per-Thread)
+system.cpu.stage-3.utilization               3.162144                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-4.idleCycles                   40181                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-4.runCycles                     4409                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-4.utilization               9.887867                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.threadCycles                         11319                       # Total Number of Cycles A Thread Was Active in CPU (Per-Thread)
 system.cpu.timesIdled                             425                       # Number of times that the entire CPU went into an idle state and unscheduled itself
 system.cpu.workload.PROG:num_syscalls              17                       # Number of system calls
 
diff --git a/tests/quick/00.hello/ref/mips/linux/inorder-timing/simout b/tests/quick/00.hello/ref/mips/linux/inorder-timing/simout
index 2ad70ea487..41a76071a3 100755
--- a/tests/quick/00.hello/ref/mips/linux/inorder-timing/simout
+++ b/tests/quick/00.hello/ref/mips/linux/inorder-timing/simout
@@ -5,13 +5,13 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 01:55:51
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 01:56:02
-M5 executing on burrito
+M5 compiled Feb 18 2011 18:35:15
+M5 revision Unknown
+M5 started Feb 18 2011 18:52:36
+M5 executing on m55-001.pool
 command line: build/MIPS_SE/m5.fast -d build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/inorder-timing -re tests/run.py build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/inorder-timing
 Global frequency set at 1000000000000 ticks per second
 info: Entering event queue @ 0.  Starting simulation...
 info: Increasing stack size by one page.
 Hello World!
-Exiting @ tick 21534000 because target called exit()
+Exiting @ tick 21538000 because target called exit()
diff --git a/tests/quick/00.hello/ref/mips/linux/inorder-timing/stats.txt b/tests/quick/00.hello/ref/mips/linux/inorder-timing/stats.txt
index 1e86aa8627..ac0fe4aec4 100644
--- a/tests/quick/00.hello/ref/mips/linux/inorder-timing/stats.txt
+++ b/tests/quick/00.hello/ref/mips/linux/inorder-timing/stats.txt
@@ -1,37 +1,37 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                  32668                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 224608                       # Number of bytes of host memory used
-host_seconds                                     0.18                       # Real time elapsed on the host
-host_tick_rate                              120542676                       # Simulator tick rate (ticks/s)
+host_inst_rate                                  94112                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 191540                       # Number of bytes of host memory used
+host_seconds                                     0.06                       # Real time elapsed on the host
+host_tick_rate                              346291258                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                        5827                       # Number of instructions simulated
 sim_seconds                                  0.000022                       # Number of seconds simulated
-sim_ticks                                    21534000                       # Number of ticks simulated
+sim_ticks                                    21538000                       # Number of ticks simulated
 system.cpu.AGEN-Unit.agens                       2404                       # Number of Address Generations
 system.cpu.Branch-Predictor.BTBHitPct       14.054054                       # BTB Hit Percentage
 system.cpu.Branch-Predictor.BTBHits                26                       # Number of BTB hits
 system.cpu.Branch-Predictor.BTBLookups            185                       # Number of BTB lookups
 system.cpu.Branch-Predictor.RASInCorrect           30                       # Number of incorrect RAS predictions.
-system.cpu.Branch-Predictor.condIncorrect          845                       # Number of conditional branches incorrect
+system.cpu.Branch-Predictor.condIncorrect          844                       # Number of conditional branches incorrect
 system.cpu.Branch-Predictor.condPredicted          778                       # Number of conditional branches predicted
 system.cpu.Branch-Predictor.lookups              1066                       # Number of BP lookups
 system.cpu.Branch-Predictor.predictedNotTaken          949                       # Number of Branches Predicted As Not Taken (False).
 system.cpu.Branch-Predictor.predictedTaken          117                       # Number of Branches Predicted As Taken (True).
 system.cpu.Branch-Predictor.usedRAS                86                       # Number of times the RAS was used to get a target.
-system.cpu.Execution-Unit.executions             3963                       # Number of Instructions Executed.
-system.cpu.Execution-Unit.mispredictPct     92.148310                       # Percentage of Incorrect Branches Predicts
-system.cpu.Execution-Unit.mispredicted            845                       # Number of Branches Incorrectly Predicted
+system.cpu.Execution-Unit.executions             3261                       # Number of Instructions Executed.
+system.cpu.Execution-Unit.mispredictPct     92.139738                       # Percentage of Incorrect Branches Predicts
+system.cpu.Execution-Unit.mispredicted            844                       # Number of Branches Incorrectly Predicted
 system.cpu.Execution-Unit.predicted                72                       # Number of Branches Incorrectly Predicted
-system.cpu.Execution-Unit.predictedNotTakenIncorrect          813                       # Number of Branches Incorrectly Predicted As Not Taken).
+system.cpu.Execution-Unit.predictedNotTakenIncorrect          812                       # Number of Branches Incorrectly Predicted As Not Taken).
 system.cpu.Execution-Unit.predictedTakenIncorrect           32                       # Number of Branches Incorrectly Predicted As Taken.
 system.cpu.Mult-Div-Unit.divides                    1                       # Number of Divide Operations Executed
 system.cpu.Mult-Div-Unit.multiplies                 3                       # Number of Multipy Operations Executed
-system.cpu.RegFile-Manager.regFileAccesses        10006                       # Number of Total Accesses (Read+Write) to the Register File
-system.cpu.RegFile-Manager.regFileReads          6596                       # Number of Reads from Register File
+system.cpu.RegFile-Manager.regFileAccesses        10004                       # Number of Total Accesses (Read+Write) to the Register File
+system.cpu.RegFile-Manager.regFileReads          6594                       # Number of Reads from Register File
 system.cpu.RegFile-Manager.regFileWrites         3410                       # Number of Writes to Register File
 system.cpu.RegFile-Manager.regForwards           1378                       # Number of Registers Read Through Forwarding Logic
-system.cpu.activity                         13.935777                       # Percentage of cycles cpu is active
+system.cpu.activity                         13.954082                       # Percentage of cycles cpu is active
 system.cpu.comBranches                            916                       # Number of Branches instructions committed
 system.cpu.comFloats                                0                       # Number of Floating Point instructions committed
 system.cpu.comInts                               2155                       # Number of Integer instructions committed
@@ -42,17 +42,17 @@ system.cpu.comStores                              925                       # Nu
 system.cpu.committedInsts                        5827                       # Number of Instructions Simulated (Per-Thread)
 system.cpu.committedInsts_total                  5827                       # Number of Instructions Simulated (Total)
 system.cpu.contextSwitches                          1                       # Number of context switches
-system.cpu.cpi                               7.391282                       # CPI: Cycles Per Instruction (Per-Thread)
-system.cpu.cpi_total                         7.391282                       # CPI: Total CPI of All Threads
+system.cpu.cpi                               7.392655                       # CPI: Cycles Per Instruction (Per-Thread)
+system.cpu.cpi_total                         7.392655                       # CPI: Total CPI of All Threads
 system.cpu.dcache.ReadReq_accesses               1164                       # number of ReadReq accesses(hits+misses)
-system.cpu.dcache.ReadReq_avg_miss_latency 56681.818182                       # average ReadReq miss latency
-system.cpu.dcache.ReadReq_avg_mshr_miss_latency 53683.908046                       # average ReadReq mshr miss latency
+system.cpu.dcache.ReadReq_avg_miss_latency 56676.136364                       # average ReadReq miss latency
+system.cpu.dcache.ReadReq_avg_mshr_miss_latency 53678.160920                       # average ReadReq mshr miss latency
 system.cpu.dcache.ReadReq_hits                   1076                       # number of ReadReq hits
-system.cpu.dcache.ReadReq_miss_latency        4988000                       # number of ReadReq miss cycles
+system.cpu.dcache.ReadReq_miss_latency        4987500                       # number of ReadReq miss cycles
 system.cpu.dcache.ReadReq_miss_rate          0.075601                       # miss rate for ReadReq accesses
 system.cpu.dcache.ReadReq_misses                   88                       # number of ReadReq misses
 system.cpu.dcache.ReadReq_mshr_hits                 1                       # number of ReadReq MSHR hits
-system.cpu.dcache.ReadReq_mshr_miss_latency      4670500                       # number of ReadReq MSHR miss cycles
+system.cpu.dcache.ReadReq_mshr_miss_latency      4670000                       # number of ReadReq MSHR miss cycles
 system.cpu.dcache.ReadReq_mshr_miss_rate     0.074742                       # mshr miss rate for ReadReq accesses
 system.cpu.dcache.ReadReq_mshr_misses              87                       # number of ReadReq MSHR misses
 system.cpu.dcache.WriteReq_accesses               925                       # number of WriteReq accesses(hits+misses)
@@ -75,31 +75,31 @@ system.cpu.dcache.blocked_cycles::no_mshrs            0                       #
 system.cpu.dcache.blocked_cycles::no_targets       265500                       # number of cycles access was blocked
 system.cpu.dcache.cache_copies                      0                       # number of cache copies performed
 system.cpu.dcache.demand_accesses                2089                       # number of demand (read+write) accesses
-system.cpu.dcache.demand_avg_miss_latency 56298.342541                       # average overall miss latency
-system.cpu.dcache.demand_avg_mshr_miss_latency 53666.666667                       # average overall mshr miss latency
+system.cpu.dcache.demand_avg_miss_latency 56295.580110                       # average overall miss latency
+system.cpu.dcache.demand_avg_mshr_miss_latency 53663.043478                       # average overall mshr miss latency
 system.cpu.dcache.demand_hits                    1908                       # number of demand (read+write) hits
-system.cpu.dcache.demand_miss_latency        10190000                       # number of demand (read+write) miss cycles
+system.cpu.dcache.demand_miss_latency        10189500                       # number of demand (read+write) miss cycles
 system.cpu.dcache.demand_miss_rate           0.086644                       # miss rate for demand accesses
 system.cpu.dcache.demand_misses                   181                       # number of demand (read+write) misses
 system.cpu.dcache.demand_mshr_hits                 43                       # number of demand (read+write) MSHR hits
-system.cpu.dcache.demand_mshr_miss_latency      7406000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.dcache.demand_mshr_miss_latency      7405500                       # number of demand (read+write) MSHR miss cycles
 system.cpu.dcache.demand_mshr_miss_rate      0.066060                       # mshr miss rate for demand accesses
 system.cpu.dcache.demand_mshr_misses              138                       # number of demand (read+write) MSHR misses
 system.cpu.dcache.fast_writes                       0                       # number of fast writes performed
 system.cpu.dcache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.dcache.no_allocate_misses                0                       # Number of misses that were no-allocate
 system.cpu.dcache.occ_%::0                   0.021745                       # Average percentage of cache occupancy
-system.cpu.dcache.occ_blocks::0             89.066455                       # Average occupied blocks per context
+system.cpu.dcache.occ_blocks::0             89.067186                       # Average occupied blocks per context
 system.cpu.dcache.overall_accesses               2089                       # number of overall (read+write) accesses
-system.cpu.dcache.overall_avg_miss_latency 56298.342541                       # average overall miss latency
-system.cpu.dcache.overall_avg_mshr_miss_latency 53666.666667                       # average overall mshr miss latency
+system.cpu.dcache.overall_avg_miss_latency 56295.580110                       # average overall miss latency
+system.cpu.dcache.overall_avg_mshr_miss_latency 53663.043478                       # average overall mshr miss latency
 system.cpu.dcache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
 system.cpu.dcache.overall_hits                   1908                       # number of overall hits
-system.cpu.dcache.overall_miss_latency       10190000                       # number of overall miss cycles
+system.cpu.dcache.overall_miss_latency       10189500                       # number of overall miss cycles
 system.cpu.dcache.overall_miss_rate          0.086644                       # miss rate for overall accesses
 system.cpu.dcache.overall_misses                  181                       # number of overall misses
 system.cpu.dcache.overall_mshr_hits                43                       # number of overall MSHR hits
-system.cpu.dcache.overall_mshr_miss_latency      7406000                       # number of overall MSHR miss cycles
+system.cpu.dcache.overall_mshr_miss_latency      7405500                       # number of overall MSHR miss cycles
 system.cpu.dcache.overall_mshr_miss_rate     0.066060                       # mshr miss rate for overall accesses
 system.cpu.dcache.overall_mshr_misses             138                       # number of overall MSHR misses
 system.cpu.dcache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
@@ -107,7 +107,7 @@ system.cpu.dcache.overall_mshr_uncacheable_misses            0
 system.cpu.dcache.replacements                      0                       # number of replacements
 system.cpu.dcache.sampled_refs                    138                       # Sample count of references to valid blocks.
 system.cpu.dcache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.dcache.tagsinuse                 89.066455                       # Cycle average of tags in use
+system.cpu.dcache.tagsinuse                 89.067186                       # Cycle average of tags in use
 system.cpu.dcache.total_refs                     1908                       # Total number of references to valid blocks.
 system.cpu.dcache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
 system.cpu.dcache.writebacks                        0                       # number of writebacks
@@ -121,14 +121,14 @@ system.cpu.dtb.write_accesses                       0                       # DT
 system.cpu.dtb.write_hits                           0                       # DTB write hits
 system.cpu.dtb.write_misses                         0                       # DTB write misses
 system.cpu.icache.ReadReq_accesses                853                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_avg_miss_latency 55526.246719                       # average ReadReq miss latency
-system.cpu.icache.ReadReq_avg_mshr_miss_latency 53153.605016                       # average ReadReq mshr miss latency
+system.cpu.icache.ReadReq_avg_miss_latency 55527.559055                       # average ReadReq miss latency
+system.cpu.icache.ReadReq_avg_mshr_miss_latency 53156.739812                       # average ReadReq mshr miss latency
 system.cpu.icache.ReadReq_hits                    472                       # number of ReadReq hits
-system.cpu.icache.ReadReq_miss_latency       21155500                       # number of ReadReq miss cycles
+system.cpu.icache.ReadReq_miss_latency       21156000                       # number of ReadReq miss cycles
 system.cpu.icache.ReadReq_miss_rate          0.446659                       # miss rate for ReadReq accesses
 system.cpu.icache.ReadReq_misses                  381                       # number of ReadReq misses
 system.cpu.icache.ReadReq_mshr_hits                62                       # number of ReadReq MSHR hits
-system.cpu.icache.ReadReq_mshr_miss_latency     16956000                       # number of ReadReq MSHR miss cycles
+system.cpu.icache.ReadReq_mshr_miss_latency     16957000                       # number of ReadReq MSHR miss cycles
 system.cpu.icache.ReadReq_mshr_miss_rate     0.373974                       # mshr miss rate for ReadReq accesses
 system.cpu.icache.ReadReq_mshr_misses             319                       # number of ReadReq MSHR misses
 system.cpu.icache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
@@ -140,31 +140,31 @@ system.cpu.icache.blocked_cycles::no_mshrs            0                       #
 system.cpu.icache.blocked_cycles::no_targets        62000                       # number of cycles access was blocked
 system.cpu.icache.cache_copies                      0                       # number of cache copies performed
 system.cpu.icache.demand_accesses                 853                       # number of demand (read+write) accesses
-system.cpu.icache.demand_avg_miss_latency 55526.246719                       # average overall miss latency
-system.cpu.icache.demand_avg_mshr_miss_latency 53153.605016                       # average overall mshr miss latency
+system.cpu.icache.demand_avg_miss_latency 55527.559055                       # average overall miss latency
+system.cpu.icache.demand_avg_mshr_miss_latency 53156.739812                       # average overall mshr miss latency
 system.cpu.icache.demand_hits                     472                       # number of demand (read+write) hits
-system.cpu.icache.demand_miss_latency        21155500                       # number of demand (read+write) miss cycles
+system.cpu.icache.demand_miss_latency        21156000                       # number of demand (read+write) miss cycles
 system.cpu.icache.demand_miss_rate           0.446659                       # miss rate for demand accesses
 system.cpu.icache.demand_misses                   381                       # number of demand (read+write) misses
 system.cpu.icache.demand_mshr_hits                 62                       # number of demand (read+write) MSHR hits
-system.cpu.icache.demand_mshr_miss_latency     16956000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.icache.demand_mshr_miss_latency     16957000                       # number of demand (read+write) MSHR miss cycles
 system.cpu.icache.demand_mshr_miss_rate      0.373974                       # mshr miss rate for demand accesses
 system.cpu.icache.demand_mshr_misses              319                       # number of demand (read+write) MSHR misses
 system.cpu.icache.fast_writes                       0                       # number of fast writes performed
 system.cpu.icache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.icache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.icache.occ_%::0                   0.070944                       # Average percentage of cache occupancy
-system.cpu.icache.occ_blocks::0            145.293265                       # Average occupied blocks per context
+system.cpu.icache.occ_%::0                   0.070945                       # Average percentage of cache occupancy
+system.cpu.icache.occ_blocks::0            145.295903                       # Average occupied blocks per context
 system.cpu.icache.overall_accesses                853                       # number of overall (read+write) accesses
-system.cpu.icache.overall_avg_miss_latency 55526.246719                       # average overall miss latency
-system.cpu.icache.overall_avg_mshr_miss_latency 53153.605016                       # average overall mshr miss latency
+system.cpu.icache.overall_avg_miss_latency 55527.559055                       # average overall miss latency
+system.cpu.icache.overall_avg_mshr_miss_latency 53156.739812                       # average overall mshr miss latency
 system.cpu.icache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
 system.cpu.icache.overall_hits                    472                       # number of overall hits
-system.cpu.icache.overall_miss_latency       21155500                       # number of overall miss cycles
+system.cpu.icache.overall_miss_latency       21156000                       # number of overall miss cycles
 system.cpu.icache.overall_miss_rate          0.446659                       # miss rate for overall accesses
 system.cpu.icache.overall_misses                  381                       # number of overall misses
 system.cpu.icache.overall_mshr_hits                62                       # number of overall MSHR hits
-system.cpu.icache.overall_mshr_miss_latency     16956000                       # number of overall MSHR miss cycles
+system.cpu.icache.overall_mshr_miss_latency     16957000                       # number of overall MSHR miss cycles
 system.cpu.icache.overall_mshr_miss_rate     0.373974                       # mshr miss rate for overall accesses
 system.cpu.icache.overall_mshr_misses             319                       # number of overall MSHR misses
 system.cpu.icache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
@@ -172,13 +172,13 @@ system.cpu.icache.overall_mshr_uncacheable_misses            0
 system.cpu.icache.replacements                     13                       # number of replacements
 system.cpu.icache.sampled_refs                    319                       # Sample count of references to valid blocks.
 system.cpu.icache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.icache.tagsinuse                145.293265                       # Cycle average of tags in use
+system.cpu.icache.tagsinuse                145.295903                       # Cycle average of tags in use
 system.cpu.icache.total_refs                      472                       # Total number of references to valid blocks.
 system.cpu.icache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
 system.cpu.icache.writebacks                        0                       # number of writebacks
-system.cpu.idleCycles                           37067                       # Number of cycles cpu's stages were not processed
-system.cpu.ipc                               0.135295                       # IPC: Instructions Per Cycle (Per-Thread)
-system.cpu.ipc_total                         0.135295                       # IPC: Total IPC of All Threads
+system.cpu.idleCycles                           37066                       # Number of cycles cpu's stages were not processed
+system.cpu.ipc                               0.135269                       # IPC: Instructions Per Cycle (Per-Thread)
+system.cpu.ipc_total                         0.135269                       # IPC: Total IPC of All Threads
 system.cpu.itb.accesses                             0                       # DTB accesses
 system.cpu.itb.hits                                 0                       # DTB hits
 system.cpu.itb.misses                               0                       # DTB misses
@@ -198,13 +198,13 @@ system.cpu.l2cache.ReadExReq_mshr_miss_latency      2052000
 system.cpu.l2cache.ReadExReq_mshr_miss_rate            1                       # mshr miss rate for ReadExReq accesses
 system.cpu.l2cache.ReadExReq_mshr_misses           51                       # number of ReadExReq MSHR misses
 system.cpu.l2cache.ReadReq_accesses               406                       # number of ReadReq accesses(hits+misses)
-system.cpu.l2cache.ReadReq_avg_miss_latency 52355.198020                       # average ReadReq miss latency
-system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 40152.227723                       # average ReadReq mshr miss latency
+system.cpu.l2cache.ReadReq_avg_miss_latency 52357.673267                       # average ReadReq miss latency
+system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 40153.465347                       # average ReadReq mshr miss latency
 system.cpu.l2cache.ReadReq_hits                     2                       # number of ReadReq hits
-system.cpu.l2cache.ReadReq_miss_latency      21151500                       # number of ReadReq miss cycles
+system.cpu.l2cache.ReadReq_miss_latency      21152500                       # number of ReadReq miss cycles
 system.cpu.l2cache.ReadReq_miss_rate         0.995074                       # miss rate for ReadReq accesses
 system.cpu.l2cache.ReadReq_misses                 404                       # number of ReadReq misses
-system.cpu.l2cache.ReadReq_mshr_miss_latency     16221500                       # number of ReadReq MSHR miss cycles
+system.cpu.l2cache.ReadReq_mshr_miss_latency     16222000                       # number of ReadReq MSHR miss cycles
 system.cpu.l2cache.ReadReq_mshr_miss_rate     0.995074                       # mshr miss rate for ReadReq accesses
 system.cpu.l2cache.ReadReq_mshr_misses            404                       # number of ReadReq MSHR misses
 system.cpu.l2cache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
@@ -216,31 +216,31 @@ system.cpu.l2cache.blocked_cycles::no_mshrs            0                       #
 system.cpu.l2cache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.l2cache.cache_copies                     0                       # number of cache copies performed
 system.cpu.l2cache.demand_accesses                457                       # number of demand (read+write) accesses
-system.cpu.l2cache.demand_avg_miss_latency 52368.131868                       # average overall miss latency
-system.cpu.l2cache.demand_avg_mshr_miss_latency 40161.538462                       # average overall mshr miss latency
+system.cpu.l2cache.demand_avg_miss_latency 52370.329670                       # average overall miss latency
+system.cpu.l2cache.demand_avg_mshr_miss_latency 40162.637363                       # average overall mshr miss latency
 system.cpu.l2cache.demand_hits                      2                       # number of demand (read+write) hits
-system.cpu.l2cache.demand_miss_latency       23827500                       # number of demand (read+write) miss cycles
+system.cpu.l2cache.demand_miss_latency       23828500                       # number of demand (read+write) miss cycles
 system.cpu.l2cache.demand_miss_rate          0.995624                       # miss rate for demand accesses
 system.cpu.l2cache.demand_misses                  455                       # number of demand (read+write) misses
 system.cpu.l2cache.demand_mshr_hits                 0                       # number of demand (read+write) MSHR hits
-system.cpu.l2cache.demand_mshr_miss_latency     18273500                       # number of demand (read+write) MSHR miss cycles
+system.cpu.l2cache.demand_mshr_miss_latency     18274000                       # number of demand (read+write) MSHR miss cycles
 system.cpu.l2cache.demand_mshr_miss_rate     0.995624                       # mshr miss rate for demand accesses
 system.cpu.l2cache.demand_mshr_misses             455                       # number of demand (read+write) MSHR misses
 system.cpu.l2cache.fast_writes                      0                       # number of fast writes performed
 system.cpu.l2cache.mshr_cap_events                  0                       # number of times MSHR cap was activated
 system.cpu.l2cache.no_allocate_misses               0                       # Number of misses that were no-allocate
 system.cpu.l2cache.occ_%::0                  0.006169                       # Average percentage of cache occupancy
-system.cpu.l2cache.occ_blocks::0           202.148379                       # Average occupied blocks per context
+system.cpu.l2cache.occ_blocks::0           202.151439                       # Average occupied blocks per context
 system.cpu.l2cache.overall_accesses               457                       # number of overall (read+write) accesses
-system.cpu.l2cache.overall_avg_miss_latency 52368.131868                       # average overall miss latency
-system.cpu.l2cache.overall_avg_mshr_miss_latency 40161.538462                       # average overall mshr miss latency
+system.cpu.l2cache.overall_avg_miss_latency 52370.329670                       # average overall miss latency
+system.cpu.l2cache.overall_avg_mshr_miss_latency 40162.637363                       # average overall mshr miss latency
 system.cpu.l2cache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
 system.cpu.l2cache.overall_hits                     2                       # number of overall hits
-system.cpu.l2cache.overall_miss_latency      23827500                       # number of overall miss cycles
+system.cpu.l2cache.overall_miss_latency      23828500                       # number of overall miss cycles
 system.cpu.l2cache.overall_miss_rate         0.995624                       # miss rate for overall accesses
 system.cpu.l2cache.overall_misses                 455                       # number of overall misses
 system.cpu.l2cache.overall_mshr_hits                0                       # number of overall MSHR hits
-system.cpu.l2cache.overall_mshr_miss_latency     18273500                       # number of overall MSHR miss cycles
+system.cpu.l2cache.overall_mshr_miss_latency     18274000                       # number of overall MSHR miss cycles
 system.cpu.l2cache.overall_mshr_miss_rate     0.995624                       # mshr miss rate for overall accesses
 system.cpu.l2cache.overall_mshr_misses            455                       # number of overall MSHR misses
 system.cpu.l2cache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
@@ -248,34 +248,34 @@ system.cpu.l2cache.overall_mshr_uncacheable_misses            0
 system.cpu.l2cache.replacements                     0                       # number of replacements
 system.cpu.l2cache.sampled_refs                   404                       # Sample count of references to valid blocks.
 system.cpu.l2cache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.l2cache.tagsinuse               202.148379                       # Cycle average of tags in use
+system.cpu.l2cache.tagsinuse               202.151439                       # Cycle average of tags in use
 system.cpu.l2cache.total_refs                       2                       # Total number of references to valid blocks.
 system.cpu.l2cache.warmup_cycle                     0                       # Cycle when the warmup percentage was hit.
 system.cpu.l2cache.writebacks                       0                       # number of writebacks
-system.cpu.numCycles                            43069                       # number of cpu cycles simulated
+system.cpu.numCycles                            43077                       # number of cpu cycles simulated
 system.cpu.numWorkItemsCompleted                    0                       # number of work items this cpu completed
 system.cpu.numWorkItemsStarted                      0                       # number of work items this cpu started
-system.cpu.runCycles                             6002                       # Number of cycles cpu stages are processed.
+system.cpu.runCycles                             6011                       # Number of cycles cpu stages are processed.
 system.cpu.smtCommittedInsts                        0                       # Number of SMT Instructions Simulated (Per-Thread)
 system.cpu.smtCycles                                0                       # Total number of cycles that the CPU was in SMT-mode
 system.cpu.smt_cpi                           no_value                       # CPI: Total SMT-CPI
 system.cpu.smt_ipc                           no_value                       # IPC: Total SMT-IPC
-system.cpu.stage-0.idleCycles                   39196                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-0.runCycles                     3873                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-0.utilization               8.992547                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-1.idleCycles                   40152                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-1.runCycles                     2917                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-1.utilization               6.772853                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-2.idleCycles                   40243                       # Number of cycles 0 instructions are processed.
-system.cpu.stage-2.runCycles                     2826                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-2.utilization               6.561564                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-3.idleCycles                   41749                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-0.idleCycles                   39203                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-0.runCycles                     3874                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-0.utilization               8.993198                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-1.idleCycles                   40159                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-1.runCycles                     2918                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-1.utilization               6.773916                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-2.idleCycles                   40245                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-2.runCycles                     2832                       # Number of cycles 1+ instructions are processed.
+system.cpu.stage-2.utilization               6.574274                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-3.idleCycles                   41757                       # Number of cycles 0 instructions are processed.
 system.cpu.stage-3.runCycles                     1320                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-3.utilization               3.064849                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.stage-4.idleCycles                   39866                       # Number of cycles 0 instructions are processed.
+system.cpu.stage-3.utilization               3.064280                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.stage-4.idleCycles                   39874                       # Number of cycles 0 instructions are processed.
 system.cpu.stage-4.runCycles                     3203                       # Number of cycles 1+ instructions are processed.
-system.cpu.stage-4.utilization               7.436904                       # Percentage of cycles stage was utilized (processing insts).
-system.cpu.threadCycles                         10184                       # Total Number of Cycles A Thread Was Active in CPU (Per-Thread)
+system.cpu.stage-4.utilization               7.435522                       # Percentage of cycles stage was utilized (processing insts).
+system.cpu.threadCycles                         10193                       # Total Number of Cycles A Thread Was Active in CPU (Per-Thread)
 system.cpu.timesIdled                             427                       # Number of times that the entire CPU went into an idle state and unscheduled itself
 system.cpu.workload.PROG:num_syscalls               8                       # Number of system calls
 
diff --git a/tests/quick/00.hello/ref/x86/linux/o3-timing/simout b/tests/quick/00.hello/ref/x86/linux/o3-timing/simout
index 0767b97775..1943466e8b 100755
--- a/tests/quick/00.hello/ref/x86/linux/o3-timing/simout
+++ b/tests/quick/00.hello/ref/x86/linux/o3-timing/simout
@@ -5,12 +5,13 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:13
+M5 compiled Feb 12 2011 02:22:23
+M5 revision 5e76f9de6972 7961 default qtip tip x86branchdetectstats.patch
+M5 started Feb 12 2011 02:22:27
 M5 executing on burrito
-command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/quick/00.hello/x86/linux/o3-timing -re tests/run.py build/X86_SE/tests/fast/quick/00.hello/x86/linux/o3-timing
+command line: build/X86_SE/m5.opt -d build/X86_SE/tests/opt/quick/00.hello/x86/linux/o3-timing -re tests/run.py build/X86_SE/tests/opt/quick/00.hello/x86/linux/o3-timing
 Global frequency set at 1000000000000 ticks per second
 info: Entering event queue @ 0.  Starting simulation...
+info: Increasing stack size by one page.
 Hello world!
-Exiting @ tick 13766000 because target called exit()
+Exiting @ tick 11421500 because target called exit()
diff --git a/tests/quick/00.hello/ref/x86/linux/o3-timing/stats.txt b/tests/quick/00.hello/ref/x86/linux/o3-timing/stats.txt
index 182e72d256..c2dfaa3ff4 100644
--- a/tests/quick/00.hello/ref/x86/linux/o3-timing/stats.txt
+++ b/tests/quick/00.hello/ref/x86/linux/o3-timing/stats.txt
@@ -1,41 +1,41 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                  47133                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 227692                       # Number of bytes of host memory used
+host_inst_rate                                  47598                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 231896                       # Number of bytes of host memory used
 host_seconds                                     0.21                       # Real time elapsed on the host
-host_tick_rate                               66053082                       # Simulator tick rate (ticks/s)
+host_tick_rate                               55349277                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                        9809                       # Number of instructions simulated
-sim_seconds                                  0.000014                       # Number of seconds simulated
-sim_ticks                                    13766000                       # Number of ticks simulated
+sim_seconds                                  0.000011                       # Number of seconds simulated
+sim_ticks                                    11421500                       # Number of ticks simulated
 system.cpu.BPredUnit.BTBCorrect                     0                       # Number of correct BTB predictions (this stat may not work properly.
-system.cpu.BPredUnit.BTBHits                      772                       # Number of BTB hits
-system.cpu.BPredUnit.BTBLookups                  1892                       # Number of BTB lookups
+system.cpu.BPredUnit.BTBHits                      944                       # Number of BTB hits
+system.cpu.BPredUnit.BTBLookups                  2550                       # Number of BTB lookups
 system.cpu.BPredUnit.RASInCorrect                   0                       # Number of incorrect RAS predictions.
-system.cpu.BPredUnit.condIncorrect                458                       # Number of conditional branches incorrect
-system.cpu.BPredUnit.condPredicted               1920                       # Number of conditional branches predicted
-system.cpu.BPredUnit.lookups                     1920                       # Number of BP lookups
+system.cpu.BPredUnit.condIncorrect                485                       # Number of conditional branches incorrect
+system.cpu.BPredUnit.condPredicted               2777                       # Number of conditional branches predicted
+system.cpu.BPredUnit.lookups                     2777                       # Number of BP lookups
 system.cpu.BPredUnit.usedRAS                        0                       # Number of times the RAS was used to get a target.
 system.cpu.commit.COM:branches                   1214                       # Number of branches committed
-system.cpu.commit.COM:bw_lim_events                37                       # number cycles where commit BW limit reached
+system.cpu.commit.COM:bw_lim_events               139                       # number cycles where commit BW limit reached
 system.cpu.commit.COM:bw_limited                    0                       # number of insts not committed due to BW limits
-system.cpu.commit.COM:committed_per_cycle::samples        15124                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::mean     0.648572                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::stdev     1.100130                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::samples        11906                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::mean     0.823870                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::stdev     1.588166                       # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::underflows            0      0.00%      0.00% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::0         9612     63.55%     63.55% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::1         3088     20.42%     83.97% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::2         1220      8.07%     92.04% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::3          836      5.53%     97.57% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::4          232      1.53%     99.10% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::5           57      0.38%     99.48% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::6           30      0.20%     99.68% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::7           12      0.08%     99.76% # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::8           37      0.24%    100.00% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::0         8274     69.49%     69.49% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::1         1230     10.33%     79.83% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::2          588      4.94%     84.76% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::3          963      8.09%     92.85% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::4          395      3.32%     96.17% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::5          136      1.14%     97.31% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::6          125      1.05%     98.36% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::7           56      0.47%     98.83% # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::8          139      1.17%    100.00% # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::overflows            0      0.00%    100.00% # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::min_value            0                       # Number of insts commited each cycle
 system.cpu.commit.COM:committed_per_cycle::max_value            8                       # Number of insts commited each cycle
-system.cpu.commit.COM:committed_per_cycle::total        15124                       # Number of insts commited each cycle
+system.cpu.commit.COM:committed_per_cycle::total        11906                       # Number of insts commited each cycle
 system.cpu.commit.COM:count                      9809                       # Number of instructions committed
 system.cpu.commit.COM:fp_insts                      0                       # Number of committed floating point instructions.
 system.cpu.commit.COM:function_calls                0                       # Number of function calls committed.
@@ -44,415 +44,417 @@ system.cpu.commit.COM:loads                      1056                       # Nu
 system.cpu.commit.COM:membars                       0                       # Number of memory barriers committed
 system.cpu.commit.COM:refs                       1990                       # Number of memory references committed
 system.cpu.commit.COM:swp_count                     0                       # Number of s/w prefetches committed
-system.cpu.commit.branchMispredicts               462                       # The number of times a branch was mispredicted
+system.cpu.commit.branchMispredicts               485                       # The number of times a branch was mispredicted
 system.cpu.commit.commitCommittedInsts           9809                       # The number of committed instructions
 system.cpu.commit.commitNonSpecStalls              13                       # The number of times commit has been forced to stall to communicate backwards
-system.cpu.commit.commitSquashedInsts            3832                       # The number of squashed insts skipped by commit
+system.cpu.commit.commitSquashedInsts            9374                       # The number of squashed insts skipped by commit
 system.cpu.committedInsts                        9809                       # Number of Instructions Simulated
 system.cpu.committedInsts_total                  9809                       # Number of Instructions Simulated
-system.cpu.cpi                               2.806912                       # CPI: Cycles Per Instruction
-system.cpu.cpi_total                         2.806912                       # CPI: Total CPI of All Threads
-system.cpu.dcache.ReadReq_accesses               1244                       # number of ReadReq accesses(hits+misses)
-system.cpu.dcache.ReadReq_avg_miss_latency 37105.263158                       # average ReadReq miss latency
-system.cpu.dcache.ReadReq_avg_mshr_miss_latency 35048.387097                       # average ReadReq mshr miss latency
-system.cpu.dcache.ReadReq_hits                   1168                       # number of ReadReq hits
-system.cpu.dcache.ReadReq_miss_latency        2820000                       # number of ReadReq miss cycles
-system.cpu.dcache.ReadReq_miss_rate          0.061093                       # miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_misses                   76                       # number of ReadReq misses
-system.cpu.dcache.ReadReq_mshr_hits                14                       # number of ReadReq MSHR hits
-system.cpu.dcache.ReadReq_mshr_miss_latency      2173000                       # number of ReadReq MSHR miss cycles
-system.cpu.dcache.ReadReq_mshr_miss_rate     0.049839                       # mshr miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_mshr_misses              62                       # number of ReadReq MSHR misses
+system.cpu.cpi                               2.328882                       # CPI: Cycles Per Instruction
+system.cpu.cpi_total                         2.328882                       # CPI: Total CPI of All Threads
+system.cpu.dcache.ReadReq_accesses               1541                       # number of ReadReq accesses(hits+misses)
+system.cpu.dcache.ReadReq_avg_miss_latency 34473.684211                       # average ReadReq miss latency
+system.cpu.dcache.ReadReq_avg_mshr_miss_latency 35119.402985                       # average ReadReq mshr miss latency
+system.cpu.dcache.ReadReq_hits                   1427                       # number of ReadReq hits
+system.cpu.dcache.ReadReq_miss_latency        3930000                       # number of ReadReq miss cycles
+system.cpu.dcache.ReadReq_miss_rate          0.073978                       # miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_misses                  114                       # number of ReadReq misses
+system.cpu.dcache.ReadReq_mshr_hits                47                       # number of ReadReq MSHR hits
+system.cpu.dcache.ReadReq_mshr_miss_latency      2353000                       # number of ReadReq MSHR miss cycles
+system.cpu.dcache.ReadReq_mshr_miss_rate     0.043478                       # mshr miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_mshr_misses              67                       # number of ReadReq MSHR misses
 system.cpu.dcache.WriteReq_accesses               934                       # number of WriteReq accesses(hits+misses)
-system.cpu.dcache.WriteReq_avg_miss_latency 33138.977636                       # average WriteReq miss latency
-system.cpu.dcache.WriteReq_avg_mshr_miss_latency 35775.641026                       # average WriteReq mshr miss latency
+system.cpu.dcache.WriteReq_avg_miss_latency 34089.456869                       # average WriteReq miss latency
+system.cpu.dcache.WriteReq_avg_mshr_miss_latency 36012.987013                       # average WriteReq mshr miss latency
 system.cpu.dcache.WriteReq_hits                   621                       # number of WriteReq hits
-system.cpu.dcache.WriteReq_miss_latency      10372500                       # number of WriteReq miss cycles
+system.cpu.dcache.WriteReq_miss_latency      10670000                       # number of WriteReq miss cycles
 system.cpu.dcache.WriteReq_miss_rate         0.335118                       # miss rate for WriteReq accesses
 system.cpu.dcache.WriteReq_misses                 313                       # number of WriteReq misses
-system.cpu.dcache.WriteReq_mshr_hits              235                       # number of WriteReq MSHR hits
-system.cpu.dcache.WriteReq_mshr_miss_latency      2790500                       # number of WriteReq MSHR miss cycles
-system.cpu.dcache.WriteReq_mshr_miss_rate     0.083512                       # mshr miss rate for WriteReq accesses
-system.cpu.dcache.WriteReq_mshr_misses             78                       # number of WriteReq MSHR misses
+system.cpu.dcache.WriteReq_mshr_hits              236                       # number of WriteReq MSHR hits
+system.cpu.dcache.WriteReq_mshr_miss_latency      2773000                       # number of WriteReq MSHR miss cycles
+system.cpu.dcache.WriteReq_mshr_miss_rate     0.082441                       # mshr miss rate for WriteReq accesses
+system.cpu.dcache.WriteReq_mshr_misses             77                       # number of WriteReq MSHR misses
 system.cpu.dcache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.dcache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.dcache.avg_refs                  12.870504                       # Average number of references to valid blocks.
+system.cpu.dcache.avg_refs                  14.321678                       # Average number of references to valid blocks.
 system.cpu.dcache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.dcache.blocked::no_targets               0                       # number of cycles access was blocked
 system.cpu.dcache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.dcache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.dcache.cache_copies                      0                       # number of cache copies performed
-system.cpu.dcache.demand_accesses                2178                       # number of demand (read+write) accesses
-system.cpu.dcache.demand_avg_miss_latency 33913.881748                       # average overall miss latency
-system.cpu.dcache.demand_avg_mshr_miss_latency 35453.571429                       # average overall mshr miss latency
-system.cpu.dcache.demand_hits                    1789                       # number of demand (read+write) hits
-system.cpu.dcache.demand_miss_latency        13192500                       # number of demand (read+write) miss cycles
-system.cpu.dcache.demand_miss_rate           0.178604                       # miss rate for demand accesses
-system.cpu.dcache.demand_misses                   389                       # number of demand (read+write) misses
-system.cpu.dcache.demand_mshr_hits                249                       # number of demand (read+write) MSHR hits
-system.cpu.dcache.demand_mshr_miss_latency      4963500                       # number of demand (read+write) MSHR miss cycles
-system.cpu.dcache.demand_mshr_miss_rate      0.064279                       # mshr miss rate for demand accesses
-system.cpu.dcache.demand_mshr_misses              140                       # number of demand (read+write) MSHR misses
+system.cpu.dcache.demand_accesses                2475                       # number of demand (read+write) accesses
+system.cpu.dcache.demand_avg_miss_latency 34192.037471                       # average overall miss latency
+system.cpu.dcache.demand_avg_mshr_miss_latency 35597.222222                       # average overall mshr miss latency
+system.cpu.dcache.demand_hits                    2048                       # number of demand (read+write) hits
+system.cpu.dcache.demand_miss_latency        14600000                       # number of demand (read+write) miss cycles
+system.cpu.dcache.demand_miss_rate           0.172525                       # miss rate for demand accesses
+system.cpu.dcache.demand_misses                   427                       # number of demand (read+write) misses
+system.cpu.dcache.demand_mshr_hits                283                       # number of demand (read+write) MSHR hits
+system.cpu.dcache.demand_mshr_miss_latency      5126000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.dcache.demand_mshr_miss_rate      0.058182                       # mshr miss rate for demand accesses
+system.cpu.dcache.demand_mshr_misses              144                       # number of demand (read+write) MSHR misses
 system.cpu.dcache.fast_writes                       0                       # number of fast writes performed
 system.cpu.dcache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.dcache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.dcache.occ_%::0                   0.020744                       # Average percentage of cache occupancy
-system.cpu.dcache.occ_blocks::0             84.965644                       # Average occupied blocks per context
-system.cpu.dcache.overall_accesses               2178                       # number of overall (read+write) accesses
-system.cpu.dcache.overall_avg_miss_latency 33913.881748                       # average overall miss latency
-system.cpu.dcache.overall_avg_mshr_miss_latency 35453.571429                       # average overall mshr miss latency
+system.cpu.dcache.occ_%::0                   0.020970                       # Average percentage of cache occupancy
+system.cpu.dcache.occ_blocks::0             85.892970                       # Average occupied blocks per context
+system.cpu.dcache.overall_accesses               2475                       # number of overall (read+write) accesses
+system.cpu.dcache.overall_avg_miss_latency 34192.037471                       # average overall miss latency
+system.cpu.dcache.overall_avg_mshr_miss_latency 35597.222222                       # average overall mshr miss latency
 system.cpu.dcache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.dcache.overall_hits                   1789                       # number of overall hits
-system.cpu.dcache.overall_miss_latency       13192500                       # number of overall miss cycles
-system.cpu.dcache.overall_miss_rate          0.178604                       # miss rate for overall accesses
-system.cpu.dcache.overall_misses                  389                       # number of overall misses
-system.cpu.dcache.overall_mshr_hits               249                       # number of overall MSHR hits
-system.cpu.dcache.overall_mshr_miss_latency      4963500                       # number of overall MSHR miss cycles
-system.cpu.dcache.overall_mshr_miss_rate     0.064279                       # mshr miss rate for overall accesses
-system.cpu.dcache.overall_mshr_misses             140                       # number of overall MSHR misses
+system.cpu.dcache.overall_hits                   2048                       # number of overall hits
+system.cpu.dcache.overall_miss_latency       14600000                       # number of overall miss cycles
+system.cpu.dcache.overall_miss_rate          0.172525                       # miss rate for overall accesses
+system.cpu.dcache.overall_misses                  427                       # number of overall misses
+system.cpu.dcache.overall_mshr_hits               283                       # number of overall MSHR hits
+system.cpu.dcache.overall_mshr_miss_latency      5126000                       # number of overall MSHR miss cycles
+system.cpu.dcache.overall_mshr_miss_rate     0.058182                       # mshr miss rate for overall accesses
+system.cpu.dcache.overall_mshr_misses             144                       # number of overall MSHR misses
 system.cpu.dcache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.dcache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
 system.cpu.dcache.replacements                      0                       # number of replacements
-system.cpu.dcache.sampled_refs                    139                       # Sample count of references to valid blocks.
+system.cpu.dcache.sampled_refs                    143                       # Sample count of references to valid blocks.
 system.cpu.dcache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.dcache.tagsinuse                 84.965644                       # Cycle average of tags in use
-system.cpu.dcache.total_refs                     1789                       # Total number of references to valid blocks.
+system.cpu.dcache.tagsinuse                 85.892970                       # Cycle average of tags in use
+system.cpu.dcache.total_refs                     2048                       # Total number of references to valid blocks.
 system.cpu.dcache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
 system.cpu.dcache.writebacks                        0                       # number of writebacks
-system.cpu.decode.DECODE:BlockedCycles            464                       # Number of cycles decode is blocked
-system.cpu.decode.DECODE:DecodedInsts           15304                       # Number of instructions handled by decode
-system.cpu.decode.DECODE:IdleCycles              6233                       # Number of cycles decode is idle
-system.cpu.decode.DECODE:RunCycles               8371                       # Number of cycles decode is running
-system.cpu.decode.DECODE:SquashCycles             721                       # Number of cycles decode is squashing
-system.cpu.decode.DECODE:UnblockCycles             56                       # Number of cycles decode is unblocking
-system.cpu.fetch.Branches                        1920                       # Number of branches that fetch encountered
-system.cpu.fetch.CacheLines                      1255                       # Number of cache lines fetched
-system.cpu.fetch.Cycles                          9031                       # Number of cycles fetch has run and was not squashing or blocked
-system.cpu.fetch.IcacheSquashes                   117                       # Number of outstanding Icache misses that were squashed
-system.cpu.fetch.Insts                           8830                       # Number of instructions fetch has processed
-system.cpu.fetch.MiscStallCycles                    8                       # Number of cycles fetch has spent waiting on interrupts, or bad addresses, or out of MSHRs
-system.cpu.fetch.SquashCycles                     469                       # Number of cycles fetch has spent squashing
-system.cpu.fetch.branchRate                  0.069735                       # Number of branch fetches per cycle
-system.cpu.fetch.icacheStallCycles               1255                       # Number of cycles fetch is stalled on an Icache miss
-system.cpu.fetch.predictedBranches                772                       # Number of branches that fetch has predicted taken
-system.cpu.fetch.rate                        0.320706                       # Number of inst fetches per cycle
-system.cpu.fetch.rateDist::samples              15845                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::mean              1.002083                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::stdev             1.178869                       # Number of instructions fetched each cycle (Total)
+system.cpu.decode.DECODE:BlockedCycles           1367                       # Number of cycles decode is blocked
+system.cpu.decode.DECODE:DecodedInsts           22275                       # Number of instructions handled by decode
+system.cpu.decode.DECODE:IdleCycles              7155                       # Number of cycles decode is idle
+system.cpu.decode.DECODE:RunCycles               3308                       # Number of cycles decode is running
+system.cpu.decode.DECODE:SquashCycles            1504                       # Number of cycles decode is squashing
+system.cpu.decode.DECODE:UnblockCycles             76                       # Number of cycles decode is unblocking
+system.cpu.fetch.Branches                        2777                       # Number of branches that fetch encountered
+system.cpu.fetch.CacheLines                      1732                       # Number of cache lines fetched
+system.cpu.fetch.Cycles                          3623                       # Number of cycles fetch has run and was not squashing or blocked
+system.cpu.fetch.IcacheSquashes                   245                       # Number of outstanding Icache misses that were squashed
+system.cpu.fetch.Insts                          12976                       # Number of instructions fetch has processed
+system.cpu.fetch.MiscStallCycles                    4                       # Number of cycles fetch has spent waiting on interrupts, or bad addresses, or out of MSHRs
+system.cpu.fetch.SquashCycles                     508                       # Number of cycles fetch has spent squashing
+system.cpu.fetch.branchRate                  0.121564                       # Number of branch fetches per cycle
+system.cpu.fetch.icacheStallCycles               1732                       # Number of cycles fetch is stalled on an Icache miss
+system.cpu.fetch.predictedBranches                944                       # Number of branches that fetch has predicted taken
+system.cpu.fetch.rate                        0.568027                       # Number of inst fetches per cycle
+system.cpu.fetch.rateDist::samples              13410                       # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::mean              1.734526                       # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::stdev             3.109133                       # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::underflows               0      0.00%      0.00% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::0                     7129     44.99%     44.99% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::1                     4489     28.33%     73.32% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::2                     1878     11.85%     85.18% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::3                     2046     12.91%     98.09% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::4                       57      0.36%     98.45% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::5                      227      1.43%     99.88% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::6                        6      0.04%     99.92% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::7                        8      0.05%     99.97% # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::8                        5      0.03%    100.00% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::0                     9877     73.65%     73.65% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::1                      162      1.21%     74.86% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::2                      123      0.92%     75.78% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::3                      227      1.69%     77.47% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::4                      192      1.43%     78.90% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::5                      174      1.30%     80.20% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::6                      266      1.98%     82.18% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::7                      175      1.30%     83.49% # Number of instructions fetched each cycle (Total)
+system.cpu.fetch.rateDist::8                     2214     16.51%    100.00% # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::overflows                0      0.00%    100.00% # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::min_value                0                       # Number of instructions fetched each cycle (Total)
 system.cpu.fetch.rateDist::max_value                8                       # Number of instructions fetched each cycle (Total)
-system.cpu.fetch.rateDist::total                15845                       # Number of instructions fetched each cycle (Total)
-system.cpu.fp_regfile_reads                         2                       # number of floating regfile reads
-system.cpu.icache.ReadReq_accesses               1255                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_avg_miss_latency 37417.543860                       # average ReadReq miss latency
-system.cpu.icache.ReadReq_avg_mshr_miss_latency 35040.697674                       # average ReadReq mshr miss latency
-system.cpu.icache.ReadReq_hits                    970                       # number of ReadReq hits
-system.cpu.icache.ReadReq_miss_latency       10664000                       # number of ReadReq miss cycles
-system.cpu.icache.ReadReq_miss_rate          0.227092                       # miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_misses                  285                       # number of ReadReq misses
-system.cpu.icache.ReadReq_mshr_hits                27                       # number of ReadReq MSHR hits
-system.cpu.icache.ReadReq_mshr_miss_latency      9040500                       # number of ReadReq MSHR miss cycles
-system.cpu.icache.ReadReq_mshr_miss_rate     0.205578                       # mshr miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_mshr_misses             258                       # number of ReadReq MSHR misses
+system.cpu.fetch.rateDist::total                13410                       # Number of instructions fetched each cycle (Total)
+system.cpu.fp_regfile_reads                         4                       # number of floating regfile reads
+system.cpu.icache.ReadReq_accesses               1732                       # number of ReadReq accesses(hits+misses)
+system.cpu.icache.ReadReq_avg_miss_latency 36454.794521                       # average ReadReq miss latency
+system.cpu.icache.ReadReq_avg_mshr_miss_latency 35105.084746                       # average ReadReq mshr miss latency
+system.cpu.icache.ReadReq_hits                   1367                       # number of ReadReq hits
+system.cpu.icache.ReadReq_miss_latency       13306000                       # number of ReadReq miss cycles
+system.cpu.icache.ReadReq_miss_rate          0.210739                       # miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_misses                  365                       # number of ReadReq misses
+system.cpu.icache.ReadReq_mshr_hits                70                       # number of ReadReq MSHR hits
+system.cpu.icache.ReadReq_mshr_miss_latency     10356000                       # number of ReadReq MSHR miss cycles
+system.cpu.icache.ReadReq_mshr_miss_rate     0.170323                       # mshr miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_mshr_misses             295                       # number of ReadReq MSHR misses
 system.cpu.icache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.icache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.icache.avg_refs                   3.759690                       # Average number of references to valid blocks.
+system.cpu.icache.avg_refs                   4.633898                       # Average number of references to valid blocks.
 system.cpu.icache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.icache.blocked::no_targets               0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.icache.cache_copies                      0                       # number of cache copies performed
-system.cpu.icache.demand_accesses                1255                       # number of demand (read+write) accesses
-system.cpu.icache.demand_avg_miss_latency 37417.543860                       # average overall miss latency
-system.cpu.icache.demand_avg_mshr_miss_latency 35040.697674                       # average overall mshr miss latency
-system.cpu.icache.demand_hits                     970                       # number of demand (read+write) hits
-system.cpu.icache.demand_miss_latency        10664000                       # number of demand (read+write) miss cycles
-system.cpu.icache.demand_miss_rate           0.227092                       # miss rate for demand accesses
-system.cpu.icache.demand_misses                   285                       # number of demand (read+write) misses
-system.cpu.icache.demand_mshr_hits                 27                       # number of demand (read+write) MSHR hits
-system.cpu.icache.demand_mshr_miss_latency      9040500                       # number of demand (read+write) MSHR miss cycles
-system.cpu.icache.demand_mshr_miss_rate      0.205578                       # mshr miss rate for demand accesses
-system.cpu.icache.demand_mshr_misses              258                       # number of demand (read+write) MSHR misses
+system.cpu.icache.demand_accesses                1732                       # number of demand (read+write) accesses
+system.cpu.icache.demand_avg_miss_latency 36454.794521                       # average overall miss latency
+system.cpu.icache.demand_avg_mshr_miss_latency 35105.084746                       # average overall mshr miss latency
+system.cpu.icache.demand_hits                    1367                       # number of demand (read+write) hits
+system.cpu.icache.demand_miss_latency        13306000                       # number of demand (read+write) miss cycles
+system.cpu.icache.demand_miss_rate           0.210739                       # miss rate for demand accesses
+system.cpu.icache.demand_misses                   365                       # number of demand (read+write) misses
+system.cpu.icache.demand_mshr_hits                 70                       # number of demand (read+write) MSHR hits
+system.cpu.icache.demand_mshr_miss_latency     10356000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.icache.demand_mshr_miss_rate      0.170323                       # mshr miss rate for demand accesses
+system.cpu.icache.demand_mshr_misses              295                       # number of demand (read+write) MSHR misses
 system.cpu.icache.fast_writes                       0                       # number of fast writes performed
 system.cpu.icache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.icache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.icache.occ_%::0                   0.061525                       # Average percentage of cache occupancy
-system.cpu.icache.occ_blocks::0            126.002915                       # Average occupied blocks per context
-system.cpu.icache.overall_accesses               1255                       # number of overall (read+write) accesses
-system.cpu.icache.overall_avg_miss_latency 37417.543860                       # average overall miss latency
-system.cpu.icache.overall_avg_mshr_miss_latency 35040.697674                       # average overall mshr miss latency
+system.cpu.icache.occ_%::0                   0.070726                       # Average percentage of cache occupancy
+system.cpu.icache.occ_blocks::0            144.846093                       # Average occupied blocks per context
+system.cpu.icache.overall_accesses               1732                       # number of overall (read+write) accesses
+system.cpu.icache.overall_avg_miss_latency 36454.794521                       # average overall miss latency
+system.cpu.icache.overall_avg_mshr_miss_latency 35105.084746                       # average overall mshr miss latency
 system.cpu.icache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.icache.overall_hits                    970                       # number of overall hits
-system.cpu.icache.overall_miss_latency       10664000                       # number of overall miss cycles
-system.cpu.icache.overall_miss_rate          0.227092                       # miss rate for overall accesses
-system.cpu.icache.overall_misses                  285                       # number of overall misses
-system.cpu.icache.overall_mshr_hits                27                       # number of overall MSHR hits
-system.cpu.icache.overall_mshr_miss_latency      9040500                       # number of overall MSHR miss cycles
-system.cpu.icache.overall_mshr_miss_rate     0.205578                       # mshr miss rate for overall accesses
-system.cpu.icache.overall_mshr_misses             258                       # number of overall MSHR misses
+system.cpu.icache.overall_hits                   1367                       # number of overall hits
+system.cpu.icache.overall_miss_latency       13306000                       # number of overall miss cycles
+system.cpu.icache.overall_miss_rate          0.210739                       # miss rate for overall accesses
+system.cpu.icache.overall_misses                  365                       # number of overall misses
+system.cpu.icache.overall_mshr_hits                70                       # number of overall MSHR hits
+system.cpu.icache.overall_mshr_miss_latency     10356000                       # number of overall MSHR miss cycles
+system.cpu.icache.overall_mshr_miss_rate     0.170323                       # mshr miss rate for overall accesses
+system.cpu.icache.overall_mshr_misses             295                       # number of overall MSHR misses
 system.cpu.icache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.icache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
 system.cpu.icache.replacements                      0                       # number of replacements
-system.cpu.icache.sampled_refs                    258                       # Sample count of references to valid blocks.
+system.cpu.icache.sampled_refs                    295                       # Sample count of references to valid blocks.
 system.cpu.icache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.icache.tagsinuse                126.002915                       # Cycle average of tags in use
-system.cpu.icache.total_refs                      970                       # Total number of references to valid blocks.
+system.cpu.icache.tagsinuse                144.846093                       # Cycle average of tags in use
+system.cpu.icache.total_refs                     1367                       # Total number of references to valid blocks.
 system.cpu.icache.warmup_cycle                      0                       # Cycle when the warmup percentage was hit.
 system.cpu.icache.writebacks                        0                       # number of writebacks
-system.cpu.idleCycles                           11688                       # Total number of cycles that the CPU has spent unscheduled due to idling
-system.cpu.iew.EXEC:branches                     1318                       # Number of branches executed
+system.cpu.idleCycles                            9434                       # Total number of cycles that the CPU has spent unscheduled due to idling
+system.cpu.iew.EXEC:branches                     1551                       # Number of branches executed
 system.cpu.iew.EXEC:nop                             0                       # number of nop insts executed
-system.cpu.iew.EXEC:rate                     0.434678                       # Inst execution rate
-system.cpu.iew.EXEC:refs                         2353                       # number of memory reference insts executed
-system.cpu.iew.EXEC:stores                       1060                       # Number of stores executed
+system.cpu.iew.EXEC:rate                     0.676151                       # Inst execution rate
+system.cpu.iew.EXEC:refs                         2971                       # number of memory reference insts executed
+system.cpu.iew.EXEC:stores                       1306                       # Number of stores executed
 system.cpu.iew.EXEC:swp                             0                       # number of swp insts executed
-system.cpu.iew.WB:consumers                     10358                       # num instructions consuming a value
-system.cpu.iew.WB:count                         11818                       # cumulative count of insts written-back
-system.cpu.iew.WB:fanout                     0.702935                       # average fanout of values written-back
+system.cpu.iew.WB:consumers                     14704                       # num instructions consuming a value
+system.cpu.iew.WB:count                         15138                       # cumulative count of insts written-back
+system.cpu.iew.WB:fanout                     0.679747                       # average fanout of values written-back
 system.cpu.iew.WB:penalized                         0                       # number of instrctions required to write to 'other' IQ
 system.cpu.iew.WB:penalized_rate                    0                       # fraction of instructions written-back that wrote to 'other' IQ
-system.cpu.iew.WB:producers                      7281                       # num instructions producing a value
-system.cpu.iew.WB:rate                       0.429230                       # insts written-back per cycle
-system.cpu.iew.WB:sent                          11866                       # cumulative count of insts sent to commit
-system.cpu.iew.branchMispredicts                  487                       # Number of branch mispredicts detected at execute
-system.cpu.iew.iewBlockCycles                      58                       # Number of cycles IEW is blocking
-system.cpu.iew.iewDispLoadInsts                  1535                       # Number of dispatched load instructions
-system.cpu.iew.iewDispNonSpecInsts                 17                       # Number of dispatched non-speculative instructions
-system.cpu.iew.iewDispSquashedInsts               418                       # Number of squashed instructions skipped by dispatch
-system.cpu.iew.iewDispStoreInsts                 1238                       # Number of dispatched store instructions
-system.cpu.iew.iewDispatchedInsts               13635                       # Number of instructions dispatched to IQ
-system.cpu.iew.iewExecLoadInsts                  1293                       # Number of load instructions executed
-system.cpu.iew.iewExecSquashedInsts               536                       # Number of squashed instructions skipped in execute
-system.cpu.iew.iewExecutedInsts                 11968                       # Number of executed instructions
-system.cpu.iew.iewIQFullEvents                      3                       # Number of times the IQ has become full, causing a stall
+system.cpu.iew.WB:producers                      9995                       # num instructions producing a value
+system.cpu.iew.WB:rate                       0.662669                       # insts written-back per cycle
+system.cpu.iew.WB:sent                          15263                       # cumulative count of insts sent to commit
+system.cpu.iew.branchMispredicts                  565                       # Number of branch mispredicts detected at execute
+system.cpu.iew.iewBlockCycles                     187                       # Number of cycles IEW is blocking
+system.cpu.iew.iewDispLoadInsts                  2105                       # Number of dispatched load instructions
+system.cpu.iew.iewDispNonSpecInsts                 30                       # Number of dispatched non-speculative instructions
+system.cpu.iew.iewDispSquashedInsts               207                       # Number of squashed instructions skipped by dispatch
+system.cpu.iew.iewDispStoreInsts                 1639                       # Number of dispatched store instructions
+system.cpu.iew.iewDispatchedInsts               19184                       # Number of instructions dispatched to IQ
+system.cpu.iew.iewExecLoadInsts                  1665                       # Number of load instructions executed
+system.cpu.iew.iewExecSquashedInsts               710                       # Number of squashed instructions skipped in execute
+system.cpu.iew.iewExecutedInsts                 15446                       # Number of executed instructions
+system.cpu.iew.iewIQFullEvents                     12                       # Number of times the IQ has become full, causing a stall
 system.cpu.iew.iewIdleCycles                        0                       # Number of cycles IEW is idle
-system.cpu.iew.iewLSQFullEvents                     1                       # Number of times the LSQ has become full, causing a stall
-system.cpu.iew.iewSquashCycles                    721                       # Number of cycles IEW is squashing
-system.cpu.iew.iewUnblockCycles                     6                       # Number of cycles IEW is unblocking
+system.cpu.iew.iewLSQFullEvents                     0                       # Number of times the LSQ has become full, causing a stall
+system.cpu.iew.iewSquashCycles                   1504                       # Number of cycles IEW is squashing
+system.cpu.iew.iewUnblockCycles                    20                       # Number of cycles IEW is unblocking
 system.cpu.iew.lsq.thread.0.blockedLoads            0                       # Number of blocked loads due to partial load-store forwarding
 system.cpu.iew.lsq.thread.0.cacheBlocked            0                       # Number of times an access to memory failed due to the cache being blocked
-system.cpu.iew.lsq.thread.0.forwLoads              21                       # Number of loads that had data forwarded from stores
-system.cpu.iew.lsq.thread.0.ignoredResponses            1                       # Number of memory responses ignored because the instruction is squashed
+system.cpu.iew.lsq.thread.0.forwLoads              68                       # Number of loads that had data forwarded from stores
+system.cpu.iew.lsq.thread.0.ignoredResponses           12                       # Number of memory responses ignored because the instruction is squashed
 system.cpu.iew.lsq.thread.0.invAddrLoads            0                       # Number of loads ignored due to an invalid address
 system.cpu.iew.lsq.thread.0.invAddrSwpfs            0                       # Number of software prefetches ignored due to an invalid address
-system.cpu.iew.lsq.thread.0.memOrderViolation            7                       # Number of memory ordering violations
-system.cpu.iew.lsq.thread.0.rescheduledLoads            0                       # Number of loads that were rescheduled
-system.cpu.iew.lsq.thread.0.squashedLoads          479                       # Number of loads squashed
-system.cpu.iew.lsq.thread.0.squashedStores          304                       # Number of stores squashed
-system.cpu.iew.memOrderViolationEvents              7                       # Number of memory order violations
-system.cpu.iew.predictedNotTakenIncorrect          390                       # Number of branches that were predicted not taken incorrectly
-system.cpu.iew.predictedTakenIncorrect             97                       # Number of branches that were predicted taken incorrectly
-system.cpu.int_regfile_reads                    25083                       # number of integer regfile reads
-system.cpu.int_regfile_writes                   11189                       # number of integer regfile writes
-system.cpu.ipc                               0.356263                       # IPC: Instructions Per Cycle
-system.cpu.ipc_total                         0.356263                       # IPC: Total IPC of All Threads
-system.cpu.iq.ISSUE:FU_type_0::No_OpClass            3      0.02%      0.02% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntAlu           10018     80.12%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntMult              0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::IntDiv               0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatAdd             0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatCmp             0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatCvt             0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatMult            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatDiv             0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::FloatSqrt            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAdd              0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAddAcc            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdAlu              0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdCmp              0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdCvt              0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMisc             0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMult             0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdMultAcc            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdShift            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdShiftAcc            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdSqrt             0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatAdd            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatAlu            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatCmp            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatCvt            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatDiv            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMisc            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMult            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatMultAcc            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::SimdFloatSqrt            0      0.00%     80.14% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::MemRead           1360     10.88%     91.02% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::MemWrite          1123      8.98%    100.00% # Type of FU issued
+system.cpu.iew.lsq.thread.0.memOrderViolation           31                       # Number of memory ordering violations
+system.cpu.iew.lsq.thread.0.rescheduledLoads            1                       # Number of loads that were rescheduled
+system.cpu.iew.lsq.thread.0.squashedLoads         1049                       # Number of loads squashed
+system.cpu.iew.lsq.thread.0.squashedStores          705                       # Number of stores squashed
+system.cpu.iew.memOrderViolationEvents             31                       # Number of memory order violations
+system.cpu.iew.predictedNotTakenIncorrect          496                       # Number of branches that were predicted not taken incorrectly
+system.cpu.iew.predictedTakenIncorrect             69                       # Number of branches that were predicted taken incorrectly
+system.cpu.int_regfile_reads                    23051                       # number of integer regfile reads
+system.cpu.int_regfile_writes                   14062                       # number of integer regfile writes
+system.cpu.ipc                               0.429391                       # IPC: Instructions Per Cycle
+system.cpu.ipc_total                         0.429391                       # IPC: Total IPC of All Threads
+system.cpu.iq.ISSUE:FU_type_0::No_OpClass            4      0.02%      0.02% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntAlu           12967     80.26%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntMult              0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::IntDiv               0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatAdd             0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatCmp             0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatCvt             0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatMult            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatDiv             0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::FloatSqrt            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAdd              0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAddAcc            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdAlu              0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdCmp              0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdCvt              0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMisc             0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMult             0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdMultAcc            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdShift            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdShiftAcc            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdSqrt             0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatAdd            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatAlu            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatCmp            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatCvt            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatDiv            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMisc            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMult            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatMultAcc            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::SimdFloatSqrt            0      0.00%     80.29% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::MemRead           1786     11.05%     91.34% # Type of FU issued
+system.cpu.iq.ISSUE:FU_type_0::MemWrite          1399      8.66%    100.00% # Type of FU issued
 system.cpu.iq.ISSUE:FU_type_0::IprAccess            0      0.00%    100.00% # Type of FU issued
 system.cpu.iq.ISSUE:FU_type_0::InstPrefetch            0      0.00%    100.00% # Type of FU issued
-system.cpu.iq.ISSUE:FU_type_0::total            12504                       # Type of FU issued
-system.cpu.iq.ISSUE:fu_busy_cnt                     4                       # FU busy when requested
-system.cpu.iq.ISSUE:fu_busy_rate             0.000320                       # FU busy rate (busy events/executed inst)
+system.cpu.iq.ISSUE:FU_type_0::total            16156                       # Type of FU issued
+system.cpu.iq.ISSUE:fu_busy_cnt                   142                       # FU busy when requested
+system.cpu.iq.ISSUE:fu_busy_rate             0.008789                       # FU busy rate (busy events/executed inst)
 system.cpu.iq.ISSUE:fu_full::No_OpClass             0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntAlu                 0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntMult                0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::IntDiv                 0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatAdd               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatCmp               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatCvt               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatMult              0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatDiv               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::FloatSqrt              0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAdd                0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAddAcc             0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdAlu                0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdCmp                0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdCvt                0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMisc               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMult               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdMultAcc            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdShift              0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdShiftAcc            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdSqrt               0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatAdd            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatAlu            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatCmp            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatCvt            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatDiv            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMisc            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMult            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatMultAcc            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::SimdFloatSqrt            0      0.00%      0.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::MemRead                3     75.00%     75.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:fu_full::MemWrite               1     25.00%    100.00% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntAlu                97     68.31%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntMult                0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::IntDiv                 0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatAdd               0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatCmp               0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatCvt               0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatMult              0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatDiv               0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::FloatSqrt              0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAdd                0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAddAcc             0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdAlu                0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdCmp                0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdCvt                0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMisc               0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMult               0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdMultAcc            0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdShift              0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdShiftAcc            0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdSqrt               0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatAdd            0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatAlu            0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatCmp            0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatCvt            0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatDiv            0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMisc            0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMult            0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatMultAcc            0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::SimdFloatSqrt            0      0.00%     68.31% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::MemRead               26     18.31%     86.62% # attempts to use FU when none available
+system.cpu.iq.ISSUE:fu_full::MemWrite              19     13.38%    100.00% # attempts to use FU when none available
 system.cpu.iq.ISSUE:fu_full::IprAccess              0      0.00%    100.00% # attempts to use FU when none available
 system.cpu.iq.ISSUE:fu_full::InstPrefetch            0      0.00%    100.00% # attempts to use FU when none available
-system.cpu.iq.ISSUE:issued_per_cycle::samples        15845                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::mean     0.789145                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::stdev     0.977935                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::samples        13410                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::mean     1.204773                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::stdev     1.912582                       # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::underflows            0      0.00%      0.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::0          8160     51.50%     51.50% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::1          4079     25.74%     77.24% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::2          2594     16.37%     93.61% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::3           834      5.26%     98.88% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::4           157      0.99%     99.87% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::5            19      0.12%     99.99% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::6             2      0.01%    100.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::7             0      0.00%    100.00% # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::8             0      0.00%    100.00% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::0          8282     61.76%     61.76% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::1          1307      9.75%     71.51% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::2           986      7.35%     78.86% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::3           745      5.56%     84.41% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::4           787      5.87%     90.28% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::5           588      4.38%     94.67% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::6           498      3.71%     98.38% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::7           170      1.27%     99.65% # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::8            47      0.35%    100.00% # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::overflows            0      0.00%    100.00% # Number of insts issued each cycle
 system.cpu.iq.ISSUE:issued_per_cycle::min_value            0                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::max_value            6                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:issued_per_cycle::total        15845                       # Number of insts issued each cycle
-system.cpu.iq.ISSUE:rate                     0.454146                       # Inst issue rate
-system.cpu.iq.fp_alu_accesses                       4                       # Number of floating point alu accesses
-system.cpu.iq.fp_inst_queue_reads                   8                       # Number of floating instruction queue reads
-system.cpu.iq.fp_inst_queue_wakeup_accesses            2                       # Number of floating instruction queue wakeup accesses
-system.cpu.iq.fp_inst_queue_writes                  8                       # Number of floating instruction queue writes
-system.cpu.iq.int_alu_accesses                  12501                       # Number of integer alu accesses
-system.cpu.iq.int_inst_queue_reads              40849                       # Number of integer instruction queue reads
-system.cpu.iq.int_inst_queue_wakeup_accesses        11816                       # Number of integer instruction queue wakeup accesses
-system.cpu.iq.int_inst_queue_writes             16975                       # Number of integer instruction queue writes
-system.cpu.iq.iqInstsAdded                      13618                       # Number of instructions added to the IQ (excludes non-spec)
-system.cpu.iq.iqInstsIssued                     12504                       # Number of instructions issued
-system.cpu.iq.iqNonSpecInstsAdded                  17                       # Number of non-speculative instructions added to the IQ
-system.cpu.iq.iqSquashedInstsExamined            3342                       # Number of squashed instructions iterated over during squash; mainly for profiling
-system.cpu.iq.iqSquashedNonSpecRemoved              4                       # Number of squashed non-spec instructions that were removed
-system.cpu.iq.iqSquashedOperandsExamined         5066                       # Number of squashed operands that are examined and possibly removed from graph
-system.cpu.l2cache.ReadExReq_accesses              78                       # number of ReadExReq accesses(hits+misses)
-system.cpu.l2cache.ReadExReq_avg_miss_latency 34493.589744                       # average ReadExReq miss latency
-system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 31326.923077                       # average ReadExReq mshr miss latency
-system.cpu.l2cache.ReadExReq_miss_latency      2690500                       # number of ReadExReq miss cycles
+system.cpu.iq.ISSUE:issued_per_cycle::max_value            8                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:issued_per_cycle::total        13410                       # Number of insts issued each cycle
+system.cpu.iq.ISSUE:rate                     0.707232                       # Inst issue rate
+system.cpu.iq.fp_alu_accesses                       5                       # Number of floating point alu accesses
+system.cpu.iq.fp_inst_queue_reads                   9                       # Number of floating instruction queue reads
+system.cpu.iq.fp_inst_queue_wakeup_accesses            4                       # Number of floating instruction queue wakeup accesses
+system.cpu.iq.fp_inst_queue_writes                  4                       # Number of floating instruction queue writes
+system.cpu.iq.int_alu_accesses                  16289                       # Number of integer alu accesses
+system.cpu.iq.int_inst_queue_reads              45908                       # Number of integer instruction queue reads
+system.cpu.iq.int_inst_queue_wakeup_accesses        15134                       # Number of integer instruction queue wakeup accesses
+system.cpu.iq.int_inst_queue_writes             27963                       # Number of integer instruction queue writes
+system.cpu.iq.iqInstsAdded                      19154                       # Number of instructions added to the IQ (excludes non-spec)
+system.cpu.iq.iqInstsIssued                     16156                       # Number of instructions issued
+system.cpu.iq.iqNonSpecInstsAdded                  30                       # Number of non-speculative instructions added to the IQ
+system.cpu.iq.iqSquashedInstsExamined            8758                       # Number of squashed instructions iterated over during squash; mainly for profiling
+system.cpu.iq.iqSquashedInstsIssued                53                       # Number of squashed instructions issued
+system.cpu.iq.iqSquashedNonSpecRemoved             17                       # Number of squashed non-spec instructions that were removed
+system.cpu.iq.iqSquashedOperandsExamined        11067                       # Number of squashed operands that are examined and possibly removed from graph
+system.cpu.l2cache.ReadExReq_accesses              77                       # number of ReadExReq accesses(hits+misses)
+system.cpu.l2cache.ReadExReq_avg_miss_latency 34616.883117                       # average ReadExReq miss latency
+system.cpu.l2cache.ReadExReq_avg_mshr_miss_latency 31389.610390                       # average ReadExReq mshr miss latency
+system.cpu.l2cache.ReadExReq_miss_latency      2665500                       # number of ReadExReq miss cycles
 system.cpu.l2cache.ReadExReq_miss_rate              1                       # miss rate for ReadExReq accesses
-system.cpu.l2cache.ReadExReq_misses                78                       # number of ReadExReq misses
-system.cpu.l2cache.ReadExReq_mshr_miss_latency      2443500                       # number of ReadExReq MSHR miss cycles
+system.cpu.l2cache.ReadExReq_misses                77                       # number of ReadExReq misses
+system.cpu.l2cache.ReadExReq_mshr_miss_latency      2417000                       # number of ReadExReq MSHR miss cycles
 system.cpu.l2cache.ReadExReq_mshr_miss_rate            1                       # mshr miss rate for ReadExReq accesses
-system.cpu.l2cache.ReadExReq_mshr_misses           78                       # number of ReadExReq MSHR misses
-system.cpu.l2cache.ReadReq_accesses               320                       # number of ReadReq accesses(hits+misses)
-system.cpu.l2cache.ReadReq_avg_miss_latency 34188.679245                       # average ReadReq miss latency
-system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 31004.716981                       # average ReadReq mshr miss latency
+system.cpu.l2cache.ReadExReq_mshr_misses           77                       # number of ReadExReq MSHR misses
+system.cpu.l2cache.ReadReq_accesses               362                       # number of ReadReq accesses(hits+misses)
+system.cpu.l2cache.ReadReq_avg_miss_latency 34245.833333                       # average ReadReq miss latency
+system.cpu.l2cache.ReadReq_avg_mshr_miss_latency 31040.277778                       # average ReadReq mshr miss latency
 system.cpu.l2cache.ReadReq_hits                     2                       # number of ReadReq hits
-system.cpu.l2cache.ReadReq_miss_latency      10872000                       # number of ReadReq miss cycles
-system.cpu.l2cache.ReadReq_miss_rate         0.993750                       # miss rate for ReadReq accesses
-system.cpu.l2cache.ReadReq_misses                 318                       # number of ReadReq misses
-system.cpu.l2cache.ReadReq_mshr_miss_latency      9859500                       # number of ReadReq MSHR miss cycles
-system.cpu.l2cache.ReadReq_mshr_miss_rate     0.993750                       # mshr miss rate for ReadReq accesses
-system.cpu.l2cache.ReadReq_mshr_misses            318                       # number of ReadReq MSHR misses
+system.cpu.l2cache.ReadReq_miss_latency      12328500                       # number of ReadReq miss cycles
+system.cpu.l2cache.ReadReq_miss_rate         0.994475                       # miss rate for ReadReq accesses
+system.cpu.l2cache.ReadReq_misses                 360                       # number of ReadReq misses
+system.cpu.l2cache.ReadReq_mshr_miss_latency     11174500                       # number of ReadReq MSHR miss cycles
+system.cpu.l2cache.ReadReq_mshr_miss_rate     0.994475                       # mshr miss rate for ReadReq accesses
+system.cpu.l2cache.ReadReq_mshr_misses            360                       # number of ReadReq MSHR misses
 system.cpu.l2cache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.l2cache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.l2cache.avg_refs                  0.006309                       # Average number of references to valid blocks.
+system.cpu.l2cache.avg_refs                  0.005571                       # Average number of references to valid blocks.
 system.cpu.l2cache.blocked::no_mshrs                0                       # number of cycles access was blocked
 system.cpu.l2cache.blocked::no_targets              0                       # number of cycles access was blocked
 system.cpu.l2cache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.l2cache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.l2cache.cache_copies                     0                       # number of cache copies performed
-system.cpu.l2cache.demand_accesses                398                       # number of demand (read+write) accesses
-system.cpu.l2cache.demand_avg_miss_latency 34248.737374                       # average overall miss latency
-system.cpu.l2cache.demand_avg_mshr_miss_latency 31068.181818                       # average overall mshr miss latency
+system.cpu.l2cache.demand_accesses                439                       # number of demand (read+write) accesses
+system.cpu.l2cache.demand_avg_miss_latency 34311.212815                       # average overall miss latency
+system.cpu.l2cache.demand_avg_mshr_miss_latency 31101.830664                       # average overall mshr miss latency
 system.cpu.l2cache.demand_hits                      2                       # number of demand (read+write) hits
-system.cpu.l2cache.demand_miss_latency       13562500                       # number of demand (read+write) miss cycles
-system.cpu.l2cache.demand_miss_rate          0.994975                       # miss rate for demand accesses
-system.cpu.l2cache.demand_misses                  396                       # number of demand (read+write) misses
+system.cpu.l2cache.demand_miss_latency       14994000                       # number of demand (read+write) miss cycles
+system.cpu.l2cache.demand_miss_rate          0.995444                       # miss rate for demand accesses
+system.cpu.l2cache.demand_misses                  437                       # number of demand (read+write) misses
 system.cpu.l2cache.demand_mshr_hits                 0                       # number of demand (read+write) MSHR hits
-system.cpu.l2cache.demand_mshr_miss_latency     12303000                       # number of demand (read+write) MSHR miss cycles
-system.cpu.l2cache.demand_mshr_miss_rate     0.994975                       # mshr miss rate for demand accesses
-system.cpu.l2cache.demand_mshr_misses             396                       # number of demand (read+write) MSHR misses
+system.cpu.l2cache.demand_mshr_miss_latency     13591500                       # number of demand (read+write) MSHR miss cycles
+system.cpu.l2cache.demand_mshr_miss_rate     0.995444                       # mshr miss rate for demand accesses
+system.cpu.l2cache.demand_mshr_misses             437                       # number of demand (read+write) MSHR misses
 system.cpu.l2cache.fast_writes                      0                       # number of fast writes performed
 system.cpu.l2cache.mshr_cap_events                  0                       # number of times MSHR cap was activated
 system.cpu.l2cache.no_allocate_misses               0                       # Number of misses that were no-allocate
-system.cpu.l2cache.occ_%::0                  0.004816                       # Average percentage of cache occupancy
-system.cpu.l2cache.occ_blocks::0           157.820330                       # Average occupied blocks per context
-system.cpu.l2cache.overall_accesses               398                       # number of overall (read+write) accesses
-system.cpu.l2cache.overall_avg_miss_latency 34248.737374                       # average overall miss latency
-system.cpu.l2cache.overall_avg_mshr_miss_latency 31068.181818                       # average overall mshr miss latency
+system.cpu.l2cache.occ_%::0                  0.005436                       # Average percentage of cache occupancy
+system.cpu.l2cache.occ_blocks::0           178.138745                       # Average occupied blocks per context
+system.cpu.l2cache.overall_accesses               439                       # number of overall (read+write) accesses
+system.cpu.l2cache.overall_avg_miss_latency 34311.212815                       # average overall miss latency
+system.cpu.l2cache.overall_avg_mshr_miss_latency 31101.830664                       # average overall mshr miss latency
 system.cpu.l2cache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
 system.cpu.l2cache.overall_hits                     2                       # number of overall hits
-system.cpu.l2cache.overall_miss_latency      13562500                       # number of overall miss cycles
-system.cpu.l2cache.overall_miss_rate         0.994975                       # miss rate for overall accesses
-system.cpu.l2cache.overall_misses                 396                       # number of overall misses
+system.cpu.l2cache.overall_miss_latency      14994000                       # number of overall miss cycles
+system.cpu.l2cache.overall_miss_rate         0.995444                       # miss rate for overall accesses
+system.cpu.l2cache.overall_misses                 437                       # number of overall misses
 system.cpu.l2cache.overall_mshr_hits                0                       # number of overall MSHR hits
-system.cpu.l2cache.overall_mshr_miss_latency     12303000                       # number of overall MSHR miss cycles
-system.cpu.l2cache.overall_mshr_miss_rate     0.994975                       # mshr miss rate for overall accesses
-system.cpu.l2cache.overall_mshr_misses            396                       # number of overall MSHR misses
+system.cpu.l2cache.overall_mshr_miss_latency     13591500                       # number of overall MSHR miss cycles
+system.cpu.l2cache.overall_mshr_miss_rate     0.995444                       # mshr miss rate for overall accesses
+system.cpu.l2cache.overall_mshr_misses            437                       # number of overall MSHR misses
 system.cpu.l2cache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.l2cache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
 system.cpu.l2cache.replacements                     0                       # number of replacements
-system.cpu.l2cache.sampled_refs                   317                       # Sample count of references to valid blocks.
+system.cpu.l2cache.sampled_refs                   359                       # Sample count of references to valid blocks.
 system.cpu.l2cache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.l2cache.tagsinuse               157.820330                       # Cycle average of tags in use
+system.cpu.l2cache.tagsinuse               178.138745                       # Cycle average of tags in use
 system.cpu.l2cache.total_refs                       2                       # Total number of references to valid blocks.
 system.cpu.l2cache.warmup_cycle                     0                       # Cycle when the warmup percentage was hit.
 system.cpu.l2cache.writebacks                       0                       # number of writebacks
-system.cpu.memDep0.conflictingLoads                 4                       # Number of conflicting loads.
-system.cpu.memDep0.conflictingStores                0                       # Number of conflicting stores.
-system.cpu.memDep0.insertedLoads                 1535                       # Number of loads inserted to the mem dependence unit.
-system.cpu.memDep0.insertedStores                1238                       # Number of stores inserted to the mem dependence unit.
-system.cpu.misc_regfile_reads                    5334                       # number of misc regfile reads
-system.cpu.numCycles                            27533                       # number of cpu cycles simulated
+system.cpu.memDep0.conflictingLoads                24                       # Number of conflicting loads.
+system.cpu.memDep0.conflictingStores                3                       # Number of conflicting stores.
+system.cpu.memDep0.insertedLoads                 2105                       # Number of loads inserted to the mem dependence unit.
+system.cpu.memDep0.insertedStores                1639                       # Number of stores inserted to the mem dependence unit.
+system.cpu.misc_regfile_reads                    6857                       # number of misc regfile reads
+system.cpu.numCycles                            22844                       # number of cpu cycles simulated
 system.cpu.numWorkItemsCompleted                    0                       # number of work items this cpu completed
 system.cpu.numWorkItemsStarted                      0                       # number of work items this cpu started
-system.cpu.rename.RENAME:BlockCycles              105                       # Number of cycles rename is blocking
+system.cpu.rename.RENAME:BlockCycles              565                       # Number of cycles rename is blocking
 system.cpu.rename.RENAME:CommittedMaps           9368                       # Number of HB maps that are committed
-system.cpu.rename.RENAME:IQFullEvents               6                       # Number of times rename has blocked due to IQ full
-system.cpu.rename.RENAME:IdleCycles              6603                       # Number of cycles rename is idle
-system.cpu.rename.RENAME:LSQFullEvents             15                       # Number of times rename has blocked due to LSQ full
-system.cpu.rename.RENAME:RenameLookups          38664                       # Number of register rename lookups that rename has made
-system.cpu.rename.RENAME:RenamedInsts           14745                       # Number of instructions processed by rename
-system.cpu.rename.RENAME:RenamedOperands        13787                       # Number of destination operands rename has renamed
-system.cpu.rename.RENAME:RunCycles               8027                       # Number of cycles rename is running
-system.cpu.rename.RENAME:SquashCycles             721                       # Number of cycles rename is squashing
-system.cpu.rename.RENAME:UnblockCycles            108                       # Number of cycles rename is unblocking
-system.cpu.rename.RENAME:UndoneMaps              4419                       # Number of HB maps that are undone due to squashing
+system.cpu.rename.RENAME:IQFullEvents              51                       # Number of times rename has blocked due to IQ full
+system.cpu.rename.RENAME:IdleCycles              7399                       # Number of cycles rename is idle
+system.cpu.rename.RENAME:LSQFullEvents            247                       # Number of times rename has blocked due to LSQ full
+system.cpu.rename.RENAME:ROBFullEvents              3                       # Number of times rename has blocked due to ROB full
+system.cpu.rename.RENAME:RenameLookups          44700                       # Number of register rename lookups that rename has made
+system.cpu.rename.RENAME:RenamedInsts           21187                       # Number of instructions processed by rename
+system.cpu.rename.RENAME:RenamedOperands        19905                       # Number of destination operands rename has renamed
+system.cpu.rename.RENAME:RunCycles               3124                       # Number of cycles rename is running
+system.cpu.rename.RENAME:SquashCycles            1504                       # Number of cycles rename is squashing
+system.cpu.rename.RENAME:UnblockCycles            378                       # Number of cycles rename is unblocking
+system.cpu.rename.RENAME:UndoneMaps             10537                       # Number of HB maps that are undone due to squashing
 system.cpu.rename.RENAME:fp_rename_lookups           16                       # Number of floating rename lookups
-system.cpu.rename.RENAME:int_rename_lookups        38648                       # Number of integer rename lookups
-system.cpu.rename.RENAME:serializeStallCycles          281                       # count of cycles rename stalled for serializing inst
-system.cpu.rename.RENAME:serializingInsts           20                       # count of serializing insts renamed
-system.cpu.rename.RENAME:skidInsts                169                       # count of insts added to the skid buffer
-system.cpu.rename.RENAME:tempSerializingInsts           17                       # count of temporary serializing insts renamed
-system.cpu.rob.rob_reads                        28728                       # The number of ROB reads
-system.cpu.rob.rob_writes                       28005                       # The number of ROB writes
-system.cpu.timesIdled                             208                       # Number of times that the entire CPU went into an idle state and unscheduled itself
+system.cpu.rename.RENAME:int_rename_lookups        44684                       # Number of integer rename lookups
+system.cpu.rename.RENAME:serializeStallCycles          440                       # count of cycles rename stalled for serializing inst
+system.cpu.rename.RENAME:serializingInsts           32                       # count of serializing insts renamed
+system.cpu.rename.RENAME:skidInsts               1476                       # count of insts added to the skid buffer
+system.cpu.rename.RENAME:tempSerializingInsts           31                       # count of temporary serializing insts renamed
+system.cpu.rob.rob_reads                        30950                       # The number of ROB reads
+system.cpu.rob.rob_writes                       39896                       # The number of ROB writes
+system.cpu.timesIdled                             184                       # Number of times that the entire CPU went into an idle state and unscheduled itself
 system.cpu.workload.PROG:num_syscalls              11                       # Number of system calls
 
 ---------- End Simulation Statistics   ----------
diff --git a/tests/quick/00.hello/ref/x86/linux/simple-atomic/simout b/tests/quick/00.hello/ref/x86/linux/simple-atomic/simout
index 09f4d0b50e..8fb08388b9 100755
--- a/tests/quick/00.hello/ref/x86/linux/simple-atomic/simout
+++ b/tests/quick/00.hello/ref/x86/linux/simple-atomic/simout
@@ -5,9 +5,9 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:13
+M5 compiled Feb  8 2011 00:58:32
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:34
 M5 executing on burrito
 command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/quick/00.hello/x86/linux/simple-atomic -re tests/run.py build/X86_SE/tests/fast/quick/00.hello/x86/linux/simple-atomic
 Global frequency set at 1000000000000 ticks per second
diff --git a/tests/quick/00.hello/ref/x86/linux/simple-atomic/stats.txt b/tests/quick/00.hello/ref/x86/linux/simple-atomic/stats.txt
index 1dca11ec54..cddb4c7b6c 100644
--- a/tests/quick/00.hello/ref/x86/linux/simple-atomic/stats.txt
+++ b/tests/quick/00.hello/ref/x86/linux/simple-atomic/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 180423                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 219128                       # Number of bytes of host memory used
-host_seconds                                     0.05                       # Real time elapsed on the host
-host_tick_rate                              103433649                       # Simulator tick rate (ticks/s)
+host_inst_rate                                 992012                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 219616                       # Number of bytes of host memory used
+host_seconds                                     0.01                       # Real time elapsed on the host
+host_tick_rate                              556721453                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                        9810                       # Number of instructions simulated
 sim_seconds                                  0.000006                       # Number of seconds simulated
@@ -24,7 +24,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                             9810                       # Number of instructions executed
 system.cpu.num_int_alu_accesses                  9715                       # Number of integer alu accesses
 system.cpu.num_int_insts                         9715                       # number of integer instructions
-system.cpu.num_int_register_reads               26194                       # number of times the integer registers were read
+system.cpu.num_int_register_reads               21313                       # number of times the integer registers were read
 system.cpu.num_int_register_writes               9368                       # number of times the integer registers were written
 system.cpu.num_load_insts                        1056                       # Number of load instructions
 system.cpu.num_mem_refs                          1990                       # number of memory refs
diff --git a/tests/quick/00.hello/ref/x86/linux/simple-timing-ruby/ruby.stats b/tests/quick/00.hello/ref/x86/linux/simple-timing-ruby/ruby.stats
index a12716c024..5696629363 100644
--- a/tests/quick/00.hello/ref/x86/linux/simple-timing-ruby/ruby.stats
+++ b/tests/quick/00.hello/ref/x86/linux/simple-timing-ruby/ruby.stats
@@ -34,7 +34,7 @@ periodic_stats_period: 1000000
 ================ End RubySystem Configuration Print ================
 
 
-Real time: Feb/07/2011 02:32:13
+Real time: Feb/08/2011 00:58:34
 
 Profiler Stats
 --------------
@@ -43,18 +43,18 @@ Elapsed_time_in_minutes: 0
 Elapsed_time_in_hours: 0
 Elapsed_time_in_days: 0
 
-Virtual_time_in_seconds: 0.35
-Virtual_time_in_minutes: 0.00583333
-Virtual_time_in_hours:   9.72222e-05
-Virtual_time_in_days:    4.05093e-06
+Virtual_time_in_seconds: 0.26
+Virtual_time_in_minutes: 0.00433333
+Virtual_time_in_hours:   7.22222e-05
+Virtual_time_in_days:    3.00926e-06
 
 Ruby_current_time: 276484
 Ruby_start_time: 0
 Ruby_cycles: 276484
 
-mbytes_resident: 38.6094
-mbytes_total: 231.508
-resident_ratio: 0.16679
+mbytes_resident: 38.6797
+mbytes_total: 231.98
+resident_ratio: 0.166754
 
 ruby_cycles_executed: [ 276485 ]
 
@@ -125,7 +125,7 @@ Resource Usage
 page_size: 4096
 user_time: 0
 system_time: 0
-page_reclaims: 10950
+page_reclaims: 11003
 page_faults: 0
 swaps: 0
 block_inputs: 0
diff --git a/tests/quick/00.hello/ref/x86/linux/simple-timing-ruby/simout b/tests/quick/00.hello/ref/x86/linux/simple-timing-ruby/simout
index 877c8d9b9d..ab908eedc2 100755
--- a/tests/quick/00.hello/ref/x86/linux/simple-timing-ruby/simout
+++ b/tests/quick/00.hello/ref/x86/linux/simple-timing-ruby/simout
@@ -5,9 +5,9 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:13
+M5 compiled Feb  8 2011 00:58:32
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:34
 M5 executing on burrito
 command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/quick/00.hello/x86/linux/simple-timing-ruby -re tests/run.py build/X86_SE/tests/fast/quick/00.hello/x86/linux/simple-timing-ruby
 Global frequency set at 1000000000 ticks per second
diff --git a/tests/quick/00.hello/ref/x86/linux/simple-timing-ruby/stats.txt b/tests/quick/00.hello/ref/x86/linux/simple-timing-ruby/stats.txt
index b88df01c54..491eaf1d1f 100644
--- a/tests/quick/00.hello/ref/x86/linux/simple-timing-ruby/stats.txt
+++ b/tests/quick/00.hello/ref/x86/linux/simple-timing-ruby/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                  32378                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 237068                       # Number of bytes of host memory used
-host_seconds                                     0.30                       # Real time elapsed on the host
-host_tick_rate                                 911908                       # Simulator tick rate (ticks/s)
+host_inst_rate                                  81703                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 237552                       # Number of bytes of host memory used
+host_seconds                                     0.12                       # Real time elapsed on the host
+host_tick_rate                                2292859                       # Simulator tick rate (ticks/s)
 sim_freq                                   1000000000                       # Frequency of simulated ticks
 sim_insts                                        9810                       # Number of instructions simulated
 sim_seconds                                  0.000276                       # Number of seconds simulated
@@ -24,7 +24,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                             9810                       # Number of instructions executed
 system.cpu.num_int_alu_accesses                  9715                       # Number of integer alu accesses
 system.cpu.num_int_insts                         9715                       # number of integer instructions
-system.cpu.num_int_register_reads               26194                       # number of times the integer registers were read
+system.cpu.num_int_register_reads               21313                       # number of times the integer registers were read
 system.cpu.num_int_register_writes               9368                       # number of times the integer registers were written
 system.cpu.num_load_insts                        1056                       # Number of load instructions
 system.cpu.num_mem_refs                          1990                       # number of memory refs
diff --git a/tests/quick/00.hello/ref/x86/linux/simple-timing/simout b/tests/quick/00.hello/ref/x86/linux/simple-timing/simout
index d6afbecf05..43766d7be8 100755
--- a/tests/quick/00.hello/ref/x86/linux/simple-timing/simout
+++ b/tests/quick/00.hello/ref/x86/linux/simple-timing/simout
@@ -5,9 +5,9 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 02:32:07
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 02:32:24
+M5 compiled Feb  8 2011 00:58:32
+M5 revision 705a4d351a43 7939 default qtip resforflagsstats.patch tip
+M5 started Feb  8 2011 00:58:34
 M5 executing on burrito
 command line: build/X86_SE/m5.fast -d build/X86_SE/tests/fast/quick/00.hello/x86/linux/simple-timing -re tests/run.py build/X86_SE/tests/fast/quick/00.hello/x86/linux/simple-timing
 Global frequency set at 1000000000000 ticks per second
diff --git a/tests/quick/00.hello/ref/x86/linux/simple-timing/stats.txt b/tests/quick/00.hello/ref/x86/linux/simple-timing/stats.txt
index 0c21882f55..fc7acffe16 100644
--- a/tests/quick/00.hello/ref/x86/linux/simple-timing/stats.txt
+++ b/tests/quick/00.hello/ref/x86/linux/simple-timing/stats.txt
@@ -1,9 +1,9 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 594010                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 226844                       # Number of bytes of host memory used
+host_inst_rate                                 525864                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 227336                       # Number of bytes of host memory used
 host_seconds                                     0.02                       # Real time elapsed on the host
-host_tick_rate                             1712507148                       # Simulator tick rate (ticks/s)
+host_tick_rate                             1518719132                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
 sim_insts                                        9810                       # Number of instructions simulated
 sim_seconds                                  0.000029                       # Number of seconds simulated
@@ -208,7 +208,7 @@ system.cpu.num_idle_cycles                          0                       # Nu
 system.cpu.num_insts                             9810                       # Number of instructions executed
 system.cpu.num_int_alu_accesses                  9715                       # Number of integer alu accesses
 system.cpu.num_int_insts                         9715                       # number of integer instructions
-system.cpu.num_int_register_reads               26194                       # number of times the integer registers were read
+system.cpu.num_int_register_reads               21313                       # number of times the integer registers were read
 system.cpu.num_int_register_writes               9368                       # number of times the integer registers were written
 system.cpu.num_load_insts                        1056                       # Number of load instructions
 system.cpu.num_mem_refs                          1990                       # number of memory refs
diff --git a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/config.ini b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/config.ini
index 79021a9585..859778cbe2 100644
--- a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/config.ini
+++ b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/config.ini
@@ -7,11 +7,11 @@ time_sync_spin_threshold=100000000
 
 [system]
 type=LinuxArmSystem
-children=bridge cpu diskmem intrctrl iobus iocache l2c membus physmem realview terminal toL2Bus
+children=bridge cpu diskmem intrctrl iobus iocache l2c membus physmem realview terminal toL2Bus vncserver
 boot_cpu_frequency=500
 boot_osflags=earlyprintk mem=128MB console=ttyAMA0 lpj=19988480 norandmaps slram=slram0,0x8000000,+0x8000000 mtdparts=slram0:- rw loglevel=8 root=/dev/mtdblock0
 init_param=0
-kernel=/dist/m5/system/binaries/vmlinux.arm
+kernel=/chips/pd/randd/dist/binaries/vmlinux.arm
 load_addr_mask=268435455
 machine_type=RealView_PBX
 mem_mode=atomic
@@ -167,7 +167,7 @@ type=ExeTracer
 
 [system.diskmem]
 type=PhysicalMemory
-file=/dist/m5/system/disks/ael-arm.ext2
+file=/chips/pd/randd/dist/disks/ael-arm.ext2
 latency=30000
 latency_var=0
 null=false
@@ -187,7 +187,7 @@ clock=1000
 header_cycles=1
 use_default_range=false
 width=64
-port=system.bridge.side_a system.realview.uart.pio system.realview.realview_io.pio system.realview.timer0.pio system.realview.timer1.pio system.realview.clcd.pio system.realview.kmi0.pio system.realview.kmi1.pio system.realview.dmac_fake.pio system.realview.uart1_fake.pio system.realview.uart2_fake.pio system.realview.uart3_fake.pio system.realview.smc_fake.pio system.realview.sp810_fake.pio system.realview.watchdog_fake.pio system.realview.gpio0_fake.pio system.realview.gpio1_fake.pio system.realview.gpio2_fake.pio system.realview.ssp_fake.pio system.realview.sci_fake.pio system.realview.aaci_fake.pio system.realview.mmc_fake.pio system.realview.rtc_fake.pio system.realview.flash_fake.pio system.iocache.cpu_side system.realview.clcd.dma
+port=system.bridge.side_a system.realview.uart.pio system.realview.realview_io.pio system.realview.timer0.pio system.realview.timer1.pio system.realview.clcd.pio system.realview.kmi0.pio system.realview.kmi1.pio system.realview.dmac_fake.pio system.realview.uart1_fake.pio system.realview.uart2_fake.pio system.realview.uart3_fake.pio system.realview.smc_fake.pio system.realview.sp810_fake.pio system.realview.watchdog_fake.pio system.realview.gpio0_fake.pio system.realview.gpio1_fake.pio system.realview.gpio2_fake.pio system.realview.ssp_fake.pio system.realview.sci_fake.pio system.realview.aaci_fake.pio system.realview.mmc_fake.pio system.realview.rtc_fake.pio system.realview.flash_fake.pio system.realview.cf0_fake.pio system.iocache.cpu_side system.realview.clcd.dma
 
 [system.iocache]
 type=BaseCache
@@ -217,7 +217,7 @@ tgts_per_mshr=12
 trace_addr=0
 two_queue=false
 write_buffers=8
-cpu_side=system.iobus.port[24]
+cpu_side=system.iobus.port[25]
 mem_side=system.membus.port[5]
 
 [system.l2c]
@@ -291,7 +291,7 @@ port=system.membus.port[1]
 
 [system.realview]
 type=RealView
-children=aaci_fake clcd dmac_fake flash_fake gic gpio0_fake gpio1_fake gpio2_fake kmi0 kmi1 l2x0_fake mmc_fake realview_io rtc_fake sci_fake smc_fake sp810_fake ssp_fake timer0 timer1 uart uart1_fake uart2_fake uart3_fake watchdog_fake
+children=aaci_fake cf0_fake clcd dmac_fake flash_fake gic gpio0_fake gpio1_fake gpio2_fake kmi0 kmi1 l2x0_fake mmc_fake realview_io rtc_fake sci_fake smc_fake sp810_fake ssp_fake timer0 timer1 uart uart1_fake uart2_fake uart3_fake watchdog_fake
 intrctrl=system.intrctrl
 system=system
 
@@ -305,6 +305,22 @@ platform=system.realview
 system=system
 pio=system.iobus.port[20]
 
+[system.realview.cf0_fake]
+type=IsaFake
+pio_addr=402653184
+pio_latency=1000
+pio_size=4095
+platform=system.realview
+ret_bad_addr=false
+ret_data16=65535
+ret_data32=4294967295
+ret_data64=18446744073709551615
+ret_data8=255
+system=system
+update_data=false
+warn_access=
+pio=system.iobus.port[24]
+
 [system.realview.clcd]
 type=Pl111
 amba_id=1315089
@@ -317,7 +333,8 @@ pio_addr=268566528
 pio_latency=10000
 platform=system.realview
 system=system
-dma=system.iobus.port[25]
+vnc=system.vncserver
+dma=system.iobus.port[26]
 pio=system.iobus.port[5]
 
 [system.realview.dmac_fake]
@@ -391,24 +408,28 @@ pio=system.iobus.port[17]
 type=Pl050
 amba_id=1314896
 gic=system.realview.gic
-int_delay=100000
+int_delay=1000000
 int_num=52
+is_mouse=false
 pio_addr=268460032
 pio_latency=1000
 platform=system.realview
 system=system
+vnc=system.vncserver
 pio=system.iobus.port[6]
 
 [system.realview.kmi1]
 type=Pl050
 amba_id=1314896
 gic=system.realview.gic
-int_delay=100000
+int_delay=1000000
 int_num=53
+is_mouse=true
 pio_addr=268464128
 pio_latency=1000
 platform=system.realview
 system=system
+vnc=system.vncserver
 pio=system.iobus.port[7]
 
 [system.realview.l2x0_fake]
@@ -594,3 +615,8 @@ use_default_range=false
 width=64
 port=system.l2c.cpu_side system.cpu.icache.mem_side system.cpu.dcache.mem_side system.cpu.itb.walker.port system.cpu.dtb.walker.port
 
+[system.vncserver]
+type=VncServer
+number=0
+port=5900
+
diff --git a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/simerr b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/simerr
index 1225613070..63ac398c96 100755
--- a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/simerr
+++ b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/simerr
@@ -1,3 +1,5 @@
+warn: Sockets disabled, not accepting vnc client connections
+For more information see: http://www.m5sim.org/warn/af6a84f6
 warn: Sockets disabled, not accepting terminal connections
 For more information see: http://www.m5sim.org/warn/8742226b
 warn: Sockets disabled, not accepting gdb connections
diff --git a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/simout b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/simout
index ba4c6742c0..180619cc16 100755
--- a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/simout
+++ b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/simout
@@ -5,12 +5,12 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 01:53:13
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 01:53:26
-M5 executing on burrito
+M5 compiled Feb 11 2011 17:53:57
+M5 revision 6c65f7ee86c1 7949 default qtip tip ext/vnc_stats_updates.patch
+M5 started Feb 11 2011 17:54:00
+M5 executing on u200439-lin.austin.arm.com
 command line: build/ARM_FS/m5.fast -d build/ARM_FS/tests/fast/quick/10.linux-boot/arm/linux/realview-simple-atomic -re tests/run.py build/ARM_FS/tests/fast/quick/10.linux-boot/arm/linux/realview-simple-atomic
 Global frequency set at 1000000000000 ticks per second
-info: kernel located at: /dist/m5/system/binaries/vmlinux.arm
+info: kernel located at: /chips/pd/randd/dist/binaries/vmlinux.arm
 info: Entering event queue @ 0.  Starting simulation...
-Exiting @ tick 25821310500 because m5_exit instruction encountered
+Exiting @ tick 26073617500 because m5_exit instruction encountered
diff --git a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/stats.txt b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/stats.txt
index 0a7542a7c1..9854d94dfc 100644
--- a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/stats.txt
+++ b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-atomic/stats.txt
@@ -1,63 +1,63 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 739167                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 360776                       # Number of bytes of host memory used
-host_seconds                                    68.93                       # Real time elapsed on the host
-host_tick_rate                              374609475                       # Simulator tick rate (ticks/s)
+host_inst_rate                                2481190                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 374936                       # Number of bytes of host memory used
+host_seconds                                    20.74                       # Real time elapsed on the host
+host_tick_rate                             1257294139                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
-sim_insts                                    50949504                       # Number of instructions simulated
-sim_seconds                                  0.025821                       # Number of seconds simulated
-sim_ticks                                 25821310500                       # Number of ticks simulated
-system.cpu.dcache.LoadLockedReq_accesses::0        96794                       # number of LoadLockedReq accesses(hits+misses)
-system.cpu.dcache.LoadLockedReq_accesses::total        96794                       # number of LoadLockedReq accesses(hits+misses)
-system.cpu.dcache.LoadLockedReq_hits::0         91895                       # number of LoadLockedReq hits
-system.cpu.dcache.LoadLockedReq_hits::total        91895                       # number of LoadLockedReq hits
-system.cpu.dcache.LoadLockedReq_miss_rate::0     0.050613                       # miss rate for LoadLockedReq accesses
-system.cpu.dcache.LoadLockedReq_misses::0         4899                       # number of LoadLockedReq misses
-system.cpu.dcache.LoadLockedReq_misses::total         4899                       # number of LoadLockedReq misses
-system.cpu.dcache.ReadReq_accesses::0         7714516                       # number of ReadReq accesses(hits+misses)
-system.cpu.dcache.ReadReq_accesses::total      7714516                       # number of ReadReq accesses(hits+misses)
-system.cpu.dcache.ReadReq_hits::0             7482193                       # number of ReadReq hits
-system.cpu.dcache.ReadReq_hits::total         7482193                       # number of ReadReq hits
-system.cpu.dcache.ReadReq_miss_rate::0       0.030115                       # miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_misses::0            232323                       # number of ReadReq misses
-system.cpu.dcache.ReadReq_misses::total        232323                       # number of ReadReq misses
-system.cpu.dcache.StoreCondReq_accesses::0        96793                       # number of StoreCondReq accesses(hits+misses)
-system.cpu.dcache.StoreCondReq_accesses::total        96793                       # number of StoreCondReq accesses(hits+misses)
-system.cpu.dcache.StoreCondReq_hits::0          96793                       # number of StoreCondReq hits
-system.cpu.dcache.StoreCondReq_hits::total        96793                       # number of StoreCondReq hits
-system.cpu.dcache.WriteReq_accesses::0        6604860                       # number of WriteReq accesses(hits+misses)
-system.cpu.dcache.WriteReq_accesses::total      6604860                       # number of WriteReq accesses(hits+misses)
-system.cpu.dcache.WriteReq_hits::0            6433311                       # number of WriteReq hits
-system.cpu.dcache.WriteReq_hits::total        6433311                       # number of WriteReq hits
-system.cpu.dcache.WriteReq_miss_rate::0      0.025973                       # miss rate for WriteReq accesses
-system.cpu.dcache.WriteReq_misses::0           171549                       # number of WriteReq misses
-system.cpu.dcache.WriteReq_misses::total       171549                       # number of WriteReq misses
+sim_insts                                    51454118                       # Number of instructions simulated
+sim_seconds                                  0.026074                       # Number of seconds simulated
+sim_ticks                                 26073617500                       # Number of ticks simulated
+system.cpu.dcache.LoadLockedReq_accesses::0       100454                       # number of LoadLockedReq accesses(hits+misses)
+system.cpu.dcache.LoadLockedReq_accesses::total       100454                       # number of LoadLockedReq accesses(hits+misses)
+system.cpu.dcache.LoadLockedReq_hits::0         95292                       # number of LoadLockedReq hits
+system.cpu.dcache.LoadLockedReq_hits::total        95292                       # number of LoadLockedReq hits
+system.cpu.dcache.LoadLockedReq_miss_rate::0     0.051387                       # miss rate for LoadLockedReq accesses
+system.cpu.dcache.LoadLockedReq_misses::0         5162                       # number of LoadLockedReq misses
+system.cpu.dcache.LoadLockedReq_misses::total         5162                       # number of LoadLockedReq misses
+system.cpu.dcache.ReadReq_accesses::0         7830681                       # number of ReadReq accesses(hits+misses)
+system.cpu.dcache.ReadReq_accesses::total      7830681                       # number of ReadReq accesses(hits+misses)
+system.cpu.dcache.ReadReq_hits::0             7594158                       # number of ReadReq hits
+system.cpu.dcache.ReadReq_hits::total         7594158                       # number of ReadReq hits
+system.cpu.dcache.ReadReq_miss_rate::0       0.030205                       # miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_misses::0            236523                       # number of ReadReq misses
+system.cpu.dcache.ReadReq_misses::total        236523                       # number of ReadReq misses
+system.cpu.dcache.StoreCondReq_accesses::0       100453                       # number of StoreCondReq accesses(hits+misses)
+system.cpu.dcache.StoreCondReq_accesses::total       100453                       # number of StoreCondReq accesses(hits+misses)
+system.cpu.dcache.StoreCondReq_hits::0         100453                       # number of StoreCondReq hits
+system.cpu.dcache.StoreCondReq_hits::total       100453                       # number of StoreCondReq hits
+system.cpu.dcache.WriteReq_accesses::0        6676067                       # number of WriteReq accesses(hits+misses)
+system.cpu.dcache.WriteReq_accesses::total      6676067                       # number of WriteReq accesses(hits+misses)
+system.cpu.dcache.WriteReq_hits::0            6503881                       # number of WriteReq hits
+system.cpu.dcache.WriteReq_hits::total        6503881                       # number of WriteReq hits
+system.cpu.dcache.WriteReq_miss_rate::0      0.025792                       # miss rate for WriteReq accesses
+system.cpu.dcache.WriteReq_misses::0           172186                       # number of WriteReq misses
+system.cpu.dcache.WriteReq_misses::total       172186                       # number of WriteReq misses
 system.cpu.dcache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.dcache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.dcache.avg_refs                  34.663994                       # Average number of references to valid blocks.
+system.cpu.dcache.avg_refs                  34.695419                       # Average number of references to valid blocks.
 system.cpu.dcache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.dcache.blocked::no_targets               0                       # number of cycles access was blocked
 system.cpu.dcache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.dcache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.dcache.cache_copies                      0                       # number of cache copies performed
-system.cpu.dcache.demand_accesses::0         14319376                       # number of demand (read+write) accesses
+system.cpu.dcache.demand_accesses::0         14506748                       # number of demand (read+write) accesses
 system.cpu.dcache.demand_accesses::1                0                       # number of demand (read+write) accesses
-system.cpu.dcache.demand_accesses::total     14319376                       # number of demand (read+write) accesses
+system.cpu.dcache.demand_accesses::total     14506748                       # number of demand (read+write) accesses
 system.cpu.dcache.demand_avg_miss_latency::0            0                       # average overall miss latency
 system.cpu.dcache.demand_avg_miss_latency::1     no_value                       # average overall miss latency
 system.cpu.dcache.demand_avg_miss_latency::total     no_value                       # average overall miss latency
 system.cpu.dcache.demand_avg_mshr_miss_latency     no_value                       # average overall mshr miss latency
-system.cpu.dcache.demand_hits::0             13915504                       # number of demand (read+write) hits
+system.cpu.dcache.demand_hits::0             14098039                       # number of demand (read+write) hits
 system.cpu.dcache.demand_hits::1                    0                       # number of demand (read+write) hits
-system.cpu.dcache.demand_hits::total         13915504                       # number of demand (read+write) hits
+system.cpu.dcache.demand_hits::total         14098039                       # number of demand (read+write) hits
 system.cpu.dcache.demand_miss_latency               0                       # number of demand (read+write) miss cycles
-system.cpu.dcache.demand_miss_rate::0        0.028205                       # miss rate for demand accesses
+system.cpu.dcache.demand_miss_rate::0        0.028174                       # miss rate for demand accesses
 system.cpu.dcache.demand_miss_rate::1        no_value                       # miss rate for demand accesses
 system.cpu.dcache.demand_miss_rate::total     no_value                       # miss rate for demand accesses
-system.cpu.dcache.demand_misses::0             403872                       # number of demand (read+write) misses
+system.cpu.dcache.demand_misses::0             408709                       # number of demand (read+write) misses
 system.cpu.dcache.demand_misses::1                  0                       # number of demand (read+write) misses
-system.cpu.dcache.demand_misses::total         403872                       # number of demand (read+write) misses
+system.cpu.dcache.demand_misses::total         408709                       # number of demand (read+write) misses
 system.cpu.dcache.demand_mshr_hits                  0                       # number of demand (read+write) MSHR hits
 system.cpu.dcache.demand_mshr_miss_latency            0                       # number of demand (read+write) MSHR miss cycles
 system.cpu.dcache.demand_mshr_miss_rate::0            0                       # mshr miss rate for demand accesses
@@ -67,26 +67,26 @@ system.cpu.dcache.demand_mshr_misses                0                       # nu
 system.cpu.dcache.fast_writes                       0                       # number of fast writes performed
 system.cpu.dcache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.dcache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.dcache.occ_%::0                   0.999475                       # Average percentage of cache occupancy
-system.cpu.dcache.occ_blocks::0            511.731250                       # Average occupied blocks per context
-system.cpu.dcache.overall_accesses::0        14319376                       # number of overall (read+write) accesses
+system.cpu.dcache.occ_%::0                   0.999480                       # Average percentage of cache occupancy
+system.cpu.dcache.occ_blocks::0            511.733850                       # Average occupied blocks per context
+system.cpu.dcache.overall_accesses::0        14506748                       # number of overall (read+write) accesses
 system.cpu.dcache.overall_accesses::1               0                       # number of overall (read+write) accesses
-system.cpu.dcache.overall_accesses::total     14319376                       # number of overall (read+write) accesses
+system.cpu.dcache.overall_accesses::total     14506748                       # number of overall (read+write) accesses
 system.cpu.dcache.overall_avg_miss_latency::0            0                       # average overall miss latency
 system.cpu.dcache.overall_avg_miss_latency::1     no_value                       # average overall miss latency
 system.cpu.dcache.overall_avg_miss_latency::total     no_value                       # average overall miss latency
 system.cpu.dcache.overall_avg_mshr_miss_latency     no_value                       # average overall mshr miss latency
 system.cpu.dcache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.dcache.overall_hits::0            13915504                       # number of overall hits
+system.cpu.dcache.overall_hits::0            14098039                       # number of overall hits
 system.cpu.dcache.overall_hits::1                   0                       # number of overall hits
-system.cpu.dcache.overall_hits::total        13915504                       # number of overall hits
+system.cpu.dcache.overall_hits::total        14098039                       # number of overall hits
 system.cpu.dcache.overall_miss_latency              0                       # number of overall miss cycles
-system.cpu.dcache.overall_miss_rate::0       0.028205                       # miss rate for overall accesses
+system.cpu.dcache.overall_miss_rate::0       0.028174                       # miss rate for overall accesses
 system.cpu.dcache.overall_miss_rate::1       no_value                       # miss rate for overall accesses
 system.cpu.dcache.overall_miss_rate::total     no_value                       # miss rate for overall accesses
-system.cpu.dcache.overall_misses::0            403872                       # number of overall misses
+system.cpu.dcache.overall_misses::0            408709                       # number of overall misses
 system.cpu.dcache.overall_misses::1                 0                       # number of overall misses
-system.cpu.dcache.overall_misses::total        403872                       # number of overall misses
+system.cpu.dcache.overall_misses::total        408709                       # number of overall misses
 system.cpu.dcache.overall_mshr_hits                 0                       # number of overall MSHR hits
 system.cpu.dcache.overall_mshr_miss_latency            0                       # number of overall MSHR miss cycles
 system.cpu.dcache.overall_mshr_miss_rate::0            0                       # mshr miss rate for overall accesses
@@ -95,66 +95,66 @@ system.cpu.dcache.overall_mshr_miss_rate::total     no_value
 system.cpu.dcache.overall_mshr_misses               0                       # number of overall MSHR misses
 system.cpu.dcache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.dcache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.dcache.replacements                 406424                       # number of replacements
-system.cpu.dcache.sampled_refs                 406936                       # Sample count of references to valid blocks.
+system.cpu.dcache.replacements                 411520                       # number of replacements
+system.cpu.dcache.sampled_refs                 412032                       # Sample count of references to valid blocks.
 system.cpu.dcache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.dcache.tagsinuse                511.731250                       # Cycle average of tags in use
-system.cpu.dcache.total_refs                 14106027                       # Total number of references to valid blocks.
+system.cpu.dcache.tagsinuse                511.733850                       # Cycle average of tags in use
+system.cpu.dcache.total_refs                 14295623                       # Total number of references to valid blocks.
 system.cpu.dcache.warmup_cycle               21760000                       # Cycle when the warmup percentage was hit.
-system.cpu.dcache.writebacks                   379025                       # number of writebacks
-system.cpu.dtb.accesses                      15336291                       # DTB accesses
+system.cpu.dcache.writebacks                   381867                       # number of writebacks
+system.cpu.dtb.accesses                      15531286                       # DTB accesses
 system.cpu.dtb.align_faults                         0                       # Number of TLB faults due to alignment restrictions
 system.cpu.dtb.domain_faults                        0                       # Number of TLB faults due to domain restrictions
-system.cpu.dtb.flush_entries                     2242                       # Number of entries that have been flushed from TLB
+system.cpu.dtb.flush_entries                     2267                       # Number of entries that have been flushed from TLB
 system.cpu.dtb.flush_tlb                            2                       # Number of times complete TLB was flushed
 system.cpu.dtb.flush_tlb_asid                      40                       # Number of times TLB was flushed by ASID
 system.cpu.dtb.flush_tlb_mva                        0                       # Number of times TLB was flushed by MVA
 system.cpu.dtb.flush_tlb_mva_asid               33670                       # Number of times TLB was flushed by MVA & ASID
-system.cpu.dtb.hits                          15330762                       # DTB hits
+system.cpu.dtb.hits                          15525735                       # DTB hits
 system.cpu.dtb.inst_accesses                        0                       # ITB inst accesses
 system.cpu.dtb.inst_hits                            0                       # ITB inst hits
 system.cpu.dtb.inst_misses                          0                       # ITB inst misses
-system.cpu.dtb.misses                            5529                       # DTB misses
+system.cpu.dtb.misses                            5551                       # DTB misses
 system.cpu.dtb.perms_faults                       255                       # Number of TLB faults due to permissions restrictions
-system.cpu.dtb.prefetch_faults                    768                       # Number of TLB faults due to prefetch
-system.cpu.dtb.read_accesses                  8622893                       # DTB read accesses
-system.cpu.dtb.read_hits                      8618361                       # DTB read hits
-system.cpu.dtb.read_misses                       4532                       # DTB read misses
-system.cpu.dtb.write_accesses                 6713398                       # DTB write accesses
-system.cpu.dtb.write_hits                     6712401                       # DTB write hits
-system.cpu.dtb.write_misses                       997                       # DTB write misses
-system.cpu.icache.ReadReq_accesses::0        41172623                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_accesses::total     41172623                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_hits::0            40741841                       # number of ReadReq hits
-system.cpu.icache.ReadReq_hits::total        40741841                       # number of ReadReq hits
-system.cpu.icache.ReadReq_miss_rate::0       0.010463                       # miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_misses::0            430782                       # number of ReadReq misses
-system.cpu.icache.ReadReq_misses::total        430782                       # number of ReadReq misses
+system.cpu.dtb.prefetch_faults                    775                       # Number of TLB faults due to prefetch
+system.cpu.dtb.read_accesses                  8743013                       # DTB read accesses
+system.cpu.dtb.read_hits                      8738461                       # DTB read hits
+system.cpu.dtb.read_misses                       4552                       # DTB read misses
+system.cpu.dtb.write_accesses                 6788273                       # DTB write accesses
+system.cpu.dtb.write_hits                     6787274                       # DTB write hits
+system.cpu.dtb.write_misses                       999                       # DTB write misses
+system.cpu.icache.ReadReq_accesses::0        41564629                       # number of ReadReq accesses(hits+misses)
+system.cpu.icache.ReadReq_accesses::total     41564629                       # number of ReadReq accesses(hits+misses)
+system.cpu.icache.ReadReq_hits::0            41131432                       # number of ReadReq hits
+system.cpu.icache.ReadReq_hits::total        41131432                       # number of ReadReq hits
+system.cpu.icache.ReadReq_miss_rate::0       0.010422                       # miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_misses::0            433197                       # number of ReadReq misses
+system.cpu.icache.ReadReq_misses::total        433197                       # number of ReadReq misses
 system.cpu.icache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.icache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.icache.avg_refs                  94.576690                       # Average number of references to valid blocks.
+system.cpu.icache.avg_refs                  94.948781                       # Average number of references to valid blocks.
 system.cpu.icache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.icache.blocked::no_targets               0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.icache.cache_copies                      0                       # number of cache copies performed
-system.cpu.icache.demand_accesses::0         41172623                       # number of demand (read+write) accesses
+system.cpu.icache.demand_accesses::0         41564629                       # number of demand (read+write) accesses
 system.cpu.icache.demand_accesses::1                0                       # number of demand (read+write) accesses
-system.cpu.icache.demand_accesses::total     41172623                       # number of demand (read+write) accesses
+system.cpu.icache.demand_accesses::total     41564629                       # number of demand (read+write) accesses
 system.cpu.icache.demand_avg_miss_latency::0            0                       # average overall miss latency
 system.cpu.icache.demand_avg_miss_latency::1     no_value                       # average overall miss latency
 system.cpu.icache.demand_avg_miss_latency::total     no_value                       # average overall miss latency
 system.cpu.icache.demand_avg_mshr_miss_latency     no_value                       # average overall mshr miss latency
-system.cpu.icache.demand_hits::0             40741841                       # number of demand (read+write) hits
+system.cpu.icache.demand_hits::0             41131432                       # number of demand (read+write) hits
 system.cpu.icache.demand_hits::1                    0                       # number of demand (read+write) hits
-system.cpu.icache.demand_hits::total         40741841                       # number of demand (read+write) hits
+system.cpu.icache.demand_hits::total         41131432                       # number of demand (read+write) hits
 system.cpu.icache.demand_miss_latency               0                       # number of demand (read+write) miss cycles
-system.cpu.icache.demand_miss_rate::0        0.010463                       # miss rate for demand accesses
+system.cpu.icache.demand_miss_rate::0        0.010422                       # miss rate for demand accesses
 system.cpu.icache.demand_miss_rate::1        no_value                       # miss rate for demand accesses
 system.cpu.icache.demand_miss_rate::total     no_value                       # miss rate for demand accesses
-system.cpu.icache.demand_misses::0             430782                       # number of demand (read+write) misses
+system.cpu.icache.demand_misses::0             433197                       # number of demand (read+write) misses
 system.cpu.icache.demand_misses::1                  0                       # number of demand (read+write) misses
-system.cpu.icache.demand_misses::total         430782                       # number of demand (read+write) misses
+system.cpu.icache.demand_misses::total         433197                       # number of demand (read+write) misses
 system.cpu.icache.demand_mshr_hits                  0                       # number of demand (read+write) MSHR hits
 system.cpu.icache.demand_mshr_miss_latency            0                       # number of demand (read+write) MSHR miss cycles
 system.cpu.icache.demand_mshr_miss_rate::0            0                       # mshr miss rate for demand accesses
@@ -164,26 +164,26 @@ system.cpu.icache.demand_mshr_misses                0                       # nu
 system.cpu.icache.fast_writes                       0                       # number of fast writes performed
 system.cpu.icache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.icache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.icache.occ_%::0                   0.929162                       # Average percentage of cache occupancy
-system.cpu.icache.occ_blocks::0            475.731149                       # Average occupied blocks per context
-system.cpu.icache.overall_accesses::0        41172623                       # number of overall (read+write) accesses
+system.cpu.icache.occ_%::0                   0.930040                       # Average percentage of cache occupancy
+system.cpu.icache.occ_blocks::0            476.180679                       # Average occupied blocks per context
+system.cpu.icache.overall_accesses::0        41564629                       # number of overall (read+write) accesses
 system.cpu.icache.overall_accesses::1               0                       # number of overall (read+write) accesses
-system.cpu.icache.overall_accesses::total     41172623                       # number of overall (read+write) accesses
+system.cpu.icache.overall_accesses::total     41564629                       # number of overall (read+write) accesses
 system.cpu.icache.overall_avg_miss_latency::0            0                       # average overall miss latency
 system.cpu.icache.overall_avg_miss_latency::1     no_value                       # average overall miss latency
 system.cpu.icache.overall_avg_miss_latency::total     no_value                       # average overall miss latency
 system.cpu.icache.overall_avg_mshr_miss_latency     no_value                       # average overall mshr miss latency
 system.cpu.icache.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.cpu.icache.overall_hits::0            40741841                       # number of overall hits
+system.cpu.icache.overall_hits::0            41131432                       # number of overall hits
 system.cpu.icache.overall_hits::1                   0                       # number of overall hits
-system.cpu.icache.overall_hits::total        40741841                       # number of overall hits
+system.cpu.icache.overall_hits::total        41131432                       # number of overall hits
 system.cpu.icache.overall_miss_latency              0                       # number of overall miss cycles
-system.cpu.icache.overall_miss_rate::0       0.010463                       # miss rate for overall accesses
+system.cpu.icache.overall_miss_rate::0       0.010422                       # miss rate for overall accesses
 system.cpu.icache.overall_miss_rate::1       no_value                       # miss rate for overall accesses
 system.cpu.icache.overall_miss_rate::total     no_value                       # miss rate for overall accesses
-system.cpu.icache.overall_misses::0            430782                       # number of overall misses
+system.cpu.icache.overall_misses::0            433197                       # number of overall misses
 system.cpu.icache.overall_misses::1                 0                       # number of overall misses
-system.cpu.icache.overall_misses::total        430782                       # number of overall misses
+system.cpu.icache.overall_misses::total        433197                       # number of overall misses
 system.cpu.icache.overall_mshr_hits                 0                       # number of overall MSHR hits
 system.cpu.icache.overall_mshr_miss_latency            0                       # number of overall MSHR miss cycles
 system.cpu.icache.overall_mshr_miss_rate::0            0                       # mshr miss rate for overall accesses
@@ -192,15 +192,15 @@ system.cpu.icache.overall_mshr_miss_rate::total     no_value
 system.cpu.icache.overall_mshr_misses               0                       # number of overall MSHR misses
 system.cpu.icache.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.cpu.icache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.icache.replacements                 430269                       # number of replacements
-system.cpu.icache.sampled_refs                 430781                       # Sample count of references to valid blocks.
+system.cpu.icache.replacements                 432684                       # number of replacements
+system.cpu.icache.sampled_refs                 433196                       # Sample count of references to valid blocks.
 system.cpu.icache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.icache.tagsinuse                475.731149                       # Cycle average of tags in use
-system.cpu.icache.total_refs                 40741841                       # Total number of references to valid blocks.
+system.cpu.icache.tagsinuse                476.180679                       # Cycle average of tags in use
+system.cpu.icache.total_refs                 41131432                       # Total number of references to valid blocks.
 system.cpu.icache.warmup_cycle             4544230000                       # Cycle when the warmup percentage was hit.
-system.cpu.icache.writebacks                    33727                       # number of writebacks
+system.cpu.icache.writebacks                    33708                       # number of writebacks
 system.cpu.idle_fraction                            0                       # Percentage of idle cycles
-system.cpu.itb.accesses                      41173750                       # DTB accesses
+system.cpu.itb.accesses                      41565756                       # DTB accesses
 system.cpu.itb.align_faults                         0                       # Number of TLB faults due to alignment restrictions
 system.cpu.itb.domain_faults                        0                       # Number of TLB faults due to domain restrictions
 system.cpu.itb.flush_entries                     1478                       # Number of entries that have been flushed from TLB
@@ -208,9 +208,9 @@ system.cpu.itb.flush_tlb                            2                       # Nu
 system.cpu.itb.flush_tlb_asid                      40                       # Number of times TLB was flushed by ASID
 system.cpu.itb.flush_tlb_mva                        0                       # Number of times TLB was flushed by MVA
 system.cpu.itb.flush_tlb_mva_asid               33670                       # Number of times TLB was flushed by MVA & ASID
-system.cpu.itb.hits                          41170928                       # DTB hits
-system.cpu.itb.inst_accesses                 41173750                       # ITB inst accesses
-system.cpu.itb.inst_hits                     41170928                       # ITB inst hits
+system.cpu.itb.hits                          41562934                       # DTB hits
+system.cpu.itb.inst_accesses                 41565756                       # ITB inst accesses
+system.cpu.itb.inst_hits                     41562934                       # ITB inst hits
 system.cpu.itb.inst_misses                       2822                       # ITB inst misses
 system.cpu.itb.misses                            2822                       # DTB misses
 system.cpu.itb.perms_faults                         0                       # Number of TLB faults due to permissions restrictions
@@ -224,10 +224,10 @@ system.cpu.itb.write_misses                         0                       # DT
 system.cpu.kern.inst.arm                            0                       # number of arm instructions executed
 system.cpu.kern.inst.quiesce                        0                       # number of quiesce instructions executed
 system.cpu.not_idle_fraction                        1                       # Percentage of non-idle cycles
-system.cpu.numCycles                         51642622                       # number of cpu cycles simulated
+system.cpu.numCycles                         52147236                       # number of cpu cycles simulated
 system.cpu.numWorkItemsCompleted                    0                       # number of work items this cpu completed
 system.cpu.numWorkItemsStarted                      0                       # number of work items this cpu started
-system.cpu.num_busy_cycles                   51642622                       # Number of busy cycles
+system.cpu.num_busy_cycles                   52147236                       # Number of busy cycles
 system.cpu.num_conditional_control_insts            0                       # number of instructions that are conditional controls
 system.cpu.num_fp_alu_accesses                   6059                       # Number of float alu accesses
 system.cpu.num_fp_insts                          6059                       # number of float instructions
@@ -235,14 +235,14 @@ system.cpu.num_fp_register_reads                 4227                       # nu
 system.cpu.num_fp_register_writes                1834                       # number of times the floating registers were written
 system.cpu.num_func_calls                           0                       # number of times a function call or return occured
 system.cpu.num_idle_cycles                          0                       # Number of idle cycles
-system.cpu.num_insts                         50949504                       # Number of instructions executed
-system.cpu.num_int_alu_accesses              41395090                       # Number of integer alu accesses
-system.cpu.num_int_insts                     41395090                       # number of integer instructions
-system.cpu.num_int_register_reads           128438705                       # number of times the integer registers were read
-system.cpu.num_int_register_writes           33973128                       # number of times the integer registers were written
-system.cpu.num_load_insts                     9082722                       # Number of load instructions
-system.cpu.num_mem_refs                      16092645                       # number of memory refs
-system.cpu.num_store_insts                    7009923                       # Number of store instructions
+system.cpu.num_insts                         51454118                       # Number of instructions executed
+system.cpu.num_int_alu_accesses              41848094                       # Number of integer alu accesses
+system.cpu.num_int_insts                     41848094                       # number of integer instructions
+system.cpu.num_int_register_reads           129780130                       # number of times the integer registers were read
+system.cpu.num_int_register_writes           34330061                       # number of times the integer registers were written
+system.cpu.num_load_insts                     9213901                       # Number of load instructions
+system.cpu.num_mem_refs                      16300106                       # number of memory refs
+system.cpu.num_store_insts                    7086205                       # Number of store instructions
 system.iocache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.iocache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
 system.iocache.avg_refs                      no_value                       # Average number of references to valid blocks.
@@ -310,61 +310,61 @@ system.iocache.tagsinuse                            0                       # Cy
 system.iocache.total_refs                           0                       # Total number of references to valid blocks.
 system.iocache.warmup_cycle                         0                       # Cycle when the warmup percentage was hit.
 system.iocache.writebacks                           0                       # number of writebacks
-system.l2c.ReadExReq_accesses::0               169714                       # number of ReadExReq accesses(hits+misses)
-system.l2c.ReadExReq_accesses::total           169714                       # number of ReadExReq accesses(hits+misses)
-system.l2c.ReadExReq_hits::0                    60310                       # number of ReadExReq hits
-system.l2c.ReadExReq_hits::total                60310                       # number of ReadExReq hits
-system.l2c.ReadExReq_miss_rate::0            0.644637                       # miss rate for ReadExReq accesses
-system.l2c.ReadExReq_misses::0                 109404                       # number of ReadExReq misses
-system.l2c.ReadExReq_misses::total             109404                       # number of ReadExReq misses
-system.l2c.ReadReq_accesses::0                 665898                       # number of ReadReq accesses(hits+misses)
-system.l2c.ReadReq_accesses::1                   6073                       # number of ReadReq accesses(hits+misses)
-system.l2c.ReadReq_accesses::total             671971                       # number of ReadReq accesses(hits+misses)
-system.l2c.ReadReq_hits::0                     648226                       # number of ReadReq hits
-system.l2c.ReadReq_hits::1                       6049                       # number of ReadReq hits
-system.l2c.ReadReq_hits::total                 654275                       # number of ReadReq hits
-system.l2c.ReadReq_miss_rate::0              0.026539                       # miss rate for ReadReq accesses
-system.l2c.ReadReq_miss_rate::1              0.003952                       # miss rate for ReadReq accesses
-system.l2c.ReadReq_miss_rate::total          0.030491                       # miss rate for ReadReq accesses
-system.l2c.ReadReq_misses::0                    17672                       # number of ReadReq misses
-system.l2c.ReadReq_misses::1                       24                       # number of ReadReq misses
-system.l2c.ReadReq_misses::total                17696                       # number of ReadReq misses
-system.l2c.UpgradeReq_accesses::0                1835                       # number of UpgradeReq accesses(hits+misses)
-system.l2c.UpgradeReq_accesses::total            1835                       # number of UpgradeReq accesses(hits+misses)
+system.l2c.ReadExReq_accesses::0               170347                       # number of ReadExReq accesses(hits+misses)
+system.l2c.ReadExReq_accesses::total           170347                       # number of ReadExReq accesses(hits+misses)
+system.l2c.ReadExReq_hits::0                    60613                       # number of ReadExReq hits
+system.l2c.ReadExReq_hits::total                60613                       # number of ReadExReq hits
+system.l2c.ReadExReq_miss_rate::0            0.644179                       # miss rate for ReadExReq accesses
+system.l2c.ReadExReq_misses::0                 109734                       # number of ReadExReq misses
+system.l2c.ReadExReq_misses::total             109734                       # number of ReadExReq misses
+system.l2c.ReadReq_accesses::0                 672769                       # number of ReadReq accesses(hits+misses)
+system.l2c.ReadReq_accesses::1                   6110                       # number of ReadReq accesses(hits+misses)
+system.l2c.ReadReq_accesses::total             678879                       # number of ReadReq accesses(hits+misses)
+system.l2c.ReadReq_hits::0                     651602                       # number of ReadReq hits
+system.l2c.ReadReq_hits::1                       6087                       # number of ReadReq hits
+system.l2c.ReadReq_hits::total                 657689                       # number of ReadReq hits
+system.l2c.ReadReq_miss_rate::0              0.031463                       # miss rate for ReadReq accesses
+system.l2c.ReadReq_miss_rate::1              0.003764                       # miss rate for ReadReq accesses
+system.l2c.ReadReq_miss_rate::total          0.035227                       # miss rate for ReadReq accesses
+system.l2c.ReadReq_misses::0                    21167                       # number of ReadReq misses
+system.l2c.ReadReq_misses::1                       23                       # number of ReadReq misses
+system.l2c.ReadReq_misses::total                21190                       # number of ReadReq misses
+system.l2c.UpgradeReq_accesses::0                1839                       # number of UpgradeReq accesses(hits+misses)
+system.l2c.UpgradeReq_accesses::total            1839                       # number of UpgradeReq accesses(hits+misses)
 system.l2c.UpgradeReq_hits::0                      17                       # number of UpgradeReq hits
 system.l2c.UpgradeReq_hits::total                  17                       # number of UpgradeReq hits
-system.l2c.UpgradeReq_miss_rate::0           0.990736                       # miss rate for UpgradeReq accesses
-system.l2c.UpgradeReq_misses::0                  1818                       # number of UpgradeReq misses
-system.l2c.UpgradeReq_misses::total              1818                       # number of UpgradeReq misses
-system.l2c.Writeback_accesses::0               412752                       # number of Writeback accesses(hits+misses)
-system.l2c.Writeback_accesses::total           412752                       # number of Writeback accesses(hits+misses)
-system.l2c.Writeback_hits::0                   412752                       # number of Writeback hits
-system.l2c.Writeback_hits::total               412752                       # number of Writeback hits
+system.l2c.UpgradeReq_miss_rate::0           0.990756                       # miss rate for UpgradeReq accesses
+system.l2c.UpgradeReq_misses::0                  1822                       # number of UpgradeReq misses
+system.l2c.UpgradeReq_misses::total              1822                       # number of UpgradeReq misses
+system.l2c.Writeback_accesses::0               415575                       # number of Writeback accesses(hits+misses)
+system.l2c.Writeback_accesses::total           415575                       # number of Writeback accesses(hits+misses)
+system.l2c.Writeback_hits::0                   415575                       # number of Writeback hits
+system.l2c.Writeback_hits::total               415575                       # number of Writeback hits
 system.l2c.avg_blocked_cycles::no_mshrs      no_value                       # average number of cycles each access was blocked
 system.l2c.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.l2c.avg_refs                          6.885433                       # Average number of references to valid blocks.
+system.l2c.avg_refs                          6.741439                       # Average number of references to valid blocks.
 system.l2c.blocked::no_mshrs                        0                       # number of cycles access was blocked
 system.l2c.blocked::no_targets                      0                       # number of cycles access was blocked
 system.l2c.blocked_cycles::no_mshrs                 0                       # number of cycles access was blocked
 system.l2c.blocked_cycles::no_targets               0                       # number of cycles access was blocked
 system.l2c.cache_copies                             0                       # number of cache copies performed
-system.l2c.demand_accesses::0                  835612                       # number of demand (read+write) accesses
-system.l2c.demand_accesses::1                    6073                       # number of demand (read+write) accesses
-system.l2c.demand_accesses::total              841685                       # number of demand (read+write) accesses
+system.l2c.demand_accesses::0                  843116                       # number of demand (read+write) accesses
+system.l2c.demand_accesses::1                    6110                       # number of demand (read+write) accesses
+system.l2c.demand_accesses::total              849226                       # number of demand (read+write) accesses
 system.l2c.demand_avg_miss_latency::0               0                       # average overall miss latency
 system.l2c.demand_avg_miss_latency::1               0                       # average overall miss latency
 system.l2c.demand_avg_miss_latency::total            0                       # average overall miss latency
 system.l2c.demand_avg_mshr_miss_latency      no_value                       # average overall mshr miss latency
-system.l2c.demand_hits::0                      708536                       # number of demand (read+write) hits
-system.l2c.demand_hits::1                        6049                       # number of demand (read+write) hits
-system.l2c.demand_hits::total                  714585                       # number of demand (read+write) hits
+system.l2c.demand_hits::0                      712215                       # number of demand (read+write) hits
+system.l2c.demand_hits::1                        6087                       # number of demand (read+write) hits
+system.l2c.demand_hits::total                  718302                       # number of demand (read+write) hits
 system.l2c.demand_miss_latency                      0                       # number of demand (read+write) miss cycles
-system.l2c.demand_miss_rate::0               0.152075                       # miss rate for demand accesses
-system.l2c.demand_miss_rate::1               0.003952                       # miss rate for demand accesses
-system.l2c.demand_miss_rate::total           0.156027                       # miss rate for demand accesses
-system.l2c.demand_misses::0                    127076                       # number of demand (read+write) misses
-system.l2c.demand_misses::1                        24                       # number of demand (read+write) misses
-system.l2c.demand_misses::total                127100                       # number of demand (read+write) misses
+system.l2c.demand_miss_rate::0               0.155259                       # miss rate for demand accesses
+system.l2c.demand_miss_rate::1               0.003764                       # miss rate for demand accesses
+system.l2c.demand_miss_rate::total           0.159023                       # miss rate for demand accesses
+system.l2c.demand_misses::0                    130901                       # number of demand (read+write) misses
+system.l2c.demand_misses::1                        23                       # number of demand (read+write) misses
+system.l2c.demand_misses::total                130924                       # number of demand (read+write) misses
 system.l2c.demand_mshr_hits                         0                       # number of demand (read+write) MSHR hits
 system.l2c.demand_mshr_miss_latency                 0                       # number of demand (read+write) MSHR miss cycles
 system.l2c.demand_mshr_miss_rate::0                 0                       # mshr miss rate for demand accesses
@@ -374,28 +374,28 @@ system.l2c.demand_mshr_misses                       0                       # nu
 system.l2c.fast_writes                              0                       # number of fast writes performed
 system.l2c.mshr_cap_events                          0                       # number of times MSHR cap was activated
 system.l2c.no_allocate_misses                       0                       # Number of misses that were no-allocate
-system.l2c.occ_%::0                          0.072507                       # Average percentage of cache occupancy
-system.l2c.occ_%::1                          0.478199                       # Average percentage of cache occupancy
-system.l2c.occ_blocks::0                  4751.792305                       # Average occupied blocks per context
-system.l2c.occ_blocks::1                 31339.221407                       # Average occupied blocks per context
-system.l2c.overall_accesses::0                 835612                       # number of overall (read+write) accesses
-system.l2c.overall_accesses::1                   6073                       # number of overall (read+write) accesses
-system.l2c.overall_accesses::total             841685                       # number of overall (read+write) accesses
+system.l2c.occ_%::0                          0.076407                       # Average percentage of cache occupancy
+system.l2c.occ_%::1                          0.476934                       # Average percentage of cache occupancy
+system.l2c.occ_blocks::0                  5007.401793                       # Average occupied blocks per context
+system.l2c.occ_blocks::1                 31256.365097                       # Average occupied blocks per context
+system.l2c.overall_accesses::0                 843116                       # number of overall (read+write) accesses
+system.l2c.overall_accesses::1                   6110                       # number of overall (read+write) accesses
+system.l2c.overall_accesses::total             849226                       # number of overall (read+write) accesses
 system.l2c.overall_avg_miss_latency::0              0                       # average overall miss latency
 system.l2c.overall_avg_miss_latency::1              0                       # average overall miss latency
 system.l2c.overall_avg_miss_latency::total            0                       # average overall miss latency
 system.l2c.overall_avg_mshr_miss_latency     no_value                       # average overall mshr miss latency
 system.l2c.overall_avg_mshr_uncacheable_latency     no_value                       # average overall mshr uncacheable latency
-system.l2c.overall_hits::0                     708536                       # number of overall hits
-system.l2c.overall_hits::1                       6049                       # number of overall hits
-system.l2c.overall_hits::total                 714585                       # number of overall hits
+system.l2c.overall_hits::0                     712215                       # number of overall hits
+system.l2c.overall_hits::1                       6087                       # number of overall hits
+system.l2c.overall_hits::total                 718302                       # number of overall hits
 system.l2c.overall_miss_latency                     0                       # number of overall miss cycles
-system.l2c.overall_miss_rate::0              0.152075                       # miss rate for overall accesses
-system.l2c.overall_miss_rate::1              0.003952                       # miss rate for overall accesses
-system.l2c.overall_miss_rate::total          0.156027                       # miss rate for overall accesses
-system.l2c.overall_misses::0                   127076                       # number of overall misses
-system.l2c.overall_misses::1                       24                       # number of overall misses
-system.l2c.overall_misses::total               127100                       # number of overall misses
+system.l2c.overall_miss_rate::0              0.155259                       # miss rate for overall accesses
+system.l2c.overall_miss_rate::1              0.003764                       # miss rate for overall accesses
+system.l2c.overall_miss_rate::total          0.159023                       # miss rate for overall accesses
+system.l2c.overall_misses::0                   130901                       # number of overall misses
+system.l2c.overall_misses::1                       23                       # number of overall misses
+system.l2c.overall_misses::total               130924                       # number of overall misses
 system.l2c.overall_mshr_hits                        0                       # number of overall MSHR hits
 system.l2c.overall_mshr_miss_latency                0                       # number of overall MSHR miss cycles
 system.l2c.overall_mshr_miss_rate::0                0                       # mshr miss rate for overall accesses
@@ -404,12 +404,12 @@ system.l2c.overall_mshr_miss_rate::total            0                       # ms
 system.l2c.overall_mshr_misses                      0                       # number of overall MSHR misses
 system.l2c.overall_mshr_uncacheable_latency            0                       # number of overall MSHR uncacheable cycles
 system.l2c.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.l2c.replacements                         95922                       # number of replacements
-system.l2c.sampled_refs                        125830                       # Sample count of references to valid blocks.
+system.l2c.replacements                         97028                       # number of replacements
+system.l2c.sampled_refs                        129660                       # Sample count of references to valid blocks.
 system.l2c.soft_prefetch_mshr_full                  0                       # number of mshr full events for SW prefetching instrutions
-system.l2c.tagsinuse                     36091.013712                       # Cycle average of tags in use
-system.l2c.total_refs                          866394                       # Total number of references to valid blocks.
+system.l2c.tagsinuse                     36263.766890                       # Cycle average of tags in use
+system.l2c.total_refs                          874095                       # Total number of references to valid blocks.
 system.l2c.warmup_cycle                             0                       # Cycle when the warmup percentage was hit.
-system.l2c.writebacks                           90126                       # number of writebacks
+system.l2c.writebacks                           90970                       # number of writebacks
 
 ---------- End Simulation Statistics   ----------
diff --git a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/config.ini b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/config.ini
index 4fad32362e..49b04d190d 100644
--- a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/config.ini
+++ b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/config.ini
@@ -7,11 +7,11 @@ time_sync_spin_threshold=100000000
 
 [system]
 type=LinuxArmSystem
-children=bridge cpu diskmem intrctrl iobus iocache l2c membus physmem realview terminal toL2Bus
+children=bridge cpu diskmem intrctrl iobus iocache l2c membus physmem realview terminal toL2Bus vncserver
 boot_cpu_frequency=500
 boot_osflags=earlyprintk mem=128MB console=ttyAMA0 lpj=19988480 norandmaps slram=slram0,0x8000000,+0x8000000 mtdparts=slram0:- rw loglevel=8 root=/dev/mtdblock0
 init_param=0
-kernel=/dist/m5/system/binaries/vmlinux.arm
+kernel=/chips/pd/randd/dist/binaries/vmlinux.arm
 load_addr_mask=268435455
 machine_type=RealView_PBX
 mem_mode=timing
@@ -164,7 +164,7 @@ type=ExeTracer
 
 [system.diskmem]
 type=PhysicalMemory
-file=/dist/m5/system/disks/ael-arm.ext2
+file=/chips/pd/randd/dist/disks/ael-arm.ext2
 latency=30000
 latency_var=0
 null=false
@@ -184,7 +184,7 @@ clock=1000
 header_cycles=1
 use_default_range=false
 width=64
-port=system.bridge.side_a system.realview.uart.pio system.realview.realview_io.pio system.realview.timer0.pio system.realview.timer1.pio system.realview.clcd.pio system.realview.kmi0.pio system.realview.kmi1.pio system.realview.dmac_fake.pio system.realview.uart1_fake.pio system.realview.uart2_fake.pio system.realview.uart3_fake.pio system.realview.smc_fake.pio system.realview.sp810_fake.pio system.realview.watchdog_fake.pio system.realview.gpio0_fake.pio system.realview.gpio1_fake.pio system.realview.gpio2_fake.pio system.realview.ssp_fake.pio system.realview.sci_fake.pio system.realview.aaci_fake.pio system.realview.mmc_fake.pio system.realview.rtc_fake.pio system.realview.flash_fake.pio system.iocache.cpu_side system.realview.clcd.dma
+port=system.bridge.side_a system.realview.uart.pio system.realview.realview_io.pio system.realview.timer0.pio system.realview.timer1.pio system.realview.clcd.pio system.realview.kmi0.pio system.realview.kmi1.pio system.realview.dmac_fake.pio system.realview.uart1_fake.pio system.realview.uart2_fake.pio system.realview.uart3_fake.pio system.realview.smc_fake.pio system.realview.sp810_fake.pio system.realview.watchdog_fake.pio system.realview.gpio0_fake.pio system.realview.gpio1_fake.pio system.realview.gpio2_fake.pio system.realview.ssp_fake.pio system.realview.sci_fake.pio system.realview.aaci_fake.pio system.realview.mmc_fake.pio system.realview.rtc_fake.pio system.realview.flash_fake.pio system.realview.cf0_fake.pio system.iocache.cpu_side system.realview.clcd.dma
 
 [system.iocache]
 type=BaseCache
@@ -214,7 +214,7 @@ tgts_per_mshr=12
 trace_addr=0
 two_queue=false
 write_buffers=8
-cpu_side=system.iobus.port[24]
+cpu_side=system.iobus.port[25]
 mem_side=system.membus.port[5]
 
 [system.l2c]
@@ -288,7 +288,7 @@ port=system.membus.port[1]
 
 [system.realview]
 type=RealView
-children=aaci_fake clcd dmac_fake flash_fake gic gpio0_fake gpio1_fake gpio2_fake kmi0 kmi1 l2x0_fake mmc_fake realview_io rtc_fake sci_fake smc_fake sp810_fake ssp_fake timer0 timer1 uart uart1_fake uart2_fake uart3_fake watchdog_fake
+children=aaci_fake cf0_fake clcd dmac_fake flash_fake gic gpio0_fake gpio1_fake gpio2_fake kmi0 kmi1 l2x0_fake mmc_fake realview_io rtc_fake sci_fake smc_fake sp810_fake ssp_fake timer0 timer1 uart uart1_fake uart2_fake uart3_fake watchdog_fake
 intrctrl=system.intrctrl
 system=system
 
@@ -302,6 +302,22 @@ platform=system.realview
 system=system
 pio=system.iobus.port[20]
 
+[system.realview.cf0_fake]
+type=IsaFake
+pio_addr=402653184
+pio_latency=1000
+pio_size=4095
+platform=system.realview
+ret_bad_addr=false
+ret_data16=65535
+ret_data32=4294967295
+ret_data64=18446744073709551615
+ret_data8=255
+system=system
+update_data=false
+warn_access=
+pio=system.iobus.port[24]
+
 [system.realview.clcd]
 type=Pl111
 amba_id=1315089
@@ -314,7 +330,8 @@ pio_addr=268566528
 pio_latency=10000
 platform=system.realview
 system=system
-dma=system.iobus.port[25]
+vnc=system.vncserver
+dma=system.iobus.port[26]
 pio=system.iobus.port[5]
 
 [system.realview.dmac_fake]
@@ -388,24 +405,28 @@ pio=system.iobus.port[17]
 type=Pl050
 amba_id=1314896
 gic=system.realview.gic
-int_delay=100000
+int_delay=1000000
 int_num=52
+is_mouse=false
 pio_addr=268460032
 pio_latency=1000
 platform=system.realview
 system=system
+vnc=system.vncserver
 pio=system.iobus.port[6]
 
 [system.realview.kmi1]
 type=Pl050
 amba_id=1314896
 gic=system.realview.gic
-int_delay=100000
+int_delay=1000000
 int_num=53
+is_mouse=true
 pio_addr=268464128
 pio_latency=1000
 platform=system.realview
 system=system
+vnc=system.vncserver
 pio=system.iobus.port[7]
 
 [system.realview.l2x0_fake]
@@ -591,3 +612,8 @@ use_default_range=false
 width=64
 port=system.l2c.cpu_side system.cpu.icache.mem_side system.cpu.dcache.mem_side system.cpu.itb.walker.port system.cpu.dtb.walker.port
 
+[system.vncserver]
+type=VncServer
+number=0
+port=5900
+
diff --git a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/simerr b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/simerr
index e76a50eec7..1cff4671c4 100755
--- a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/simerr
+++ b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/simerr
@@ -1,3 +1,5 @@
+warn: Sockets disabled, not accepting vnc client connections
+For more information see: http://www.m5sim.org/warn/af6a84f6
 warn: Sockets disabled, not accepting terminal connections
 For more information see: http://www.m5sim.org/warn/8742226b
 warn: Sockets disabled, not accepting gdb connections
diff --git a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/simout b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/simout
index 994dfb6a2d..2a456e7be0 100755
--- a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/simout
+++ b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/simout
@@ -5,12 +5,12 @@ The Regents of The University of Michigan
 All Rights Reserved
 
 
-M5 compiled Feb  7 2011 01:53:13
-M5 revision 4b4b02c5553c 7929 default qtip reupdatestats.patch tip
-M5 started Feb  7 2011 01:53:26
-M5 executing on burrito
+M5 compiled Feb 11 2011 17:53:57
+M5 revision 6c65f7ee86c1 7949 default qtip tip ext/vnc_stats_updates.patch
+M5 started Feb 11 2011 17:54:00
+M5 executing on u200439-lin.austin.arm.com
 command line: build/ARM_FS/m5.fast -d build/ARM_FS/tests/fast/quick/10.linux-boot/arm/linux/realview-simple-timing -re tests/run.py build/ARM_FS/tests/fast/quick/10.linux-boot/arm/linux/realview-simple-timing
 Global frequency set at 1000000000000 ticks per second
-info: kernel located at: /dist/m5/system/binaries/vmlinux.arm
+info: kernel located at: /chips/pd/randd/dist/binaries/vmlinux.arm
 info: Entering event queue @ 0.  Starting simulation...
-Exiting @ tick 114721074000 because m5_exit instruction encountered
+Exiting @ tick 114726567000 because m5_exit instruction encountered
diff --git a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/stats.txt b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/stats.txt
index 85fb992207..c96422cfa6 100644
--- a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/stats.txt
+++ b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/stats.txt
@@ -1,254 +1,254 @@
 
 ---------- Begin Simulation Statistics ----------
-host_inst_rate                                 433208                       # Simulator instruction rate (inst/s)
-host_mem_usage                                 360908                       # Number of bytes of host memory used
-host_seconds                                   116.74                       # Real time elapsed on the host
-host_tick_rate                              982709659                       # Simulator tick rate (ticks/s)
+host_inst_rate                                1425483                       # Simulator instruction rate (inst/s)
+host_mem_usage                                 374960                       # Number of bytes of host memory used
+host_seconds                                    35.49                       # Real time elapsed on the host
+host_tick_rate                             3232752918                       # Simulator tick rate (ticks/s)
 sim_freq                                 1000000000000                       # Frequency of simulated ticks
-sim_insts                                    50572425                       # Number of instructions simulated
-sim_seconds                                  0.114721                       # Number of seconds simulated
-sim_ticks                                114721074000                       # Number of ticks simulated
-system.cpu.dcache.LoadLockedReq_accesses::0       100214                       # number of LoadLockedReq accesses(hits+misses)
-system.cpu.dcache.LoadLockedReq_accesses::total       100214                       # number of LoadLockedReq accesses(hits+misses)
-system.cpu.dcache.LoadLockedReq_avg_miss_latency::0 15147.115385                       # average LoadLockedReq miss latency
+sim_insts                                    50588397                       # Number of instructions simulated
+sim_seconds                                  0.114727                       # Number of seconds simulated
+sim_ticks                                114726567000                       # Number of ticks simulated
+system.cpu.dcache.LoadLockedReq_accesses::0       100290                       # number of LoadLockedReq accesses(hits+misses)
+system.cpu.dcache.LoadLockedReq_accesses::total       100290                       # number of LoadLockedReq accesses(hits+misses)
+system.cpu.dcache.LoadLockedReq_avg_miss_latency::0 14562.978560                       # average LoadLockedReq miss latency
 system.cpu.dcache.LoadLockedReq_avg_miss_latency::1          inf                       # average LoadLockedReq miss latency
 system.cpu.dcache.LoadLockedReq_avg_miss_latency::total          inf                       # average LoadLockedReq miss latency
-system.cpu.dcache.LoadLockedReq_avg_mshr_miss_latency 12147.115385                       # average LoadLockedReq mshr miss latency
+system.cpu.dcache.LoadLockedReq_avg_mshr_miss_latency 11562.978560                       # average LoadLockedReq mshr miss latency
 system.cpu.dcache.LoadLockedReq_avg_mshr_uncacheable_latency          inf                       # average LoadLockedReq mshr uncacheable latency
-system.cpu.dcache.LoadLockedReq_hits::0         95014                       # number of LoadLockedReq hits
-system.cpu.dcache.LoadLockedReq_hits::total        95014                       # number of LoadLockedReq hits
-system.cpu.dcache.LoadLockedReq_miss_latency     78765000                       # number of LoadLockedReq miss cycles
-system.cpu.dcache.LoadLockedReq_miss_rate::0     0.051889                       # miss rate for LoadLockedReq accesses
-system.cpu.dcache.LoadLockedReq_misses::0         5200                       # number of LoadLockedReq misses
-system.cpu.dcache.LoadLockedReq_misses::total         5200                       # number of LoadLockedReq misses
-system.cpu.dcache.LoadLockedReq_mshr_miss_latency     63165000                       # number of LoadLockedReq MSHR miss cycles
-system.cpu.dcache.LoadLockedReq_mshr_miss_rate::0     0.051889                       # mshr miss rate for LoadLockedReq accesses
+system.cpu.dcache.LoadLockedReq_hits::0         95066                       # number of LoadLockedReq hits
+system.cpu.dcache.LoadLockedReq_hits::total        95066                       # number of LoadLockedReq hits
+system.cpu.dcache.LoadLockedReq_miss_latency     76077000                       # number of LoadLockedReq miss cycles
+system.cpu.dcache.LoadLockedReq_miss_rate::0     0.052089                       # miss rate for LoadLockedReq accesses
+system.cpu.dcache.LoadLockedReq_misses::0         5224                       # number of LoadLockedReq misses
+system.cpu.dcache.LoadLockedReq_misses::total         5224                       # number of LoadLockedReq misses
+system.cpu.dcache.LoadLockedReq_mshr_miss_latency     60405000                       # number of LoadLockedReq MSHR miss cycles
+system.cpu.dcache.LoadLockedReq_mshr_miss_rate::0     0.052089                       # mshr miss rate for LoadLockedReq accesses
 system.cpu.dcache.LoadLockedReq_mshr_miss_rate::1          inf                       # mshr miss rate for LoadLockedReq accesses
 system.cpu.dcache.LoadLockedReq_mshr_miss_rate::total          inf                       # mshr miss rate for LoadLockedReq accesses
-system.cpu.dcache.LoadLockedReq_mshr_misses         5200                       # number of LoadLockedReq MSHR misses
-system.cpu.dcache.LoadLockedReq_mshr_uncacheable_latency    310267000                       # number of LoadLockedReq MSHR uncacheable cycles
-system.cpu.dcache.ReadReq_accesses::0         7824780                       # number of ReadReq accesses(hits+misses)
-system.cpu.dcache.ReadReq_accesses::total      7824780                       # number of ReadReq accesses(hits+misses)
-system.cpu.dcache.ReadReq_avg_miss_latency::0 15798.342892                       # average ReadReq miss latency
+system.cpu.dcache.LoadLockedReq_mshr_misses         5224                       # number of LoadLockedReq MSHR misses
+system.cpu.dcache.LoadLockedReq_mshr_uncacheable_latency    310532000                       # number of LoadLockedReq MSHR uncacheable cycles
+system.cpu.dcache.ReadReq_accesses::0         7828656                       # number of ReadReq accesses(hits+misses)
+system.cpu.dcache.ReadReq_accesses::total      7828656                       # number of ReadReq accesses(hits+misses)
+system.cpu.dcache.ReadReq_avg_miss_latency::0 15679.539912                       # average ReadReq miss latency
 system.cpu.dcache.ReadReq_avg_miss_latency::1          inf                       # average ReadReq miss latency
 system.cpu.dcache.ReadReq_avg_miss_latency::total          inf                       # average ReadReq miss latency
-system.cpu.dcache.ReadReq_avg_mshr_miss_latency 12798.015358                       # average ReadReq mshr miss latency
+system.cpu.dcache.ReadReq_avg_mshr_miss_latency 12679.195749                       # average ReadReq mshr miss latency
 system.cpu.dcache.ReadReq_avg_mshr_uncacheable_latency          inf                       # average ReadReq mshr uncacheable latency
-system.cpu.dcache.ReadReq_hits::0             7588163                       # number of ReadReq hits
-system.cpu.dcache.ReadReq_hits::total         7588163                       # number of ReadReq hits
-system.cpu.dcache.ReadReq_miss_latency     3738156500                       # number of ReadReq miss cycles
-system.cpu.dcache.ReadReq_miss_rate::0       0.030239                       # miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_misses::0            236617                       # number of ReadReq misses
-system.cpu.dcache.ReadReq_misses::total        236617                       # number of ReadReq misses
-system.cpu.dcache.ReadReq_mshr_miss_latency   3028228000                       # number of ReadReq MSHR miss cycles
-system.cpu.dcache.ReadReq_mshr_miss_rate::0     0.030239                       # mshr miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_hits::0             7590397                       # number of ReadReq hits
+system.cpu.dcache.ReadReq_hits::total         7590397                       # number of ReadReq hits
+system.cpu.dcache.ReadReq_miss_latency     3735791500                       # number of ReadReq miss cycles
+system.cpu.dcache.ReadReq_miss_rate::0       0.030434                       # miss rate for ReadReq accesses
+system.cpu.dcache.ReadReq_misses::0            238259                       # number of ReadReq misses
+system.cpu.dcache.ReadReq_misses::total        238259                       # number of ReadReq misses
+system.cpu.dcache.ReadReq_mshr_miss_latency   3020932500                       # number of ReadReq MSHR miss cycles
+system.cpu.dcache.ReadReq_mshr_miss_rate::0     0.030434                       # mshr miss rate for ReadReq accesses
 system.cpu.dcache.ReadReq_mshr_miss_rate::1          inf                       # mshr miss rate for ReadReq accesses
 system.cpu.dcache.ReadReq_mshr_miss_rate::total          inf                       # mshr miss rate for ReadReq accesses
-system.cpu.dcache.ReadReq_mshr_misses          236617                       # number of ReadReq MSHR misses
-system.cpu.dcache.ReadReq_mshr_uncacheable_latency  38190415500                       # number of ReadReq MSHR uncacheable cycles
-system.cpu.dcache.StoreCondReq_accesses::0       100213                       # number of StoreCondReq accesses(hits+misses)
-system.cpu.dcache.StoreCondReq_accesses::total       100213                       # number of StoreCondReq accesses(hits+misses)
-system.cpu.dcache.StoreCondReq_hits::0         100213                       # number of StoreCondReq hits
-system.cpu.dcache.StoreCondReq_hits::total       100213                       # number of StoreCondReq hits
-system.cpu.dcache.WriteReq_accesses::0        6671860                       # number of WriteReq accesses(hits+misses)
-system.cpu.dcache.WriteReq_accesses::total      6671860                       # number of WriteReq accesses(hits+misses)
-system.cpu.dcache.WriteReq_avg_miss_latency::0 40836.063764                       # average WriteReq miss latency
+system.cpu.dcache.ReadReq_mshr_misses          238259                       # number of ReadReq MSHR misses
+system.cpu.dcache.ReadReq_mshr_uncacheable_latency  38191771500                       # number of ReadReq MSHR uncacheable cycles
+system.cpu.dcache.StoreCondReq_accesses::0       100289                       # number of StoreCondReq accesses(hits+misses)
+system.cpu.dcache.StoreCondReq_accesses::total       100289                       # number of StoreCondReq accesses(hits+misses)
+system.cpu.dcache.StoreCondReq_hits::0         100289                       # number of StoreCondReq hits
+system.cpu.dcache.StoreCondReq_hits::total       100289                       # number of StoreCondReq hits
+system.cpu.dcache.WriteReq_accesses::0        6674369                       # number of WriteReq accesses(hits+misses)
+system.cpu.dcache.WriteReq_accesses::total      6674369                       # number of WriteReq accesses(hits+misses)
+system.cpu.dcache.WriteReq_avg_miss_latency::0 40728.962545                       # average WriteReq miss latency
 system.cpu.dcache.WriteReq_avg_miss_latency::1          inf                       # average WriteReq miss latency
 system.cpu.dcache.WriteReq_avg_miss_latency::total          inf                       # average WriteReq miss latency
-system.cpu.dcache.WriteReq_avg_mshr_miss_latency 37835.781907                       # average WriteReq mshr miss latency
+system.cpu.dcache.WriteReq_avg_mshr_miss_latency 37728.712808                       # average WriteReq mshr miss latency
 system.cpu.dcache.WriteReq_avg_mshr_uncacheable_latency          inf                       # average WriteReq mshr uncacheable latency
-system.cpu.dcache.WriteReq_hits::0            6499787                       # number of WriteReq hits
-system.cpu.dcache.WriteReq_hits::total        6499787                       # number of WriteReq hits
-system.cpu.dcache.WriteReq_miss_latency    7026784000                       # number of WriteReq miss cycles
-system.cpu.dcache.WriteReq_miss_rate::0      0.025791                       # miss rate for WriteReq accesses
-system.cpu.dcache.WriteReq_misses::0           172073                       # number of WriteReq misses
-system.cpu.dcache.WriteReq_misses::total       172073                       # number of WriteReq misses
-system.cpu.dcache.WriteReq_mshr_miss_latency   6510516500                       # number of WriteReq MSHR miss cycles
-system.cpu.dcache.WriteReq_mshr_miss_rate::0     0.025791                       # mshr miss rate for WriteReq accesses
+system.cpu.dcache.WriteReq_hits::0            6502188                       # number of WriteReq hits
+system.cpu.dcache.WriteReq_hits::total        6502188                       # number of WriteReq hits
+system.cpu.dcache.WriteReq_miss_latency    7012753500                       # number of WriteReq miss cycles
+system.cpu.dcache.WriteReq_miss_rate::0      0.025797                       # miss rate for WriteReq accesses
+system.cpu.dcache.WriteReq_misses::0           172181                       # number of WriteReq misses
+system.cpu.dcache.WriteReq_misses::total       172181                       # number of WriteReq misses
+system.cpu.dcache.WriteReq_mshr_miss_latency   6496167500                       # number of WriteReq MSHR miss cycles
+system.cpu.dcache.WriteReq_mshr_miss_rate::0     0.025797                       # mshr miss rate for WriteReq accesses
 system.cpu.dcache.WriteReq_mshr_miss_rate::1          inf                       # mshr miss rate for WriteReq accesses
 system.cpu.dcache.WriteReq_mshr_miss_rate::total          inf                       # mshr miss rate for WriteReq accesses
-system.cpu.dcache.WriteReq_mshr_misses         172073                       # number of WriteReq MSHR misses
-system.cpu.dcache.WriteReq_mshr_uncacheable_latency    926046500                       # number of WriteReq MSHR uncacheable cycles
+system.cpu.dcache.WriteReq_mshr_misses         172181                       # number of WriteReq MSHR misses
+system.cpu.dcache.WriteReq_mshr_uncacheable_latency    927436000                       # number of WriteReq MSHR uncacheable cycles
 system.cpu.dcache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.dcache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.dcache.avg_refs                  34.660375                       # Average number of references to valid blocks.
+system.cpu.dcache.avg_refs                  34.529769                       # Average number of references to valid blocks.
 system.cpu.dcache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.dcache.blocked::no_targets               0                       # number of cycles access was blocked
 system.cpu.dcache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.dcache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.dcache.cache_copies                      0                       # number of cache copies performed
-system.cpu.dcache.demand_accesses::0         14496640                       # number of demand (read+write) accesses
+system.cpu.dcache.demand_accesses::0         14503025                       # number of demand (read+write) accesses
 system.cpu.dcache.demand_accesses::1                0                       # number of demand (read+write) accesses
-system.cpu.dcache.demand_accesses::total     14496640                       # number of demand (read+write) accesses
-system.cpu.dcache.demand_avg_miss_latency::0 26340.112310                       # average overall miss latency
+system.cpu.dcache.demand_accesses::total     14503025                       # number of demand (read+write) accesses
+system.cpu.dcache.demand_avg_miss_latency::0 26187.859370                       # average overall miss latency
 system.cpu.dcache.demand_avg_miss_latency::1          inf                       # average overall miss latency
 system.cpu.dcache.demand_avg_miss_latency::total          inf                       # average overall miss latency
-system.cpu.dcache.demand_avg_mshr_miss_latency 23339.804008                       # average overall mshr miss latency
-system.cpu.dcache.demand_hits::0             14087950                       # number of demand (read+write) hits
+system.cpu.dcache.demand_avg_mshr_miss_latency 23187.554819                       # average overall mshr miss latency
+system.cpu.dcache.demand_hits::0             14092585                       # number of demand (read+write) hits
 system.cpu.dcache.demand_hits::1                    0                       # number of demand (read+write) hits
-system.cpu.dcache.demand_hits::total         14087950                       # number of demand (read+write) hits
-system.cpu.dcache.demand_miss_latency     10764940500                       # number of demand (read+write) miss cycles
-system.cpu.dcache.demand_miss_rate::0        0.028192                       # miss rate for demand accesses
+system.cpu.dcache.demand_hits::total         14092585                       # number of demand (read+write) hits
+system.cpu.dcache.demand_miss_latency     10748545000                       # number of demand (read+write) miss cycles
+system.cpu.dcache.demand_miss_rate::0        0.028300                       # miss rate for demand accesses
 system.cpu.dcache.demand_miss_rate::1        no_value                       # miss rate for demand accesses
 system.cpu.dcache.demand_miss_rate::total     no_value                       # miss rate for demand accesses
-system.cpu.dcache.demand_misses::0             408690                       # number of demand (read+write) misses
+system.cpu.dcache.demand_misses::0             410440                       # number of demand (read+write) misses
 system.cpu.dcache.demand_misses::1                  0                       # number of demand (read+write) misses
-system.cpu.dcache.demand_misses::total         408690                       # number of demand (read+write) misses
+system.cpu.dcache.demand_misses::total         410440                       # number of demand (read+write) misses
 system.cpu.dcache.demand_mshr_hits                  0                       # number of demand (read+write) MSHR hits
-system.cpu.dcache.demand_mshr_miss_latency   9538744500                       # number of demand (read+write) MSHR miss cycles
-system.cpu.dcache.demand_mshr_miss_rate::0     0.028192                       # mshr miss rate for demand accesses
+system.cpu.dcache.demand_mshr_miss_latency   9517100000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.dcache.demand_mshr_miss_rate::0     0.028300                       # mshr miss rate for demand accesses
 system.cpu.dcache.demand_mshr_miss_rate::1          inf                       # mshr miss rate for demand accesses
 system.cpu.dcache.demand_mshr_miss_rate::total          inf                       # mshr miss rate for demand accesses
-system.cpu.dcache.demand_mshr_misses           408690                       # number of demand (read+write) MSHR misses
+system.cpu.dcache.demand_mshr_misses           410440                       # number of demand (read+write) MSHR misses
 system.cpu.dcache.fast_writes                       0                       # number of fast writes performed
 system.cpu.dcache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.dcache.no_allocate_misses                0                       # Number of misses that were no-allocate
 system.cpu.dcache.occ_%::0                   0.994530                       # Average percentage of cache occupancy
-system.cpu.dcache.occ_blocks::0            509.199113                       # Average occupied blocks per context
-system.cpu.dcache.overall_accesses::0        14496640                       # number of overall (read+write) accesses
+system.cpu.dcache.occ_blocks::0            509.199247                       # Average occupied blocks per context
+system.cpu.dcache.overall_accesses::0        14503025                       # number of overall (read+write) accesses
 system.cpu.dcache.overall_accesses::1               0                       # number of overall (read+write) accesses
-system.cpu.dcache.overall_accesses::total     14496640                       # number of overall (read+write) accesses
-system.cpu.dcache.overall_avg_miss_latency::0 26340.112310                       # average overall miss latency
+system.cpu.dcache.overall_accesses::total     14503025                       # number of overall (read+write) accesses
+system.cpu.dcache.overall_avg_miss_latency::0 26187.859370                       # average overall miss latency
 system.cpu.dcache.overall_avg_miss_latency::1          inf                       # average overall miss latency
 system.cpu.dcache.overall_avg_miss_latency::total          inf                       # average overall miss latency
-system.cpu.dcache.overall_avg_mshr_miss_latency 23339.804008                       # average overall mshr miss latency
+system.cpu.dcache.overall_avg_mshr_miss_latency 23187.554819                       # average overall mshr miss latency
 system.cpu.dcache.overall_avg_mshr_uncacheable_latency          inf                       # average overall mshr uncacheable latency
-system.cpu.dcache.overall_hits::0            14087950                       # number of overall hits
+system.cpu.dcache.overall_hits::0            14092585                       # number of overall hits
 system.cpu.dcache.overall_hits::1                   0                       # number of overall hits
-system.cpu.dcache.overall_hits::total        14087950                       # number of overall hits
-system.cpu.dcache.overall_miss_latency    10764940500                       # number of overall miss cycles
-system.cpu.dcache.overall_miss_rate::0       0.028192                       # miss rate for overall accesses
+system.cpu.dcache.overall_hits::total        14092585                       # number of overall hits
+system.cpu.dcache.overall_miss_latency    10748545000                       # number of overall miss cycles
+system.cpu.dcache.overall_miss_rate::0       0.028300                       # miss rate for overall accesses
 system.cpu.dcache.overall_miss_rate::1       no_value                       # miss rate for overall accesses
 system.cpu.dcache.overall_miss_rate::total     no_value                       # miss rate for overall accesses
-system.cpu.dcache.overall_misses::0            408690                       # number of overall misses
+system.cpu.dcache.overall_misses::0            410440                       # number of overall misses
 system.cpu.dcache.overall_misses::1                 0                       # number of overall misses
-system.cpu.dcache.overall_misses::total        408690                       # number of overall misses
+system.cpu.dcache.overall_misses::total        410440                       # number of overall misses
 system.cpu.dcache.overall_mshr_hits                 0                       # number of overall MSHR hits
-system.cpu.dcache.overall_mshr_miss_latency   9538744500                       # number of overall MSHR miss cycles
-system.cpu.dcache.overall_mshr_miss_rate::0     0.028192                       # mshr miss rate for overall accesses
+system.cpu.dcache.overall_mshr_miss_latency   9517100000                       # number of overall MSHR miss cycles
+system.cpu.dcache.overall_mshr_miss_rate::0     0.028300                       # mshr miss rate for overall accesses
 system.cpu.dcache.overall_mshr_miss_rate::1          inf                       # mshr miss rate for overall accesses
 system.cpu.dcache.overall_mshr_miss_rate::total          inf                       # mshr miss rate for overall accesses
-system.cpu.dcache.overall_mshr_misses          408690                       # number of overall MSHR misses
-system.cpu.dcache.overall_mshr_uncacheable_latency  39116462000                       # number of overall MSHR uncacheable cycles
+system.cpu.dcache.overall_mshr_misses          410440                       # number of overall MSHR misses
+system.cpu.dcache.overall_mshr_uncacheable_latency  39119207500                       # number of overall MSHR uncacheable cycles
 system.cpu.dcache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.dcache.replacements                 411628                       # number of replacements
-system.cpu.dcache.sampled_refs                 412140                       # Sample count of references to valid blocks.
+system.cpu.dcache.replacements                 413327                       # number of replacements
+system.cpu.dcache.sampled_refs                 413839                       # Sample count of references to valid blocks.
 system.cpu.dcache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.dcache.tagsinuse                509.199113                       # Cycle average of tags in use
-system.cpu.dcache.total_refs                 14284927                       # Total number of references to valid blocks.
+system.cpu.dcache.tagsinuse                509.199247                       # Cycle average of tags in use
+system.cpu.dcache.total_refs                 14289765                       # Total number of references to valid blocks.
 system.cpu.dcache.warmup_cycle              658097000                       # Cycle when the warmup percentage was hit.
-system.cpu.dcache.writebacks                   382676                       # number of writebacks
-system.cpu.dtb.accesses                      15524935                       # DTB accesses
+system.cpu.dcache.writebacks                   381698                       # number of writebacks
+system.cpu.dtb.accesses                      15531532                       # DTB accesses
 system.cpu.dtb.align_faults                         0                       # Number of TLB faults due to alignment restrictions
 system.cpu.dtb.domain_faults                        0                       # Number of TLB faults due to domain restrictions
-system.cpu.dtb.flush_entries                     2199                       # Number of entries that have been flushed from TLB
+system.cpu.dtb.flush_entries                     2220                       # Number of entries that have been flushed from TLB
 system.cpu.dtb.flush_tlb                            2                       # Number of times complete TLB was flushed
 system.cpu.dtb.flush_tlb_asid                      40                       # Number of times TLB was flushed by ASID
 system.cpu.dtb.flush_tlb_mva                        0                       # Number of times TLB was flushed by MVA
 system.cpu.dtb.flush_tlb_mva_asid               33670                       # Number of times TLB was flushed by MVA & ASID
-system.cpu.dtb.hits                          15519414                       # DTB hits
+system.cpu.dtb.hits                          15525999                       # DTB hits
 system.cpu.dtb.inst_accesses                        0                       # ITB inst accesses
 system.cpu.dtb.inst_hits                            0                       # ITB inst hits
 system.cpu.dtb.inst_misses                          0                       # ITB inst misses
-system.cpu.dtb.misses                            5521                       # DTB misses
+system.cpu.dtb.misses                            5533                       # DTB misses
 system.cpu.dtb.perms_faults                       255                       # Number of TLB faults due to permissions restrictions
-system.cpu.dtb.prefetch_faults                    756                       # Number of TLB faults due to prefetch
-system.cpu.dtb.read_accesses                  8740303                       # DTB read accesses
-system.cpu.dtb.read_hits                      8735762                       # DTB read hits
-system.cpu.dtb.read_misses                       4541                       # DTB read misses
-system.cpu.dtb.write_accesses                 6784632                       # DTB write accesses
-system.cpu.dtb.write_hits                     6783652                       # DTB write hits
-system.cpu.dtb.write_misses                       980                       # DTB write misses
-system.cpu.icache.ReadReq_accesses::0        41543801                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_accesses::total     41543801                       # number of ReadReq accesses(hits+misses)
-system.cpu.icache.ReadReq_avg_miss_latency::0 14800.791885                       # average ReadReq miss latency
+system.cpu.dtb.prefetch_faults                    757                       # Number of TLB faults due to prefetch
+system.cpu.dtb.read_accesses                  8744287                       # DTB read accesses
+system.cpu.dtb.read_hits                      8739733                       # DTB read hits
+system.cpu.dtb.read_misses                       4554                       # DTB read misses
+system.cpu.dtb.write_accesses                 6787245                       # DTB write accesses
+system.cpu.dtb.write_hits                     6786266                       # DTB write hits
+system.cpu.dtb.write_misses                       979                       # DTB write misses
+system.cpu.icache.ReadReq_accesses::0        41555414                       # number of ReadReq accesses(hits+misses)
+system.cpu.icache.ReadReq_accesses::total     41555414                       # number of ReadReq accesses(hits+misses)
+system.cpu.icache.ReadReq_avg_miss_latency::0 14790.398445                       # average ReadReq miss latency
 system.cpu.icache.ReadReq_avg_miss_latency::1          inf                       # average ReadReq miss latency
 system.cpu.icache.ReadReq_avg_miss_latency::total          inf                       # average ReadReq miss latency
-system.cpu.icache.ReadReq_avg_mshr_miss_latency 11799.492843                       # average ReadReq mshr miss latency
+system.cpu.icache.ReadReq_avg_mshr_miss_latency 11789.103925                       # average ReadReq mshr miss latency
 system.cpu.icache.ReadReq_avg_mshr_uncacheable_latency          inf                       # average ReadReq mshr uncacheable latency
-system.cpu.icache.ReadReq_hits::0            41110405                       # number of ReadReq hits
-system.cpu.icache.ReadReq_hits::total        41110405                       # number of ReadReq hits
-system.cpu.icache.ReadReq_miss_latency     6414604000                       # number of ReadReq miss cycles
-system.cpu.icache.ReadReq_miss_rate::0       0.010432                       # miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_misses::0            433396                       # number of ReadReq misses
-system.cpu.icache.ReadReq_misses::total        433396                       # number of ReadReq misses
-system.cpu.icache.ReadReq_mshr_miss_latency   5113853000                       # number of ReadReq MSHR miss cycles
-system.cpu.icache.ReadReq_mshr_miss_rate::0     0.010432                       # mshr miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_hits::0            41121276                       # number of ReadReq hits
+system.cpu.icache.ReadReq_hits::total        41121276                       # number of ReadReq hits
+system.cpu.icache.ReadReq_miss_latency     6421074000                       # number of ReadReq miss cycles
+system.cpu.icache.ReadReq_miss_rate::0       0.010447                       # miss rate for ReadReq accesses
+system.cpu.icache.ReadReq_misses::0            434138                       # number of ReadReq misses
+system.cpu.icache.ReadReq_misses::total        434138                       # number of ReadReq misses
+system.cpu.icache.ReadReq_mshr_miss_latency   5118098000                       # number of ReadReq MSHR miss cycles
+system.cpu.icache.ReadReq_mshr_miss_rate::0     0.010447                       # mshr miss rate for ReadReq accesses
 system.cpu.icache.ReadReq_mshr_miss_rate::1          inf                       # mshr miss rate for ReadReq accesses
 system.cpu.icache.ReadReq_mshr_miss_rate::total          inf                       # mshr miss rate for ReadReq accesses
-system.cpu.icache.ReadReq_mshr_misses          433396                       # number of ReadReq MSHR misses
+system.cpu.icache.ReadReq_mshr_misses          434138                       # number of ReadReq MSHR misses
 system.cpu.icache.ReadReq_mshr_uncacheable_latency    349111000                       # number of ReadReq MSHR uncacheable cycles
 system.cpu.icache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.cpu.icache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.cpu.icache.avg_refs                  94.856667                       # Average number of references to valid blocks.
+system.cpu.icache.avg_refs                  94.719366                       # Average number of references to valid blocks.
 system.cpu.icache.blocked::no_mshrs                 0                       # number of cycles access was blocked
 system.cpu.icache.blocked::no_targets               0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_mshrs            0                       # number of cycles access was blocked
 system.cpu.icache.blocked_cycles::no_targets            0                       # number of cycles access was blocked
 system.cpu.icache.cache_copies                      0                       # number of cache copies performed
-system.cpu.icache.demand_accesses::0         41543801                       # number of demand (read+write) accesses
+system.cpu.icache.demand_accesses::0         41555414                       # number of demand (read+write) accesses
 system.cpu.icache.demand_accesses::1                0                       # number of demand (read+write) accesses
-system.cpu.icache.demand_accesses::total     41543801                       # number of demand (read+write) accesses
-system.cpu.icache.demand_avg_miss_latency::0 14800.791885                       # average overall miss latency
+system.cpu.icache.demand_accesses::total     41555414                       # number of demand (read+write) accesses
+system.cpu.icache.demand_avg_miss_latency::0 14790.398445                       # average overall miss latency
 system.cpu.icache.demand_avg_miss_latency::1          inf                       # average overall miss latency
 system.cpu.icache.demand_avg_miss_latency::total          inf                       # average overall miss latency
-system.cpu.icache.demand_avg_mshr_miss_latency 11799.492843                       # average overall mshr miss latency
-system.cpu.icache.demand_hits::0             41110405                       # number of demand (read+write) hits
+system.cpu.icache.demand_avg_mshr_miss_latency 11789.103925                       # average overall mshr miss latency
+system.cpu.icache.demand_hits::0             41121276                       # number of demand (read+write) hits
 system.cpu.icache.demand_hits::1                    0                       # number of demand (read+write) hits
-system.cpu.icache.demand_hits::total         41110405                       # number of demand (read+write) hits
-system.cpu.icache.demand_miss_latency      6414604000                       # number of demand (read+write) miss cycles
-system.cpu.icache.demand_miss_rate::0        0.010432                       # miss rate for demand accesses
+system.cpu.icache.demand_hits::total         41121276                       # number of demand (read+write) hits
+system.cpu.icache.demand_miss_latency      6421074000                       # number of demand (read+write) miss cycles
+system.cpu.icache.demand_miss_rate::0        0.010447                       # miss rate for demand accesses
 system.cpu.icache.demand_miss_rate::1        no_value                       # miss rate for demand accesses
 system.cpu.icache.demand_miss_rate::total     no_value                       # miss rate for demand accesses
-system.cpu.icache.demand_misses::0             433396                       # number of demand (read+write) misses
+system.cpu.icache.demand_misses::0             434138                       # number of demand (read+write) misses
 system.cpu.icache.demand_misses::1                  0                       # number of demand (read+write) misses
-system.cpu.icache.demand_misses::total         433396                       # number of demand (read+write) misses
+system.cpu.icache.demand_misses::total         434138                       # number of demand (read+write) misses
 system.cpu.icache.demand_mshr_hits                  0                       # number of demand (read+write) MSHR hits
-system.cpu.icache.demand_mshr_miss_latency   5113853000                       # number of demand (read+write) MSHR miss cycles
-system.cpu.icache.demand_mshr_miss_rate::0     0.010432                       # mshr miss rate for demand accesses
+system.cpu.icache.demand_mshr_miss_latency   5118098000                       # number of demand (read+write) MSHR miss cycles
+system.cpu.icache.demand_mshr_miss_rate::0     0.010447                       # mshr miss rate for demand accesses
 system.cpu.icache.demand_mshr_miss_rate::1          inf                       # mshr miss rate for demand accesses
 system.cpu.icache.demand_mshr_miss_rate::total          inf                       # mshr miss rate for demand accesses
-system.cpu.icache.demand_mshr_misses           433396                       # number of demand (read+write) MSHR misses
+system.cpu.icache.demand_mshr_misses           434138                       # number of demand (read+write) MSHR misses
 system.cpu.icache.fast_writes                       0                       # number of fast writes performed
 system.cpu.icache.mshr_cap_events                   0                       # number of times MSHR cap was activated
 system.cpu.icache.no_allocate_misses                0                       # Number of misses that were no-allocate
-system.cpu.icache.occ_%::0                   0.945788                       # Average percentage of cache occupancy
-system.cpu.icache.occ_blocks::0            484.243503                       # Average occupied blocks per context
-system.cpu.icache.overall_accesses::0        41543801                       # number of overall (read+write) accesses
+system.cpu.icache.occ_%::0                   0.946115                       # Average percentage of cache occupancy
+system.cpu.icache.occ_blocks::0            484.411008                       # Average occupied blocks per context
+system.cpu.icache.overall_accesses::0        41555414                       # number of overall (read+write) accesses
 system.cpu.icache.overall_accesses::1               0                       # number of overall (read+write) accesses
-system.cpu.icache.overall_accesses::total     41543801                       # number of overall (read+write) accesses
-system.cpu.icache.overall_avg_miss_latency::0 14800.791885                       # average overall miss latency
+system.cpu.icache.overall_accesses::total     41555414                       # number of overall (read+write) accesses
+system.cpu.icache.overall_avg_miss_latency::0 14790.398445                       # average overall miss latency
 system.cpu.icache.overall_avg_miss_latency::1          inf                       # average overall miss latency
 system.cpu.icache.overall_avg_miss_latency::total          inf                       # average overall miss latency
-system.cpu.icache.overall_avg_mshr_miss_latency 11799.492843                       # average overall mshr miss latency
+system.cpu.icache.overall_avg_mshr_miss_latency 11789.103925                       # average overall mshr miss latency
 system.cpu.icache.overall_avg_mshr_uncacheable_latency          inf                       # average overall mshr uncacheable latency
-system.cpu.icache.overall_hits::0            41110405                       # number of overall hits
+system.cpu.icache.overall_hits::0            41121276                       # number of overall hits
 system.cpu.icache.overall_hits::1                   0                       # number of overall hits
-system.cpu.icache.overall_hits::total        41110405                       # number of overall hits
-system.cpu.icache.overall_miss_latency     6414604000                       # number of overall miss cycles
-system.cpu.icache.overall_miss_rate::0       0.010432                       # miss rate for overall accesses
+system.cpu.icache.overall_hits::total        41121276                       # number of overall hits
+system.cpu.icache.overall_miss_latency     6421074000                       # number of overall miss cycles
+system.cpu.icache.overall_miss_rate::0       0.010447                       # miss rate for overall accesses
 system.cpu.icache.overall_miss_rate::1       no_value                       # miss rate for overall accesses
 system.cpu.icache.overall_miss_rate::total     no_value                       # miss rate for overall accesses
-system.cpu.icache.overall_misses::0            433396                       # number of overall misses
+system.cpu.icache.overall_misses::0            434138                       # number of overall misses
 system.cpu.icache.overall_misses::1                 0                       # number of overall misses
-system.cpu.icache.overall_misses::total        433396                       # number of overall misses
+system.cpu.icache.overall_misses::total        434138                       # number of overall misses
 system.cpu.icache.overall_mshr_hits                 0                       # number of overall MSHR hits
-system.cpu.icache.overall_mshr_miss_latency   5113853000                       # number of overall MSHR miss cycles
-system.cpu.icache.overall_mshr_miss_rate::0     0.010432                       # mshr miss rate for overall accesses
+system.cpu.icache.overall_mshr_miss_latency   5118098000                       # number of overall MSHR miss cycles
+system.cpu.icache.overall_mshr_miss_rate::0     0.010447                       # mshr miss rate for overall accesses
 system.cpu.icache.overall_mshr_miss_rate::1          inf                       # mshr miss rate for overall accesses
 system.cpu.icache.overall_mshr_miss_rate::total          inf                       # mshr miss rate for overall accesses
-system.cpu.icache.overall_mshr_misses          433396                       # number of overall MSHR misses
+system.cpu.icache.overall_mshr_misses          434138                       # number of overall MSHR misses
 system.cpu.icache.overall_mshr_uncacheable_latency    349111000                       # number of overall MSHR uncacheable cycles
 system.cpu.icache.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.cpu.icache.replacements                 432883                       # number of replacements
-system.cpu.icache.sampled_refs                 433395                       # Sample count of references to valid blocks.
+system.cpu.icache.replacements                 433626                       # number of replacements
+system.cpu.icache.sampled_refs                 434138                       # Sample count of references to valid blocks.
 system.cpu.icache.soft_prefetch_mshr_full            0                       # number of mshr full events for SW prefetching instrutions
-system.cpu.icache.tagsinuse                484.243503                       # Cycle average of tags in use
-system.cpu.icache.total_refs                 41110405                       # Total number of references to valid blocks.
+system.cpu.icache.tagsinuse                484.411008                       # Cycle average of tags in use
+system.cpu.icache.total_refs                 41121276                       # Total number of references to valid blocks.
 system.cpu.icache.warmup_cycle            14253306000                       # Cycle when the warmup percentage was hit.
-system.cpu.icache.writebacks                    33555                       # number of writebacks
+system.cpu.icache.writebacks                    34007                       # number of writebacks
 system.cpu.idle_fraction                            0                       # Percentage of idle cycles
-system.cpu.itb.accesses                      41546620                       # DTB accesses
+system.cpu.itb.accesses                      41558233                       # DTB accesses
 system.cpu.itb.align_faults                         0                       # Number of TLB faults due to alignment restrictions
 system.cpu.itb.domain_faults                        0                       # Number of TLB faults due to domain restrictions
 system.cpu.itb.flush_entries                     1478                       # Number of entries that have been flushed from TLB
@@ -256,9 +256,9 @@ system.cpu.itb.flush_tlb                            2                       # Nu
 system.cpu.itb.flush_tlb_asid                      40                       # Number of times TLB was flushed by ASID
 system.cpu.itb.flush_tlb_mva                        0                       # Number of times TLB was flushed by MVA
 system.cpu.itb.flush_tlb_mva_asid               33670                       # Number of times TLB was flushed by MVA & ASID
-system.cpu.itb.hits                          41543801                       # DTB hits
-system.cpu.itb.inst_accesses                 41546620                       # ITB inst accesses
-system.cpu.itb.inst_hits                     41543801                       # ITB inst hits
+system.cpu.itb.hits                          41555414                       # DTB hits
+system.cpu.itb.inst_accesses                 41558233                       # ITB inst accesses
+system.cpu.itb.inst_hits                     41555414                       # ITB inst hits
 system.cpu.itb.inst_misses                       2819                       # ITB inst misses
 system.cpu.itb.misses                            2819                       # DTB misses
 system.cpu.itb.perms_faults                         0                       # Number of TLB faults due to permissions restrictions
@@ -272,10 +272,10 @@ system.cpu.itb.write_misses                         0                       # DT
 system.cpu.kern.inst.arm                            0                       # number of arm instructions executed
 system.cpu.kern.inst.quiesce                        0                       # number of quiesce instructions executed
 system.cpu.not_idle_fraction                        1                       # Percentage of non-idle cycles
-system.cpu.numCycles                        229442148                       # number of cpu cycles simulated
+system.cpu.numCycles                        229453134                       # number of cpu cycles simulated
 system.cpu.numWorkItemsCompleted                    0                       # number of work items this cpu completed
 system.cpu.numWorkItemsStarted                      0                       # number of work items this cpu started
-system.cpu.num_busy_cycles                  229442148                       # Number of busy cycles
+system.cpu.num_busy_cycles                  229453134                       # Number of busy cycles
 system.cpu.num_conditional_control_insts            0                       # number of instructions that are conditional controls
 system.cpu.num_fp_alu_accesses                   6058                       # Number of float alu accesses
 system.cpu.num_fp_insts                          6058                       # number of float instructions
@@ -283,14 +283,14 @@ system.cpu.num_fp_register_reads                 4226                       # nu
 system.cpu.num_fp_register_writes                1834                       # number of times the floating registers were written
 system.cpu.num_func_calls                           0                       # number of times a function call or return occured
 system.cpu.num_idle_cycles                          0                       # Number of idle cycles
-system.cpu.num_insts                         50572425                       # Number of instructions executed
-system.cpu.num_int_alu_accesses              41827211                       # Number of integer alu accesses
-system.cpu.num_int_insts                     41827211                       # number of integer instructions
-system.cpu.num_int_register_reads           137988684                       # number of times the integer registers were read
-system.cpu.num_int_register_writes           34313952                       # number of times the integer registers were written
-system.cpu.num_load_insts                     9208240                       # Number of load instructions
-system.cpu.num_mem_refs                      16289993                       # number of memory refs
-system.cpu.num_store_insts                    7081753                       # Number of store instructions
+system.cpu.num_insts                         50588397                       # Number of instructions executed
+system.cpu.num_int_alu_accesses              41841366                       # Number of integer alu accesses
+system.cpu.num_int_insts                     41841366                       # number of integer instructions
+system.cpu.num_int_register_reads           138034734                       # number of times the integer registers were read
+system.cpu.num_int_register_writes           34325875                       # number of times the integer registers were written
+system.cpu.num_load_insts                     9211791                       # Number of load instructions
+system.cpu.num_mem_refs                      16296219                       # number of memory refs
+system.cpu.num_store_insts                    7084428                       # Number of store instructions
 system.iocache.avg_blocked_cycles::no_mshrs     no_value                       # average number of cycles each access was blocked
 system.iocache.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
 system.iocache.avg_refs                      no_value                       # Average number of references to valid blocks.
@@ -359,141 +359,141 @@ system.iocache.total_refs                           0                       # To
 system.iocache.warmup_cycle                         0                       # Cycle when the warmup percentage was hit.
 system.iocache.writebacks                           0                       # number of writebacks
 system.l2c.LoadLockedReq_avg_mshr_uncacheable_latency          inf                       # average LoadLockedReq mshr uncacheable latency
-system.l2c.LoadLockedReq_mshr_uncacheable_latency    234160000                       # number of LoadLockedReq MSHR uncacheable cycles
-system.l2c.ReadExReq_accesses::0               170323                       # number of ReadExReq accesses(hits+misses)
-system.l2c.ReadExReq_accesses::total           170323                       # number of ReadExReq accesses(hits+misses)
+system.l2c.LoadLockedReq_mshr_uncacheable_latency    234360000                       # number of LoadLockedReq MSHR uncacheable cycles
+system.l2c.ReadExReq_accesses::0               170356                       # number of ReadExReq accesses(hits+misses)
+system.l2c.ReadExReq_accesses::total           170356                       # number of ReadExReq accesses(hits+misses)
 system.l2c.ReadExReq_avg_miss_latency::0        52000                       # average ReadExReq miss latency
 system.l2c.ReadExReq_avg_miss_latency::1          inf                       # average ReadExReq miss latency
 system.l2c.ReadExReq_avg_miss_latency::total          inf                       # average ReadExReq miss latency
 system.l2c.ReadExReq_avg_mshr_miss_latency        40000                       # average ReadExReq mshr miss latency
-system.l2c.ReadExReq_hits::0                    62071                       # number of ReadExReq hits
-system.l2c.ReadExReq_hits::total                62071                       # number of ReadExReq hits
-system.l2c.ReadExReq_miss_latency          5629104000                       # number of ReadExReq miss cycles
-system.l2c.ReadExReq_miss_rate::0            0.635569                       # miss rate for ReadExReq accesses
-system.l2c.ReadExReq_misses::0                 108252                       # number of ReadExReq misses
-system.l2c.ReadExReq_misses::total             108252                       # number of ReadExReq misses
-system.l2c.ReadExReq_mshr_miss_latency     4330080000                       # number of ReadExReq MSHR miss cycles
-system.l2c.ReadExReq_mshr_miss_rate::0       0.635569                       # mshr miss rate for ReadExReq accesses
+system.l2c.ReadExReq_hits::0                    62546                       # number of ReadExReq hits
+system.l2c.ReadExReq_hits::total                62546                       # number of ReadExReq hits
+system.l2c.ReadExReq_miss_latency          5606120000                       # number of ReadExReq miss cycles
+system.l2c.ReadExReq_miss_rate::0            0.632851                       # miss rate for ReadExReq accesses
+system.l2c.ReadExReq_misses::0                 107810                       # number of ReadExReq misses
+system.l2c.ReadExReq_misses::total             107810                       # number of ReadExReq misses
+system.l2c.ReadExReq_mshr_miss_latency     4312400000                       # number of ReadExReq MSHR miss cycles
+system.l2c.ReadExReq_mshr_miss_rate::0       0.632851                       # mshr miss rate for ReadExReq accesses
 system.l2c.ReadExReq_mshr_miss_rate::1            inf                       # mshr miss rate for ReadExReq accesses
 system.l2c.ReadExReq_mshr_miss_rate::total          inf                       # mshr miss rate for ReadExReq accesses
-system.l2c.ReadExReq_mshr_misses               108252                       # number of ReadExReq MSHR misses
-system.l2c.ReadReq_accesses::0                 673101                       # number of ReadReq accesses(hits+misses)
-system.l2c.ReadReq_accesses::1                   5652                       # number of ReadReq accesses(hits+misses)
-system.l2c.ReadReq_accesses::total             678753                       # number of ReadReq accesses(hits+misses)
-system.l2c.ReadReq_avg_miss_latency::0   52096.523258                       # average ReadReq miss latency
-system.l2c.ReadReq_avg_miss_latency::1   28127657.142857                       # average ReadReq miss latency
-system.l2c.ReadReq_avg_miss_latency::total 28179753.666115                       # average ReadReq miss latency
+system.l2c.ReadExReq_mshr_misses               107810                       # number of ReadExReq MSHR misses
+system.l2c.ReadReq_accesses::0                 675489                       # number of ReadReq accesses(hits+misses)
+system.l2c.ReadReq_accesses::1                   5600                       # number of ReadReq accesses(hits+misses)
+system.l2c.ReadReq_accesses::total             681089                       # number of ReadReq accesses(hits+misses)
+system.l2c.ReadReq_avg_miss_latency::0   52080.437900                       # average ReadReq miss latency
+system.l2c.ReadReq_avg_miss_latency::1   33725803.571429                       # average ReadReq miss latency
+system.l2c.ReadReq_avg_miss_latency::total 33777884.009328                       # average ReadReq miss latency
 system.l2c.ReadReq_avg_mshr_miss_latency        40000                       # average ReadReq mshr miss latency
 system.l2c.ReadReq_avg_mshr_uncacheable_latency          inf                       # average ReadReq mshr uncacheable latency
-system.l2c.ReadReq_hits::0                     654204                       # number of ReadReq hits
-system.l2c.ReadReq_hits::1                       5617                       # number of ReadReq hits
-system.l2c.ReadReq_hits::total                 659821                       # number of ReadReq hits
-system.l2c.ReadReq_miss_latency             984468000                       # number of ReadReq miss cycles
-system.l2c.ReadReq_miss_rate::0              0.028075                       # miss rate for ReadReq accesses
-system.l2c.ReadReq_miss_rate::1              0.006192                       # miss rate for ReadReq accesses
-system.l2c.ReadReq_miss_rate::total          0.034267                       # miss rate for ReadReq accesses
-system.l2c.ReadReq_misses::0                    18897                       # number of ReadReq misses
-system.l2c.ReadReq_misses::1                       35                       # number of ReadReq misses
-system.l2c.ReadReq_misses::total                18932                       # number of ReadReq misses
-system.l2c.ReadReq_mshr_miss_latency        757280000                       # number of ReadReq MSHR miss cycles
-system.l2c.ReadReq_mshr_miss_rate::0         0.028127                       # mshr miss rate for ReadReq accesses
-system.l2c.ReadReq_mshr_miss_rate::1         3.349611                       # mshr miss rate for ReadReq accesses
-system.l2c.ReadReq_mshr_miss_rate::total     3.377737                       # mshr miss rate for ReadReq accesses
-system.l2c.ReadReq_mshr_misses                  18932                       # number of ReadReq MSHR misses
-system.l2c.ReadReq_mshr_uncacheable_latency  29199338000                       # number of ReadReq MSHR uncacheable cycles
-system.l2c.UpgradeReq_accesses::0                1750                       # number of UpgradeReq accesses(hits+misses)
-system.l2c.UpgradeReq_accesses::total            1750                       # number of UpgradeReq accesses(hits+misses)
-system.l2c.UpgradeReq_avg_miss_latency::0   660.126947                       # average UpgradeReq miss latency
+system.l2c.ReadReq_hits::0                     657357                       # number of ReadReq hits
+system.l2c.ReadReq_hits::1                       5572                       # number of ReadReq hits
+system.l2c.ReadReq_hits::total                 662929                       # number of ReadReq hits
+system.l2c.ReadReq_miss_latency             944322500                       # number of ReadReq miss cycles
+system.l2c.ReadReq_miss_rate::0              0.026843                       # miss rate for ReadReq accesses
+system.l2c.ReadReq_miss_rate::1              0.005000                       # miss rate for ReadReq accesses
+system.l2c.ReadReq_miss_rate::total          0.031843                       # miss rate for ReadReq accesses
+system.l2c.ReadReq_misses::0                    18132                       # number of ReadReq misses
+system.l2c.ReadReq_misses::1                       28                       # number of ReadReq misses
+system.l2c.ReadReq_misses::total                18160                       # number of ReadReq misses
+system.l2c.ReadReq_mshr_miss_latency        726400000                       # number of ReadReq MSHR miss cycles
+system.l2c.ReadReq_mshr_miss_rate::0         0.026884                       # mshr miss rate for ReadReq accesses
+system.l2c.ReadReq_mshr_miss_rate::1         3.242857                       # mshr miss rate for ReadReq accesses
+system.l2c.ReadReq_mshr_miss_rate::total     3.269741                       # mshr miss rate for ReadReq accesses
+system.l2c.ReadReq_mshr_misses                  18160                       # number of ReadReq MSHR misses
+system.l2c.ReadReq_mshr_uncacheable_latency  29200446000                       # number of ReadReq MSHR uncacheable cycles
+system.l2c.UpgradeReq_accesses::0                1825                       # number of UpgradeReq accesses(hits+misses)
+system.l2c.UpgradeReq_accesses::total            1825                       # number of UpgradeReq accesses(hits+misses)
+system.l2c.UpgradeReq_avg_miss_latency::0   489.208633                       # average UpgradeReq miss latency
 system.l2c.UpgradeReq_avg_miss_latency::1          inf                       # average UpgradeReq miss latency
 system.l2c.UpgradeReq_avg_miss_latency::total          inf                       # average UpgradeReq miss latency
 system.l2c.UpgradeReq_avg_mshr_miss_latency        40000                       # average UpgradeReq mshr miss latency
-system.l2c.UpgradeReq_hits::0                      17                       # number of UpgradeReq hits
-system.l2c.UpgradeReq_hits::total                  17                       # number of UpgradeReq hits
-system.l2c.UpgradeReq_miss_latency            1144000                       # number of UpgradeReq miss cycles
-system.l2c.UpgradeReq_miss_rate::0           0.990286                       # miss rate for UpgradeReq accesses
-system.l2c.UpgradeReq_misses::0                  1733                       # number of UpgradeReq misses
-system.l2c.UpgradeReq_misses::total              1733                       # number of UpgradeReq misses
-system.l2c.UpgradeReq_mshr_miss_latency      69320000                       # number of UpgradeReq MSHR miss cycles
-system.l2c.UpgradeReq_mshr_miss_rate::0      0.990286                       # mshr miss rate for UpgradeReq accesses
+system.l2c.UpgradeReq_hits::0                      18                       # number of UpgradeReq hits
+system.l2c.UpgradeReq_hits::total                  18                       # number of UpgradeReq hits
+system.l2c.UpgradeReq_miss_latency             884000                       # number of UpgradeReq miss cycles
+system.l2c.UpgradeReq_miss_rate::0           0.990137                       # miss rate for UpgradeReq accesses
+system.l2c.UpgradeReq_misses::0                  1807                       # number of UpgradeReq misses
+system.l2c.UpgradeReq_misses::total              1807                       # number of UpgradeReq misses
+system.l2c.UpgradeReq_mshr_miss_latency      72280000                       # number of UpgradeReq MSHR miss cycles
+system.l2c.UpgradeReq_mshr_miss_rate::0      0.990137                       # mshr miss rate for UpgradeReq accesses
 system.l2c.UpgradeReq_mshr_miss_rate::1           inf                       # mshr miss rate for UpgradeReq accesses
 system.l2c.UpgradeReq_mshr_miss_rate::total          inf                       # mshr miss rate for UpgradeReq accesses
-system.l2c.UpgradeReq_mshr_misses                1733                       # number of UpgradeReq MSHR misses
+system.l2c.UpgradeReq_mshr_misses                1807                       # number of UpgradeReq MSHR misses
 system.l2c.WriteReq_avg_mshr_uncacheable_latency          inf                       # average WriteReq mshr uncacheable latency
-system.l2c.WriteReq_mshr_uncacheable_latency    739844000                       # number of WriteReq MSHR uncacheable cycles
-system.l2c.Writeback_accesses::0               416231                       # number of Writeback accesses(hits+misses)
-system.l2c.Writeback_accesses::total           416231                       # number of Writeback accesses(hits+misses)
-system.l2c.Writeback_hits::0                   416231                       # number of Writeback hits
-system.l2c.Writeback_hits::total               416231                       # number of Writeback hits
+system.l2c.WriteReq_mshr_uncacheable_latency    740884000                       # number of WriteReq MSHR uncacheable cycles
+system.l2c.Writeback_accesses::0               415705                       # number of Writeback accesses(hits+misses)
+system.l2c.Writeback_accesses::total           415705                       # number of Writeback accesses(hits+misses)
+system.l2c.Writeback_hits::0                   415705                       # number of Writeback hits
+system.l2c.Writeback_hits::total               415705                       # number of Writeback hits
 system.l2c.avg_blocked_cycles::no_mshrs      no_value                       # average number of cycles each access was blocked
 system.l2c.avg_blocked_cycles::no_targets     no_value                       # average number of cycles each access was blocked
-system.l2c.avg_refs                          6.975292                       # Average number of references to valid blocks.
+system.l2c.avg_refs                          7.060757                       # Average number of references to valid blocks.
 system.l2c.blocked::no_mshrs                        0                       # number of cycles access was blocked
 system.l2c.blocked::no_targets                      0                       # number of cycles access was blocked
 system.l2c.blocked_cycles::no_mshrs                 0                       # number of cycles access was blocked
 system.l2c.blocked_cycles::no_targets               0                       # number of cycles access was blocked
 system.l2c.cache_copies                             0                       # number of cache copies performed
-system.l2c.demand_accesses::0                  843424                       # number of demand (read+write) accesses
-system.l2c.demand_accesses::1                    5652                       # number of demand (read+write) accesses
-system.l2c.demand_accesses::total              849076                       # number of demand (read+write) accesses
-system.l2c.demand_avg_miss_latency::0    52014.345374                       # average overall miss latency
-system.l2c.demand_avg_miss_latency::1       188959200                       # average overall miss latency
-system.l2c.demand_avg_miss_latency::total 189011214.345374                       # average overall miss latency
+system.l2c.demand_accesses::0                  845845                       # number of demand (read+write) accesses
+system.l2c.demand_accesses::1                    5600                       # number of demand (read+write) accesses
+system.l2c.demand_accesses::total              851445                       # number of demand (read+write) accesses
+system.l2c.demand_avg_miss_latency::0    52011.580728                       # average overall miss latency
+system.l2c.demand_avg_miss_latency::1       233944375                       # average overall miss latency
+system.l2c.demand_avg_miss_latency::total 233996386.580728                       # average overall miss latency
 system.l2c.demand_avg_mshr_miss_latency         40000                       # average overall mshr miss latency
-system.l2c.demand_hits::0                      716275                       # number of demand (read+write) hits
-system.l2c.demand_hits::1                        5617                       # number of demand (read+write) hits
-system.l2c.demand_hits::total                  721892                       # number of demand (read+write) hits
-system.l2c.demand_miss_latency             6613572000                       # number of demand (read+write) miss cycles
-system.l2c.demand_miss_rate::0               0.150753                       # miss rate for demand accesses
-system.l2c.demand_miss_rate::1               0.006192                       # miss rate for demand accesses
-system.l2c.demand_miss_rate::total           0.156946                       # miss rate for demand accesses
-system.l2c.demand_misses::0                    127149                       # number of demand (read+write) misses
-system.l2c.demand_misses::1                        35                       # number of demand (read+write) misses
-system.l2c.demand_misses::total                127184                       # number of demand (read+write) misses
+system.l2c.demand_hits::0                      719903                       # number of demand (read+write) hits
+system.l2c.demand_hits::1                        5572                       # number of demand (read+write) hits
+system.l2c.demand_hits::total                  725475                       # number of demand (read+write) hits
+system.l2c.demand_miss_latency             6550442500                       # number of demand (read+write) miss cycles
+system.l2c.demand_miss_rate::0               0.148895                       # miss rate for demand accesses
+system.l2c.demand_miss_rate::1               0.005000                       # miss rate for demand accesses
+system.l2c.demand_miss_rate::total           0.153895                       # miss rate for demand accesses
+system.l2c.demand_misses::0                    125942                       # number of demand (read+write) misses
+system.l2c.demand_misses::1                        28                       # number of demand (read+write) misses
+system.l2c.demand_misses::total                125970                       # number of demand (read+write) misses
 system.l2c.demand_mshr_hits                         0                       # number of demand (read+write) MSHR hits
-system.l2c.demand_mshr_miss_latency        5087360000                       # number of demand (read+write) MSHR miss cycles
-system.l2c.demand_mshr_miss_rate::0          0.150795                       # mshr miss rate for demand accesses
-system.l2c.demand_mshr_miss_rate::1         22.502477                       # mshr miss rate for demand accesses
-system.l2c.demand_mshr_miss_rate::total     22.653272                       # mshr miss rate for demand accesses
-system.l2c.demand_mshr_misses                  127184                       # number of demand (read+write) MSHR misses
+system.l2c.demand_mshr_miss_latency        5038800000                       # number of demand (read+write) MSHR miss cycles
+system.l2c.demand_mshr_miss_rate::0          0.148928                       # mshr miss rate for demand accesses
+system.l2c.demand_mshr_miss_rate::1         22.494643                       # mshr miss rate for demand accesses
+system.l2c.demand_mshr_miss_rate::total     22.643571                       # mshr miss rate for demand accesses
+system.l2c.demand_mshr_misses                  125970                       # number of demand (read+write) MSHR misses
 system.l2c.fast_writes                              0                       # number of fast writes performed
 system.l2c.mshr_cap_events                          0                       # number of times MSHR cap was activated
 system.l2c.no_allocate_misses                       0                       # Number of misses that were no-allocate
-system.l2c.occ_%::0                          0.086431                       # Average percentage of cache occupancy
-system.l2c.occ_%::1                          0.477933                       # Average percentage of cache occupancy
-system.l2c.occ_blocks::0                  5664.361976                       # Average occupied blocks per context
-system.l2c.occ_blocks::1                 31321.847814                       # Average occupied blocks per context
-system.l2c.overall_accesses::0                 843424                       # number of overall (read+write) accesses
-system.l2c.overall_accesses::1                   5652                       # number of overall (read+write) accesses
-system.l2c.overall_accesses::total             849076                       # number of overall (read+write) accesses
-system.l2c.overall_avg_miss_latency::0   52014.345374                       # average overall miss latency
-system.l2c.overall_avg_miss_latency::1      188959200                       # average overall miss latency
-system.l2c.overall_avg_miss_latency::total 189011214.345374                       # average overall miss latency
+system.l2c.occ_%::0                          0.081481                       # Average percentage of cache occupancy
+system.l2c.occ_%::1                          0.477898                       # Average percentage of cache occupancy
+system.l2c.occ_blocks::0                  5339.953820                       # Average occupied blocks per context
+system.l2c.occ_blocks::1                 31319.548737                       # Average occupied blocks per context
+system.l2c.overall_accesses::0                 845845                       # number of overall (read+write) accesses
+system.l2c.overall_accesses::1                   5600                       # number of overall (read+write) accesses
+system.l2c.overall_accesses::total             851445                       # number of overall (read+write) accesses
+system.l2c.overall_avg_miss_latency::0   52011.580728                       # average overall miss latency
+system.l2c.overall_avg_miss_latency::1      233944375                       # average overall miss latency
+system.l2c.overall_avg_miss_latency::total 233996386.580728                       # average overall miss latency
 system.l2c.overall_avg_mshr_miss_latency        40000                       # average overall mshr miss latency
 system.l2c.overall_avg_mshr_uncacheable_latency          inf                       # average overall mshr uncacheable latency
-system.l2c.overall_hits::0                     716275                       # number of overall hits
-system.l2c.overall_hits::1                       5617                       # number of overall hits
-system.l2c.overall_hits::total                 721892                       # number of overall hits
-system.l2c.overall_miss_latency            6613572000                       # number of overall miss cycles
-system.l2c.overall_miss_rate::0              0.150753                       # miss rate for overall accesses
-system.l2c.overall_miss_rate::1              0.006192                       # miss rate for overall accesses
-system.l2c.overall_miss_rate::total          0.156946                       # miss rate for overall accesses
-system.l2c.overall_misses::0                   127149                       # number of overall misses
-system.l2c.overall_misses::1                       35                       # number of overall misses
-system.l2c.overall_misses::total               127184                       # number of overall misses
+system.l2c.overall_hits::0                     719903                       # number of overall hits
+system.l2c.overall_hits::1                       5572                       # number of overall hits
+system.l2c.overall_hits::total                 725475                       # number of overall hits
+system.l2c.overall_miss_latency            6550442500                       # number of overall miss cycles
+system.l2c.overall_miss_rate::0              0.148895                       # miss rate for overall accesses
+system.l2c.overall_miss_rate::1              0.005000                       # miss rate for overall accesses
+system.l2c.overall_miss_rate::total          0.153895                       # miss rate for overall accesses
+system.l2c.overall_misses::0                   125942                       # number of overall misses
+system.l2c.overall_misses::1                       28                       # number of overall misses
+system.l2c.overall_misses::total               125970                       # number of overall misses
 system.l2c.overall_mshr_hits                        0                       # number of overall MSHR hits
-system.l2c.overall_mshr_miss_latency       5087360000                       # number of overall MSHR miss cycles
-system.l2c.overall_mshr_miss_rate::0         0.150795                       # mshr miss rate for overall accesses
-system.l2c.overall_mshr_miss_rate::1        22.502477                       # mshr miss rate for overall accesses
-system.l2c.overall_mshr_miss_rate::total    22.653272                       # mshr miss rate for overall accesses
-system.l2c.overall_mshr_misses                 127184                       # number of overall MSHR misses
-system.l2c.overall_mshr_uncacheable_latency  29939182000                       # number of overall MSHR uncacheable cycles
+system.l2c.overall_mshr_miss_latency       5038800000                       # number of overall MSHR miss cycles
+system.l2c.overall_mshr_miss_rate::0         0.148928                       # mshr miss rate for overall accesses
+system.l2c.overall_mshr_miss_rate::1        22.494643                       # mshr miss rate for overall accesses
+system.l2c.overall_mshr_miss_rate::total    22.643571                       # mshr miss rate for overall accesses
+system.l2c.overall_mshr_misses                 125970                       # number of overall MSHR misses
+system.l2c.overall_mshr_uncacheable_latency  29941330000                       # number of overall MSHR uncacheable cycles
 system.l2c.overall_mshr_uncacheable_misses            0                       # number of overall MSHR uncacheable misses
-system.l2c.replacements                         94170                       # number of replacements
-system.l2c.sampled_refs                        125831                       # Sample count of references to valid blocks.
+system.l2c.replacements                         93233                       # number of replacements
+system.l2c.sampled_refs                        124676                       # Sample count of references to valid blocks.
 system.l2c.soft_prefetch_mshr_full                  0                       # number of mshr full events for SW prefetching instrutions
-system.l2c.tagsinuse                     36986.209790                       # Cycle average of tags in use
-system.l2c.total_refs                          877708                       # Total number of references to valid blocks.
+system.l2c.tagsinuse                     36659.502556                       # Cycle average of tags in use
+system.l2c.total_refs                          880307                       # Total number of references to valid blocks.
 system.l2c.warmup_cycle                             0                       # Cycle when the warmup percentage was hit.
-system.l2c.writebacks                           87626                       # number of writebacks
+system.l2c.writebacks                           87349                       # number of writebacks
 
 ---------- End Simulation Statistics   ----------
diff --git a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/system.terminal b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/system.terminal
index f3053783c0..3921585dfd 100644
Binary files a/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/system.terminal and b/tests/quick/10.linux-boot/ref/arm/linux/realview-simple-timing/system.terminal differ
diff --git a/util/make_release.py b/util/make_release.py
deleted file mode 100755
index 5a47f06587..0000000000
--- a/util/make_release.py
+++ /dev/null
@@ -1,222 +0,0 @@
-#!/usr/bin/env python
-# Copyright (c) 2006-2008 The Regents of The University of Michigan
-# All rights reserved.
-#
-# Redistribution and use in source and binary forms, with or without
-# modification, are permitted provided that the following conditions are
-# met: redistributions of source code must retain the above copyright
-# notice, this list of conditions and the following disclaimer;
-# redistributions in binary form must reproduce the above copyright
-# notice, this list of conditions and the following disclaimer in the
-# documentation and/or other materials provided with the distribution;
-# neither the name of the copyright holders nor the names of its
-# contributors may be used to endorse or promote products derived from
-# this software without specific prior written permission.
-#
-# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-#
-# Authors: Ali Saidi
-#          Steve Reinhardt
-#          Nathan Binkert
-
-import os
-import re
-import shutil
-import sys
-import time
-
-from glob import glob
-from os import system
-from os.path import basename, dirname, exists, isdir, isfile, join as joinpath
-
-def mkdir(*args):
-    path = joinpath(*args)
-    os.mkdir(path)
-
-def touch(*args, **kwargs):
-    when = kwargs.get('when', None)
-    path = joinpath(*args)
-    os.utime(path, when)
-
-def rmtree(*args):
-    path = joinpath(*args)
-    for match in glob(path):
-        if isdir(match):
-            shutil.rmtree(match)
-        else:
-            os.unlink(match)
-
-def remove(*args):
-    path = joinpath(*args)
-    for match in glob(path):
-        if not isdir(match):
-            os.unlink(match)
-
-def movedir(srcdir, destdir, dir):
-    src = joinpath(srcdir, dir)
-    dest = joinpath(destdir, dir)
-
-    if not isdir(src):
-        raise AttributeError
-
-    os.makedirs(dirname(dest))
-    shutil.move(src, dest)
-
-if not isdir('.hg'):
-    sys.exit('Not in the top level of an m5 tree!')
-
-usage = '%s <destdir> <release name>' % sys.argv[0]
-
-if len(sys.argv) != 3:
-    sys.exit(usage)
-
-destdir = sys.argv[1]
-releasename = sys.argv[2]
-release_dest = joinpath(destdir, 'release')
-#encumbered_dest = joinpath(destdir, 'encumbered')
-release_dir = joinpath(release_dest, releasename)
-#encumbered_dir = joinpath(encumbered_dest, releasename)
-
-if exists(destdir):
-    if not isdir(destdir):
-        raise AttributeError, '%s exists, but is not a directory' % destdir
-else:
-    mkdir(destdir)
-
-if exists(release_dest):
-    if not isdir(release_dest):
-        raise AttributeError, \
-              '%s exists, but is not a directory' % release_dest
-    rmtree(release_dest)
-
-#if exists(encumbered_dest):
-#    if not isdir(encumbered_dest):
-#       raise AttributeError, \
-#             '%s exists, but is not a directory' % encumbered_dest
-#   rmtree(encumbered_dest)
-
-mkdir(release_dest)
-#mkdir(encumbered_dest)
-mkdir(release_dir)
-#mkdir(encumbered_dir)
-
-system('hg update')
-system('rsync -av --exclude ".hg*" --exclude build . %s' % release_dir)
-# move the time forward on some files by a couple of minutes so we can
-# avoid building things unnecessarily
-when = int(time.time()) + 120
-
-# make sure scons doesn't try to run flex unnecessarily
-#touch(release_dir, 'src/encumbered/eio/exolex.cc', when=(when, when))
-
-# get rid of non-shipping code
-#rmtree(release_dir, 'src/encumbered/dev')
-rmtree(release_dir, 'src/cpu/ozone')
-#rmtree(release_dir, 'src/mem/cache/tags/split*.cc')
-#rmtree(release_dir, 'src/mem/cache/tags/split*.hh')
-#rmtree(release_dir, 'src/mem/cache/prefetch/ghb_*.cc')
-#rmtree(release_dir, 'src/mem/cache/prefetch/ghb_*.hh')
-#rmtree(release_dir, 'src/mem/cache/prefetch/stride_*.cc')
-#rmtree(release_dir, 'src/mem/cache/prefetch/stride_*.hh')
-rmtree(release_dir, 'configs/fullsys')
-rmtree(release_dir, 'configs/test')
-rmtree(release_dir, 'tests/long/*/ref')
-rmtree(release_dir, 'tests/old')
-rmtree(release_dir, 'tests/quick/00.hello/ref/x86')
-rmtree(release_dir, 'tests/quick/02.insttest')
-rmtree(release_dir, 'tests/test-progs/hello/bin/x86')
-
-remove(release_dir, 'src/cpu/nativetrace.hh')
-remove(release_dir, 'src/cpu/nativetrace.cc')
-
-# get rid of some of private scripts
-remove(release_dir, 'util/chgcopyright')
-remove(release_dir, 'util/make_release.py')
-
-def remove_sources(regex, subdir):
-    script = joinpath(release_dir, subdir, 'SConscript')
-    if isinstance(regex, str):
-        regex = re.compile(regex)
-    inscript = file(script, 'r').readlines()
-    outscript = file(script, 'w')
-    for line in inscript:
-        if regex.match(line):
-            continue
-
-        outscript.write(line)
-    outscript.close()
-
-def remove_lines(s_regex, e_regex, f):
-    f = joinpath(release_dir, f)
-    if isinstance(s_regex, str):
-        s_regex = re.compile(s_regex)
-    if isinstance(e_regex, str):
-        e_regex = re.compile(e_regex)
-    inscript = file(f, 'r').readlines()
-    outscript = file(f, 'w')
-    skipping = False
-    for line in inscript:
-        if (not skipping and s_regex.match(line)) or \
-                (e_regex and skipping and not e_regex.match(line)):
-            if e_regex:
-                skipping = True
-            continue
-        skipping = False
-        outscript.write(line)
-    outscript.close()
-
-def replace_line(s_regex, f, rl):
-    f = joinpath(release_dir, f)
-    if isinstance(s_regex, str):
-        s_regex = re.compile(s_regex)
-    inscript = file(f, 'r').readlines()
-    outscript = file(f, 'w')
-    for line in inscript:
-        if s_regex.match(line):
-            outscript.write(rl)
-            continue
-        outscript.write(line)
-    outscript.close()
-
-
-# fix up the SConscript to deal with files we've removed
-#remove_sources(r'.*split.*\.cc', 'src/mem/cache/tags')
-#remove_sources(r'.*(ghb|stride)_prefetcher\.cc', 'src/mem/cache/prefetch')
-remove_sources(r'.*nativetrace.*', 'src/cpu')
-
-benches = [ 'bzip2', 'eon', 'gzip', 'mcf', 'parser', 'perlbmk',
-            'twolf', 'vortex' ]
-for bench in benches:
-    rmtree(release_dir, 'tests', 'test-progs', bench)
-
-#movedir(release_dir, encumbered_dir, 'src/encumbered')
-rmtree(release_dir, 'tests/test-progs/anagram')
-rmtree(release_dir, 'tests/quick/20.eio-short')
-
-f = open('src/cpu/SConsopts', 'w+')
-f.writelines(("Import('*')\n", "all_cpu_list.append('DummyCPUMakeSconsHappy')\n"))
-f.close()
-
-
-def taritup(directory, destdir, filename):
-    basedir = dirname(directory)
-    tarball = joinpath(destdir, filename)
-    tardir = basename(directory)
-
-    system('cd %s; tar cfj %s %s' % (basedir, tarball, tardir))
-
-taritup(release_dir, destdir, '%s.tar.bz2' % releasename)
-#taritup(encumbered_dir, destdir, '%s-encumbered.tar.bz2' % releasename)
-
-print "release created in %s" % destdir
-print "don't forget to tag the repository!"