misc: Merge branch 'release-staging-v23-0' into stable

Change-Id: Ie2012ea0ae86401181cf02de3e22401e406a18e6
This commit is contained in:
Bobby R. Bruce
2023-07-07 19:25:10 -07:00
1793 changed files with 67805 additions and 18789 deletions

View File

@@ -26,3 +26,6 @@ c3bd8eb1214cbebbc92c7958b80aa06913bce3ba
# A commit which ran Python Black on all Python files.
# https://gem5-review.googlesource.com/c/public/gem5/+/47024
787204c92d876dd81357b75aede52d8ef5e053d3
# A commit which ran flynt all Python files.
e73655d038cdfa68964109044e33c9a6e7d85ac9

View File

@@ -1,3 +1,121 @@
# Version 23.0
This release has approximately 500 contributions from 50 unique contributors.
Below we highlight key gem5 features and improvements in this release.
## Significant API and user-facing changes
### Major renaming of CPU stats
The CPU stats have been renamed.
See <https://gem5.atlassian.net/browse/GEM5-1304> for details.
Now, each stage (fetch, execute, commit) have their own stat group.
Stats that are shared between the different CPU model (O3, Minor, Simple) now have the exact same names.
**Important:** Some stat names were misleading before this change.
With this change, stats with the same names between different CPU models have the same meaning.
### `fs.py` and `se.py` deprecated
These scripts have not been well supported for many gem5 releases.
With gem5 23.0, we have officially deprecated these scripts.
They have been moved into the `deprecated` directory, **but they will be removed in a future release.**
As a replacement, we strongly suggest using the gem5 standard library.
See <https://www.gem5.org/documentation/gem5-stdlib/overview> for more information.
### Renaming of `DEBUG` guard into `GEM5_DEBUG`
Scons no longer defines the `DEBUG` guard in debug builds, so code making using of it should use `GEM5_DEBUG` instead.
### Other API changes
Also, this release:
- Removes deprecated namespaces. Namespace names were updated a couple of releases ago. This release removes the old names.
- Uses `MemberEventWrapper` in favor of `EventWrapper` for instance member functions.
- Adds an extension mechanism to `Packet` and `Request`.
- Sets x86 CPU vendor string to "HygoneGenuine" to better support GLIBC.
## New features and improvements
### Large improvements to gem5 resources and gem5 resources website
We now have a new web portal for the gem5 resources: <https://resources.gem5.org>
This web portal will allow users to browse the resources available (e.g., disk images, kernels, workloads, binaries, simpoints, etc.) to use out-of-the-box with the gem5 standard library.
You can filter based on architecture, resource type, and compatible gem5 versions.
For each resource, there are examples of how to use the resource and pointers to examples using the resource in the gem5 codebase.
More information can be found on gem5's website: <https://www.gem5.org/documentation/general_docs/gem5_resources/>
We will be expanding gem5 resources with more workloads and resources over the course of the next release.
If you would like to contribute to gem5 resources by uploading your own workloads, disk images, etc., please create an issue on GitHub.
In addition to the new gem5 Resources web portal, the gem5 Resources API has been significantly updated and improved.
There are now much simpler functions for getting resources such as `obtain_resource(<name>)` that will download the resource by name and return a reference that can be used (e.g., as a binary in `set_se_workload` function on the board).
As such the generic `Resouce` class has been deprecated and will be removed in a future release.
Resources are now specialized for their particular category.
For example, there is now a `BinaryResource` class which will return if a user specifies a binary resource when using the `obtain_resource` function.
This allow for resource typing and for greater resource specialization.
### Arm ISA improvements
Architectural support for Armv9 [Scalable Matrix extension](https://developer.arm.com/documentation/ddi0616/latest) (FEAT_SME).
The implementation employs a simple renaming scheme for the Za array register in the O3 CPU, so that writes to difference tiles in the register are considered a dependency and are therefore serialized.
The following SVE and SIMD & FP extensions have also been implemented:
* FEAT_F64MM
* FEAT_F32MM
* FEAT_DOTPROD
* FEAT_I8MM
And more generally:
* FEAT_TLBIOS
* FEAT_FLAGM
* FEAT_FLAGM2
* FEAT_RNG
* FEAT_RNG_TRAP
* FEAT_EVT
### Support for DRAMSys
gem5 can now use DRAMSys <https://github.com/tukl-msd/DRAMSys> as a DRAM backend.
### RISC-V improvements
This release:
- Fully implements RISC-V scalar cryptography extensions.
- Fully implement RISC-V rv32.
- Implements PMP lock features.
- Adds general RISC-V improvements to provide better stability.
### Standard library improvements and new components
This release:
- Adds MESI_Three_Level component.
- Supports ELFies and LoopPoint analysis output from Sniper.
- Supports DRAMSys in the stdlib.
## Bugfixes and other small improvements
This release also:
- Removes deprecated python libraries.
- Adds a DDR5 model.
- Adds AMD GPU MI200/gfx90a support.
- Changes building so it no longer "duplicates sources" in build/ which improves support for some IDEs and code analysis. If you still need to duplicate sources you can use the `--duplicate-sources` option to `scons`.
- Enables `--debug-activate=<object name>` to use debug trace for only a single SimObject (the opposite of `--debug-ignore`). See `--debug-help` for more information.
- Adds support to exit the simulation loop based on Arm-PMU events.
- Supports Python 3.11.
- Adds the idea of a CpuCluster to gem5.
# Version 22.1.0.0
This release has 500 contributions from 48 unique contributors and marks our second major release of 2022.

View File

@@ -1,6 +1,6 @@
# -*- mode:python -*-
# Copyright (c) 2013, 2015-2020 ARM Limited
# Copyright (c) 2013, 2015-2020, 2023 ARM Limited
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
@@ -145,6 +145,15 @@ AddOption('--gprof', action='store_true',
help='Enable support for the gprof profiler')
AddOption('--pprof', action='store_true',
help='Enable support for the pprof profiler')
# Default to --no-duplicate-sources, but keep --duplicate-sources to opt-out
# of this new build behaviour in case it introduces regressions. We could use
# action=argparse.BooleanOptionalAction here once Python 3.9 is required.
AddOption('--duplicate-sources', action='store_true', default=False,
dest='duplicate_sources',
help='Create symlinks to sources in the build directory')
AddOption('--no-duplicate-sources', action='store_false',
dest='duplicate_sources',
help='Do not create symlinks to sources in the build directory')
# Inject the built_tools directory into the python path.
sys.path[1:1] = [ Dir('#build_tools').abspath ]
@@ -168,6 +177,10 @@ SetOption('warn', 'no-duplicate-environment')
Export('MakeAction')
# Patch re.compile to support inline flags anywhere within a RE
# string. Required to use PLY with Python 3.11+.
gem5_scons.patch_re_compile_for_inline_flags()
########################################################################
#
# Set up the main build environment.
@@ -264,6 +277,8 @@ main.Append(CPPPATH=[Dir('ext')])
# Add shared top-level headers
main.Prepend(CPPPATH=Dir('include'))
if not GetOption('duplicate_sources'):
main.Prepend(CPPPATH=Dir('src'))
########################################################################
@@ -290,6 +305,17 @@ main['CLANG'] = CXX_version and CXX_version.find('clang') >= 0
if main['GCC'] + main['CLANG'] > 1:
error('Two compilers enabled at once?')
# Find the gem5 binary target architecture (usually host architecture). The
# "Target: <target>" is consistent accross gcc and clang at the time of
# writting this.
bin_target_arch = readCommand([main['CXX'], '--verbose'], exception=False)
main["BIN_TARGET_ARCH"] = (
"x86_64"
if bin_target_arch.find("Target: x86_64") != -1
else "aarch64"
if bin_target_arch.find("Target: aarch64") != -1
else "unknown"
)
########################################################################
#
@@ -420,6 +446,8 @@ for variant_path in variant_paths:
conf.CheckLinkFlag('-Wl,--threads')
conf.CheckLinkFlag(
'-Wl,--thread-count=%d' % GetOption('num_jobs'))
else:
error('\n'.join((
"Don't know what compiler options to use for your compiler.",
@@ -439,10 +467,6 @@ for variant_path in variant_paths:
error('gcc version 7 or newer required.\n'
'Installed version:', env['CXXVERSION'])
with gem5_scons.Configure(env) as conf:
# This warning has a false positive in the systemc in g++ 11.1.
conf.CheckCxxFlag('-Wno-free-nonheap-object')
# Add the appropriate Link-Time Optimization (LTO) flags if
# `--with-lto` is set.
if GetOption('with_lto'):
@@ -464,6 +488,17 @@ for variant_path in variant_paths:
'-fno-builtin-malloc', '-fno-builtin-calloc',
'-fno-builtin-realloc', '-fno-builtin-free'])
if compareVersions(env['CXXVERSION'], "9") < 0:
# `libstdc++fs`` must be explicitly linked for `std::filesystem``
# in GCC version 8. As of GCC version 9, this is not required.
#
# In GCC 7 the `libstdc++fs`` library explicit linkage is also
# required but the `std::filesystem` is under the `experimental`
# namespace(`std::experimental::filesystem`).
#
# Note: gem5 does not support GCC versions < 7.
env.Append(LIBS=['stdc++fs'])
elif env['CLANG']:
if compareVersions(env['CXXVERSION'], "6") < 0:
error('clang version 6 or newer required.\n'
@@ -481,6 +516,18 @@ for variant_path in variant_paths:
env.Append(TCMALLOC_CCFLAGS=['-fno-builtin'])
if compareVersions(env['CXXVERSION'], "11") < 0:
# `libstdc++fs`` must be explicitly linked for `std::filesystem``
# in clang versions 6 through 10.
#
# In addition, for these versions, the
# `std::filesystem` is under the `experimental`
# namespace(`std::experimental::filesystem`).
#
# Note: gem5 does not support clang versions < 6.
env.Append(LIBS=['stdc++fs'])
# On Mac OS X/Darwin we need to also use libc++ (part of XCode) as
# opposed to libstdc++, as the later is dated.
if sys.platform == "darwin":
@@ -511,7 +558,38 @@ for variant_path in variant_paths:
if env['GCC'] or env['CLANG']:
env.Append(CCFLAGS=['-fsanitize=%s' % sanitizers,
'-fno-omit-frame-pointer'],
LINKFLAGS='-fsanitize=%s' % sanitizers)
LINKFLAGS=['-fsanitize=%s' % sanitizers,
'-static-libasan'])
if main["BIN_TARGET_ARCH"] == "x86_64":
# Sanitizers can enlarge binary size drammatically, north of
# 2GB. This can prevent successful linkage due to symbol
# relocation outside from the 2GB region allocated by the small
# x86_64 code model that is enabled by default (32-bit relative
# offset limitation). Switching to the medium model in x86_64
# enables 64-bit relative offset for large objects (>64KB by
# default) while sticking to 32-bit relative addressing for
# code and smaller objects. Note this comes at a potential
# performance cost so it should not be enabled in all cases.
# This should still be a very happy medium for
# non-perf-critical sanitized builds.
env.Append(CCFLAGS='-mcmodel=medium')
env.Append(LINKFLAGS='-mcmodel=medium')
elif main["BIN_TARGET_ARCH"] == "aarch64":
# aarch64 default code model is small but with different
# constrains than for x86_64. With aarch64, the small code
# model enables 4GB distance between symbols. This is
# sufficient for the largest ALL/gem5.debug target with all
# sanitizers enabled at the time of writting this. Note that
# the next aarch64 code model is "large" which prevents dynamic
# linkage so it should be avoided when possible.
pass
else:
warning(
"Unknown code model options for your architecture. "
"Linkage might fail for larger binaries "
"(e.g., ALL/gem5.debug with sanitizers enabled)."
)
else:
warning("Don't know how to enable %s sanitizer(s) for your "
"compiler." % sanitizers)
@@ -563,9 +641,9 @@ for variant_path in variant_paths:
if not GetOption('without_tcmalloc'):
with gem5_scons.Configure(env) as conf:
if conf.CheckLib('tcmalloc'):
if conf.CheckLib('tcmalloc_minimal'):
conf.env.Append(CCFLAGS=conf.env['TCMALLOC_CCFLAGS'])
elif conf.CheckLib('tcmalloc_minimal'):
elif conf.CheckLib('tcmalloc'):
conf.env.Append(CCFLAGS=conf.env['TCMALLOC_CCFLAGS'])
else:
warning("You can get a 12% performance improvement by "
@@ -728,11 +806,13 @@ Build variables for {dir}:
build_dir = os.path.relpath(root, ext_dir)
SConscript(os.path.join(root, 'SConscript'),
variant_dir=os.path.join(variant_ext, build_dir),
exports=exports)
exports=exports,
duplicate=GetOption('duplicate_sources'))
# The src/SConscript file sets up the build rules in 'env' according
# to the configured variables. It returns a list of environments,
# one for each variant build (debug, opt, etc.)
SConscript('src/SConscript', variant_dir=variant_path, exports=exports)
SConscript('src/SConscript', variant_dir=variant_path, exports=exports,
duplicate=GetOption('duplicate_sources'))
atexit.register(summarize_warnings)

View File

@@ -86,10 +86,10 @@ For instance, if you want to run only with `gem5.opt`, you can use
./main.py run --variant opt
```
Or, if you want to just run X86 tests with the `gem5.opt` binary:
Or, if you want to just run quick tests with the `gem5.opt` binary:
```shell
./main.py run --length quick --variant opt --isa X86
./main.py run --length quick --variant opt
```
@@ -102,6 +102,14 @@ To view all of the available tags, use
The output is split into tag *types* (e.g., isa, variant, length) and the
tags for each type are listed after the type name.
Note that when using the isa tag type, tests were traditionally sorted based
on what compilation it required. However, as tests have switched to all be
compiled under the ALL compilation, which includes all ISAs so one doesn't
need to compile each one individually, using the isa tag for ISAs other than
ALL has become a less optimal way of searching for tests. It would instead
be better to run subsets of tests based on their directories, as described
above.
You can specify "or" between tags within the same type by using the tag flag
multiple times. For instance, to run everything that is tagged "opt" or "fast"
use
@@ -112,10 +120,10 @@ use
You can also specify "and" between different types of tags by specifying more
than one type on the command line. For instance, this will only run tests with
both the "X86" and "opt" tags.
both the "ALL" and "opt" tags.
```shell
./main.py run --isa X86 --variant opt
./main.py run --isa All --variant opt
```
## Running tests in batch

View File

@@ -255,9 +255,7 @@ for param in sim_object._params.values():
code('} else if (name == "${{param.name}}") {')
code.indent()
code("${{param.name}}.clear();")
code(
"for (auto i = values.begin(); " "ret && i != values.end(); i ++)"
)
code("for (auto i = values.begin(); ret && i != values.end(); i ++)")
code("{")
code.indent()
code("${{param.ptype.cxx_type}} elem;")

View File

@@ -82,7 +82,6 @@ code(
namespace gem5
{
GEM5_DEPRECATED_NAMESPACE(Debug, debug);
namespace debug
{

View File

@@ -87,7 +87,7 @@ namespace gem5
)
if enum.wrapper_is_struct:
code("const char *${wrapper_name}::${name}Strings" "[Num_${name}] =")
code("const char *${wrapper_name}::${name}Strings[Num_${name}] =")
else:
if enum.is_class:
code(
@@ -97,8 +97,7 @@ const char *${name}Strings[static_cast<int>(${name}::Num_${name})] =
)
else:
code(
"""GEM5_DEPRECATED_NAMESPACE(Enums, enums);
namespace enums
"""namespace enums
{"""
)
code.indent(1)

View File

@@ -48,7 +48,9 @@ interpretters, and so the exact same interpretter should be used both to run
this script, and to read in and execute the marshalled code later.
"""
import locale
import marshal
import os
import sys
import zlib
@@ -65,6 +67,11 @@ if len(sys.argv) < 4:
print(f"Usage: {sys.argv[0]} CPP PY MODPATH ABSPATH", file=sys.stderr)
sys.exit(1)
# Set the Python's locale settings manually based on the `LC_CTYPE`
# environment variable
if "LC_CTYPE" in os.environ:
locale.setlocale(locale.LC_CTYPE, os.environ["LC_CTYPE"])
_, cpp, python, modpath, abspath = sys.argv
with open(python, "r") as f:

View File

@@ -60,15 +60,15 @@ def _get_hwp(hwp_option):
def _get_cache_opts(level, options):
opts = {}
size_attr = "{}_size".format(level)
size_attr = f"{level}_size"
if hasattr(options, size_attr):
opts["size"] = getattr(options, size_attr)
assoc_attr = "{}_assoc".format(level)
assoc_attr = f"{level}_assoc"
if hasattr(options, assoc_attr):
opts["assoc"] = getattr(options, assoc_attr)
prefetcher_attr = "{}_hwp_type".format(level)
prefetcher_attr = f"{level}_hwp_type"
if hasattr(options, prefetcher_attr):
opts["prefetcher"] = _get_hwp(getattr(options, prefetcher_attr))

View File

@@ -51,7 +51,7 @@ from shutil import rmtree, copyfile
def hex_mask(terms):
dec_mask = reduce(operator.or_, [2**i for i in terms], 0)
return "%08x" % dec_mask
return f"{dec_mask:08x}"
def file_append(path, contents):
@@ -252,13 +252,13 @@ def _redirect_paths(options):
# Redirect filesystem syscalls from src to the first matching dests
redirect_paths = [
RedirectPath(
app_path="/proc", host_paths=["%s/fs/proc" % m5.options.outdir]
app_path="/proc", host_paths=[f"{m5.options.outdir}/fs/proc"]
),
RedirectPath(
app_path="/sys", host_paths=["%s/fs/sys" % m5.options.outdir]
app_path="/sys", host_paths=[f"{m5.options.outdir}/fs/sys"]
),
RedirectPath(
app_path="/tmp", host_paths=["%s/fs/tmp" % m5.options.outdir]
app_path="/tmp", host_paths=[f"{m5.options.outdir}/fs/tmp"]
),
]
@@ -275,7 +275,7 @@ def _redirect_paths(options):
if chroot:
redirect_paths.append(
RedirectPath(
app_path="/", host_paths=["%s" % os.path.expanduser(chroot)]
app_path="/", host_paths=[f"{os.path.expanduser(chroot)}"]
)
)

View File

@@ -204,8 +204,8 @@ def config_tlb_hierarchy(
# add the different TLB levels to the system
# Modify here if you want to make the TLB hierarchy a child of
# the shader.
exec("system.%s = TLB_array" % system_TLB_name)
exec("system.%s = Coalescer_array" % system_Coalescer_name)
exec(f"system.{system_TLB_name} = TLB_array")
exec(f"system.{system_Coalescer_name} = Coalescer_array")
# ===========================================================
# Specify the TLB hierarchy (i.e., port connections)

View File

@@ -65,22 +65,18 @@ class ObjectList(object):
sub_cls = self._sub_classes[real_name]
return sub_cls
except KeyError:
print(
"{} is not a valid sub-class of {}.".format(
name, self.base_cls
)
)
print(f"{name} is not a valid sub-class of {self.base_cls}.")
raise
def print(self):
"""Print a list of available sub-classes and aliases."""
print("Available {} classes:".format(self.base_cls))
print(f"Available {self.base_cls} classes:")
doc_wrapper = TextWrapper(
initial_indent="\t\t", subsequent_indent="\t\t"
)
for name, cls in list(self._sub_classes.items()):
print("\t{}".format(name))
print(f"\t{name}")
# Try to extract the class documentation from the class help
# string.
@@ -92,7 +88,7 @@ class ObjectList(object):
if self._aliases:
print("\Aliases:")
for alias, target in list(self._aliases.items()):
print("\t{} => {}".format(alias, target))
print(f"\t{alias} => {target}")
def get_names(self):
"""Return a list of valid sub-class names and aliases."""

View File

@@ -217,7 +217,7 @@ def addNoISAOptions(parser):
"--maxtime",
type=float,
default=None,
help="Run to the specified absolute simulated time in " "seconds",
help="Run to the specified absolute simulated time in seconds",
)
parser.add_argument(
"-P",
@@ -691,7 +691,7 @@ def addSEOptions(parser):
"-o",
"--options",
default="",
help="""The options to pass to the binary, use " "
help="""The options to pass to the binary, use
around the entire string""",
)
parser.add_argument(
@@ -834,8 +834,7 @@ def addFSOptions(parser):
action="store",
type=str,
dest="benchmark",
help="Specify the benchmark to run. Available benchmarks: %s"
% DefinedBenchmarks,
help=f"Specify the benchmark to run. Available benchmarks: {DefinedBenchmarks}",
)
# Metafile options

View File

@@ -71,7 +71,7 @@ def setCPUClass(options):
TmpClass, test_mem_mode = getCPUClass(options.cpu_type)
CPUClass = None
if TmpClass.require_caches() and not options.caches and not options.ruby:
fatal("%s must be used with caches" % options.cpu_type)
fatal(f"{options.cpu_type} must be used with caches")
if options.checkpoint_restore != None:
if options.restore_with_cpu != options.cpu_type:
@@ -144,7 +144,7 @@ def findCptDir(options, cptdir, testsys):
fatal("Unable to find simpoint")
inst += int(testsys.cpu[0].workload[0].simpoint)
checkpoint_dir = joinpath(cptdir, "cpt.%s.%s" % (options.bench, inst))
checkpoint_dir = joinpath(cptdir, f"cpt.{options.bench}.{inst}")
if not exists(checkpoint_dir):
fatal("Unable to find checkpoint directory %s", checkpoint_dir)
@@ -204,7 +204,7 @@ def findCptDir(options, cptdir, testsys):
fatal("Checkpoint %d not found", cpt_num)
cpt_starttick = int(cpts[cpt_num - 1])
checkpoint_dir = joinpath(cptdir, "cpt.%s" % cpts[cpt_num - 1])
checkpoint_dir = joinpath(cptdir, f"cpt.{cpts[cpt_num - 1]}")
return cpt_starttick, checkpoint_dir
@@ -220,7 +220,7 @@ def scriptCheckpoints(options, maxtick, cptdir):
print("Creating checkpoint at inst:%d" % (checkpoint_inst))
exit_event = m5.simulate()
exit_cause = exit_event.getCause()
print("exit cause = %s" % exit_cause)
print(f"exit cause = {exit_cause}")
# skip checkpoint instructions should they exist
while exit_cause == "checkpoint":
@@ -549,10 +549,10 @@ def run(options, root, testsys, cpu_class):
if options.repeat_switch:
switch_class = getCPUClass(options.cpu_type)[0]
if switch_class.require_caches() and not options.caches:
print("%s: Must be used with caches" % str(switch_class))
print(f"{str(switch_class)}: Must be used with caches")
sys.exit(1)
if not switch_class.support_take_over():
print("%s: CPU switching not supported" % str(switch_class))
print(f"{str(switch_class)}: CPU switching not supported")
sys.exit(1)
repeat_switch_cpus = [
@@ -740,9 +740,9 @@ def run(options, root, testsys, cpu_class):
)
exit_event = m5.simulate()
else:
print("Switch at curTick count:%s" % str(10000))
print(f"Switch at curTick count:{str(10000)}")
exit_event = m5.simulate(10000)
print("Switched CPUS @ tick %s" % (m5.curTick()))
print(f"Switched CPUS @ tick {m5.curTick()}")
m5.switchCpus(testsys, switch_cpu_list)
@@ -757,7 +757,7 @@ def run(options, root, testsys, cpu_class):
exit_event = m5.simulate()
else:
exit_event = m5.simulate(options.standard_switch)
print("Switching CPUS @ tick %s" % (m5.curTick()))
print(f"Switching CPUS @ tick {m5.curTick()}")
print(
"Simulation ends instruction count:%d"
% (testsys.switch_cpus_1[0].max_insts_any_thread)

View File

@@ -73,9 +73,7 @@ class PathSearchFunc(object):
return next(p for p in paths if os.path.exists(p))
except StopIteration:
raise IOError(
"Can't find file '{}' on {}.".format(
filepath, self.environment_variable
)
f"Can't find file '{filepath}' on {self.environment_variable}."
)

View File

@@ -1420,6 +1420,7 @@ class HPI_FloatSimdFU(MinorFU):
"SimdMisc",
"SimdMult",
"SimdMultAcc",
"SimdMatMultAcc",
"SimdShift",
"SimdShiftAcc",
"SimdSqrt",
@@ -1431,6 +1432,7 @@ class HPI_FloatSimdFU(MinorFU):
"SimdFloatMisc",
"SimdFloatMult",
"SimdFloatMultAcc",
"SimdFloatMatMultAcc",
"SimdFloatSqrt",
]
)

View File

@@ -53,6 +53,7 @@ class O3_ARM_v7a_FP(FUDesc):
OpDesc(opClass="SimdMisc", opLat=3),
OpDesc(opClass="SimdMult", opLat=5),
OpDesc(opClass="SimdMultAcc", opLat=5),
OpDesc(opClass="SimdMatMultAcc", opLat=5),
OpDesc(opClass="SimdShift", opLat=3),
OpDesc(opClass="SimdShiftAcc", opLat=3),
OpDesc(opClass="SimdSqrt", opLat=9),
@@ -64,6 +65,7 @@ class O3_ARM_v7a_FP(FUDesc):
OpDesc(opClass="SimdFloatMisc", opLat=3),
OpDesc(opClass="SimdFloatMult", opLat=3),
OpDesc(opClass="SimdFloatMultAcc", opLat=5),
OpDesc(opClass="SimdFloatMatMultAcc", opLat=5),
OpDesc(opClass="SimdFloatSqrt", opLat=9),
OpDesc(opClass="FloatAdd", opLat=5),
OpDesc(opClass="FloatCmp", opLat=5),

View File

@@ -56,6 +56,7 @@ class ex5_LITTLE_FP(MinorDefaultFloatSimdFU):
OpDesc(opClass="SimdMisc", opLat=3),
OpDesc(opClass="SimdMult", opLat=4),
OpDesc(opClass="SimdMultAcc", opLat=5),
OpDesc(opClass="SimdMatMultAcc", opLat=5),
OpDesc(opClass="SimdShift", opLat=3),
OpDesc(opClass="SimdShiftAcc", opLat=3),
OpDesc(opClass="SimdSqrt", opLat=9),
@@ -67,6 +68,7 @@ class ex5_LITTLE_FP(MinorDefaultFloatSimdFU):
OpDesc(opClass="SimdFloatMisc", opLat=6),
OpDesc(opClass="SimdFloatMult", opLat=15),
OpDesc(opClass="SimdFloatMultAcc", opLat=6),
OpDesc(opClass="SimdFloatMatMultAcc", opLat=6),
OpDesc(opClass="SimdFloatSqrt", opLat=17),
OpDesc(opClass="FloatAdd", opLat=8),
OpDesc(opClass="FloatCmp", opLat=6),

View File

@@ -58,6 +58,7 @@ class ex5_big_FP(FUDesc):
OpDesc(opClass="SimdMisc", opLat=3),
OpDesc(opClass="SimdMult", opLat=6),
OpDesc(opClass="SimdMultAcc", opLat=5),
OpDesc(opClass="SimdMatMultAcc", opLat=5),
OpDesc(opClass="SimdShift", opLat=3),
OpDesc(opClass="SimdShiftAcc", opLat=3),
OpDesc(opClass="SimdSqrt", opLat=9),
@@ -69,6 +70,7 @@ class ex5_big_FP(FUDesc):
OpDesc(opClass="SimdFloatMisc", opLat=3),
OpDesc(opClass="SimdFloatMult", opLat=6),
OpDesc(opClass="SimdFloatMultAcc", opLat=1),
OpDesc(opClass="SimdFloatMatMultAcc", opLat=1),
OpDesc(opClass="SimdFloatSqrt", opLat=9),
OpDesc(opClass="FloatAdd", opLat=6),
OpDesc(opClass="FloatCmp", opLat=5),

View File

@@ -83,7 +83,7 @@ class Benchmark(object):
self.args = []
if not hasattr(self.__class__, "output"):
self.output = "%s.out" % self.name
self.output = f"{self.name}.out"
if not hasattr(self.__class__, "simpoint"):
self.simpoint = None
@@ -92,13 +92,12 @@ class Benchmark(object):
func = getattr(self.__class__, input_set)
except AttributeError:
raise AttributeError(
"The benchmark %s does not have the %s input set"
% (self.name, input_set)
f"The benchmark {self.name} does not have the {input_set} input set"
)
executable = joinpath(spec_dist, "binaries", isa, os, self.binary)
if not isfile(executable):
raise AttributeError("%s not found" % executable)
raise AttributeError(f"{executable} not found")
self.executable = executable
# root of tree for input & output data files
@@ -112,7 +111,7 @@ class Benchmark(object):
self.input_set = input_set
if not isdir(inputs_dir):
raise AttributeError("%s not found" % inputs_dir)
raise AttributeError(f"{inputs_dir} not found")
self.inputs_dir = [inputs_dir]
if isdir(all_dir):
@@ -121,12 +120,12 @@ class Benchmark(object):
self.outputs_dir = outputs_dir
if not hasattr(self.__class__, "stdin"):
self.stdin = joinpath(inputs_dir, "%s.in" % self.name)
self.stdin = joinpath(inputs_dir, f"{self.name}.in")
if not isfile(self.stdin):
self.stdin = None
if not hasattr(self.__class__, "stdout"):
self.stdout = joinpath(outputs_dir, "%s.out" % self.name)
self.stdout = joinpath(outputs_dir, f"{self.name}.out")
if not isfile(self.stdout):
self.stdout = None
@@ -387,9 +386,9 @@ class mesa(Benchmark):
"-frames",
frames,
"-meshfile",
"%s.in" % self.name,
f"{self.name}.in",
"-ppmfile",
"%s.ppm" % self.name,
f"{self.name}.ppm",
]
def test(self, isa, os):
@@ -876,34 +875,34 @@ class vortex(Benchmark):
elif isa == "sparc" or isa == "sparc32":
self.endian = "bendian"
else:
raise AttributeError("unknown ISA %s" % isa)
raise AttributeError(f"unknown ISA {isa}")
super(vortex, self).__init__(isa, os, input_set)
def test(self, isa, os):
self.args = ["%s.raw" % self.endian]
self.args = [f"{self.endian}.raw"]
self.output = "vortex.out"
def train(self, isa, os):
self.args = ["%s.raw" % self.endian]
self.args = [f"{self.endian}.raw"]
self.output = "vortex.out"
def smred(self, isa, os):
self.args = ["%s.raw" % self.endian]
self.args = [f"{self.endian}.raw"]
self.output = "vortex.out"
def mdred(self, isa, os):
self.args = ["%s.raw" % self.endian]
self.args = [f"{self.endian}.raw"]
self.output = "vortex.out"
def lgred(self, isa, os):
self.args = ["%s.raw" % self.endian]
self.args = [f"{self.endian}.raw"]
self.output = "vortex.out"
class vortex1(vortex):
def ref(self, isa, os):
self.args = ["%s1.raw" % self.endian]
self.args = [f"{self.endian}1.raw"]
self.output = "vortex1.out"
self.simpoint = 271 * 100e6
@@ -911,14 +910,14 @@ class vortex1(vortex):
class vortex2(vortex):
def ref(self, isa, os):
self.simpoint = 1024 * 100e6
self.args = ["%s2.raw" % self.endian]
self.args = [f"{self.endian}2.raw"]
self.output = "vortex2.out"
class vortex3(vortex):
def ref(self, isa, os):
self.simpoint = 564 * 100e6
self.args = ["%s3.raw" % self.endian]
self.args = [f"{self.endian}3.raw"]
self.output = "vortex3.out"
@@ -1031,8 +1030,8 @@ if __name__ == "__main__":
for bench in all:
for input_set in "ref", "test", "train":
print("class: %s" % bench.__name__)
print(f"class: {bench.__name__}")
x = bench("x86", "linux", input_set)
print("%s: %s" % (x, input_set))
print(f"{x}: {input_set}")
pprint(x.makeProcessArgs())
print()

View File

@@ -0,0 +1,444 @@
# Copyright (c) 2010-2013, 2016, 2019-2020 ARM Limited
# Copyright (c) 2020 Barkhausen Institut
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
# not be construed as granting a license to any other intellectual
# property including but not limited to intellectual property relating
# to a hardware implementation of the functionality of the software
# licensed hereunder. You may use the software subject to the license
# terms below provided that you ensure that this notice is replicated
# unmodified and in its entirety in all distributions of the software,
# modified or unmodified, in source code or in binary form.
#
# Copyright (c) 2012-2014 Mark D. Hill and David A. Wood
# Copyright (c) 2009-2011 Advanced Micro Devices, Inc.
# Copyright (c) 2006-2007 The Regents of The University of Michigan
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import argparse
import sys
import m5
from m5.defines import buildEnv
from m5.objects import *
from m5.util import addToPath, fatal, warn
from m5.util.fdthelper import *
from gem5.isas import ISA
from gem5.runtime import get_runtime_isa
addToPath("../../")
from ruby import Ruby
from common.FSConfig import *
from common.SysPaths import *
from common.Benchmarks import *
from common import Simulation
from common import CacheConfig
from common import CpuConfig
from common import MemConfig
from common import ObjectList
from common.Caches import *
from common import Options
def cmd_line_template():
if args.command_line and args.command_line_file:
print(
"Error: --command-line and --command-line-file are "
"mutually exclusive"
)
sys.exit(1)
if args.command_line:
return args.command_line
if args.command_line_file:
return open(args.command_line_file).read().strip()
return None
def build_test_system(np):
cmdline = cmd_line_template()
isa = get_runtime_isa()
if isa == ISA.MIPS:
test_sys = makeLinuxMipsSystem(test_mem_mode, bm[0], cmdline=cmdline)
elif isa == ISA.SPARC:
test_sys = makeSparcSystem(test_mem_mode, bm[0], cmdline=cmdline)
elif isa == ISA.RISCV:
test_sys = makeBareMetalRiscvSystem(
test_mem_mode, bm[0], cmdline=cmdline
)
elif isa == ISA.X86:
test_sys = makeLinuxX86System(
test_mem_mode, np, bm[0], args.ruby, cmdline=cmdline
)
elif isa == ISA.ARM:
test_sys = makeArmSystem(
test_mem_mode,
args.machine_type,
np,
bm[0],
args.dtb_filename,
bare_metal=args.bare_metal,
cmdline=cmdline,
external_memory=args.external_memory_system,
ruby=args.ruby,
vio_9p=args.vio_9p,
bootloader=args.bootloader,
)
if args.enable_context_switch_stats_dump:
test_sys.enable_context_switch_stats_dump = True
else:
fatal("Incapable of building %s full system!", isa.name)
# Set the cache line size for the entire system
test_sys.cache_line_size = args.cacheline_size
# Create a top-level voltage domain
test_sys.voltage_domain = VoltageDomain(voltage=args.sys_voltage)
# Create a source clock for the system and set the clock period
test_sys.clk_domain = SrcClockDomain(
clock=args.sys_clock, voltage_domain=test_sys.voltage_domain
)
# Create a CPU voltage domain
test_sys.cpu_voltage_domain = VoltageDomain()
# Create a source clock for the CPUs and set the clock period
test_sys.cpu_clk_domain = SrcClockDomain(
clock=args.cpu_clock, voltage_domain=test_sys.cpu_voltage_domain
)
if buildEnv["USE_RISCV_ISA"]:
test_sys.workload.bootloader = args.kernel
elif args.kernel is not None:
test_sys.workload.object_file = binary(args.kernel)
if args.script is not None:
test_sys.readfile = args.script
test_sys.init_param = args.init_param
# For now, assign all the CPUs to the same clock domain
test_sys.cpu = [
TestCPUClass(clk_domain=test_sys.cpu_clk_domain, cpu_id=i)
for i in range(np)
]
if args.ruby:
bootmem = getattr(test_sys, "_bootmem", None)
Ruby.create_system(
args, True, test_sys, test_sys.iobus, test_sys._dma_ports, bootmem
)
# Create a seperate clock domain for Ruby
test_sys.ruby.clk_domain = SrcClockDomain(
clock=args.ruby_clock, voltage_domain=test_sys.voltage_domain
)
# Connect the ruby io port to the PIO bus,
# assuming that there is just one such port.
test_sys.iobus.mem_side_ports = test_sys.ruby._io_port.in_ports
for (i, cpu) in enumerate(test_sys.cpu):
#
# Tie the cpu ports to the correct ruby system ports
#
cpu.clk_domain = test_sys.cpu_clk_domain
cpu.createThreads()
cpu.createInterruptController()
test_sys.ruby._cpu_ports[i].connectCpuPorts(cpu)
else:
if args.caches or args.l2cache:
# By default the IOCache runs at the system clock
test_sys.iocache = IOCache(addr_ranges=test_sys.mem_ranges)
test_sys.iocache.cpu_side = test_sys.iobus.mem_side_ports
test_sys.iocache.mem_side = test_sys.membus.cpu_side_ports
elif not args.external_memory_system:
test_sys.iobridge = Bridge(
delay="50ns", ranges=test_sys.mem_ranges
)
test_sys.iobridge.cpu_side_port = test_sys.iobus.mem_side_ports
test_sys.iobridge.mem_side_port = test_sys.membus.cpu_side_ports
# Sanity check
if args.simpoint_profile:
if not ObjectList.is_noncaching_cpu(TestCPUClass):
fatal("SimPoint generation should be done with atomic cpu")
if np > 1:
fatal(
"SimPoint generation not supported with more than one CPUs"
)
for i in range(np):
if args.simpoint_profile:
test_sys.cpu[i].addSimPointProbe(args.simpoint_interval)
if args.checker:
test_sys.cpu[i].addCheckerCpu()
if not ObjectList.is_kvm_cpu(TestCPUClass):
if args.bp_type:
bpClass = ObjectList.bp_list.get(args.bp_type)
test_sys.cpu[i].branchPred = bpClass()
if args.indirect_bp_type:
IndirectBPClass = ObjectList.indirect_bp_list.get(
args.indirect_bp_type
)
test_sys.cpu[
i
].branchPred.indirectBranchPred = IndirectBPClass()
test_sys.cpu[i].createThreads()
# If elastic tracing is enabled when not restoring from checkpoint and
# when not fast forwarding using the atomic cpu, then check that the
# TestCPUClass is DerivO3CPU or inherits from DerivO3CPU. If the check
# passes then attach the elastic trace probe.
# If restoring from checkpoint or fast forwarding, the code that does this for
# FutureCPUClass is in the Simulation module. If the check passes then the
# elastic trace probe is attached to the switch CPUs.
if (
args.elastic_trace_en
and args.checkpoint_restore == None
and not args.fast_forward
):
CpuConfig.config_etrace(TestCPUClass, test_sys.cpu, args)
CacheConfig.config_cache(args, test_sys)
MemConfig.config_mem(args, test_sys)
if ObjectList.is_kvm_cpu(TestCPUClass) or ObjectList.is_kvm_cpu(
FutureClass
):
# Assign KVM CPUs to their own event queues / threads. This
# has to be done after creating caches and other child objects
# since these mustn't inherit the CPU event queue.
for i, cpu in enumerate(test_sys.cpu):
# Child objects usually inherit the parent's event
# queue. Override that and use the same event queue for
# all devices.
for obj in cpu.descendants():
obj.eventq_index = 0
cpu.eventq_index = i + 1
test_sys.kvm_vm = KvmVM()
return test_sys
def build_drive_system(np):
# driver system CPU is always simple, so is the memory
# Note this is an assignment of a class, not an instance.
DriveCPUClass = AtomicSimpleCPU
drive_mem_mode = "atomic"
DriveMemClass = SimpleMemory
cmdline = cmd_line_template()
if buildEnv["USE_MIPS_ISA"]:
drive_sys = makeLinuxMipsSystem(drive_mem_mode, bm[1], cmdline=cmdline)
elif buildEnv["USE_SPARC_ISA"]:
drive_sys = makeSparcSystem(drive_mem_mode, bm[1], cmdline=cmdline)
elif buildEnv["USE_X86_ISA"]:
drive_sys = makeLinuxX86System(
drive_mem_mode, np, bm[1], cmdline=cmdline
)
elif buildEnv["USE_ARM_ISA"]:
drive_sys = makeArmSystem(
drive_mem_mode,
args.machine_type,
np,
bm[1],
args.dtb_filename,
cmdline=cmdline,
)
# Create a top-level voltage domain
drive_sys.voltage_domain = VoltageDomain(voltage=args.sys_voltage)
# Create a source clock for the system and set the clock period
drive_sys.clk_domain = SrcClockDomain(
clock=args.sys_clock, voltage_domain=drive_sys.voltage_domain
)
# Create a CPU voltage domain
drive_sys.cpu_voltage_domain = VoltageDomain()
# Create a source clock for the CPUs and set the clock period
drive_sys.cpu_clk_domain = SrcClockDomain(
clock=args.cpu_clock, voltage_domain=drive_sys.cpu_voltage_domain
)
drive_sys.cpu = DriveCPUClass(
clk_domain=drive_sys.cpu_clk_domain, cpu_id=0
)
drive_sys.cpu.createThreads()
drive_sys.cpu.createInterruptController()
drive_sys.cpu.connectBus(drive_sys.membus)
if args.kernel is not None:
drive_sys.workload.object_file = binary(args.kernel)
if ObjectList.is_kvm_cpu(DriveCPUClass):
drive_sys.kvm_vm = KvmVM()
drive_sys.iobridge = Bridge(delay="50ns", ranges=drive_sys.mem_ranges)
drive_sys.iobridge.cpu_side_port = drive_sys.iobus.mem_side_ports
drive_sys.iobridge.mem_side_port = drive_sys.membus.cpu_side_ports
# Create the appropriate memory controllers and connect them to the
# memory bus
drive_sys.mem_ctrls = [
DriveMemClass(range=r) for r in drive_sys.mem_ranges
]
for i in range(len(drive_sys.mem_ctrls)):
drive_sys.mem_ctrls[i].port = drive_sys.membus.mem_side_ports
drive_sys.init_param = args.init_param
return drive_sys
warn(
"The fs.py script is deprecated. It will be removed in future releases of "
" gem5."
)
# Add args
parser = argparse.ArgumentParser()
Options.addCommonOptions(parser)
Options.addFSOptions(parser)
# Add the ruby specific and protocol specific args
if "--ruby" in sys.argv:
Ruby.define_options(parser)
args = parser.parse_args()
# system under test can be any CPU
(TestCPUClass, test_mem_mode, FutureClass) = Simulation.setCPUClass(args)
# Match the memories with the CPUs, based on the options for the test system
TestMemClass = Simulation.setMemClass(args)
if args.benchmark:
try:
bm = Benchmarks[args.benchmark]
except KeyError:
print(f"Error benchmark {args.benchmark} has not been defined.")
print(f"Valid benchmarks are: {DefinedBenchmarks}")
sys.exit(1)
else:
if args.dual:
bm = [
SysConfig(
disks=args.disk_image,
rootdev=args.root_device,
mem=args.mem_size,
os_type=args.os_type,
),
SysConfig(
disks=args.disk_image,
rootdev=args.root_device,
mem=args.mem_size,
os_type=args.os_type,
),
]
else:
bm = [
SysConfig(
disks=args.disk_image,
rootdev=args.root_device,
mem=args.mem_size,
os_type=args.os_type,
)
]
np = args.num_cpus
test_sys = build_test_system(np)
if len(bm) == 2:
drive_sys = build_drive_system(np)
root = makeDualRoot(True, test_sys, drive_sys, args.etherdump)
elif len(bm) == 1 and args.dist:
# This system is part of a dist-gem5 simulation
root = makeDistRoot(
test_sys,
args.dist_rank,
args.dist_size,
args.dist_server_name,
args.dist_server_port,
args.dist_sync_repeat,
args.dist_sync_start,
args.ethernet_linkspeed,
args.ethernet_linkdelay,
args.etherdump,
)
elif len(bm) == 1:
root = Root(full_system=True, system=test_sys)
else:
print("Error I don't know how to create more than 2 systems.")
sys.exit(1)
if ObjectList.is_kvm_cpu(TestCPUClass) or ObjectList.is_kvm_cpu(FutureClass):
# Required for running kvm on multiple host cores.
# Uses gem5's parallel event queue feature
# Note: The simulator is quite picky about this number!
root.sim_quantum = int(1e9) # 1 ms
if args.timesync:
root.time_sync_enable = True
if args.frame_capture:
VncServer.frame_capture = True
if buildEnv["USE_ARM_ISA"] and not args.bare_metal and not args.dtb_filename:
if args.machine_type not in [
"VExpress_GEM5",
"VExpress_GEM5_V1",
"VExpress_GEM5_V2",
"VExpress_GEM5_Foundation",
]:
warn(
"Can only correctly generate a dtb for VExpress_GEM5_* "
"platforms, unless custom hardware models have been equipped "
"with generation functionality."
)
# Generate a Device Tree
for sysname in ("system", "testsys", "drivesys"):
if hasattr(root, sysname):
sys = getattr(root, sysname)
sys.workload.dtb_filename = os.path.join(
m5.options.outdir, f"{sysname}.dtb"
)
sys.generateDtb(sys.workload.dtb_filename)
if args.wait_gdb:
test_sys.workload.wait_for_remote_gdb = True
Simulation.setWorkCountOptions(test_sys, args)
Simulation.run(args, root, test_sys, FutureClass)

View File

@@ -0,0 +1,292 @@
# Copyright (c) 2012-2013 ARM Limited
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
# not be construed as granting a license to any other intellectual
# property including but not limited to intellectual property relating
# to a hardware implementation of the functionality of the software
# licensed hereunder. You may use the software subject to the license
# terms below provided that you ensure that this notice is replicated
# unmodified and in its entirety in all distributions of the software,
# modified or unmodified, in source code or in binary form.
#
# Copyright (c) 2006-2008 The Regents of The University of Michigan
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Simple test script
#
# "m5 test.py"
import argparse
import sys
import os
import m5
from m5.defines import buildEnv
from m5.objects import *
from m5.params import NULL
from m5.util import addToPath, fatal, warn
from gem5.isas import ISA
from gem5.runtime import get_runtime_isa
addToPath("../../")
from ruby import Ruby
from common import Options
from common import Simulation
from common import CacheConfig
from common import CpuConfig
from common import ObjectList
from common import MemConfig
from common.FileSystemConfig import config_filesystem
from common.Caches import *
from common.cpu2000 import *
def get_processes(args):
"""Interprets provided args and returns a list of processes"""
multiprocesses = []
inputs = []
outputs = []
errouts = []
pargs = []
workloads = args.cmd.split(";")
if args.input != "":
inputs = args.input.split(";")
if args.output != "":
outputs = args.output.split(";")
if args.errout != "":
errouts = args.errout.split(";")
if args.options != "":
pargs = args.options.split(";")
idx = 0
for wrkld in workloads:
process = Process(pid=100 + idx)
process.executable = wrkld
process.cwd = os.getcwd()
process.gid = os.getgid()
if args.env:
with open(args.env, "r") as f:
process.env = [line.rstrip() for line in f]
if len(pargs) > idx:
process.cmd = [wrkld] + pargs[idx].split()
else:
process.cmd = [wrkld]
if len(inputs) > idx:
process.input = inputs[idx]
if len(outputs) > idx:
process.output = outputs[idx]
if len(errouts) > idx:
process.errout = errouts[idx]
multiprocesses.append(process)
idx += 1
if args.smt:
assert args.cpu_type == "DerivO3CPU"
return multiprocesses, idx
else:
return multiprocesses, 1
warn(
"The se.py script is deprecated. It will be removed in future releases of "
" gem5."
)
parser = argparse.ArgumentParser()
Options.addCommonOptions(parser)
Options.addSEOptions(parser)
if "--ruby" in sys.argv:
Ruby.define_options(parser)
args = parser.parse_args()
multiprocesses = []
numThreads = 1
if args.bench:
apps = args.bench.split("-")
if len(apps) != args.num_cpus:
print("number of benchmarks not equal to set num_cpus!")
sys.exit(1)
for app in apps:
try:
if get_runtime_isa() == ISA.ARM:
exec(
"workload = %s('arm_%s', 'linux', '%s')"
% (app, args.arm_iset, args.spec_input)
)
else:
# TARGET_ISA has been removed, but this is missing a ], so it
# has incorrect syntax and wasn't being used anyway.
exec(
"workload = %s(buildEnv['TARGET_ISA', 'linux', '%s')"
% (app, args.spec_input)
)
multiprocesses.append(workload.makeProcess())
except:
print(
f"Unable to find workload for {get_runtime_isa().name()}: {app}",
file=sys.stderr,
)
sys.exit(1)
elif args.cmd:
multiprocesses, numThreads = get_processes(args)
else:
print("No workload specified. Exiting!\n", file=sys.stderr)
sys.exit(1)
(CPUClass, test_mem_mode, FutureClass) = Simulation.setCPUClass(args)
CPUClass.numThreads = numThreads
# Check -- do not allow SMT with multiple CPUs
if args.smt and args.num_cpus > 1:
fatal("You cannot use SMT with multiple CPUs!")
np = args.num_cpus
mp0_path = multiprocesses[0].executable
system = System(
cpu=[CPUClass(cpu_id=i) for i in range(np)],
mem_mode=test_mem_mode,
mem_ranges=[AddrRange(args.mem_size)],
cache_line_size=args.cacheline_size,
)
if numThreads > 1:
system.multi_thread = True
# Create a top-level voltage domain
system.voltage_domain = VoltageDomain(voltage=args.sys_voltage)
# Create a source clock for the system and set the clock period
system.clk_domain = SrcClockDomain(
clock=args.sys_clock, voltage_domain=system.voltage_domain
)
# Create a CPU voltage domain
system.cpu_voltage_domain = VoltageDomain()
# Create a separate clock domain for the CPUs
system.cpu_clk_domain = SrcClockDomain(
clock=args.cpu_clock, voltage_domain=system.cpu_voltage_domain
)
# If elastic tracing is enabled, then configure the cpu and attach the elastic
# trace probe
if args.elastic_trace_en:
CpuConfig.config_etrace(CPUClass, system.cpu, args)
# All cpus belong to a common cpu_clk_domain, therefore running at a common
# frequency.
for cpu in system.cpu:
cpu.clk_domain = system.cpu_clk_domain
if ObjectList.is_kvm_cpu(CPUClass) or ObjectList.is_kvm_cpu(FutureClass):
if buildEnv["USE_X86_ISA"]:
system.kvm_vm = KvmVM()
system.m5ops_base = 0xFFFF0000
for process in multiprocesses:
process.useArchPT = True
process.kvmInSE = True
else:
fatal("KvmCPU can only be used in SE mode with x86")
# Sanity check
if args.simpoint_profile:
if not ObjectList.is_noncaching_cpu(CPUClass):
fatal("SimPoint/BPProbe should be done with an atomic cpu")
if np > 1:
fatal("SimPoint generation not supported with more than one CPUs")
for i in range(np):
if args.smt:
system.cpu[i].workload = multiprocesses
elif len(multiprocesses) == 1:
system.cpu[i].workload = multiprocesses[0]
else:
system.cpu[i].workload = multiprocesses[i]
if args.simpoint_profile:
system.cpu[i].addSimPointProbe(args.simpoint_interval)
if args.checker:
system.cpu[i].addCheckerCpu()
if args.bp_type:
bpClass = ObjectList.bp_list.get(args.bp_type)
system.cpu[i].branchPred = bpClass()
if args.indirect_bp_type:
indirectBPClass = ObjectList.indirect_bp_list.get(
args.indirect_bp_type
)
system.cpu[i].branchPred.indirectBranchPred = indirectBPClass()
system.cpu[i].createThreads()
if args.ruby:
Ruby.create_system(args, False, system)
assert args.num_cpus == len(system.ruby._cpu_ports)
system.ruby.clk_domain = SrcClockDomain(
clock=args.ruby_clock, voltage_domain=system.voltage_domain
)
for i in range(np):
ruby_port = system.ruby._cpu_ports[i]
# Create the interrupt controller and connect its ports to Ruby
# Note that the interrupt controller is always present but only
# in x86 does it have message ports that need to be connected
system.cpu[i].createInterruptController()
# Connect the cpu's cache ports to Ruby
ruby_port.connectCpuPorts(system.cpu[i])
else:
MemClass = Simulation.setMemClass(args)
system.membus = SystemXBar()
system.system_port = system.membus.cpu_side_ports
CacheConfig.config_cache(args, system)
MemConfig.config_mem(args, system)
config_filesystem(system, args)
system.workload = SEWorkload.init_compatible(mp0_path)
if args.wait_gdb:
system.workload.wait_for_remote_gdb = True
root = Root(full_system=False, system=system)
Simulation.run(args, root, system, FutureClass)

View File

@@ -85,7 +85,7 @@ parser.add_argument(
"--cu-per-sqc",
type=int,
default=4,
help="number of CUs" "sharing an SQC (icache, and thus icache TLB)",
help="number of CUssharing an SQC (icache, and thus icache TLB)",
)
parser.add_argument(
"--cu-per-scalar-cache",
@@ -94,7 +94,7 @@ parser.add_argument(
help="Number of CUs sharing a scalar cache",
)
parser.add_argument(
"--simds-per-cu", type=int, default=4, help="SIMD units" "per CU"
"--simds-per-cu", type=int, default=4, help="SIMD unitsper CU"
)
parser.add_argument(
"--cu-per-sa",
@@ -140,13 +140,13 @@ parser.add_argument(
"--glbmem-wr-bus-width",
type=int,
default=32,
help="VGPR to Coalescer (Global Memory) data bus width " "in bytes",
help="VGPR to Coalescer (Global Memory) data bus width in bytes",
)
parser.add_argument(
"--glbmem-rd-bus-width",
type=int,
default=32,
help="Coalescer to VGPR (Global Memory) data bus width in " "bytes",
help="Coalescer to VGPR (Global Memory) data bus width in bytes",
)
# Currently we only support 1 local memory pipe
parser.add_argument(
@@ -166,7 +166,7 @@ parser.add_argument(
"--wfs-per-simd",
type=int,
default=10,
help="Number of " "WF slots per SIMD",
help="Number of WF slots per SIMD",
)
parser.add_argument(
@@ -276,12 +276,25 @@ parser.add_argument(
help="Latency for responses from ruby to the cu.",
)
parser.add_argument(
"--TLB-prefetch", type=int, help="prefetch depth for" "TLBs"
"--scalar-mem-req-latency",
type=int,
default=50,
help="Latency for scalar requests from the cu to ruby.",
)
parser.add_argument(
"--scalar-mem-resp-latency",
type=int,
# Set to 0 as the scalar cache response path does not model
# response latency yet and this parameter is currently not used
default=0,
help="Latency for scalar responses from ruby to the cu.",
)
parser.add_argument("--TLB-prefetch", type=int, help="prefetch depth for TLBs")
parser.add_argument(
"--pf-type",
type=str,
help="type of prefetch: " "PF_CU, PF_WF, PF_PHASE, PF_STRIDE",
help="type of prefetch: PF_CU, PF_WF, PF_PHASE, PF_STRIDE",
)
parser.add_argument("--pf-stride", type=int, help="set prefetch stride")
parser.add_argument(
@@ -354,7 +367,7 @@ parser.add_argument(
type=str,
default="gfx801",
choices=GfxVersion.vals,
help="Gfx version for gpu" "Note: gfx902 is not fully supported by ROCm",
help="Gfx version for gpuNote: gfx902 is not fully supported by ROCm",
)
Ruby.define_options(parser)
@@ -463,6 +476,8 @@ for i in range(n_cu):
vrf_lm_bus_latency=args.vrf_lm_bus_latency,
mem_req_latency=args.mem_req_latency,
mem_resp_latency=args.mem_resp_latency,
scalar_mem_req_latency=args.scalar_mem_req_latency,
scalar_mem_resp_latency=args.scalar_mem_resp_latency,
localDataStore=LdsState(
banks=args.numLdsBanks,
bankConflictPenalty=args.ldsBankConflictPenalty,
@@ -668,7 +683,7 @@ def find_path(base_list, rel_path, test):
full_path = os.path.join(base, rel_path)
if test(full_path):
return full_path
fatal("%s not found in %s" % (rel_path, base_list))
fatal(f"{rel_path} not found in {base_list}")
def find_file(base_list, rel_path):
@@ -702,7 +717,7 @@ else:
"/usr/lib/x86_64-linux-gnu",
]
),
"HOME=%s" % os.getenv("HOME", "/"),
f"HOME={os.getenv('HOME', '/')}",
# Disable the VM fault handler signal creation for dGPUs also
# forces the use of DefaultSignals instead of driver-controlled
# InteruptSignals throughout the runtime. DefaultSignals poll
@@ -907,14 +922,10 @@ else:
redirect_paths = [
RedirectPath(
app_path="/proc", host_paths=["%s/fs/proc" % m5.options.outdir]
),
RedirectPath(
app_path="/sys", host_paths=["%s/fs/sys" % m5.options.outdir]
),
RedirectPath(
app_path="/tmp", host_paths=["%s/fs/tmp" % m5.options.outdir]
app_path="/proc", host_paths=[f"{m5.options.outdir}/fs/proc"]
),
RedirectPath(app_path="/sys", host_paths=[f"{m5.options.outdir}/fs/sys"]),
RedirectPath(app_path="/tmp", host_paths=[f"{m5.options.outdir}/fs/tmp"]),
]
system.redirect_paths = redirect_paths
@@ -966,7 +977,7 @@ exit_event = m5.simulate(maxtick)
if args.fast_forward:
if exit_event.getCause() == "a thread reached the max instruction count":
m5.switchCpus(system, switch_cpu_list)
print("Switched CPUS @ tick %s" % (m5.curTick()))
print(f"Switched CPUS @ tick {m5.curTick()}")
m5.stats.reset()
exit_event = m5.simulate(maxtick - m5.curTick())
elif args.fast_forward_pseudo_op:
@@ -977,7 +988,7 @@ elif args.fast_forward_pseudo_op:
print("Dumping stats...")
m5.stats.dump()
m5.switchCpus(system, switch_cpu_list)
print("Switched CPUS @ tick %s" % (m5.curTick()))
print(f"Switched CPUS @ tick {m5.curTick()}")
m5.stats.reset()
# This lets us switch back and forth without keeping a counter
switch_cpu_list = [(x[1], x[0]) for x in switch_cpu_list]

View File

@@ -1,4 +1,4 @@
# Copyright (c) 2016-2017,2019-2021 ARM Limited
# Copyright (c) 2016-2017,2019-2023 Arm Limited
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
@@ -44,6 +44,7 @@ import m5
from m5.util import addToPath
from m5.objects import *
from m5.options import *
from gem5.simulate.exit_event import ExitEvent
import argparse
m5.util.addToPath("../..")
@@ -52,6 +53,7 @@ from common import SysPaths
from common import MemConfig
from common import ObjectList
from common.cores.arm import HPI
from common.cores.arm import O3_ARM_v7a
import devices
import workloads
@@ -63,8 +65,26 @@ cpu_types = {
"atomic": (AtomicSimpleCPU, None, None, None),
"minor": (MinorCPU, devices.L1I, devices.L1D, devices.L2),
"hpi": (HPI.HPI, HPI.HPI_ICache, HPI.HPI_DCache, HPI.HPI_L2),
"o3": (
O3_ARM_v7a.O3_ARM_v7a_3,
O3_ARM_v7a.O3_ARM_v7a_ICache,
O3_ARM_v7a.O3_ARM_v7a_DCache,
O3_ARM_v7a.O3_ARM_v7aL2,
),
}
pmu_control_events = {
"enable": ExitEvent.PERF_COUNTER_ENABLE,
"disable": ExitEvent.PERF_COUNTER_DISABLE,
"reset": ExitEvent.PERF_COUNTER_RESET,
}
pmu_interrupt_events = {
"interrupt": ExitEvent.PERF_COUNTER_INTERRUPT,
}
pmu_stats_events = dict(**pmu_control_events, **pmu_interrupt_events)
def create_cow_image(name):
"""Helper function to create a Copy-on-Write disk image"""
@@ -77,7 +97,7 @@ def create(args):
"""Create and configure the system object."""
if args.readfile and not os.path.isfile(args.readfile):
print("Error: Bootscript %s does not exist" % args.readfile)
print(f"Error: Bootscript {args.readfile} does not exist")
sys.exit(1)
object_file = args.kernel if args.kernel else ""
@@ -122,8 +142,14 @@ def create(args):
# Add CPU clusters to the system
system.cpu_cluster = [
devices.CpuCluster(
system, args.num_cores, args.cpu_freq, "1.0V", *cpu_types[args.cpu]
devices.ArmCpuCluster(
system,
args.num_cores,
args.cpu_freq,
"1.0V",
*cpu_types[args.cpu],
tarmac_gen=args.tarmac_gen,
tarmac_dest=args.tarmac_dest,
)
]
@@ -136,34 +162,85 @@ def create(args):
system.auto_reset_addr = True
# Using GICv3
system.realview.gic.gicv4 = False
if hasattr(system.realview.gic, "gicv4"):
system.realview.gic.gicv4 = False
system.highest_el_is_64 = True
workload_class = workloads.workload_list.get(args.workload)
system.workload = workload_class(object_file, system)
if args.with_pmu:
enabled_pmu_events = set(
(*args.pmu_dump_stats_on, *args.pmu_reset_stats_on)
)
exit_sim_on_control = bool(
enabled_pmu_events & set(pmu_control_events.keys())
)
exit_sim_on_interrupt = bool(
enabled_pmu_events & set(pmu_interrupt_events.keys())
)
for cluster in system.cpu_cluster:
interrupt_numbers = [args.pmu_ppi_number] * len(cluster)
cluster.addPMUs(
interrupt_numbers,
exit_sim_on_control=exit_sim_on_control,
exit_sim_on_interrupt=exit_sim_on_interrupt,
)
if args.exit_on_uart_eot:
for uart in system.realview.uart:
uart.end_on_eot = True
return system
def run(args):
cptdir = m5.options.outdir
if args.checkpoint:
print("Checkpoint directory: %s" % cptdir)
print(f"Checkpoint directory: {cptdir}")
pmu_exit_msgs = tuple(evt.value for evt in pmu_stats_events.values())
pmu_stats_dump_msgs = tuple(
pmu_stats_events[evt].value for evt in set(args.pmu_dump_stats_on)
)
pmu_stats_reset_msgs = tuple(
pmu_stats_events[evt].value for evt in set(args.pmu_reset_stats_on)
)
while True:
event = m5.simulate()
exit_msg = event.getCause()
if exit_msg == "checkpoint":
print("Dropping checkpoint at tick %d" % m5.curTick())
if exit_msg == ExitEvent.CHECKPOINT.value:
print(f"Dropping checkpoint at tick {m5.curTick():d}")
cpt_dir = os.path.join(m5.options.outdir, "cpt.%d" % m5.curTick())
m5.checkpoint(os.path.join(cpt_dir))
print("Checkpoint done.")
elif exit_msg in pmu_exit_msgs:
if exit_msg in pmu_stats_dump_msgs:
print(
f"Dumping stats at tick {m5.curTick():d}, "
f"due to {exit_msg}"
)
m5.stats.dump()
if exit_msg in pmu_stats_reset_msgs:
print(
f"Resetting stats at tick {m5.curTick():d}, "
f"due to {exit_msg}"
)
m5.stats.reset()
else:
print(exit_msg, " @ ", m5.curTick())
print(f"{exit_msg} ({event.getCode()}) @ {m5.curTick()}")
break
sys.exit(event.getCode())
def arm_ppi_arg(int_num: int) -> int:
"""Argparse argument parser for valid Arm PPI numbers."""
# PPIs (1056 <= int_num <= 1119) are not yet supported by gem5
int_num = int(int_num)
if 16 <= int_num <= 31:
return int_num
raise ValueError(f"{int_num} is not a valid Arm PPI number")
def main():
@@ -230,6 +307,55 @@ def main():
)
parser.add_argument("--checkpoint", action="store_true")
parser.add_argument("--restore", type=str, default=None)
parser.add_argument(
"--tarmac-gen",
action="store_true",
help="Write a Tarmac trace.",
)
parser.add_argument(
"--tarmac-dest",
choices=TarmacDump.vals,
default="stdoutput",
help="Destination for the Tarmac trace output. [Default: stdoutput]",
)
parser.add_argument(
"--with-pmu",
action="store_true",
help="Add a PMU to each core in the cluster.",
)
parser.add_argument(
"--pmu-ppi-number",
type=arm_ppi_arg,
default=23,
help="The number of the PPI to use to connect each PMU to its core. "
"Must be an integer and a valid PPI number (16 <= int_num <= 31).",
)
parser.add_argument(
"--pmu-dump-stats-on",
type=str,
default=[],
action="append",
choices=pmu_stats_events.keys(),
help="Specify the PMU events on which to dump the gem5 stats. "
"This option may be specified multiple times to enable multiple "
"PMU events.",
)
parser.add_argument(
"--pmu-reset-stats-on",
type=str,
default=[],
action="append",
choices=pmu_stats_events.keys(),
help="Specify the PMU events on which to reset the gem5 stats. "
"This option may be specified multiple times to enable multiple "
"PMU events.",
)
parser.add_argument(
"--exit-on-uart-eot",
action="store_true",
help="Exit simulation if any of the UARTs receive an EOT. Many "
"workloads signal termination by sending an EOT character.",
)
parser.add_argument(
"--dtb-gen",
action="store_true",
@@ -242,25 +368,25 @@ def main():
"--semi-stdin",
type=str,
default="stdin",
help="Standard input for semihosting " "(default: gem5's stdin)",
help="Standard input for semihosting (default: gem5's stdin)",
)
parser.add_argument(
"--semi-stdout",
type=str,
default="stdout",
help="Standard output for semihosting " "(default: gem5's stdout)",
help="Standard output for semihosting (default: gem5's stdout)",
)
parser.add_argument(
"--semi-stderr",
type=str,
default="stderr",
help="Standard error for semihosting " "(default: gem5's stderr)",
help="Standard error for semihosting (default: gem5's stderr)",
)
parser.add_argument(
"--semi-path",
type=str,
default="",
help=("Search path for files to be loaded through " "Arm Semihosting"),
help=("Search path for files to be loaded through Arm Semihosting"),
)
parser.add_argument(
"args",

View File

@@ -1,4 +1,4 @@
# Copyright (c) 2016-2017, 2019, 2021 Arm Limited
# Copyright (c) 2016-2017, 2019, 2021-2023 Arm Limited
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
@@ -95,7 +95,7 @@ class MemBus(SystemXBar):
default = Self.badaddr_responder.pio
class CpuCluster(SubSystem):
class ArmCpuCluster(CpuCluster):
def __init__(
self,
system,
@@ -106,8 +106,10 @@ class CpuCluster(SubSystem):
l1i_type,
l1d_type,
l2_type,
tarmac_gen=False,
tarmac_dest=None,
):
super(CpuCluster, self).__init__()
super().__init__()
self._cpu_type = cpu_type
self._l1i_type = l1i_type
self._l1d_type = l1d_type
@@ -120,24 +122,15 @@ class CpuCluster(SubSystem):
clock=cpu_clock, voltage_domain=self.voltage_domain
)
self.cpus = [
self._cpu_type(
cpu_id=system.numCpus() + idx, clk_domain=self.clk_domain
)
for idx in range(num_cpus)
]
self.generate_cpus(cpu_type, num_cpus)
for cpu in self.cpus:
cpu.createThreads()
cpu.createInterruptController()
cpu.socket_id = system.numCpuClusters()
system.addCpuCluster(self, num_cpus)
if tarmac_gen:
cpu.tracer = TarmacTracer()
if tarmac_dest is not None:
cpu.tracer.outfile = tarmac_dest
def requireCaches(self):
return self._cpu_type.require_caches()
def memoryMode(self):
return self._cpu_type.memory_mode()
system.addCpuCluster(self)
def addL1(self):
for cpu in self.cpus:
@@ -154,7 +147,13 @@ class CpuCluster(SubSystem):
cpu.connectCachedPorts(self.toL2Bus.cpu_side_ports)
self.toL2Bus.mem_side_ports = self.l2.cpu_side
def addPMUs(self, ints, events=[]):
def addPMUs(
self,
ints,
events=[],
exit_sim_on_control=False,
exit_sim_on_interrupt=False,
):
"""
Instantiates 1 ArmPMU per PE. The method is accepting a list of
interrupt numbers (ints) used by the PMU and a list of events to
@@ -166,12 +165,21 @@ class CpuCluster(SubSystem):
:type ints: List[int]
:param events: Additional events to be measured by the PMUs
:type events: List[Union[ProbeEvent, SoftwareIncrement]]
:param exit_sim_on_control: If true, exit the sim loop when the PMU is
enabled, disabled, or reset.
:type exit_on_control: bool
:param exit_sim_on_interrupt: If true, exit the sim loop when the PMU
triggers an interrupt.
:type exit_on_control: bool
"""
assert len(ints) == len(self.cpus)
for cpu, pint in zip(self.cpus, ints):
int_cls = ArmPPI if pint < 32 else ArmSPI
for isa in cpu.isa:
isa.pmu = ArmPMU(interrupt=int_cls(num=pint))
isa.pmu.exitOnPMUControl = exit_sim_on_control
isa.pmu.exitOnPMUInterrupt = exit_sim_on_interrupt
isa.pmu.addArchEvents(
cpu=cpu,
itb=cpu.mmu.itb,
@@ -191,36 +199,63 @@ class CpuCluster(SubSystem):
cpu.connectCachedPorts(bus.cpu_side_ports)
class AtomicCluster(CpuCluster):
def __init__(self, system, num_cpus, cpu_clock, cpu_voltage="1.0V"):
cpu_config = [
ObjectList.cpu_list.get("AtomicSimpleCPU"),
None,
None,
None,
]
super(AtomicCluster, self).__init__(
system, num_cpus, cpu_clock, cpu_voltage, *cpu_config
class AtomicCluster(ArmCpuCluster):
def __init__(
self,
system,
num_cpus,
cpu_clock,
cpu_voltage="1.0V",
tarmac_gen=False,
tarmac_dest=None,
):
super().__init__(
system,
num_cpus,
cpu_clock,
cpu_voltage,
cpu_type=ObjectList.cpu_list.get("AtomicSimpleCPU"),
l1i_type=None,
l1d_type=None,
l2_type=None,
tarmac_gen=tarmac_gen,
tarmac_dest=tarmac_dest,
)
def addL1(self):
pass
class KvmCluster(CpuCluster):
def __init__(self, system, num_cpus, cpu_clock, cpu_voltage="1.0V"):
cpu_config = [ObjectList.cpu_list.get("ArmV8KvmCPU"), None, None, None]
super(KvmCluster, self).__init__(
system, num_cpus, cpu_clock, cpu_voltage, *cpu_config
class KvmCluster(ArmCpuCluster):
def __init__(
self,
system,
num_cpus,
cpu_clock,
cpu_voltage="1.0V",
tarmac_gen=False,
tarmac_dest=None,
):
super().__init__(
system,
num_cpus,
cpu_clock,
cpu_voltage,
cpu_type=ObjectList.cpu_list.get("ArmV8KvmCPU"),
l1i_type=None,
l1d_type=None,
l2_type=None,
tarmac_gen=tarmac_gen,
tarmac_dest=tarmac_dest,
)
def addL1(self):
pass
class FastmodelCluster(SubSystem):
class FastmodelCluster(CpuCluster):
def __init__(self, system, num_cpus, cpu_clock, cpu_voltage="1.0V"):
super(FastmodelCluster, self).__init__()
super().__init__()
# Setup GIC
gic = system.realview.gic
@@ -285,12 +320,12 @@ class FastmodelCluster(SubSystem):
self.cpu_hub.a2t = a2t
self.cpu_hub.t2g = t2g
system.addCpuCluster(self, num_cpus)
system.addCpuCluster(self)
def requireCaches(self):
def require_caches(self):
return False
def memoryMode(self):
def memory_mode(self):
return "atomic_noncaching"
def addL1(self):
@@ -330,7 +365,6 @@ class BaseSimpleSystem(ArmSystem):
self.mem_ranges = self.getMemRanges(int(Addr(mem_size)))
self._clusters = []
self._num_cpus = 0
def getMemRanges(self, mem_size):
"""
@@ -357,14 +391,8 @@ class BaseSimpleSystem(ArmSystem):
def numCpuClusters(self):
return len(self._clusters)
def addCpuCluster(self, cpu_cluster, num_cpus):
assert cpu_cluster not in self._clusters
assert num_cpus > 0
def addCpuCluster(self, cpu_cluster):
self._clusters.append(cpu_cluster)
self._num_cpus += num_cpus
def numCpus(self):
return self._num_cpus
def addCaches(self, need_caches, last_cache_level):
if not need_caches:

View File

@@ -51,7 +51,7 @@ import sw
def addOptions(parser):
# Options for distributed simulation (i.e. dist-gem5)
parser.add_argument(
"--dist", action="store_true", help="Distributed gem5" " simulation."
"--dist", action="store_true", help="Distributed gem5 simulation."
)
parser.add_argument(
"--is-switch",
@@ -71,14 +71,14 @@ def addOptions(parser):
default=0,
action="store",
type=int,
help="Number of gem5 processes within the dist gem5" " run.",
help="Number of gem5 processes within the dist gem5 run.",
)
parser.add_argument(
"--dist-server-name",
default="127.0.0.1",
action="store",
type=str,
help="Name of the message server host\nDEFAULT:" " localhost",
help="Name of the message server host\nDEFAULT: localhost",
)
parser.add_argument(
"--dist-server-port",

View File

@@ -79,7 +79,7 @@ def _using_pdes(root):
return False
class BigCluster(devices.CpuCluster):
class BigCluster(devices.ArmCpuCluster):
def __init__(self, system, num_cpus, cpu_clock, cpu_voltage="1.0V"):
cpu_config = [
ObjectList.cpu_list.get("O3_ARM_v7a_3"),
@@ -87,12 +87,10 @@ class BigCluster(devices.CpuCluster):
devices.L1D,
devices.L2,
]
super(BigCluster, self).__init__(
system, num_cpus, cpu_clock, cpu_voltage, *cpu_config
)
super().__init__(system, num_cpus, cpu_clock, cpu_voltage, *cpu_config)
class LittleCluster(devices.CpuCluster):
class LittleCluster(devices.ArmCpuCluster):
def __init__(self, system, num_cpus, cpu_clock, cpu_voltage="1.0V"):
cpu_config = [
ObjectList.cpu_list.get("MinorCPU"),
@@ -100,9 +98,7 @@ class LittleCluster(devices.CpuCluster):
devices.L1D,
devices.L2,
]
super(LittleCluster, self).__init__(
system, num_cpus, cpu_clock, cpu_voltage, *cpu_config
)
super().__init__(system, num_cpus, cpu_clock, cpu_voltage, *cpu_config)
class Ex5BigCluster(devices.CpuCluster):
@@ -113,9 +109,7 @@ class Ex5BigCluster(devices.CpuCluster):
ex5_big.L1D,
ex5_big.L2,
]
super(Ex5BigCluster, self).__init__(
system, num_cpus, cpu_clock, cpu_voltage, *cpu_config
)
super().__init__(system, num_cpus, cpu_clock, cpu_voltage, *cpu_config)
class Ex5LittleCluster(devices.CpuCluster):
@@ -126,9 +120,7 @@ class Ex5LittleCluster(devices.CpuCluster):
ex5_LITTLE.L1D,
ex5_LITTLE.L2,
]
super(Ex5LittleCluster, self).__init__(
system, num_cpus, cpu_clock, cpu_voltage, *cpu_config
)
super().__init__(system, num_cpus, cpu_clock, cpu_voltage, *cpu_config)
def createSystem(
@@ -339,10 +331,10 @@ def build(options):
"lpj=19988480",
"norandmaps",
"loglevel=8",
"mem=%s" % options.mem_size,
"root=%s" % options.root,
f"mem={options.mem_size}",
f"root={options.root}",
"rw",
"init=%s" % options.kernel_init,
f"init={options.kernel_init}",
"vmalloc=768MB",
]
@@ -376,7 +368,7 @@ def build(options):
system.bigCluster = big_model(
system, options.big_cpus, options.big_cpu_clock
)
system.mem_mode = system.bigCluster.memoryMode()
system.mem_mode = system.bigCluster.memory_mode()
all_cpus += system.bigCluster.cpus
# little cluster
@@ -384,23 +376,24 @@ def build(options):
system.littleCluster = little_model(
system, options.little_cpus, options.little_cpu_clock
)
system.mem_mode = system.littleCluster.memoryMode()
system.mem_mode = system.littleCluster.memory_mode()
all_cpus += system.littleCluster.cpus
# Figure out the memory mode
if (
options.big_cpus > 0
and options.little_cpus > 0
and system.bigCluster.memoryMode() != system.littleCluster.memoryMode()
and system.bigCluster.memory_mode()
!= system.littleCluster.memory_mode()
):
m5.util.panic("Memory mode missmatch among CPU clusters")
# create caches
system.addCaches(options.caches, options.last_cache_level)
if not options.caches:
if options.big_cpus > 0 and system.bigCluster.requireCaches():
if options.big_cpus > 0 and system.bigCluster.require_caches():
m5.util.panic("Big CPU model requires caches")
if options.little_cpus > 0 and system.littleCluster.requireCaches():
if options.little_cpus > 0 and system.littleCluster.require_caches():
m5.util.panic("Little CPU model requires caches")
# Create a KVM VM and do KVM-specific configuration

View File

@@ -79,7 +79,7 @@ class L2PowerOn(MathExprPowerModel):
# Example to report l2 Cache overallAccesses
# The estimated power is converted to Watt and will vary based
# on the size of the cache
self.dyn = "{}.overallAccesses * 0.000018000".format(l2_path)
self.dyn = f"{l2_path}.overallAccesses * 0.000018000"
self.st = "(voltage * 3)/10"

View File

@@ -100,7 +100,7 @@ def create(args):
"""Create and configure the system object."""
if args.script and not os.path.isfile(args.script):
print("Error: Bootscript %s does not exist" % args.script)
print(f"Error: Bootscript {args.script} does not exist")
sys.exit(1)
cpu_class = cpu_types[args.cpu]
@@ -115,7 +115,7 @@ def create(args):
# Add CPU clusters to the system
system.cpu_cluster = [
devices.CpuCluster(
devices.ArmCpuCluster(
system,
args.num_cpus,
args.cpu_freq,
@@ -171,11 +171,11 @@ def create(args):
# memory layout.
"norandmaps",
# Tell Linux where to find the root disk image.
"root=%s" % args.root_device,
f"root={args.root_device}",
# Mount the root disk read-write by default.
"rw",
# Tell Linux about the amount of physical memory present.
"mem=%s" % args.mem_size,
f"mem={args.mem_size}",
]
system.workload.command_line = " ".join(kernel_cmd)
@@ -185,7 +185,7 @@ def create(args):
def run(args):
cptdir = m5.options.outdir
if args.checkpoint:
print("Checkpoint directory: %s" % cptdir)
print(f"Checkpoint directory: {cptdir}")
while True:
event = m5.simulate()
@@ -221,9 +221,7 @@ def main():
"--root-device",
type=str,
default=default_root_device,
help="OS device name for root partition (default: {})".format(
default_root_device
),
help=f"OS device name for root partition (default: {default_root_device})",
)
parser.add_argument(
"--script", type=str, default="", help="Linux bootscript"

View File

@@ -88,7 +88,7 @@ def create(args):
"""Create and configure the system object."""
if args.script and not os.path.isfile(args.script):
print("Error: Bootscript %s does not exist" % args.script)
print(f"Error: Bootscript {args.script} does not exist")
sys.exit(1)
cpu_class = cpu_types[args.cpu][0]
@@ -128,8 +128,14 @@ def create(args):
# Add CPU clusters to the system
system.cpu_cluster = [
devices.CpuCluster(
system, args.num_cores, args.cpu_freq, "1.0V", *cpu_types[args.cpu]
devices.ArmCpuCluster(
system,
args.num_cores,
args.cpu_freq,
"1.0V",
*cpu_types[args.cpu],
tarmac_gen=args.tarmac_gen,
tarmac_dest=args.tarmac_dest,
)
]
@@ -163,21 +169,26 @@ def create(args):
# memory layout.
"norandmaps",
# Tell Linux where to find the root disk image.
"root=%s" % args.root_device,
f"root={args.root_device}",
# Mount the root disk read-write by default.
"rw",
# Tell Linux about the amount of physical memory present.
"mem=%s" % args.mem_size,
f"mem={args.mem_size}",
]
system.workload.command_line = " ".join(kernel_cmd)
if args.with_pmu:
for cluster in system.cpu_cluster:
interrupt_numbers = [args.pmu_ppi_number] * len(cluster)
cluster.addPMUs(interrupt_numbers)
return system
def run(args):
cptdir = m5.options.outdir
if args.checkpoint:
print("Checkpoint directory: %s" % cptdir)
print(f"Checkpoint directory: {cptdir}")
while True:
event = m5.simulate()
@@ -188,10 +199,17 @@ def run(args):
m5.checkpoint(os.path.join(cpt_dir))
print("Checkpoint done.")
else:
print(exit_msg, " @ ", m5.curTick())
print(f"{exit_msg} ({event.getCode()}) @ {m5.curTick()}")
break
sys.exit(event.getCode())
def arm_ppi_arg(int_num: int) -> int:
"""Argparse argument parser for valid Arm PPI numbers."""
# PPIs (1056 <= int_num <= 1119) are not yet supported by gem5
int_num = int(int_num)
if 16 <= int_num <= 31:
return int_num
raise ValueError(f"{int_num} is not a valid Arm PPI number")
def main():
@@ -219,9 +237,7 @@ def main():
"--root-device",
type=str,
default=default_root_device,
help="OS device name for root partition (default: {})".format(
default_root_device
),
help=f"OS device name for root partition (default: {default_root_device})",
)
parser.add_argument(
"--script", type=str, default="", help="Linux bootscript"
@@ -259,6 +275,29 @@ def main():
default="2GB",
help="Specify the physical memory size",
)
parser.add_argument(
"--tarmac-gen",
action="store_true",
help="Write a Tarmac trace.",
)
parser.add_argument(
"--tarmac-dest",
choices=TarmacDump.vals,
default="stdoutput",
help="Destination for the Tarmac trace output. [Default: stdoutput]",
)
parser.add_argument(
"--with-pmu",
action="store_true",
help="Add a PMU to each core in the cluster.",
)
parser.add_argument(
"--pmu-ppi-number",
type=arm_ppi_arg,
default=23,
help="The number of the PPI to use to connect each PMU to its core. "
"Must be an integer and a valid PPI number (16 <= int_num <= 31).",
)
parser.add_argument("--checkpoint", action="store_true")
parser.add_argument("--restore", type=str, default=None)

View File

@@ -1,4 +1,4 @@
# Copyright (c) 2016-2017 ARM Limited
# Copyright (c) 2016-2017, 2022-2023 Arm Limited
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
@@ -95,30 +95,36 @@ class SimpleSeSystem(System):
# Add CPUs to the system. A cluster of CPUs typically have
# private L1 caches and a shared L2 cache.
self.cpu_cluster = devices.CpuCluster(
self, args.num_cores, args.cpu_freq, "1.2V", *cpu_types[args.cpu]
self.cpu_cluster = devices.ArmCpuCluster(
self,
args.num_cores,
args.cpu_freq,
"1.2V",
*cpu_types[args.cpu],
tarmac_gen=args.tarmac_gen,
tarmac_dest=args.tarmac_dest,
)
# Create a cache hierarchy (unless we are simulating a
# functional CPU in atomic memory mode) for the CPU cluster
# and connect it to the shared memory bus.
if self.cpu_cluster.memoryMode() == "timing":
if self.cpu_cluster.memory_mode() == "timing":
self.cpu_cluster.addL1()
self.cpu_cluster.addL2(self.cpu_cluster.clk_domain)
self.cpu_cluster.connectMemSide(self.membus)
# Tell gem5 about the memory mode used by the CPUs we are
# simulating.
self.mem_mode = self.cpu_cluster.memoryMode()
self.mem_mode = self.cpu_cluster.memory_mode()
def numCpuClusters(self):
return len(self._clusters)
def addCpuCluster(self, cpu_cluster, num_cpus):
def addCpuCluster(self, cpu_cluster):
assert cpu_cluster not in self._clusters
assert num_cpus > 0
assert len(cpu_cluster) > 0
self._clusters.append(cpu_cluster)
self._num_cpus += num_cpus
self._num_cpus += len(cpu_cluster)
def numCpus(self):
return self._num_cpus
@@ -215,6 +221,17 @@ def main():
default="2GB",
help="Specify the physical memory size",
)
parser.add_argument(
"--tarmac-gen",
action="store_true",
help="Write a Tarmac trace.",
)
parser.add_argument(
"--tarmac-dest",
choices=TarmacDump.vals,
default="stdoutput",
help="Destination for the Tarmac trace output. [Default: stdoutput]",
)
args = parser.parse_args()
@@ -240,8 +257,7 @@ def main():
# Print the reason for the simulation exit. Some exit codes are
# requests for service (e.g., checkpoints) from the simulation
# script. We'll just ignore them here and exit.
print(event.getCause(), " @ ", m5.curTick())
sys.exit(event.getCode())
print(f"{event.getCause()} ({event.getCode()}) @ {m5.curTick()}")
if __name__ == "__m5_main__":

63
configs/example/dramsys.py Executable file
View File

@@ -0,0 +1,63 @@
# Copyright (c) 2022 Fraunhofer IESE
# All rights reserved
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import m5
from m5.objects import *
traffic_gen = PyTrafficGen()
system = System()
vd = VoltageDomain(voltage="1V")
system.mem_mode = "timing"
system.cpu = traffic_gen
dramsys = DRAMSys(
configuration="ext/dramsys/DRAMSys/DRAMSys/"
"library/resources/simulations/ddr4-example.json",
resource_directory="ext/dramsys/DRAMSys/DRAMSys/library/resources",
)
system.target = dramsys
system.transactor = Gem5ToTlmBridge32()
system.clk_domain = SrcClockDomain(clock="1.5GHz", voltage_domain=vd)
# Connect everything:
system.transactor.gem5 = system.cpu.port
system.transactor.tlm = system.target.tlm
kernel = SystemC_Kernel(system=system)
root = Root(full_system=False, systemc_kernel=kernel)
m5.instantiate()
idle = traffic_gen.createIdle(100000)
linear = traffic_gen.createLinear(10000000, 0, 16777216, 64, 500, 1500, 65, 0)
random = traffic_gen.createRandom(10000000, 0, 16777216, 64, 500, 1500, 65, 0)
traffic_gen.start([linear, idle, random])
cause = m5.simulate(20000000).getCause()
print(cause)

View File

@@ -1,19 +1,4 @@
# Copyright (c) 2010-2013, 2016, 2019-2020 ARM Limited
# Copyright (c) 2020 Barkhausen Institut
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
# not be construed as granting a license to any other intellectual
# property including but not limited to intellectual property relating
# to a hardware implementation of the functionality of the software
# licensed hereunder. You may use the software subject to the license
# terms below provided that you ensure that this notice is replicated
# unmodified and in its entirety in all distributions of the software,
# modified or unmodified, in source code or in binary form.
#
# Copyright (c) 2012-2014 Mark D. Hill and David A. Wood
# Copyright (c) 2009-2011 Advanced Micro Devices, Inc.
# Copyright (c) 2006-2007 The Regents of The University of Michigan
# Copyright (c) 2023 The Regents of the University of California
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
@@ -39,401 +24,10 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import argparse
import sys
from m5.util import fatal
import m5
from m5.defines import buildEnv
from m5.objects import *
from m5.util import addToPath, fatal, warn
from m5.util.fdthelper import *
from gem5.isas import ISA
from gem5.runtime import get_runtime_isa
addToPath("../")
from ruby import Ruby
from common.FSConfig import *
from common.SysPaths import *
from common.Benchmarks import *
from common import Simulation
from common import CacheConfig
from common import CpuConfig
from common import MemConfig
from common import ObjectList
from common.Caches import *
from common import Options
def cmd_line_template():
if args.command_line and args.command_line_file:
print(
"Error: --command-line and --command-line-file are "
"mutually exclusive"
)
sys.exit(1)
if args.command_line:
return args.command_line
if args.command_line_file:
return open(args.command_line_file).read().strip()
return None
def build_test_system(np):
cmdline = cmd_line_template()
isa = get_runtime_isa()
if isa == ISA.MIPS:
test_sys = makeLinuxMipsSystem(test_mem_mode, bm[0], cmdline=cmdline)
elif isa == ISA.SPARC:
test_sys = makeSparcSystem(test_mem_mode, bm[0], cmdline=cmdline)
elif isa == ISA.RISCV:
test_sys = makeBareMetalRiscvSystem(
test_mem_mode, bm[0], cmdline=cmdline
)
elif isa == ISA.X86:
test_sys = makeLinuxX86System(
test_mem_mode, np, bm[0], args.ruby, cmdline=cmdline
)
elif isa == ISA.ARM:
test_sys = makeArmSystem(
test_mem_mode,
args.machine_type,
np,
bm[0],
args.dtb_filename,
bare_metal=args.bare_metal,
cmdline=cmdline,
external_memory=args.external_memory_system,
ruby=args.ruby,
vio_9p=args.vio_9p,
bootloader=args.bootloader,
)
if args.enable_context_switch_stats_dump:
test_sys.enable_context_switch_stats_dump = True
else:
fatal("Incapable of building %s full system!", isa.name)
# Set the cache line size for the entire system
test_sys.cache_line_size = args.cacheline_size
# Create a top-level voltage domain
test_sys.voltage_domain = VoltageDomain(voltage=args.sys_voltage)
# Create a source clock for the system and set the clock period
test_sys.clk_domain = SrcClockDomain(
clock=args.sys_clock, voltage_domain=test_sys.voltage_domain
)
# Create a CPU voltage domain
test_sys.cpu_voltage_domain = VoltageDomain()
# Create a source clock for the CPUs and set the clock period
test_sys.cpu_clk_domain = SrcClockDomain(
clock=args.cpu_clock, voltage_domain=test_sys.cpu_voltage_domain
)
if buildEnv["USE_RISCV_ISA"]:
test_sys.workload.bootloader = args.kernel
elif args.kernel is not None:
test_sys.workload.object_file = binary(args.kernel)
if args.script is not None:
test_sys.readfile = args.script
test_sys.init_param = args.init_param
# For now, assign all the CPUs to the same clock domain
test_sys.cpu = [
TestCPUClass(clk_domain=test_sys.cpu_clk_domain, cpu_id=i)
for i in range(np)
]
if args.ruby:
bootmem = getattr(test_sys, "_bootmem", None)
Ruby.create_system(
args, True, test_sys, test_sys.iobus, test_sys._dma_ports, bootmem
)
# Create a seperate clock domain for Ruby
test_sys.ruby.clk_domain = SrcClockDomain(
clock=args.ruby_clock, voltage_domain=test_sys.voltage_domain
)
# Connect the ruby io port to the PIO bus,
# assuming that there is just one such port.
test_sys.iobus.mem_side_ports = test_sys.ruby._io_port.in_ports
for (i, cpu) in enumerate(test_sys.cpu):
#
# Tie the cpu ports to the correct ruby system ports
#
cpu.clk_domain = test_sys.cpu_clk_domain
cpu.createThreads()
cpu.createInterruptController()
test_sys.ruby._cpu_ports[i].connectCpuPorts(cpu)
else:
if args.caches or args.l2cache:
# By default the IOCache runs at the system clock
test_sys.iocache = IOCache(addr_ranges=test_sys.mem_ranges)
test_sys.iocache.cpu_side = test_sys.iobus.mem_side_ports
test_sys.iocache.mem_side = test_sys.membus.cpu_side_ports
elif not args.external_memory_system:
test_sys.iobridge = Bridge(
delay="50ns", ranges=test_sys.mem_ranges
)
test_sys.iobridge.cpu_side_port = test_sys.iobus.mem_side_ports
test_sys.iobridge.mem_side_port = test_sys.membus.cpu_side_ports
# Sanity check
if args.simpoint_profile:
if not ObjectList.is_noncaching_cpu(TestCPUClass):
fatal("SimPoint generation should be done with atomic cpu")
if np > 1:
fatal(
"SimPoint generation not supported with more than one CPUs"
)
for i in range(np):
if args.simpoint_profile:
test_sys.cpu[i].addSimPointProbe(args.simpoint_interval)
if args.checker:
test_sys.cpu[i].addCheckerCpu()
if not ObjectList.is_kvm_cpu(TestCPUClass):
if args.bp_type:
bpClass = ObjectList.bp_list.get(args.bp_type)
test_sys.cpu[i].branchPred = bpClass()
if args.indirect_bp_type:
IndirectBPClass = ObjectList.indirect_bp_list.get(
args.indirect_bp_type
)
test_sys.cpu[
i
].branchPred.indirectBranchPred = IndirectBPClass()
test_sys.cpu[i].createThreads()
# If elastic tracing is enabled when not restoring from checkpoint and
# when not fast forwarding using the atomic cpu, then check that the
# TestCPUClass is DerivO3CPU or inherits from DerivO3CPU. If the check
# passes then attach the elastic trace probe.
# If restoring from checkpoint or fast forwarding, the code that does this for
# FutureCPUClass is in the Simulation module. If the check passes then the
# elastic trace probe is attached to the switch CPUs.
if (
args.elastic_trace_en
and args.checkpoint_restore == None
and not args.fast_forward
):
CpuConfig.config_etrace(TestCPUClass, test_sys.cpu, args)
CacheConfig.config_cache(args, test_sys)
MemConfig.config_mem(args, test_sys)
if ObjectList.is_kvm_cpu(TestCPUClass) or ObjectList.is_kvm_cpu(
FutureClass
):
# Assign KVM CPUs to their own event queues / threads. This
# has to be done after creating caches and other child objects
# since these mustn't inherit the CPU event queue.
for i, cpu in enumerate(test_sys.cpu):
# Child objects usually inherit the parent's event
# queue. Override that and use the same event queue for
# all devices.
for obj in cpu.descendants():
obj.eventq_index = 0
cpu.eventq_index = i + 1
test_sys.kvm_vm = KvmVM()
return test_sys
def build_drive_system(np):
# driver system CPU is always simple, so is the memory
# Note this is an assignment of a class, not an instance.
DriveCPUClass = AtomicSimpleCPU
drive_mem_mode = "atomic"
DriveMemClass = SimpleMemory
cmdline = cmd_line_template()
if buildEnv["USE_MIPS_ISA"]:
drive_sys = makeLinuxMipsSystem(drive_mem_mode, bm[1], cmdline=cmdline)
elif buildEnv["USE_SPARC_ISA"]:
drive_sys = makeSparcSystem(drive_mem_mode, bm[1], cmdline=cmdline)
elif buildEnv["USE_X86_ISA"]:
drive_sys = makeLinuxX86System(
drive_mem_mode, np, bm[1], cmdline=cmdline
)
elif buildEnv["USE_ARM_ISA"]:
drive_sys = makeArmSystem(
drive_mem_mode,
args.machine_type,
np,
bm[1],
args.dtb_filename,
cmdline=cmdline,
)
# Create a top-level voltage domain
drive_sys.voltage_domain = VoltageDomain(voltage=args.sys_voltage)
# Create a source clock for the system and set the clock period
drive_sys.clk_domain = SrcClockDomain(
clock=args.sys_clock, voltage_domain=drive_sys.voltage_domain
)
# Create a CPU voltage domain
drive_sys.cpu_voltage_domain = VoltageDomain()
# Create a source clock for the CPUs and set the clock period
drive_sys.cpu_clk_domain = SrcClockDomain(
clock=args.cpu_clock, voltage_domain=drive_sys.cpu_voltage_domain
)
drive_sys.cpu = DriveCPUClass(
clk_domain=drive_sys.cpu_clk_domain, cpu_id=0
)
drive_sys.cpu.createThreads()
drive_sys.cpu.createInterruptController()
drive_sys.cpu.connectBus(drive_sys.membus)
if args.kernel is not None:
drive_sys.workload.object_file = binary(args.kernel)
if ObjectList.is_kvm_cpu(DriveCPUClass):
drive_sys.kvm_vm = KvmVM()
drive_sys.iobridge = Bridge(delay="50ns", ranges=drive_sys.mem_ranges)
drive_sys.iobridge.cpu_side_port = drive_sys.iobus.mem_side_ports
drive_sys.iobridge.mem_side_port = drive_sys.membus.cpu_side_ports
# Create the appropriate memory controllers and connect them to the
# memory bus
drive_sys.mem_ctrls = [
DriveMemClass(range=r) for r in drive_sys.mem_ranges
]
for i in range(len(drive_sys.mem_ctrls)):
drive_sys.mem_ctrls[i].port = drive_sys.membus.mem_side_ports
drive_sys.init_param = args.init_param
return drive_sys
# Add args
parser = argparse.ArgumentParser()
Options.addCommonOptions(parser)
Options.addFSOptions(parser)
# Add the ruby specific and protocol specific args
if "--ruby" in sys.argv:
Ruby.define_options(parser)
args = parser.parse_args()
# system under test can be any CPU
(TestCPUClass, test_mem_mode, FutureClass) = Simulation.setCPUClass(args)
# Match the memories with the CPUs, based on the options for the test system
TestMemClass = Simulation.setMemClass(args)
if args.benchmark:
try:
bm = Benchmarks[args.benchmark]
except KeyError:
print("Error benchmark %s has not been defined." % args.benchmark)
print("Valid benchmarks are: %s" % DefinedBenchmarks)
sys.exit(1)
else:
if args.dual:
bm = [
SysConfig(
disks=args.disk_image,
rootdev=args.root_device,
mem=args.mem_size,
os_type=args.os_type,
),
SysConfig(
disks=args.disk_image,
rootdev=args.root_device,
mem=args.mem_size,
os_type=args.os_type,
),
]
else:
bm = [
SysConfig(
disks=args.disk_image,
rootdev=args.root_device,
mem=args.mem_size,
os_type=args.os_type,
)
]
np = args.num_cpus
test_sys = build_test_system(np)
if len(bm) == 2:
drive_sys = build_drive_system(np)
root = makeDualRoot(True, test_sys, drive_sys, args.etherdump)
elif len(bm) == 1 and args.dist:
# This system is part of a dist-gem5 simulation
root = makeDistRoot(
test_sys,
args.dist_rank,
args.dist_size,
args.dist_server_name,
args.dist_server_port,
args.dist_sync_repeat,
args.dist_sync_start,
args.ethernet_linkspeed,
args.ethernet_linkdelay,
args.etherdump,
)
elif len(bm) == 1:
root = Root(full_system=True, system=test_sys)
else:
print("Error I don't know how to create more than 2 systems.")
sys.exit(1)
if ObjectList.is_kvm_cpu(TestCPUClass) or ObjectList.is_kvm_cpu(FutureClass):
# Required for running kvm on multiple host cores.
# Uses gem5's parallel event queue feature
# Note: The simulator is quite picky about this number!
root.sim_quantum = int(1e9) # 1 ms
if args.timesync:
root.time_sync_enable = True
if args.frame_capture:
VncServer.frame_capture = True
if buildEnv["USE_ARM_ISA"] and not args.bare_metal and not args.dtb_filename:
if args.machine_type not in [
"VExpress_GEM5",
"VExpress_GEM5_V1",
"VExpress_GEM5_V2",
"VExpress_GEM5_Foundation",
]:
warn(
"Can only correctly generate a dtb for VExpress_GEM5_* "
"platforms, unless custom hardware models have been equipped "
"with generation functionality."
)
# Generate a Device Tree
for sysname in ("system", "testsys", "drivesys"):
if hasattr(root, sysname):
sys = getattr(root, sysname)
sys.workload.dtb_filename = os.path.join(
m5.options.outdir, "%s.dtb" % sysname
)
sys.generateDtb(sys.workload.dtb_filename)
if args.wait_gdb:
test_sys.workload.wait_for_remote_gdb = True
Simulation.setWorkCountOptions(test_sys, args)
Simulation.run(args, root, test_sys, FutureClass)
fatal(
"The 'configs/example/fs.py' script has been deprecated. It can be "
"found in 'configs/deprecated/example' if required. Its usage should be "
"avoided as it will be removed in future releases of gem5."
)

View File

@@ -90,7 +90,7 @@ board = SimpleBoard(
board.set_se_binary_workload(
# the workload should be the same as the save-checkpoint script
Resource("riscv-hello"),
checkpoint=Resource("riscv-hello-example-checkpoint-v22-1"),
checkpoint=Resource("riscv-hello-example-checkpoint-v23"),
)
simulator = Simulator(

View File

@@ -58,6 +58,7 @@ from gem5.components.processors.simple_processor import SimpleProcessor
from gem5.components.processors.cpu_types import CPUTypes
from gem5.isas import ISA
from gem5.resources.workload import Workload
from gem5.resources.resource import obtain_resource, SimpointResource
from pathlib import Path
from gem5.components.cachehierarchies.classic.no_cache import NoCache
from gem5.simulate.exit_event_generators import (
@@ -108,7 +109,23 @@ board = SimpleBoard(
cache_hierarchy=cache_hierarchy,
)
board.set_workload(Workload("x86-print-this-15000-with-simpoints"))
# board.set_workload(
# Workload("x86-print-this-15000-with-simpoints")
#
# **Note: This has been removed until we update the resources.json file to
# encapsulate the new Simpoint format.
# Below we set the simpount manually.
board.set_se_simpoint_workload(
binary=obtain_resource("x86-print-this"),
arguments=["print this", 15000],
simpoint=SimpointResource(
simpoint_interval=1000000,
simpoint_list=[2, 3, 4, 15],
weight_list=[0.1, 0.2, 0.4, 0.3],
warmup_interval=1000000,
),
)
dir = Path(args.checkpoint_path)
dir.mkdir(exist_ok=True)

View File

@@ -63,8 +63,9 @@ from gem5.components.memory import DualChannelDDR4_2400
from gem5.components.processors.simple_processor import SimpleProcessor
from gem5.components.processors.cpu_types import CPUTypes
from gem5.isas import ISA
from gem5.resources.resource import Resource
from gem5.resources.resource import SimpointResource, obtain_resource
from gem5.resources.workload import Workload
from gem5.resources.resource import SimpointResource
from pathlib import Path
from m5.stats import reset, dump
@@ -96,11 +97,29 @@ board = SimpleBoard(
cache_hierarchy=cache_hierarchy,
)
# Here we obtain the workloadfrom gem5 resources, the checkpoint in this
# Here we obtain the workload from gem5 resources, the checkpoint in this
# workload was generated from
# `configs/example/gem5_library/checkpoints/simpoints-se-checkpoint.py`.
board.set_workload(
Workload("x86-print-this-15000-with-simpoints-and-checkpoint")
# board.set_workload(
# Workload("x86-print-this-15000-with-simpoints-and-checkpoint")
#
# **Note: This has been removed until we update the resources.json file to
# encapsulate the new Simpoint format.
# Below we set the simpount manually.
#
# This loads a single checkpoint as an example of using simpoints to simulate
# the function of a single simpoint region.
board.set_se_simpoint_workload(
binary=obtain_resource("x86-print-this"),
arguments=["print this", 15000],
simpoint=SimpointResource(
simpoint_interval=1000000,
simpoint_list=[2, 3, 4, 15],
weight_list=[0.1, 0.2, 0.4, 0.3],
warmup_interval=1000000,
),
checkpoint=obtain_resource("simpoints-se-checkpoints-v23-0-v1"),
)

View File

@@ -0,0 +1,92 @@
# Copyright (c) 2021 The Regents of the University of California
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""
This gem5 configuation script creates a simple board to run an ARM
"hello world" binary using the DRAMSys simulator.
**Important Note**: DRAMSys must be compiled into the gem5 binary to use the
DRRAMSys simulator. Please consult 'ext/dramsys/README' on how to compile
correctly. If this is not done correctly this script will run with error.
"""
from gem5.isas import ISA
from gem5.utils.requires import requires
from gem5.resources.resource import Resource
from gem5.components.memory import DRAMSysDDR3_1600
from gem5.components.processors.cpu_types import CPUTypes
from gem5.components.boards.simple_board import SimpleBoard
from gem5.components.cachehierarchies.classic.private_l1_cache_hierarchy import (
PrivateL1CacheHierarchy,
)
from gem5.components.processors.simple_processor import SimpleProcessor
from gem5.simulate.simulator import Simulator
# This check ensures the gem5 binary is compiled to the ARM ISA target. If not,
# an exception will be thrown.
requires(isa_required=ISA.ARM)
# We need a cache as DRAMSys only accepts requests with the size of a cache line
cache_hierarchy = PrivateL1CacheHierarchy(l1d_size="32kB", l1i_size="32kB")
# We use a single channel DDR3_1600 memory system
memory = DRAMSysDDR3_1600(recordable=True)
# We use a simple Timing processor with one core.
processor = SimpleProcessor(cpu_type=CPUTypes.TIMING, isa=ISA.ARM, num_cores=1)
# The gem5 library simble board which can be used to run simple SE-mode
# simulations.
board = SimpleBoard(
clk_freq="3GHz",
processor=processor,
memory=memory,
cache_hierarchy=cache_hierarchy,
)
# Here we set the workload. In this case we want to run a simple "Hello World!"
# program compiled to the ARM ISA. The `Resource` class will automatically
# download the binary from the gem5 Resources cloud bucket if it's not already
# present.
board.set_se_binary_workload(
# The `Resource` class reads the `resources.json` file from the gem5
# resources repository:
# https://gem5.googlesource.com/public/gem5-resource.
# Any resource specified in this file will be automatically retrieved.
# At the time of writing, this file is a WIP and does not contain all
# resources. Jira ticket: https://gem5.atlassian.net/browse/GEM5-1096
Resource("arm-hello64-static")
)
# Lastly we run the simulation.
simulator = Simulator(board=board)
simulator.run()
print(
"Exiting @ tick {} because {}.".format(
simulator.get_current_tick(), simulator.get_last_exit_event_cause()
)
)

View File

@@ -0,0 +1,62 @@
# Copyright (c) 2023 The Regents of the University of California
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""
This script is used for running a traffic generator connected to the
DRAMSys simulator.
**Important Note**: DRAMSys must be compiled into the gem5 binary to use the
DRRAMSys simulator. Please consult 'ext/dramsys/README' on how to compile
correctly. If this is not done correctly this script will run with error.
"""
import m5
from gem5.components.memory import DRAMSysMem
from gem5.components.boards.test_board import TestBoard
from gem5.components.processors.linear_generator import LinearGenerator
from m5.objects import Root
memory = DRAMSysMem(
configuration="ext/dramsys/DRAMSys/DRAMSys/"
"library/resources/simulations/ddr4-example.json",
resource_directory="ext/dramsys/DRAMSys/DRAMSys/library/resources",
recordable=True,
size="4GB",
)
generator = LinearGenerator(
duration="250us",
rate="40GB/s",
num_cores=1,
max_addr=memory.get_size(),
)
board = TestBoard(
clk_freq="3GHz", generator=generator, memory=memory, cache_hierarchy=None
)
root = Root(full_system=False, system=board)
board._pre_instantiate()
m5.instantiate()
generator.start_traffic()
exit_event = m5.simulate()

View File

@@ -0,0 +1,138 @@
# Copyright (c) 2023 The Regents of the University of California
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""
This configuration script shows an example of how to take checkpoints for
LoopPoint using the gem5 stdlib. To take checkpoints for LoopPoint simulation
regions, there must be a LoopPoint data file generated by Pin or the gem5
simulator. With the information in the LoopPoint data file, the stdlib
modules will take checkpoints at the beginning of the simulation regions
(warmup region included if it exists) and record all restore needed information
into a JSON file. The JSON file is needed for later restoring, so please call
`looppoint.output_json_file()` at the end of the simulation.
This script builds a simple board with the gem5 stdlib with no cache and a
simple memory structure to take checkpoints. Some of the components, such as
cache hierarchy, can be changed when restoring checkpoints.
Usage
-----
```
scons build/X86/gem5.opt
./build/X86/gem5.opt \
configs/example/gem5_library/looppoints/create-looppoint-checkpoint.py
```
"""
from gem5.simulate.exit_event import ExitEvent
from gem5.simulate.simulator import Simulator
from gem5.utils.requires import requires
from gem5.components.cachehierarchies.classic.no_cache import NoCache
from gem5.components.boards.simple_board import SimpleBoard
from gem5.components.memory.single_channel import SingleChannelDDR3_1600
from gem5.components.processors.simple_processor import SimpleProcessor
from gem5.components.processors.cpu_types import CPUTypes
from gem5.isas import ISA
from gem5.resources.workload import Workload
from pathlib import Path
from gem5.simulate.exit_event_generators import (
looppoint_save_checkpoint_generator,
)
import argparse
requires(isa_required=ISA.X86)
parser = argparse.ArgumentParser(
description="An example looppoint workload file path"
)
# The lone arguments is a file path to a directory to store the checkpoints.
parser.add_argument(
"--checkpoint-path",
type=str,
required=False,
default="looppoint_checkpoints_folder",
help="The directory to store the checkpoints.",
)
args = parser.parse_args()
# When taking a checkpoint, the cache state is not saved, so the cache
# hierarchy can be changed completely when restoring from a checkpoint.
# By using NoCache() to take checkpoints, it can slightly improve the
# performance when running in atomic mode, and it will not put any restrictions
# on what people can do with the checkpoints.
cache_hierarchy = NoCache()
# Using simple memory to take checkpoints might slightly imporve the
# performance in atomic mode. The memory structure can be changed when
# restoring from a checkpoint, but the size of the memory must be equal or
# greater to that taken when creating the checkpoint.
memory = SingleChannelDDR3_1600(size="2GB")
processor = SimpleProcessor(
cpu_type=CPUTypes.ATOMIC,
isa=ISA.X86,
# LoopPoint can work with multicore workloads
num_cores=9,
)
board = SimpleBoard(
clk_freq="3GHz",
processor=processor,
memory=memory,
cache_hierarchy=cache_hierarchy,
)
board.set_workload(Workload("x86-matrix-multiply-omp-100-8-looppoint-csv"))
dir = Path(args.checkpoint_path)
dir.mkdir(exist_ok=True)
simulator = Simulator(
board=board,
on_exit_event={
ExitEvent.SIMPOINT_BEGIN: looppoint_save_checkpoint_generator(
checkpoint_dir=dir,
looppoint=board.get_looppoint(),
# True if the relative PC count pairs should be updated during the
# simulation. Default as True.
update_relatives=True,
# True if the simulation loop should exit after all the PC count
# pairs in the LoopPoint data file have been encountered. Default
# as True.
exit_when_empty=True,
)
},
)
simulator.run()
# Output the JSON file
board.get_looppoint().output_json_file()

View File

@@ -0,0 +1,139 @@
# Copyright (c) 2023 The Regents of the University of California
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""
This configuration script shows an example of how to restore a checkpoint that
was taken for a LoopPoint simulation region in the example-restore.py.
All the LoopPoint information should be passed in through the JSON file
generated by the gem5 simulator when all the checkpoints were taken.
This script builds a more complex board than the board used for taking
checkpoints.
Usage
-----
```
./build/X86/gem5.opt \
configs/example/gem5_library/looppoints/restore-looppoint-checkpoint.py
```
"""
import argparse
from gem5.simulate.exit_event import ExitEvent
from gem5.simulate.simulator import Simulator
from gem5.utils.requires import requires
from gem5.components.cachehierarchies.classic.private_l1_private_l2_cache_hierarchy import (
PrivateL1PrivateL2CacheHierarchy,
)
from gem5.components.boards.simple_board import SimpleBoard
from gem5.components.memory import DualChannelDDR4_2400
from gem5.components.processors.simple_processor import SimpleProcessor
from gem5.components.processors.cpu_types import CPUTypes
from gem5.isas import ISA
from gem5.resources.resource import obtain_resource
from gem5.resources.workload import Workload
from m5.stats import reset, dump
requires(isa_required=ISA.X86)
parser = argparse.ArgumentParser(description="An restore checkpoint script.")
parser.add_argument(
"--checkpoint-region",
type=str,
required=False,
choices=(
"1",
"2",
"3",
"5",
"6",
"7",
"8",
"9",
"10",
"11",
"12",
"13",
"14",
),
default="1",
help="The checkpoint region to restore from.",
)
args = parser.parse_args()
# The cache hierarchy can be different from the cache hierarchy used in taking
# the checkpoints
cache_hierarchy = PrivateL1PrivateL2CacheHierarchy(
l1d_size="32kB",
l1i_size="32kB",
l2_size="256kB",
)
# The memory structure can be different from the memory structure used in
# taking the checkpoints, but the size of the memory must be equal or larger.
memory = DualChannelDDR4_2400(size="2GB")
processor = SimpleProcessor(
cpu_type=CPUTypes.TIMING,
isa=ISA.X86,
# The number of cores must be equal or greater than that used when taking
# the checkpoint.
num_cores=9,
)
board = SimpleBoard(
clk_freq="3GHz",
processor=processor,
memory=memory,
cache_hierarchy=cache_hierarchy,
)
board.set_workload(
Workload(
f"x86-matrix-multiply-omp-100-8-looppoint-region-{args.checkpoint_region}"
)
)
# This generator will dump the stats and exit the simulation loop when the
# simulation region reaches its end. In the case there is a warmup interval,
# the simulation stats are reset after the warmup is complete.
def reset_and_dump():
if len(board.get_looppoint().get_targets()) > 1:
print("Warmup region ended. Resetting stats.")
reset()
yield False
print("Region ended. Dumping stats.")
dump()
yield True
simulator = Simulator(
board=board,
on_exit_event={ExitEvent.SIMPOINT_BEGIN: reset_and_dump()},
)
simulator.run()

View File

@@ -0,0 +1,89 @@
# Copyright (c) 2023 The Regents of the University of California
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""
This gem5 configuation script creates a simple board to run a POWER
"hello world" binary.
This is setup is the close to the simplest setup possible using the gem5
library. It does not contain any kind of caching, IO, or any non-essential
components.
Usage
-----
```
scons build/POWER/gem5.opt
./build/POWER/gem5.opt configs/example/gem5_library/power-hello.py
```
"""
from gem5.isas import ISA
from gem5.utils.requires import requires
from gem5.resources.resource import Resource
from gem5.components.memory import SingleChannelDDR4_2400
from gem5.components.processors.cpu_types import CPUTypes
from gem5.components.boards.simple_board import SimpleBoard
from gem5.components.cachehierarchies.classic.no_cache import NoCache
from gem5.components.processors.simple_processor import SimpleProcessor
from gem5.simulate.simulator import Simulator
# This check ensures the gem5 binary is compiled to the POWER ISA target.
# If not, an exception will be thrown.
requires(isa_required=ISA.POWER)
# In this setup we don't have a cache. `NoCache` can be used for such setups.
cache_hierarchy = NoCache()
# We use a single channel DDR4_2400 memory system
memory = SingleChannelDDR4_2400(size="32MB")
# We use a simple ATOMIC processor with one core.
processor = SimpleProcessor(
cpu_type=CPUTypes.ATOMIC, isa=ISA.POWER, num_cores=1
)
# The gem5 library simple board which can be used to run simple SE-mode
# simulations.
board = SimpleBoard(
clk_freq="3GHz",
processor=processor,
memory=memory,
cache_hierarchy=cache_hierarchy,
)
board.set_se_binary_workload(Resource("power-hello"))
# Lastly we run the simulation.
simulator = Simulator(board=board)
simulator.run()
print(
"Exiting @ tick {} because {}.".format(
simulator.get_current_tick(),
simulator.get_last_exit_event_cause(),
)
)

View File

@@ -39,9 +39,7 @@ scons build/RISCV/gem5.opt
from gem5.resources.resource import Resource
from gem5.simulate.simulator import Simulator
from python.gem5.prebuilt.riscvmatched.riscvmatched_board import (
RISCVMatchedBoard,
)
from gem5.prebuilt.riscvmatched.riscvmatched_board import RISCVMatchedBoard
from gem5.isas import ISA
from gem5.utils.requires import requires

View File

@@ -195,9 +195,9 @@ if args.synthetic == "1":
)
exit(-1)
command = "./{} -g {}\n".format(args.benchmark, args.size)
command = f"./{args.benchmark} -g {args.size}\n"
else:
command = "./{} -sf ../{}".format(args.benchmark, args.size)
command = f"./{args.benchmark} -sf ../{args.size}"
board.set_kernel_disk_workload(
# The x86 linux kernel will be automatically downloaded to the
@@ -262,7 +262,9 @@ print("Done with the simulation")
print()
print("Performance statistics:")
print("Simulated time in ROI: %.2fs" % ((end_tick - start_tick) / 1e12))
print(
f"Simulated time in ROI: {(end_tick - start_tick) / 1000000000000.0:.2f}s"
)
print(
"Ran a total of", simulator.get_current_tick() / 1e12, "simulated seconds"
)

View File

@@ -195,7 +195,7 @@ board = X86Board(
# properly.
command = (
"/home/gem5/NPB3.3-OMP/bin/{}.{}.x;".format(args.benchmark, args.size)
f"/home/gem5/NPB3.3-OMP/bin/{args.benchmark}.{args.size}.x;"
+ "sleep 5;"
+ "m5 exit;"
)

View File

@@ -177,10 +177,7 @@ board = X86Board(
command = (
"cd /home/gem5/parsec-benchmark;".format(args.benchmark)
+ "source env.sh;"
+ "parsecmgmt -a run -p {} -c gcc-hooks -i {} \
-n {};".format(
args.benchmark, args.size, "2"
)
+ f"parsecmgmt -a run -p {args.benchmark} -c gcc-hooks -i {args.size} -n 2;"
+ "sleep 5;"
+ "m5 exit;"
)

View File

@@ -179,7 +179,7 @@ if not os.path.exists(args.image):
print(
"https://gem5art.readthedocs.io/en/latest/tutorials/spec-tutorial.html"
)
fatal("The disk-image is not found at {}".format(args.image))
fatal(f"The disk-image is not found at {args.image}")
# Setting up all the fixed system parameters here
# Caches: MESI Two Level Cache Hierarchy
@@ -252,7 +252,7 @@ except FileExistsError:
# The runscript.sh file places `m5 exit` before and after the following command
# Therefore, we only pass this command without m5 exit.
command = "{} {} {}".format(args.benchmark, args.size, output_dir)
command = f"{args.benchmark} {args.size} {output_dir}"
board.set_kernel_disk_workload(
# The x86 linux kernel will be automatically downloaded to the
@@ -262,7 +262,7 @@ board.set_kernel_disk_workload(
kernel=Resource("x86-linux-kernel-4.19.83"),
# The location of the x86 SPEC CPU 2017 image
disk_image=CustomDiskImageResource(
args.image, disk_root_partition=args.partition
args.image, root_partition=args.partition
),
readfile_contents=command,
)
@@ -272,6 +272,7 @@ def handle_exit():
print("Done bootling Linux")
print("Resetting stats at the start of ROI!")
m5.stats.reset()
processor.switch()
yield False # E.g., continue the simulation.
print("Dump stats at the end of the ROI!")
m5.stats.dump()
@@ -304,7 +305,11 @@ print("All simulation events were successful.")
print("Performance statistics:")
print("Simulated time: " + ((str(simulator.get_roi_ticks()[0]))))
roi_begin_ticks = simulator.get_tick_stopwatch()[0][1]
roi_end_ticks = simulator.get_tick_stopwatch()[1][1]
print("roi simulated ticks: " + str(roi_end_ticks - roi_begin_ticks))
print(
"Ran a total of", simulator.get_current_tick() / 1e12, "simulated seconds"
)

View File

@@ -193,7 +193,7 @@ if not os.path.exists(args.image):
print(
"https://gem5art.readthedocs.io/en/latest/tutorials/spec-tutorial.html"
)
fatal("The disk-image is not found at {}".format(args.image))
fatal(f"The disk-image is not found at {args.image}")
# Setting up all the fixed system parameters here
# Caches: MESI Two Level Cache Hierarchy
@@ -266,7 +266,7 @@ except FileExistsError:
# The runscript.sh file places `m5 exit` before and after the following command
# Therefore, we only pass this command without m5 exit.
command = "{} {} {}".format(args.benchmark, args.size, output_dir)
command = f"{args.benchmark} {args.size} {output_dir}"
# For enabling CustomResource, we pass an additional parameter to mount the
# correct partition.
@@ -278,7 +278,7 @@ board.set_kernel_disk_workload(
kernel=Resource("x86-linux-kernel-4.19.83"),
# The location of the x86 SPEC CPU 2017 image
disk_image=CustomDiskImageResource(
args.image, disk_root_partition=args.partition
args.image, root_partition=args.partition
),
readfile_contents=command,
)
@@ -288,6 +288,7 @@ def handle_exit():
print("Done bootling Linux")
print("Resetting stats at the start of ROI!")
m5.stats.reset()
processor.switch()
yield False # E.g., continue the simulation.
print("Dump stats at the end of the ROI!")
m5.stats.dump()
@@ -319,7 +320,11 @@ print("Done with the simulation")
print()
print("Performance statistics:")
print("Simulated time in ROI: " + ((str(simulator.get_roi_ticks()[0]))))
roi_begin_ticks = simulator.get_tick_stopwatch()[0][1]
roi_end_ticks = simulator.get_tick_stopwatch()[1][1]
print("roi simulated ticks: " + str(roi_end_ticks - roi_begin_ticks))
print(
"Ran a total of", simulator.get_current_tick() / 1e12, "simulated seconds"
)

View File

@@ -48,7 +48,7 @@ class DisjointSimple(SimpleNetwork):
def connectCPU(self, opts, controllers):
# Setup parameters for makeTopology call for CPU network
topo_module = import_module("topologies.%s" % opts.cpu_topology)
topo_module = import_module(f"topologies.{opts.cpu_topology}")
topo_class = getattr(topo_module, opts.cpu_topology)
_topo = topo_class(controllers)
_topo.makeTopology(opts, self, SimpleIntLink, SimpleExtLink, Switch)
@@ -58,7 +58,7 @@ class DisjointSimple(SimpleNetwork):
def connectGPU(self, opts, controllers):
# Setup parameters for makeTopology call for GPU network
topo_module = import_module("topologies.%s" % opts.gpu_topology)
topo_module = import_module(f"topologies.{opts.gpu_topology}")
topo_class = getattr(topo_module, opts.gpu_topology)
_topo = topo_class(controllers)
_topo.makeTopology(opts, self, SimpleIntLink, SimpleExtLink, Switch)
@@ -84,7 +84,7 @@ class DisjointGarnet(GarnetNetwork):
def connectCPU(self, opts, controllers):
# Setup parameters for makeTopology call for CPU network
topo_module = import_module("topologies.%s" % opts.cpu_topology)
topo_module = import_module(f"topologies.{opts.cpu_topology}")
topo_class = getattr(topo_module, opts.cpu_topology)
_topo = topo_class(controllers)
_topo.makeTopology(
@@ -96,7 +96,7 @@ class DisjointGarnet(GarnetNetwork):
def connectGPU(self, opts, controllers):
# Setup parameters for makeTopology call
topo_module = import_module("topologies.%s" % opts.gpu_topology)
topo_module = import_module(f"topologies.{opts.gpu_topology}")
topo_class = getattr(topo_module, opts.gpu_topology)
_topo = topo_class(controllers)
_topo.makeTopology(

View File

@@ -49,7 +49,7 @@ def addAmdGPUOptions(parser):
"--cu-per-sqc",
type=int,
default=4,
help="number of CUs sharing an SQC" " (icache, and thus icache TLB)",
help="number of CUs sharing an SQC (icache, and thus icache TLB)",
)
parser.add_argument(
"--cu-per-scalar-cache",
@@ -102,19 +102,19 @@ def addAmdGPUOptions(parser):
"--issue-period",
type=int,
default=4,
help="Number of cycles per vector instruction issue" " period",
help="Number of cycles per vector instruction issue period",
)
parser.add_argument(
"--glbmem-wr-bus-width",
type=int,
default=32,
help="VGPR to Coalescer (Global Memory) data bus width" " in bytes",
help="VGPR to Coalescer (Global Memory) data bus width in bytes",
)
parser.add_argument(
"--glbmem-rd-bus-width",
type=int,
default=32,
help="Coalescer to VGPR (Global Memory) data bus width" " in bytes",
help="Coalescer to VGPR (Global Memory) data bus width in bytes",
)
# Currently we only support 1 local memory pipe
parser.add_argument(
@@ -204,20 +204,20 @@ def addAmdGPUOptions(parser):
parser.add_argument(
"--LocalMemBarrier",
action="store_true",
help="Barrier does not wait for writethroughs to " " complete",
help="Barrier does not wait for writethroughs to complete",
)
parser.add_argument(
"--countPages",
action="store_true",
help="Count Page Accesses and output in " " per-CU output files",
help="Count Page Accesses and output in per-CU output files",
)
parser.add_argument(
"--TLB-prefetch", type=int, help="prefetch depth for" "TLBs"
"--TLB-prefetch", type=int, help="prefetch depth for TLBs"
)
parser.add_argument(
"--pf-type",
type=str,
help="type of prefetch: " "PF_CU, PF_WF, PF_PHASE, PF_STRIDE",
help="type of prefetch: PF_CU, PF_WF, PF_PHASE, PF_STRIDE",
)
parser.add_argument("--pf-stride", type=int, help="set prefetch stride")
parser.add_argument(

View File

@@ -42,7 +42,7 @@ from ruby import Ruby
cookbook_runscript = """\
export LD_LIBRARY_PATH=/opt/rocm/lib:$LD_LIBRARY_PATH
export HSA_ENABLE_INTERRUPT=0
dmesg -n3
dmesg -n8
dd if=/root/roms/vega10.rom of=/dev/mem bs=1k seek=768 count=128
if [ ! -f /lib/modules/`uname -r`/updates/dkms/amdgpu.ko ]; then
echo "ERROR: Missing DKMS package for kernel `uname -r`. Exiting gem5."
@@ -99,18 +99,16 @@ if __name__ == "__m5_main__":
# Create temp script to run application
if args.app is None:
print("No application given. Use %s -a <app>" % sys.argv[0])
print(f"No application given. Use {sys.argv[0]} -a <app>")
sys.exit(1)
elif args.kernel is None:
print("No kernel path given. Use %s --kernel <vmlinux>" % sys.argv[0])
print(f"No kernel path given. Use {sys.argv[0]} --kernel <vmlinux>")
sys.exit(1)
elif args.disk_image is None:
print("No disk path given. Use %s --disk-image <linux>" % sys.argv[0])
print(f"No disk path given. Use {sys.argv[0]} --disk-image <linux>")
sys.exit(1)
elif args.gpu_mmio_trace is None:
print(
"No MMIO trace path. Use %s --gpu-mmio-trace <path>" % sys.argv[0]
)
print(f"No MMIO trace path. Use {sys.argv[0]} --gpu-mmio-trace <path>")
sys.exit(1)
_, tempRunscript = tempfile.mkstemp()

View File

@@ -43,7 +43,7 @@ from ruby import Ruby
rodinia_runscript = """\
export LD_LIBRARY_PATH=/opt/rocm/lib:$LD_LIBRARY_PATH
export HSA_ENABLE_INTERRUPT=0
dmesg -n3
dmesg -n8
dd if=/root/roms/vega10.rom of=/dev/mem bs=1k seek=768 count=128
if [ ! -f /lib/modules/`uname -r`/updates/dkms/amdgpu.ko ]; then
echo "ERROR: Missing DKMS package for kernel `uname -r`. Exiting gem5."
@@ -107,18 +107,16 @@ if __name__ == "__m5_main__":
# Create temp script to run application
if args.app is None:
print("No application given. Use %s -a <app>" % sys.argv[0])
print(f"No application given. Use {sys.argv[0]} -a <app>")
sys.exit(1)
elif args.kernel is None:
print("No kernel path given. Use %s --kernel <vmlinux>" % sys.argv[0])
print(f"No kernel path given. Use {sys.argv[0]} --kernel <vmlinux>")
sys.exit(1)
elif args.disk_image is None:
print("No disk path given. Use %s --disk-image <linux>" % sys.argv[0])
print(f"No disk path given. Use {sys.argv[0]} --disk-image <linux>")
sys.exit(1)
elif args.gpu_mmio_trace is None:
print(
"No MMIO trace path. Use %s --gpu-mmio-trace <path>" % sys.argv[0]
)
print(f"No MMIO trace path. Use {sys.argv[0]} --gpu-mmio-trace <path>")
sys.exit(1)
_, tempRunscript = tempfile.mkstemp()

View File

@@ -42,7 +42,7 @@ from ruby import Ruby
samples_runscript = """\
export LD_LIBRARY_PATH=/opt/rocm/lib:$LD_LIBRARY_PATH
export HSA_ENABLE_INTERRUPT=0
dmesg -n3
dmesg -n8
dd if=/root/roms/vega10.rom of=/dev/mem bs=1k seek=768 count=128
if [ ! -f /lib/modules/`uname -r`/updates/dkms/amdgpu.ko ]; then
echo "ERROR: Missing DKMS package for kernel `uname -r`. Exiting gem5."
@@ -97,18 +97,16 @@ if __name__ == "__m5_main__":
# Create temp script to run application
if args.app is None:
print("No application given. Use %s -a <app>" % sys.argv[0])
print(f"No application given. Use {sys.argv[0]} -a <app>")
sys.exit(1)
elif args.kernel is None:
print("No kernel path given. Use %s --kernel <vmlinux>" % sys.argv[0])
print(f"No kernel path given. Use {sys.argv[0]} --kernel <vmlinux>")
sys.exit(1)
elif args.disk_image is None:
print("No disk path given. Use %s --disk-image <linux>" % sys.argv[0])
print(f"No disk path given. Use {sys.argv[0]} --disk-image <linux>")
sys.exit(1)
elif args.gpu_mmio_trace is None:
print(
"No MMIO trace path. Use %s --gpu-mmio-trace <path>" % sys.argv[0]
)
print(f"No MMIO trace path. Use {sys.argv[0]} --gpu-mmio-trace <path>")
sys.exit(1)
_, tempRunscript = tempfile.mkstemp()

View File

@@ -30,6 +30,7 @@
# System includes
import argparse
import math
import hashlib
# gem5 related
import m5
@@ -110,13 +111,13 @@ def addRunFSOptions(parser):
action="store",
type=str,
default="16GB",
help="Specify the dGPU physical memory" " size",
help="Specify the dGPU physical memory size",
)
parser.add_argument(
"--dgpu-num-dirs",
type=int,
default=1,
help="Set " "the number of dGPU directories (memory controllers",
help="Set the number of dGPU directories (memory controllers",
)
parser.add_argument(
"--dgpu-mem-type",
@@ -125,6 +126,17 @@ def addRunFSOptions(parser):
help="type of memory to use",
)
# These are the models that are both supported in gem5 and supported
# by the versions of ROCm supported by gem5 in full system mode. For
# other gfx versions there is some support in syscall emulation mode.
parser.add_argument(
"--gpu-device",
default="Vega10",
choices=["Vega10", "MI100", "MI200"],
help="GPU model to run: Vega10 (gfx900), MI100 (gfx908), or "
"MI200 (gfx90a)",
)
def runGpuFSSystem(args):
"""
@@ -145,6 +157,11 @@ def runGpuFSSystem(args):
math.ceil(float(n_cu) / args.cu_per_scalar_cache)
)
# Verify MMIO trace is valid
mmio_md5 = hashlib.md5(open(args.gpu_mmio_trace, "rb").read()).hexdigest()
if mmio_md5 != "c4ff3326ae8a036e329b8b595c83bd6d":
m5.util.panic("MMIO file does not match gem5 resources")
system = makeGpuFSSystem(args)
root = Root(
@@ -184,7 +201,7 @@ def runGpuFSSystem(args):
break
else:
print(
"Unknown exit event: %s. Continuing..." % exit_event.getCause()
f"Unknown exit event: {exit_event.getCause()}. Continuing..."
)
print(

View File

@@ -170,3 +170,18 @@ def connectGPU(system, args):
system.pc.south_bridge.gpu.checkpoint_before_mmios = (
args.checkpoint_before_mmios
)
system.pc.south_bridge.gpu.device_name = args.gpu_device
if args.gpu_device == "MI100":
system.pc.south_bridge.gpu.DeviceID = 0x738C
system.pc.south_bridge.gpu.SubsystemVendorID = 0x1002
system.pc.south_bridge.gpu.SubsystemID = 0x0C34
elif args.gpu_device == "MI200":
system.pc.south_bridge.gpu.DeviceID = 0x740F
system.pc.south_bridge.gpu.SubsystemVendorID = 0x1002
system.pc.south_bridge.gpu.SubsystemID = 0x0C34
elif args.gpu_device == "Vega10":
system.pc.south_bridge.gpu.DeviceID = 0x6863
else:
panic("Unknown GPU device: {}".format(args.gpu_device))

View File

@@ -61,7 +61,9 @@ def makeGpuFSSystem(args):
panic("Need at least 2GB of system memory to load amdgpu module")
# Use the common FSConfig to setup a Linux X86 System
(TestCPUClass, test_mem_mode, FutureClass) = Simulation.setCPUClass(args)
(TestCPUClass, test_mem_mode) = Simulation.getCPUClass(args.cpu_type)
if test_mem_mode == "atomic":
test_mem_mode = "atomic_noncaching"
disks = [args.disk_image]
if args.second_disk is not None:
disks.extend([args.second_disk])
@@ -91,10 +93,11 @@ def makeGpuFSSystem(args):
# Create specified number of CPUs. GPUFS really only needs one.
system.cpu = [
X86KvmCPU(clk_domain=system.cpu_clk_domain, cpu_id=i)
TestCPUClass(clk_domain=system.cpu_clk_domain, cpu_id=i)
for i in range(args.num_cpus)
]
system.kvm_vm = KvmVM()
if ObjectList.is_kvm_cpu(TestCPUClass):
system.kvm_vm = KvmVM()
# Create AMDGPU and attach to southbridge
shader = createGPU(system, args)
@@ -112,7 +115,8 @@ def makeGpuFSSystem(args):
numHWQueues=args.num_hw_queues,
walker=hsapp_pt_walker,
)
dispatcher = GPUDispatcher()
dispatcher_exit_events = True if args.exit_at_gpu_kernel > -1 else False
dispatcher = GPUDispatcher(kernel_exit_events=dispatcher_exit_events)
cp_pt_walker = VegaPagetableWalker()
gpu_cmd_proc = GPUCommandProcessor(
hsapp=gpu_hsapp, dispatcher=dispatcher, walker=cp_pt_walker
@@ -126,15 +130,55 @@ def makeGpuFSSystem(args):
device_ih = AMDGPUInterruptHandler()
system.pc.south_bridge.gpu.device_ih = device_ih
# Setup the SDMA engines
sdma0_pt_walker = VegaPagetableWalker()
sdma1_pt_walker = VegaPagetableWalker()
# Setup the SDMA engines depending on device. The MMIO base addresses
# can be found in the driver code under:
# include/asic_reg/sdmaX/sdmaX_Y_Z_offset.h
num_sdmas = 2
sdma_bases = []
sdma_sizes = []
if args.gpu_device == "Vega10":
num_sdmas = 2
sdma_bases = [0x4980, 0x5180]
sdma_sizes = [0x800] * 2
elif args.gpu_device == "MI100":
num_sdmas = 8
sdma_bases = [
0x4980,
0x6180,
0x78000,
0x79000,
0x7A000,
0x7B000,
0x7C000,
0x7D000,
]
sdma_sizes = [0x1000] * 8
elif args.gpu_device == "MI200":
num_sdmas = 5
sdma_bases = [
0x4980,
0x6180,
0x78000,
0x79000,
0x7A000,
]
sdma_sizes = [0x1000] * 5
else:
m5.util.panic(f"Unknown GPU device {args.gpu_device}")
sdma0 = SDMAEngine(walker=sdma0_pt_walker)
sdma1 = SDMAEngine(walker=sdma1_pt_walker)
sdma_pt_walkers = []
sdma_engines = []
for sdma_idx in range(num_sdmas):
sdma_pt_walker = VegaPagetableWalker()
sdma_engine = SDMAEngine(
walker=sdma_pt_walker,
mmio_base=sdma_bases[sdma_idx],
mmio_size=sdma_sizes[sdma_idx],
)
sdma_pt_walkers.append(sdma_pt_walker)
sdma_engines.append(sdma_engine)
system.pc.south_bridge.gpu.sdma0 = sdma0
system.pc.south_bridge.gpu.sdma1 = sdma1
system.pc.south_bridge.gpu.sdmas = sdma_engines
# Setup PM4 packet processor
pm4_pkt_proc = PM4PacketProcessor()
@@ -152,22 +196,22 @@ def makeGpuFSSystem(args):
system._dma_ports.append(gpu_hsapp)
system._dma_ports.append(gpu_cmd_proc)
system._dma_ports.append(system.pc.south_bridge.gpu)
system._dma_ports.append(sdma0)
system._dma_ports.append(sdma1)
for sdma in sdma_engines:
system._dma_ports.append(sdma)
system._dma_ports.append(device_ih)
system._dma_ports.append(pm4_pkt_proc)
system._dma_ports.append(system_hub)
system._dma_ports.append(gpu_mem_mgr)
system._dma_ports.append(hsapp_pt_walker)
system._dma_ports.append(cp_pt_walker)
system._dma_ports.append(sdma0_pt_walker)
system._dma_ports.append(sdma1_pt_walker)
for sdma_pt_walker in sdma_pt_walkers:
system._dma_ports.append(sdma_pt_walker)
gpu_hsapp.pio = system.iobus.mem_side_ports
gpu_cmd_proc.pio = system.iobus.mem_side_ports
system.pc.south_bridge.gpu.pio = system.iobus.mem_side_ports
sdma0.pio = system.iobus.mem_side_ports
sdma1.pio = system.iobus.mem_side_ports
for sdma in sdma_engines:
sdma.pio = system.iobus.mem_side_ports
device_ih.pio = system.iobus.mem_side_ports
pm4_pkt_proc.pio = system.iobus.mem_side_ports
system_hub.pio = system.iobus.mem_side_ports

View File

@@ -0,0 +1,153 @@
# Copyright (c) 2022-2023 Advanced Micro Devices, Inc.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from this
# software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
import m5
import runfs
import base64
import tempfile
import argparse
import sys
import os
from amd import AmdGPUOptions
from common import Options
from common import GPUTLBOptions
from ruby import Ruby
demo_runscript_without_checkpoint = """\
export LD_LIBRARY_PATH=/opt/rocm/lib:$LD_LIBRARY_PATH
export HSA_ENABLE_INTERRUPT=0
dmesg -n8
dd if=/root/roms/vega10.rom of=/dev/mem bs=1k seek=768 count=128
if [ ! -f /lib/modules/`uname -r`/updates/dkms/amdgpu.ko ]; then
echo "ERROR: Missing DKMS package for kernel `uname -r`. Exiting gem5."
/sbin/m5 exit
fi
modprobe -v amdgpu ip_block_mask=0xff ppfeaturemask=0 dpm=0 audio=0
echo "Running {} {}"
echo "{}" | base64 -d > myapp
chmod +x myapp
./myapp {}
/sbin/m5 exit
"""
demo_runscript_with_checkpoint = """\
export LD_LIBRARY_PATH=/opt/rocm/lib:$LD_LIBRARY_PATH
export HSA_ENABLE_INTERRUPT=0
dmesg -n8
dd if=/root/roms/vega10.rom of=/dev/mem bs=1k seek=768 count=128
if [ ! -f /lib/modules/`uname -r`/updates/dkms/amdgpu.ko ]; then
echo "ERROR: Missing DKMS package for kernel `uname -r`. Exiting gem5."
/sbin/m5 exit
fi
modprobe -v amdgpu ip_block_mask=0xff ppfeaturemask=0 dpm=0 audio=0
echo "Running {} {}"
echo "{}" | base64 -d > myapp
chmod +x myapp
/sbin/m5 checkpoint
./myapp {}
/sbin/m5 exit
"""
def addDemoOptions(parser):
parser.add_argument(
"-a", "--app", default=None, help="GPU application to run"
)
parser.add_argument(
"-o", "--opts", default="", help="GPU application arguments"
)
def runVegaGPUFS(cpu_type):
parser = argparse.ArgumentParser()
runfs.addRunFSOptions(parser)
Options.addCommonOptions(parser)
AmdGPUOptions.addAmdGPUOptions(parser)
Ruby.define_options(parser)
GPUTLBOptions.tlb_options(parser)
addDemoOptions(parser)
# Parse now so we can override options
args = parser.parse_args()
demo_runscript = ""
# Create temp script to run application
if args.app is None:
print(f"No application given. Use {sys.argv[0]} -a <app>")
sys.exit(1)
elif args.kernel is None:
print(f"No kernel path given. Use {sys.argv[0]} --kernel <vmlinux>")
sys.exit(1)
elif args.disk_image is None:
print(f"No disk path given. Use {sys.argv[0]} --disk-image <linux>")
sys.exit(1)
elif args.gpu_mmio_trace is None:
print(f"No MMIO trace path. Use {sys.argv[0]} --gpu-mmio-trace <path>")
sys.exit(1)
elif not os.path.isfile(args.app):
print("Could not find applcation", args.app)
sys.exit(1)
# Choose runscript Based on whether any checkpointing args are set
if args.checkpoint_dir is not None:
demo_runscript = demo_runscript_with_checkpoint
else:
demo_runscript = demo_runscript_without_checkpoint
with open(os.path.abspath(args.app), "rb") as binfile:
encodedBin = base64.b64encode(binfile.read()).decode()
_, tempRunscript = tempfile.mkstemp()
with open(tempRunscript, "w") as b64file:
runscriptStr = demo_runscript.format(
args.app, args.opts, encodedBin, args.opts
)
b64file.write(runscriptStr)
if args.second_disk == None:
args.second_disk = args.disk_image
# Defaults for Vega10
args.ruby = True
args.cpu_type = cpu_type
args.num_cpus = 1
args.mem_size = "3GB"
args.dgpu = True
args.dgpu_mem_size = "16GB"
args.dgpu_start = "0GB"
args.checkpoint_restore = 0
args.disjoint = True
args.timing_gpu = True
args.script = tempRunscript
args.dgpu_xor_low_bit = 0
# Run gem5
runfs.runGpuFSSystem(args)

View File

@@ -0,0 +1,32 @@
# Copyright (c) 2023 Advanced Micro Devices, Inc.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# 1. Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
#
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# 3. Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from this
# software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
import vega10
vega10.runVegaGPUFS("AtomicSimpleCPU")

View File

@@ -1,4 +1,4 @@
# Copyright (c) 2022 Advanced Micro Devices, Inc.
# Copyright (c) 2022-2023 Advanced Micro Devices, Inc.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
@@ -27,104 +27,6 @@
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
import m5
import runfs
import base64
import tempfile
import argparse
import sys
import os
import vega10
from amd import AmdGPUOptions
from common import Options
from common import GPUTLBOptions
from ruby import Ruby
demo_runscript = """\
export LD_LIBRARY_PATH=/opt/rocm/lib:$LD_LIBRARY_PATH
export HSA_ENABLE_INTERRUPT=0
dmesg -n3
dd if=/root/roms/vega10.rom of=/dev/mem bs=1k seek=768 count=128
if [ ! -f /lib/modules/`uname -r`/updates/dkms/amdgpu.ko ]; then
echo "ERROR: Missing DKMS package for kernel `uname -r`. Exiting gem5."
/sbin/m5 exit
fi
modprobe -v amdgpu ip_block_mask=0xff ppfeaturemask=0 dpm=0 audio=0
echo "Running {} {}"
echo "{}" | base64 -d > myapp
chmod +x myapp
./myapp {}
/sbin/m5 exit
"""
def addDemoOptions(parser):
parser.add_argument(
"-a", "--app", default=None, help="GPU application to run"
)
parser.add_argument(
"-o", "--opts", default="", help="GPU application arguments"
)
if __name__ == "__m5_main__":
parser = argparse.ArgumentParser()
runfs.addRunFSOptions(parser)
Options.addCommonOptions(parser)
AmdGPUOptions.addAmdGPUOptions(parser)
Ruby.define_options(parser)
GPUTLBOptions.tlb_options(parser)
addDemoOptions(parser)
# Parse now so we can override options
args = parser.parse_args()
# Create temp script to run application
if args.app is None:
print("No application given. Use %s -a <app>" % sys.argv[0])
sys.exit(1)
elif args.kernel is None:
print("No kernel path given. Use %s --kernel <vmlinux>" % sys.argv[0])
sys.exit(1)
elif args.disk_image is None:
print("No disk path given. Use %s --disk-image <linux>" % sys.argv[0])
sys.exit(1)
elif args.gpu_mmio_trace is None:
print(
"No MMIO trace path. Use %s --gpu-mmio-trace <path>" % sys.argv[0]
)
sys.exit(1)
elif not os.path.isfile(args.app):
print("Could not find applcation", args.app)
sys.exit(1)
with open(os.path.abspath(args.app), "rb") as binfile:
encodedBin = base64.b64encode(binfile.read()).decode()
_, tempRunscript = tempfile.mkstemp()
with open(tempRunscript, "w") as b64file:
runscriptStr = demo_runscript.format(
args.app, args.opts, encodedBin, args.opts
)
b64file.write(runscriptStr)
if args.second_disk == None:
args.second_disk = args.disk_image
# Defaults for Vega10
args.ruby = True
args.cpu_type = "X86KvmCPU"
args.num_cpus = 1
args.mem_size = "3GB"
args.dgpu = True
args.dgpu_mem_size = "16GB"
args.dgpu_start = "0GB"
args.checkpoint_restore = 0
args.disjoint = True
args.timing_gpu = True
args.script = tempRunscript
args.dgpu_xor_low_bit = 0
# Run gem5
runfs.runGpuFSSystem(args)
vega10.runVegaGPUFS("X86KvmCPU")

View File

@@ -118,11 +118,11 @@ def createVegaTopology(options):
# Populate CPU node properties
node_prop = (
"cpu_cores_count %s\n" % options.num_cpus
f"cpu_cores_count {options.num_cpus}\n"
+ "simd_count 0\n"
+ "mem_banks_count 1\n"
+ "caches_count 0\n"
+ "io_links_count %s\n" % io_links
+ f"io_links_count {io_links}\n"
+ "cpu_core_id_base 0\n"
+ "simd_id_base 0\n"
+ "max_waves_per_simd 0\n"
@@ -200,8 +200,8 @@ def createVegaTopology(options):
"cpu_cores_count 0\n"
+ "simd_count 256\n"
+ "mem_banks_count 1\n"
+ "caches_count %s\n" % caches
+ "io_links_count %s\n" % io_links
+ f"caches_count {caches}\n"
+ f"io_links_count {io_links}\n"
+ "cpu_core_id_base 0\n"
+ "simd_id_base 2147487744\n"
+ "max_waves_per_simd 10\n"
@@ -212,11 +212,11 @@ def createVegaTopology(options):
+ "simd_arrays_per_engine 1\n"
+ "cu_per_simd_array 16\n"
+ "simd_per_cu 4\n"
+ "max_slots_scratch_cu %s\n" % cu_scratch
+ f"max_slots_scratch_cu {cu_scratch}\n"
+ "vendor_id 4098\n"
+ "device_id 26720\n"
+ "location_id 1024\n"
+ "drm_render_minor %s\n" % drm_num
+ f"drm_render_minor {drm_num}\n"
+ "hive_id 0\n"
+ "num_sdma_engines 2\n"
+ "num_sdma_xgmi_engines 0\n"
@@ -313,11 +313,11 @@ def createFijiTopology(options):
# Populate CPU node properties
node_prop = (
"cpu_cores_count %s\n" % options.num_cpus
f"cpu_cores_count {options.num_cpus}\n"
+ "simd_count 0\n"
+ "mem_banks_count 1\n"
+ "caches_count 0\n"
+ "io_links_count %s\n" % io_links
+ f"io_links_count {io_links}\n"
+ "cpu_core_id_base 0\n"
+ "simd_id_base 0\n"
+ "max_waves_per_simd 0\n"
@@ -392,33 +392,30 @@ def createFijiTopology(options):
# Populate GPU node properties
node_prop = (
"cpu_cores_count 0\n"
+ "simd_count %s\n"
% (options.num_compute_units * options.simds_per_cu)
+ f"simd_count {options.num_compute_units * options.simds_per_cu}\n"
+ "mem_banks_count 1\n"
+ "caches_count %s\n" % caches
+ "io_links_count %s\n" % io_links
+ f"caches_count {caches}\n"
+ f"io_links_count {io_links}\n"
+ "cpu_core_id_base 0\n"
+ "simd_id_base 2147487744\n"
+ "max_waves_per_simd %s\n" % options.wfs_per_simd
+ "lds_size_in_kb %s\n" % int(options.lds_size / 1024)
+ f"max_waves_per_simd {options.wfs_per_simd}\n"
+ f"lds_size_in_kb {int(options.lds_size / 1024)}\n"
+ "gds_size_in_kb 0\n"
+ "wave_front_size %s\n" % options.wf_size
+ f"wave_front_size {options.wf_size}\n"
+ "array_count 4\n"
+ "simd_arrays_per_engine %s\n" % options.sa_per_complex
+ "cu_per_simd_array %s\n" % options.cu_per_sa
+ "simd_per_cu %s\n" % options.simds_per_cu
+ f"simd_arrays_per_engine {options.sa_per_complex}\n"
+ f"cu_per_simd_array {options.cu_per_sa}\n"
+ f"simd_per_cu {options.simds_per_cu}\n"
+ "max_slots_scratch_cu 32\n"
+ "vendor_id 4098\n"
+ "device_id 29440\n"
+ "location_id 512\n"
+ "drm_render_minor %s\n" % drm_num
+ "max_engine_clk_fcompute %s\n"
% int(toFrequency(options.gpu_clock) / 1e6)
+ f"drm_render_minor {drm_num}\n"
+ f"max_engine_clk_fcompute {int(toFrequency(options.gpu_clock) / 1000000.0)}\n"
+ "local_mem_size 4294967296\n"
+ "fw_version 730\n"
+ "capability 4736\n"
+ "max_engine_clk_ccompute %s\n"
% int(toFrequency(options.CPUClock) / 1e6)
+ f"max_engine_clk_ccompute {int(toFrequency(options.CPUClock) / 1000000.0)}\n"
)
file_append((node_dir, "properties"), node_prop)
@@ -484,34 +481,31 @@ def createCarrizoTopology(options):
# populate global node properties
# NOTE: SIMD count triggers a valid GPU agent creation
node_prop = (
"cpu_cores_count %s\n" % options.num_cpus
+ "simd_count %s\n"
% (options.num_compute_units * options.simds_per_cu)
+ "mem_banks_count %s\n" % mem_banks_cnt
f"cpu_cores_count {options.num_cpus}\n"
+ f"simd_count {options.num_compute_units * options.simds_per_cu}\n"
+ f"mem_banks_count {mem_banks_cnt}\n"
+ "caches_count 0\n"
+ "io_links_count 0\n"
+ "cpu_core_id_base 16\n"
+ "simd_id_base 2147483648\n"
+ "max_waves_per_simd %s\n" % options.wfs_per_simd
+ "lds_size_in_kb %s\n" % int(options.lds_size / 1024)
+ f"max_waves_per_simd {options.wfs_per_simd}\n"
+ f"lds_size_in_kb {int(options.lds_size / 1024)}\n"
+ "gds_size_in_kb 0\n"
+ "wave_front_size %s\n" % options.wf_size
+ f"wave_front_size {options.wf_size}\n"
+ "array_count 1\n"
+ "simd_arrays_per_engine %s\n" % options.sa_per_complex
+ "cu_per_simd_array %s\n" % options.cu_per_sa
+ "simd_per_cu %s\n" % options.simds_per_cu
+ f"simd_arrays_per_engine {options.sa_per_complex}\n"
+ f"cu_per_simd_array {options.cu_per_sa}\n"
+ f"simd_per_cu {options.simds_per_cu}\n"
+ "max_slots_scratch_cu 32\n"
+ "vendor_id 4098\n"
+ "device_id %s\n" % device_id
+ f"device_id {device_id}\n"
+ "location_id 8\n"
+ "drm_render_minor %s\n" % drm_num
+ "max_engine_clk_fcompute %s\n"
% int(toFrequency(options.gpu_clock) / 1e6)
+ f"drm_render_minor {drm_num}\n"
+ f"max_engine_clk_fcompute {int(toFrequency(options.gpu_clock) / 1000000.0)}\n"
+ "local_mem_size 0\n"
+ "fw_version 699\n"
+ "capability 4738\n"
+ "max_engine_clk_ccompute %s\n"
% int(toFrequency(options.CPUClock) / 1e6)
+ f"max_engine_clk_ccompute {int(toFrequency(options.CPUClock) / 1000000.0)}\n"
)
file_append((node_dir, "properties"), node_prop)

View File

@@ -113,6 +113,4 @@ print("Beginning simulation!")
exit_event = m5.simulate(args.max_ticks)
print(
"Exiting @ tick {} because {}.".format(m5.curTick(), exit_event.getCause())
)
print(f"Exiting @ tick {m5.curTick()} because {exit_event.getCause()}.")

View File

@@ -330,7 +330,7 @@ def make_cache_level(ncaches, prototypes, level, next_cache):
make_cache_level(cachespec, cache_proto, len(cachespec), None)
# Connect the lowest level crossbar to the memory
last_subsys = getattr(system, "l%dsubsys0" % len(cachespec))
last_subsys = getattr(system, f"l{len(cachespec)}subsys0")
last_subsys.xbar.mem_side_ports = system.physmem.port
last_subsys.xbar.point_of_coherency = True

View File

@@ -211,8 +211,7 @@ else:
if numtesters(cachespec, testerspec) > block_size:
print(
"Error: Limited to %s testers because of false sharing"
% (block_size)
f"Error: Limited to {block_size} testers because of false sharing"
)
sys.exit(1)
@@ -351,7 +350,7 @@ make_cache_level(cachespec, cache_proto, len(cachespec), None)
# Connect the lowest level crossbar to the last-level cache and memory
# controller
last_subsys = getattr(system, "l%dsubsys0" % len(cachespec))
last_subsys = getattr(system, f"l{len(cachespec)}subsys0")
last_subsys.xbar.point_of_coherency = True
if args.noncoherent_cache:
system.llc = NoncoherentCache(

View File

@@ -68,8 +68,7 @@ sim_object_classes_by_name = {
def no_parser(cls, flags, param):
raise Exception(
"Can't parse string: %s for parameter"
" class: %s" % (str(param), cls.__name__)
f"Can't parse string: {str(param)} for parameter class: {cls.__name__}"
)
@@ -114,7 +113,7 @@ def memory_bandwidth_parser(cls, flags, param):
value = 1.0 / float(param)
# Convert to byte/s
value = ticks.fromSeconds(value)
return cls("%fB/s" % value)
return cls(f"{value:f}B/s")
# These parameters have trickier parsing from .ini files than might be
@@ -201,8 +200,7 @@ class ConfigManager(object):
if object_type not in sim_object_classes_by_name:
raise Exception(
"No SimObject type %s is available to"
" build: %s" % (object_type, object_name)
f"No SimObject type {object_type} is available to build: {object_name}"
)
object_class = sim_object_classes_by_name[object_type]
@@ -479,7 +477,7 @@ class ConfigIniFile(ConfigFile):
if object_name == "root":
return child_name
else:
return "%s.%s" % (object_name, child_name)
return f"{object_name}.{child_name}"
return [(name, make_path(name)) for name in child_names]

View File

@@ -91,7 +91,7 @@ from common import Options
def generateMemNode(state, mem_range):
node = FdtNode("memory@%x" % int(mem_range.start))
node = FdtNode(f"memory@{int(mem_range.start):x}")
node.append(FdtPropertyStrings("device_type", ["memory"]))
node.append(
FdtPropertyWords(
@@ -187,6 +187,7 @@ system.platform = HiFive()
# RTCCLK (Set to 100MHz for faster simulation)
system.platform.rtc = RiscvRTC(frequency=Frequency("100MHz"))
system.platform.clint.int_pin = system.platform.rtc.int_pin
system.platform.pci_host.pio = system.iobus.mem_side_ports
# VirtIOMMIO
if args.disk_image:
@@ -236,8 +237,6 @@ system.cpu_clk_domain = SrcClockDomain(
clock=args.cpu_clock, voltage_domain=system.cpu_voltage_domain
)
system.workload.object_file = args.kernel
# NOTE: Not yet tested
if args.script is not None:
system.readfile = args.script

View File

@@ -1,16 +1,4 @@
# Copyright (c) 2012-2013 ARM Limited
# All rights reserved.
#
# The license below extends only to copyright in the software and shall
# not be construed as granting a license to any other intellectual
# property including but not limited to intellectual property relating
# to a hardware implementation of the functionality of the software
# licensed hereunder. You may use the software subject to the license
# terms below provided that you ensure that this notice is replicated
# unmodified and in its entirety in all distributions of the software,
# modified or unmodified, in source code or in binary form.
#
# Copyright (c) 2006-2008 The Regents of The University of Michigan
# Copyright (c) 2023 The Regents of the University of California
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
@@ -36,253 +24,10 @@
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# Simple test script
#
# "m5 test.py"
from m5.util import fatal
import argparse
import sys
import os
import m5
from m5.defines import buildEnv
from m5.objects import *
from m5.params import NULL
from m5.util import addToPath, fatal, warn
from gem5.isas import ISA
from gem5.runtime import get_runtime_isa
addToPath("../")
from ruby import Ruby
from common import Options
from common import Simulation
from common import CacheConfig
from common import CpuConfig
from common import ObjectList
from common import MemConfig
from common.FileSystemConfig import config_filesystem
from common.Caches import *
from common.cpu2000 import *
def get_processes(args):
"""Interprets provided args and returns a list of processes"""
multiprocesses = []
inputs = []
outputs = []
errouts = []
pargs = []
workloads = args.cmd.split(";")
if args.input != "":
inputs = args.input.split(";")
if args.output != "":
outputs = args.output.split(";")
if args.errout != "":
errouts = args.errout.split(";")
if args.options != "":
pargs = args.options.split(";")
idx = 0
for wrkld in workloads:
process = Process(pid=100 + idx)
process.executable = wrkld
process.cwd = os.getcwd()
process.gid = os.getgid()
if args.env:
with open(args.env, "r") as f:
process.env = [line.rstrip() for line in f]
if len(pargs) > idx:
process.cmd = [wrkld] + pargs[idx].split()
else:
process.cmd = [wrkld]
if len(inputs) > idx:
process.input = inputs[idx]
if len(outputs) > idx:
process.output = outputs[idx]
if len(errouts) > idx:
process.errout = errouts[idx]
multiprocesses.append(process)
idx += 1
if args.smt:
assert args.cpu_type == "DerivO3CPU"
return multiprocesses, idx
else:
return multiprocesses, 1
parser = argparse.ArgumentParser()
Options.addCommonOptions(parser)
Options.addSEOptions(parser)
if "--ruby" in sys.argv:
Ruby.define_options(parser)
args = parser.parse_args()
multiprocesses = []
numThreads = 1
if args.bench:
apps = args.bench.split("-")
if len(apps) != args.num_cpus:
print("number of benchmarks not equal to set num_cpus!")
sys.exit(1)
for app in apps:
try:
if get_runtime_isa() == ISA.ARM:
exec(
"workload = %s('arm_%s', 'linux', '%s')"
% (app, args.arm_iset, args.spec_input)
)
else:
# TARGET_ISA has been removed, but this is missing a ], so it
# has incorrect syntax and wasn't being used anyway.
exec(
"workload = %s(buildEnv['TARGET_ISA', 'linux', '%s')"
% (app, args.spec_input)
)
multiprocesses.append(workload.makeProcess())
except:
print(
"Unable to find workload for %s: %s"
% (get_runtime_isa().name(), app),
file=sys.stderr,
)
sys.exit(1)
elif args.cmd:
multiprocesses, numThreads = get_processes(args)
else:
print("No workload specified. Exiting!\n", file=sys.stderr)
sys.exit(1)
(CPUClass, test_mem_mode, FutureClass) = Simulation.setCPUClass(args)
CPUClass.numThreads = numThreads
# Check -- do not allow SMT with multiple CPUs
if args.smt and args.num_cpus > 1:
fatal("You cannot use SMT with multiple CPUs!")
np = args.num_cpus
mp0_path = multiprocesses[0].executable
system = System(
cpu=[CPUClass(cpu_id=i) for i in range(np)],
mem_mode=test_mem_mode,
mem_ranges=[AddrRange(args.mem_size)],
cache_line_size=args.cacheline_size,
fatal(
"The 'configs/example/se.py' script has been deprecated. It can be "
"found in 'configs/deprecated/example' if required. Its usage should be "
"avoided as it will be removed in future releases of gem5."
)
if numThreads > 1:
system.multi_thread = True
# Create a top-level voltage domain
system.voltage_domain = VoltageDomain(voltage=args.sys_voltage)
# Create a source clock for the system and set the clock period
system.clk_domain = SrcClockDomain(
clock=args.sys_clock, voltage_domain=system.voltage_domain
)
# Create a CPU voltage domain
system.cpu_voltage_domain = VoltageDomain()
# Create a separate clock domain for the CPUs
system.cpu_clk_domain = SrcClockDomain(
clock=args.cpu_clock, voltage_domain=system.cpu_voltage_domain
)
# If elastic tracing is enabled, then configure the cpu and attach the elastic
# trace probe
if args.elastic_trace_en:
CpuConfig.config_etrace(CPUClass, system.cpu, args)
# All cpus belong to a common cpu_clk_domain, therefore running at a common
# frequency.
for cpu in system.cpu:
cpu.clk_domain = system.cpu_clk_domain
if ObjectList.is_kvm_cpu(CPUClass) or ObjectList.is_kvm_cpu(FutureClass):
if buildEnv["USE_X86_ISA"]:
system.kvm_vm = KvmVM()
system.m5ops_base = 0xFFFF0000
for process in multiprocesses:
process.useArchPT = True
process.kvmInSE = True
else:
fatal("KvmCPU can only be used in SE mode with x86")
# Sanity check
if args.simpoint_profile:
if not ObjectList.is_noncaching_cpu(CPUClass):
fatal("SimPoint/BPProbe should be done with an atomic cpu")
if np > 1:
fatal("SimPoint generation not supported with more than one CPUs")
for i in range(np):
if args.smt:
system.cpu[i].workload = multiprocesses
elif len(multiprocesses) == 1:
system.cpu[i].workload = multiprocesses[0]
else:
system.cpu[i].workload = multiprocesses[i]
if args.simpoint_profile:
system.cpu[i].addSimPointProbe(args.simpoint_interval)
if args.checker:
system.cpu[i].addCheckerCpu()
if args.bp_type:
bpClass = ObjectList.bp_list.get(args.bp_type)
system.cpu[i].branchPred = bpClass()
if args.indirect_bp_type:
indirectBPClass = ObjectList.indirect_bp_list.get(
args.indirect_bp_type
)
system.cpu[i].branchPred.indirectBranchPred = indirectBPClass()
system.cpu[i].createThreads()
if args.ruby:
Ruby.create_system(args, False, system)
assert args.num_cpus == len(system.ruby._cpu_ports)
system.ruby.clk_domain = SrcClockDomain(
clock=args.ruby_clock, voltage_domain=system.voltage_domain
)
for i in range(np):
ruby_port = system.ruby._cpu_ports[i]
# Create the interrupt controller and connect its ports to Ruby
# Note that the interrupt controller is always present but only
# in x86 does it have message ports that need to be connected
system.cpu[i].createInterruptController()
# Connect the cpu's cache ports to Ruby
ruby_port.connectCpuPorts(system.cpu[i])
else:
MemClass = Simulation.setMemClass(args)
system.membus = SystemXBar()
system.system_port = system.membus.cpu_side_ports
CacheConfig.config_cache(args, system)
MemConfig.config_mem(args, system)
config_filesystem(system, args)
system.workload = SEWorkload.init_compatible(mp0_path)
if args.wait_gdb:
system.workload.wait_for_remote_gdb = True
root = Root(full_system=False, system=system)
Simulation.run(args, root, system, FutureClass)

View File

@@ -35,7 +35,7 @@ import argparse
def generateMemNode(state, mem_range):
node = FdtNode("memory@%x" % int(mem_range.start))
node = FdtNode(f"memory@{int(mem_range.start):x}")
node.append(FdtPropertyStrings("device_type", ["memory"]))
node.append(
FdtPropertyWords(

View File

@@ -75,7 +75,7 @@ class L1ICache(L1Cache):
size = "16kB"
SimpleOpts.add_option(
"--l1i_size", help="L1 instruction cache size. Default: %s" % size
"--l1i_size", help=f"L1 instruction cache size. Default: {size}"
)
def __init__(self, opts=None):
@@ -96,7 +96,7 @@ class L1DCache(L1Cache):
size = "64kB"
SimpleOpts.add_option(
"--l1d_size", help="L1 data cache size. Default: %s" % size
"--l1d_size", help=f"L1 data cache size. Default: {size}"
)
def __init__(self, opts=None):
@@ -122,9 +122,7 @@ class L2Cache(Cache):
mshrs = 20
tgts_per_mshr = 12
SimpleOpts.add_option(
"--l2_size", help="L2 cache size. Default: %s" % size
)
SimpleOpts.add_option("--l2_size", help=f"L2 cache size. Default: {size}")
def __init__(self, opts=None):
super(L2Cache, self).__init__()

View File

@@ -78,6 +78,4 @@ m5.instantiate()
print("Beginning simulation!")
exit_event = m5.simulate()
print(
"Exiting @ tick {} because {}".format(m5.curTick(), exit_event.getCause())
)
print(f"Exiting @ tick {m5.curTick()} because {exit_event.getCause()}")

View File

@@ -110,6 +110,4 @@ m5.instantiate()
print("Beginning simulation!")
exit_event = m5.simulate()
print(
"Exiting @ tick {} because {}".format(m5.curTick(), exit_event.getCause())
)
print(f"Exiting @ tick {m5.curTick()} because {exit_event.getCause()}")

View File

@@ -280,6 +280,6 @@ def create_system(
elif options.topology in ["Crossbar", "Pt2Pt"]:
topology = create_topology(network_cntrls, options)
else:
m5.fatal("%s not supported!" % options.topology)
m5.fatal(f"{options.topology} not supported!")
return (cpu_sequencers, mem_cntrls, topology)

View File

@@ -428,7 +428,7 @@ class CPUSequencerWrapper:
cpu.icache_port = self.inst_seq.in_ports
for p in cpu._cached_ports:
if str(p) != "icache_port":
exec("cpu.%s = self.data_seq.in_ports" % p)
exec(f"cpu.{p} = self.data_seq.in_ports")
cpu.connectUncachedPorts(
self.data_seq.in_ports, self.data_seq.interrupt_out_port
)

View File

@@ -120,8 +120,8 @@ def define_options(parser):
)
protocol = buildEnv["PROTOCOL"]
exec("from . import %s" % protocol)
eval("%s.define_options(parser)" % protocol)
exec(f"from . import {protocol}")
eval(f"{protocol}.define_options(parser)")
Network.define_options(parser)
@@ -207,8 +207,8 @@ def create_topology(controllers, options):
found in configs/topologies/BaseTopology.py
This is a wrapper for the legacy topologies.
"""
exec("import topologies.%s as Topo" % options.topology)
topology = eval("Topo.%s(controllers)" % options.topology)
exec(f"import topologies.{options.topology} as Topo")
topology = eval(f"Topo.{options.topology}(controllers)")
return topology
@@ -242,7 +242,7 @@ def create_system(
cpus = system.cpu
protocol = buildEnv["PROTOCOL"]
exec("from . import %s" % protocol)
exec(f"from . import {protocol}")
try:
(cpu_sequencers, dir_cntrls, topology) = eval(
"%s.create_system(options, full_system, system, dma_ports,\
@@ -250,7 +250,7 @@ def create_system(
% protocol
)
except:
print("Error: could not create sytem for ruby protocol %s" % protocol)
print(f"Error: could not create sytem for ruby protocol {protocol}")
raise
# Create the network topology

View File

@@ -325,9 +325,7 @@ class CustomMesh(SimpleTopology):
rni_io_params = check_same(type(n).NoC_Params, rni_io_params)
else:
fatal(
"topologies.CustomMesh: {} not supported".format(
n.__class__.__name__
)
f"topologies.CustomMesh: {n.__class__.__name__} not supported"
)
# Create all mesh routers
@@ -420,11 +418,11 @@ class CustomMesh(SimpleTopology):
if pair_debug:
print(c.path())
for r in c.addr_ranges:
print("%s" % r)
print(f"{r}")
for p in c._pairing:
print("\t" + p.path())
for r in p.addr_ranges:
print("\t%s" % r)
print(f"\t{r}")
# all must be paired
for c in all_cache:
@@ -516,8 +514,8 @@ class CustomMesh(SimpleTopology):
assert len(c._pairing) == pairing_check
print(c.path())
for r in c.addr_ranges:
print("%s" % r)
print(f"{r}")
for p in c._pairing:
print("\t" + p.path())
for r in p.addr_ranges:
print("\t%s" % r)
print(f"\t{r}")

View File

@@ -41,7 +41,7 @@ import os
Import('env')
env.Prepend(CPPPATH=Dir('./src'))
env.Prepend(CPPPATH=Dir('./src').srcnode())
# Add the appropriate files for the library
drampower_files = []

View File

@@ -59,7 +59,7 @@ DRAMFile('AddressMapping.cpp')
DRAMFile('Bank.cpp')
DRAMFile('BankState.cpp')
DRAMFile('BusPacket.cpp')
DRAMFile('ClockDoenv.cpp')
DRAMFile('ClockDomain.cpp')
DRAMFile('CommandQueue.cpp')
DRAMFile('IniReader.cpp')
DRAMFile('MemoryController.cpp')
@@ -85,6 +85,6 @@ dramenv.Append(CCFLAGS=['-DNO_STORAGE'])
dramenv.Library('dramsim2', [dramenv.SharedObject(f) for f in dram_files])
env.Prepend(CPPPATH=Dir('.'))
env.Prepend(CPPPATH=Dir('.').srcnode())
env.Append(LIBS=['dramsim2'])
env.Prepend(LIBPATH=[Dir('.')])

View File

@@ -56,12 +56,12 @@ dramsim_path = os.path.join(Dir('#').abspath, 'ext/dramsim3/DRAMsim3/')
if thermal:
superlu_path = os.path.join(dramsim_path, 'ext/SuperLU_MT_3.1/lib')
env.Prepend(CPPPATH=Dir('.'))
env.Prepend(CPPPATH=Dir('.').srcnode())
env.Append(LIBS=['dramsim3', 'superlu_mt_OPENMP', 'm', 'f77blas',
'atlas', 'gomp'],
LIBPATH=[dramsim_path, superlu_path])
else:
env.Prepend(CPPPATH=Dir('.'))
env.Prepend(CPPPATH=Dir('.').srcnode())
# a littel hacky but can get a shared library working
env.Append(LIBS=['dramsim3', 'gomp'],
LIBPATH=[dramsim_path], # compile-time lookup

10
ext/dramsys/README Normal file
View File

@@ -0,0 +1,10 @@
Follow these steps to get DRAMSys as part of gem5
1. Go to ext/dramsys (this directory)
2. Clone DRAMSys: 'git clone --recursive git@github.com:tukl-msd/DRAMSys.git DRAMSys'
3. Change directory to DRAMSys: 'cd DRAMSys'
4. Checkout the correct commit: 'git checkout -b gem5 09f6dcbb91351e6ee7cadfc7bc8b29d97625db8f'
If you wish to run a simulation using the gem5 processor cores, make sure to enable the storage mode in DRAMSys.
This is done by setting the value of the "StoreMode" key to "Store" in the base configuration file.
Those configuration file can be found in 'DRAMSys/library/resources/configs/simulator'.

96
ext/dramsys/SConscript Normal file
View File

@@ -0,0 +1,96 @@
# Copyright (c) 2022 Fraunhofer IESE
# All rights reserved
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met: redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer;
# redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution;
# neither the name of the copyright holders nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import os
Import('env')
build_root = Dir('../..').abspath
src_root = Dir('DRAMSys/DRAMSys/library').srcnode().abspath
# See if we got a cloned DRAMSys repo as a subdirectory and set the
# HAVE_DRAMSys flag accordingly
if not os.path.exists(Dir('.').srcnode().abspath + '/DRAMSys'):
env['HAVE_DRAMSYS'] = False
Return()
env['HAVE_DRAMSYS'] = True
dramsys_files = []
dramsys_configuration_files = []
dramsys_files.extend(Glob("%s/*.cpp" % f"{src_root}/src/controller"))
for root, dirs, files in os.walk(f"{src_root}/src/controller", topdown=False):
for dir in dirs:
dramsys_files.extend(Glob("%s/*.cpp" % os.path.join(root, dir)))
dramsys_files.extend(Glob("%s/*.cpp" % f"{src_root}/src/simulation"))
for root, dirs, files in os.walk(f"{src_root}/src/simulation", topdown=False):
for dir in dirs:
dramsys_files.extend(Glob("%s/*.cpp" % os.path.join(root, dir)))
dramsys_files.extend(Glob("%s/*.cpp" % f"{src_root}/src/configuration"))
for root, dirs, files in os.walk(f"{src_root}/src/configuration", topdown=False):
for dir in dirs:
dramsys_files.extend(Glob("%s/*.cpp" % os.path.join(root, dir)))
dramsys_files.extend(Glob("%s/*.cpp" % f"{src_root}/src/error"))
dramsys_files.extend(Glob(f"{src_root}/src/error/ECC/Bit.cpp"))
dramsys_files.extend(Glob(f"{src_root}/src/error/ECC/ECC.cpp"))
dramsys_files.extend(Glob(f"{src_root}/src/error/ECC/Word.cpp"))
dramsys_files.extend(Glob("%s/*.cpp" % f"{src_root}/src/common"))
dramsys_files.extend(Glob("%s/*.cpp" % f"{src_root}/src/common/configuration"))
dramsys_files.extend(Glob("%s/*.cpp" % f"{src_root}/src/common/configuration/memspec"))
dramsys_files.extend(Glob("%s/*.c" % f"{src_root}/src/common/third_party/sqlite-amalgamation"))
env.Prepend(CPPPATH=[
src_root + "/src",
src_root + "/src/common/configuration",
src_root + "/src/common/third_party/nlohmann/include",
])
env.Prepend(CPPDEFINES=[("DRAMSysResourceDirectory", '\\"' + os.getcwd() + '/resources' + '\\"')])
env.Prepend(CPPDEFINES=[("SYSTEMC_VERSION", 20191203)])
dramsys = env.Clone()
if '-Werror' in dramsys['CCFLAGS']:
dramsys['CCFLAGS'].remove('-Werror')
dramsys.Prepend(CPPPATH=[
src_root + "/src/common/third_party/sqlite-amalgamation",
build_root + "/systemc/ext"
])
dramsys.Prepend(CPPDEFINES=[("SQLITE_ENABLE_RTREE", "1")])
dramsys_configuration = env.Clone()
dramsys.Library('dramsys', dramsys_files)
env.Append(LIBS=['dramsys', 'dl'])
env.Append(LIBPATH=[Dir('.')])

View File

@@ -30,7 +30,7 @@
Import('env')
env.Prepend(CPPPATH=Dir('./include'))
env.Prepend(CPPPATH=Dir('./include').srcnode())
fpenv = env.Clone()

181
ext/gdbremote/signals.hh Normal file
View File

@@ -0,0 +1,181 @@
//===-- Generated From GDBRemoteSignals.cpp ------------------------===//
//
// Part of the LLVM Project,
// under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===---------------------------------------------------------------===//
#include <stdint.h>
#ifndef __BASE_GDB_SIGNALS_HH__
#define __BASE_GDB_SIGNALS_HH__
/*
These signals definitions are produced from LLVM's
lldb/source/Plugins/Process/Utility/GDBRemoteSignals.cpp
*/
namespace gem5{
enum class GDBSignal : uint8_t
{
ZERO = 0, //Signal 0
HUP = 1, //hangup
INT = 2, //interrupt
QUIT = 3, //quit
ILL = 4, //illegal instruction
TRAP = 5, //trace trap (not reset when caught)
ABRT = 6, //SIGIOT
EMT = 7, //emulation trap
FPE = 8, //floating point exception
KILL = 9, //kill
BUS = 10, //bus error
SEGV = 11, //segmentation violation
SYS = 12, //invalid system call
PIPE = 13, //write to pipe with reading end closed
ALRM = 14, //alarm
TERM = 15, //termination requested
URG = 16, //urgent data on socket
STOP = 17, //process stop
TSTP = 18, //tty stop
CONT = 19, //process continue
CHLD = 20, //SIGCLD
TTIN = 21, //background tty read
TTOU = 22, //background tty write
IO = 23, //input/output ready/Pollable event
XCPU = 24, //CPU resource exceeded
XFSZ = 25, //file size limit exceeded
VTALRM = 26, //virtual time alarm
PROF = 27, //profiling time alarm
WINCH = 28, //window size changes
LOST = 29, //resource lost
USR1 = 30, //user defined signal 1
USR2 = 31, //user defined signal 2
PWR = 32, //power failure
POLL = 33, //pollable event
WIND = 34, //SIGWIND
PHONE = 35, //SIGPHONE
WAITING = 36, //process's LWPs are blocked
LWP = 37, //signal LWP
DANGER = 38, //swap space dangerously low
GRANT = 39, //monitor mode granted
RETRACT = 40, //need to relinquish monitor mode
MSG = 41, //monitor mode data available
SOUND = 42, //sound completed
SAK = 43, //secure attention
PRIO = 44, //SIGPRIO
SIG33 = 45, //real-time event 33
SIG34 = 46, //real-time event 34
SIG35 = 47, //real-time event 35
SIG36 = 48, //real-time event 36
SIG37 = 49, //real-time event 37
SIG38 = 50, //real-time event 38
SIG39 = 51, //real-time event 39
SIG40 = 52, //real-time event 40
SIG41 = 53, //real-time event 41
SIG42 = 54, //real-time event 42
SIG43 = 55, //real-time event 43
SIG44 = 56, //real-time event 44
SIG45 = 57, //real-time event 45
SIG46 = 58, //real-time event 46
SIG47 = 59, //real-time event 47
SIG48 = 60, //real-time event 48
SIG49 = 61, //real-time event 49
SIG50 = 62, //real-time event 50
SIG51 = 63, //real-time event 51
SIG52 = 64, //real-time event 52
SIG53 = 65, //real-time event 53
SIG54 = 66, //real-time event 54
SIG55 = 67, //real-time event 55
SIG56 = 68, //real-time event 56
SIG57 = 69, //real-time event 57
SIG58 = 70, //real-time event 58
SIG59 = 71, //real-time event 59
SIG60 = 72, //real-time event 60
SIG61 = 73, //real-time event 61
SIG62 = 74, //real-time event 62
SIG63 = 75, //real-time event 63
CANCEL = 76, //LWP internal signal
SIG32 = 77, //real-time event 32
SIG64 = 78, //real-time event 64
SIG65 = 79, //real-time event 65
SIG66 = 80, //real-time event 66
SIG67 = 81, //real-time event 67
SIG68 = 82, //real-time event 68
SIG69 = 83, //real-time event 69
SIG70 = 84, //real-time event 70
SIG71 = 85, //real-time event 71
SIG72 = 86, //real-time event 72
SIG73 = 87, //real-time event 73
SIG74 = 88, //real-time event 74
SIG75 = 89, //real-time event 75
SIG76 = 90, //real-time event 76
SIG77 = 91, //real-time event 77
SIG78 = 92, //real-time event 78
SIG79 = 93, //real-time event 79
SIG80 = 94, //real-time event 80
SIG81 = 95, //real-time event 81
SIG82 = 96, //real-time event 82
SIG83 = 97, //real-time event 83
SIG84 = 98, //real-time event 84
SIG85 = 99, //real-time event 85
SIG86 = 100, //real-time event 86
SIG87 = 101, //real-time event 87
SIG88 = 102, //real-time event 88
SIG89 = 103, //real-time event 89
SIG90 = 104, //real-time event 90
SIG91 = 105, //real-time event 91
SIG92 = 106, //real-time event 92
SIG93 = 107, //real-time event 93
SIG94 = 108, //real-time event 94
SIG95 = 109, //real-time event 95
SIG96 = 110, //real-time event 96
SIG97 = 111, //real-time event 97
SIG98 = 112, //real-time event 98
SIG99 = 113, //real-time event 99
SIG100 = 114, //real-time event 100
SIG101 = 115, //real-time event 101
SIG102 = 116, //real-time event 102
SIG103 = 117, //real-time event 103
SIG104 = 118, //real-time event 104
SIG105 = 119, //real-time event 105
SIG106 = 120, //real-time event 106
SIG107 = 121, //real-time event 107
SIG108 = 122, //real-time event 108
SIG109 = 123, //real-time event 109
SIG110 = 124, //real-time event 110
SIG111 = 125, //real-time event 111
SIG112 = 126, //real-time event 112
SIG113 = 127, //real-time event 113
SIG114 = 128, //real-time event 114
SIG115 = 129, //real-time event 115
SIG116 = 130, //real-time event 116
SIG117 = 131, //real-time event 117
SIG118 = 132, //real-time event 118
SIG119 = 133, //real-time event 119
SIG120 = 134, //real-time event 120
SIG121 = 135, //real-time event 121
SIG122 = 136, //real-time event 122
SIG123 = 137, //real-time event 123
SIG124 = 138, //real-time event 124
SIG125 = 139, //real-time event 125
SIG126 = 140, //real-time event 126
SIG127 = 141, //real-time event 127
INFO = 142, //information request
unknown = 143, //unknown signal
EXC_BAD_ACCESS = 145, //could not access memory
EXC_BAD_INSTRUCTION = 146, //illegal instruction/operand
EXC_ARITHMETIC = 147, //arithmetic exception
EXC_EMULATION = 148, //emulation instruction
EXC_SOFTWARE = 149, //software generated exception
EXC_BREAKPOINT = 150, //breakpoint
LIBRT = 151, //librt internal signal
};
}
#endif /* __BASE_GDB_SIGNALS_HH__ */

View File

@@ -41,6 +41,6 @@ Import('env')
env.Library('iostream3', [env.SharedObject('zfstream.cc')])
env.Prepend(CPPPATH=Dir('.'))
env.Prepend(CPPPATH=Dir('.').srcnode())
env.Append(LIBS=['iostream3'])
env.Prepend(LIBPATH=[Dir('.')])

View File

@@ -127,16 +127,19 @@ if not SCons.Tool.m4.exists(m4env):
# Setup m4 tool
m4env.Tool('m4')
m4env.Append(M4FLAGS=['-DSRCDIR=%s' % Dir('.').path])
m4env.Append(M4FLAGS=['-DSRCDIR=%s' % Dir('.').srcnode().path])
m4env['M4COM'] = '$M4 $M4FLAGS $SOURCES > $TARGET'
m4env.M4(target=File('libelf_convert.c'),
source=[File('elf_types.m4'), File('libelf_convert.m4')])
source=[File('elf_types.m4').srcnode(),
File('libelf_convert.m4').srcnode()])
m4env.M4(target=File('libelf_fsize.c'),
source=[File('elf_types.m4'), File('libelf_fsize.m4')])
source=[File('elf_types.m4').srcnode(),
File('libelf_fsize.m4').srcnode()])
m4env.M4(target=File('libelf_msize.c'),
source=[File('elf_types.m4'), File('libelf_msize.m4')])
source=[File('elf_types.m4').srcnode(),
File('libelf_msize.m4').srcnode()])
m4env.Append(CPPPATH=Dir('.'))
m4env.Append(CPPPATH=[Dir('.'), Dir('.').srcnode()])
# Build libelf as a static library with PIC code so it can be linked
# into either m5 or the library
@@ -146,6 +149,6 @@ m4env.Library('elf', [m4env.SharedObject(f) for f in elf_files])
m4env.Command(File('native-elf-format.h'), File('native-elf-format'),
'${SOURCE} > ${TARGET}')
env.Prepend(CPPPATH=Dir('.'))
env.Prepend(CPPPATH=Dir('.').srcnode())
env.Append(LIBS=[File('libelf.a')])
env.Prepend(LIBPATH=[Dir('.')])

View File

@@ -44,6 +44,6 @@ FdtFile('fdt_empty_tree.c')
FdtFile('fdt_strerror.c')
env.Library('fdt', [env.SharedObject(f) for f in fdt_files])
env.Prepend(CPPPATH=Dir('.'))
env.Prepend(CPPPATH=Dir('.').srcnode())
env.Append(LIBS=['fdt'])
env.Prepend(LIBPATH=[Dir('.')])

View File

@@ -39,7 +39,7 @@
Import('env')
env.Prepend(CPPPATH=Dir('./include'))
env.Prepend(CPPPATH=Dir('include').srcnode())
nomali = env.Clone()
nomali.Append(CCFLAGS=['-Wno-ignored-qualifiers'])

View File

@@ -1,6 +1,6 @@
version: 1.0.{build}
image:
- Visual Studio 2015
- Visual Studio 2017
test: off
skip_branch_with_pr: true
build:
@@ -11,11 +11,9 @@ environment:
matrix:
- PYTHON: 36
CONFIG: Debug
- PYTHON: 27
CONFIG: Debug
install:
- ps: |
$env:CMAKE_GENERATOR = "Visual Studio 14 2015"
$env:CMAKE_GENERATOR = "Visual Studio 15 2017"
if ($env:PLATFORM -eq "x64") { $env:PYTHON = "$env:PYTHON-x64" }
$env:PATH = "C:\Python$env:PYTHON\;C:\Python$env:PYTHON\Scripts\;$env:PATH"
python -W ignore -m pip install --upgrade pip wheel

View File

@@ -3,19 +3,36 @@
# clang-format --style=llvm --dump-config
BasedOnStyle: LLVM
AccessModifierOffset: -4
AlignConsecutiveAssignments: true
AllowShortLambdasOnASingleLine: true
AlwaysBreakTemplateDeclarations: Yes
BinPackArguments: false
BinPackParameters: false
BreakBeforeBinaryOperators: All
BreakConstructorInitializers: BeforeColon
ColumnLimit: 99
CommentPragmas: 'NOLINT:.*|^ IWYU pragma:'
IncludeBlocks: Regroup
IndentCaseLabels: true
IndentPPDirectives: AfterHash
IndentWidth: 4
Language: Cpp
SpaceAfterCStyleCast: true
# SpaceInEmptyBlock: true # too new
Standard: Cpp11
StatementMacros: ['PyObject_HEAD']
TabWidth: 4
IncludeCategories:
- Regex: '<pybind11/.*'
Priority: -1
- Regex: 'pybind11.h"$'
Priority: 1
- Regex: '^".*/?detail/'
Priority: 1
SortPriority: 2
- Regex: '^"'
Priority: 1
SortPriority: 3
- Regex: '<[[:alnum:]._]+>'
Priority: 4
- Regex: '.*'
Priority: 5
...

View File

@@ -1,13 +1,77 @@
FormatStyle: file
Checks: '
llvm-namespace-comment,
modernize-use-override,
readability-container-size-empty,
modernize-use-using,
modernize-use-equals-default,
modernize-use-auto,
modernize-use-emplace,
'
Checks: |
*bugprone*,
*performance*,
clang-analyzer-optin.cplusplus.VirtualCall,
clang-analyzer-optin.performance.Padding,
cppcoreguidelines-init-variables,
cppcoreguidelines-prefer-member-initializer,
cppcoreguidelines-pro-type-static-cast-downcast,
cppcoreguidelines-slicing,
google-explicit-constructor,
llvm-namespace-comment,
misc-definitions-in-headers,
misc-misplaced-const,
misc-non-copyable-objects,
misc-static-assert,
misc-throw-by-value-catch-by-reference,
misc-uniqueptr-reset-release,
misc-unused-parameters,
modernize-avoid-bind,
modernize-loop-convert,
modernize-make-shared,
modernize-redundant-void-arg,
modernize-replace-auto-ptr,
modernize-replace-disallow-copy-and-assign-macro,
modernize-replace-random-shuffle,
modernize-shrink-to-fit,
modernize-use-auto,
modernize-use-bool-literals,
modernize-use-default-member-init,
modernize-use-emplace,
modernize-use-equals-default,
modernize-use-equals-delete,
modernize-use-noexcept,
modernize-use-nullptr,
modernize-use-override,
modernize-use-using,
readability-avoid-const-params-in-decls,
readability-braces-around-statements,
readability-const-return-type,
readability-container-size-empty,
readability-delete-null-pointer,
readability-else-after-return,
readability-implicit-bool-conversion,
readability-inconsistent-declaration-parameter-name,
readability-make-member-function-const,
readability-misplaced-array-index,
readability-non-const-parameter,
readability-qualified-auto,
readability-redundant-function-ptr-dereference,
readability-redundant-smartptr-get,
readability-redundant-string-cstr,
readability-simplify-subscript-expr,
readability-static-accessed-through-instance,
readability-static-definition-in-anonymous-namespace,
readability-string-compare,
readability-suspicious-call-argument,
readability-uniqueptr-delete-release,
-bugprone-easily-swappable-parameters,
-bugprone-exception-escape,
-bugprone-reserved-identifier,
-bugprone-unused-raii,
CheckOptions:
- key: modernize-use-equals-default.IgnoreMacros
value: false
- key: performance-for-range-copy.WarnOnAllAutoCopies
value: true
- key: performance-inefficient-string-concatenation.StrictMode
value: true
- key: performance-unnecessary-value-param.AllowedTypes
value: 'exception_ptr$;'
- key: readability-implicit-bool-conversion.AllowPointerConditions
value: true
HeaderFilterRegex: 'pybind11/.*h'

View File

@@ -0,0 +1,24 @@
template <op_id id, op_type ot, typename L = undefined_t, typename R = undefined_t>
template <typename ThisT>
auto &this_ = static_cast<ThisT &>(*this);
if (load_impl<ThisT>(temp, false)) {
ssize_t nd = 0;
auto trivial = broadcast(buffers, nd, shape);
auto ndim = (size_t) nd;
int nd;
ssize_t ndim() const { return detail::array_proxy(m_ptr)->nd; }
using op = op_impl<id, ot, Base, L_type, R_type>;
template <op_id id, op_type ot, typename L, typename R>
template <detail::op_id id, detail::op_type ot, typename L, typename R, typename... Extra>
class_ &def(const detail::op_<id, ot, L, R> &op, const Extra &...extra) {
class_ &def_cast(const detail::op_<id, ot, L, R> &op, const Extra &...extra) {
@pytest.mark.parametrize("access", ["ro", "rw", "static_ro", "static_rw"])
struct IntStruct {
explicit IntStruct(int v) : value(v){};
~IntStruct() { value = -value; }
IntStruct(const IntStruct &) = default;
IntStruct &operator=(const IntStruct &) = default;
py::class_<IntStruct>(m, "IntStruct").def(py::init([](const int i) { return IntStruct(i); }));
py::implicitly_convertible<int, IntStruct>();
m.def("test", [](int expected, const IntStruct &in) {
[](int expected, const IntStruct &in) {

1
ext/pybind11/.gitattributes vendored Normal file
View File

@@ -0,0 +1 @@
docs/*.svg binary

9
ext/pybind11/.github/CODEOWNERS vendored Normal file
View File

@@ -0,0 +1,9 @@
*.cmake @henryiii
CMakeLists.txt @henryiii
*.yml @henryiii
*.yaml @henryiii
/tools/ @henryiii
/pybind11/ @henryiii
noxfile.py @henryiii
.clang-format @henryiii
.clang-tidy @henryiii

View File

@@ -53,6 +53,33 @@ derivative works thereof, in binary and source code form.
## Development of pybind11
### Quick setup
To setup a quick development environment, use [`nox`](https://nox.thea.codes).
This will allow you to do some common tasks with minimal setup effort, but will
take more time to run and be less flexible than a full development environment.
If you use [`pipx run nox`](https://pipx.pypa.io), you don't even need to
install `nox`. Examples:
```bash
# List all available sessions
nox -l
# Run linters
nox -s lint
# Run tests on Python 3.9
nox -s tests-3.9
# Build and preview docs
nox -s docs -- serve
# Build SDists and wheels
nox -s build
```
### Full setup
To setup an ideal development environment, run the following commands on a
system with CMake 3.14+:
@@ -66,11 +93,10 @@ cmake --build build -j4
Tips:
* You can use `virtualenv` (from PyPI) instead of `venv` (which is Python 3
only).
* You can use `virtualenv` (faster, from PyPI) instead of `venv`.
* You can select any name for your environment folder; if it contains "env" it
will be ignored by git.
* If you dont have CMake 3.14+, just add cmake to the pip install command.
* If you don't have CMake 3.14+, just add "cmake" to the pip install command.
* You can use `-DPYBIND11_FINDPYTHON=ON` to use FindPython on CMake 3.12+
* In classic mode, you may need to set `-DPYTHON_EXECUTABLE=/path/to/python`.
FindPython uses `-DPython_ROOT_DIR=/path/to` or
@@ -78,7 +104,7 @@ Tips:
### Configuration options
In CMake, configuration options are given with -D. Options are stored in the
In CMake, configuration options are given with "-D". Options are stored in the
build directory, in the `CMakeCache.txt` file, so they are remembered for each
build directory. Two selections are special - the generator, given with `-G`,
and the compiler, which is selected based on environment variables `CXX` and
@@ -88,12 +114,12 @@ after the initial run.
The valid options are:
* `-DCMAKE_BUILD_TYPE`: Release, Debug, MinSizeRel, RelWithDebInfo
* `-DPYBIND11_FINDPYTHON=ON`: Use CMake 3.12+s FindPython instead of the
* `-DPYBIND11_FINDPYTHON=ON`: Use CMake 3.12+'s FindPython instead of the
classic, deprecated, custom FindPythonLibs
* `-DPYBIND11_NOPYTHON=ON`: Disable all Python searching (disables tests)
* `-DBUILD_TESTING=ON`: Enable the tests
* `-DDOWNLOAD_CATCH=ON`: Download catch to build the C++ tests
* `-DOWNLOAD_EIGEN=ON`: Download Eigen for the NumPy tests
* `-DDOWNLOAD_EIGEN=ON`: Download Eigen for the NumPy tests
* `-DPYBIND11_INSTALL=ON/OFF`: Enable the install target (on by default for the
master project)
* `-DUSE_PYTHON_INSTALL_DIR=ON`: Try to install into the python dir
@@ -132,8 +158,9 @@ tests with these targets:
* `test_cmake_build`: Install / subdirectory tests
If you want to build just a subset of tests, use
`-DPYBIND11_TEST_OVERRIDE="test_callbacks.cpp;test_pickling.cpp"`. If this is
empty, all tests will be built.
`-DPYBIND11_TEST_OVERRIDE="test_callbacks;test_pickling"`. If this is
empty, all tests will be built. Tests are specified without an extension if they need both a .py and
.cpp file.
You may also pass flags to the `pytest` target by editing `tests/pytest.ini` or
by using the `PYTEST_ADDOPTS` environment variable
@@ -203,16 +230,19 @@ of the pybind11 repo.
[`clang-tidy`][clang-tidy] performs deeper static code analyses and is
more complex to run, compared to `clang-format`, but support for `clang-tidy`
is built into the pybind11 CMake configuration. To run `clang-tidy`, the
following recipe should work. Files will be modified in place, so you can
use git to monitor the changes.
following recipe should work. Run the `docker` command from the top-level
directory inside your pybind11 git clone. Files will be modified in place,
so you can use git to monitor the changes.
```bash
docker run --rm -v $PWD:/pybind11 -it silkeh/clang:10
apt-get update && apt-get install python3-dev python3-pytest
cmake -S pybind11/ -B build -DCMAKE_CXX_CLANG_TIDY="$(which clang-tidy);-fix"
cmake --build build
docker run --rm -v $PWD:/mounted_pybind11 -it silkeh/clang:13
apt-get update && apt-get install -y python3-dev python3-pytest
cmake -S /mounted_pybind11/ -B build -DCMAKE_CXX_CLANG_TIDY="$(which clang-tidy);--use-color" -DDOWNLOAD_EIGEN=ON -DDOWNLOAD_CATCH=ON -DCMAKE_CXX_STANDARD=17
cmake --build build -j 2
```
You can add `--fix` to the options list if you want.
### Include what you use
To run include what you use, install (`brew install include-what-you-use` on
@@ -228,7 +258,7 @@ The report is sent to stderr; you can pipe it into a file if you wish.
### Build recipes
This builds with the Intel compiler (assuming it is in your path, along with a
recent CMake and Python 3):
recent CMake and Python):
```bash
python3 -m venv venv

View File

@@ -1,28 +0,0 @@
---
name: Bug Report
about: File an issue about a bug
title: "[BUG] "
---
Make sure you've completed the following steps before submitting your issue -- thank you!
1. Make sure you've read the [documentation][]. Your issue may be addressed there.
2. Search the [issue tracker][] to verify that this hasn't already been reported. +1 or comment there if it has.
3. Consider asking first in the [Gitter chat room][].
4. Include a self-contained and minimal piece of code that reproduces the problem. If that's not possible, try to make the description as clear as possible.
a. If possible, make a PR with a new, failing test to give us a starting point to work on!
[documentation]: https://pybind11.readthedocs.io
[issue tracker]: https://github.com/pybind/pybind11/issues
[Gitter chat room]: https://gitter.im/pybind/Lobby
*After reading, remove this checklist and the template text in parentheses below.*
## Issue description
(Provide a short description, state the expected behavior and what actually happens.)
## Reproducible example code
(The code should be minimal, have no external dependencies, isolate the function(s) that cause breakage. Submit matched and complete C++ and Python snippets that can be easily compiled and run to diagnose the issue.)

View File

@@ -0,0 +1,61 @@
name: Bug Report
description: File an issue about a bug
title: "[BUG]: "
labels: [triage]
body:
- type: markdown
attributes:
value: |
Please do your best to make the issue as easy to act on as possible, and only submit here if there is clearly a problem with pybind11 (ask first if unsure). **Note that a reproducer in a PR is much more likely to get immediate attention.**
- type: checkboxes
id: steps
attributes:
label: Required prerequisites
description: Make sure you've completed the following steps before submitting your issue -- thank you!
options:
- label: Make sure you've read the [documentation](https://pybind11.readthedocs.io). Your issue may be addressed there.
required: true
- label: Search the [issue tracker](https://github.com/pybind/pybind11/issues) and [Discussions](https:/pybind/pybind11/discussions) to verify that this hasn't already been reported. +1 or comment there if it has.
required: true
- label: Consider asking first in the [Gitter chat room](https://gitter.im/pybind/Lobby) or in a [Discussion](https:/pybind/pybind11/discussions/new).
required: false
- type: input
id: version
attributes:
label: What version (or hash if on master) of pybind11 are you using?
validations:
required: true
- type: textarea
id: description
attributes:
label: Problem description
placeholder: >-
Provide a short description, state the expected behavior and what
actually happens. Include relevant information like what version of
pybind11 you are using, what system you are on, and any useful commands
/ output.
validations:
required: true
- type: textarea
id: code
attributes:
label: Reproducible example code
placeholder: >-
The code should be minimal, have no external dependencies, isolate the
function(s) that cause breakage. Submit matched and complete C++ and
Python snippets that can be easily compiled and run to diagnose the
issue. — Note that a reproducer in a PR is much more likely to get
immediate attention: failing tests in the pybind11 CI are the best
starting point for working out fixes.
render: text
- type: input
id: regression
attributes:
label: Is this a regression? Put the last known working version here if it is.
description: Put the last known working version here if this is a regression.
value: Not a regression

View File

@@ -1,5 +1,8 @@
blank_issues_enabled: false
contact_links:
- name: Ask a question
url: https://github.com/pybind/pybind11/discussions/new
about: Please ask and answer questions here, or propose new ideas.
- name: Gitter room
url: https://gitter.im/pybind/Lobby
about: A room for discussing pybind11 with an active community

View File

@@ -1,16 +0,0 @@
---
name: Feature Request
about: File an issue about adding a feature
title: "[FEAT] "
---
Make sure you've completed the following steps before submitting your issue -- thank you!
1. Check if your feature has already been mentioned / rejected / planned in other issues.
2. If those resources didn't help, consider asking in the [Gitter chat room][] to see if this is interesting / useful to a larger audience and possible to implement reasonably,
4. If you have a useful feature that passes the previous items (or not suitable for chat), please fill in the details below.
[Gitter chat room]: https://gitter.im/pybind/Lobby
*After reading, remove this checklist.*

View File

@@ -1,21 +0,0 @@
---
name: Question
about: File an issue about unexplained behavior
title: "[QUESTION] "
---
If you have a question, please check the following first:
1. Check if your question has already been answered in the [FAQ][] section.
2. Make sure you've read the [documentation][]. Your issue may be addressed there.
3. If those resources didn't help and you only have a short question (not a bug report), consider asking in the [Gitter chat room][]
4. Search the [issue tracker][], including the closed issues, to see if your question has already been asked/answered. +1 or comment if it has been asked but has no answer.
5. If you have a more complex question which is not answered in the previous items (or not suitable for chat), please fill in the details below.
6. Include a self-contained and minimal piece of code that illustrates your question. If that's not possible, try to make the description as clear as possible.
[FAQ]: http://pybind11.readthedocs.io/en/latest/faq.html
[documentation]: https://pybind11.readthedocs.io
[issue tracker]: https://github.com/pybind/pybind11/issues
[Gitter chat room]: https://gitter.im/pybind/Lobby
*After reading, remove this checklist.*

View File

@@ -5,12 +5,3 @@ updates:
directory: "/"
schedule:
interval: "daily"
ignore:
# Official actions have moving tags like v1
# that are used, so they don't need updates here
- dependency-name: "actions/checkout"
- dependency-name: "actions/setup-python"
- dependency-name: "actions/cache"
- dependency-name: "actions/upload-artifact"
- dependency-name: "actions/download-artifact"
- dependency-name: "actions/labeler"

View File

@@ -0,0 +1,32 @@
{
"problemMatcher": [
{
"severity": "warning",
"pattern": [
{
"regexp": "^([^:]+):(\\d+):(\\d+): ([A-DF-Z]\\d+): \\033\\[[\\d;]+m([^\\033]+).*$",
"file": 1,
"line": 2,
"column": 3,
"code": 4,
"message": 5
}
],
"owner": "pylint-warning"
},
{
"severity": "error",
"pattern": [
{
"regexp": "^([^:]+):(\\d+):(\\d+): (E\\d+): \\033\\[[\\d;]+m([^\\033]+).*$",
"file": 1,
"line": 2,
"column": 3,
"code": 4,
"message": 5
}
],
"owner": "pylint-error"
}
]
}

View File

@@ -1,3 +1,7 @@
<!--
Title (above): please place [branch_name] at the beginning if you are targeting a branch other than master. *Do not target stable*.
It is recommended to use conventional commit format, see conventionalcommits.org, but not required.
-->
## Description
<!-- Include relevant issues or PRs here, describe what changed and why -->

View File

@@ -9,6 +9,17 @@ on:
- stable
- v*
concurrency:
group: test-${{ github.ref }}
cancel-in-progress: true
env:
PIP_ONLY_BINARY: numpy
FORCE_COLOR: 3
PYTEST_TIMEOUT: 300
# For cmake:
VERBOSE: 1
jobs:
# This is the "main" test suite, which tests a large number of different
# versions of default compilers and Python versions in GitHub Actions.
@@ -16,66 +27,66 @@ jobs:
strategy:
fail-fast: false
matrix:
runs-on: [ubuntu-latest, windows-latest, macos-latest]
runs-on: [ubuntu-20.04, windows-2022, macos-latest]
python:
- 2.7
- 3.5
- 3.6
- 3.9
# - 3.10-dev # Re-enable once 3.10.0a5 is released
- pypy2
- pypy3
- '3.6'
- '3.9'
- '3.10'
- '3.11'
- 'pypy-3.7'
- 'pypy-3.8'
- 'pypy-3.9'
# Items in here will either be added to the build matrix (if not
# present), or add new keys to an existing matrix element if all the
# existing keys match.
#
# We support three optional keys: args (both build), args1 (first
# build), and args2 (second build).
# We support an optional key: args, for cmake args
include:
# Just add a key
- runs-on: ubuntu-latest
python: 3.6
- runs-on: ubuntu-20.04
python: '3.6'
args: >
-DPYBIND11_FINDPYTHON=ON
- runs-on: windows-latest
python: 3.6
-DCMAKE_CXX_FLAGS="-D_=1"
- runs-on: ubuntu-20.04
python: 'pypy-3.8'
args: >
-DPYBIND11_FINDPYTHON=ON
# These items will be removed from the build matrix, keys must match.
exclude:
# Currently 32bit only, and we build 64bit
- runs-on: windows-latest
python: pypy2
- runs-on: windows-latest
python: pypy3
# TODO: PyPy2 7.3.3 segfaults, while 7.3.2 was fine.
- runs-on: ubuntu-latest
python: pypy2
- runs-on: windows-2019
python: '3.6'
args: >
-DPYBIND11_FINDPYTHON=ON
# Inject a couple Windows 2019 runs
- runs-on: windows-2019
python: '3.9'
name: "🐍 ${{ matrix.python }} • ${{ matrix.runs-on }} • x64 ${{ matrix.args }}"
runs-on: ${{ matrix.runs-on }}
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Setup Python ${{ matrix.python }}
uses: actions/setup-python@v2
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python }}
- name: Setup Boost (Windows / Linux latest)
shell: bash
run: echo "BOOST_ROOT=$BOOST_ROOT_1_72_0" >> $GITHUB_ENV
- name: Setup Boost (Linux)
# Can't use boost + define _
if: runner.os == 'Linux' && matrix.python != '3.6'
run: sudo apt-get install libboost-dev
- name: Setup Boost (macOS)
if: runner.os == 'macOS'
run: brew install boost
- name: Update CMake
uses: jwlawson/actions-setup-cmake@v1.7
uses: jwlawson/actions-setup-cmake@v1.13
- name: Cache wheels
if: runner.os == 'macOS'
uses: actions/cache@v2
uses: actions/cache@v3
with:
# This path is specific to macOS - we really only need it for PyPy NumPy wheels
# See https://github.com/actions/cache/blob/master/examples.md#python---pip
@@ -85,17 +96,20 @@ jobs:
key: ${{ runner.os }}-pip-${{ matrix.python }}-x64-${{ hashFiles('tests/requirements.txt') }}
- name: Prepare env
run: python -m pip install -r tests/requirements.txt --prefer-binary
run: |
python -m pip install -r tests/requirements.txt
- name: Setup annotations on Linux
if: runner.os == 'Linux'
run: python -m pip install pytest-github-actions-annotate-failures
# First build - C++11 mode and inplace
# More-or-less randomly adding -DPYBIND11_SIMPLE_GIL_MANAGEMENT=ON here.
- name: Configure C++11 ${{ matrix.args }}
run: >
cmake -S . -B .
-DPYBIND11_WERROR=ON
-DPYBIND11_SIMPLE_GIL_MANAGEMENT=ON
-DDOWNLOAD_CATCH=ON
-DDOWNLOAD_EIGEN=ON
-DCMAKE_CXX_STANDARD=11
@@ -109,7 +123,7 @@ jobs:
- name: C++11 tests
# TODO: Figure out how to load the DLL on Python 3.8+
if: "!(runner.os == 'Windows' && (matrix.python == 3.8 || matrix.python == 3.9 || matrix.python == '3.10-dev'))"
if: "!(runner.os == 'Windows' && (matrix.python == 3.8 || matrix.python == 3.9 || matrix.python == '3.10' || matrix.python == '3.11' || matrix.python == 'pypy-3.8'))"
run: cmake --build . --target cpptest -j 2
- name: Interface test C++11
@@ -119,15 +133,16 @@ jobs:
run: git clean -fdx
# Second build - C++17 mode and in a build directory
- name: Configure ${{ matrix.args2 }}
# More-or-less randomly adding -DPYBIND11_SIMPLE_GIL_MANAGEMENT=OFF here.
- name: Configure C++17
run: >
cmake -S . -B build2
-DPYBIND11_WERROR=ON
-DPYBIND11_SIMPLE_GIL_MANAGEMENT=OFF
-DDOWNLOAD_CATCH=ON
-DDOWNLOAD_EIGEN=ON
-DCMAKE_CXX_STANDARD=17
${{ matrix.args }}
${{ matrix.args2 }}
- name: Build
run: cmake --build build2 -j 2
@@ -137,32 +152,35 @@ jobs:
- name: C++ tests
# TODO: Figure out how to load the DLL on Python 3.8+
if: "!(runner.os == 'Windows' && (matrix.python == 3.8 || matrix.python == 3.9 || matrix.python == '3.10-dev'))"
if: "!(runner.os == 'Windows' && (matrix.python == 3.8 || matrix.python == 3.9 || matrix.python == '3.10' || matrix.python == '3.11' || matrix.python == 'pypy-3.8'))"
run: cmake --build build2 --target cpptest
# Third build - C++17 mode with unstable ABI
- name: Configure (unstable ABI)
run: >
cmake -S . -B build3
-DPYBIND11_WERROR=ON
-DDOWNLOAD_CATCH=ON
-DDOWNLOAD_EIGEN=ON
-DCMAKE_CXX_STANDARD=17
-DPYBIND11_INTERNALS_VERSION=10000000
"-DPYBIND11_TEST_OVERRIDE=test_call_policies.cpp;test_gil_scoped.cpp;test_thread.cpp"
${{ matrix.args }}
- name: Build (unstable ABI)
run: cmake --build build3 -j 2
- name: Python tests (unstable ABI)
run: cmake --build build3 --target pytest
- name: Interface test
run: cmake --build build2 --target test_cmake_build
# Eventually Microsoft might have an action for setting up
# MSVC, but for now, this action works:
- name: Prepare compiler environment for Windows 🐍 2.7
if: matrix.python == 2.7 && runner.os == 'Windows'
uses: ilammy/msvc-dev-cmd@v1
with:
arch: x64
# This makes two environment variables available in the following step(s)
- name: Set Windows 🐍 2.7 environment variables
if: matrix.python == 2.7 && runner.os == 'Windows'
shell: bash
run: |
echo "DISTUTILS_USE_SDK=1" >> $GITHUB_ENV
echo "MSSdk=1" >> $GITHUB_ENV
# This makes sure the setup_helpers module can build packages using
# setuptools
- name: Setuptools helpers test
run: pytest tests/extra_setuptools
if: "!(matrix.runs-on == 'windows-2022')"
deadsnakes:
@@ -170,30 +188,31 @@ jobs:
fail-fast: false
matrix:
include:
- python-version: 3.9
# TODO: Fails on 3.10, investigate
- python-version: "3.9"
python-debug: true
valgrind: true
- python-version: 3.10-dev
- python-version: "3.11"
python-debug: false
name: "🐍 ${{ matrix.python-version }}${{ matrix.python-debug && '-dbg' || '' }} (deadsnakes)${{ matrix.valgrind && ' • Valgrind' || '' }} • x64"
runs-on: ubuntu-latest
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Setup Python ${{ matrix.python-version }} (deadsnakes)
uses: deadsnakes/action@v2.1.1
uses: deadsnakes/action@v3.0.0
with:
python-version: ${{ matrix.python-version }}
debug: ${{ matrix.python-debug }}
- name: Update CMake
uses: jwlawson/actions-setup-cmake@v1.7
uses: jwlawson/actions-setup-cmake@v1.13
- name: Valgrind cache
if: matrix.valgrind
uses: actions/cache@v2
uses: actions/cache@v3
id: cache-valgrind
with:
path: valgrind
@@ -218,9 +237,12 @@ jobs:
sudo apt-get install libc6-dbg # Needed by Valgrind
- name: Prepare env
run: python -m pip install -r tests/requirements.txt --prefer-binary
run: |
python -m pip install -r tests/requirements.txt
- name: Configure
env:
SETUPTOOLS_USE_DISTUTILS: stdlib
run: >
cmake -S . -B build
-DCMAKE_BUILD_TYPE=Debug
@@ -261,16 +283,22 @@ jobs:
include:
- clang: 5
std: 14
- clang: 10
std: 20
- clang: 10
std: 17
- clang: 11
std: 20
- clang: 12
std: 20
- clang: 13
std: 20
- clang: 14
std: 20
name: "🐍 3 • Clang ${{ matrix.clang }} • C++${{ matrix.std }} • x64"
container: "silkeh/clang:${{ matrix.clang }}"
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Add wget and python3
run: apt-get update && apt-get install -y python3-dev python3-numpy python3-pytest libeigen3-dev
@@ -300,11 +328,11 @@ jobs:
# Testing NVCC; forces sources to behave like .cu files
cuda:
runs-on: ubuntu-latest
name: "🐍 3.8 • CUDA 11 • Ubuntu 20.04"
container: nvidia/cuda:11.0-devel-ubuntu20.04
name: "🐍 3.10 • CUDA 11.7 • Ubuntu 22.04"
container: nvidia/cuda:11.7.0-devel-ubuntu22.04
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
# tzdata will try to ask for the timezone, so set the DEBIAN_FRONTEND
- name: Install 🐍 3
@@ -328,7 +356,7 @@ jobs:
# container: centos:8
#
# steps:
# - uses: actions/checkout@v2
# - uses: actions/checkout@v3
#
# - name: Add Python 3 and a few requirements
# run: yum update -y && yum install -y git python3-devel python3-numpy python3-pytest make environment-modules
@@ -367,32 +395,32 @@ jobs:
# Testing on CentOS 7 + PGI compilers, which seems to require more workarounds
centos-nvhpc7:
runs-on: ubuntu-latest
name: "🐍 3 • CentOS7 / PGI 20.9 • x64"
name: "🐍 3 • CentOS7 / PGI 22.9 • x64"
container: centos:7
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Add Python 3 and a few requirements
run: yum update -y && yum install -y epel-release && yum install -y git python3-devel make environment-modules cmake3
run: yum update -y && yum install -y epel-release && yum install -y git python3-devel make environment-modules cmake3 yum-utils
- name: Install NVidia HPC SDK
run: yum -y install https://developer.download.nvidia.com/hpc-sdk/20.9/nvhpc-20-9-20.9-1.x86_64.rpm https://developer.download.nvidia.com/hpc-sdk/20.9/nvhpc-2020-20.9-1.x86_64.rpm
run: yum-config-manager --add-repo https://developer.download.nvidia.com/hpc-sdk/rhel/nvhpc.repo && yum -y install nvhpc-22.9
# On CentOS 7, we have to filter a few tests (compiler internal error)
# and allow deeper templete recursion (not needed on CentOS 8 with a newer
# and allow deeper template recursion (not needed on CentOS 8 with a newer
# standard library). On some systems, you many need further workarounds:
# https://github.com/pybind/pybind11/pull/2475
- name: Configure
shell: bash
run: |
source /etc/profile.d/modules.sh
module load /opt/nvidia/hpc_sdk/modulefiles/nvhpc/20.9
module load /opt/nvidia/hpc_sdk/modulefiles/nvhpc/22.9
cmake3 -S . -B build -DDOWNLOAD_CATCH=ON \
-DCMAKE_CXX_STANDARD=11 \
-DPYTHON_EXECUTABLE=$(python3 -c "import sys; print(sys.executable)") \
-DCMAKE_CXX_FLAGS="-Wc,--pending_instantiations=0" \
-DPYBIND11_TEST_FILTER="test_smart_ptr.cpp;test_virtual_functions.cpp"
-DPYBIND11_TEST_FILTER="test_smart_ptr.cpp"
# Building before installing Pip should produce a warning but not an error
- name: Build
@@ -419,20 +447,20 @@ jobs:
strategy:
fail-fast: false
matrix:
gcc:
- 7
- latest
std:
- 11
include:
- gcc: 10
std: 20
- { gcc: 7, std: 11 }
- { gcc: 7, std: 17 }
- { gcc: 8, std: 14 }
- { gcc: 8, std: 17 }
- { gcc: 10, std: 17 }
- { gcc: 11, std: 20 }
- { gcc: 12, std: 20 }
name: "🐍 3 • GCC ${{ matrix.gcc }} • C++${{ matrix.std }}• x64"
container: "gcc:${{ matrix.gcc }}"
steps:
- uses: actions/checkout@v1
- uses: actions/checkout@v3
- name: Add Python 3
run: apt-get update; apt-get install -y python3-dev python3-numpy python3-pytest python3-pip libeigen3-dev
@@ -441,7 +469,7 @@ jobs:
run: python3 -m pip install --upgrade pip
- name: Update CMake
uses: jwlawson/actions-setup-cmake@v1.7
uses: jwlawson/actions-setup-cmake@v1.13
- name: Configure
shell: bash
@@ -474,7 +502,7 @@ jobs:
name: "🐍 3 • ICC latest • x64"
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Add apt repo
run: |
@@ -495,7 +523,7 @@ jobs:
- name: Install dependencies
run: |
set +e; source /opt/intel/oneapi/setvars.sh; set -e
python3 -m pip install -r tests/requirements.txt --prefer-binary
python3 -m pip install -r tests/requirements.txt
- name: Configure C++11
run: |
@@ -569,29 +597,37 @@ jobs:
strategy:
fail-fast: false
matrix:
centos:
- 7 # GCC 4.8
- 8
container:
- "centos:7" # GCC 4.8
- "almalinux:8"
- "almalinux:9"
name: "🐍 3 • CentOS ${{ matrix.centos }} • x64"
container: "centos:${{ matrix.centos }}"
name: "🐍 3 • ${{ matrix.container }} • x64"
container: "${{ matrix.container }}"
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Add Python 3
- name: Add Python 3 (RHEL 7)
if: matrix.container == 'centos:7'
run: yum update -y && yum install -y python3-devel gcc-c++ make git
- name: Add Python 3 (RHEL 8+)
if: matrix.container != 'centos:7'
run: dnf update -y && dnf install -y python3-devel gcc-c++ make git
- name: Update pip
run: python3 -m pip install --upgrade pip
- name: Install dependencies
run: python3 -m pip install cmake -r tests/requirements.txt --prefer-binary
run: |
python3 -m pip install cmake -r tests/requirements.txt
- name: Configure
shell: bash
run: >
cmake -S . -B build
-DCMAKE_BUILD_TYPE=MinSizeRel
-DPYBIND11_WERROR=ON
-DDOWNLOAD_CATCH=ON
-DDOWNLOAD_EIGEN=ON
@@ -613,18 +649,18 @@ jobs:
# This tests an "install" with the CMake tools
install-classic:
name: "🐍 3.5 • Debian • x86 • Install"
name: "🐍 3.7 • Debian • x86 • Install"
runs-on: ubuntu-latest
container: i386/debian:stretch
container: i386/debian:buster
steps:
- uses: actions/checkout@v1
- uses: actions/checkout@v1 # Required to run inside docker
- name: Install requirements
run: |
apt-get update
apt-get install -y git make cmake g++ libeigen3-dev python3-dev python3-pip
pip3 install "pytest==3.1.*"
pip3 install "pytest==6.*"
- name: Configure for install
run: >
@@ -649,33 +685,32 @@ jobs:
-DPYTHON_EXECUTABLE=$(python3 -c "import sys; print(sys.executable)")
working-directory: /build-tests
- name: Run tests
- name: Python tests
run: make pytest -j 2
working-directory: /build-tests
# This verifies that the documentation is not horribly broken, and does a
# basic sanity check on the SDist.
# basic validation check on the SDist.
doxygen:
name: "Documentation build test"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- uses: actions/setup-python@v2
- uses: actions/setup-python@v4
with:
python-version: "3.x"
- name: Install Doxygen
run: sudo apt-get install -y doxygen librsvg2-bin # Changed to rsvg-convert in 20.04
- name: Install docs & setup requirements
run: python3 -m pip install -r docs/requirements.txt
- name: Build docs
run: python3 -m sphinx -W -b html docs docs/.build
run: pipx run nox -s docs
- name: Make SDist
run: python3 setup.py sdist
run: pipx run nox -s build -- --sdist
- run: git status --ignored
@@ -687,7 +722,7 @@ jobs:
- name: Compare Dists (headers only)
working-directory: include
run: |
python3 -m pip install --user -U ../dist/*
python3 -m pip install --user -U ../dist/*.tar.gz
installed=$(python3 -c "import pybind11; print(pybind11.get_include() + '/pybind11')")
diff -rq $installed ./pybind11
@@ -696,42 +731,43 @@ jobs:
fail-fast: false
matrix:
python:
- 3.5
- 3.6
- 3.7
- 3.8
- 3.9
- pypy3
# TODO: fix hang on pypy2
include:
- python: 3.9
args: -DCMAKE_CXX_STANDARD=20 -DDOWNLOAD_EIGEN=OFF
args: -DCMAKE_CXX_STANDARD=20
- python: 3.8
args: -DCMAKE_CXX_STANDARD=17
- python: 3.7
args: -DCMAKE_CXX_STANDARD=14
name: "🐍 ${{ matrix.python }} • MSVC 2019 • x86 ${{ matrix.args }}"
runs-on: windows-latest
runs-on: windows-2019
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Setup Python ${{ matrix.python }}
uses: actions/setup-python@v2
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python }}
architecture: x86
- name: Update CMake
uses: jwlawson/actions-setup-cmake@v1.7
uses: jwlawson/actions-setup-cmake@v1.13
- name: Prepare MSVC
uses: ilammy/msvc-dev-cmd@v1
uses: ilammy/msvc-dev-cmd@v1.12.0
with:
arch: x86
- name: Prepare env
run: python -m pip install -r tests/requirements.txt --prefer-binary
run: |
python -m pip install -r tests/requirements.txt
# First build - C++11 mode and inplace
- name: Configure ${{ matrix.args }}
@@ -745,102 +781,324 @@ jobs:
- name: Build C++11
run: cmake --build build -j 2
- name: Run tests
- name: Python tests
run: cmake --build build -t pytest
win32-msvc2015:
name: "🐍 ${{ matrix.python }} • MSVC 2015 • x64"
runs-on: windows-latest
win32-debug:
strategy:
fail-fast: false
matrix:
python:
- 2.7
- 3.6
- 3.7
# todo: check/cpptest does not support 3.8+ yet
steps:
- uses: actions/checkout@v2
- name: Setup 🐍 ${{ matrix.python }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python }}
- name: Update CMake
uses: jwlawson/actions-setup-cmake@v1.7
- name: Prepare MSVC
uses: ilammy/msvc-dev-cmd@v1
with:
toolset: 14.0
- name: Prepare env
run: python -m pip install -r tests/requirements.txt --prefer-binary
# First build - C++11 mode and inplace
- name: Configure
run: >
cmake -S . -B build
-G "Visual Studio 14 2015" -A x64
-DPYBIND11_WERROR=ON
-DDOWNLOAD_CATCH=ON
-DDOWNLOAD_EIGEN=ON
- name: Build C++14
run: cmake --build build -j 2
- name: Run all checks
run: cmake --build build -t check
win32-msvc2017:
name: "🐍 ${{ matrix.python }} • MSVC 2017 • x64"
runs-on: windows-2016
strategy:
fail-fast: false
matrix:
python:
- 2.7
- 3.5
- 3.7
std:
- 14
- 3.8
- 3.9
include:
- python: 2.7
std: 17
args: >
-DCMAKE_CXX_FLAGS="/permissive- /EHsc /GR"
- python: 3.9
args: -DCMAKE_CXX_STANDARD=20
- python: 3.8
args: -DCMAKE_CXX_STANDARD=17
name: "🐍 ${{ matrix.python }} • MSVC 2019 (Debug) • x86 ${{ matrix.args }}"
runs-on: windows-2019
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Setup 🐍 ${{ matrix.python }}
uses: actions/setup-python@v2
- name: Setup Python ${{ matrix.python }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python }}
architecture: x86
- name: Update CMake
uses: jwlawson/actions-setup-cmake@v1.7
uses: jwlawson/actions-setup-cmake@v1.13
- name: Prepare MSVC
uses: ilammy/msvc-dev-cmd@v1.12.0
with:
arch: x86
- name: Prepare env
run: python -m pip install -r tests/requirements.txt --prefer-binary
run: |
python -m pip install -r tests/requirements.txt
# First build - C++11 mode and inplace
- name: Configure
- name: Configure ${{ matrix.args }}
run: >
cmake -S . -B build
-G "Visual Studio 15 2017" -A x64
-G "Visual Studio 16 2019" -A Win32
-DCMAKE_BUILD_TYPE=Debug
-DPYBIND11_WERROR=ON
-DDOWNLOAD_CATCH=ON
-DDOWNLOAD_EIGEN=ON
-DCMAKE_CXX_STANDARD=${{ matrix.std }}
${{ matrix.args }}
- name: Build C++11
run: cmake --build build --config Debug -j 2
- name: Build ${{ matrix.std }}
- name: Python tests
run: cmake --build build --config Debug -t pytest
windows-2022:
strategy:
fail-fast: false
matrix:
python:
- 3.9
name: "🐍 ${{ matrix.python }} • MSVC 2022 C++20 • x64"
runs-on: windows-2022
steps:
- uses: actions/checkout@v3
- name: Setup Python ${{ matrix.python }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python }}
- name: Prepare env
run: |
python3 -m pip install -r tests/requirements.txt
- name: Update CMake
uses: jwlawson/actions-setup-cmake@v1.13
- name: Configure C++20
run: >
cmake -S . -B build
-DPYBIND11_WERROR=ON
-DDOWNLOAD_CATCH=ON
-DDOWNLOAD_EIGEN=ON
-DCMAKE_CXX_STANDARD=20
- name: Build C++20
run: cmake --build build -j 2
- name: Run all checks
run: cmake --build build -t check
- name: Python tests
run: cmake --build build --target pytest
- name: C++20 tests
run: cmake --build build --target cpptest -j 2
- name: Interface test C++20
run: cmake --build build --target test_cmake_build
mingw:
name: "🐍 3 • windows-latest • ${{ matrix.sys }}"
runs-on: windows-latest
defaults:
run:
shell: msys2 {0}
strategy:
fail-fast: false
matrix:
include:
- { sys: mingw64, env: x86_64 }
- { sys: mingw32, env: i686 }
steps:
- uses: msys2/setup-msys2@v2
with:
msystem: ${{matrix.sys}}
install: >-
git
mingw-w64-${{matrix.env}}-gcc
mingw-w64-${{matrix.env}}-python-pip
mingw-w64-${{matrix.env}}-python-numpy
mingw-w64-${{matrix.env}}-python-scipy
mingw-w64-${{matrix.env}}-cmake
mingw-w64-${{matrix.env}}-make
mingw-w64-${{matrix.env}}-python-pytest
mingw-w64-${{matrix.env}}-eigen3
mingw-w64-${{matrix.env}}-boost
mingw-w64-${{matrix.env}}-catch
- uses: actions/checkout@v3
- name: Configure C++11
# LTO leads to many undefined reference like
# `pybind11::detail::function_call::function_call(pybind11::detail::function_call&&)
run: cmake -G "MinGW Makefiles" -DCMAKE_CXX_STANDARD=11 -DPYBIND11_WERROR=ON -DDOWNLOAD_CATCH=ON -S . -B build
- name: Build C++11
run: cmake --build build -j 2
- name: Python tests C++11
run: cmake --build build --target pytest -j 2
- name: C++11 tests
run: PYTHONHOME=/${{matrix.sys}} PYTHONPATH=/${{matrix.sys}} cmake --build build --target cpptest -j 2
- name: Interface test C++11
run: PYTHONHOME=/${{matrix.sys}} PYTHONPATH=/${{matrix.sys}} cmake --build build --target test_cmake_build
- name: Clean directory
run: git clean -fdx
- name: Configure C++14
run: cmake -G "MinGW Makefiles" -DCMAKE_CXX_STANDARD=14 -DPYBIND11_WERROR=ON -DDOWNLOAD_CATCH=ON -S . -B build2
- name: Build C++14
run: cmake --build build2 -j 2
- name: Python tests C++14
run: cmake --build build2 --target pytest -j 2
- name: C++14 tests
run: PYTHONHOME=/${{matrix.sys}} PYTHONPATH=/${{matrix.sys}} cmake --build build2 --target cpptest -j 2
- name: Interface test C++14
run: PYTHONHOME=/${{matrix.sys}} PYTHONPATH=/${{matrix.sys}} cmake --build build2 --target test_cmake_build
- name: Clean directory
run: git clean -fdx
- name: Configure C++17
run: cmake -G "MinGW Makefiles" -DCMAKE_CXX_STANDARD=17 -DPYBIND11_WERROR=ON -DDOWNLOAD_CATCH=ON -S . -B build3
- name: Build C++17
run: cmake --build build3 -j 2
- name: Python tests C++17
run: cmake --build build3 --target pytest -j 2
- name: C++17 tests
run: PYTHONHOME=/${{matrix.sys}} PYTHONPATH=/${{matrix.sys}} cmake --build build3 --target cpptest -j 2
- name: Interface test C++17
run: PYTHONHOME=/${{matrix.sys}} PYTHONPATH=/${{matrix.sys}} cmake --build build3 --target test_cmake_build
windows_clang:
strategy:
matrix:
os: [windows-latest]
python: ['3.10']
runs-on: "${{ matrix.os }}"
name: "🐍 ${{ matrix.python }} • ${{ matrix.os }} • clang-latest"
steps:
- name: Show env
run: env
- name: Checkout
uses: actions/checkout@v3
- name: Set up Clang
uses: egor-tensin/setup-clang@v1
- name: Setup Python ${{ matrix.python }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python }}
- name: Update CMake
uses: jwlawson/actions-setup-cmake@v1.13
- name: Install ninja-build tool
uses: seanmiddleditch/gha-setup-ninja@v3
- name: Run pip installs
run: |
python -m pip install --upgrade pip
python -m pip install -r tests/requirements.txt
- name: Show Clang++ version
run: clang++ --version
- name: Show CMake version
run: cmake --version
# TODO: WERROR=ON
- name: Configure Clang
run: >
cmake -G Ninja -S . -B .
-DPYBIND11_WERROR=OFF
-DPYBIND11_SIMPLE_GIL_MANAGEMENT=OFF
-DDOWNLOAD_CATCH=ON
-DDOWNLOAD_EIGEN=ON
-DCMAKE_CXX_COMPILER=clang++
-DCMAKE_CXX_STANDARD=17
- name: Build
run: cmake --build . -j 2
- name: Python tests
run: cmake --build . --target pytest -j 2
- name: C++ tests
run: cmake --build . --target cpptest -j 2
- name: Interface test
run: cmake --build . --target test_cmake_build -j 2
- name: Clean directory
run: git clean -fdx
macos_brew_install_llvm:
name: "macos-latest • brew install llvm"
runs-on: macos-latest
env:
# https://apple.stackexchange.com/questions/227026/how-to-install-recent-clang-with-homebrew
LDFLAGS: '-L/usr/local/opt/llvm/lib -Wl,-rpath,/usr/local/opt/llvm/lib'
steps:
- name: Update PATH
run: echo "/usr/local/opt/llvm/bin" >> $GITHUB_PATH
- name: Show env
run: env
- name: Checkout
uses: actions/checkout@v3
- name: Show Clang++ version before brew install llvm
run: clang++ --version
- name: brew install llvm
run: brew install llvm
- name: Show Clang++ version after brew install llvm
run: clang++ --version
- name: Update CMake
uses: jwlawson/actions-setup-cmake@v1.13
- name: Run pip installs
run: |
python3 -m pip install --upgrade pip
python3 -m pip install -r tests/requirements.txt
python3 -m pip install numpy
python3 -m pip install scipy
- name: Show CMake version
run: cmake --version
- name: CMake Configure
run: >
cmake -S . -B .
-DPYBIND11_WERROR=ON
-DPYBIND11_SIMPLE_GIL_MANAGEMENT=OFF
-DDOWNLOAD_CATCH=ON
-DDOWNLOAD_EIGEN=ON
-DCMAKE_CXX_COMPILER=clang++
-DCMAKE_CXX_STANDARD=17
-DPYTHON_EXECUTABLE=$(python3 -c "import sys; print(sys.executable)")
- name: Build
run: cmake --build . -j 2
- name: Python tests
run: cmake --build . --target pytest -j 2
- name: C++ tests
run: cmake --build . --target cpptest -j 2
- name: Interface test
run: cmake --build . --target test_cmake_build -j 2
- name: Clean directory
run: git clean -fdx

Some files were not shown because too many files have changed in this diff Show More