base: use setjmp to speed up fiber

ucontext is an order of magnitude slower compared to most of the fiber
implementation, mainly due to the additional signal mask operation.

This change applies the trick provided in
http://www.1024cores.net/home/lock-free-algorithms/tricks/fibers,
which uses _setjmp/_longjmp to switch between contexts created by
ucontext.

Combine with NodeList improvement, we see 81% speed improvement with the
example provided by Matthias Jung:
https://gist.github.com/myzinsky/557200aa04556de44a317e0a10f51840

Compared with Accellera's SystemC, gem5 SystemC was originally 10x
slower, and with this change it's about 1.8x.

Change-Id: I0ffb6978e83dc8be049b750dc1baebb3d251601c
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/34356
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
Tested-by: kokoro <noreply+kokoro@google.com>
This commit is contained in:
Earl Ou
2020-09-11 09:21:52 +08:00
parent 8864c2ea24
commit 429b828e7b
2 changed files with 18 additions and 5 deletions

View File

@@ -145,10 +145,12 @@ Fiber::start()
setStarted();
// Swap back to the parent context which is still considered "current",
// now that we're ready to go.
int ret M5_VAR_USED = swapcontext(&ctx, &_currentFiber->ctx);
panic_if(ret == -1, strerror(errno));
if (_setjmp(jmp) == 0) {
// Swap back to the parent context which is still considered "current",
// now that we're ready to go.
int ret = swapcontext(&ctx, &_currentFiber->ctx);
panic_if(ret == -1, strerror(errno));
}
// Call main() when we're been reactivated for the first time.
main();
@@ -175,7 +177,8 @@ Fiber::run()
Fiber *prev = _currentFiber;
Fiber *next = this;
_currentFiber = next;
swapcontext(&prev->ctx, &next->ctx);
if (_setjmp(prev->jmp) == 0)
_longjmp(next->jmp, 1);
}
Fiber *Fiber::currentFiber() { return _currentFiber; }

View File

@@ -39,6 +39,12 @@
#include <ucontext.h>
#endif
// Avoid fortify source for longjmp to work between ucontext stacks.
#pragma push_macro("__USE_FORTIFY_LEVEL")
#undef __USE_FORTIFY_LEVEL
#include <setjmp.h>
#pragma pop_macro("__USE_FORTIFY_LEVEL")
#include <cstddef>
#include <cstdint>
@@ -137,6 +143,10 @@ class Fiber
void start();
ucontext_t ctx;
// ucontext is slow in swapcontext. Here we use _setjmp/_longjmp to avoid
// the additional signals for speed up.
jmp_buf jmp;
Fiber *link;
// The stack for this context, or a nullptr if allocated elsewhere.