Skip to content

Commit

Permalink
Merge pull request #174 from fwsGonzo/embedded_sandbox_code
Browse files Browse the repository at this point in the history
Add initial support for embedded sandboxed code
  • Loading branch information
fwsGonzo committed Jun 30, 2024
2 parents 4cff3d1 + d152fdb commit ee1e831
Show file tree
Hide file tree
Showing 12 changed files with 303 additions and 59 deletions.
31 changes: 30 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Non goals:

## Benchmarks

[STREAM benchmark](https://gist.github.com/fwsGonzo/a594727a9429cb29f2012652ad43fb37) [CoreMark: 34997](https://gist.github.com/fwsGonzo/7ef100ba4fe7116e97ddb20cf26e6879) vs 41382 native (~85%).
[STREAM benchmark](https://gist.github.com/fwsGonzo/a594727a9429cb29f2012652ad43fb37) [CoreMark: 35475](https://gist.github.com/fwsGonzo/7ef100ba4fe7116e97ddb20cf26e6879) vs 41382 native (~86%).

Run [D00M 1 in libriscv](/examples/doom) and see for yourself. It should use around 8% CPU at 60 fps.

Expand Down Expand Up @@ -348,6 +348,35 @@ When binary translation is enabled, the option `RISCV_LIBTCC` is also available.

If you are seeing the error `tcc: error: file 'libtcc1.a' not found`, you can change to using the distro package instead by enabling `RISCV_LIBTCC_DISTRO_PACKAGE`, where `libtcc1.a` is pre-installed. In which case, install your distros `libtcc-dev` equivalent package. Otherwise, the CMake build scripts produces `libtcc1.a` and puts it at the root of the build folder. So, a very quick solution to the error is to just create a symbolic link: `ln -fs build/libtcc1.a .`. It is a run-time dependency of TCC.

### Full binary translation as embeddable code

It is possible to generate C99 freestanding source files from a binary translated program, embed it in a project at some later time, and automatically load and utilize the binary translation at run-time. This feature makes it possible to use full binary translation on platforms where it is ordinarily not possible. If a RISC-V program is changed without generating new sources, the emulator will (intentionally) not find these embedded functions and instead fall back to other modes, eg. interpreter mode. Changing a RISC-V program requires regenerating the sources and rebuilding the final program. This practically adds support for high-performance emulation on all console systems, for final/shipped builds.

In order to test this feature, follow these instructions:
```
cd emulator
./build.sh --bintr
./rvlinux -o test ~/github/coremark/coremark-rv32g_b
$ ls test*
test17A11122.cpp
./build.sh --embed test17A11122.cpp
./rvlinux -v ~/github/coremark/coremark-rv32g_b
$ ./rvlinux -v ~/github/coremark/coremark-rv32g_b
* Loading program of size 75145 from 0x77e75e5b2010 to virtual 0x10000 -> 0x22589
* Program segment readable: 1 writable: 0 executable: 1
* Loading program of size 1864 from 0x77e75e5c459c to virtual 0x2358c -> 0x23cd4
* Program segment readable: 1 writable: 1 executable: 0
Found embedded translation for hash 17A11122
...
CoreMark 1.0 : 35475.669603 / GCC13.2.0 -O3 -DPERFORMANCE_RUN=1 / Static
```

- The original RISC-V binary is still needed, as it is treated as the ultimate truth by the emulator
- If the RISC-V program changes, the emulator will not use the outdated embedded code and instead fall back to another emulation mode, eg. interpreter mode
- Many files can be embedded allowing for dynamic executables to be embedded, with all their dependencies
- The configuration settings of libriscv are added to the hash of the filename, so in order to use the generated code on other systems and platforms the configurations must match exactly
- Embedded segments can be re-used by many emulators, for high scalability

### Experimental multiprocessing

There is multiprocessing support, but it is in its early stages. It is achieved by simultaneously calling a (C/SYSV ABI) function on many machines, each with a unique CPU ID. The input data to be processed should exist beforehand. It is not well tested, and potential page table races are not well understood. That said, it passes manual testing and there is a unit test for the basic cases.
Expand Down
4 changes: 4 additions & 0 deletions emulator/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ option(MICRO_EMULATOR "Build the special micro emulator (custom system calls)"
set(SOURCES
src/main.cpp
)
if (EMBED_FILES)
message(STATUS "Embedding code file ${EMBED_FILE}")
set(SOURCES ${SOURCES} ${EMBED_FILES})
endif()

if (NATIVE)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native")
Expand Down
4 changes: 3 additions & 1 deletion emulator/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
set -e

OPTS=""
EMBED_FILES=""

function usage()
{
Expand All @@ -26,6 +27,7 @@ while [[ "$#" -gt 0 ]]; do
-b|--bintr) OPTS="$OPTS -DRISCV_BINARY_TRANSLATION=ON -DRISCV_LIBTCC=OFF" ;;
-t|--tcc ) OPTS="$OPTS -DRISCV_BINARY_TRANSLATION=ON -DRISCV_LIBTCC=ON" ;;
-x|--expr ) OPTS="$OPTS -DRISCV_EXPERIMENTAL=ON -DRISCV_ENCOMPASSING_ARENA=ON" ;;
-e|--embed) EMBED_FILES="$EMBED_FILES;$2"; shift ;;
-v|--verbose ) set -x ;;
*) echo "Unknown parameter passed: $1"; exit 1 ;;
esac
Expand All @@ -34,7 +36,7 @@ done

mkdir -p .build
pushd .build
cmake .. -DCMAKE_BUILD_TYPE=Release $OPTS
cmake .. -DCMAKE_BUILD_TYPE=Release $OPTS -DEMBED_FILES="$EMBED_FILES"
make -j6
popd

Expand Down
30 changes: 25 additions & 5 deletions emulator/src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ struct Arguments {
bool sandbox = false;
bool ignore_text = false;
uint64_t fuel = UINT64_MAX;
std::string output_file;
};

#ifdef HAVE_GETOPT_LONG
Expand All @@ -41,6 +42,7 @@ static const struct option long_options[] = {
{"trace", no_argument, 0, 'T'},
{"no-translate", no_argument, 0, 'n'},
{"mingw", no_argument, 0, 'm'},
{"output", required_argument, 0, 'o'},
{"from-start", no_argument, 0, 'F'},
{"sandbox", no_argument, 0, 'S'},
{"ignore-text", no_argument, 0, 'I'},
Expand All @@ -56,13 +58,14 @@ static void print_help(const char* name)
" -a, --accurate Accurate instruction counting\n"
" -d, --debug Enable CLI debugger\n"
" -1, --single-step One instruction at a time, enabling exact exceptions\n"
" -f, --fuel Set max instructions until program halts\n"
" -f, --fuel amt Set max instructions until program halts\n"
" -g, --gdb Start GDB server on port 2159\n"
" -s, --silent Suppress program completion information\n"
" -t, --timing Enable timing information in binary translator\n"
" -T, --trace Enable tracing in binary translator\n"
" -n, --no-translate Disable binary translation\n"
" -m, --mingw Cross-compile for Windows (MinGW)\n"
" -o, --output file Output embeddable binary translated code (C99)\n"
" -F, --from-start Start debugger from the beginning (_start)\n"
" -S --sandbox Enable strict sandbox\n"
" -I, --ignore-text Ignore .text section, and use segments only\n"
Expand Down Expand Up @@ -110,7 +113,7 @@ static void print_help(const char* name)
static int parse_arguments(int argc, const char** argv, Arguments& args)
{
int c;
while ((c = getopt_long(argc, (char**)argv, "hvad1f:gstTnmFSI", long_options, nullptr)) != -1)
while ((c = getopt_long(argc, (char**)argv, "hvad1f:gstTnmo:FSI", long_options, nullptr)) != -1)
{
switch (c)
{
Expand All @@ -126,6 +129,7 @@ static int parse_arguments(int argc, const char** argv, Arguments& args)
case 'T': args.trace = true; break;
case 'n': args.no_translate = true; break;
case 'm': args.mingw = true; break;
case 'o': break;
case 'F': args.from_start = true; break;
case 'S': args.sandbox = true; break;
case 'I': args.ignore_text = true; break;
Expand All @@ -147,6 +151,11 @@ static int parse_arguments(int argc, const char** argv, Arguments& args)
if (args.verbose) {
printf("Fuel set to %" PRIu64 "\n", args.fuel);
}
} else if (c == 'o') {
args.output_file = optarg;
if (args.verbose) {
printf("Output file prefix set to %s\n", args.output_file.c_str());
}
}
}

Expand All @@ -170,6 +179,19 @@ static void run_program(
const bool is_dynamic,
const std::vector<std::string>& args)
{
if (cli_args.mingw && (!riscv::binary_translation_enabled || riscv::libtcc_enabled)) {
fprintf(stderr, "Error: Full binary translation must be enabled for MinGW cross-compilation\n");
exit(1);
}

std::vector<riscv::MachineTranslationOptions> cc;
if (cli_args.mingw) {
cc.push_back(riscv::MachineTranslationCrossOptions{});
}
if (!cli_args.output_file.empty()) {
cc.push_back(riscv::MachineTranslationEmbeddableCodeOptions{cli_args.output_file});
}

// Create a RISC-V machine with the binary as input program
riscv::Machine<W> machine { binary, {
.memory_max = MAX_MEMORY,
Expand All @@ -185,9 +207,7 @@ static void run_program(
.translation_prefix = "translations/rvbintr-",
.translation_suffix = ".dll",
#else
.cross_compile = cli_args.mingw ?
std::vector<riscv::MachineTranslationCrossOptions>{riscv::MachineTranslationCrossOptions{}}
: std::vector<riscv::MachineTranslationCrossOptions>{},
.cross_compile = cc,
#endif
#endif
}};
Expand Down
19 changes: 18 additions & 1 deletion lib/libriscv/common.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
#endif
#include <string>
#include <string_view>
#include <variant>
#include "util/function.hpp"
#include "types.hpp"

Expand Down Expand Up @@ -51,6 +52,18 @@ namespace riscv
/// @example ".dll"
std::string cross_suffix = ".dll";
};
/// @brief Options for generating embeddable C99 code into a C or C++ program.
struct MachineTranslationEmbeddableCodeOptions
{
/// @brief Provide a filename prefix for the embedded code output.
/// @example "mycode-"
std::string prefix = "mycode-";

/// @brief Provide a filename suffix for the embedded code output.
/// @example ".c" or ".cpp"
std::string suffix = ".cpp";
};
using MachineTranslationOptions = std::variant<MachineTranslationCrossOptions, MachineTranslationEmbeddableCodeOptions>;

/// @brief Options passed to Machine constructor
/// @tparam W The RISC-V architecture
Expand Down Expand Up @@ -119,6 +132,10 @@ namespace riscv
#ifdef RISCV_BINARY_TRANSLATION
/// @brief Enable the binary translator.
bool translate_enabled = true;
/// @brief Enable loading of embedded binary translated programs.
/// @details This will allow the machine to load and execute embedded
/// binary translated programs. They auto-register themselves.
bool translate_enable_embedded = true;
/// @brief Enable compiling execute segment on-demand during emulation.
/// @details Not available on most Windows systems.
#ifdef _WIN32
Expand Down Expand Up @@ -157,7 +174,7 @@ namespace riscv
/// @brief Allow the production of a secondary dependency-free DLL that can be
/// transferred to and loaded on Windows (or other) machines. It will be used
/// to greatly accelerate the emulation of the RISC-V program.
std::vector<MachineTranslationCrossOptions> cross_compile {};
std::vector<MachineTranslationOptions> cross_compile {};

/// @brief Produce the translation output filename from the prefix, hash and suffix.
/// @param prefix A prefix for the filename.
Expand Down
2 changes: 1 addition & 1 deletion lib/libriscv/cpu.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ namespace riscv
trigger_exception(EXECUTION_SPACE_PROTECTION_FAULT, begin);

this->m_exec = &machine().memory.create_execute_segment(
{}, vdata, begin, vlength);
machine().options(), vdata, begin, vlength);
return *this->m_exec;
} // CPU::init_execute_area

Expand Down
2 changes: 1 addition & 1 deletion lib/libriscv/decoded_exec_segment.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ namespace riscv
void set_crc32c_hash(uint32_t hash) { m_crc32c_hash = hash; }

#ifdef RISCV_BINARY_TRANSLATION
bool is_binary_translated() const noexcept { return m_bintr_dl != nullptr; }
bool is_binary_translated() const noexcept { return !m_translator_mappings.empty(); }
void* binary_translation_so() const { return m_bintr_dl; }
void set_binary_translated(void* dl) const { m_bintr_dl = dl; }
uint32_t translation_hash() const { return m_bintr_hash; }
Expand Down
6 changes: 4 additions & 2 deletions lib/libriscv/machine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,17 @@ namespace riscv
inline Machine<W>::Machine(std::string_view binary, const MachineOptions<W>& options)
: cpu(*this, options.cpu_id),
memory(*this, binary, options),
m_arena(nullptr)
m_arena(nullptr),
m_options(options)
{
cpu.reset();
}
template <int W>
inline Machine<W>::Machine(const Machine& other, const MachineOptions<W>& options)
: cpu(*this, options.cpu_id, other),
memory(*this, other, options),
m_arena(nullptr)
m_arena(nullptr),
m_options(options)
{
this->m_counter = other.m_counter;
this->m_max_counter = other.m_max_counter;
Expand Down
7 changes: 7 additions & 0 deletions lib/libriscv/machine.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,10 @@ namespace riscv
/// @brief Tears down the machine, freeing all owned memory and pages.
~Machine();

/// @brief Returns the machine options that were used to create the machine.
/// @return The machine options.
auto& options() const noexcept { return m_options; }

/// @brief Simulate RISC-V starting from the PC register, and
/// stopping when at most @max_instructions have been executed.
/// If Throw == true, the machine will throw a
Expand Down Expand Up @@ -443,6 +447,9 @@ namespace riscv
std::unique_ptr<FileDescriptors> m_fds = nullptr;
std::unique_ptr<Multiprocessing<W>> m_smp = nullptr;
std::unique_ptr<Signals<W>> m_signals = nullptr;

MachineOptions<W> m_options;

static_assert((W == 4 || W == 8 || W == 16), "Must be either 32-bit, 64-bit or 128-bit ISA");
static void default_printer(const Machine&, const char*, size_t);
static long default_stdin(const Machine&, char*, size_t);
Expand Down
2 changes: 2 additions & 0 deletions lib/libriscv/tr_api.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -224,11 +224,13 @@ static inline uint64_t MUL128(
return (middle << 32) | (uint32_t)p00;
}
#ifndef EMBEDDABLE_CODE
extern VISIBLE void init(struct CallbackTable* table, char* arena)
{
api = *table;
arena_ptr = arena;
}
#endif
typedef struct {
uint64_t counter;
Expand Down
6 changes: 4 additions & 2 deletions lib/libriscv/tr_emit.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1472,18 +1472,20 @@ void Emitter<W>::emit()
switch (vi.OPVV.funct6)
{
case 0b000000: // VFADD.VF
code += "const float " + scalar + " = " + from_fpreg(vi.OPVV.vs1) + ".f32[0];\n";
code += "{ const float " + scalar + " = " + from_fpreg(vi.OPVV.vs1) + ".f32[0];\n";
for (unsigned i = 0; i < vlen; i++) {
const std::string f32 = ".f32[" + std::to_string(i) + "]";
code += from_rvvreg(vi.OPVV.vd) + f32 + " = " + from_rvvreg(vi.OPVV.vs2) + f32 + " + " + scalar + ";\n";
}
code += "}\n";
break;
case 0b100100: // VFMUL.VF
code += "const float " + scalar + " = " + from_fpreg(vi.OPVV.vs1) + ".f32[0];\n";
code += "{ const float " + scalar + " = " + from_fpreg(vi.OPVV.vs1) + ".f32[0];\n";
for (unsigned i = 0; i < vlen; i++) {
const std::string f32 = ".f32[" + std::to_string(i) + "]";
code += from_rvvreg(vi.OPVV.vd) + f32 + " = " + from_rvvreg(vi.OPVV.vs2) + f32 + " * " + scalar + ";\n";
}
code += "}\n";
break;
default:
UNKNOWN_INSTRUCTION();
Expand Down
Loading

0 comments on commit ee1e831

Please sign in to comment.