Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GSoC 2020: Webassembly backend #147

Closed
wants to merge 81 commits into from
Closed

GSoC 2020: Webassembly backend #147

wants to merge 81 commits into from

Conversation

pkel
Copy link
Collaborator

@pkel pkel commented Aug 30, 2020

I worked on translating Ergo smart contracts to Webassembly (Wasm) as part of Google Summer of Code (GSoC) 2020. You can find an overview of my project in PR #777 on the Ergo repo. TL;DR: Ergo adds a smart contract language frontend to the Q*cert compiler. I extend Q*cert with a proof-of-concept Webassembly backend.

This PR constitutes my GSoC project submission. It's meant to be frozen on the GSoC deadline on 8/31/2020. Future edits should go to Jerome's working PR #142.

In the following, I will try to wrap up some technical aspects of my implementation.

Overview

The Wasm specification contains a reference implementation in OCaml. I implement a translation from Imp(Ejson) to the Wasm reference Ast. We also hook up the reference interpreter as Q*Cert evaluation function for testing. All my code is OCaml and not verified. Jerome did some plumbing on the Coq side.

Preliminaries

  • We add the opam dependency wasm.1.0.1.
  • We switch to separate extraction to circumvent cyclic dependencies between Coq and OCaml.

Code generation and evaluation

Code generation and evaluation is implemented in /compiler/wasm/. The main component is the functor in wasm_backend.ml which is instantiated in wasm_ast.ml. The latter provides all functions necessary to extend Q*cert with a new Wasm_ast language target. The functorization is needed since Ergo's extracts its own Imp and OCaml cannot infer their equality.

The translation functions in wasm_backend.ml rely on an incomplete intermediate representation for Wasm (wasm_ir.mli). In Wasm, functions, types, and variables are addressed by their integer index within the module. The IR replaces these integers with OCaml variables for more convenient addressing. It also provides concise constructors for Wasm AST elements.

Another convenience module is wasm_util.ml. It contains a small lookup table implementation based on Hashtable which is used all over the place to get from OCaml values to integer indexes.

In contrast to Imp's block-scoped variables, Wasm has only function-scoped variables. The translation in wasm_backend.ml does not take this into account and relies on unique variable names. We pre-process the Imp using the translation in wasm_imp_scoping.ml in order to resolve this conflict (see #145).

I implement a binary encoding for EJson values in wasm_binary_ejson.ml. This encoding is used for storing constants in the linear memory of the generated Wasm module. It's also used at runtime for communication between the interpreter and the runtime module (see Ergo PR #777). I briefly describe the encoding in the header of source file.

Runtime

The generated wasm module relies on a Wasm runtime for execution (see Ergo PR #777). This runtime is implemented in Assemblyscript (NPM ecosystem) and lives in /runtimes/assemblyscript/. We build the runtime with npm run asbuild:untouched and run some tests with npm test. The Assemblyscript source resides in assembly/index.ts, the compiled Wasm in build/untouched.wasm.

We add a dune rule that encodes the binary runtime build/untouched.wasm into an wasm_runtime.ml file and bundles it into wasm_backend.ml.

Tests

We provide some rudimentary tests under /tests/imp_ejson/ which can be run all at once using the command make -C tests wasm-imp-spec. The tests evaluate pieces of Imp(Ejson) on a given input in order to produce the expected output. Then the Imp code gets translated to Wasm and executed with the same input on the Wasm reference interpreter. The resulting output is compared with the expected output (should be equal).

Issues

jeromesimeon and others added 30 commits August 13, 2020 15:00
I'm struggling with RuntimeOpRecDot.

make -C tests wasm

I need debugger access to the generated wasm and provided ejson argument.
This PoC demonstrates how Assemblyscript could be used to implement a
runtime for imp_ejson. The next step on this road would be to implement
a pair functions that read/write JS Ejson from/to Assemblyscript's
managed memory.
This commit adds a pair of functions that read/write JS Ejson from/to
Assemblyscript's managed memory. Executing a imp function compiled to
wasm is now easy.

Next steps:
- Test the set op implemented operators.
- Hook the new runtime into the compiler pipeline.
The runtime allows to run qcert queries compiled to wasm modules in
chrome. DevTools enable stepwise debugging. This predates the recent
effort to implement the IMP operators in Assemblyscript.
This commit removes the Imp(Wasm) runtime operators that have been
implemented before. Instead we call imported functions. These functions
are defined in an Assemblyscript module.

As a side effect, we lost support for constants and
EjsonRuntimeOperator(s).

Next steps:
- Unit test the set op implemented operators.
- Provide engine that links the compiled module with the runtime.
- Compile constants
- Support EjsonRuntimeOperator(s).
This commit add a PoC engine that executes a compiled to wasm qcert
query on NodeJS. It dynamically links the IMP runtime that we implement
in AssemblyScript and compile to a separate wasm module.
We can compile the following OQL queries:
3.14
not (true or false)
3.14 <. 4.5
pi
pi <. e
greet

And execute them on input:
{ "pi" : 3.14, "e" : 2.72, "greet" : "Hello World!" }
Before, each use of an IMP constant lead to a fresh allocation in the
AssemblyScript runtime. Additionally, strings were entirely encoded in
the AST (as function that allocate and inititalize the corresponding
string in the runtime).

Now, constants are serialized into the linear memory of the compiled
module. On first use, a corresponding value is allocated in the memory
of the AssemblyScript runtime. Repeated use of a constant uses the same
value on the runtime side.
We now use the AssemblyScript runtime. The old runtime is not needed
anymore.
- Make explicit, that be transfer bytes of an UTF8 string.
- Allocate a single, correctly sized buffer on the runtime side.
Avoids headache on Javascript/Wasm interface.
pkel and others added 22 commits August 18, 2020 11:29
- fixes incompatibility with node
  ("invalid block type error" when loading compiled wasm module)
- removes dependencies to wat2wasm/wasm2wat
Before, if-then-else became if-then-then

Signed-off-by: Patrik Keller <[email protected]>
Q*cert's `EJsonRuntimeCompare a b` returns sign(b - a).
Before this fix the Wasm implementation of this operator returned
sign(a - b).

Signed-off-by: Patrik Keller <[email protected]>
Signed-off-by: Jerome Simeon <[email protected]>
Signed-off-by: Jerome Simeon <[email protected]>
- derive the expected output from Imp eval.
- avoid bash for loop for iteration
- add a small binary (tools/binary_to_string.(ml|exe) that escapes a
  binary string to a valid OCaml string using hex codes.
- add a dune rule that reads binary runtime.wasm into a generated ml
  file.
- use the resulting OCaml value in for the wasm spec eval.
@pkel pkel marked this pull request as ready for review August 31, 2020 16:33
@jeromesimeon jeromesimeon mentioned this pull request Sep 1, 2020
@pkel pkel mentioned this pull request Feb 18, 2021
@pkel
Copy link
Collaborator Author

pkel commented Feb 18, 2021

Some other pages link to this PR as my GSoC project report. Let's keep it as is. We can track progress in #153.

@pkel pkel closed this Feb 18, 2021
@jeromesimeon jeromesimeon deleted the gsoc2020-wasm branch May 19, 2022 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants