Skip to content
This repository has been archived by the owner on May 11, 2020. It is now read-only.

wat: lisp-y text format for wasm #34

Open
sbinet opened this issue Sep 14, 2017 · 11 comments
Open

wat: lisp-y text format for wasm #34

sbinet opened this issue Sep 14, 2017 · 11 comments
Assignees

Comments

@sbinet
Copy link
Contributor

sbinet commented Sep 14, 2017

to ease with sharing the spec tests with WebAssembly, we should be able to parse the .wast files.

There are already a few Go-based lisp-y parsers/lexers:

@sbinet sbinet self-assigned this Sep 14, 2017
@Spriithy
Copy link
Contributor

Spriithy commented Nov 1, 2017

Hello there,

I was wondering if you would like some help for this particular issue ?
I don't have much experience for lisp-y languages but some with lexers / parsers.

I would be glad to contribute to this project in my free time.

@sbinet
Copy link
Contributor Author

sbinet commented Nov 1, 2017

hi @Spriithy

yes. definitely yes!

@Spriithy
Copy link
Contributor

Spriithy commented Nov 1, 2017

Thanks @sbinet for your consideration,

After looking at the file format specification and based on the knowledge that .wast files are lisp syntax based, should we implement an in-house lexer/parser to generate some pre-defined IR ?

From the above references, I think Schego and Zygomys looks quite robust.

Anyways, what are the specifications for the wast format interaction ? Would it be meant for:

  • Execution
  • Sanitization
  • Verification
  • Optimization

We should first better define the needs.

@sbinet
Copy link
Contributor Author

sbinet commented Nov 2, 2017

I am not super familiar with the design space of lexers/parsers, so I can't answer your question about the generation of a pre-defined IR.

The first and immediate value for a wat (or wast) parser is to be able to reduce the scaffolding in the tests of the interpreter.

the official wasm tests contain wast files that describe the content of wasm modules.
these modules are then fed to the interpreter and the result of which is then checked against the expectations that are also described in the wast files.

I'd like to closely reproduce this modus operandi.

I think this involves being able to parse a wast file, detect the parts that describe a wasm module, create a proper wasm.Module out of that and then let the exec package loose.

I am not yet sure whether directly creating a wasm.Module from a wast file is better over first creating a wasm file from the wast one and then parsing the wasm file to create a wasm.Module.
(I guess the second solution is more modular but it involves more moving pieces...)

what do you think?

@Spriithy
Copy link
Contributor

Spriithy commented Nov 2, 2017

As stated in the official documentation, the .wast format is a superset of .wat.

Note: The .wast format understood by some of the listed tools is a superset of the .wat format that is intended for writing test scripts.

Anyways, after comparing a handful of wat and wast files it seems, to me, that creating a wasm file from a wast one and then parsing the resulting file to create a wasm.Module is just almost (I cannot emphasize this word enough 😄) doing the same work twice.

To give you an idea, wast files are to be seen as some sort of test files, where each top-level element is a test unit/case asserting one particular behavior (see the i64 test suite for instance). As you can see, the file contains a module, exporting several functions that are then tested.

It's like a combination of the source file and the test file, if you want.

It seems to me that parsing a wast file should generate []wasm.Module containing all the declared modules along with a test module wrapping all the test cases. The latter would include the former as needed and sequentially run the statements (after indexing possible declarations).

@Spriithy
Copy link
Contributor

Spriithy commented Nov 2, 2017

Obviously, all the test suites from the official WebAssembly repositories are given as standalone wast files. I would like to add that, parsing and generating the code from such files and then executing it on-the-fly seems to be the right approach to me.

Nevertheless, it would be interesting to have some external feedback / suggestions.

@Spriithy
Copy link
Contributor

Hey there,

I allow myself to bump this issue to notify those interested that I am currently working on it.
I have decided to go for a in-house lexer/parser for the wast file format has very special specifications.

I plan to create a reviewable pull request that is not meant for immediate merge but rather see the advancement of the implementation and propose an early insight in my work.

@sbinet
Copy link
Contributor Author

sbinet commented Dec 18, 2017

great!

abourget added a commit to eoscanada/eos-bios that referenced this issue Mar 17, 2018
will not convert from `.wast` to `.wasm`.

JS and C/C++ use the toolchains in their language because they exist,
in Go we have wagon and plans to implement this here:
go-interpreter/wagon#34

The hash we check, in any case, would be the hash of the `.wasm`, and that
is what is checked on the blockchain when you `get code`.. it's the
end reference for the code.
@andreimatei
Copy link

Just wondering, without knowing what I'm talking about: instead of building an assembler, have you considered at all using wabt through CGo? If you want to keep wagon buildable through go build, I think wabt's build system can't be used and instead one has to list all the .c files in CGo directives. It's ugly but probably doable - CockroachDB used to build large C++ dependencies that way.

@andreimatei
Copy link

Btw, another option for using wabt as a dependency is to ask user of wagon to install and build wabt themselves and then wagon would link against it through a directive like

// #cgo LDFLAGS: -lwabt

This directive could be put in a file gated behind a build flag, if the integration is designed in such a way that it is optional.

@sbinet
Copy link
Contributor Author

sbinet commented Jun 11, 2018

well, the whole point of wagon is to be pure-Go.
I wouldn't be against having another go-interpreter/repo with either a CGo package that links against wabt or one that shells out to one of its commands.

but I'd like to keep wagon free of any cgo business :)

sbinet pushed a commit that referenced this issue Feb 27, 2020
 - `tokens.go`: Defines the Token type generated by the scanner and several utility functions around it
 - `test.wast`: A .wast test file extracted from [wasm's spec tests](https://github.com/WebAssembly/spec/blob/master/test/core/names.wast)
 - `scanner.go`: Implements the actual scanner and utilty functions regarding runes
 - `scanner_test.go`: A dead simple test case against the `test.wast` file. It doesn't verify correctness yet.
 - Added several new token kind constants to list different typed instructions such as `i64.reinterpret/f64` or `i32.add`
 - Implemented the `(*Scanner).scanTypedReserved()` method to identify well and ill-formed typed instructions
 - Better Scanner test interface, the test suite now takes a `-wast-file` flag that must point to the `.wast` file to Scan and test against.

In order to test:
 - Create (or use a [`spec/test/core`](https://github.com/WebAssembly/spec/tree/master/test/core) ) wast file

Updates #34.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants