Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cargo build to build the entire crate #1

Merged
merged 5 commits into from
May 30, 2024

Conversation

surma
Copy link
Contributor

@surma surma commented May 30, 2024

This is amazing work, @rafaelbeckel.

Thanks for the heartwarming shoutout! It makes me super happy to know that my blog post was of use! However, I was surprised at the approach of using LLVM IR, as I am not sure you can always rely on on LLVM IRs emitted from different compilers being able to link together seamlessly.

Instead, object files are somewhat standardized, so I wanted to have a play and see if I can get that to work. I also found a couple of minor things along the way that I wanted to give back to you.

Of course, feel free to cherry-pick, ignore or discard anything and everything :) This is your exploration after all.

I’ll leave some comments in the PR itself for explanation.

build_rust.sh Outdated
@@ -1,6 +1,13 @@
RUSTFLAGS="--cfg=web_sys_unstable_apis --Z wasm_c_abi=spec" cargo rustc --target=wasm32-unknown-unknown --release -- --emit=llvm-ir
export RUSTFLAGS="--cfg=web_sys_unstable_apis --Z wasm_c_abi=spec"
rustc +nightly \
Copy link
Contributor Author

@surma surma May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the biggest change in this PR.

cargo rustc injects some defaults flags for the rustc invocation, but in the end you were grabbing the .ll file anyway, so you while the full crate was still being built (I think), you did not use the product of that work and most of those extra flags do not have an effect on the per-file artifacts.

Instead, I’m invoking rustc directly here (with +nightly to make sure the right toolchain is used), which allows the same kind of invocation pattern as in build_c.sh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.o files are actually normal wasm files (you can use wasm-objdump) with some extra custom sections for the relocation info.

I was hoping I could build the entire crate/library this way, but sadly I could convice cargo +nightly build to spit out a wasm file that wasm-ld would accept.

The upside of that would be that you are much closer to normal rust development, where your options in Cargo.toml are being respected etc.

Copy link
Contributor Author

@surma surma May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay lol I just played around and found a way to do this. Ignore this PR comment thread (although I’ll leave unresolved for context), and I’ll push another commit.

@@ -8,7 +8,7 @@ <h1>Check console log</h1>
(async () => {
// Look ma, Rust and C in the same WASM binary!
const { instance } = await WebAssembly.instantiateStreaming(
fetch("advanced_maths.wasm")
fetch("/advanced_maths.wasm")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started the web server in the repo root, which means this command would try and find the .wasm file in the wasm folder, where it is not located. This fixes that.

@surma surma changed the title Minor fixes & suggestions Use cargo build to build the entire crate May 30, 2024
Copy link
Owner

@rafaelbeckel rafaelbeckel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your generosity and contribution!

It's good to know the .ll files are not strictly necessary, it greatly simplifies the build!

This will come in handy because I plan to introduce more complex scenarios, i.e., bring wasm-bindgen, import external C libs, integrate the build with cmake, etc.

This LLVM IR idea was kind of an "A-HA" moment for me while reading the article for the second time. I first read it last year, but that IR part has been mostly unnoticed. I was reading it for the second time last week while in the middle of a weeks-long quest to map the state of the Rust ecosystem for bundling C/C++ with wasm-bindgen, which only supports the wasm32-unknown-unknown target, and when I crossed that IR part my first thought was "wait a minute, what if I do that for both? This HAS to work!".

The result was those .sh files that I'd never use in production, but they served to prove the concept.

@rafaelbeckel rafaelbeckel merged commit 264f4b5 into rafaelbeckel:master May 30, 2024
@surma
Copy link
Contributor Author

surma commented May 30, 2024

Oh I hope I didn’t come across like I was judging. The fact that I could go in and explore stuff was great. The shell scripts worked a treat :D

Really excited to see where you take this next. It’s gonna be interesting if there is a way to get a libc linked in and use Rust’s allocator to back malloc et al.

@curiousdannii
Copy link

curiousdannii commented May 31, 2024

I don't know if C can use Rust's allocator (it almost certainly can, but how easily is the question), but if you compile to a staticlib then Rust will use the system allocator which makes it very easy to share things. I'm using Emscripten rather than wasm-bindgen, but it means I can even allocate a struct/array from JS, and then give the pointer to Rust which will take ownership of it and free it when finished. The JS side is very clean because I don't have to manually free it at any point.

@rafaelbeckel
Copy link
Owner

Oh I hope I didn’t come across like I was judging. The fact that I could go in and explore stuff was great. The shell scripts worked a treat :D

Not at all; I'm glad of your contribution! I do use similar shell scripts in production, I was referring to some sloppy stuff I threw in (ex., the cd wasm and cd .. thing).

Really excited to see where you take this next. It’s gonna be interesting if there is a way to get a libc linked in and use Rust’s allocator to back malloc et al.

In the wasi folder, there's a tiny example of injecting memory from JS and a shim for system calls. It was actually the first experiment in this repository before I had the LLVM IR idea from the article and moved things to different wasm and wasi folders. I like the idea of providing malloc and free from Rust and could explore this next.

Perhaps I could transform this project into a workspace with many crates to test different build strategies for each. My main interest at this moment is integrating it with Cmake because I have legacy external C/C++ dependencies that use it, but afaik Cmake doesn't support the unknown target. In my production project, I must avoid emscripten or wasi to keep the binary as small as possible.

@rafaelbeckel
Copy link
Owner

rafaelbeckel commented May 31, 2024

I don't know if C can use Rust's allocator (it almost certainly can, but how easily is the question), but if you compile to a staticlib then Rust will use the system allocator which makes it very easy to share things. I'm using Emscripten rather than wasm-bindgen, but it means I can even allocate a struct/array from JS, and then give the pointer to Rust which will take ownership of it and free it when finished. The JS side is very clean because I don't have to manually free it at any point.

wasm-bindgen and js-sys make it very easy to do this sort of thing, but the limitation is that it only builds to wasm32-unknown-unknown. This is the internal design part of their docs about how they allocate and pass objects from JS to Rust and from Rust to JS. One can of course do it manually for emscripten.

I'm curious whether we can somehow link wasm32-unknown-unknown and wasm32-unknown-emscripten together now that the ABIs are compatible. In theory, they should be linkable.

@curiousdannii
Copy link

Maybe, but I'm not sure why you'd want to, just use the Emscripten target from the beginning.

Unless you mean you want to use both wasm-bindgen and Emscripten. I don't know if that would be possible, wouldn't they both want to control the JS code?

@surma
Copy link
Contributor Author

surma commented Jun 1, 2024

@rafaelbeckel Cmake just invokes a compiler, so as long as the compiler supports Wasm, so does Cmake. You can probably steal a lot from how the WASI SDK configures Cmake to build for Wasi to get there.

@rafaelbeckel
Copy link
Owner

Maybe, but I'm not sure why you'd want to, just use the Emscripten target from the beginning.

Unless you mean you want to use both wasm-bindgen and Emscripten. I don't know if that would be possible, wouldn't they both want to control the JS code?

Yes, wasm-bindgen and emscripten together wouldn't make sense because both provide glue code, but the unknown target is just raw instructions. I didn't think about any practical applications, I was just curious if that would work (i.e. an emscripten project importing a module compiled to the unknown target).

@rafaelbeckel
Copy link
Owner

rafaelbeckel commented Jun 2, 2024

@rafaelbeckel Cmake just invokes a compiler, so as long as the compiler supports Wasm, so does Cmake. You can probably steal a lot from how the WASI SDK configures Cmake to build for Wasi to get there.

WASI SDK provides a shim for the wasi target. CMAKE recognizes the platform (third part of the triplet) in wasi and emscripten but does not work for the unknown target (there's a ticket in cmake repository about that, I'll find it out and link it here).

The issue I get while trying to compile a Rust project for the unknown target with the cmake crate (which invokes cmake, which in turn invokes the compiler) is exactly this one: Can't build a WASM target.

# (... a bunch of output stripped for brevity) 

The C compiler

      "/opt/homebrew/opt/llvm/bin/clang"

    is not able to compile a simple test program.

    It fails with the following output:

# (...) more stripped output

    wasm-ld: error: unknown argument: -search_paths_first
    wasm-ld: error: unknown argument: -headerpad_max_install_names
    wasm-ld: error: cannot open crt1.o: No such file or directory
    wasm-ld: error: unable to find library -lc

I didn't figure out a minimal solution yet. Maybe I need to provide my own shim.

As far as I know, crt1.o is responsible for loading/unloading the program and calling the main() function. I believe wasi-sdk provides it. I have to figure out if a similar solution exists for unknown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants