Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvidia containers support #1602

Open
edgarriba opened this issue Jan 3, 2025 · 12 comments
Open

nvidia containers support #1602

edgarriba opened this issue Jan 3, 2025 · 12 comments

Comments

@edgarriba
Copy link

Is there any example or anyone tried cross with any of the NVIDIA L4T containers ? **https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-base

I made a couple of attempts to use as base nvidia but seems that it requires to much configuration happening in this repo.

Any help will pretty much appreciated as rust and cuda builds are starting to get common :)

@Emilgardis
Copy link
Member

not sure I understand this issue. What's the end goal?

@Emilgardis
Copy link
Member

does these containers contain a cross-compiler (x86_64 -> cuda?) for rust to use?

@edgarriba
Copy link
Author

the end goal is being able to cross compile libraries that have to be linked against cuda dependencies (which usually it's a pain to install and that's why L4T or related containers are very useful as they come with all the goodies for cuda). One example would be this crate: https://github.com/mstallmo/tensorrt-rs

As I mentioned above, I tried to take a l4t-base image and install myself cargo as by default those containers do not have any cargo installed.

@Emilgardis
Copy link
Member

for cross, there is no need to have cargo in the container. Cross provides cargo from the host in to the running container via mounting. Alright if it's just libraries maybe just copying them into a suitable place could work?

@edgarriba
Copy link
Author

when I try to cross build from an nvidia image nvcr.io/nvidia/l4t-base:r35.2.1

I get the following error:

WARNING: The requested image's platform (linux/arm64) does not match the detected host platform (linux/amd64/v4) and no specific platform was requested
/usr/bin/sh: 1: cargo: not found
==> ERROR: Build failed

@Emilgardis
Copy link
Member

Emilgardis commented Jan 4, 2025

That just means that something funky is going on, not neccesarily that cargo wasn't found but probably a lib dependency was not found, most likely libc.

I'm a bit confused about the ==> ERROR: Build failed in your snippet. Can you share the entire snippet together with running cross with -v for erbose output

@edgarriba
Copy link
Author

the script I'm running is here: https://github.com/kornia/bubbaloop/blob/main/scripts/cross_deploy.sh

by passing the verbose flag I get the following warning

WARNING: The requested image's platform (linux/arm64) does not match the detected host platform (linux/amd64/v4) and no specific platform was requested

@Emilgardis
Copy link
Member

that's because you're using cross 0.2.5, you need to use cross from the main branch to fix that or use CROSS_BUILD_FLAGS. See #1214 (comment)

@edgarriba
Copy link
Author

updated to cross main and no much difference

+ /usr/bin/docker run --userns host -e 'XARGO_HOME=/home/edgar/.xargo' -e 'CARGO_HOME=/home/edgar/.cargo' -e 'CROSS_RUST_SYSROOT=/home/edgar/.rustup/toolchains/stable-x86_64-unknown-linux-gnu' -e 'CARGO_TARGET_DIR=/target' -e 'CROSS_RUNNER=' -e CROSS_CONTAINER_OPTS -e TERM -e CROSS_BUILD_OPTS -e 'USER=edgar' --platform linux/amd64 -e 'CROSS_RUSTC_MAJOR_VERSION=1' -e 'CROSS_RUSTC_MINOR_VERSION=83' -e 'CROSS_RUSTC_PATCH_VERSION=0' --name cross-stable-x86_64-unknown-linux-gnu-f7aad-90b35a623-aarch64-unknown-linux-gnu-b4932-1736286916293 --rm --user 1001:1001 -v /home/edgar/.xargo:/home/edgar/.xargo:z -v /home/edgar/.cargo:/home/edgar/.cargo:z -v /home/edgar/.cargo/bin -v /home/edgar/software/bubbaloop:/home/edgar/software/bubbaloop:z -v /home/edgar/.rustup/toolchains/stable-x86_64-unknown-linux-gnu:/home/edgar/.rustup/toolchains/stable-x86_64-unknown-linux-gnu:z,ro -v /home/edgar/software/bubbaloop/target:/target:z -w /home/edgar/software/bubbaloop -t localhost/cross-rs/cross-custom-bubbaloop:aarch64-unknown-linux-gnu-b4932 sh -c 'PATH="$PATH":"/home/edgar/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin" cargo build --target aarch64-unknown-linux-gnu --release -v --bin serve'
/bin/sh: 1: cargo: not found
+ rustup component list --toolchain stable-x86_64-unknown-linux-gnu
==> ERROR: Build failed

@Emilgardis
Copy link
Member

if you do

cross-util run --target aarch64-unknown-linux-gnu -- ls /home/edgar/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin

does it show cargo? if yes then it's a problem with the container you've used, as I hinted at earlier, probably a cargo dependency missing. You can try debugging it without the unhelpful error message by using gdb in the container

$ cross-util run --target aarch64-unknown-linux-gnu -i -- sh
> gdb /home/edgar/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin/cargo
> r

@edgarriba
Copy link
Author

does it show cargo?

yes, it does !

with gdb then I get

Reading symbols from /home/edgar/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin/cargo...done.
(gdb) r
Starting program: /home/edgar/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin/cargo 
warning: Error disabling address space randomization: Operation not permitted
warning: Could not trace the inferior process.
Error: 
warning: ptrace: Function not implemented
During startup program exited with code 127.

@Emilgardis
Copy link
Member

ugh forgot that gdb doesn't work on qemu...

anyway, there's a bunch of resources online how to debug from this point forward going from the message During startup program exited with code 127.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants