Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Error building extension '_hash_encoder_df' #1

Open
entangledothers opened this issue Mar 27, 2022 · 5 comments
Open

Comments

@entangledothers
Copy link

entangledothers commented Mar 27, 2022

Have tried various env setups but get stuck with the following error when running this command: OMP_NUM_THREADS=1 CUDA_VISIBLE_DEVICES=1 python main_nerf.py "cthulhu" --workspace trial --cuda_ray --fp16 --gui

Using /home/user/.cache/torch_extensions/py37_cu113 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/user/.cache/torch_extensions/py37_cu113/_hash_encoder_df/build.ninja...
Building extension module _hash_encoder_df...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/3] c++ -MMD -MF bindings.o.d -DTORCH_EXTENSION_NAME=_hash_encoder_df -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/TH -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/THC -isystem /home/user/anaconda3/envs/dreamfields/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -c /home/user/test/dreamfields-torch/hashencoder/src/bindings.cpp -o bindings.o 
[2/3] /usr/bin/nvcc  -DTORCH_EXTENSION_NAME=_hash_encoder_df -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/TH -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/THC -isystem /home/user/anaconda3/envs/dreamfields/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -c /home/user/test/dreamfields-torch/hashencoder/src/hashencoder.cu -o hashencoder.cuda.o 
FAILED: hashencoder.cuda.o 
/usr/bin/nvcc  -DTORCH_EXTENSION_NAME=_hash_encoder_df -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/TH -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/THC -isystem /home/user/anaconda3/envs/dreamfields/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -c /home/user/test/dreamfields-torch/hashencoder/src/hashencoder.cu -o hashencoder.cuda.o 
/usr/include/c++/10/chrono: In substitution of ‘template<class _Rep, class _Period> template<class _Period2> using __is_harmonic = std::__bool_constant<(std::ratio<((_Period2::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)) * (_Period::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den))), ((_Period2::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den)) * (_Period::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)))>::den == 1)> [with _Period2 = _Period2; _Rep = _Rep; _Period = _Period]’:
/usr/include/c++/10/chrono:473:154:   required from here
/usr/include/c++/10/chrono:428:27: internal compiler error: Segmentation fault
  428 |  _S_gcd(intmax_t __m, intmax_t __n) noexcept
      |                           ^~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-10/README.Bugs> for instructions.
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1723, in _run_ninja_build
    env=env)
  File "/home/user/anaconda3/envs/dreamfields/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "main_nerf.py", line 57, in <module>
    from nerf.network import NeRFNetwork
  File "/home/user/test/dreamfields-torch/nerf/network.py", line 5, in <module>
    from encoding import get_encoder
  File "/home/user/test/dreamfields-torch/encoding.py", line 6, in <module>
    from hashencoder import HashEncoder
  File "/home/user/test/dreamfields-torch/hashencoder/__init__.py", line 1, in <module>
    from .hashgrid import HashEncoder
  File "/home/user/test/dreamfields-torch/hashencoder/hashgrid.py", line 9, in <module>
    from .backend import _backend
  File "/home/user/test/dreamfields-torch/hashencoder/backend.py", line 12, in <module>
    sources=[os.path.join(_src_path, 'src', f) for f in [
  File "/home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1136, in load
    keep_intermediates=keep_intermediates)
  File "/home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1347, in _jit_compile
    is_standalone=is_standalone)
  File "/home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1452, in _write_ninja_file_and_build_library
    error_prefix=f"Error building extension '{name}'")
  File "/home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1733, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension '_hash_encoder_df'

Only thing changed in repo is having added verbose output to hashencoder. Also makes no difference with the tiny-cuda-nn installed or not.

@ashawkey
Copy link
Owner

  1. Could you provide more details about the environment, such as the platform, GPU arch, and CUDA version?
  2. If you can successfully compile tiny-cuda-nn, you could comment out all HashEncoder (e.g., at /home/user/test/dreamfields-torch/encoding.py) and use the hash encoder of tiny-cuda-nn, by adding the --tcnn flag.

@entangledothers
Copy link
Author

Of course!

  1. POP OS (Ubuntu 21.04), A6000 & RTX 6000, CUDA 11.4.
  2. Commented out lines 6, 65 & 66 (let me know if there were any further parts that need commenting out). The process now breaks on raymarching:

´´´
OMP_NUM_THREADS=1 CUDA_VISIBLE_DEVICES=0 python main_nerf.py "cthulhu" --workspace trial --cuda_ray --fp16 --tcnn --gui

Namespace(H=800, W=800, aug_copy=8, bound=1, cuda_ray=True, dir_text=False, ff=False, fovy=90, fp16=True, gui=True, h=128, max_ray_batch=4096, max_spp=64, num_rays=4096, num_steps=128, radius=3, seed=0, tau_0=0.5, tau_1=0.8, tau_step=500, tcnn=True, test=False, text='cthulhu', upsample_steps=128, w=128, workspace='trial')
Traceback (most recent call last):
File "/home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1723, in _run_ninja_build
env=env)
File "/home/user/anaconda3/envs/dreamfields/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "main_nerf.py", line 55, in
from nerf.network_tcnn import NeRFNetwork
File "/home/user/dreamfields-torch/nerf/network_tcnn.py", line 6, in
from .renderer import NeRFRenderer
File "/home/user/dreamfields-torch/nerf/renderer.py", line 9, in
import raymarching
File "/home/user/dreamfields-torch/raymarching/init.py", line 1, in
from .raymarching import *
File "/home/user/dreamfields-torch/raymarching/raymarching.py", line 9, in
from .backend import _backend
File "/home/user/dreamfields-torch/raymarching/backend.py", line 9, in
sources=[os.path.join(_src_path, 'src', f) for f in [
File "/home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1136, in load
keep_intermediates=keep_intermediates)
File "/home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1347, in jit_compile
is_standalone=is_standalone)
File "/home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1452, in write_ninja_file_and_build_library
error_prefix=f"Error building extension '{name}'")
File "/home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1733, in run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'raymarching_df': [1/3] c++ -MMD -MF bindings.o.d -DTORCH_EXTENSION_NAME=raymarching_df -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/TH -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/THC -isystem /home/user/anaconda3/envs/dreamfields/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -c /home/user/dreamfields-torch/raymarching/src/bindings.cpp -o bindings.o
[2/3] /usr/bin/nvcc -DTORCH_EXTENSION_NAME=raymarching_df -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/TH -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/THC -isystem /home/user/anaconda3/envs/dreamfields/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS
-D__CUDA_NO_HALF2_OPERATORS
--expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/user/dreamfields-torch/raymarching/src/raymarching.cu -o raymarching.cuda.o
FAILED: raymarching.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=raymarching_df -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/TH -isystem /home/user/anaconda3/envs/dreamfields/lib/python3.7/site-packages/torch/include/THC -isystem /home/user/anaconda3/envs/dreamfields/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS
-D__CUDA_NO_HALF2_OPERATORS
--expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/user/dreamfields-torch/raymarching/src/raymarching.cu -o raymarching.cuda.o
/home/user/dreamfields-torch/raymarching/src/raymarching.cu(271): warning: variable "d" was declared but never referenced
detected during instantiation of "void kernel_composite_rays_train_forward(const scalar_t *, const scalar_t *, const scalar_t *, const int *, float, uint32_t, uint32_t, scalar_t *, scalar_t *) [with scalar_t=double]"
(444): here

/home/user/dreamfields-torch/raymarching/src/raymarching.cu(271): warning: variable "d" was declared but never referenced
detected during instantiation of "void kernel_composite_rays_train_forward(const scalar_t *, const scalar_t *, const scalar_t *, const int *, float, uint32_t, uint32_t, scalar_t *, scalar_t *) [with scalar_t=float]"
(444): here

/home/user/dreamfields-torch/raymarching/src/raymarching.cu(535): warning: variable "near" was declared but never referenced
detected during instantiation of "void kernel_march_rays(uint32_t, uint32_t, const int *, const scalar_t *, const scalar_t *, const scalar_t *, float, uint32_t, const scalar_t *, float, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *, scalar_t *, uint32_t) [with scalar_t=float]"
(606): here

/usr/include/c++/10/chrono: In substitution of ‘template<class _Rep, class _Period> template using __is_harmonic = std::__bool_constant<(std::ratio<((_Period2::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)) * (_Period::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den))), ((_Period2::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den)) * (_Period::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)))>::den == 1)> [with _Period2 = _Period2; _Rep = _Rep; _Period = _Period]’:
/usr/include/c++/10/chrono:473:154: required from here
/usr/include/c++/10/chrono:428:27: internal compiler error: Segmentation fault
428 | _S_gcd(intmax_t __m, intmax_t __n) noexcept
| ^~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See file:///usr/share/doc/gcc-10/README.Bugs for instructions.
ninja: build stopped: subcommand failed.
´´´

@ashawkey
Copy link
Owner

@entangledothers It seems to be caused by the gcc version according to this issue. Could you try with a lower gcc version, such as gcc-9?

@entangledothers
Copy link
Author

@entangledothers It seems to be caused by the gcc version according to this issue. Could you try with a lower gcc version, such as gcc-9?

Sadly, using gcc-9 (and even 8) made no difference, same issue as above.

@ashawkey
Copy link
Owner

Sorry for the late reply! A major updation has been pushed, you can try again to see if anything changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants