FP8 format #354

stella-jxu · 2023-06-08T06:04:11Z

stella-jxu
Jun 8, 2023

I am trying this pytorch example below,
https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/index.html?highlight=e4m3#pytorch
It is working fine with E4M3 data format, However, I tried with E5M2 and I got the following error

Traceback (most recent call last):
File "transformer/transformer.py", line 16, in
fp8_recipe = recipe.DelayedScaling(margin=0, interval=1, fp8_format=recipe.Format.E5M2)
File "pydantic/dataclasses.py", line 286, in pydantic.dataclasses._add_pydantic_validation_attributes.handle_extra_init
f'default={self.default!r},'
File "", line 11, in init
File "pydantic/dataclasses.py", line 305, in pydantic.dataclasses._add_pydantic_validation_attributes.new_post_init
def set_name(self, owner, name):
File "/usr/local/lib/python3.10/dist-packages/transformer_engine/common/recipe.py", line 135, in post_init
assert self.fp8_format != Format.E5M2, "Pure E5M2 training is not supported."
AssertionError: Pure E5M2 training is not supported.

Just wondering how to enable E5M2 format in this case. Thanks!

ksivaman · 2023-08-03T06:00:04Z

ksivaman
Aug 3, 2023
Maintainer

@stella-jxu E5M2 only training is not supported as mentioned in the assertion error. It cannot be enabled.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FP8 format #354

{{title}}

Replies: 1 comment

{{title}}

Select a reply

FP8 format #354

stella-jxu Jun 8, 2023

Replies: 1 comment

ksivaman Aug 3, 2023 Maintainer

stella-jxu
Jun 8, 2023

ksivaman
Aug 3, 2023
Maintainer