BUG: Nondeterminisitc behaviour of MetaDriveEnv #758

olek-osikowicz · 2024-08-30T15:42:32Z

Hi MetaDrive team,

I believe I discovered a bug, in resetting the MetaDriveEnv resulting in nondeterminism.

MetaDrive simulation supposed to be deterministic but even when I use the enviroment and reset it with same with same seed resulting traces are not identical. Cosider following code adapted from examples:

try:
    env=MetaDriveEnv(config={"map":"C",
                            "num_scenarios": n_scenarios})

    for rep in range(n_scenarios):
        obs, step_info = env.reset(seed)
        while True:
            # get action from expert driving, or a dummy action
            action = expert(env.agent, deterministic=True) if expert_driving else [0, 0.33]
            obs, reward, tm, tr, step_info = env.step(action)
            traces.append(step_info)
            
            if tm or tr:
                break
finally:
    env.close()

When I was analyzing traces (step info for each timestep) from diffrent repetitions I found slight diffrences probably comming from floating point number arithemtic. Those diffrences (error) between traces is magnified, the longer the episode is.

Suspecting that .reset() function doesn't clear the state properly I started initializing the enviroment for each repetition, and closing at the end.

try:

    for rep in range(n_scenarios):
        env=MetaDriveEnv(config={"map":"C",
                                "num_scenarios": n_scenarios})
        obs, step_info = env.reset(seed)
        while True:
            
            # get action from expert driving, or a dummy action
            action = expert(env.agent, deterministic=True) if expert_driving else [0, 0.33]
            obs, reward, tm, tr, step_info = env.step(action)
            if tm or tr:
                break

        env.close()
finally:
    pass

Above solved an issue and each traces produced are exacly the same (fully deterministic).

Please see my notebook reproducing the bug.

Conda env

# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-64
@EXPLICIT
https://repo.anaconda.com/pkgs/main/linux-64/_libgcc_mutex-0.1-main.conda
https://conda.anaconda.org/conda-forge/linux-64/ca-certificates-2024.7.4-hbcca054_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/ld_impl_linux-64-2.38-h1181459_1.conda
https://repo.anaconda.com/pkgs/main/linux-64/libstdcxx-ng-11.2.0-h1234567_1.conda
https://repo.anaconda.com/pkgs/main/noarch/tzdata-2024a-h04d1e81_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/libgomp-11.2.0-h1234567_1.conda
https://repo.anaconda.com/pkgs/main/linux-64/_openmp_mutex-5.1-1_gnu.conda
https://repo.anaconda.com/pkgs/main/linux-64/libgcc-ng-11.2.0-h1234567_1.conda
https://repo.anaconda.com/pkgs/main/linux-64/bzip2-1.0.8-h5eee18b_6.conda
https://repo.anaconda.com/pkgs/main/linux-64/libffi-3.4.4-h6a678d5_1.conda
https://conda.anaconda.org/conda-forge/linux-64/libsodium-1.0.18-h36c2ea0_1.tar.bz2
https://repo.anaconda.com/pkgs/main/linux-64/libuuid-1.41.5-h5eee18b_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/ncurses-6.4-h6a678d5_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/openssl-3.0.14-h5eee18b_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/xz-5.4.6-h5eee18b_1.conda
https://repo.anaconda.com/pkgs/main/linux-64/zlib-1.2.13-h5eee18b_1.conda
https://repo.anaconda.com/pkgs/main/linux-64/readline-8.2-h5eee18b_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/tk-8.6.14-h39e8969_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/zeromq-4.3.5-h6a678d5_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/sqlite-3.45.3-h5eee18b_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/python-3.10.14-h955ad1f_1.conda
https://repo.anaconda.com/pkgs/main/linux-64/debugpy-1.6.7-py310h6a678d5_0.conda
https://conda.anaconda.org/conda-forge/noarch/decorator-5.1.1-pyhd8ed1ab_0.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/entrypoints-0.4-pyhd8ed1ab_0.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/exceptiongroup-1.2.2-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/noarch/executing-2.0.1-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/noarch/nest-asyncio-1.6.0-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/noarch/packaging-24.1-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/noarch/parso-0.8.4-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/noarch/pickleshare-0.7.5-py_1003.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/platformdirs-4.2.2-pyhd8ed1ab_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/psutil-5.9.0-py310h5eee18b_0.conda
https://conda.anaconda.org/conda-forge/noarch/ptyprocess-0.7.0-pyhd3deb0d_0.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/pure_eval-0.2.3-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/noarch/pygments-2.18.0-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/linux-64/python_abi-3.10-2_cp310.tar.bz2
https://repo.anaconda.com/pkgs/main/linux-64/pyzmq-25.1.2-py310h6a678d5_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/setuptools-72.1.0-py310h06a4308_0.conda
https://conda.anaconda.org/conda-forge/noarch/six-1.16.0-pyh6c4a22f_0.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/traitlets-5.14.3-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/noarch/typing_extensions-4.12.2-pyha770c72_0.conda
https://conda.anaconda.org/conda-forge/noarch/wcwidth-0.2.13-pyhd8ed1ab_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/wheel-0.43.0-py310h06a4308_0.conda
https://conda.anaconda.org/conda-forge/noarch/asttokens-2.4.1-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/noarch/comm-0.2.2-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/noarch/jedi-0.19.1-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/linux-64/jupyter_core-5.7.2-py310hff52083_0.conda
https://conda.anaconda.org/conda-forge/noarch/matplotlib-inline-0.1.7-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/noarch/pexpect-4.9.0-pyhd8ed1ab_0.conda
https://repo.anaconda.com/pkgs/main/linux-64/pip-24.2-py310h06a4308_0.conda
https://conda.anaconda.org/conda-forge/noarch/prompt-toolkit-3.0.47-pyha770c72_0.conda
https://conda.anaconda.org/conda-forge/noarch/python-dateutil-2.9.0-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/linux-64/tornado-6.1-py310h5764c6d_3.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/jupyter_client-7.3.4-pyhd8ed1ab_0.tar.bz2
https://conda.anaconda.org/conda-forge/noarch/stack_data-0.6.2-pyhd8ed1ab_0.conda
https://conda.anaconda.org/conda-forge/noarch/ipython-8.26.0-pyh707e725_0.conda
https://conda.anaconda.org/conda-forge/noarch/ipykernel-6.29.5-pyh3099207_0.conda

The text was updated successfully, but these errors were encountered:

pengzhenghao · 2024-12-15T19:12:01Z

Thanks for raising this! May I ask is this problem caused by:

floating point problem with expert policy, or
reset function doesn't clear state correctly, or
random state not reset properly?

olek-osikowicz · 2024-12-16T15:00:55Z

Hi @pengzhenghao,
Since the time I submitted this issue I moved the notebook. I've pushed it now to the new repo.

Answering your questions:

I don't think it's the problem with an expert policy, I ran with both the torch, and numpy versions and the bug appears consistently.
I don't think it's a problem with random state. I ran scenarios with expert agent with deterministic=True. I looked a the code briefly and policy doesn't draw any random number.
So I presume it's the "Reset function doesn't clear state correctly" option. The state is cleared complitely only if you reinitialize the environment object MetaDriveEnv(...). If the .reset() would clear the state completely, there wouldn't be a diffrence beetween traces.

However those are only my suggestions, feel free to look at the code, and share your thoughts.

pengzhenghao · 2024-12-16T18:29:10Z

Thanks for sharing the code! I do some experiments:

reset + rollout expert = Reproduce your result. Yes the traces are different.
only reset is determinisitic. If we don't call any env.step then the initial state are deterministic. This is expected and is a major promise that current MD is still working..
reset + single step from expert is non-deterministic.
reset + single step fixed action is determinsitic.
reset + rollut fixed action is non-det.

I notice that you are using absolute equal to do assertions. We usually use np's almost equal to help avoid floating point issue. So I've made a new test script for this:

#789

With this new script I can verified that the relative error <0.1% if we rollout the expert for 50 steps. <1e-6 relative error if we rollout the expert for 1 step.

Therefore, I think there is no bug in metadrive. Some tiny floating point number error is inevitable.

pengzhenghao mentioned this issue Dec 16, 2024

Introduce a test script for determinism #789

Merged

2 tasks

pengzhenghao closed this as completed in #789 Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Nondeterminisitc behaviour of MetaDriveEnv #758

BUG: Nondeterminisitc behaviour of MetaDriveEnv #758

olek-osikowicz commented Aug 30, 2024

pengzhenghao commented Dec 15, 2024

olek-osikowicz commented Dec 16, 2024

pengzhenghao commented Dec 16, 2024

BUG: Nondeterminisitc behaviour of MetaDriveEnv #758

BUG: Nondeterminisitc behaviour of MetaDriveEnv #758

Comments

olek-osikowicz commented Aug 30, 2024

pengzhenghao commented Dec 15, 2024

olek-osikowicz commented Dec 16, 2024

pengzhenghao commented Dec 16, 2024