Safety Gymnasium Environment#

Safety Gymnasium Interface#

Documentation

class omnisafe.envs.safety_gymnasium_env.SafetyGymnasiumEnv(env_id, num_envs=1, device=DEVICE_CPU, **kwargs)[source]#

Safety Gymnasium Environment.

Parameters:
  • env_id (str) – Environment id.

  • num_envs (int, optional) – Number of environments. Defaults to 1.

  • device (torch.device, optional) – Device to store the data. Defaults to torch.device('cpu').

  • **kwargs (Any) – Other arguments.

Variables:
  • need_auto_reset_wrapper (bool) – Whether to use auto reset wrapper.

  • need_time_limit_wrapper (bool) – Whether to use time limit wrapper.

Initialize an instance of SafetyGymnasiumEnv.

close()[source]#

Close the environment.

Return type:

None

render()[source]#

Render the environment.

Returns:

Rendered image.

Return type:

Any

reset(seed=None)[source]#

Reset the environment.

Parameters:

seed (int or None, optional) – Seed to reset the environment. Defaults to None.

Returns:
  • observation – Agent’s observation of the current environment.

  • info – Some information logged by the environment.

Return type:

tuple[torch.Tensor, dict[str, Any]]

sample_action()[source]#

Sample a random action.

Returns:

A random action.

Return type:

Tensor

set_seed(seed)[source]#

Set the seed for the environment.

Parameters:

seed (int) – Seed to set.

Return type:

None

step(action)[source]#

Step the environment.

Note

OmniSafe uses auto reset wrapper to reset the environment when the episode is terminated. So the obs will be the first observation of the next episode. And the true final_observation in info will be stored in the final_observation key of info.

Parameters:

action (torch.Tensor) – Action to take.

Returns:
  • observation – The agent’s observation of the current environment.

  • reward – The amount of reward returned after previous action.

  • cost – The amount of cost returned after previous action.

  • terminated – Whether the episode has ended.

  • truncated – Whether the episode has been truncated due to a time limit.

  • info – Some information logged by the environment.

Return type:

tuple[Tensor, Tensor, Tensor, Tensor, Tensor, dict[str, Any]]