Safety Gymnasium Environment#
Safety Gymnasium Interface#
Documentation
- class omnisafe.envs.safety_gymnasium_env.SafetyGymnasiumEnv(env_id, num_envs=1, device=DEVICE_CPU, **kwargs)[source]#
Safety Gymnasium Environment.
- Parameters:
env_id (str) – Environment id.
num_envs (int, optional) – Number of environments. Defaults to 1.
device (torch.device, optional) – Device to store the data. Defaults to
torch.device('cpu')
.**kwargs (Any) – Other arguments.
- Variables:
need_auto_reset_wrapper (bool) – Whether to use auto reset wrapper.
need_time_limit_wrapper (bool) – Whether to use time limit wrapper.
Initialize an instance of
SafetyGymnasiumEnv
.- reset(seed=None)[source]#
Reset the environment.
- Parameters:
seed (int or None, optional) – Seed to reset the environment. Defaults to None.
- Returns:
observation – Agent’s observation of the current environment.
info – Some information logged by the environment.
- Return type:
tuple[torch.Tensor, dict[str, Any]]
- set_seed(seed)[source]#
Set the seed for the environment.
- Parameters:
seed (int) – Seed to set.
- Return type:
None
- step(action)[source]#
Step the environment.
Note
OmniSafe uses auto reset wrapper to reset the environment when the episode is terminated. So the
obs
will be the first observation of the next episode. And the truefinal_observation
ininfo
will be stored in thefinal_observation
key ofinfo
.- Parameters:
action (torch.Tensor) – Action to take.
- Returns:
observation – The agent’s observation of the current environment.
reward – The amount of reward returned after previous action.
cost – The amount of cost returned after previous action.
terminated – Whether the episode has ended.
truncated – Whether the episode has been truncated due to a time limit.
info – Some information logged by the environment.
- Return type:
tuple
[Tensor
,Tensor
,Tensor
,Tensor
,Tensor
,dict
[str
,Any
]]