academia.environments.base module
Module contents
Base classes for all environments available in this package. All user-defined environments should inherit from one of these classes (see Using your own environments for more information on that)
Exported classes:
- class academia.environments.base.GenericAtariWrapper(difficulty: int, environment_id: str, n_frames_stacked: int = 1, append_step_count: bool = False, flatten_state: bool = False, random_state: int | None = None, **kwargs)
Bases:
GenericGymnasiumWrapperA wrapper for Atari environments that makes them scalable.
- Parameters:
difficulty – Difficulty level from 0 to 3, where 0 is the easiest and 3 is the hardest.
n_frames_stacked – How many most recent states should be stacked together to form a final state representation. Defaults to 1.
append_step_count – Whether or not append the current step count to each state. Defaults to
False.flatten_state – Wheter ot not to flatten the state if represented by and RGB or grayscale image. If
obs_typeis set to"ram"this parameter does nothing. Defaults toFalse.random_state – Optional seed that controls the randomness of the environment. Defaults to
None.kwargs – Arguments passed down to
gymnasium.make.
- Raises:
ValueError – If the specified difficulty level is invalid.
- step_count
Current step count since the last reset.
- Type:
int
- difficulty
Difficulty level. Higher values indicate more difficult environments.
- Type:
int
- n_frames_stacked
How many most recent states should be stacked together to form a final state representation.
- Type:
int
- append_step_count
Whether or not append the current step count to each state.
- Type:
bool
- flatten_state
Wheter ot not to flatten the state if represented by and RGB or grayscale image.
- Type:
bool
- class academia.environments.base.GenericGymnasiumWrapper(difficulty: int, environment_id: str, n_frames_stacked: int = 1, append_step_count: bool = False, random_state: int | None = None, **kwargs)
Bases:
ScalableEnvironmentA wrapper for Gymnasium environments. The purpose of it is to contain common Gymnasium syntax so that it does not have to be copied and pasted in every wrapper. At the same time, it aims to deliver flexibility that is required to handle generalized nature of Gymnasium’s API such as varying state representations.
- Parameters:
difficulty – The difficulty level of the environment.
environment_id – Gymnasium environment ID.
n_frames_stacked – How many most recent states should be stacked together to form a final state representation. Defaults to 1.
append_step_count – Whether or not append the current step count to each state. Defaults to
False.random_state – Optional seed that controls the randomness of the environment. Defaults to
None.kwargs – Arguments passed down to
gymnasium.make
- step_count
Current step count since the last reset.
- Type:
int
- difficulty
Difficulty level. Higher values indicate more difficult environments.
- Type:
int
- n_frames_stacked
How many most recent states should be stacked together to form a final state representation.
- Type:
int
- append_step_count
Whether or not append the current step count to each state.
- Type:
bool
- get_legal_mask() ndarray[Any, dtype[int32]]
- Returns:
A binary mask with 0s in place for illegal actions (actions that have no effect) and 1s for legal actions.
Note
For all Gymnasium-based environments in this package it is hard to cheaply obtain a legal mask, so this default implementation always returns an array of ones.
- observe() ndarray[Any, dtype[float32]]
Returns the current state of the environment. Performs state stacking if
n_frames_stackedis greater than 1.- Returns:
The current state of the environment.
- render() None
Renders the environment in the current render mode.
- reset() ndarray[Any, dtype[float32]]
Resets the environment to its initial state.
- Returns:
The new state after resetting the environment.
- step(action: int) tuple[ndarray[Any, dtype[float32]], float, bool]
Advances the environment by one step given the specified action.
- Parameters:
action – The action to take.
- Returns:
A tuple containing the new state, reward, and a flag indicating episode termination.
- class academia.environments.base.GenericMiniGridWrapper(difficulty: int, difficulty_envid_map: dict, n_frames_stacked: int = 1, append_step_count: bool = False, random_state: int | None = None, **kwargs)
Bases:
GenericGymnasiumWrapperA wrapper for MiniGrid environments that makes them scalable.
- Parameters:
difficulty – Difficulty level from 0 to 3, where 0 is the easiest and 3 is the hardest.
difficulty_envid_map – A dict that maps numerical difficulty level to gymnasium environment ID.
n_frames_stacked – How many most recent states should be stacked together to form a final state representation. Defaults to 1.
append_step_count – Whether or not append the current step count to each state. Defaults to
False.random_state – Optional seed that controls the randomness of the environment. Defaults to
None.kwargs – Arguments passed down to
gymnasium.make.
- Raises:
ValueError – If the specified difficulty level is invalid.
- step_count
Current step count since the last reset.
- Type:
int
- difficulty
Difficulty level. Higher values indicate more difficult environments.
- Type:
int
- n_frames_stacked
How many most recent states should be stacked together to form a final state representation.
- Type:
int
- append_step_count
Whether or not append the current step count to each state.
- Type:
bool
- class academia.environments.base.ScalableEnvironment(difficulty: int, n_frames_stacked: int = 1, **kwargs)
Bases:
ABCBase class for all environments used in this package. Scalability ensures environments can be used for Curriculum Learning.
- Parameters:
difficulty – Difficulty level. Higher values indicate more difficult environments.
n_frames_stacked – How many most recent states should be stacked together to form a final state representation. Defaults to 1.
- difficulty
Difficulty level. Higher values indicate more difficult environments.
- Type:
int
- n_frames_stacked
How many most recent states should be stacked together to form a final state representation.
- Type:
int
- N_ACTIONS: int
Number of available actions.
- STATE_SHAPE: tuple[int, ...]
Shape of the state representation. Can vary for each instance
- abstract get_legal_mask() ndarray[Any, dtype[int32]]
- Returns:
A binary mask with 0s in place for illegal actions (actions that have no effect) and 1s for legal actions.
- abstract observe() Any
- Returns:
A current state.
- abstract render() None
Renders the environment.
- abstract reset() Any
Resets the environment.
- Returns:
A starting state.
- abstract step(action: int) tuple[Any, float, bool]
Takes the given action in the environment
- Parameters:
action – An action to take.
- Returns:
A tuple consisting of a new state, reward and a flag indicating whether the state is terminal.