academia.environments.base module

Module contents

Base classes for all environments available in this package. All user-defined environments should inherit from one of these classes (see Using your own environments for more information on that)

Exported classes:

ScalableEnvironment
GenericMiniGridWrapper
GenericGymnasiumWrapper
GenericAtariWrapper

class academia.environments.base.GenericAtariWrapper(difficulty: int, environment_id: str, n_frames_stacked: int = 1, append_step_count: bool = False, flatten_state: bool = False, random_state: int | None = None, **kwargs)

Bases: GenericGymnasiumWrapper

A wrapper for Atari environments that makes them scalable.

Parameters:

difficulty – Difficulty level from 0 to 3, where 0 is the easiest and 3 is the hardest.
n_frames_stacked – How many most recent states should be stacked together to form a final state representation. Defaults to 1.
append_step_count – Whether or not append the current step count to each state. Defaults to False.
flatten_state – Wheter ot not to flatten the state if represented by and RGB or grayscale image. If obs_type is set to "ram" this parameter does nothing. Defaults to False.
random_state – Optional seed that controls the randomness of the environment. Defaults to None.
kwargs – Arguments passed down to gymnasium.make.

Raises:

ValueError – If the specified difficulty level is invalid.

step_count

Current step count since the last reset.

Type:: int

difficulty

Difficulty level. Higher values indicate more difficult environments.

Type:: int

n_frames_stacked

How many most recent states should be stacked together to form a final state representation.

Type:: int

append_step_count

Whether or not append the current step count to each state.

Type:: bool

flatten_state

Wheter ot not to flatten the state if represented by and RGB or grayscale image.

Type:: bool

class academia.environments.base.GenericGymnasiumWrapper(difficulty: int, environment_id: str, n_frames_stacked: int = 1, append_step_count: bool = False, random_state: int | None = None, **kwargs)

Bases: ScalableEnvironment

A wrapper for Gymnasium environments. The purpose of it is to contain common Gymnasium syntax so that it does not have to be copied and pasted in every wrapper. At the same time, it aims to deliver flexibility that is required to handle generalized nature of Gymnasium’s API such as varying state representations.

Parameters:

difficulty – The difficulty level of the environment.
environment_id – Gymnasium environment ID.
n_frames_stacked – How many most recent states should be stacked together to form a final state representation. Defaults to 1.
append_step_count – Whether or not append the current step count to each state. Defaults to False.
random_state – Optional seed that controls the randomness of the environment. Defaults to None.
kwargs – Arguments passed down to gymnasium.make

step_count

Current step count since the last reset.

Type:: int

difficulty

Difficulty level. Higher values indicate more difficult environments.

Type:: int

n_frames_stacked

How many most recent states should be stacked together to form a final state representation.

Type:: int

append_step_count

Whether or not append the current step count to each state.

Type:: bool

get_legal_mask() → ndarray[Any, dtype[int32]]

Returns:: A binary mask with 0s in place for illegal actions (actions that have no effect) and 1s for legal actions.

Note

For all Gymnasium-based environments in this package it is hard to cheaply obtain a legal mask, so this default implementation always returns an array of ones.

observe() → ndarray[Any, dtype[float32]]

Returns the current state of the environment. Performs state stacking if n_frames_stacked is greater than 1.

Returns:: The current state of the environment.

render() → None: Renders the environment in the current render mode.

reset() → ndarray[Any, dtype[float32]]

Resets the environment to its initial state.

Returns:: The new state after resetting the environment.

step(action: int) → tuple[ndarray[Any, dtype[float32]], float, bool]

Advances the environment by one step given the specified action.

Parameters:: action – The action to take.
Returns:: A tuple containing the new state, reward, and a flag indicating episode termination.

class academia.environments.base.GenericMiniGridWrapper(difficulty: int, difficulty_envid_map: dict, n_frames_stacked: int = 1, append_step_count: bool = False, random_state: int | None = None, **kwargs)

Bases: GenericGymnasiumWrapper

A wrapper for MiniGrid environments that makes them scalable.

Parameters:

difficulty – Difficulty level from 0 to 3, where 0 is the easiest and 3 is the hardest.
difficulty_envid_map – A dict that maps numerical difficulty level to gymnasium environment ID.
n_frames_stacked – How many most recent states should be stacked together to form a final state representation. Defaults to 1.
append_step_count – Whether or not append the current step count to each state. Defaults to False.
random_state – Optional seed that controls the randomness of the environment. Defaults to None.
kwargs – Arguments passed down to gymnasium.make.

Raises:

ValueError – If the specified difficulty level is invalid.

step_count

Current step count since the last reset.

Type:: int

difficulty

Difficulty level. Higher values indicate more difficult environments.

Type:: int

n_frames_stacked

How many most recent states should be stacked together to form a final state representation.

Type:: int

append_step_count

Whether or not append the current step count to each state.

Type:: bool

class academia.environments.base.ScalableEnvironment(difficulty: int, n_frames_stacked: int = 1, **kwargs)

Bases: ABC

Base class for all environments used in this package. Scalability ensures environments can be used for Curriculum Learning.

Parameters:

difficulty – Difficulty level. Higher values indicate more difficult environments.
n_frames_stacked – How many most recent states should be stacked together to form a final state representation. Defaults to 1.

difficulty

Difficulty level. Higher values indicate more difficult environments.

Type:: int

n_frames_stacked

How many most recent states should be stacked together to form a final state representation.

Type:: int

N_ACTIONS: int: Number of available actions.

STATE_SHAPE: tuple[int, ...]: Shape of the state representation. Can vary for each instance

abstract get_legal_mask() → ndarray[Any, dtype[int32]]

Returns:: A binary mask with 0s in place for illegal actions (actions that have no effect) and 1s for legal actions.

abstract observe() → Any

Returns:: A current state.

abstract render() → None: Renders the environment.

abstract reset() → Any

Resets the environment.

Returns:: A starting state.

abstract step(action: int) → tuple[Any, float, bool]

Takes the given action in the environment

Parameters:: action – An action to take.
Returns:: A tuple consisting of a new state, reward and a flag indicating whether the state is terminal.