academia.environments.base module

Module contents

Base classes for all environments available in this package. All user-defined environments should inherit from one of these classes (see Using your own environments for more information on that)

Exported classes:

class academia.environments.base.GenericAtariWrapper(difficulty: int, environment_id: str, n_frames_stacked: int = 1, append_step_count: bool = False, flatten_state: bool = False, random_state: int | None = None, **kwargs)

Bases: GenericGymnasiumWrapper

A wrapper for Atari environments that makes them scalable.

Parameters:
  • difficulty – Difficulty level from 0 to 3, where 0 is the easiest and 3 is the hardest.

  • n_frames_stacked – How many most recent states should be stacked together to form a final state representation. Defaults to 1.

  • append_step_count – Whether or not append the current step count to each state. Defaults to False.

  • flatten_state – Wheter ot not to flatten the state if represented by and RGB or grayscale image. If obs_type is set to "ram" this parameter does nothing. Defaults to False.

  • random_state – Optional seed that controls the randomness of the environment. Defaults to None.

  • kwargs – Arguments passed down to gymnasium.make.

Raises:

ValueError – If the specified difficulty level is invalid.

step_count

Current step count since the last reset.

Type:

int

difficulty

Difficulty level. Higher values indicate more difficult environments.

Type:

int

n_frames_stacked

How many most recent states should be stacked together to form a final state representation.

Type:

int

append_step_count

Whether or not append the current step count to each state.

Type:

bool

flatten_state

Wheter ot not to flatten the state if represented by and RGB or grayscale image.

Type:

bool

class academia.environments.base.GenericGymnasiumWrapper(difficulty: int, environment_id: str, n_frames_stacked: int = 1, append_step_count: bool = False, random_state: int | None = None, **kwargs)

Bases: ScalableEnvironment

A wrapper for Gymnasium environments. The purpose of it is to contain common Gymnasium syntax so that it does not have to be copied and pasted in every wrapper. At the same time, it aims to deliver flexibility that is required to handle generalized nature of Gymnasium’s API such as varying state representations.

Parameters:
  • difficulty – The difficulty level of the environment.

  • environment_idGymnasium environment ID.

  • n_frames_stacked – How many most recent states should be stacked together to form a final state representation. Defaults to 1.

  • append_step_count – Whether or not append the current step count to each state. Defaults to False.

  • random_state – Optional seed that controls the randomness of the environment. Defaults to None.

  • kwargs – Arguments passed down to gymnasium.make

step_count

Current step count since the last reset.

Type:

int

difficulty

Difficulty level. Higher values indicate more difficult environments.

Type:

int

n_frames_stacked

How many most recent states should be stacked together to form a final state representation.

Type:

int

append_step_count

Whether or not append the current step count to each state.

Type:

bool

Returns:

A binary mask with 0s in place for illegal actions (actions that have no effect) and 1s for legal actions.

Note

For all Gymnasium-based environments in this package it is hard to cheaply obtain a legal mask, so this default implementation always returns an array of ones.

observe() ndarray[Any, dtype[float32]]

Returns the current state of the environment. Performs state stacking if n_frames_stacked is greater than 1.

Returns:

The current state of the environment.

render() None

Renders the environment in the current render mode.

reset() ndarray[Any, dtype[float32]]

Resets the environment to its initial state.

Returns:

The new state after resetting the environment.

step(action: int) tuple[ndarray[Any, dtype[float32]], float, bool]

Advances the environment by one step given the specified action.

Parameters:

action – The action to take.

Returns:

A tuple containing the new state, reward, and a flag indicating episode termination.

class academia.environments.base.GenericMiniGridWrapper(difficulty: int, difficulty_envid_map: dict, n_frames_stacked: int = 1, append_step_count: bool = False, random_state: int | None = None, **kwargs)

Bases: GenericGymnasiumWrapper

A wrapper for MiniGrid environments that makes them scalable.

Parameters:
  • difficulty – Difficulty level from 0 to 3, where 0 is the easiest and 3 is the hardest.

  • difficulty_envid_map – A dict that maps numerical difficulty level to gymnasium environment ID.

  • n_frames_stacked – How many most recent states should be stacked together to form a final state representation. Defaults to 1.

  • append_step_count – Whether or not append the current step count to each state. Defaults to False.

  • random_state – Optional seed that controls the randomness of the environment. Defaults to None.

  • kwargs – Arguments passed down to gymnasium.make.

Raises:

ValueError – If the specified difficulty level is invalid.

step_count

Current step count since the last reset.

Type:

int

difficulty

Difficulty level. Higher values indicate more difficult environments.

Type:

int

n_frames_stacked

How many most recent states should be stacked together to form a final state representation.

Type:

int

append_step_count

Whether or not append the current step count to each state.

Type:

bool

class academia.environments.base.ScalableEnvironment(difficulty: int, n_frames_stacked: int = 1, **kwargs)

Bases: ABC

Base class for all environments used in this package. Scalability ensures environments can be used for Curriculum Learning.

Parameters:
  • difficulty – Difficulty level. Higher values indicate more difficult environments.

  • n_frames_stacked – How many most recent states should be stacked together to form a final state representation. Defaults to 1.

difficulty

Difficulty level. Higher values indicate more difficult environments.

Type:

int

n_frames_stacked

How many most recent states should be stacked together to form a final state representation.

Type:

int

N_ACTIONS: int

Number of available actions.

STATE_SHAPE: tuple[int, ...]

Shape of the state representation. Can vary for each instance

Returns:

A binary mask with 0s in place for illegal actions (actions that have no effect) and 1s for legal actions.

abstract observe() Any
Returns:

A current state.

abstract render() None

Renders the environment.

abstract reset() Any

Resets the environment.

Returns:

A starting state.

abstract step(action: int) tuple[Any, float, bool]

Takes the given action in the environment

Parameters:

action – An action to take.

Returns:

A tuple consisting of a new state, reward and a flag indicating whether the state is terminal.