academia.tools module
Submodules
Module contents
Miscellaneous classes and functions that don’t belong anywhere else.
The idea behind academia.tools
is that no other module depends on it, but it itself does depend on
other modules. That is different to academia.utils
which works the other way around.
Exported classes:
- class academia.tools.AgentDebugger(agent: Agent, env: ScalableEnvironment, start_greedy: bool = False, start_paused: bool = False, key_action_map: dict = {}, run: bool = False, run_verbose: int = 1)
Bases:
object
Class allowing for easy agent debugging. Using this class the user can investigate agent’s behavior step-by-step with ability to check what the agent thinks about the current state. The user can also toggle between greedy and non-greedy behavior mid-episode.
Additionally the user can take over the agent at any moment by overriding the actions taken by the agent. This allows the user to put the agent in new, difficult or otherwise interesting situations and check how the agent behaves.
The user can interact with the debugger using the following keys:
‘t’ - terminate the current episode (and start a new one)
‘p’ - pause the environment
‘g’ - toggle between greedy and non-greedy behavior
‘ ‘ (space) - perform one step (only works when
paused
is set toTrue
)esc (’x1b’) - quit the debugger
The user can also interact with the environment using a custom
key_action_map
.- Parameters:
agent – Agent object to be debugged.
env – Environment object with which the agent will interact. The environment should be instantiated with
render_mode
set to"human"
for the user to see it.start_greedy – Whether the agent should start with greedy behavior. Defaults to
False
.start_paused – Whether the environment should start in a paused state. Defaults to
False
.key_action_map – Dictionary between keyboard keys and environment actions. It accepts one character per action. If a digit character is not present in the dictionary it will be automatically converted to the corresponding action. If any other character is not present in the dictionary it will be converted to
None
and ignored. The dictionary does not acceptreserved_keys
as its keys. Defaults to an empty dictionary.run – Whether to run the debugger after initialization. Defaults to
False
.run_verbose – Verbosity level with which to automatically run the debugger if
run
isTrue
. Defaults to 1.
- env
Environment with which the agent interacts.
- Type:
- key_action_map
Dictionary between keyboard keys and environment actions.
- Type:
dict
- greedy
Whether the agent behaves in a greedy manner.
- Type:
bool
- paused
Whether the environment is paused (allows for step-by-step execution).
- Type:
bool
- input_timeout
Time (in seconds) to wait for user input. If the user does not press any key in that time frame the execution continues (unless
paused
isTrue
).- Type:
float
- episodes
Number of episodes run in the environment.
- Type:
int
- steps
Number of steps in the current episode.
- Type:
int
- running
Whether the debugger is currently running.
- Type:
bool
Examples
Initialization:
>>> from academia.tools import AgentDebugger >>> from academia.environments import LavaCrossing >>> from academia.agents import DQNAgent >>> from academia.utils.models import lava_crossing >>> >>> agent = DQNAgent(lava_crossing.MLPDQN, 3) >>> env = LavaCrossing(difficulty=0, render_mode='human') >>> >>> # auto running with keymap example >>> AgentDebugger(agent, env, run=True, key_action_map={ >>> 'w': 2, >>> 'a': 0, >>> 'd': 1, >>> }) >>> >>> # manual running >>> ad = AgentDebugger(agent, env) >>> ad.run(verbose=5)
- reserved_keys = ['t', '\x1b', ' ', 'p', 'g']
A list of reserved keys that cannot be used by
key_action_map
- run(verbose: int = 0) None
Runs the agent debugger with the specified verbosity level.
Verbosity level
What is logged
0
no logging (except for errors)
1
Episode Rewards
2
Step Rewards
3
Agent Thoughts
- Parameters:
verbose – Verbosity level.
- thoughts_handlers = {'DQNAgent': <function _dqnagent_thoughts_handler>, 'PPOAgent': <function _ppoagent_thoughts_handler>, 'QLAgent': <function _qlagent_thoughts_handler>, 'SarsaAgent': <function _sarsa_thoughts_handler>}
A class attribute that stores global list of available agent thought handlers. Thought handlers are functions that accept an agent object and an observed state and return a user defined “thought” e.g. q-values predicted by the agent.
These functions are stored with the following signature:
>>> def my_thought_handler(agent: Agent, state: Any) -> str: >>> pass
where
agent
is the agent object to handle andstate
is the observed state of the environment on which we want to get agent’s thoughts.There are a few default thought handler corresponding to implemented agents:
'PPOAgent'
- returns the predicted probabilites of actions when in discrete mode and mean action when in continuous mode as well as the state value as predicted by the critic.'DQNAgent'
- returns the predicted q-values of each action.'QLAgent'
- returns the predicted q-values of each action.'SarsaAgent'
- returns the predicted q-values of each action.
Example
>>> from academia.agents.base import Agent >>> >>> # custom agent class >>> class MyAgent(Agent): >>> pass >>> >>> def my_agent_handler(agent: Agent, state: Any): >>> pass >>> # adds a new handler to the dicitonary >>> # the key should be a string containing the name of the class >>> AgentDebugger.thoughts_handlers['MyAgent'] = my_agent_handler