Configuring tasks & curricula
Intro
In academia package, there are two ways of initializing tasks and curricula.
The first method is through the use of LearningTask and
Curriculum’s constructors. The other utilizes configuration files in the YAML
format and load_task_config() or load_curriculum_config() functions.
Here is an example of initializing a simple curriculum directly in a script:
1from academia.curriculum import LearningTask, Curriculum
2from academia.environments import LavaCrossing
3
4# define tasks
5task1 = LearningTask(
6 env_type=LavaCrossing,
7 env_args={'difficulty': 0, 'render_mode': 'human', 'append_step_count': True},
8 stop_conditions={'max_episodes': 500},
9)
10task2 = LearningTask(
11 env_type=LavaCrossing,
12 env_args={'difficulty': 1, 'render_mode': 'human', 'append_step_count': True},
13 stop_conditions={'max_episodes': 1000},
14)
15
16# define a curriculum
17curriculum = Curriculum(
18 tasks=[task1, task2],
19 output_dir='./my_curriculum/',
20)
This code creates a curriculum which comprises the first two levels of the Lava Crossing environment. An identical curriculum can be defined with the following configuration file:
1output_dir: './my_curriculum/'
2order:
3- 0
4- 1
5tasks:
6 0:
7 env_args:
8 difficulty: 0
9 render_mode: human
10 append_step_count: True
11 env_type: academia.environments.LavaCrossing
12 evaluation_interval: 100
13 stop_conditions:
14 max_episodes: 500
15 1:
16 env_args:
17 difficulty: 1
18 render_mode: human
19 append_step_count: True
20 env_type: academia.environments.LavaCrossing
21 evaluation_interval: 100
22 stop_conditions:
23 max_episodes: 1000
This can be then loaded with a single line of code:
1from academia.curriculum import load_curriculum_config
2
3curriculum = load_curriculum_config('my_config.curriculum.yml')
Neither method is better than the other and it is up to the user to choose which one they prefer. Initializing through code gives more flexibility and can be easier for users not familiar with academia’s API. On the other hand, configuration files allow to extract most of the configuration logic out of the source code. They can also make large and complex configurations more concise and readable, which might make them a better option for more complex experiments.
To learn about the specific parameters for environments, tasks and curricula, feel free to explore the rest of the documentation to get familiar with academia’s functions and classes. The rest of this guide will focus on YAML configuration files. More specifically, we will explore some special features which make this method flexible and allow users to avoid duplication in their configuraions.
Note
While the configuration has to be in the YAML format, academia does not
enforce any particular file extensions. However, it is often a good practise
to differentiate task and curricula configuration files by using extensions
such as .task.yml or .curriculum.yml.
Default task parameters inside a curriculum
Tasks inside a single curriculum often share similar sets of parameter values. For example, all of them could utilize the same environment, but with different difficulty levels. Curriculum configuration file allows to define a set of default parameters for tasks inside that curriculum.
In the example configuration above, both tasks share a lot of the same configuration, which leads to lots of code duplication. Below are highlighted the only unique pieces of configuration for both tasks:
1output_dir: './my_curriculum/'
2order:
3- 0
4- 1
5tasks:
6 0:
7 env_args:
8 difficulty: 0
9 render_mode: human
10 append_step_count: True
11 env_type: academia.environments.LavaCrossing
12 evaluation_interval: 100
13 stop_conditions:
14 max_episodes: 500
15 1:
16 env_args:
17 difficulty: 1
18 render_mode: human
19 append_step_count: True
20 env_type: academia.environments.LavaCrossing
21 evaluation_interval: 100
22 stop_conditions:
23 max_episodes: 1000
To address this issue, a special _default task can be defined for
the curriculum, which provides default parameters values for all tasks
defined or loaded in this curriculum (more on loading later). The
configuration listed above can be simplified in the following way:
1output_dir: './my_curriculum/'
2order:
3- 0
4- 1
5tasks:
6 _default:
7 env_args:
8 render_mode: human
9 append_step_count: True
10 env_type: academia.environments.LavaCrossing
11 evaluation_interval: 100
12 0:
13 env_args:
14 difficulty: 0
15 stop_conditions:
16 max_episodes: 500
17 1:
18 env_args:
19 difficulty: 1
20 stop_conditions:
21 max_episodes: 1000
Now, all common configuration has been moved to the _default task, and
the tasks define only their unique arguments. Note that the _default task
can also be used in the curriculum, just as any other task. All we need to do
is to supply all required parameters to it. Consider the following configuration,
which again is equivalent to the ones listed before:
1output_dir: './my_curriculum/'
2order:
3- easier
4- _default
5tasks:
6 _default:
7 env_args:
8 difficulty: 1
9 render_mode: human
10 append_step_count: True
11 env_type: academia.environments.LavaCrossing
12 evaluation_interval: 100
13 stop_conditions:
14 max_episodes: 1000
15 easier:
16 env_args:
17 difficulty: 0
18 stop_conditions:
19 max_episodes: 500
In curriculum learning, the final environment is treated as the most important one,
and all other tasks are only there to speed up the training. It makes sense then to
mark the target environment as _default in the configuration, and then for easier
tasks define just their unique pieces of configuration. This is exactly what we do
in the above example. Notice that both _default and easier tasks define
the environment difficulty, as well as a max episodes stop condition. Each task can
override the default configuration, and this is exactly what happens here.
For instance, the easier task is now going to end after 500 episodes -
if we did not specify this stop condition here, it would end after 1000 episodes,
just as declared in the _default task.
Loading configurations from external files
It is not uncommon for multiple curricula to share common tasks. Let us say we want to design two curricula for the Door Key environment. Consider the difficulty level of 2 as the target difficulty for this environment. In the first curriculum, we want an agent to go through all the difficulty levels up to the level 2, starting at level 0. In the other curriculum, we want it to skip the level 1 and go straight from level 0 to level 2. Below are example configurations for both scenarios:
1order:
2 - 0
3 - 1
4 - 2
5tasks:
6 _default:
7 env_type: academia.environments.DoorKey
8 stop_conditions:
9 min_evaluation_score: 0.9
10 evaluation_interval: 100
11 evaluation_count: 25
12 include_init_eval: True
13 0:
14 name: 'Easy task'
15 env_args:
16 difficulty: 0
17 1:
18 name: 'Intermediate task'
19 env_args:
20 difficulty: 1
21 2:
22 name: 'Hard task'
23 env_args:
24 difficulty: 2
25 stop_conditions:
26 max_episodes: 1000
1order:
2 - 0
3 - 2
4tasks:
5 _default:
6 env_type: academia.environments.DoorKey
7 stop_conditions:
8 min_evaluation_score: 0.9
9 evaluation_interval: 100
10 evaluation_count: 25
11 include_init_eval: True
12 0:
13 name: 'Easy task'
14 env_args:
15 difficulty: 0
16 2:
17 name: 'Hard task'
18 env_args:
19 difficulty: 2
20 stop_conditions:
21 max_episodes: 1000
We use the _default task to avoid configuration duplication in each of the
files. Still, the configurations for tasks named “Easy task” and “Hard task” are identical
in both files. It would be nice to somehow extract it to a separate file, and load it in
both of the above’s configurations. Luckily, we can do it using the special attribute
named _load. It tells the configuration loaders to load YAML attributes from another
file. This way, we can split the above configurations into multiple files to create
an equivalent configuration:
1name: 'Easy task'
2env_args:
3 difficulty: 0
1name: 'Intermediate task'
2env_args:
3 difficulty: 1
1name: 'Hard task'
2env_args:
3 difficulty: 2
4stop_conditions:
5 max_episodes: 1000
1order:
2 - 0
3 - 1
4 - 2
5tasks:
6 _default:
7 env_type: academia.environments.DoorKey
8 stop_conditions:
9 min_evaluation_score: 0.9
10 evaluation_interval: 100
11 evaluation_count: 25
12 include_init_eval: True
13 0:
14 _load: ./easy.task.yml
15 1:
16 _load: ./intermediate.task.yml
17 2:
18 _load: ./hard.task.yml
1order:
2 - 0
3 - 2
4tasks:
5 _default:
6 env_type: academia.environments.DoorKey
7 stop_conditions:
8 min_evaluation_score: 0.9
9 evaluation_interval: 100
10 evaluation_count: 25
11 include_init_eval: True
12 0:
13 _load: ./easy.task.yml
14 2:
15 _load: ./hard.task.yml
Note that the path provided for the _load attribute must be relative to the
current configuration file.
The _load special attribute can be used not just to load tasks. It is designed to be
able to load attributes from any YAML file, which makes it very versatile. For example,
in the above configurations, since the _default task is also shared across both curricula,
we could extract its parameters into a separate file. It could look as follows
for full curriculum (analogously for the task-skip curriculum):
1env_type: academia.environments.DoorKey
2stop_conditions:
3 min_evaluation_score: 0.9
4evaluation_interval: 100
5evaluation_count: 25
6include_init_eval: True
1order:
2 - 0
3 - 1
4 - 2
5tasks:
6 _default:
7 _load: ./task-defaults.yml
8 0:
9 _load: ./easy.task.yml
10 1:
11 _load: ./intermediate.task.yml
12 2:
13 _load: ./hard.task.yml
The _load special attribute could also be chained, i.e. you can load a file, which
has _load in it, and it will also be handled. Also, just like with the _default
task, attributes loaded with the _load can be overriden if you specify them
alongside the _load attribute:
1order:
2 - 0
3 - 1
4 - 2
5tasks:
6 _default:
7 _load: ./task-defaults.yml
8 # this will override the evaluation_count of 25 from ./task-defaults.yml:
9 evaluation_count: 10
10 0:
11 _load: ./easy.task.yml
12 1:
13 _load: ./intermediate.task.yml
14 2:
15 _load: ./hard.task.yml
This is just one way to transform these configurations, and there could possibly be even
better ways to structure them. Remember that the _load special attribute can be used in
both tasks and curricula configurations.
Variables in configuration files
So far all configuration files we looked at had all the parameter values hardcoded. There could be cases however when we might want to input some of the parameters dynamically. For example, let us say we want to run a task 10 times to be able to average the results of our experiment across different independent runs. Consider the following configuration file and script:
1env_type: academia.environments.DoorKey
2env_args:
3 difficulty: 0
4 append_step_count: True
5 random_state: 123
6stop_conditions:
7 min_evaluation_score: 0.9
8evaluation_interval: 100
9evaluation_count: 25
10include_init_eval: True
1from academia.curriculum import load_task_config
2
3stats = []
4
5for run_no in range(10):
6 agent = ... # initialise some agent here
7 task = load_task_config('./doorkey.task.yml')
8 task.run(agent)
9 stats.append(task.stats)
Note that we specify a random state to the Door Key environment to ensure reproducibility
of our experiments. However, it could be better to pass a different random seed to the
environment for each individual run. We can achieve this using variables inside our
configuration. Variables are marked by a dollar sign $ in the configuration files
and can be used as follows:
1env_type: academia.environments.DoorKey
2env_args:
3 difficulty: 0
4 append_step_count: True
5 random_state: $env_random_state
6stop_conditions:
7 min_evaluation_score: 0.9
8evaluation_interval: 100
9evaluation_count: 25
10include_init_eval: True
1from academia.curriculum import load_task_config
2
3stats = []
4
5for run_no in range(10):
6 agent = ... # initialise some agent here
7 task = load_task_config('./doorkey.task.yml', variables={
8 'env_random_state': run_no,
9 })
10 task.run(agent)
11 stats.append(stats)
The same syntax applies for the load_curriculum_config() function. Variables can also be
used in external files loaded via the _load attribute - the same variables dictionary
will be used to resolve variables in any loaded files.
Variables can also be useful in setting parameters which are not possible to be set directly
in the configuration files. Good examples of such parameters are task_callback for
Curriculum and episode_callback for LearningTask. In the following
example, we use a variable to configure the former:
1output_dir: './my_curriculum/'
2task_callback: $task_callback
3order:
4- 0
5- 1
6tasks:
7 _default:
8 env_args:
9 render_mode: human
10 append_step_count: True
11 env_type: academia.environments.LavaCrossing
12 evaluation_interval: 100
13 0:
14 env_args:
15 difficulty: 0
16 stop_conditions:
17 max_episodes: 500
18 1:
19 env_args:
20 difficulty: 1
21 stop_conditions:
22 max_episodes: 1000
1from academia.agents.base import Agent
2from academia.curriculum import LearningStats, load_curriculum_config
3
4
5def my_task_callback(agent: Agent, stats: LearningStats, task_id: str) -> None:
6 agent.reset_exploration(0.8)
7
8
9task = load_curriculum_config('my_curriculum.yml', variables={
10 'task_callback': my_task_callback,
11})
These examples provide just the most common use cases. Variables have
also been designed with versitality in mind, and could also be used to
specify full tasks inside a curriculum, or to order tasks in a curriculum.
Basically, any attribute in the configuration (except for _load) can have a
variable assigned to it with a value provided at runtime upon loading.
Note
Variables cannot be used to dynamically provide paths for the _load attribute.
This is because by design all loads are handled before variables are resolved.