Flow Module

baconian.core.flow.train_test_flow.Flow

class baconian.core.flow.train_test_flow.Flow(func_dict)

Interface of experiment flow module, it defines the workflow of the reinforcement learning experiments.

__init__(func_dict)

Constructor for Flow.

Parameters:func_dict (dict) – the function and its arguments that will be called in the Flow
_call_func(key, **extra_kwargs)

Call a function that is pre-defined in self.func_dict

Parameters:
  • key (str) – name of the function, e.g., train, test, sample.
  • extra_kwargs – some extra kwargs you may want to be passed in the function calling
Returns:

actual return value of the called function if self.func_dict has such function otherwise None.

Return type:

_launch() → bool

Abstract method to be implemented by subclass for a certain workflow.

Returns:True if the flow correctly executed and finished
Return type:bool
launch() → bool

Launch the flow until it finished or catch a system-allowed errors (e.g., out of GPU memory, to ensure the log will be saved safely).

Returns:True if the flow correctly executed and finished
Return type:bool
required_func = ()
required_key_dict = {}
class baconian.core.flow.train_test_flow.TrainTestFlow(train_sample_count_func, config_or_config_dict: (<class 'baconian.config.dict_config.DictConfig'>, <class 'dict'>), func_dict: dict)

A typical sampling-trainning and testing workflow, that used by most of model-free/model-based reinforcement learning method. Typically, it repeat the sampling(saving to memory if off policy)->training(from memory if off-policy, from samples if on-policy)->test

__init__(train_sample_count_func, config_or_config_dict: (<class 'baconian.config.dict_config.DictConfig'>, <class 'dict'>), func_dict: dict)

Constructor of TrainTestFlow

Parameters:
  • train_sample_count_func (method) – a function indicates how much training samples the agent has collected currently.
  • config_or_config_dict (Config or dict) – a Config or a dict should have the keys: (TEST_EVERY_SAMPLE_COUNT, TRAIN_EVERY_SAMPLE_COUNT, START_TRAIN_AFTER_SAMPLE_COUNT, START_TEST_AFTER_SAMPLE_COUNT)
  • func_dict (dict) – function dict, holds the keys: ‘sample’, ‘train’, ‘test’. each item in the dict as also should be a dict, holds the keys ‘func’, ‘args’, ‘kwargs’
_is_ended()
Returns:True if an experiment is ended
Return type:bool
_launch() → bool

Launch the flow until it finished or catch a system-allowed errors (e.g., out of GPU memory, to ensure the log will be saved safely).

Returns:True if the flow correctly executed and finished
Return type:bool
required_func = ('train', 'test', 'sample')
required_key_dict = {'START_TEST_AFTER_SAMPLE_COUNT': 1, 'START_TRAIN_AFTER_SAMPLE_COUNT': 1, 'TEST_EVERY_SAMPLE_COUNT': 1000, 'TRAIN_EVERY_SAMPLE_COUNT': 1000}

baconian.core.flow.dyna_flow.DynaFlow

class baconian.core.flow.dyna_flow.DynaFlow(train_sample_count_func, config_or_config_dict: (<class 'baconian.config.dict_config.DictConfig'>, <class 'dict'>), func_dict: dict)

A typical flow for utilizing the model-based algo, it is not restricted to Dyna algorithms, but can be utilized by others.

__init__(train_sample_count_func, config_or_config_dict: (<class 'baconian.config.dict_config.DictConfig'>, <class 'dict'>), func_dict: dict)
Parameters:
  • train_sample_count_func (method) – a function indicates how much training samples the agent has collected currently.
  • config_or_config_dict (Config or dict) – a Config or a dict should have the keys: (TEST_EVERY_SAMPLE_COUNT, TRAIN_EVERY_SAMPLE_COUNT, START_TRAIN_AFTER_SAMPLE_COUNT, START_TEST_AFTER_SAMPLE_COUNT)
  • func_dict (dict) – function dict, holds the keys: ‘sample’, ‘train’, ‘test’. each item in the dict as also should be a dict, holds the keys ‘func’, ‘args’, ‘kwargs’
_is_ended()
Returns:True if an experiment is ended
Return type:bool
_launch() → bool

Launch the flow until it finished or catch a system-allowed errors (e.g., out of GPU memory, to ensure the log will be saved safely).

Returns:True if the flow correctly executed and finished
Return type:bool
required_func = ('train_algo', 'train_algo_from_synthesized_data', 'train_dynamics', 'test_algo', 'test_dynamics', 'sample_from_real_env', 'sample_from_dynamics_env')
required_key_dict = {'START_TEST_ALGO_AFTER_SAMPLE_COUNT': 1, 'START_TEST_DYNAMICS_AFTER_SAMPLE_COUNT': 1, 'START_TRAIN_ALGO_AFTER_SAMPLE_COUNT': 1, 'START_TRAIN_DYNAMICS_AFTER_SAMPLE_COUNT': 1, 'TEST_ALGO_EVERY_REAL_SAMPLE_COUNT': 1000, 'TEST_DYNAMICS_EVERY_REAL_SAMPLE_COUNT': 1000, 'TRAIN_ALGO_EVERY_REAL_SAMPLE_COUNT_FROM_DYNAMICS_ENV': 1000, 'TRAIN_ALGO_EVERY_REAL_SAMPLE_COUNT_FROM_REAL_ENV': 1000, 'TRAIN_DYNAMICS_EVERY_REAL_SAMPLE_COUNT': 1000, 'WARM_UP_DYNAMICS_SAMPLES': 1000}