amago.envs.amago_env#

Main environment wrapper stack for sequence data and parallel logging.

Classes

AMAGOEnv(env[, env_name, batched_envs])

A wrapper that connects gymnasium envs to the Experiment

EnvCreator(make_env, ...)

Utility that builds the SequenceWrapper(AMAGOEnv()) wrapper stack for AsyncVectorEnv

ReturnHistory(env_name)

SequenceWrapper(env, save_trajs_to[, ...])

A wrapper that saves trajectory files to disk and logs rollout metrics.

SpecialMetricHistory(env_name)

class AMAGOEnv(env, env_name=None, batched_envs=1)[source]#

Bases: Wrapper

A wrapper that connects gymnasium envs to the Experiment

Parameters:
  • env (Env) – The gymnasium env to wrap. Wraps the action space to [-1, 1] if continuous control.

  • env_name (str | None) – The name of the environment. Used for logging metrics.

  • batched_envs (int) – Alert the Experiment that this env is vectorized with a batch dimension like (batched_envs, …)

property env_name#

Dynamically change the name of the current environment in a multi-task setting (for logging and saving .traj files)

inner_reset(seed=None, options=None)[source]#
inner_step(action)[source]#
make_action_rep(action)[source]#
Return type:

ndarray

render(*args, **kwargs)[source]#

Uses the render() of the env that can be overwritten to change the returned data.

reset(seed=None, options=None)[source]#

Uses the reset() of the env that can be overwritten to change the returned data.

Return type:

Timestep

step(action)[source]#

Uses the step() of the env that can be overwritten to change the returned data.

Return type:

tuple[Timestep, ndarray, ndarray, ndarray, dict]

class EnvCreator(make_env, exploration_wrapper_type, save_trajs_to, save_every_low, save_every_high, save_trajs_as)[source]#

Bases: object

Utility that builds the SequenceWrapper(AMAGOEnv()) wrapper stack for AsyncVectorEnv

exploration_wrapper_type: Type[ExplorationWrapper]#
make_env: Callable#
save_every_high: int#
save_every_low: int#
save_trajs_as: str#
save_trajs_to: str | None#
class ReturnHistory(env_name)[source]#

Bases: object

add_score(env_name, score)[source]#
class SequenceWrapper(env, save_trajs_to, save_every=None, save_trajs_as='npz')[source]#

Bases: Wrapper

A wrapper that saves trajectory files to disk and logs rollout metrics.

Automatically logs total return in all envs.

We also log any metric from the gym env’s info dict that begins with “AMAGO_LOG_METRIC” (amago.envs.env_utils.AMAGO_ENV_LOG_PREFIX).

Parameters:
  • env (AMAGOEnv) – The AMAGOEnv to wrap.

  • save_trajs_to (str | None) – The directory to save trajectory files to.

  • save_every (tuple[int, int] | None) – The number of steps between saving trajectory files (if they don’t terminate).

  • save_trajs_as (str) – The format to save trajectory files as. “traj”, “npz”, or “npz-compressed”.

property current_timestep: tuple[ndarray, ndarray, ndarray]#

Get the current timestep of this environment in the format expected by the Agent

property env_name#
finish_active_traj(idx, new_init_timestep)[source]#
random_traj_length()[source]#
reset(seed=None)[source]#

Uses the reset() of the env that can be overwritten to change the returned data.

Return type:

Timestep

reset_stats()[source]#
save_finished_trajs()[source]#

Saves all completed trajectories to disk.

step(action)[source]#

Uses the step() of the env that can be overwritten to change the returned data.

property step_count#
property total_frames: int#

Returns the total number of env.step calls taken by this particular env wrapper.

property total_frames_by_env_name: dict[str, int]#

total env.step calls in this env_name}

Environment names are dynamically generated by the AMAGOEnv in case they change on reset. For example, a multi-task setup might pick from N games every reset. This dict would reveal the total timesteps seen in each game.

Type:

Returns a dict of {env_name

class SpecialMetricHistory(env_name)[source]#

Bases: object

add_score(env_name, key, value)[source]#
log_prefix = 'AMAGO_LOG_METRIC'#