Environment Creation¶

PCSE Gym environments are specified using the utils.Args dataclass and calling the utils.make_gym_env(args) function.

The utils.make_gym_env(args) function makes the Gym environment and saves and/or loads the correct config.yaml file that specifies the crop simulation.

For information on environment configuration, see Environment Configuration.

Caution

We do not recommend calling

gymnasium.make(env-id, **kwargs)

directly, as utils.make_gym_env(args) also handles correctly passing arguments to the gymnasium.make function.

As an example, a default environment can be created with the following:

import utils, tyro

args = tyro.cli(utils.Args)
env = utils.make_gym_env(args)

Environment Wrappers¶

There are a few cases in which a Gymnasium.Wrapper should be used.

If using a prespecified policy Prespecified Policies, the environment should be wrapped in a pcse_gym.wrappers.NPKDictActionWrapper and pcse_gym.wrappers.NPKDictObservationWrapper. These two wrappers wrap the environment actions and observations as dictionaries to interface with the Policy class in pcse_gym.policies which makes policy specification easy.

To wrap the environment in these wrappers, do the following:

import utils, tyro

args = tyro.cli(utils.Args)
env = utils.make_gym_env(args)
env = pcse_gym.wrappers.NPKDictActionWrapper(env)
env = pcse_gym.wrappers.NPKDictObservationWrapper(env)

Caution

When using a trained RL Agent policy, ensure that the environment is not wrapped in pcse_gym.wrappers.NPKDictActionWrapper or pcse_gym.wrappers.NPKDictObservationWrapper, as this will make the agent unable to execute a policy.

Additionally, the user may want to specify different reward functions (e.g. to penalize fertilizer runoff or set irrigation limits) and then train a RL agent with this reward function. To do so, the environment should be instantiated as follows:

import utils, tyro

args = tyro.cli(Args)

# Make the gym environment with wrappers
env = utils.make_gym_env(args)
env = pcse_gym.wrappers.NPKDictActionWrapper(env)
env = pcse_gym.wrappers.NPKDictObservationWrapper(env)
env = utils.wrap_env_reward(env, args)

And, the reward wrapper should be specified in the command line argument or config.yaml file:

python3 test_wofost.py --env-reward RewardFertilizationThresholdWrapper

For examples on how to create a RewardWrapper see Reward Wrappers

Important

Always call env = utils.wrap_env_reward(env, args) after any calls to other pcse_gym.wrappers as RewardWrappers overwrite the step function, creating the desired reward.