A R6Class
to represent reinforcement learning environments. To define custom environment, one should define a R6Class
which inherit rlR::Environment.
Number of actions of the agent to environment
The dimension of the observation(or state) space on the environment. Must be vector of integers. For example, c(28, 28, 3), which can be the dimension for a tensor of order 3.
A string to represent the name of the environment
A boolean variable to represent whether the action space is continous or not
Constructor function to initialize environment
Function to make a step in the environment. Must return a named list of [state(array of size state_dim), reward(reward the agent get after making the step), done(boolean variable whether the episode is finished or not), info(list of anything)
]. There must be stoping criteria in step function which should return [list(state = state, reward = reward, done = TRUE, info = list())
] to stop the interaction between the environment and the agent.
Reset the environment
Print out information to user about the environment, can be left empty
What needs to be done after learning is finished, could be left empty
process value of vec_arm which is the same length vector as action count act_cnt to only generate legal action, by default doing nothing