Supplement functions in ReservoirTrajectory and BehaviorCloningPolicy #390

peterchen96 · 2021-07-20T10:57:18Z

add update! and prob functions in ReservoirTrajectory and BehaviorCloningPolicy for solving problems mentioned in #386:

use reservoir_trajetory to collect data for sl_agent

replace average_learner with BehaviourCloningPolicy

So that I can define sl_agent as the following:

sl_agent = Agent(
    policy = BehaviorCloningPolicy(;
        approximator = NeuralNetworkApproximator(
            model = base_model |> _device,
            optimizer = Optimizer(sl_learning_rate),
        ),
        explorer = WeightedSoftmaxExplorer(),
        batch_size = batch_size,
        min_reservoir_history = min_buffer_size_to_learn,
        rng = rng,
    ),
    trajectory = ReservoirTrajectory(
        reservoir_buffer_capacity;
        :state => Vector{Int64},
        :action => Int
    ),
)

…env) method

…olicy.

peterchen96 and others added 12 commits July 20, 2021 18:30

add update! method for ReservoirTrajectory when :PreActStage

b5bf775

redefine BehaviorCloningPolicy and update (p::BehaviorCloningPolicy)(…

9403895

…env) method

add update! method for BehaviorCloning with trajectory.

4ad5eb3

add prob method for BehaviorCloningPolicy.

d669976

add env and stage in update! for BehaviorCloningPolicy.

ab565e3

add ActionStyle judgement of env in BehaviorCloningPolicy's prob.

0c67138

supplement prob function in weighted_softmax_explorer.

55dc0c4

add ActionStyle judgement of env when get action from BehaviorCloingP…

7cc5a03

…olicy.

delete update_freq and update_step in BehaviorCloningPolicy.

6807ed5

modify the judgment criteria in update!

9d448eb

Merge branch 'master' into peter/supplement

f1dc12b

update RNEWS.md

4c9e36a

findmyway merged commit 97f603f into JuliaReinforcementLearning:master Jul 22, 2021

peterchen96 deleted the peter/supplement branch August 8, 2021 18:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supplement functions in ReservoirTrajectory and BehaviorCloningPolicy #390

Supplement functions in ReservoirTrajectory and BehaviorCloningPolicy #390

peterchen96 commented Jul 20, 2021 •

edited

Loading

Supplement functions in ReservoirTrajectory and BehaviorCloningPolicy #390

Supplement functions in ReservoirTrajectory and BehaviorCloningPolicy #390

Conversation

peterchen96 commented Jul 20, 2021 • edited Loading

peterchen96 commented Jul 20, 2021 •

edited

Loading