Skip to content

Commit

Permalink
Findings_02/08
Browse files Browse the repository at this point in the history
  • Loading branch information
sayalikandarkar authored Aug 2, 2021
1 parent feddcaa commit 6e8a3e5
Showing 1 changed file with 44 additions and 0 deletions.
44 changes: 44 additions & 0 deletions logical_reward_functions/SayaSings/Findings_02_8.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
Findings from training on 30th July, 2021.

Model 4 - {SAC, Mean squared} - off track for all 5 laps, on right edge, failed to take the right turn at the end
Model 3 - {SAC, Huber} - off track for 4 laps, failed to take the right turn at the end
Model 2 - {PPO, Mean squared} - On track for 4 laps, a bit slow
Model 1 - {PPO, Huber} - On track for all 5 laps, super slow

Functions to optimize for Model 1 -
1. goes zig-zag a lot during straight lines.
2. speed is less for even straight lines.
3. speed and steering angle action space to be optimized.
4. hyperparameters to be tried and tested.


Findings from training on 31st July, 2021. (finalized model 1 for further optimization)

Model 1-4 - {action space change, changed threshold angle} - off track for 3 laps.
Model 1-3 - {action space change, changed stay straight rewards} - off track for 1 lap, min speed - 26s
Model 1-2 - {action space change, learning rate 10^-5} - 40s
Model 1-1 - {action space change} - superfast but 95%, 17 secs

Findings from training on 1st August 2021.

Model 1-3-clone - {clone of 1-3} - off track for 1 lap, min speed - 26s.
Model 1-1-3 - {action space speed 1.5}, default, off track for all 5 laps
Model 1-1-2 - {action space speed 1.5}- SPEED_THRESHOLD_FAST, rewards, grad batch size - 128 - off track, eliminated
Model 1-1-1 - {action space speed 1.5}- SPEED_THRESHOLD_FAST, rewards - 87%, 12 secs, all tracks

Findings from training on 1st August 2021. (finalized model 1-1-1 for further optimization)

Model 1-1-1-4 - {discrete - new car, find_curve function change} - error
Model 1-1-1-3-1 - {Model 1-1-1, discrete - new car, lr -> 10^-3} - 1 lap completed in 14 secs, rest off track
Model 1-1-1-2 - {clone of Model 1-1-1}, grad batch size - 32 - 5 laps completed in 13.3 secs, best model so far
Model 1-1-1-1 - {clone of Model 1-1-1} - 5 laps completed in 13 secs, second best model till date, gogo is proud of himself.

1. rewards
2. action space - speed {car} - min speed at 2m/s , stay straight - 4/5;

4 versions
1. Model 1-1-1-2 clone
2. Change rewards /thresholds
3. car - min speed - 2 - new model
4. TBD

0 comments on commit 6e8a3e5

Please sign in to comment.