Merge pull request #90 from P-Schumacher/dev

fix time bug in chase tag
MyoHub · Oct 5, 2023 · b58e604 · b58e604
2 parents 8465c42 + a0cc2eb
commit b58e604
Show file tree

Hide file tree

Showing 3 changed files with 3 additions and 1 deletion.
diff --git a/docs/source/baselines.rst b/docs/source/baselines.rst
@@ -90,7 +90,7 @@ Launch training
 
 DEP-RL baseline
 ```````````````
-We provide `deprl <https://github.com/martius-lab/depRL>`_ as an additional baseline for locomotion policies. The controller was adapted from the original paper and produces robust locomotion policies with the MyoLeg through the use of a self-organizing exploration method.
+We provide `deprl <https://github.com/martius-lab/depRL>`_ as an additional baseline for locomotion policies. You can find more detailed explanations and documentation on how to use it `here <https://deprl.readthedocs.io/en/latest/index.html>`__. The controller was adapted from the original paper and produces robust locomotion policies with the MyoLeg through the use of a self-organizing exploration method.
 While DEP-RL can be used for any kind of RL task, we provide a pre-trained controller and training settings for the `myoLegWalk-v0` task.
 See `this tutorial <https://github.com/facebookresearch/myosuite/blob/main/docs/source/tutorials/4a_deprl.ipynb>`_ for more detailed tutorials.
 

diff --git a/docs/source/tutorials.rst b/docs/source/tutorials.rst
@@ -165,6 +165,7 @@ When using ``mjrl`` it might be needed to resume training of a policy locally. I
 
 Load DEP-RL Baseline
 ====================
+See `here <https://deprl.readthedocs.io/en/latest/index.html>`__ for more detailed documentation of ``deprl``.
 
 If you want to load and execute the pre-trained DEP-RL baseline. Make sure that the ``deprl`` package is installed.
 

diff --git a/myosuite/envs/myo/myochallenge/chasetag_v0.py b/myosuite/envs/myo/myochallenge/chasetag_v0.py
@@ -454,6 +454,7 @@ def get_reward_dict(self, obs_dict):
         # The task is entirely defined by these 3 lines
         win_cdt = self._win_condition()
         lose_cdt = self._lose_condition()
+
         if self.current_task == 'chase':
             score = self._get_score(float(self.obs_dict['time'])) if win_cdt else 0
             self.obs_dict['time'] = self.maxTime if lose_cdt else self.obs_dict['time']