Reinforcement Learning with Human Feedback. ReTrain Models. #27

nealmick · 2023-11-04T17:06:27Z

nealmick
Nov 4, 2023
Maintainer

Today there is a new update live on the webserver. The main new feature adds the ability to fine tune models and continuously retrain models as needed.

Notes:
Only finished games with final scores can be used to retrain.
The same model used to originally predict the game is the model used to retrain.
You cannot currently ReTrain a different model other then the original model used to make the prediction.
The option to retrain will only appear if a custom model was used.
You cannot retrain the default model. You must have a custom model currently trained in the model slot selected.

If all of that above is correct, and you have custom model with a finished game, you should now see a new option at the bottom labeled ReTrain:

When clicked the data set for the game is loaded up along with all your custom model settings. The model is then trained with a single epoch on the game data. If your running locally you should see a single epoch fire in console:

[04/Nov/2023 11:48:28] "GET /predict/edit/2/ HTTP/1.1" 200 118561 retraining model on game id: 2 1/1 [==============================] - 1s 825ms/step - loss: 83.6608 - accuracy: 1.0000 [04/Nov/2023 11:48:35] "GET /predict/retrain/2 HTTP/1.1" 302 0

This is a very new feature soo it is pretty unclear how this will effect models. Initially from testing it is relatively easy to change the models outputs. Definitely makes models more accurate on those individual games, however if this generalizes forward is the question... You can also retrain multiple times which can further increase accuracy..

let me know if you have any feedback or ideas!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reinforcement Learning with Human Feedback. ReTrain Models. #27

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Reinforcement Learning with Human Feedback. ReTrain Models. #27

nealmick Nov 4, 2023 Maintainer

Replies: 0 comments

nealmick
Nov 4, 2023
Maintainer