machine-learning-zoomcamp/05-deployment/02-pickle.md at master · DataTalksClub/machine-learning-zoomcamp · GitHub

5.2 Saving and loading the model

Notes

In this session we'll cover the idea "How to use the model in future without training and evaluating the code"

To save the model we made before there is an option using the pickle library:
- First install the library with the command pip install pickle-mixin if you don't have it.
- After training the model and making it ready for the prediction process, use this code to save the model for later.
- ```
import pickle

with open('model.bin', 'wb') as f_out: # 'wb' means write-binary
    pickle.dump((dict_vectorizer, model), f_out)
```
- In the code above we'll make a binary file named model.bin, and write the dict_vectorizer for one hot encoding and the model as array in it. (We will save it as binary in case it wouldn't be readable by humans)
- To be able to use the model in future without running the code, We need to open the binary file we saved before.
- ```
import pickle

with open('mode.bin', 'rb') as f_in: # very important to use 'rb' here, it means read-binary 
    dict_vectorizer, model = pickle.load(f_in)
## Note: never open a binary file you do not trust the source!
```
- With unpacking the model and the dict_vectorizer, We're able to predict again for new input values without training a new model by re-running the code.

Add notes from the video (PRs are welcome)

⚠️	The notes are written by the community. If you see an error here, please create a PR with a fix.

Notes from Peter Ernicke

Navigation