- Ubuntu 18.04
- Python 3.6.9
sudo apt-get install python3
- pip3 20.3.1
sudo apt-get install python3-pip
pip3 install --upgrade pip
- Tensorflow 1.15.0
pip3 install tensorflow==1.15.0
- Keras 2.3.1
pip3 install keras==2.3.1
- Docker
Make sure that you have installed!!
- Kubenetes
Make sure that you have installed!!
cd $HOME/DL-Pipeline-Tutorial/model_retrain/model
wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1zwrqgdkeHkxU7mwMHTtidkPK_10kNAW7' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1zwrqgdkeHkxU7mwMHTtidkPK_10kNAW7" -O top_model_weights.h5&& rm -rf /tmp/cookies.txt
model_retrain/model_retrain.sh
- Data is located at
model_retrain/data/
. model_retrain/retrain.py
generates a new model atmodel_retrain/model/
.- Write the trainging log to file
output_<version>_<accuracy>
.This file is used to serve the website, which needs information about the model to display. - After training a new model, the
.h5
file is copied tosaved_model/input_models/
. saved_model/export_saved_model.py
generate saved model tosaved_model/x_test/
.
model_retrain/model_remove_bad_perf.sh
If the newest model's performance is low, you may remove it with this script.
model_retrain/model_deploy.sh
- Deploy a specific version model to the cluster. It writes a new yaml file at
deploy/tfserving_<version>.yml
and creates a new pod.
model_retrain/model_delete.sh
- Remove specific version of TF serving pod.
1.Download the model
$ cd $HOME/DL-Pipeline-Tutorial/model_retrain/model
$ wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1zwrqgdkeHkxU7mwMHTtidkPK_10kNAW7' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1zwrqgdkeHkxU7mwMHTtidkPK_10kNAW7" -O top_model_weights.h5&& rm -rf /tmp/cookies.txt
2.Training script
$ cd model_retrain/ && ./model_retrain.sh
3.Deploy script
$ cd model_retrain/ && sudo ./model_deploy.sh $version $DockerName
- $version: model version that you generated
- $DockerName: Dockerhub account
Check the pod is running and get the IP of pod
$ kubectl get pod -o wide
Check the service and the NodePort
$ kubectl get svc
- get the metadata
- the node address
$ curl $NodeIP:$NodePort/v1/models/x_test/metadata
- the pod address
$ curl $PodIP:$Port/v1/models/x_test/metadata
- predict the picture
$ python3 test/client.py -i $picture -u $ip -p $port