Files and guidelines on how to setup containers on the EPFL Kubernetes cluster
kubectl get all
Navigate to the folder where the yaml files for your configurations are on your (local) machine.
kubectl create -f pod-deep.yaml
kubectl create -f svc-deep.yaml
kubectl describe pod
kubectl exec -it deep -- /bin/bash
exit
kubectl get pod deep -o yaml | grep hostIP
Note that this only gives the ip address. The external port mapping can be found through kubectl get all
kubectl delete pod deep
kubectl delete service deep
kubectl cp ~/Documents/file.txt deep:/data/
rsync -r --progress <source> <target>
sudo bash install_script.sh
nvidia-smi
Replace /dev/nvidia#
with the GPU number:
for i in $(sudo lsof /dev/nvidia1 | grep python | awk '{print $2}' | sort -u); do kill -9 $i; done
source: https://forums.developer.nvidia.com/t/11-gb-of-gpu-ram-used-and-no-process-listed-by-nvidia-smi/44459/12
kill all python processes (sleeping, but claiming GPU memory)
pkill -9 python
Replace name
with part of the executed command (e.g. see (h)top):
ps aux | grep Reptile
Replace id
with process id listed in the previous step
kill -9 id
python
import torch
torch.cuda.get_device_name(0)
python
import tensorflow as tf
with tf.Session() as sess:
print(sess.list_devices())
python -m http.server 8888
jupyter notebook --ip=0.0.0.0 --port=8888 --allow-root
jupyter lab --ip=0.0.0.0 --port=8888 --allow-root
sudo sh code-server --port 8080 --host 0.0.0.0 --auth none
pkill -f code-server
source: How to kill all processes with a given partial name? [closed]
tensorboard --logdir=./runs --port=8888
ps -ef|grep tensorboard
kill -9 <first/second pid>
See the code-server GitHub page
pip install matplotlib
0.4.1:
pip install http://download.pytorch.org/whl/cu90/torch-0.4.1-cp36-cp36m-linux_x86_64.whl
1.0.0 (has cuda 9.0.176 embedded):
pip3 install https://download.pytorch.org/whl/cu90/torch-1.0.0-cp36-cp36m-linux_x86_64.whl
Follow https://medium.com/@zhanwenchen/install-cuda-and-cudnn-for-tensorflow-gpu-on-ubuntu-79306e4ac04e
pip install tensorflow-gpu==1.8
echo "defshell -bash" > ~/.screenrc
source ~/.screenrc
conda activate myenv
python -m ipykernel install --user --name myenv --display-name "Python (myenv)"