Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Init fails if one node doesn't have Portworx installed #187

Open
disrani-px opened this issue Sep 13, 2018 · 3 comments
Open

Init fails if one node doesn't have Portworx installed #187

disrani-px opened this issue Sep 13, 2018 · 3 comments

Comments

@disrani-px
Copy link
Contributor

Trying to run torpedo on K8s on DC/OS where portworx isn't installed on public nodes, but there is a public K8s node. Torpedo init fails in this case.

--- FAIL: TestMain (1.45s)
    --- FAIL: TestMain/setup (1.44s)
        Error Trace:    common_test.go:68
    	Error:      	Received unexpected error:
    	            	failed to find px node for node: {fa3f9b3e-c580-4461-831b-c69b3e94fc3f  kube-node-public-0-kubelet.kubernetes.mesos [192.168.65.60]  Worker}
    	Messages:   	Error initializing volume driver pxd
@disrani-px
Copy link
Contributor Author

Should also not wait for PX to be up on that node

time="2018-09-13T22:03:28Z" level=info msg="Using the Portworx volume driver under scheduler: k8s"
time="2018-09-13T22:03:28Z" level=info msg="Using http://10.100.112.142:9001 as endpoint for portworx volume driver"
time="2018-09-13T22:03:28Z" level=info msg="px on node 46512cbf-12ef-47c7-8f37-a942345d14ae is now up. status: STATUS_OK"
time="2018-09-13T22:03:28Z" level=info msg="px on node 9b3d2e27-27da-4c02-9fa4-0165ad2a6381 is now up. status: STATUS_OK"
time="2018-09-13T22:03:28Z" level=info msg="px on node c8e15ccd-e827-41f4-a713-0ef2257a5b5e is now up. status: STATUS_OK"
2018/09/13 22:03:28 Failed to wait for px status on: kube-node-public-0-kubelet.kubernetes.mesos due to err: 404 page not found
 Next retry in: 10s
2018/09/13 22:03:37 Failed to wait for px status on: kube-node-public-0-kubelet.kubernetes.mesos due to err: 404 page not found
 Next retry in: 10s
2018/09/13 22:03:47 Failed to wait for px status on: kube-node-public-0-kubelet.kubernetes.mesos due to err: 404 page not found
 Next retry in: 10s
2018/09/13 22:03:55 Failed to wait for px status on: kube-node-public-0-kubelet.kubernetes.mesos due to err: 404 page not found
 Next retry in: 10s
2018/09/13 22:04:05 Failed to wait for px status on: kube-node-public-0-kubelet.kubernetes.mesos due to err: 404 page not found
 Next retry in: 10s
2018/09/13 22:04:13 Failed to wait for px status on: kube-node-public-0-kubelet.kubernetes.mesos due to err: 404 page not found
 Next retry in: 10s
2018/09/13 22:04:23 Failed to wait for px status on: kube-node-public-0-kubelet.kubernetes.mesos due to err: 404 page not found
 Next retry in: 10s
2018/09/13 22:04:33 Failed to wait for px status on: kube-node-public-0-kubelet.kubernetes.mesos due to err: 404 page not found
 Next retry in: 10s
2018/09/13 22:04:42 Failed to wait for px status on: kube-node-public-0-kubelet.kubernetes.mesos due to err: 404 page not found
 Next retry in: 10s
2018/09/13 22:04:52 Failed to wait for px status on: kube-node-public-0-kubelet.kubernetes.mesos due to err: 404 page not found
 Next retry in: 10s
2018/09/13 22:05:00 Failed to wait for px status on: kube-node-public-0-kubelet.kubernetes.mesos due to err: 404 page not found
 Next retry in: 10s

@disrani-px
Copy link
Contributor Author

The public node has a label which can be used to ignore it

$ kubectl describe nodes kube-node-public-0-kubelet.kubernetes.mesos
Name:               kube-node-public-0-kubelet.kubernetes.mesos
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.dcos.io/node-type=public
                    kubernetes.io/hostname=kube-node-public-0-kubelet.kubernetes.mesos
                    name=kube-node-public-0-kubelet.kubernetes.mesos
Annotations:        node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:             node-type.kubernetes.dcos.io/public=true:NoSchedule

@ram-infrac
Copy link
Contributor

@harsh-px & @disrani-px this is fixed by #190 . Can you please confirm and close this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants