-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question]: sriov CreateEndpoint failure #4
Comments
Hi, I need some more information. Did you start the container with --mac-address= option? Something looks wrong with rest of the VF mac addresses being zero. Some notes: To avoid such hazzle, you can use this support script, Such as below, Now that you know the interested netdev to use, Or you can avoid above steps, and use this wrapper, Now you can do either, If you are not choosy about which VF to use than you can completely depend on plugin to find free VF for you. In simpler configurations, |
Hi, docker run --net=mynet-sriov -it a1a3b055c1f9 bin/bash I even tried to run the manual script, and also found another "error", which makes me thing there might be some other issue (?) ./docker_sriov_roce_mgmt list_netdevs enp4s0 Any further hints to overcome this are most welcome! |
Seems like issue that is not related to this plugin.
|
Here they are:
Linux ct-analytcis-2 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
total 0
-rw-r--r-- 1 root root 4096 Mar 9 16:42 broken_parity_status
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1 |
From output of command 4, it appears that netdevices for the VF are not created for some reason. I suggest you that you talk to Mellanox tech support first to see that these netdevices are seen. |
Thanks for the reply. You mean output of command 4 or command 3? What should be the expected outcome of the command? |
4th command - ip link show |
if you share the output (pretty long) of lspci -vvv it will reflect which driver (mlx5_core) or vfio driver owns the VFs that might throw light on why netdevices are not created. |
trying to short the output (include the native card and one VF - seems mlx5_core owns both): 04:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] 04:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx Virtual Function] |
as to the /var/log/messages ... which one are we talking about? ls -l /var/log/ |
/var/log/syslog and /var/log/dmesg should have some driver failure logs for the VFs. |
Right...dmesg shows some errors: Mar 15 14:21:58 ct-analytcis-2 kernel: [73144.310139] (0000:04:00.0): E-Switch: E-Switch enable SRIOV: nvfs(4) mode (1) |
Now it make sense. It seems like driver fail to load on VF with given error. This is helpful. I suggest you please contact the tech support to get this error resolved without bringing any plugin/container things in picture to get faster results. I will add more check at plugin level to make sure that network creation fails if it encounters this kind of unexpected error (instead of failing at container creation time). |
Hi,
I have a Mellanox Innova IPsec card and I am trying to set up docker with SR-IOV.
I managed to start a container via passthrough, however I get the following error when trying to boot one with SR-IOV:
"docker: Error response from daemon: failed to create endpoint kind_heisenberg on network mynet-sriov: NetworkDriver.CreateEndpoint: All devices in use [ f53229e321b1a7fdce364b6e8b7c749f34000b40075cd13839dc7d6eb98326ab ].."
Any help to overcome this would be appreciated. I tried to understand the problem and according to the code it seems to be related to the MAC Address assignment. Below the log of the plugin:
time="2018-03-08T16:57:16Z" level=debug msg="CreateNetwork IPv4Data len : [ 1 ]\n"
time="2018-03-08T16:57:16Z" level=debug msg="parseNetworkGenericOptions map[mode:sriov netdevice:enp4s0]"
max_vfs = 4
cur_vfs = 0
max_vfs = 4
time="2018-03-08T16:57:25Z" level=debug msg="DiscoverVF vfDev list length : [4]"
time="2018-03-08T16:57:25Z" level=debug msg="SRIOV CreateNetwork : [f53229e321b1a7fdce364b6e8b7c749f34000b40075cd13839dc7d6eb98326ab] IPv4Data : [ &{AddressSpace:LocalDefault Pool:194.168.1.0/24 Gateway:194.168.1.1/24 AuxAddresses:map[]} ]\n"
time="2018-03-08T16:57:38Z" level=debug msg="CreateEndpoint Called: [ &{NetworkID:f53229e321b1a7fdce364b6e8b7c749f34000b40075cd13839dc7d6eb98326ab EndpointID:ebb3c7d220ade467b8174e70ebe39232faecb98ce0bee7369e48851896173d5c Interface:0xc4201b20c0 Options:map[com.docker.network.endpoint.exposedports:[] com.docker.network.portmap:[]]} ]"
time="2018-03-08T16:57:38Z" level=debug msg="r.Interface: [ &{Address:194.168.1.2/24 AddressIPv6: MacAddress:} ]"
As well as the output of the "ip link show"
6: enp4s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
link/ether 24:8a:07:ad:54:f2 brd ff:ff:ff:ff:ff:ff
vf 0 MAC 00:22:33:44:55:66, spoof checking off, link-state auto
vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state auto
vf 2 MAC 00:00:00:00:00:00, spoof checking off, link-state auto
vf 3 MAC 00:00:00:00:00:00, spoof checking off, link-state auto
The text was updated successfully, but these errors were encountered: