(实则因为本项目嵌套本项目开设的问题，非嵌套本项目应该无问题) 宿主机：debian12 安装PVE，宿主机重启网络后虚拟机的tap设备丢失无法自创建和链接，需要虚拟机本身关机重启解决/使用OVS替代网桥实现NAT #21

spiritLHLS · 2024-08-22T05:42:49Z

debian12系统安装了虚拟化项目，开设的nat kvm虚拟机运行使用一段时间会断网，从pev控制台进入nat kvm虚拟机 ping 172.16.1.1 也不通，执行reboot重启也不通，是所有的nat kvm 虚拟机同时出现不通，只能在pve web控制台点击虚拟机然后再点右上方的关闭菜单机选择等待重启重启后虚拟机网络可以恢复，但是这个没网络时不会出现，不知道是哪里的问题。debian12 主机自身网络正常。

auto lo
iface lo inet loopback
auto vmbr0
iface vmbr0 inet static
    address 110.xx.xx.xx/24
    gateway 110.xx.xx.1
    bridge_ports eth0
    bridge_stp off
    bridge_fd 0

iface vmbr0 inet6 static
    address 240e:x:x:x::x:x/128
    gateway 240e:x:x:x::5064:1
    up ip addr del fe80::be24:11ff:feb6:c5c2/64 dev eth0
auto vmbr1
iface vmbr1 inet static
    address 172.16.1.1
    netmask 255.255.255.0
    bridge_ports none
    bridge_stp off
    bridge_fd 0
    post-up echo 1 > /proc/sys/net/ipv4/ip_forward
    post-up echo 1 > /proc/sys/net/ipv4/conf/vmbr1/proxy_arp
    post-up iptables -t nat -A POSTROUTING -s '172.16.1.0/24' -o vmbr0 -j MASQUERADE
    post-down iptables -t nat -D POSTROUTING -s '172.16.1.0/24' -o vmbr0 -j MASQUERADE

iface vmbr1 inet6 static
    address 2001:db8:1::1/64
    post-up sysctl -w net.ipv6.conf.all.forwarding=1
    post-up ip6tables -t nat -A POSTROUTING -s 2001:db8:1::/64 -o vmbr0 -j MASQUERADE
    post-down sysctl -w net.ipv6.conf.all.forwarding=0
    post-down ip6tables -t nat -D POSTROUTING -s 2001:db8:1::/64 -o vmbr0 -j MASQUERADE

Originally posted by @wbews in #11 (comment)

The text was updated successfully, but these errors were encountered:

spiritLHLS · 2024-08-22T05:46:21Z

我怀疑是你网关掉了，被什么东西卡掉了

出现这种情况后，有试过在宿主机上ping 172.16.1.1吗

还有

brctl show

@wbews

spiritLHLS · 2024-08-22T05:48:25Z

虚拟机内执行以下命令截取最新的20行给我

cat /var/log/messages

cat /var/log/syslog

spiritLHLS · 2024-08-22T08:33:57Z

你好，我刚才pve_delete.sh 107 删除了一个未启动的虚拟机。然后nat kvm 所有虚拟机都掉网了。这正常吗。

正常

https://github.com/oneclickvirt/pve/blob/main/scripts/pve_delete.sh#L63C1-L67C37

删除后会重启整个宿主机的网络，重载NAT映射

spiritLHLS · 2024-08-22T08:34:54Z

Aug 22 14:45:00 VM102 systemd[1]: cloud-config.service: Failed with result 'exit-code'.
Aug 22 14:45:00 VM102 systemd[1]: Failed to start Apply the settings specified in cloud-config.

Aug 22 14:45:00 VM102 systemd[1]: cloud-final.service: Failed with result 'exit-code'.
Aug 22 14:45:00 VM102 systemd[1]: Failed to start Execute cloud user/final scripts.

日志可以看到 cloud-init 有点问题，不知道是不是这个原因导致的

spiritLHLS · 2024-08-22T08:35:32Z

Aug 22 14:45:00 VM102 systemd[1]: cloud-config.service: Failed with result 'exit-code'. Aug 22 14:45:00 VM102 systemd[1]: Failed to start Apply the settings specified in cloud-config.

Aug 22 14:45:00 VM102 systemd[1]: cloud-final.service: Failed with result 'exit-code'. Aug 22 14:45:00 VM102 systemd[1]: Failed to start Execute cloud user/final scripts.

日志可以看到 cloud-init 有点问题，不知道是不是这个原因导致的

cat /etc/cloud/cloud.cfg

虚拟机内看看配置

spiritLHLS · 2024-08-22T08:36:36Z

宿主机内网关能ping通代表外面网络配置没啥问题，有问题的是虚拟机内部的配置

spiritLHLS · 2024-08-22T08:37:34Z

如果可以你可以试试开不同的系统的虚拟机，看看是不是仅一个类型的系统有问题

wbews · 2024-08-22T08:38:16Z

service networking restart
systemctl restart networking.service

删除时重启网络，虚拟机不会自动恢复，只能从控制台重启是吧？

虚拟机 cloud.cfg

root@VM102:~# cat /etc/cloud/cloud.cfg

The top level settings are used as module

and system configuration.

A set of users which may be applied and/or used by various modules

when a 'default' entry is found it will reference the 'default_user'

from the distro configuration specified below

users:

default

If this is set, 'root' will not be able to ssh in and they

will get a message to login instead as the above $user (debian)

disable_root: true

This will cause the set+update hostname module to not operate (if true)

preserve_hostname: false

This prevents cloud-init from rewriting apt's sources.list file,

which has been a source of surprise.

apt_preserve_sources_list: true

Example datasource config

datasource:

Ec2:

metadata_urls: [ 'blah.com' ]

timeout: 5 # (defaults to 50 seconds)

max_wait: 10 # (defaults to 120 seconds)

The modules that run in the 'init' stage

cloud_init_modules:

migrator
seed_random
bootcmd
write-files
growpart
resizefs
disk_setup
mounts
set_hostname
update_hostname
update_etc_hosts
ca-certs
rsyslog
users-groups
ssh

The modules that run in the 'config' stage

cloud_config_modules:

Emit the cloud config ready event

this can be used by upstart jobs for 'start on cloud-config'.

emit_upstart
ssh-import-id
locale
set-passwords
grub-dpkg
apt-pipelining
apt-configure
ntp
timezone
disable-ec2-metadata
runcmd
byobu

The modules that run in the 'final' stage

cloud_final_modules:

package-update-upgrade-install
fan
puppet
chef
salt-minion
mcollective
rightscale_userdata
scripts-vendor
scripts-per-once
scripts-per-boot
scripts-per-instance
scripts-user
ssh-authkey-fingerprints
keys-to-console
phone-home
final-message
power-state-change

System and/or distro specific settings

(not accessible to handlers/transforms)

system_info:

This will affect which distro class gets used

distro: debian

Default user name + that default users groups (if added/used)

default_user:
name: debian
lock_passwd: True
gecos: Debian
groups: [adm, audio, cdrom, dialout, dip, floppy, netdev, plugdev, sudo, video]
sudo: ["ALL=(ALL) NOPASSWD:ALL"]
shell: /bin/bash

Other config here will be given to the distro class and/or path classes

paths:
cloud_dir: /var/lib/cloud/
templates_dir: /etc/cloud/templates/
upstart_dir: /etc/init/
package_mirrors:
- arches: [default]
failsafe:
primary: http://deb.debian.org/debian
security: http://security.debian.org/
ssh_svcname: ssh
root@VM102:~#

spiritLHLS · 2024-08-22T08:50:24Z

service networking restart
systemctl restart networking.service
删除时重启网络，虚拟机不会自动恢复，只能从控制台重启是吧？

有这个可能，你可以遇到这种情况的时候试试

systemctl restart pve-cluster
systemctl restart pvedaemon
systemctl restart pveproxy
systemctl restart pvestatd

重启PVE的服务看看，有没有启动虚拟机网络

wbews · 2024-08-22T08:55:05Z

重启PVE服务，虚拟机网络没有启动。

spiritLHLS · 2024-08-22T08:57:30Z

ifdown vmbr0 && ifup vmbr0

ifdown vmbr1 && ifup vmbr1

停了网桥再启动网桥呢？

wbews · 2024-08-22T09:05:27Z

也不行哦。
root@pve:# ifdown vmbr1 && ifup vmbr1
root@pve:# brctl show
bridge name bridge id STP enabled interfaces
vmbr0 8000.bc24111480bd no eth0
vmbr1 8000.000000000000 no
root@pve:# brctl show
bridge name bridge id STP enabled interfaces
vmbr0 8000.bc24111480bd no eth0
vmbr1 8000.000000000000 no
root@pve:# brctl show
bridge name bridge id STP enabled interfaces
vmbr0 8000.bc24111480bd no eth0
vmbr1 8000.000000000000 no
root@pve:# ifdown vmbr0 && ifup vmbr0
warning: vmbr0: up cmd 'ip addr del fe80::be24:11ff:feb6:c5c2/64 dev eth0' failed: returned 2 (Error: ipv6: address not found.
)
root@pve:# ifdown vmbr0 && ifup vmbr0
warning: vmbr0: up cmd 'ip addr del fe80::be24:11ff:feb6:c5c2/64 dev eth0' failed: returned 2 (Error: ipv6: address not found.
)
root@pve:~# brctl show
bridge name bridge id STP enabled interfaces
vmbr0 8000.bc24111480bd no eth0
vmbr1 8000.000000000000 no

spiritLHLS · 2024-08-22T09:06:13Z

嘶，那可真怪了，宿主机外重启网络管不到虚拟机网络还行

wbews · 2024-08-22T09:11:26Z

service networking restart
systemctl restart networking.service
宿主机重启网络。虚拟机就只能冷重启才能恢复网络，reboot 也是无效。

spiritLHLS · 2024-08-22T09:13:15Z

qm agent $vmid network-interfaces-flush

不知道你用的镜像有没有装QEMU Guest Agent，如果有这样刷新一下内部网络接口不知道有没有用

$vmid 写你虚拟机编号 102 103 什么的

spiritLHLS · 2024-08-22T09:15:27Z

你原先的问题大概也是类似的毛病，外面网络自重启了，虚拟机的tap设备丢失了，虚拟机就只能冷重启才能恢复网络

wbews · 2024-08-22T09:23:02Z

是的，应该是这样导致的。这个命令刷新不了！
root@pve:~# qm agent 102 network-interfaces-flush
400 Parameter verification failed.
command: value 'network-interfaces-flush' does not have a value in the enumeration 'fsfreeze-freeze, fsfreeze-status, fsfreeze-thaw, fstrim, get-fsinfo, get-host-name, get-memory-block-info, get-memory-blocks, get-osinfo, get-time, get-timezone, get-users, get-vcpus, info, network-get-interfaces, ping, shutdown, suspend-disk, suspend-hybrid, suspend-ram'
qm guest cmd

spiritLHLS · 2024-08-22T09:26:00Z

是的，应该是这样导致的。这个命令刷新不了！ root@pve:~# qm agent 102 network-interfaces-flush 400 Parameter verification failed. command: value 'network-interfaces-flush' does not have a value in the enumeration 'fsfreeze-freeze, fsfreeze-status, fsfreeze-thaw, fstrim, get-fsinfo, get-host-name, get-memory-block-info, get-memory-blocks, get-osinfo, get-time, get-timezone, get-users, get-vcpus, info, network-get-interfaces, ping, shutdown, suspend-disk, suspend-hybrid, suspend-ram' qm guest cmd

web面板冷启动实际应该也是关掉虚拟机启动虚拟机吧

qm shutdown 102
qm start 102

直接这样命令重启是不是也有效果？你试试

wbews · 2024-08-22T09:32:32Z

这样可以的，删除虚拟机是必须重启宿主机网络吗

spiritLHLS · 2024-08-22T09:34:37Z

这样可以的，删除虚拟机是必须重启宿主机网络吗

不必要，刚刚我已经删除了对应部分的内容

spiritLHLS · 2024-08-22T09:37:04Z

是的，应该是这样导致的。这个命令刷新不了！ root@pve:~# qm agent 102 network-interfaces-flush 400 Parameter verification failed. command: value 'network-interfaces-flush' does not have a value in the enumeration 'fsfreeze-freeze, fsfreeze-status, fsfreeze-thaw, fstrim, get-fsinfo, get-host-name, get-memory-block-info, get-memory-blocks, get-osinfo, get-time, get-timezone, get-users, get-vcpus, info, network-get-interfaces, ping, shutdown, suspend-disk, suspend-hybrid, suspend-ram' qm guest cmd

web面板冷启动实际应该也是关掉虚拟机启动虚拟机吧

qm shutdown 102 qm start 102

直接这样命令重启是不是也有效果？你试试

ifreload -a

重载接口文件的命令

虽然我觉得也不顶用，tap设备可能还是没有自创建和链接网桥

彻底解决这个问题得上 OVS 了大概，网桥的增强版

spiritLHLS · 2024-08-22T09:51:09Z

web面板冷启动实际应该也是关掉虚拟机启动虚拟机吧

qm shutdown 102 qm start 102

直接这样命令重启是不是也有效果？你试试

自动化版本重启虚拟机的玩意：

#!/bin/bash
running_vms=$(qm list | awk '$3 == "running" {print $1}')
if [ -z "$running_vms" ]; then
    echo "没有运行中的虚拟机。"
    exit 0
fi
echo "以下虚拟机将被关闭然后重新启动："
echo "$running_vms"
for vm in $running_vms; do
    qm shutdown $vm
    while qm status $vm | grep -q running; do
        sleep 5
    done
    qm start $vm
    while ! qm status $vm | grep -q running; do
        sleep 5
    done
    sleep 1
done

wbews · 2024-08-22T09:57:45Z

我发现有的时候，宿主机或者pve上冷重启虚拟机没反应，pve提示超时，这时候还得在控制台先reboot一下，然后再次立即shutodn 🤣

spiritLHLS · 2024-08-22T10:46:35Z

我发现有的时候，宿主机或者pve上冷重启虚拟机没反应，pve提示超时，这时候还得在控制台先reboot一下，然后再次立即shutodn 🤣

什么商家的服务器啊，这么多问题的？

wbews · 2024-08-22T12:38:32Z

鸡仔云其它商家重启网络虚拟机不会断网吗？

spiritLHLS · 2024-08-22T12:41:31Z

靠，怎么又是这家的东西，见

#20

本项目不支持嵌套再嵌套啊

wbews · 2024-08-22T12:46:56Z

#20 刚看了，他家也是用的这个项目 🤣

spiritLHLS · 2024-08-22T12:47:39Z

绝了，感觉是嵌套出毛病了，但我不知道具体毛病在哪里，是我才疏学浅了

spiritLHLS · 2024-08-22T12:51:23Z

暂时先这么着吧，待哪天哪个有缘人找到问题再关闭本问题了，留着先

spiritLHLS · 2024-08-22T12:52:34Z

鸡仔云其它商家重启网络虚拟机不会断网吗？

我没遇到过，也没有其他用户反馈过这个问题

使用本项目开设PVE嵌套再嵌套PVE这种操作非常少见

wbews · 2024-08-22T12:53:11Z

好的，感谢

spiritLHLS · 2024-08-22T12:54:01Z

好的，感谢

非KVM需求用LXD/INCUS就不会出这种问题了大概，配置方面应该不冲突了这样

This comment was marked as resolved.

Sign in to view

spiritLHLS added the bug Something isn't working label Aug 22, 2024

spiritLHLS added the help wanted Extra attention is needed label Aug 22, 2024

spiritLHLS added the enhancement New feature or request label Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(实则因为本项目嵌套本项目开设的问题，非嵌套本项目应该无问题) 宿主机：debian12 安装PVE，宿主机重启网络后虚拟机的tap设备丢失无法自创建和链接，需要虚拟机本身关机重启解决/使用OVS替代网桥实现NAT #21

(实则因为本项目嵌套本项目开设的问题，非嵌套本项目应该无问题) 宿主机：debian12 安装PVE，宿主机重启网络后虚拟机的tap设备丢失无法自创建和链接，需要虚拟机本身关机重启解决/使用OVS替代网桥实现NAT #21

spiritLHLS commented Aug 22, 2024 •

edited

Loading

spiritLHLS commented Aug 22, 2024 •

edited

Loading

spiritLHLS commented Aug 22, 2024 •

edited

Loading

This comment was marked as resolved.

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024 •

edited

Loading

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024 •

edited

Loading

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024 •

edited

Loading

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

(实则因为本项目嵌套本项目开设的问题，非嵌套本项目应该无问题) 宿主机：debian12 安装PVE，宿主机重启网络后虚拟机的tap设备丢失无法自创建和链接，需要虚拟机本身关机重启解决/使用OVS替代网桥实现NAT #21

(实则因为本项目嵌套本项目开设的问题，非嵌套本项目应该无问题) 宿主机：debian12 安装PVE，宿主机重启网络后虚拟机的tap设备丢失无法自创建和链接，需要虚拟机本身关机重启解决/使用OVS替代网桥实现NAT #21

Comments

spiritLHLS commented Aug 22, 2024 • edited Loading

spiritLHLS commented Aug 22, 2024 • edited Loading

spiritLHLS commented Aug 22, 2024 • edited Loading

This comment was marked as resolved.

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

The top level settings are used as module

and system configuration.

A set of users which may be applied and/or used by various modules

when a 'default' entry is found it will reference the 'default_user'

from the distro configuration specified below

If this is set, 'root' will not be able to ssh in and they

will get a message to login instead as the above $user (debian)

This will cause the set+update hostname module to not operate (if true)

This prevents cloud-init from rewriting apt's sources.list file,

which has been a source of surprise.

Example datasource config

datasource:

Ec2:

metadata_urls: [ 'blah.com' ]

timeout: 5 # (defaults to 50 seconds)

max_wait: 10 # (defaults to 120 seconds)

The modules that run in the 'init' stage

The modules that run in the 'config' stage

Emit the cloud config ready event

this can be used by upstart jobs for 'start on cloud-config'.

The modules that run in the 'final' stage

System and/or distro specific settings

(not accessible to handlers/transforms)

This will affect which distro class gets used

Default user name + that default users groups (if added/used)

Other config here will be given to the distro class and/or path classes

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024 • edited Loading

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024 • edited Loading

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024 • edited Loading

wbews commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024

spiritLHLS commented Aug 22, 2024 •

edited

Loading

spiritLHLS commented Aug 22, 2024 •

edited

Loading

spiritLHLS commented Aug 22, 2024 •

edited

Loading

spiritLHLS commented Aug 22, 2024 •

edited

Loading

spiritLHLS commented Aug 22, 2024 •

edited

Loading

spiritLHLS commented Aug 22, 2024 •

edited

Loading