Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] TUN模式下,DNAT入站流量无法回包出站的问题 #1493

Open
7 tasks done
huzheyi opened this issue Sep 3, 2024 · 13 comments
Open
7 tasks done

[Bug] TUN模式下,DNAT入站流量无法回包出站的问题 #1493

huzheyi opened this issue Sep 3, 2024 · 13 comments
Labels
bug Something isn't working

Comments

@huzheyi
Copy link

huzheyi commented Sep 3, 2024

Verify steps

  • I have read the documentation and understand the meaning of all configuration items I have written, avoiding a large number of seemingly useful options or default values.
  • I have not reviewed the documentation and resolve this issue.
  • I have not searched the Issue Tracker for the problem I am going to raise.
  • I have tested with the latest Alpha branch version, and the issue still persists.
  • I have provided server and client configuration files and processes that can reproduce the issue locally, rather than a desensitized complex client configuration file.
  • I have provided the simplest configuration that can reproduce the error I reported, rather than relying on remote servers, TUN, graphical client interfaces, or other closed-source software.
  • I have provided complete configuration files and logs, rather than providing only parts that I believe are useful due to confidence in my own intelligence.

Operating System

Linux

System Version

vyos 1.5

Mihomo Version

Mihomo Meta alpha-43f21c0 linux amd64 with go1.23.0 Mon Sep 2 08:24:57 UTC 2024
Use tags: with_gvisor

Configuration File

#port: 7890
#socks-port: 7891
mixed-port: 7890
#redir-port: 7892
#tproxy-port: 9898

allow-lan: true
bind-address: '*'

find-process-mode: strict

mode: rule

geox-url:
  geoip: "https://fastly.jsdelivr.net/gh/MetaCubeX/meta-rules-dat@release/geoip.dat"
  geosite: "https://fastly.jsdelivr.net/gh/MetaCubeX/meta-rules-dat@release/geosite.dat"
  mmdb: "https://fastly.jsdelivr.net/gh/MetaCubeX/meta-rules-dat@release/geoip.metadb"

# geodata-mode: true
geodata-loader: standard
geo-auto-update: true
geo-update-interval: 72

log-level: warning

ipv6: true

external-controller: 0.0.0.0:9090

tcp-concurrent: true

external-ui: /root/.config/mihomo/ui
external-ui-url: "https://github.com/MetaCubeX/metacubexd/archive/refs/heads/gh-pages.zip"

global-client-fingerprint: ios

profile:
  store-selected: true
  store-fake-ip: true

tun:
  enable: true
  stack: mixed
  dns-hijack:
    - 'any:53'
  auto-route: true
#  auto-redirect: true
  auto-detect-interface: true
  gso: true
  gso-max-size: 65536

sniffer:
  enable: true
  sniff:
    TLS:
      ports: [443, 8443]
    HTTP:
      ports: [80, 8080-8880]
      override-destination: true
    QUIC:
      ports: [443,8443]
  force-domain:
    - +.v2ex.com
  skip-domain:
     - Mijia Cloud

dns:
  cache-algorithm: arc
  enable: true
  prefer-h3: true
  listen: :5353
  ipv6: true

  default-nameserver:
    - 119.29.29.29
    - 223.5.5.5
    - system

  enhanced-mode: fake-ip
  fake-ip-range: 198.18.0.1/16
  # use-hosts: true

  respect-rules: false

  fake-ip-filter:
     - '*.lan'
     - '*.linksys.com'
     - '+.pool.ntp.org'
     - localhost.ptlogin2.qq.com
     - openpgpkey.kernel.org

  nameserver:
    - https://doh.pub/dns-query
    - https://dns.alidns.com/dns-query

  fallback:
    - https://1.1.1.1/dns-query
    - tls://1.0.0.1:853

  fallback-filter:
    geoip: true
    geoip-code: CN
    geosite:
      - gfw
    ipcidr:
      - 240.0.0.0/4
    domain:
      - '+.google.com'
      - '+.facebook.com'
      - '+.youtube.com'

  nameserver-policy:
    "geosite:private,cn,private,apple,microsoft@cn,category-games@cn":
      - https://doh.pub/dns-query
      - https://dns.alidns.com/dns-query

proxies:

...省略

rule-providers:
  bypass-source:
    type: file
    behavior: classical
    path: "bypass-source.yaml"

rules:

  - RULE-SET,bypass-source,DIRECT
  - GEOIP,private,DIRECT
  - GEOIP,cn,DIRECT
  - GEOSITE,private,DIRECT
  - GEOSITE,cn,DIRECT
  - GEOSITE,apple,DIRECT
  - GEOSITE,microsoft@cn,DIRECT
  - GEOSITE,category-games@cn,DIRECT
  - GEOIP,telegram,PROXY,no-resolve
  - MATCH,PROXY

Description

我是在vyos上通过container host模式运行的mihomo 1.18.8

我的情况是vyos上配置了dnat映射,经过抓包发现
当从外部网络访问我映射的服务时

  1. 数据包先从pppoe0接口进入,从pppoe0接口抓包可以看到
15:50:28.099265 IP 43.226.237.69.32153 > 123.117.170.178.4433: Flags [S], seq 2965468837, win 64240, options [mss 1448,sackOK,TS val 70108068 ecr 0,nop,wscale 7], length 0
  1. 然后数据包根据dnat规则转发到内网,从br0接口抓包可以看到
15:50:28.099505 IP 43.226.237.69.32153 > 192.168.1.41.443: Flags [S], seq 2965468837, win 64240, options [mss 1448,sackOK,TS val 70108068 ecr 0,nop,wscale 7], length 0
  1. 随后内网服务器响应tcp请求发送ack,从br0接口抓包可以看到
15:50:28.099613 IP 192.168.1.41.443 > 43.226.237.69.32153: Flags [S.], seq 3703792511, ack 2965468838, win 31856, options [mss 1460,sackOK,TS val 1923171828 ecr 70108068,nop,wscale 7], length 0

4.之后这个数据包就进入了Meta接口,从Meta接口抓包可以看到

15:50:28.099742 IP 123.117.170.178.4433 > 43.226.237.69.32153: Flags [S.], seq 3703792511, ack 2965468838, win 31856, options [mss 1460,sackOK,TS val 1923171828 ecr 70108068,nop,wscale 7], length 0
  1. 正常此时这个数据包应该可以在pppoe0上可以看到,事实上却没有,同时外部网络的这台主机也接收不到任何回包

当我关闭mihomo时相关数据包可以在pppoe0和br0上被正常捕获,我确认这与我vyos的防火墙无关,因为关闭防火墙全通策略下也是这样的表现。

我不知道到4这一步时,正常的表现应该是:

  1. 内网服务器响应的数据包不进入Meta接口,而从pppoe0直接出去
    还是:
    2.内网服务器响应的数据包进入Meta接口,然后再从pppoe0出去

但无论如何,确实外部网络的主机无法收到任何回包。

内网主机正常访问外网表现为完全正常。

Reproduction Steps

vyos dnat配置

set nat destination rule 101 inbound-interface name pppoe0
set nat destination rule 101 destination port 4433
set nat destination rule 101 protocol tcp
set nat destination rule 101 translation address 192.168.1.41
set nat destination rule 101 translation port 443

vyos mihomo容器配置

 allow-host-networks
 capability net-admin
 device tun {
     destination /dev/net/tun
     source /dev/net/tun
 }
 host-name mihomo
 image docker.io/metacubex/mihomo:Alpha
 restart always
 volume mihomo {
     destination /root/.config/mihomo
     source /config/user-data/mihomo
 }

在mihomo alpha、v1.18.8、v1.18.5版本测试表现相同

Logs

DEBUG模式下未能捕获任何相关日志
@huzheyi huzheyi added the bug Something isn't working label Sep 3, 2024
@huzheyi
Copy link
Author

huzheyi commented Sep 4, 2024

今天在虚拟化平台构建了一个简单的测试环境,对这个问题进行专项测试
vyos虚拟机
eth0(wan):静态ip,192.168.1.250/24
br0(lan):dhcp-server, 172.20.0.1/24
内网服务器: 172.20.0.107/24

在vyos配置相同的防火墙策略、snat masquerade策略和dnat策略,将内网服务器22端口映射到wan的2222端口
从外部192.168.1.93机器访问 192.168.1.250:2222,在启用mihomo的情况下正常访问

抓包Meta接口看不到任何流量,应该是入站流量和出站流量没有经过mihomo处理。

很匪夷所思

如果说这两个环境有区别,比较大的区别可能在我正式环境的vyos拥有eth0、eth1、eth2、eth3四个端口
eth0上建立pppoe0接口连接互联网
eth1连接了上游路由器的iptv网络、未配置地址、用于接收igmp组播流量
eth1子接口vif3010桥接了上游路由器的voip网络、配置静态地址
eth2和eth3作为br0的成员接口、成为lan网络、下接交换机

我猜测有没有可能mihomo面对比较多的网络接口时出现了什么bug导致流量被错误导入到tun设备?

补充一下我mihomo在vyos设备上创建的ip rule规则和2022路由表

//ip rule 
0:	from all lookup local
9000:	from all to 198.18.0.0/30 lookup 2022
9001:	not from all dport 53 lookup main suppress_prefixlength 0
9001:	from all ipproto icmp goto 9010
9001:	from all iif Meta goto 9010
9002:	not from all iif lo lookup 2022
9002:	from 0.0.0.0 iif lo lookup 2022
9002:	from 198.18.0.0/30 iif lo lookup 2022
9010:	from all nop
32766:	from all lookup main
32767:	from all lookup default

//route table 2022
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

VRF default table 2022:
K>* 0.0.0.0/0 [0/0] is directly connected, Meta, 1d01h12m

经过比较正式环境与测试环境上述量表内容相同,另,mihomo配置也相同

@huzheyi
Copy link
Author

huzheyi commented Sep 4, 2024

又在正式环境里测试了将内网端口映射到eth0上(而非pppoe0)的情况,从连接eth0的上游设备上是可以正常访问的。

现在看来原因逐渐清晰起来,当我映射到pppoe0上时:
入站请求:数据包(from public ip) --> wan --dnat--> br0 --> host,这个过程mihomo没有参与
出站流量:数据包( to public ip ) --> host --> br0 --snat--> tun,这个过程的流量在snat后可能被mihomo视作一个新的连接请求从而劫持

至于这个流量是否真的出站了,目前我无法确定,理论上他应该可以出站,而我在外部服务器上抓不到包也可能是由于数据包的state不对而被云服务器的防火墙当作一个new的连接给drop掉了,这个我要再测试一下

而当我映射到eth0上时,由于eth0上是一个私网地址,所以出站流量不会被mihomo劫持,也自然在Meta接口抓不到包。

@huzheyi
Copy link
Author

huzheyi commented Sep 4, 2024

在openclash项目里查到两个issue,应该目前是通过读取openwrt的端口映射配置,然后把相关端口的访问提前在iptables里retune掉了。很不优雅。。
vernesong/OpenClash#146
vernesong/OpenClash#525

@Wacchi-Lorie
Copy link

这个问题是否与#281也有关?
我在使用tun模式过后 尝试过了所有模式 也尝试了没有使用任何节点的情况 游戏均显示strict nat 设置的udp端口转发也都失效了

@huzheyi
Copy link
Author

huzheyi commented Sep 6, 2024

这个问题是否与#281也有关? 我在使用tun模式过后 尝试过了所有模式 也尝试了没有使用任何节点的情况 游戏均显示strict nat 设置的udp端口转发也都失效了

我这边测试只要是dnat到wan ip的内网服务,无论tcp/udp,都是我说的这个状态。

@WangShayne
Copy link

开启tun后,端口转发从外部访问失效
MobileNetwork --> a.com:1234 (公网ip) --> 192.168.1.99:4321 ❌
但如果我从局域网内访问外部端口,却是可以
Wifi --> a.com:1234 (公网ip) --> 192.168.1.99:4321 ✔

@huzheyi
Copy link
Author

huzheyi commented Sep 11, 2024

开启tun后,端口转发从外部访问失效 MobileNetwork --> a.com:1234 (公网ip) --> 192.168.1.99:4321 ❌ 但如果我从局域网内访问外部端口,却是可以 Wifi --> a.com:1234 (公网ip) --> 192.168.1.99:4321 ✔

是的,因为这种实际上hairpin nat做了一个源地址转换,你访问的a.com实际仍然是192.168.1.99

@WanQiyang
Copy link

试试 endpoint-independent-nat: true 呢,我这边可以临时解决

@huzheyi
Copy link
Author

huzheyi commented Sep 12, 2024

试试 endpoint-independent-nat: true 呢,我这边可以临时解决

这个参数什么意思其实不太理解,能否解释一下?

@WangShayne
Copy link

我暂时把docker container改为host了

@huzheyi
Copy link
Author

huzheyi commented Sep 12, 2024

我暂时把docker container改为host了

我暂时跑了tproxy模式,其实tproxy效率还更好一些。

@gitcook
Copy link

gitcook commented Sep 18, 2024

大佬们咋解决的啊 ,有点看不懂啊 ,docker 部署host模式的外网访问不了nas

@WangShayne
Copy link

大佬们咋解决的啊 ,有点看不懂啊 ,docker 部署host模式的外网访问不了nas

ufw开了吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants