Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proxy and MTU matching #9

Closed
flplv opened this issue Jul 30, 2020 · 26 comments
Closed

Proxy and MTU matching #9

flplv opened this issue Jul 30, 2020 · 26 comments
Labels
bug Something isn't working

Comments

@flplv
Copy link

flplv commented Jul 30, 2020

Hello, I believe I found a bug and will submit a patch soon.

The bug is this:

When using the project as proxy the MTU may be different between clients and server, and that will break the raknet protocol, that is MTU sensitive, because packets between server and clients in the example proxy are passed as raw data, they are not assembled, while the MTUs are negotiated between proxy and server and proxy and clients inside the raknet handshake.

The solution I found was to connect to the server first, get the MTU of the server, and disconnect. Wait for client connections but limiting max MTU with client to match MTU of the server, and then when opening the connection to the server after a client connects, limit server MTU with client's. This will cover usecases where either client or server have bigger MTU than the other, by limiting the MTU of the connection to the least of the two sides.

@flplv
Copy link
Author

flplv commented Jul 30, 2020

During my tests, I confirmed the bug on the case that server has smaller MTU than client, I did not test the inverse to confirm the bug when the client has smaller MTU than the server. But I assume it would confirm too since server would send a packet that the client cannot decode due to bigger MTU than the one expected.

If you'd like I can share the tcpdumps.

@Sandertv
Copy link
Owner

I don't think I see the issue here. The raknet client and server establish an MTU size based on the other end and use that MTU size exclusively for that connection. I don't see why that would be a problem, as fragmentation of packets happens on both ends depending on their respective MTU sizes.

@Sandertv Sandertv added bug Something isn't working decision needed Extra attention is needed labels Jul 30, 2020
@flplv
Copy link
Author

flplv commented Jul 30, 2020

Yep, the Raknet handshakes will stabilish the MTU per connection basis in a proxy use case, not end to end.

Take this example:

Assume the MTU are stabilised as follows:
proxy to server: MTU is 1300
proxy to client: MTU is 1400

Now when client sends a packet of size 1400 to the proxy, proxy will forward it in raw mode to the server, and it will violate max MTU size of the interface. In tcpdump I observed the following messages:

17:24:34.216571 IP x.x.x.x.dynamic.adsl.gvt.net.br.63328 > instance-1.us-east1-b.c.propane-choir-284403.internal.19132: UDP, bad length 1456 > 1448

It is an UDP stack error, at the interface level, because it received a packet that was bigger than its maximum payload.

The key thing is that the suggested proxy.go implementation does not reassemble Raknet packets, instead, they are forwarded between sides in RAW mode. The fix could be to match MTUs or to do Raknet aware proxy during the payloads, not only the handhakes, but that would be impossible because connection after handshake is encrypted end to end.

@Sandertv
Copy link
Owner

The thing is that the proxy doesn't forward anything in 'raw' mode. It should all be fragmented where appropriate.

@flplv
Copy link
Author

flplv commented Jul 30, 2020

Interesting, let me do some more analysis here.

@Sandertv
Copy link
Owner

So if there is actually an issue where packets are sent that exceed the MTU size, I think that might be a bug of the actual fragmentation code.

@flplv
Copy link
Author

flplv commented Jul 30, 2020

Ok, I thought encryption was at raknet layer, but it seems it is at Minecraft layer.

I could not narrow down the cause of the bug yet, but this is the test I did:

I tested my use case in two manners: having this project as a server and having the Mojang's Dedicated Minecraft Bedrock as a server, in the same cloud machine, in the same interface and port, at different moments, and I observed a difference in the OpenConnetionReply1 packet, from server to client.

Mojang's implementation has reduced down the MTU, while this implementation did not.

Please find attached the two tcpdumps.
captures.zip

@Sandertv
Copy link
Owner

Sandertv commented Jul 30, 2020

I think the MTU size being 1400 is a different thing altogether. Mojang for some reason caps the MTU size at 1400 server side.

@flplv
Copy link
Author

flplv commented Jul 30, 2020

Got it. Yet it is an important data point that when I capped MTU down to 1400 with the fix I made the problem went away. Somehow the root cause must be related to MTU size... I am not sure how much more time I can put in this debug, I will let you know if I find out anything.

@Sandertv
Copy link
Owner

17:24:34.216571 IP x.x.x.x.dynamic.adsl.gvt.net.br.63328 > instance-1.us-east1-b.c.propane-choir-284403.internal.19132: UDP, bad length 1456 > 1448
This error is worrying though, it might be pointing at some bug in the fragmenting. I'll have a look at that. With this error, do you know what MTU size was established for the connection that this occurred on?

@flplv
Copy link
Author

flplv commented Jul 30, 2020

Let me double check.

@flplv
Copy link
Author

flplv commented Jul 30, 2020

Humm.. scrolling down my tcpdump capture look what I found:

Screenshot from 2020-07-30 17-46-52

The server is flooding the client with packets of packet size 1474, while the negotiated MTU was 1492.

Screenshot from 2020-07-30 17-48-24

Those packets are definitely too big for the MTU of the interface. Either my interface is reporting bad MTU to the application, or the application is not checking the right MTU of the interface.

@Sandertv
Copy link
Owner

Perhaps limiting it to 1400 like Mojang does isn't a bad idea. That way the packets probably won't be fragmented like this anymore and VPNs will work properly...

@flplv
Copy link
Author

flplv commented Jul 30, 2020

Indeed.

@flplv
Copy link
Author

flplv commented Jul 30, 2020

this is my interface MTU:

ens4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1460

@Sandertv
Copy link
Owner

Thanks a lot for all the debugging. I'm going to dive into this tonight/tomorrow. I'll keep you up to date.

@Sandertv Sandertv removed the decision needed Extra attention is needed label Jul 30, 2020
@flplv
Copy link
Author

flplv commented Jul 30, 2020

Isn't pk.MTUSize = int16(buf.Len()+1) + 28 computing MTUSize over a packet that was already fragmented and reassembled by the IP layer? Cos the OpenConnectionRequest1 received packet in the server had 1490 bytes, while the MTU of the interface was 1460, and wireshark pointed it as a fragemented IP packet too.

@flplv
Copy link
Author

flplv commented Jul 30, 2020

It doesn't make much sense, but this is what I got on my server regarding mtu compuation:

int16(buf.Len()+1) + 28 returns 1492
tcp packet size in wireshark is: 1490 size, with 1456 as payload.

These numbers makes no sense. buf.Len() might not reflect MTU at all...

@Sandertv
Copy link
Owner

The OpenConnectionRequest1 is padded to be able to get the maximum size of a packet that isn't dropped, but it doesn't provide any way to determine if the packet was fragmented on IP protocol level. The MTU size in that packet therefore isn't exactly right because it is only the biggest size of packets that is not dropped. I'm thinking of adding a 'PreferredMTUSize' field and establish an MTU size based on those.

@Sandertv
Copy link
Owner

Either that or I might just make the max 1400 like Mojang does.

@flplv
Copy link
Author

flplv commented Jul 30, 2020

Ah got it, so the implementation of the Raknet handshake expects packets to be discarded if they exceed the MTU of the path to destination. That is not reliable.

       discovery.  This means the kernel will keep track of the MTU to a
       specific target IP address and return EMSGSIZE when a UDP packet
       write exceeds it.  When this happens, the application should decrease
       the packet size.  Path MTU discovery can be also turned off using the
       IP_MTU_DISCOVER socket option or the
       /proc/sys/net/ipv4/ip_no_pmtu_disc file; see ip(7) for details.  When
       turned off, UDP will fragment outgoing UDP packets that exceed the
       interface MTU.  However, disabling it is not recommended for
       performance and reliability reasons.

https://man7.org/linux/man-pages/man7/udp.7.html

In my system, MTU path detection is disabled, that means IP fragmentation will occur.

Detecting MTU reliably and portably is probably impossible, I'd recommend a hard limit in the MTU as you suggested.

@Sandertv
Copy link
Owner

Could you check out a2bcbfd and see if it still happens? I've capped the MTU size at 1400 and packet sizes seem to be equal to those sent by the vanilla server.

@flplv
Copy link
Author

flplv commented Jul 31, 2020

tested and the patch fixed the issue

@Sandertv
Copy link
Owner

Okay excellent. I will close this issue and tag a new version.

@flplv
Copy link
Author

flplv commented Jul 31, 2020 via email

@ryanMushy
Copy link

Hey gang, I'm a year late to finding this thread but...

I also ran into mtu max packet size issues while working with raknet. We were sending large hd images via udp and getting severe dropouts. Wireshark showed no extra packet splitting beyond the initial fragmentation to make all packets 1500 bytes. However I also noticed that my packet sizes were not the correct length.

Long story short, when a packet is split the reliability of the packet changes and that affects the packet size, here was my novel fix facebookarchive/RakNet#145

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants