-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Limitations of the Tofino Software Model in Mininet #64
Comments
I am not aware of any documented information on throughput limitations of the Tofino model. I would guess that on today's highest performance CPU cores, several thousand packets per second is pretty close to its maximum limit. As you say, it is intended for testing and debugging. If you are thinking of using it in a production network for processing packets, you are spending way more CPU cycles to do that than can be achieved on a general purpose CPU by many other techniques, e.g. writing a C or Golang program to process the packets. |
Dear @lorepap , First of all, I am somewhat surprised about the numbers you report. Even if you use jumbo frames (9216 bytes), 40Mbps of traffic would be equivalent to about 550pps (packets per second) which is quite a bit more than I would even expect. And I assume you didn't disable logging, right? Here are some facts about the model performance (and performance measurements in general) that might help you to make the right decisions. Tofino model is designed for the very accurate simulation of the actual Tofino hardware and not for performance. In fact, it is quite typical that even incorrectly compiled code will produce the same results on both the model and the ASIC. That's how accurate it is. Tofino model performance depends on:
Tofino model performance does not depend on:
Given that the size of the payload does not affect P4 program performance on Tofino (as long as the average packet size is within the spec), it is customary (and more practical) to measure the performance in packets-per-second rather than in bits-per-second, since the former number does not depend on the packet length. Note also, that even though the model's performance is not very high (I typically quote 100pps although I haven't personally measured it in a while), it can be used to simulate processing of the flows of any bandwidth. The "secret" is that it has a special feature called "manual time advancement" that acts like a very sophisticated time machine. This is something that is covered in the courses I teach. |
Thank you for your insights @vgurevich Yes, I have disabled logging, so the reported numbers reflect the model's raw performance without logging overhead. For reference, I observed approximately 579 pps (~43Mbps with udp packets). Regarding your point on "manual time advancement," I am curious whether this feature could be leveraged to approximate the performance I might expect under high-rate traffic conditions. Could you provide some guidance on how to use it effectively? Specifically, can it be used to infer the expected throughput on actual Tofino hardware, or is it only useful for logical validation of pipeline behavior? Looking forward to your insights. |
@lorepap -- great to hear that you disabled logging. How big is your program? The number I quoted above (50-100pps) was measured on a pretty complex one (switch.p4_16). The manual time advancement, it is, indeed, designed to accurately simulate time-dependent aspects of the pipeline functionality. As for Tofino performance in general, I can say the following:
|
I just did a quick measurement of the model performance on my VM (AWS That minimal program minprog.p4.txt does not even use match-action pipeline per se -- everything is done inside one parser state. On a more complicated program that used all 12 stages of the ingress pipeline (albeit minimally), the performance dropped to about 170pps. I would not be surprised to see that performance dropping by another factor of two if the packet had to undergo both ingress and egress processing, If I used mode complicated processing (e.g. involving more parser states, more logical tables, stateful externs, more hash calculations, etc.) it would surely would go down even more. So, I'd say that my initial number (50-100ps) is still a good rule of thumb for most practical purposes, whereas it looks like it is possible to quote performance of "up to" 500 (or maybe even more) pps :) |
Hi,
I'm conducting a stress test using Mininet and the Tofino software model on a high-performance server with 32 CPU cores and 256GB RAM, running Ubuntu 20.04.
For my experiment, I set up two hosts and established an iperf session over links configured at 100Mbps. However, the Tofino software model seems to struggle beyond ~40Mbps of traffic.
I'm aware that the software model is meant for testing/debugging purposes, but I was wondering if there is any documented information on the throughput limitations. I haven't been able to find clear benchmarks online regarding its expected performance.
Any insights would be greatly appreciated!
The text was updated successfully, but these errors were encountered: