Research checkpoint #09: varying the size of the packets with the Raspberry Pi
Expanding our tests, in this post I explore whether packet length is an influential factor in the performance of the Raspberry Pi running the Python libraries that we have been studying in the last posts: Pyshark, Scapy and Pypcap + dpkt. The tests were performed using the iperf3 application, as we did in post 07 where we studied the performance of the libraries with different network bit rates. All the scripts used, including the Bash script that automates the tests execution, are available in my Github repository for this research, with the Python libraries and tools used to collected the metrics below being the same as those reported in the cited post.
Tests with defined number and length of packets
Different from what we did in the last posts in which the duration of the packet transmission was defined (10 seconds), here the constant over the tests is only the bit rate at 2Mb/sec, since this is a number that all the libraries still can perform reasonably well and also still allows a considerable number of packets in a short period of time. So, the procedure to perform the test cases was similar to the that used in post 07, but with the packet transmission having now 42 seconds to end after its start and no longer a fixed time interval:
- Run the Python sniffer script and wait 10 seconds;
- Start the transmission of packets;
- After 42 seconds, end the test and collect the results.
The parameters considered for tests were chosen to be within this time range and to permit a gap at the end of the iperf3 execution. They are:
- 1000 packets with 148, 724 and 1448 bytes of length;
- 4000 packets with 148, 724 and 1448 bytes of length.
In the first cases with 1000 packets, the idea was to use a volume of packets that we knew all the libraries would be capable to handle, considering what we had already studied. For the second round of tests with 4000 packets, this number was chosen to understand whether a reduction in the length of the packets would improve the results we observed in the last post, where we saw that Pyshark and Scapy never reached a count above this. In all of the scenarios, the test was run five times and the metrics reported are based on all these executions.
The MTU (Maximum Transmission Unit) of the Raspberry Ethernet interface is 1500 bytes, being the TCP MSS (Maximum Segment Size) of 1448 bytes. This last one is the default length of the packets in the remote server test cases we analyzed in the last posts, even in the ones using UDP. Using this value as a basis, two others were chosen for the comparison: 148 bytes (close to 10% of 1448) and 724 bytes (50% of 1448). All the tests were performed using my notebook as a remote server and the Raspberry Pi as a client, in a local network without internet access (the specifications of the machines can be found at post 07).
Below, I report the metrics collected with each library and then analyze the results with more attention. One observation to keep in mind: the fixed number of packets refers to the ones in the iperf3 generated flow for the test, this application also establishes a separate control connection that exchanges packets between the client and server which ends up also being counted. So, it’s natural to the Python sniffing libraries report a number a little over 1000 or 4000 packets, since this other connection always has around 35 packets.
Pyshark
Test | Use of CPU (mean) | Peak of memory use (max) | N. of packets counted (mean) | iperf3 flow data |
---|---|---|---|---|
1000 packets of 148 bytes | 28.78% | 27648 KB | 1033.2 | 2.00 Mbits/sec, 145 KBytes in 0.59 sec |
1000 packets of 724 bytes | 37.60% | 27768 KB | 1032.6 | 2.00 Mbits/sec, 707 KBytes in 2.89 sec |
1000 packets of 1448 bytes | 47.72% | 28032 KB | 1033.0 | 2.00 Mbits/sec, 1.38 MBytes in 5.79 sec |
4000 packets of 128 bytes | 79.48% | 27640 KB | 2883.8 | 2.00 Mbits/sec, 578 KBytes in 2.37 sec |
4000 packets of 724 bytes | 79.96% | 27896 KB | 2192.4 | 2.00 Mbits/sec, 2.76 MBytes in 11.58 sec |
4000 packets of 1448 bytes | 79.16% | 27964 KB | 1815.8 | 2.00 Mbits/sec, 5.52 MBytes in 23.16 sec |
Scapy
Test | Use of CPU (mean) | Peak of memory use (max) | N. of packets counted (mean) | iperf3 flow data |
---|---|---|---|---|
1000 packets of 148 bytes | 2.40% | 67228 KB | 339.0 | 2.00 Mbits/sec, 145 KBytes in 0.59 sec |
1000 packets of 724 bytes | 6.20% | 67612 KB | 994.0 | 2.00 Mbits/sec, 707 KBytes in 2.89 sec |
1000 packets of 1448 bytes | 6.64% | 67608 KB | 1031.4 | 2.00 Mbits/sec, 1.38 MBytes in 5.79 sec |
4000 packets of 128 bytes | 5.60% | 67288 KB | 880.2 | 2.00 Mbits/sec, 578 KBytes in 2.37 sec |
4000 packets of 724 bytes | 23.36% | 67864 KB | 3958.0 | 2.00 Mbits/sec, 2.76 MBytes in 11.58 sec |
4000 packets of 1448 bytes | 24.96% | 68096 KB | 4035.0 | 2.00 Mbits/sec, 5.52 MBytes in 23.16 sec |
Pypcap + dpkt
Test | Use of CPU (mean) | Peak of memory use (max) | N. of packets counted (mean) | iperf3 flow data |
---|---|---|---|---|
1000 packets of 148 bytes | 0.38% | 21204 KB | 1032.4 | 2.00 Mbits/sec, 145 KBytes in 0.59 sec |
1000 packets of 724 bytes | 0.54% | 21204 KB | 1032.2 | 2.00 Mbits/sec, 707 KBytes in 2.89 sec |
1000 packets of 1448 bytes | 0.62% | 21204 KB | 1033.6 | 2.00 Mbits/sec, 1.38 MBytes in 5.79 sec |
4000 packets of 128 bytes | 1.18% | 21204 KB | 4035.2 | 2.00 Mbits/sec, 578 KBytes in 2.37 sec |
4000 packets of 724 bytes | 1.96% | 21204 KB | 4033.0 | 2.00 Mbits/sec, 2.76 MBytes in 11.58 sec |
4000 packets of 1448 bytes | 2.36% | 21204 KB | 4029.8 | 2.00 Mbits/sec, 5.52 MBytes in 23.16 sec |
Analyzing and comparing the performance
The first thing that can be noticed is that all the libraries have an increase of CPU usage when there is an increase in the size of the packets:
Comparison of the CPU usage by each Python library, standard deviation is also indicated. Click on the image to expand it.
However, this isn’t a metric that we can use to affirm a clear correlation: the tool used to collect the CPU usage (the psutil Python library) reports the ratio between the CPU time consumed by a process and the total time interval considered for evaluation, so tests that take more time are expected to have a greater value here, as we can see in the graphic above. Even so, the percentages used by each library are quite distinct.
Pyshark, being coherent with what we saw in the last post, utilizes much more CPU than the other libraries. The Python script runs for a total of 50 seconds and the first test has 0.59 seconds of duration, representing ~1,2% of the total time, while the usage of CPU time by Pyshark was of 28.78% on average, that is, this library had a much higher CPU time consumption than expected considering the duration of the test. In the tests with 4000 packets, we can see that the library stays in the same level. This observation is interesting to be seen together with the number of packets counted:
Comparison of the number of packets counted, standard deviation is also indicated. Click on the image to expand it.
We can see that in the 4000 packets scenarios Pyshark counted fewer packets than the other libraries in most cases, even with high CPU consumption. However, different from the plateau observed in the last graphic, here it is clear that this library is being affected by the size of the packets: the CPU usage is the same over all the packet lengths considered, but the number counted decreases as there is an increase in the packet size. Even that Pyshark couldn’t count the totality of data in the network traffic over the three different packet lengths in the 4000 packets scenario, clearly it struggled more in the ones with bigger packets. On the other hand, in the 1000 packets scenario, the library was able to count all the packets.
Scapy, in contrast, presents a strange behavior: even in the 1000 packets scenario it wasn’t able to count all the packets, and note that the standard deviation is not very significant. My supposition is that this library has a limitation in the number of packets it can read per unity of time (obviously none of the libraries can read unlimited packets per second, however we ended up bumping into this limit here). In the tests of the last post we observed that the limit reached by this library in the number of packets counted was close to 3500 during the 10 seconds of flow, so it is interesting to see here that this library was able to count only 339 packets in the test with 0.59 seconds (1000 packets of 148 bytes) and 994 packets in the one with 2.89 seconds (1000 packets of 724 bytes). The numbers aren’t exactly proportional, but are close: assuming that this library can read 350 packets per second, in 0.59 and 2.89 seconds we would expect 206.5 and 1011.5 packets respectively. The bit rate being fixed leads to differences in the number of packets per second, which might influence here. Even more interesting to see that in the tests that lasted more than 10 seconds (4000 packets of 724 and 1448 bytes), Scapy was able to read more than the usual 3500 packets, counting 3958 packets on average in the one with 11.58 seconds (the standard deviation is more expressive here, during the 5 repetitions the number of packets varied between 3775 and 4006), in which we would expect 4053 packets considering 350 packets per second.
The CPU usage by Scapy also seems to reflect the behavior described in the last paragraph: if we compare the consumption between the tests with 1000 packets of 724 bytes and the one with 1448 bytes, we can see that the difference is small (6.20% and 6.64%, respectively), as is the difference in the counted number, but the duration of the tests is much more significant (2.89 and 5.79 seconds). The use of CPU in the scenario with 4000 packets of 128 bytes, that lasted 2.37 seconds, also reinforces it: 5.60%, but the number jumps in the other 4000 packets tests that have a longer duration. Therefore, it seems that the most influential factor here isn’t the packet length, but the number of packets per second.
The memory usage of Scapy is also coherent with what we saw in the last posts:
Comparison of the memory usage, with standard deviation indicated. Click on the image to expand it.
In truth, the three libraries were coherent with what we saw in the memory usage. What isn’t coherent is the variations between the tests: neither of the libraries seems to respect the increase of packet length in their reports of memory consumption. Pyshark and Scapy had variations that aren’t proportional to the difference of length, and the duo Pypcap + dpkt was constant over all the tests. This is perhaps an indication that the structures used aren’t dynamic in the packet size.
Finally, Pypcap together with dpkt presented good results, with low CPU consumption and counting all the packets that were being generated by the iperf3 application. Note that the most stressful test (4000 packets of 1448 packets) had flow with duration of 23.16 seconds, which is approximately 46% of the total 50 seconds, but this duo of libraries presented only 2.36% on CPU consumption. Good metrics that make hard to tell if the packet length had an influence on its performance.
Tests with defined bit rate and length of packets
For comparison and record purposes, I also performed another battery of tests before the one reported above, in which I used a different configuration of parameters: different from the last one, I decided to fix the duration of the test and vary the packet length and the bit rate. So, the procedure was the same used in post 07, with the packet transmission lasting 10 seconds and then with a interval of 32 seconds before the Bash script ends. The parameters considered were:
- 1 Mb/sec of bit rate, packets with 48, 328, 608, 888 and 1168 bytes of length;
- 5 Mb/sec of bit rate, packets with 48, 328, 608, 888 and 1168 bytes of length.
I also included below the metrics collected in the execution of 1 Mb/sec and 5 Mb/sec tests in post 07, in which the packet length wasn’t manually defined. However, over all the runs, the default value used was 1448 bytes.
I chose to use the bit rates of 1 Mb/sec and 5 Mb/sec considering that they represent different scenarios well: in the first one, all the libraries presented a good performance without struggles (as seen in post 08); and in the second one Pyshark and Scapy had difficulty handling all the traffic, but it’s still an intermediary value for observation. The length variation between test cases is 280 bytes, from 48 to 1448 bytes.
In my understanding, the numbers observed below reaffirm the analysis of the last battery of tests. The most interesting metrics to be seen are, again, the CPU usage and the number of packets counted, with the memory usage being almost constant, as can be seen in the graphics. After them, I make some commentaries about each library separately, comparing the results with the ones from the last battery of tests.
Comparing the mean of CPU usage over the 5 repetitions. Click on the image to expand it.
Comparing the mean of the number of packets counted over the 5 repetitions. Click on the image to expand it.
Comparing the peak of memory in use over the 5 repetitions. Click on the image to expand it.
Pyshark
Test | Use of CPU (mean) | Peak of memory use (max) | N. of packets counted (mean) | iperf3 flow data |
---|---|---|---|---|
1 Mb/s and 48 bytes/packet | 78.54% | 27512 KB | 3042.6 | 1.00 Mbits/sec, 1.19 MBytes in 26040 packets |
1 Mb/s and 328 bytes/packet | 79.82% | 27648 KB | 2654.8 | 1.00 Mbits/sec, 1.19 MBytes in 3811 packets |
1 Mb/s and 608 bytes/packet | 71.44% | 27904 KB | 2090.2 | 1.00 Mbits/sec, 1.19 MBytes in 2056 packets |
1 Mb/s and 888 bytes/packet | 55.50% | 27768 KB | 1440.2 | 1.00 Mbits/sec, 1.19 MBytes in 1408 packets |
1 Mb/s and 1168 bytes/packet | 47.06% | 27896 KB | 1119.8 | 1.00 Mbits/sec, 1.19 MBytes in 1071 packets |
1 Mb/s and 1448 bytes/packet | 41.20% | 27904 KB | 898.0 | 1.00 Mbits/sec, 1.19 MBytes in 864 packets |
5 Mb/s and 48 bytes/packet | 78.34% | 27508 KB | 2879.8 | 5.00 Mbits/sec, 5.96 MBytes in 130196 packets |
5 Mb/s and 328 bytes/packet | 79.72% | 27648 KB | 2633.6 | 5.00 Mbits/sec, 5.96 MBytes in 19054 packets |
5 Mb/s and 608 bytes/packet | 80.10% | 27904 KB | 2344.4 | 5.00 Mbits/sec, 5.96 MBytes in 10279 packets |
5 Mb/s and 888 bytes/packet | 79.60% | 27764 KB | 2054.2 | 5.00 Mbits/sec, 5.96 MBytes in 7038 packets |
5 Mb/s and 1168 bytes/packet | 79.52% | 27892 KB | 1882.8 | 5.00 Mbits/sec, 5.96 MBytes in 5351 packets |
5 Mb/s and 1448 bytes/packet | 78.88% | 27904 KB | 1717.4 | 5.00 Mbits/sec, 5.96 MBytes in 4316 packets |
In the 5Mb/sec scenarios, Pyshark demonstrated variations in the number of packets counted, even with the CPU usage being almost constant over the tests. As there is an increase in the size of the packets, the library reported a smaller number, another indication that Pyshark is affected by the length of the traffic in the network. Considering that Pyshark is a wrapper for tshark and that it uses Wireshark dissectors, this is a natural behavior, as the size of the packet is possibly an influential factor in the dissection process.
Even considering this observation, the length seems to have a limited influence, as the variations aren’t proportionals: still in the 5 Mb/sec scenario, varying from 328 to 48 bytes we increased the number of packets from 19054 to 130196, an increase of approximately 583.3% and a reduction of ~85.4% in the size, but Pyshark was able to count 2633.6 and 2879.8 packets, respectively, an increase of ~9.3%. This behavior is not exclusive to this case in which the number of packets had a huge increase: varying from 1448 to 1168 bytes we increased the number of packets from 4316 to 5351, an increase of ~23.9% and a reduction of ~19.3% in the size, however the increase in the number counted by Pyshark was again of only ~9.6%.
In the 1 Mb/sec scenario, Pyshark can handle with all the packets up to a certain traffic volume limit, with high CPU usage and memory variations that aren’t exactly proportional to the increase of length in the packets.
Scapy
Test | Use of CPU (mean) | Peak of memory use (max) | N. of packets counted (mean) | iperf3 flow data |
---|---|---|---|---|
1 Mb/s and 48 bytes/packet | 20.92% | 67672 KB | 3419.2 | 1.00 Mbits/sec, 1.19 MBytes in 26040 packets |
1 Mb/s and 328 bytes/packet | 20.60% | 67608 KB | 3496.0 | 1.00 Mbits/sec, 1.19 MBytes in 3811 packets |
1 Mb/s and 608 bytes/packet | 13.12% | 67608 KB | 2092.0 | 1.00 Mbits/sec, 1.19 MBytes in 2056 packets |
1 Mb/s and 888 bytes/packet | 9.34% | 67664 KB | 1441.6 | 1.00 Mbits/sec, 1.19 MBytes in 1408 packets |
1 Mb/s and 1168 bytes/packet | 7.64% | 67608 KB | 1105.4 | 1.00 Mbits/sec, 1.19 MBytes in 1071 packets |
1 Mb/s and 1448 bytes/packet | 7.02% | 67736 KB | 899.4 | 1.00 Mbits/sec, 1.19 MBytes in 864 packets |
5 Mb/s and 48 bytes/packet | 20.80% | 67608 KB | 2983.6 | 5.00 Mbits/sec, 5.96 MBytes in 130205 packets |
5 Mb/s and 328 bytes/packet | 20.64% | 67612 KB | 3359.8 | 5.00 Mbits/sec, 5.96 MBytes in 19054 packets |
5 Mb/s and 608 bytes/packet | 20.60% | 67736 KB | 3405.4 | 5.00 Mbits/sec, 5.96 MBytes in 10279 packets |
5 Mb/s and 888 bytes/packet | 20.40% | 67864 KB | 3391.4 | 5.00 Mbits/sec, 5.96 MBytes in 7038 packets |
5 Mb/s and 1168 bytes/packet | 20.42% | 67864 KB | 3351.8 | 5.00 Mbits/sec, 5.96 MBytes in 5351 packets |
5 Mb/s and 1448 bytes/packet | 20.40% | 67740 KB | 3434.6 | 5.00 Mbits/sec, 5.96 MBytes in 4316 packets |
As the number of packets increases in the 1Mb/sec scenario, the use of CPU by Scapy is also increased. It wasn’t able again to count much more than 3400 packets in each run, and, more than that, the library almost read a constant number of packets over all the tests in the 5Mb/sec scenario. Considering the defined interval of 10 seconds for each flow, this is an indication of the library limitation in handling big volumes of packets per unit of time.
Interesting to observe that in the 5 Mb/sec with 48 bytes/packet the number of packets counted was reduced to only 2983.6 (with standard deviation of approximately 134.28 packets). It isn’t clear for me what caused this reduction, maybe with the high volume of packets the library wasn’t even able to reach its plateau. I think this will became more clear in the next post, when I report the tests with big volumes of traffic simulating a DDoS attack.
Pyshark + dpkt
Test | Use of CPU (mean) | Peak of memory use (max) | N. of packets counted (mean) | iperf3 flow data |
---|---|---|---|---|
1 Mb/s and 48 bytes/packet | 6.30% | 21204 KB | 26073.6 | 1.00 Mbits/sec, 1.19 MBytes in 26040 packets |
1 Mb/s and 328 bytes/packet | 1.92% | 21204 KB | 3842.4 | 1.00 Mbits/sec, 1.19 MBytes in 3811 packets |
1 Mb/s and 608 bytes/packet | 1.10% | 21204 KB | 2089.2 | 1.00 Mbits/sec, 1.19 MBytes in 2056 packets |
1 Mb/s and 888 bytes/packet | 0.82% | 21204 KB | 1442.2 | 1.00 Mbits/sec, 1.19 MBytes in 1408 packets |
1 Mb/s and 1168 bytes/packet | 0.66% | 21204 KB | 1104.0 | 1.00 Mbits/sec, 1.19 MBytes in 1071 packets |
1 Mb/s and 1448 bytes/packet | 0.52% | 21208 KB | 895.6 | 1.00 Mbits/sec, 1.19 MBytes in 864 packets |
5 Mb/s and 48 bytes/packet | 19.38% | 21204 KB | 92247.2 | 5.00 Mbits/sec, 5.96 MBytes in 130202 packets |
5 Mb/s and 328 bytes/packet | 5.48% | 21204 KB | 19087.0 | 5.00 Mbits/sec, 5.96 MBytes in 19054 packets |
5 Mb/s and 608 bytes/packet | 4.44% | 21200 KB | 10312.4 | 5.00 Mbits/sec, 5.96 MBytes in 10279 packets |
5 Mb/s and 888 bytes/packet | 3.26% | 21204 KB | 7072.0 | 5.00 Mbits/sec, 5.96 MBytes in 7038 packets |
5 Mb/s and 1168 bytes/packet | 2.78% | 21204 KB | 5387.6 | 5.00 Mbits/sec, 5.96 MBytes in 5351 packets |
5 Mb/s and 1448 bytes/packet | 2.26% | 21208 KB | 4349.2 | 5.00 Mbits/sec, 5.96 MBytes in 4316 packets |
These joint of libraries again performed well, only failing in the 5 Mb/sec and 48 bytes/packet case, in which it counted only ~70% of the packets (remember that there are more packets in the network than the reference metrics, which refers only to the flow generated by iperf3). Its use of memory was almost constant and the CPU usage presented is also in agreement with the variation of volume in the network, leading us to think that the most influential factor for its performance is the volume of packets in the network.
Next steps
In the next post, I will report the results of experiments simulating an DDoS attack against the Raspberry Pi, concluding for now these tests in the packet parsing context, as we defined in post 05:
- Packet Parsing
- Conduct experiments to verify the performance of Python libraries sniffing the network and processing the packages in a Raspberry Pi (different from what we did in post 4, in which we tested the lib’s with the CIC-IoT-2023 dataset and in a regular computer).
- Inference
- Run the baseline models proposed in post 02 in a Raspberry Pi to verify its performance.
- Training (centralized IDS)
- TBD.
This post was made as a record of the progress of the research project “DDoS Detection in the Internet of Things using Machine Learning Methods with Retraining”, supervised by professor Daniel Batista, BCC - IME - USP. Project supported by the São Paulo Research Foundation (FAPESP), process nº 2024/10240-3. The opinions, hypothesis and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of FAPESP.