Session G-4

Traffic Shaping and Inspection

8:30 AM — 10:00 AM EDT
May 18 Thu, 8:30 AM — 10:00 AM EDT
Babbio 221

Harry: A Scalable SIMD-based Multi-literal Pattern Matching Engine for Deep Packet Inspection

Hao Xu (Fudan University, China); Harry Chang, Wenjun Zhu, Yang Hong, Geoff Langdale and Kun Qiu (Intel, China); Jin Zhao (Fudan University, China)

Deep Packet Inspection (DPI) is a significant network security technique. It examines traffic workloads by searching for specific rules. Since every byte of packets needs to be examined by many literal rules, multi-literal matching becomes the performance bottleneck of DPI. FDR, the fastest multi-literal matching engine on CPUs, takes advantage of Single-Instruction-Multiple-Data (SIMD) to alleviate this bottleneck and achieves a performance boost over the widely-used Aho-Corasick (AC) algorithm. However, FDR does not deeply exploit the data-level parallelism of SIMD and its SIMD vector utilization is only 50%. Besides, limited by certain SIMD shift instructions, it cannot benefit from advanced SIMD instruction sets. To overcome these issues, we propose Harry, a scalable and SIMD-based multi-literal matching engine. Harry adopts a column-vector-based matching algorithm to improve the data-level parallelism and SIMD vector utilization. To support the algorithm, it takes two encoding methods to compress the mask table. Also, it utilizes shuffle instruction to implement shift. We implement Harry on commodity CPU and evaluate it with real network traffic and DPI rules. Our evaluation shows that Harry reaches a throughput of 30∼70 Gbit/s, up to 52x that of AC and 2.09x of FDR. It has been successfully deployed in Hyperscan.
Speaker Hao Xu (Fudan University)

Hao Xu is currently a second year Master student in School of Computer Science Technology, Fudan University, advised by Prof. Jin Zhao. He received his B.Eng. from Northeastern University in 2019. His research interests lie in computer networking and systems.

COIN: Cost-Efficient Traffic Engineering with Various Pricing Schemes in Clouds

Gongming Zhao, Jingzhou Wang and Hongli Xu (University of Science and Technology of China, China); Zhuolong Yu (Johns Hopkins University, USA); Chunming Qiao (University at Buffalo, USA)

The rapid growth of cloud services has brought a significant increase in inter-datacenter traffic. To transfer data among geographically distributed datacenters, cloud providers need to purchase bandwidth from ISPs. The data transferring cost has become one of the major expenses for cloud providers. Therefore, it is essential for a cloud provider to carefully allocate inter-datacenter traffic among the ISPs' links to minimize the costs. Exiting solutions mainly focus on the situations where all links adopt the same pricing scheme. However, in practice, ISPs usually provide multiple pricing schemes for their links due to market competition, which makes the existing solutions non-optimal. Thus, a new traffic engineering approach that considers various pricing schemes is needed. This paper presents COIN, a new framework for cost-efficient traffic engineering with various pricing schemes. We propose a partition rounding traffic engineering algorithm based on linear independence analysis. The approximation factors and time complexity are formally analyzed. We further conduct large-scale simulations with real-world topologies and datasets. Extensive simulation results show that COIN can save the data transferring cost by up to 54.54% compared with the state-of-the-art solutions.
Speaker Jingzhou Wang (University of Science anf Technology of China)

Jingzhou Wang is pursuing his Ph.D. degree in Computer Science in USTC. His interest includes cloud networks, quality of service and quantum networks. He has published 5 papers in top-ranked conferences and journals, including INFOCOM, ICDCS and ToN.

DeeP4R: Deep Packet Inspection in P4 using Packet Recirculation

Sahil Gupta (Rochester Institute of Technology, USA); Devashish Gosain (Max Planck Institute for Informatics, Germany); Minseok Kwon and Hrishikesh B Acharya (Rochester Institute of Technology, USA)

Software-defined networks are useful for multiple tasks, including firewalling, telemetry, and flow analysis. In particular, the P4 language makes it possible to carry out some simple packet processing tasks in the data plane, i.e., on the switch itself (without real-time support from the SDN controller or a server). However, owing to the limitations of packet parsing in P4, these tasks involve only the packet headers. In this paper, we present a novel approach that allows Deep Packet Inspection (DPI) - i.e., inspection of the packet payload - in the data plane, using P4 alone. We make use of the fact that in P4, a switch can clone and recirculate packets. One copy (clone) can be recirculated, slicing off a byte in each round, and using a finite-state machine to check if a target string has yet been seen. If the target string is found, the other copy (original packet) is discarded; if not, it is passed through. Our approach allows us to build the first application-layer firewall (URL filter) in the data plane, and to achieve essentially line-rate performance while filtering thousands of URLs, on a commodity programmable switch. It may in future also be used for other DPI tasks.
Speaker Sahil Gupta (Rochester Institute of Technology)

The author is a Ph.D. student at the Rochester Institute of Technology in the computer science department. The author is interested in Networks, Systems, and Network Security as research areas.

Burst can be Harmless: Achieving Line-rate Software Traffic Shaping by Inter-flow Batching

Danfeng Shan, Shihao Hu and Yuqi Liu (Xi'an Jiaotong University, China); Wanchun Jiang (Central South University, China); Hao Li, Peng Zhang, Yazhe Tang and Huanzhao Wang (Xi'an Jiaotong University, China); Fengyuan Ren (Tsinghua University, China)

Traffic shaping is a common function at end hosts. Compared with hardware ones, software shapers are more flexible to be developed and deployed, and thus are very attractive. Nevertheless, software approaches are still unsatisfactory as they struggle to saturate 40Gbps and higher speed. While much effort has been made to reduce the intrinsic overhead of software traffic shaping, we find that it is the extrinsic overhead, such as PCIe communications and interrupts, that hinders shaping from achieving 40Gbps - 100Gbps speed. Batching is an effective way to amortize these overheads. However, blindly batching can degrade the network performance, as it introduces bursts into the network. Diving into the dilemma, we find that intra-flow burst is to blame for harming the network performance, while inter-flow burst, consisting of packets from different flows, can be naturally demultiplexed in the network. Based on the insight, we present FlowBundler, which can achieve efficient traffic shaping by inter-flow batching. Testbed experiments show that FlowBundler can achieve an accurate shaping of 98Gbps with a single CPU core, which is 2.6× better than state-of-the-art approaches. Large-scale simulations show that FlowBundler can batch packet transmissions without harming the network performance.
Speaker Danfeng Shan (Xi'an Jiaotong University)

Danfeng Shan is an associate professor at School of Computer Science and Technology, Xi'an Jiaotong University. He received his Ph.D. degree from Department of Computer Science and Technology, Tsinghua University in 2018, and the B.E. degree from Department of Computer Science and Technology from Xi'an Jiaotong University in 2013. His research interests include traffic management, data center networking, congestion control.

Session Chair

Marco Fiore

Gold Sponsor

Gold Sponsor

Bronze Sponsor

Student Travel Grants

Student Travel Grants

Local Organizer

Made with in Toronto · Privacy Policy · INFOCOM 2020 · INFOCOM 2021 · INFOCOM 2022 · © 2023 Duetone Corp.