Session E-1

Packets and Flows

2:00 PM — 3:30 PM EDT
May 3 Tue, 2:00 PM — 3:30 PM EDT

FlowShark: Sampling for High Flow Visibility in SDNs

Sogand Sadrhaghighi (University of Calgary, Canada); Mahdi Dolati (University of Tehran, Iran); Majid Ghaderi (University of Calgary, Canada); Ahmad Khonsari (University of Tehran, Iran)

As the scale and speed of modern networks continue to increase, traffic sampling has become an indispensable tool in network management. While there exist a plethora of sampling solutions, they either provide limited flow visibility or have poor scalability in large networks. This paper presents the design and evaluation of FlowShark, a high-visibility per-flow sampling system for Software-Defined Networks (SDNs). The key idea in FlowShark is to separate sampling decisions on short and long flows, whereby sampling short flows is managed locally on edge switches, while a central controller optimizes sampling decisions on long flows. To this end, we formulate flow sampling as an optimization problem and design an online algorithm with a bounded competitive ratio to solve the problem efficiently. To show the feasibility of our design, we have implemented FlowShark in a small OpenFlow network using Mininet. We present experimental results of our Mininet implementation as well as performance benchmarks obtained from packet-level simulations in larger networks. Our experiments with a machine learning based Traffic Classifier application show up to 27% and 19% higher classification recall and precision, respectively, with FlowShark compared to existing sampling approaches.

Joint Resource Management and Flow Scheduling for SFC Deployment in Hybrid Edge-and-Cloud Network

Yingling Mao, Xiaojun Shang and Yuanyuan Yang (Stony Brook University, USA)

Network Function Virtualization (NFV) migrates network functions (NF) from proprietary hardware to commercial servers on the edge or cloud, making network services more cost-efficient, manage-convenient, and flexible. To facilitate these advantages, it is critical to find an optimal deployment of the chained virtual NFs, i.e. service function chains (SFCs), in hybrid edge-and-cloud environment, considering both resource and latency. It is an NP-hard problem. In this paper, we first limit the problem at the edge and design a constant approximation algorithm named chained next fit (CNF), where a sub-algorithm called double spanning tree (DST) is designed to deal with virtual network embedding. Then we take both cloud and edge resources into consideration and create a promotional algorithm called decreasing sorted, chained next fit (DCNF), which also has a provable constant approximation ratio. The simulation results demonstrate that the ratio between DCNF and the optimal solution is much smaller than the theoretical bound, approaching an average of 1.25. Moreover, DCNF always has a better performance than the benchmarks, which implies that it is a good candidate for joint resource and latency optimization in hybrid edge-and-cloud networks.

NFlow and MVT Abstractions for NFV Scaling

Ziyan Wu and Yang Zhang (University of Minnesota, USA); Wendi Feng (Beijing Information Science and Technology University, China); Zhi-Li Zhang (University of Minnesota, USA)

The ability to dynamically scale out/in network functions (NFs) on multiple cores/servers to meet traffic demands is a key benefit of network function virtualization (NFV). The stateful NF operations make NFV scaling a challenging task: if care is not taken, NFV scaling can lead to incorrect operations and poor performance. We advocate two general abstractions, NFlow and Match-Value Table (MVT), for NFV packet processing pipelines. We present formal definitions of these concepts and discuss how they can facilitate NFV scaling by minimizing or eliminating shared states. Using NFs implemented with the proposed abstractions, we conduct extensive experiments and demonstrate their efficacy in terms of correctness and performance of NFV scaling.

The Information Velocity of Packet-Erasure Links

Elad Domanovitz (Tel Aviv University, Israel); Tal Philosof (Samsung, Israel); Anatoly Khina (Tel Aviv University, Israel)

We consider the problem of in-order packet transmission over a cascade of packet-erasure links with acknowledgment (ACK) signals, interconnected by relays. We treat first the case of transmitting a single packet, in which ACKs are unnecessary, over links with independent identically distributed erasures. For this case, we derive tight upper and lower bounds on the probability of arrive failure within an allowed end-to-end communication delay over a given number of links. When the number of links is commensurate with the allowed delay, we determine the maximal ratio between the two---coined information velocity---for which the arrive-failure probability decays to zero; we further derive bounds on the arrive-failure probability when the ratio is below the information velocity, determine the exponential arrive-failure decay rate, and extend the treatment to links with different erasure probabilities. We then elevate all these results for a stream of packets with independent geometrically distributed interarrival times, and prove that the information velocity and the exponential decay rate remain the same for any stationary ergodic arrival process and for deterministic interarrival times. We demonstrate the significance of the derived fundamental limits---the information velocity and the arrive-failure exponential decay rate---by comparing them to simulation results.

Session Chair

Ruidong Li (Kanazawa University)

Session E-2


4:00 PM — 5:30 PM EDT
May 3 Tue, 4:00 PM — 5:30 PM EDT

Mag-E4E: Trade Efficiency for Energy in Magnetic MIMO Wireless Power Transfer System

Xiang Cui, Hao Zhou, Jialin Deng and Wangqiu Zhou (University of Science and Technology of China, China); Xing Guo (Anhui University, China); Yu Gu (Hefei University of Technology, China)

Magnetic resonant coupling (MRC) wireless power transfer (WPT) is a convenient and potential power supply solution for smart devices. The scheduling problem in the multiple-input multiple-output (MIMO) scenarios is essential to concentrate energy at the receiver (RX) side. Meanwhile, strong TX-RX coupling could ensure better power transfer efficiency (PTE), but may cause lower power delivered to load (PDL) when transmitter voltages are bounded. In this paper, we propose the frequency adjustment based PDL maximization scheme for MIMO MRC-WPT systems. We formulate such joint optimization problem and decouple it into two sub-problems, i.e., high-level frequency adjustment and low-level voltage adaptation. We solve these two subproblems with gradient descent based and alternating direction method of multipliers (ADMM) based algorithms, respectively. We further design an energy-voltage transform matrix algebra based estimation mechanism to reduce context measurement overhead. We prototype the proposed system, and conduct extensive experiments to evaluate its performance. As compared with the PTE maximization solutions, our system trades smaller efficiency for larger energy, i.e., 361% PDL improvement with respect to 26% PTE losses when TX-RX distance is 10cm.

Minimal Total Deviation in TCAM Load Balancing

Yaniv Sadeh (Tel Aviv University, Israel); Ori Rottenstreich (Technion - Israel Institute of Technology, Israel); Haim Kaplan (Tel-Aviv University, Israel)

Traffic splitting is a required functionality in networks, for example for load balancing over multiple paths or among different servers. The capacities of the servers determine the partition by which traffic should be split. A recent approach implements traffic splitting within the ternary content addressable memory (TCAM), which is often available in switches. It is important to reduce the amount of memory allocated for this task since TCAMs are power consuming and are also required for other tasks such as classification and routing. Previous work showed how to compute the smallest prefix-matching TCAM necessary to implement a given partition exactly. In this paper we solve the more practical case, where at most n prefix-matching TCAM rules are available, restricting the ability to implement exactly the desired partition.
We consider the L1 distance between partitions, which is of interest when overloaded requests are simply dropped, and we want to minimize the total loss. We prove that the Niagara algorithm can be used to find the closest partition in L1 to the desired partition, that can be realized with n TCAM rules. Moreover, we prove it for arbitrary partitions, with (possibly) non-integer parts.

Performance and Scaling of Parallel Systems with Blocking Start and/or Departure Barriers

Brenton Walker (Leibniz Universität Hannover, Germany); Stefan Bora (Universität Hannover, Germany); Markus Fidler (Leibniz Universität Hannover, Germany)

Parallel systems divide jobs into smaller tasks that can be serviced by many workers at the same time. Some parallel systems have blocking barriers that require all of their tasks to start and/or depart in unison. This is true of many parallelized machine learning workloads, and the popular Spark processing engine has recently added support for Barrier Execution Mode, which allows users to add such barriers to their jobs. The drawback of these barriers is reduced performance and stability compared to equivalent non-blocking systems.

We derive analytical expressions for the stability regions for parallel systems with blocking start and/or departure barriers. We extend results from queuing theory to derive waiting and sojourn time bounds for systems with blocking start barriers. Our results show that for a given system utilization and number of servers, there is an optimal degree of parallelism that balances waiting time and job execution time. This observation leads us to propose and implement a class of self-adaptive schedulers, we call "Take-Half", that modulate the allowed degree of parallelism based on the instantaneous system load, improving mean performance and eliminating stability issues.

Short-Term Memory Sampling for Spread Measurement in High-Speed Networks

Yang Du, He Huang and Yu-e Sun (Soochow University, China); Shigang Chen (University of Florida, USA); Guoju Gao, Xiaoyu Wang and Shenghui Xu (Soochow University, China)

Per-flow spread measurement in high-speed networks can provide indispensable information to many practical applications. However, it is challenging to measure millions of flows at line speed because on-chip memory modules cannot simultaneously provide large capacity and large bandwidth. The prior studies address this mismatch by entirely using on-chip compact data structures or utilizing off-chip space to assist limited on-chip memory. Nevertheless, their on-chip data structures record massive transient elements, each of which only appears in a short time interval in a long-period measurement task, and thus waste significant on-chip space. This paper presents short-term memory sampling, a novel spread estimator that samples new elements while only holding elements for short periods. Our estimator can work with tiny on-chip space and provide accurate estimations for online queries. The key of our design is a short-term memory duplicate filter that reports new elements and filters duplicates effectively while allowing incoming elements to override the stale elements to reduce on-chip memory usage. We implement our approach on a NetFPGA-equipped prototype. Experimental results based on real Internet traces show that, compared to the state-of-the-art, short-term memory sampling reduces up to 99% of on-chip memory usage when providing the same probabilistic assurance on spread-estimation error.

Session Chair

Markus Fidler (Leibniz Universität Hannover)

Made with in Toronto · Privacy Policy · INFOCOM 2020 · INFOCOM 2021 · © 2022 Duetone Corp.