Session A-7

Video Streaming

Conference
10:00 AM — 11:30 AM EDT
Local
May 13 Thu, 7:00 AM — 8:30 AM PDT

Towards Video Streaming Analysis and Sharing for Multi-Device Interaction with Lightweight DNNs

Yakun Huang, Hongru Zhao and Xiuquan Qiao (Beijing University of Posts and Telecommunications, China); Jian Tang (Syracuse University, USA); Ling Liu (Georgia Tech, USA)

0
Multi-device interaction has attracted a growing interest in both mobile communication industry and mobile computing research community as mobile devices enabled social media and social networking continue to blossom. However, due to the stringent low latency requirements and the complexity and intensity of computation, implementing efficient multi-device interaction for real-time video streaming analysis and sharing is still in its infancy. Unlike previous approaches that rely on high network bandwidth and high availability of cloud center with GPUs to support intensive computations for multi-device interaction and for improving the service experience, we propose MIRSA, a novel edge centric multi-device interaction framework with a lightweight end-to-end DNN for on-device visual odometry (VO) streaming analysis by leveraging edge computing optimizations with three main contributions. First, we design MIRSA to migrate computations from the cloud to the device side, reducing the high overhead for large transmission of video streaming while alleviating the server load of the cloud. Second, we design a lightweight VO network by utilizing temporal shift module to support on-device pose estimation. Third, we provide on-device resource-aware scheduling algorithm to optimize the task allocation. Extensive experiments show MIRSA provides real-time high quality pose estimation as an interactive service and outperforms baseline methods.

AMIS: Edge Computing Based Adaptive Mobile Video Streaming

Phil K Mu, Jinkai Zheng, Tom H. Luan and Lina Zhu (Xidian University, China); Zhou Su (Shanghai University, China); Mianxiong Dong (Muroran Institute of Technology, Japan)

1
This work proposes AMIS, an edge computing-based adaptive video streaming system. AMIS explores the power of edge computing in three aspects. First, with video contents pre-cached in the local buffer, AMIS is content-aware which adapts the video playout strategy based on the scene features of video contents and quality of experience (QoE) of users. Second, AMIS is channel-aware which measures the channel conditions in real-time and estimates the wireless bandwidth. Third, by integrating the content features and channel estimation, AMIS applies the deep reinforcement learning model to optimize the playout strategy towards the best QoE. Therefore, AMIS is an intelligent content- and channel-aware scheme which fully explores the intelligence of edge computing and adapts to general environments and QoE requirements. Using trace-driven simulations, we show that AMIS can succeed in improving the average QoE by 14%-46% as compared to the state-of-the-art adaptive bitrate algorithms.

Robust 360◦ Video Streaming via Non-Linear Sampling

Mijanur R Palash, Voicu Popescu, Amit Sheoran and Sonia Fahmy (Purdue University, USA)

0
We propose CoRE, a 360 ◦ video streaming approach that reduces bandwidth requirements compared to transferring the entire 360 ◦ video. CoRE uses non-linear sampling in both the spatial and temporal domains to achieve robustness to view direction prediction error and to transient wireless network bandwidth fluctuation. Each CoRE frame samples the environment in all directions, with full resolution over the predicted field of view and gradually decreasing resolution at the periphery, so that missing pixels are avoided, irrespective of the view prediction error magnitude. A CoRE video chunk has a main part at full frame rate, and an extension part at a gradually decreasing frame rate, which avoids stalls while waiting for a delayed transfer. We evaluate a prototype implementation of CoRE through trace-based experiments and a user study, and find that, compared to tiling with low-resolution padding, CoRE reduces data transfer amounts, stalls, and H.264 decoding overhead, increases frame rates, and eliminates missing pixels.

Popularity-Aware 360-Degree Video Streaming

Xianda Chen, Tianxiang Tan and Guohong Cao (The Pennsylvania State University, USA)

0
Tile-based streaming techniques have been widely used to save bandwidth in 360 ◦ video streaming. However, it is a challenge to determine the right tile size which directly affects the bandwidth usage. To address this problem, we propose to encode the video by considering the viewing popularity, where the popularly viewed areas are encoded as macrotiles to save bandwidth. We propose techniques to identify and build macrotiles, and adjust their sizes considering practical issues such as head movement randomness. In some cases, a user's viewing area may not be covered by the constructed macrotiles, and then the conventional tiling scheme is used. To support popularity-aware 360 ◦ video streaming, the client selects the right tiles (a macrotile or a set of conventional tiles) with the right quality level to maximize the QoE under bandwidth constraint. We formulate this problem as an optimization problem which is NP-hard, and then propose a heuristic algorithm to solve it. Through extensive evaluations based on real traces, we demonstrate that the proposed algorithm can significantly improve the QoE and save the bandwidth usage.

Session Chair

Yusheng Ji (National Institute of Informatics, Japan)

Session B-7

Cloud

Conference
10:00 AM — 11:30 AM EDT
Local
May 13 Thu, 7:00 AM — 8:30 AM PDT

Robust Service Mapping in Multi-Tenant Clouds

Jingzhou Wang, Gongming Zhao and Hongli Xu (University of Science and Technology of China, China); He Huang (Soochow University, China); Luyao Luo (University of Science and Technology of China, China); Yongqiang Yang (Huawei Technologies Co., Ltd, China)

1
In a multi-tenant cloud, cloud vendors provide services (e.g., elastic load-balancing, virtual private networks) on service nodes for tenants. Thus, the mapping of tenants' traffic and service nodes is an important issue in multi-tenant clouds. In practice, unreliability of service nodes and uncertainty/dynamics of tenants' traffic are two critical challenges that affect the tenants' QoS. However, previous works often ignore the impact of these two challenges, leading to poor system robustness when encountering system accidents.

To bridge the gap, this paper studies the problem of robust service mapping in multi-tenant clouds (RSMP). Due to traffic dynamics, we take a two-step approach: service node assignment and tenant traffic scheduling. For service node assignment, we prove its NP-Hardness and analyze its problem difficulty. Then, we propose an efficient algorithm with bounded approximation factors based on randomized rounding and knapsack. For tenant traffic scheduling, we design an approximation algorithm based on fully polynomial time approximation scheme (FPTAS). The proposed algorithm achieves the approximation factor of 2+ɛ, where ɛ is an arbitrarily small value. Both small-scale experimental results and large-scale simulation results show the superior performance of our proposed algorithms compared with other alternatives.

Scalable On-Switch Rate Limiters for the Cloud

Yongchao He and Wenfei Wu (Tsinghua University, China); Xuemin Wen and Haifeng Li (Huawei, China); Yongqiang Yang (Huawei Technologies Co., Ltd, China)

2
While most clouds use on-server rate limiters for bandwidth allocation, we propose to implement them on switches. On-switch rate limiters can simplify network management and promote the performance of control-plane rate limiting applications. We leverage the recent progress of programmable switches to implement on-switch rate limiters, named SwRL. In the design of SwRL, we make design choices according to the programmable hardware characteristics, we deeply optimize the memory usage of the algorithm so as to fit a cloud-scale (one million) rate limiters in a single switch, and we complement the missing computation primitives of the hardware using a pre-computed approximate table. We further developed three control-plane applications and integrate them with SwRL, showing the control-plane interoperability of SwRL. We prototype and evaluate SwRL in both testbed and production environments, demonstrating its good properties of precision rate control, scalability, interoperability, and manageability (execution environmental isolation).

Monitoring Cloud Service Unreachability at Scale

Kapil Agrawal (Microsoft Research, India); Viral Mehta (Google, India); Sundararajan Renganathan (Stanford, USA); Sreangsu Acharyya (Microsoft Research, India); Venkata N. Padmanabhan (Microsoft Research, USA); Chakri Kotipalli (Microsoft, USA); Liting Zhao (Microsoft, China)

0
We consider the problem of network-unreachability in a global-scale cloud-hosted service that caters to hundreds of millions of users. Even when the service itself is up, the "last mile" between where users are, and the cloud is often the weak link that could render the service unreachable. We present NetDetector, a tool for detecting network-unreachability based on measurements from a client-based HTTP-ping service. NetDetector employs two models. The first, GA (Gaussian Alerts) models temporally averaged raw success rate of the HTTP-pings as a Gaussian distribution and flags significant dips below the mean as unreachability episodes. The second, more sophisticated approach (BB, or Beta-Binomial) models the health of network connectivity as the probability of an access request succeeding, estimates health from noisy samples, and alerts based on dips in health below a client-network-specific SLO (service-level objective) derived from data. These algorithms are enhanced by a drill-down technique that identifies a more precise scope of the unreachability event. We present promising results from GA, which has been in deployment, and the experimental BB detector over a 4-month period. For instance, GA flags 49 country-level unreachability incidents, of which 42 were labelled true positives based on investigation by on-call engineers (OCEs).

Near Optimal and Dynamic Mechanisms Towards a Stable NFV Market in Multi-Tier Cloud Networks

Zichuan Xu and Haozhe Ren (Dalian University of Technology, China); Weifa Liang (The Australian National University, Australia); Qiufen Xia (Dalian University of Technology, China); Wanlei Zhou (University of Technology Sydney, Australia); Guowei WU (Dalian University of Technology, China); Pan Zhou (Huazhong University of Science and Technology, China)

0
With the fast development of next-generation networking techniques, a Network Function Virtualization (NFV) market is emerging as a major market that allows network service providers to trade various network services among consumers. Therefore, efficient mechanisms that guarantee stable and efficient operations of the NFV market are urgently needed. One fundamental problem in the NFV market is how to maximize the social welfare of all players, so they have incentives to participate in activities of the market. In this paper, we first formulate the social welfare maximization problem, with an aim to maximize the total revenue of all players in the NFV market. For the social welfare maximization problem, we design an efficient incentive-compatible mechanism and analyze the existence of a Nash equilibrium of the mechanism. We also consider an online social welfare maximization problem without the knowledge of future request arrivals. We devise an online learning algorithm based on Multi-Armed Bandits (MAB) to allow both customers and network service providers to make decisions with uncertainty of customers' strategy. We evaluate the performance of the proposed mechanisms by both simulations and test-bed implementations, and the results show that the proposed mechanisms obtain at most 23% higher social welfare than existing studies.

Session Chair

Ruidong Li (Kanazawa University, Japan)

Session C-7

Containers and Data Centers

Conference
10:00 AM — 11:30 AM EDT
Local
May 13 Thu, 7:00 AM — 8:30 AM PDT

Exploring Layered Container Structure for Cost Efficient Microservice Deployment

Lin Gu (Huazhong University of Science and Technology, China); Deze Zeng (China University of Geosciences, China); Jie Hu and Hai Jin (Huazhong University of Science and Technology, China); Song Guo (Hong Kong Polytechnic University, Hong Kong); Albert Zomaya (The University of Sydney, Australia)

0
Container, as a light-weight virtualization technology with the advantages of continuous integration and easy deployment, has been widely adopted to support diverse microservices. At runtime, non-local container images need to be frequently pulled from remote registries to local servers, resulting in large pulling traffic and hence long startup time. A distinctive feature in container-based microservice, which has not been exploited, is that container images are in layered structure and some common base layers can be shared between co-located microservices.

In this paper, we propose a layer sharing microservice deployment and image pulling strategy which explores the advantage of layer sharing to speedup microservice startup and lower image storage consumption. The problem is formulated into an Integer Linear Programming (ILP) form. An Accelerated Distributed Augmented Lagrangian (ADAL) based distributed algorithm executed cooperatively by registries and servers is proposed. Through extensive trace driven experiments, we validate the high efficiency of our ADAL based algorithm as it accelerates the microservice startup by 2.30 times in average and reduces the storage consumption by 55.33%.

NetMARKS: Network Metrics-AwaRe Kubernetes Scheduler Powered by Service Mesh

Łukasz Wojciechowski (Samsung R&D Institute Poland, Poland); Krzysztof Opasiak and Jakub Latusek (Warsaw University of Technology & Samsung R&D Institute Poland, Poland); Maciej Wereski (Samsung R&D Institute Poland, Poland); Victor Morales (Samsung Research America, USA); Taewan Kim (Samsung Research, Samsung Electronics Co., Ltd., Korea (South)); Moonki Hong (Samsung Electronics, Co., Ltd., Korea (South))

0
Container technology has revolutionized the way software is being packaged and run. The telecommunications industry, now challenged with the 5G transformation, views containers as the best way to achieve agile infrastructure that can serve as a stable base for high throughput and low latency for 5G edge applications. These challenges make optimal scheduling of performance-sensitive containerized workflows a matter of emerging importance. Meanwhile, the wide adoption of Kubernetes across industries has placed it as a de-facto standard for container orchestration. Several attempts have been made to improve Kubernetes scheduling, but the existing solutions either do not respect current scheduling rules or only considered a static infrastructure viewpoint. To address this, we propose NetMARKS - a novel approach to Kubernetes pod scheduling that uses dynamic network metrics collected with Istio Service Mesh. This solution improves Kubernetes scheduling while being fully backward compatible. We validated our solution using different workloads and processing layouts. Based on our analysis, NetMARKS can reduce application response time up to 37 percent and save up to 50 percent of inter-node bandwidth in a fully automated manner. This significant improvement is crucial to Kubernetes adoption in 5G use cases, especially for multi-access edge computing and machine-to-machine communication.

Optimal Rack-Coordinated Updates in Erasure-Coded Data Centers

Guowen Gong, Zhirong Shen and Suzhen Wu (Xiamen University, China); Xiaolu Li and Patrick Pak-Ching Lee (The Chinese University of Hong Kong, Hong Kong)

0
Erasure coding has been extensively deployed in today's data centers to tackle prevalent failures, yet it is prone to give rise to substantial cross-rack traffic for parity update. In this paper, we propose a new rack-coordinated update mechanism to suppress the cross-rack update traffic, which comprises two successive phases: a delta-collecting phase that collects data delta chunks, and another selective parity update phase that renews the parity chunks based on the update pattern and parity layout. We further design RackCU, an optimal rack-coordinated update solution that achieves the theoretical lower bound of the cross-rack update traffic. We finally conduct extensive evaluations, in terms of large-scale simulation and real-world data center experiments, showing that RackCU can reduce 22.1%-75.1% of the cross-rack update traffic and hence improve 34.2%-292.6% of the update throughput.

Primus: Fast and Robust Centralized Routing for Large-scale Data Center Networks

Guihua Zhou, Guo Chen, Fusheng Lin, Tingting Xu, Dehui Wei and Jianbing Wu (Hunan University, China); Li Chen (Huawei, Hong Kong); Yuanwei Lu and Andrew Qu (Tencent, China); Hua Shao (Tsinghua University & Tencent, China); Hongbo Jiang (Hunan University, China)

0
This paper presents a fast and robust centralized data center network (DCN) routing solution called Primus. For fast routing calculation, Primus uses centralized controller to collect/disseminates the network's link-states (LS), and offload the actual routing calculation onto each switch. Observing that the routing changes can be classified into a few fixed patterns in DCNs which have regular topologies, we simplify each switch's routing calculation into a table-lookup manner, i.e., comparing LS changes with pre-installed base topology and updating routing paths according to predefined rules. As such, the routing calculation time at each switch only needs 10s of us even in a large network topology containing 10K+ switches. For efficient controller fault-tolerance, Primus purposely uses reporter switch to ensure the LS updates successfully delivered to all affected switches. As such, Primus can use multiple stateless controllers and little redundant traffic to tolerate failures, which incurs little overhead under normal case, and keeps 10s of ms fast routing reaction time even under complex data-/control-plane failures. We design, implement and evaluate Primus with extensive experiments on Linux-machine controllers and white-box switches. Primus provides ∼1200x and ∼100x shorter convergence time than current distributed protocol BGP and the state-of-the-art centralized routing solution, respectively.

Session Chair

Wei Wang (Hong Kong University of Science and Technology)

Session D-7

Measurement and Monitoring

Conference
10:00 AM — 11:30 AM EDT
Local
May 13 Thu, 7:00 AM — 8:30 AM PDT

Self-Adaptive Sampling for Network Traffic Measurement

Yang Du, He Huang and Yu-e Sun (Soochow University, China); Shigang Chen (University of Florida, USA); Guoju Gao (Soochow University, China)

1
Per-flow traffic measurement in the high-speed network plays an important role in many practical applications. Due to the limited on-chip memory and the mismatch between off-chip memory speed and line rate, sampling-based methods select and forward a part of flow traffic to off-chip memory, complementing sketch-based solutions in estimation accuracy and online query support. However, most current work uses the same sampling probability for all flows, overlooking that the sampling rates different flows require to meet the same accuracy constraint are different. It leads to a waste in storage and communication resources. In this paper, we present self-adaptive sampling, a framework to sample each flow with a probability adapted to flow size/spread. Then we propose two algorithms, SAS-LC and SAS-LOG, which are geared towards per-flow spread estimation and per-flow size estimation by using different compression functions. Experimental results based on real Internet traces show that, when compared to NDS in per-flow spread estimation, SAS-LC can save around 10% on-chip space and reduce up to 40% communication cost for large flows. Moreover, SAS-LOG can save 40% on-chip space and reduce up to 96% communication cost for large flows than NDS in per-flow size estimation.

MTP: Avoiding Control Plane Overload with Measurement Task Placement

Xiang Chen (Peking University, Pengcheng Lab, and Fuzhou University, China); Qun Huang (Peking University, China); Wang Peiqiao (Fuzhou China, China); Hongyan Liu (Zhejiang University, China); Yuxin Chen (University of Science and Technology of China, China); Dong Zhang (Fuzhou University, China); Haifeng Zhou (Zhejiang University, and Zhejiang Lab, China); Chunming Wu (Zhejiang Lab, and Zhejiang University, China)

1
In programmable networks, measurement tasks are placed on programmable switches to keep pace with high-speed traffic. At runtime, programmable switches send events to the control plane for further processing. However, existing solutions for task placement overlook the limitations of control plane resources. Thus, excessive events may overload the control plane. In this paper, we propose MTP, a system that eliminates control plane overload via careful task placement. For each task, MTP analyzes its structure to estimate its maximum possible rate of sending events to the control plane. Then it builds an optimization framework that addresses the resource restrictions of both switches and the control plane. We have implemented MTP on Barefoot Tofino switches. The experimental results indicate that MTP outperforms existing solutions with higher accuracy across four real use cases.

Low Cost Sparse Network Monitoring Based on Block Matrix Completion

Kun Xie and Jiazheng Tian (Hunan University, China); Gaogang Xie (Institute of Computing Technology, Chinese Academy of Sciences, China); Guangxing Zhang (Institute of Computing Technology Chinese Academy of Sciences, China); Dafang Zhang (Hunan University, China)

0
Due to high network measurement cost, network-wide monitoring faces many challenges. For a network consisting of n nodes, the cost of one time network-wide monitoring will be O(n 2 ). To reduce the monitoring cost, inspired by recent progress of matrix completion, a novel sparse network monitoring scheme is proposed to obtain network-wide monitoring data by sampling a few paths while inferring monitoring data of others. However, current sparse network monitoring schemes suffer from the problems of high measurement cost, high computation complexity in sampling scheduling, and long time to recover the un-sampled data. We propose a novel block matrix completion that can guarantee the quality of the un-sampled data inference by selecting as few as m = O(nr ln(r)) samples for a rank r N × T matrix with n = max{N, T }, which largely reduces the sampling complexity as compared to the existing algorithm for matrix completion. Based on block matrix completion, we further propose a light weight sampling scheduling algorithm to select measurement samples and a light weight data inference algorithm to quickly and accurately recover the un-sampled data. Extensive experiments on three real network monitoring data sets verify our theoretical claims and demonstrate the effectiveness of the proposed algorithms.

Expectile Tensor Completion to Recover Skewed Network Monitoring Data

Kun Xie and Siqi Li (Hunan University, China); Xin Wang (Stony Brook University, USA); Gaogang Xie (Institute of Computing Technology, Chinese Academy of Sciences, China); Yudian Ouyang (Hunan University, China)

0
Network applications, such as network state tracking and forecasting, anomaly detection, and failure recovery, require complete network monitoring data. However, the monitoring data are often incomplete due to the use of partial measurements and the unavoidable loss of data during transmissions. Tensor completion has attracted some recent attentions with its capability of exploiting the multi-dimensional data structure for more accurate un-measurement/missing data inference. Although conventional tensor completion algorithms can work well when the application data follow the symmetric normal distribution, it cannot well handle network monitoring data which are highly skewed with heavy tails. To better follow the data distribution for more accurate recovery of the missing entries with large values, we propose a novel expectile tensor completion (ETC) formulation and a simple yet efficient tensor completion algorithm without hard-setting parameters for easy implementation. From both experimental and theoretical ways, we prove the convergence of the proposed algorithm. Extensive experiments on two real-world network monitoring datasets demonstrate the effectiveness of the proposed ETC.

Session Chair

Xiaolong Zheng (Beijing University of Posts and Telecommunications, China)

Session E-7

Federated Learning 3

Conference
10:00 AM — 11:30 AM EDT
Local
May 13 Thu, 7:00 AM — 8:30 AM PDT

Device Sampling for Heterogeneous Federated Learning: Theory, Algorithms, and Implementation

Su Wang (Purdue University, USA); Mengyuan Lee (Zhejiang University, China); Seyyedali Hosseinalipour (Purdue University, USA); Roberto Morabito (Ericsson Research & Princeton University, Finland); Mung Chiang (Purdue University, USA); Christopher G. Brinton (Purdue University & Zoomi Inc., USA)

0
The conventional federated learning (FedL) architecture distributes machine learning (ML) across worker devices by having them train local models that are periodically aggregated by a server. FedL ignores two important characteristics of contemporary wireless networks, however: (i) the network may contain heterogeneous communication/computation resources, while (ii) there may be significant overlaps in devices' local data distributions. In this work, we develop a novel optimization methodology that jointly accounts for these factors via intelligent device sampling complemented by device-to-device (D2D) offloading. Our optimization aims to select the best combination of sampled nodes and data offloading configuration to maximize FedL training accuracy subject to realistic constraints on the network topology and device capabilities. Theoretical analysis of the D2D offloading subproblem leads to new FedL convergence bounds and an efficient sequential convex optimizer. Using this result, we develop a sampling methodology based on graph convolutional networks (GCNs) which learns the relationship between network attributes, sampled nodes, and resulting offloading that maximizes FedL accuracy. Through evaluation on real-world datasets and network measurements from our IoT testbed, we find that our methodology while sampling less than 5% of all devices outperforms conventional FedL substantially both in terms of trained model accuracy and required resource utilization.

Sample-level Data Selection for Federated Learning

Anran Li, Lan Zhang, Juntao Tan, Yaxuan Qin, Junhao Wang and Xiang-Yang Li (University of Science and Technology of China, China)

0
Federated learning (FL) enables participants to collaboratively construct a global machine learning model without sharing their local training data to the remote server. In FL systems, the selection of training samples has a significant impact on model performances, e.g., selecting participants whose datasets have erroneous samples, skewed categorical distributions, and low content diversity would result in low accuracy and unstable models. In this work, we aim to solve the exigent optimization problem that selects a collection of high-quality training samples for a given FL task under a monetary budget in a privacy-preserving way, which is extremely challenging without visibility to participants' local data and training process. We provide a systematic analysis of important data related factors affecting the model performance and propose a holistic design to privately and efficiently select high-quality data samples considering all these factors. We verify the merits of our proposed solution with extensive experiments on a real AIoT system with 50 clients, including 20 edge computers, 20 laptops, and 10 desktops. The experimental results validates that our solution achieves accurate and efficient selection of high-quality data samples, and consequently an FL model with a faster convergence speed and higher accuracy than that achieved by existing solutions.

An Incentive Mechanism for Cross-Silo Federated Learning: A Public Goods Perspective

Ming Tang and Vincent W.S. Wong (University of British Columbia, Canada)

2
In cross-silo federated learning (FL), organizations cooperatively train a global model with their local data. The organizations, however, may be heterogeneous in terms of their valuation on the precision of the trained global model and their training cost. Meanwhile, the computational and communication resources of the organizations are non-excludable public goods. That is, even if an organization does not perform any local training, other organizations cannot prevent that organization from using the outcome of their resources (i.e., the trained global model). To address the organization heterogeneity and the public goods feature, in this paper, we formulate a social welfare maximization problem and propose an incentive mechanism for cross-silo FL. With the proposed mechanism, organizations can achieve not only social welfare maximization but also individual rationality and budget balance. Moreover, we propose a distributed algorithm that enables organizations to maximize the social welfare without knowing the valuation and cost of each other. Our simulations with MNIST dataset show that the proposed algorithm converges faster than a benchmark method. Furthermore, when organizations have higher valuation on precision, the proposed mechanism and algorithm are more beneficial in the sense that the organizations can achieve higher social welfare through participating in cross-silo FL.

Learning-Driven Decentralized Machine Learning in Resource-Constrained Wireless Edge Computing

Zeyu Meng, Hongli Xu and Min Chen (University of Science and Technology of China, China); Yang Xu (University of Science and Technology of China & School of Computer Science and Technology, China); Yangming Zhao and Chunming Qiao (University at Buffalo, USA)

2
Data generated at the network edge can be processed locally by leveraging the paradigm of edge computing. To fully utilize the widely distributed data, we concentrate on a wireless edge computing system that conducts model training using decentralized peer-to-peer (P2P) methods. However, there are two major challenges on the way towards efficient P2P model training: limited resources (e.g., network bandwidth and battery life of mobile edge devices) and time-varying network connectivity due to device mobility or wireless channel dynamics, which have received less attention in recent years. To address these two challenges, this paper adaptively constructs a dynamic and efficient P2P topology, where model aggregation occurs at the edge devices. In a nutshell, we first formulate the topology construction for P2P learning (TCPL) problem with resource constraints as an integer programming problem. Then a learning-driven method is proposed to adaptively construct a topology at each training epoch. We further give the convergence analysis on training machine learning models even with non-convex loss functions. Extensive simulation results show that our proposed method can improve the model training efficiency by about 11% with resource constraints and reduce the communication cost by about 30% under the same accuracy requirement compared to the benchmarks.

Session Chair

Chuan Wu (The University of Hong Kong)

Session F-7

Caching 2

Conference
10:00 AM — 11:30 AM EDT
Local
May 13 Thu, 7:00 AM — 8:30 AM PDT

Attack Resilience of Cache Replacement Policies

Tian Xie (Pennsylvania State University, USA); Ting He (Penn State University, USA); Patrick McDaniel and Namitha Nambiar (Pennsylvania State University, USA)

3
Caches are pervasively used in computer networks to speed up access by reusing previous communications, where various replacement policies are used to manage the cached contents. The replacement policy of a cache plays a key role in its performance, and is thus extensively engineered to achieve a high hit ratio in benign environments. However, some studies showed that a policy with a higher hit ratio in benign environments may be more vulnerable to denial of service (DoS) attacks that intentionally send requests for unpopular contents. To understand the cache performance under such attacks, we analyze a suite of representative replacement policies under the framework of TTL approximation in how well they preserve the hit ratios for legitimate users, while incorporating the delay for the cache to obtain a missing content. We further develop a scheme to adapt the cache replacement policy based on the perceived level of attack. Our analysis and validation on real traces show that although no single policy is resilient to all the attack strategies, suitably adapting the replacement policy can notably improve the attack resilience of the cache.

Rate Allocation and Content Placement in Cache Networks

Khashayar Kamran, Armin Moharrer, Stratis Ioannidis and Edmund Yeh (Northeastern University, USA)

3
We introduce the problem of optimal congestion control in cache networks, whereby both rate allocations and content placements are optimized jointly. We formulate this as a maximization problem with non-convex constraints, and propose solving this problem via (a) a Lagrangian barrier algorithm and (b) a convex relaxation. We prove different optimality guarantees for each of these two algorithms; our proofs exploit the fact that the non-convex constraints of our problem involve DR-submodular functions.

Joint Cache Size Scaling and Replacement Adaptation for Small Content Providers

Jiahui Ye, Zichun Li, Zhi Wang and Zhuobin Zheng (Tsinghua University, China); Han Hu (Beijing Institute of Technology, China); Wenwu Zhu (Tsinghua University, China)

2
Elastic Content Delivery Networks (Elastic CDNs) have been introduced to support explosive Internet traffic growth by providing small Content Providers (CPs) with just-in-time services. Due to the diverse requirements of small CPs, they need customized adaptive caching modules to help them adjust the cached contents to maximize their long-term utility. The traditional adaptive caching module is usually a built-in service in a cloud CDN. They adaptively change cache contents using size-scaling-only methods or strategy-adaptation-only methods. A natural question is: can we jointly optimize size and strategy to achieve tradeoff and better performance for small CPs when renting services from elastic CDNs? The problem is challenging because the two decision variables could involve both discrete and categorical variables, where discrete variables have an intrinsic order while categorical variables do not. In this paper, we propose a distribution-guided reinforcement learning framework JEANA to learn the joint cache size scaling and strategy adaptation policy. We design a distribution-guided regularizer to keep the intrinsic order of discrete variables. More importantly, we prove that our algorithm has a theoretical guarantee of performance improvement. Trace-driven experimental results demonstrate our method can improve the hit ratio while reducing the rental cost.

Self-adjusting Advertisement of Cache Indicators with Bandwidth Constraints

Itamar Cohen (Politecnico di Torino, Italy); Gil Einziger (Ben-Gurion University Of The Negev, Israel); Gabriel Scalosub (Ben-Gurion University of the Negev, Israel)

0
Cache advertisements reduce the access cost by allowing users to skip the cache when it does not contain their datum. Such advertisements are used in multiple networked domains such as 5G networks, wide area networks, and information-centric networking. The selection of an advertisement strategy exposes a trade-off between the access cost and bandwidth consumption. Still, existing works mostly apply a trial-and-error approach for selecting the best strategy, as the rigorous foundations required for optimizing such decisions is lacking.

Our work shows that the desired advertisement policy depends on numerous parameters such as the cache policy, the workload, the cache size, and the available bandwidth. In particular, we show that there is no ideal single configuration. Therefore, we design an adaptive, self-adjusting algorithm that periodically selects an advertisement policy. Our algorithm does not require any prior information about the cache policy, cache size, or workload, and does not require any apriori configuration. Through extensive simulations, using several state-of-the-art cache policies, and real workloads, we show that our approach attains a similar cost to that of the best static configuration (which is only identified in retrospect) in each case.

Session Chair

Sergey Gorinsky (IMDEA Networks Institute, Spain)

Session G-7

Wireless

Conference
10:00 AM — 11:30 AM EDT
Local
May 13 Thu, 7:00 AM — 8:30 AM PDT

GPU-Ether: GPU-native packet I/O for GPU applications on commodity Ethernet

Changue Jung, Suhwan Kim, Ikjun Yeom, Honguk Woo and Younghoon Kim (Sungkyunkwan University, Korea (South))

0
Despite the advent of various network enhancement technologies, it is yet a challenge to provide high-performance networking for GPU-accelerated applications on commodity Ethernet. Kernel-bypass I/O, such as DPDK or netmap, which is normally optimized for host memory-based CPU applications, has limitations on improving the performance of GPU-accelerated applications due to the data transfer overhead between host and GPU. In this paper, we propose GPU-Ether, GPU-native packet I/O on commodity Ethernet, which enables direct network access from GPU via dedicated persistent kernel threads. We implement GPU-Ether prototype on a commodity Ethernet NIC and perform extensive testing to evaluate it. The results show that GPU-Ether can provide high throughput and low latency for GPU applications.

On the Reliability of IEEE 802.1CB FRER

Doğanalp Ergenç and Mathias Fischer (University Hamburg, Germany)

0
The introduction of IEEE Time-sensitive Networking (TSN) enables the design of real-time and mission-critical networks based on Ethernet technologies. Apart from providing necessary tools for near-deterministic scheduling, TSN comes with further functionalities for configurability, security, and reliability. IEEE 802.1CB Frame Replication and Elimination (FRER) is the only protocol in the TSN toolbox for adding fault-tolerance via sending the same packets via redundant paths. Although its core functions are defined by the standard, its effective use mainly depends on the actual deployment scenario and the path selection strategy. In this paper, we show that FRER can induce unintentional elimination of packets packets when the paths chosen for a particular packet flow are non-disjoint. We propose the new metric reassurance that can be used in FRER path selection. Besides, we propose an additional enhancement to FRER that can prevent unintended packet eliminations independent from the deployment scenario. Our simulation results indicate that the reassurance-based path selection performs better than random or maximum-disjoint path selection in random failure scenarios.

Reversible Models for Wireless Multi-Channel Multiple Access

Michael Neely (University of Southern California, USA)

0
This paper presents a network layer model for a wireless multiple access system with both persistent and nonpersistent users. There is a single access point with multiple identical channels. Each user who wants to send a file first scans a subset of the channels to find one that is idle. If at least one idle channel is found, the user transmits a file over that channel. If no idle channel is found, a persistent user will repeat the access attempt at a later time, while a nonpersistent user will leave. This is a useful mathematical model for situations where a group of persistent users stay near an access point for an extended period of time while nonpersistent users come and go. Users have heterogeneous activity behavior, file upload rates, and service durations. The system is a complex multi-dimensional Markov chain. The steady state probabilities are found by exploiting a latent reversibility property and leveraging a discrete Fourier transform. This enables simple expressions for throughput and blocking probability.

Session Chair

Zhi Sun (SUNY Buffalo)

Session Break-1-May13

Virtual Coffee Break

Conference
11:30 AM — 12:00 PM EDT
Local
May 13 Thu, 8:30 AM — 9:00 AM PDT

Session A-8

Attacks

Conference
12:00 PM — 1:30 PM EDT
Local
May 13 Thu, 9:00 AM — 10:30 AM PDT

Launching Smart Selective Jamming Attacks in WirelessHART Networks

Xia Cheng, Junyang Shi and Mo Sha (State University of New York at Binghamton, USA); Linke Guo (Clemson University, USA)

0
As a leading industrial wireless standard, WirelessHART has been widely implemented to build wireless sensor-actuator networks (WSANs) in industrial facilities, such as oil refineries, chemical plants, and factories. For instance, 54,835 WSANs that implement the WirelessHART standard have been deployed globally by Emerson process management, a WirelessHART network supplier, to support process automation. While the existing research to improve industrial WSANs focuses mainly on enhancing network performance, the security aspects have not been given enough attention. We have identified a new threat to WirelessHART networks, namely smart selective jamming attacks, where the attacker first cracks the channel usage, routes, and parameter configuration of the victim network and then jams the transmissions of interest on their specific communication channels in their specific time slots, which makes the attacks energy efficient and hardly detectable. In this paper, we present this severe, stealthy threat by demonstrating the step-by-step attack process on a 50-node network that runs a publicly accessible WirelessHART implementation. Experimental results show that the smart selective jamming attacks significantly reduce the network reliability without triggering network updates.

Your Home is Insecure: Practical Attacks on Wireless Home Alarm Systems

Tao Li (IUPUI, USA); Dianqi Han, Jiawei Li, Ang Li and Yan Zhang (Arizona State University, USA); Rui Zhang (University of Delaware, USA); Yanchao Zhang (Arizona State University, USA)

0
Wireless home alarm systems are being widely deployed, but their security has not been well studied. Existing attacks on wireless home alarm systems exploit the vulnerabilities of networking protocols while neglecting the problems arising from the physical component of IoT devices. In this paper, we present new event-eliminating and event-spoofing attacks on commercial wireless home alarm systems by interfering with the reed switch in almost all COTS alarm sensors. In both attacks,
the external adversary uses his own magnet to control the state of the reed switch in order to either eliminate legitimate alarms or spoof false alarms. We also present a new battery-depletion attack with programmable electromagnets to deplete the alarm
sensor's battery quickly and stealthily in hours which is expected to last a few years. The efficacy of our attacks is confirmed by detailed experiments on a representative Ring alarm system.

Tornadoes In The Cloud: Worst-Case Attacks on Distributed Resources Systems

Jhonatan Tavori and Hanoch Levy (Tel Aviv University, Israel)

0
Geographically distributed cloud networks are used by a variety of applications and services worldwide. As the demand for these services increases, their data centers form an attractive target for malicious attackers, aiming at harming the services. In this study we address sophisticated attackers who aim at causing maximal-damage to the service.

A worst-case (damage-maximizing) attack is an attack which minimizes the revenue of the system operator, due to disrupting the users from being served. A sophisticated attacker needs to decide how many attacking agents should be launched at each of the systems regions, in order to inflict maximal damage.

We characterize and analyze damage-maximization strategies for a number of attacks including deterministic attack, concurrent stochastic agents attack, approximation of a virus-spread attack and over-size binomial attack. We also address user-migration defense, allowing to dynamically migrate demands among regions, and we provide efficient algorithms for deriving worst-case attacks given a system with arbitrary placement and demands. The results form a basis for devising resource allocation strategies aiming at minimizing attack damages.

Invisible Poison: A Blackbox Clean Label Backdoor Attack to Deep Neural Networks

Rui Ning, Jiang Li, ChunSheng Xin and Hongyi Wu (Old Dominion University, USA)

0
This paper reports a new clean-label data poisoning backdoor attack, named Invisible Poison, which stealthily and aggressively plants a backdoor in neural networks. It converts a regular trigger to a noised trigger that can be easily concealed inside images for training NN, with the objective to plant a backdoor that can be later activated by the trigger. Compared with existing data poisoning backdoor attacks, this newfound attack has the following distinct properties. First, it is a black-box attack, requiring zero-knowledge of the target model. Second, this attack utilizes "invisible poison" to achieve stealthiness where the trigger is disguised as 'noise', and thus can easily evade human inspection. On the other hand, this noised trigger remains effective in the feature space to poison training data. Third, the attack is practical and aggressive. A backdoor can be effectively planted with a small amount of poisoned data and is robust to most data augmentation methods during training. The attack is fully tested on multiple benchmark datasets including MNIST, Cifar10, and ImageNet10, as well as application-specific data sets such as Yahoo Adblocker and GTSRB. Two countermeasures, namely Supervised and Unsupervised Poison Sample Detection, are introduced to defend the attack.

Session Chair

Ruozhou Yu (North Carlolina State University)

Session B-8

Optimization

Conference
12:00 PM — 1:30 PM EDT
Local
May 13 Thu, 9:00 AM — 10:30 AM PDT

Blind Optimal User Association in Small-Cell Networks

Livia E. Chatzieleftheriou (Athens University of Economics and Business, Greece); Apostolos Destounis (Huawei Technologies France Research Center, France); Georgios Paschos (Amazon, Luxembourg); Iordanis Koutsopoulos (Athens University of Economics and Business, Greece)

0
We learn optimal user association policies for traffic from different locations to Access Points(APs), in the presence of unknown dynamic traffic demand. We aim at minimizing a broad family of α-fair cost functions that express various objectives in load assignment in the wireless downlink, such as total load or total delay minimization. Finding an optimal user association policy in dynamic environments is challenging because traffic demand fluctuations over time are non-stationary and difficult to characterize statistically, which obstructs the computation of cost-efficient associations. Assuming arbitrary traffic patterns over time, we formulate the problem of online learning of optimal user association policies using the Online Convex Optimization (OCO) framework. We introduce a periodic benchmark for OCO problems that generalizes state-of-the-art benchmarks. We exploit inherent properties of the online user association problem and propose PerOnE, a simple online learning scheme that dynamically adapts the association policy to arbitrary traffic demand variations. We compare PerOnE against our periodic benchmark and prove that it enjoys the no-regret property, with additional sublinear dependence of the network size. To the best of our knowledge, this is the first work that introduces a periodic benchmark for OCO problems and a no-regret algorithm for the online user association problem. Our theoretical findings are validated through results on a real-trace dataset.

Dynamically Choosing the Candidate Algorithm with Ostasos in Online Optimization

Weirong Chen, Jiaqi Zheng and Haoyu Yu (Nanjing University, China)

1
The increasing challenge in designing online algorithms lies in the distribution uncertainty. To cope with the distribution variations in online optimization, an intuitive idea is to reselect an algorithm from the candidate set that will be more suitable to future distributions. In this paper, we propose Ostasos,
an automatic algorithm selection framework that can choose the most suitable algorithm on the fly with provable guarantees. Rigorous theoretical analysis demonstrates that the performance of Ostasos is no worse than that of any candidate algorithms in terms of competitive ratio. Finally, we apply Ostasos to the online car-hailing problem and trace-driven experiments verify the effectiveness of Ostasos.

Taming Time-Varying Information Asymmetry in Fresh Status Acquisition

Zhiyuan Wang (The Chinese University of Hong Kong, Hong Kong); Lin Gao (Harbin Institute of Technology (Shenzhen), China); Jianwei Huang (The Chinese University of Hong Kong, Shenzhen, China)

0
Many online platforms are providing valuable real-time contents (e.g., traffic) by continuously acquiring the status of different Points of Interest (PoIs). In status acquisition, it is challenging to determine how frequently a PoI should upload its status to a platform, since they are self-interested with private and possibly time-varying preferences. This paper considers a general multi-period status acquisition system, aiming to maximize the aggregate social welfare and ensure the platform freshness. The freshness is measured by a metric termed age of information. For this goal, we devise a long-term decomposition (LtD) mechanism to resolve the time-varying information asymmetry. The key idea is to construct a virtual social welfare that only depends on the current private information, and then decompose the per-period operation into multiple distributed bidding problems for the PoIs and platforms. The LtD mechanism enables the platforms to achieve a tunable trade-off between payoff maximization and freshness conditions. Moreover, the LtD mechanism retains the same social performance compared to the benchmark with symmetric information and asymptotically ensures the platform freshness conditions. Numerical results based on real-world data show that when the platforms pay more attention to payoff maximization, each PoI still obtains a non-negative payoff in the long-term.

ToP: Time-dependent Zone-enhanced Points-of-interest Embedding-based Explainable Recommender system

En Wang, Yuanbo Xu, Yongjian Yang, Fukang Yang, Chunyu Liu and Yiheng Jiang (Jilin University, China)

0
Points-of-interest (POIs) recommendation plays a vital role by introducing unexplored POIs to consumers and has drawn extensive attention from both academia and industry. Existing POI recommender systems usually learn latent vectors to represent both consumers and POIs from historical check-ins and make recommendations under the spatio-temporal constraints. However, we argue that the existing works still suffer from the challenges of explaining consumers complicated check-in actions. In this paper, we first explore the interpretability of recommendations from the POI aspect, i.e., for a specific POI, its function usually changes over time, so representing a POI with a single fixed latent vector is not sufficient to describe POIs dynamic function. Besides, check-in actions to a POI is also affected by the zone it belongs to. In other words, the zone's embedding learned from POI distributions, road segments, and historical check-ins could be jointly utilized to enhance the accuracy of POI recommendations. Along this line, we propose a Time-dependent Zone-enhanced POI embedding model (ToP), a recommender system that integrates knowledge graph and topic model to introduce the spatio-temporal effects into POI embeddings for strengthening interpretability of recommendation. Specifically, ToP learns multiple latent vectors for a POI in different time to capture its dynamic functions. Jointly combining these vectors with zones representations, ToP enhances the spatio-temporal interpretability of POI recommendations. With this hybrid architecture, some existing POI recommender systems can be treated as special cases of ToP. Extensive experiments on real-world Changchun city datasets demonstrate that ToP not only achieves state-of-the-art performance in terms of common metrics, but also provides more insights for consumers POI check-in actions.

Session Chair

Bin Li (U. Rhode Island)

Session C-8

Sea, Space and Quantum Networks

Conference
12:00 PM — 1:30 PM EDT
Local
May 13 Thu, 9:00 AM — 10:30 AM PDT

PolarTracker: Attitude-aware Channel Access for Floating Low Power Wide Area Networks

Yuting Wang, Xiaolong Zheng, Liang Liu and Huadong Ma (Beijing University of Posts and Telecommunications, China)

1
Low Power Wide Area Networks (LPWAN) such as Long Range (LoRa) show great potential in emerging aquatic IoT applications. However, our deployment experience shows that the floating LPWAN suffer significant performance degradation, compared to the static terrestrial deployments. Our measurement
results reveal the reason behind this is due to the polarization and directivity of the antenna. The dynamic attitude of a floating node incurs varying signal strength losses, which is ignored by the attitude-oblivious link model adopted in most of the existing methods. When accessing the channel at a misaligned attitude, packet errors can happen. In this paper, we propose an attitude-aware link model that explicitly quantifies the impact of node attitude on link quality. Based on the new model, we propose PolarTracker, a novel channel access method for floating LPWAN. PolarTracker tracks the node attitude alignment state and schedules the transmissions into the aligned periods with better link quality. We implement a prototype of PolarTracker on commercial LoRa platforms and extensively evaluate its performance in various real-world environments. The experimental results show that PolarTracker can efficiently improve the packet reception ratio by 48.8%, compared with ALOHA in LoRaWAN.

Mobility- and Load-Adaptive Controller Placement and Assignment in LEO Satellite Networks

Long Chen, Feilong Tang and Xu Li (Shanghai Jiao Tong University, China)

1
Software-defined networking (SDN) based LEO satellite networks can make full use of satellite resources through flexible function configuration and efficient resource management of controllers. Consequently, controllers have to be carefully deployed based on dynamical topology and time-varying workload.
However, existing work on controller placement and assignment is not applicable to LEO satellite networks with highly dynamic topology and randomly fluctuating load. In this paper, we first formulate the adaptive controller placement and assignment (ACPA) problem and prove its NP-hardness. Then, we propose the control relation graph (CRG) to quantitatively capture the control overhead in LEO satellite networks. Next, we propose the CRG-based controller placement and assignment (CCPA) algorithm with a bounded approximation ratio. Finally, using the predicted topology and estimated traffic load, a lookahead-based improvement algorithm is designed to further decrease the overall management costs. Extensive emulation results demonstrate that the CCPA algorithm outperforms related schemes in terms of response time and load balancing.

Time-Varying Resource Graph Based Resource Model for Space-Terrestrial Integrated Networks

Long Chen and Feilong Tang (Shanghai Jiao Tong University, China); Zhetao Li (Xiangtan University, China); Laurence T. Yang (St. Francis Xavier University, Canada); Jiadi Yu and Bin Yao (Shanghai Jiao Tong University, China)

1
It is critical but difficult to efficiently model resources in space-terrestrial integrated networks (STINs). Existing work is not applicable to STINs because they lack the joint consideration of different movement patterns and fluctuating loads.

In this paper, we propose the time-varying resource graph (TVRG) to model STINs from the resource perspective. Firstly, we propose the STIN mobility model to uniformly model different movement patterns in STINs. Then, we propose a layered Resource Modeling and Abstraction (RMA) approach, where evolutions of node resources are modeled as Markov processes, by encoding predictable topologies and influences of fluctuating loads as states. Besides, we propose the low-complexity domain resource abstraction algorithm by defining two mobility-based and load-aware partial orders on resource abilities. Finally, we propose an efficient TVRG-based Resource Scheduling (TRS) algorithm for time-sensitive and bandwidth-intensive data flows, with the multi-level on-demand scheduling ability. Comprehensive simulation results demonstrate that RMA-TRS outperforms related schemes in terms of throughput, end-to-end delay and flow completion time.

Redundant Entanglement Provisioning and Selection for Throughput Maximization in Quantum Networks

Yangming Zhao and Chunming Qiao (University at Buffalo, USA)

1
Quantum communication using qubits based on the principle of entangled photons is a promising solution to improve network security. However, it is difficult to successfully create an entanglement link or connection between two nodes, especially when they are far apart from each other. In addition, only one qubit can be exchanged over an established entanglement connection, resulting in a low throughput.

In this paper, we propose Redundant Entanglement Provisioning and Selection (REPS) to maximize the throughput for multiple source-destination (SD) pairs in a circuit-switched, multi-hop quantum network. REPS has two distinct features: (i). It provisions backup resources for extra entanglement links between adjacent nodes for failure-tolerance; and (ii). It provides flexibility in selecting successfully created entanglement links to establish entanglement connections for the SD pairs to achieve network-wide optimization. Extensive analysis and simulations show that REPS can achieve optimal routing with a high probability, and improves the throughput by up to 68.35% over the highest-performing algorithms in existence. In addition, it also improves the fairness among the SD pairs in the networks.

Session Chair

Ana Aguiar (University of Porto, Portugal)

Session D-8

WiFi

Conference
12:00 PM — 1:30 PM EDT
Local
May 13 Thu, 9:00 AM — 10:30 AM PDT

BLESS: BLE-aided Swift Wi-Fi Scanning in Multi-protocol IoT Networks

Wonbin Park and Dokyun Ryoo (Seoul National University, Korea (South)); Changhee Joo (Korea University, Korea (South)); Saewoong Bahk (Seoul National University, Korea (South))

1
Wi-Fi scanning that searches neighboring access points (APs) is an essential prerequisite for Wi-Fi operations such as initial association and handover. As the traffic demand increases, APs are more densely deployed and the number of operating Wi-Fi channels also increases, which, however, results in additional scanning delay and makes the scanning a burdensome task. In this paper, we note that the co-location of Wi-Fi protocol with BLE protocol is a common practice in IoT networks, and develop a Wi-Fi passive scanning framework that uses BLE to assist scanning. Although the framework has great potential to improve scanning performance without explicit message exchanges, there are technical challenges related to time synchronization and channel switching delay. We address the challenges and develop a practical passive scanning scheme, named BLESS-Sync. We verify its performance through testbed experiments and extensive simulations, and show that BLESS-Sync significantly outperforms legacy Wi-Fi scanning in terms of scanning delay and energy efficiency.

Efficient Association of Wi-Fi Probe Requests under MAC Address Randomization

Jiajie Tan and S.-H. Gary Chan (The Hong Kong University of Science and Technology, China)

0
Wi-Fi-enabled devices such as smartphones periodically search for available networks by broadcasting probe requests which encapsulate MAC addresses as the device identifiers. To protect privacy (user identity and location), modern devices embed random MAC addresses in their probe frames, the so-called MAC address randomization. Such randomization greatly hampers statistical analysis such as people counting and trajectory inference. To mitigate its impact while respecting privacy, we propose Espresso, a simple, novel and efficient approach which establishes probe request association under MAC address randomization. Espresso models the frame association as a flow network, with frames as nodes and frame correlation as edge cost. To estimate the correlation between any two frames, it considers the multimodality of request frames, including information elements, sequence numbers and received signal strength. It then associates frames with minimum-cost flow optimization. To the best of our knowledge, this is the first piece of work that formulates the probe request association problem as network flow optimization using frame correlation. We have implemented Espresso and conducted extensive experiments in a leading shopping mall. Our results show that Espresso outperforms the state-of-the-art schemes in terms of discrimination accuracy (> 80%) and V-measure scores (> 0.85).

Coexistence of Wi-Fi 6E and 5G NR-U: Can We Do Better in the 6 GHz Bands?

Gaurang Naik and Jung-Min (Jerry) Park (Virginia Tech, USA)

0
Regulators in the US and Europe have stepped up their efforts to open the 6 GHz bands for unlicensed access. The two unlicensed technologies likely to operate and coexist in these bands are Wi-Fi 6E and 5G New Radio Unlicensed (NR-U). The greenfield 6 GHz bands allow us to take a fresh look at the coexistence between Wi-Fi and 3GPP-based unlicensed technologies. In this paper, using tools from stochastic geometry, we study the impact of Multi User Orthogonal Frequency Division Multiple Access, i.e., MU OFDMA-a feature introduced in 802.11ax-on this coexistence issue. Our results reveal that by disabling the use of the legacy contention mechanism (and allowing only MU OFDMA) for uplink access in Wi-Fi 6E, the performance of both NR-U networks and uplink Wi-Fi 6E can be improved. This is indeed feasible in the 6 GHz bands, where there are no operational Wi-Fi or NR-U users. In so doing, we also highlight the importance of accurate channel sensing at the entity that schedules uplink transmissions in Wi-Fi 6E and NR-U. If the channel is incorrectly detected as idle, factors that improve the uplink performance of one technology contribute negatively to the performance of the other technology.

LoFi: Enabling 2.4GHz LoRa and WiFi Coexistence by Detecting Extremely Weak Signals

Gonglong Chen, Wei Dong and Jiamei Lv (Zhejiang University, China)

1
Low-Power Wide Area Networks (LPWANs) emerges as attractive communication technologies to connect the Internet-of-Things. A new LoRa chip has been proposed to provide long range and low power support on 2.4GHz. Comparing with previous LoRa radios operating on sub-gigahertz, the new one can transmit LoRa packets faster without strict channel duty cycle limitations and have attracted many attentions. Prior studies have shown that LoRa packets may suffer from severe corruptions with WiFi interference. However, there are many limitations in existing approaches such as too much signal processing overhead on weak devices or low detection accuracy. In this paper, we propose a novel weak signal detection approach, LoFi, to enable the coexistence of LoRa and WiFi. LoFi utilizes a typical physical phenomenon Stochastic Resonance (SR) to boost weak signals with a specific frequency by adding appropriate white noise. Based on the detected spectrum occupancy of LoRa signals, LoFi reserves the spectrum for LoRa transmissions. We implement LoFi on USRP N210 and conduct extensive experiments to evaluate its performance. Results show that LoFi can enable the coexistence of LoRa and WiFi in 2.4GHz. The packet reception ratio of LoRa achieves 98% over an occupied 20MHz WiFi channel, and the WiFi throughput loss is reduced by up to 13%.

Session Chair

Christoph Sommer (TU Dresden, Germany)

Session E-8

Distributed ML

Conference
12:00 PM — 1:30 PM EDT
Local
May 13 Thu, 9:00 AM — 10:30 AM PDT

Live Gradient Compensation for Evading Stragglers in Distributed Learning

Jian Xu (Tsinghua University, China); Shao-Lun Huang (Tsinghua-Berkeley Shenzhen Institute, China); Linqi Song (City University of Hong Kong, Hong Kong); Tian Lan (George Washington University, USA)

1
The training efficiency of distributed learning systems is vulnerable to stragglers, namely, those slow worker nodes. A naive strategy is performing the distributed learning by incorporating the fastest K workers and ignoring these stragglers, which may induce high deviation for non-IID data. To tackle this, we develop a Live Gradient Compensation (LGC) strategy to incorporate the one-step delayed gradients from stragglers, aiming to accelerate learning process and utilize the stragglers simultaneously. In LGC framework, mini-batch data are divided into smaller blocks and processed separately, which makes the gradient computed based on partial work accessible. In addition, we provide theoretical convergence analysis of our algorithm for non-convex optimization problem under non-IID training data to show that LGC-SGD has almost the same convergence error as full synchronous SGD. The theoretical results also allow us to quantify a novel tradeoff in minimizing training time and error by selecting the optimal straggler threshold. Finally, extensive simulation experiments of image classification on CIFAR-10 dataset are conducted, and the numerical results demonstrate the effectiveness of our proposed strategy.

Exploiting Simultaneous Communications to Accelerate Data Parallel Distributed Deep Learning

Shaohuai Shi (The Hong Kong University of Science and Technology, Hong Kong); Xiaowen Chu (Hong Kong Baptist University, Hong Kong); Bo Li (Hong Kong University of Science and Technology, Hong Kong)

1
Synchronous stochastic gradient descent (S-SGD) with data parallelism is widely used for training deep learning (DL) models in distributed systems. A pipelined schedule of the computing and communication tasks of a DL training job is an effective scheme to hide some communication costs. In such pipelined S-SGD, tensor fusion (i.e., merging some consecutive layers' gradients for a single communication) is a key ingredient to improve communication efficiency. However, existing tensor fusion techniques schedule the communication tasks sequentially, which overlooks their independence nature. In this paper, we expand the design space of scheduling by exploiting simultaneous All-Reduce communications. Through theoretical analysis and experiments, we show that simultaneous All-Reduce communications can effectively improve the communication efficiency of small tensors. We formulate an optimization problem of minimizing the training iteration time, in which both tensor fusion and simultaneous communications are allowed. We develop an efficient optimal scheduling solution and implement the distributed training algorithm ASC-WFBP with Horovod and PyTorch. We conduct real-world experiments on an 8-node GPU cluster of 32 GPUs with 10Gbps Ethernet. Experimental results on four modern DNNs show that ASC-WFBP can achieve about 1.09 ⇥ 2.48⇥ speedup over the baseline without tensor fusion, and 1.15⇥ 1.35⇥ speedup over the state-of-the-art tensor fusion solution.

Low Sample and Communication Complexities in Decentralized Learning: A Triple Hybrid Approach

Xin Zhang (Iowa State University, USA); Jia Liu (The Ohio State University, USA); Zhengyuan Zhu (Iowa State University, USA); Elizabeth Serena Bentley (AFRL, USA)

3
Network-consensus-based decentralized learning optimization algorithms have attracted a significant amount of attention in recent years due to their rapidly growing applications. However, most of the existing decentralized learning algorithms could not achieve low sample and communication complexities simultaneously - two important metrics in evaluating the trade-off between computation and communication costs of decentralized learning. To overcome these limitations, in this paper, we propose a triple hybrid decentralized stochastic gradient descent (TH-DSGD) algorithm for efficiently solving non-convex network-consensus optimization problems for decentralized learning. We show that to reach an ɛ 2 -stationary solution, the total sample complexity of TH-DSGD is O(ɛ −-3 ) and the communication complexity is O(ɛ −-3 ), both of which are independent of dataset sizes and significantly improve the sample and communication complexities of the existing works. We conduct extensive experiments with a variety of learning models to verify our theoretical findings. We also show that our TH-DSGD algorithm is stable as the network topology gets sparse and enjoys better convergence in the large-system regime.

DC2: Delay-aware Compression Control for Distributed Machine Learning

Ahmed M. Abdelmoniem and Marco Canini (KAUST, Saudi Arabia)

1
Distributed training performs data-parallel training of DNN models which is a necessity for increasingly complex models and large datasets. Recent works are identifying major communication bottlenecks in distributed training. These works seek possible opportunities to speed-up the training in systems supporting distributed ML workloads. As communication reduction, compression techniques are proposed to speed-up this communication phase. However, compression comes at the cost of reduced model accuracy, especially when compression is applied arbitrarily. Instead, we advocate a more controlled use of compression and propose DC2, a delay-aware compression control mechanism. DC2 couples compression control and network delays in applying compression adaptively. DC2 not only compensates for network variations but can also strike a better trade-off between training speed and accuracy. DC2 is implemented as a drop-in module to the communication library used by the ML toolkit and can operate in a variety of network settings. We empirically evaluate DC2 in network environments exhibiting low and high delay variations. Our evaluation of different popular CNN models and datasets shows that DC2 improves training speed-ups of up to 41⇥ and 5.3⇥ over baselines with no-compression and uniform compression, respectively.

Session Chair

Zhichao Cao (Michigan State University)

Session F-8

LoRa

Conference
12:00 PM — 1:30 PM EDT
Local
May 13 Thu, 9:00 AM — 10:30 AM PDT

Modeling Communication Reliability in LoRa Networks with Device-level Accuracy

Verónica Toro-Betancur and Gopika Premsankar (Aalto University, Finland); Mariusz Slabicki (Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, Poland); Mario Di Francesco (Aalto University, Finland)

3
Long Range (LoRa) is a low-power wireless communication technology for long-range connectivity, extensively used in the Internet of Things. Several works in the literature have analytically characterized the performance of LoRa networks, with particular focus on scalability and reliability. However, most of the related models are limited, as they cannot account for factors that occur in practice, or make strong assumptions on how devices are deployed in the network. This article proposes an analytical model that describes the delivery ratio in a LoRa network with device-level granularity. Specifically, it considers the impact of several key factors that affect real deployments, including multiple gateways and channel variation. Therefore, the proposed model can effectively evaluate the delivery ratio in realistic network topologies, without any restrictions on device deployment or configuration. It also accurately characterizes the delivery ratio of each device in a network, as demonstrated by extensive simulations in a wide variety of conditions, including diverse networks in terms of node deployment and link-level parameter settings. The proposed model provides a level of detail that is not available in the state of the art, and it matches the simulation results within an error of a few percentage points.

Jamming of LoRa PHY and Countermeasure

Ningning Hou, Xianjin Xia and Yuanqing Zheng (The Hong Kong Polytechnic University, Hong Kong)

1
LoRaWAN forms a one-hop star topology where LoRa nodes send data via one-hop up-link transmission to a LoRa gateway. If the LoRa gateway can be jammed by attackers, the LoRa gateway may not be able to receive any data from any nodes in the network. Our empirical study shows that although LoRa physical layer (PHY) is robust and resilient by design, it is still vulnerable to synchronized jamming chirps. Potential protection solutions (e.g., collision recovery, parallel decoding) may fail to extract LoRa packets if an attacker transmits synchronized jamming chirps at high power. To protect the LoRa PHY from such attacks, we propose a new protection method that can separate LoRa chirps from jamming chirps by leveraging their difference in the received signal strength in power domain. We note that the new protection solution is orthogonal to existing solutions which leverage the chirp misalignment in time domain or the frequency disparity in frequency domain. We conduct experiments with COTS LoRa nodes and software defined radios. The results show that synchronized jamming chirps at high power can jam all previous solutions, while our protection solution can effectively protect LoRa gateways from the jamming attacks.

Radio Frequency Fingerprint Identification for LoRa Using Spectrogram and CNN

Guanxiong Shen, Junqing Zhang and Alan Marshall (University of Liverpool, United Kingdom (Great Britain)); Linning Peng (Southeast University, China); Xianbin Wang (Western University, Canada)

0
Radio frequency fingerprint identification (RFFI) is an emerging device authentication technique that relies on intrinsic hardware characteristics of wireless devices. We designed an RFFI scheme for Long Range (LoRa) systems based on spectrogram and convolutional neural network (CNN). Specifically, we used spectrogram to represent the fine-grained time-frequency characteristics of LoRa signals. In addition, we revealed that the instantaneous carrier frequency offset (CFO) is drifting, which will result in misclassification and significantly compromise the system stability; we demonstrated CFO compensation is an effective mitigation. Finally, we designed a hybrid classifier that can adjust CNN outputs with the estimated CFO. The mean value of CFO remains relatively stable, hence it can be used to rule out CNN predictions whose estimated CFO falls out of the range. We performed experiments in real wireless environments using 20 LoRa devices under test (DUTs) and a Universal Software Radio Peripheral (USRP) N210 receiver. By comparing with the IQ-based and FFT-based RFFI schemes, our spectrogram-based scheme can reach the best classification accuracy, i.e., 97.61% for 20 LoRa DUTs.

Pyramid: Real-Time LoRa Collision Decoding with Peak Tracking

Zhenqiang Xu, Pengjin Xie and Jiliang Wang (Tsinghua University, China)

1
LoRa, as a representative Lower Power Wide Area Network (LPWAN) technology, shows great potential in providing low power and long range wireless communication. Real LoRa deployments, however, suffer from severe collisions. Existing collision decoding methods cannot work well for low SNR LoRa signals. Most LoRa collision decoding methods process collisions offline and cannot support real-time collision decoding in practice. To address these problems, we propose Pyramid, a real-time LoRa collision decoding approach. To the best of our knowledge, this is the first real-time multi-packet LoRa collision decoding approach in low SNR. Pyramid exploits the subtle packet offset to separate packets in a collision. The core of Pyramid is to combine signals in multiple windows and transfers variation of chirp length in multiple windows to robust features in the frequency domain that are resistant to noise. We address practical challenges including accurate peak recovery and feature extraction in low SNR signals of collided packets. We theoretically prove that Pyramid incurs a very small SNR loss (< 0.56 dB) to original LoRa transmissions. We implement Pyramid using USRP N210 and evaluate its performance in a 20-nodes network. Evaluation results show that Pyramid achieves real-time collision decoding and improves the throughput by 2.11×.

Session Chair

Janise McNair (University of Florida)

Session G-8

Routing

Conference
12:00 PM — 1:30 PM EDT
Local
May 13 Thu, 9:00 AM — 10:30 AM PDT

Grafting Arborescences for Extra Resilience of Fast Rerouting Schemes

Klaus-Tycho Foerster (University of Vienna, Austria); Andrzej Kamisiński (AGH University of Science and Technology, Poland); Yvonne-Anne Pignolet (DFINITY, Switzerland); Stefan Schmid (University of Vienna, Austria); Gilles Tredan (LAAS-CNRS, France)

0
To provide a high availability and to be able to quickly react to link failures, most communication networks feature fast rerouting (FRR) mechanisms in the data plane. However, configuring these mechanisms to provide a high resilience against multiple failures is algorithmically challenging, as rerouting rules can only depend on local failure information and need to be pre-defined. This paper is motivated by the observation that the common approach to design fast rerouting algorithms, based on spanning trees and covering arborescences, comes at a cost of reduced resilience as it does not fully exploit the available links in heterogeneous topologies. We present several novel fast rerouting algorithms which are not limited by spanning trees, but rather extend and combine ("graft") multiple spanning arborescences to improve resilience. We compare our algorithms analytically and empirically, and show that they can significantly improve not only the resilience, but also accelerate the preprocessing to generate the local fast failover rules.

A Fast-Convergence Routing of the Hot-Potato

Jean-Romain Luttringer and Quentin Bramas (University of Strasbourg, France); Cristel Pelsser (University of Strasbourg); Pascal Mérindol (Université de Strasbourg, France)

0
Interactions between the intra- and inter-domain routing protocols received little attention despite playing an important role in forwarding transit traffic. More precisely, by default, IGP distances are taken into account by BGP to select the closest exit gateway for the transit traffic (hot-potato routing). Upon an IGP update, the new best gateway may change and should be updated through the (full) re-convergence of BGP, causing superfluous BGP processing and updates in many cases. We propose OPTIC (Optimal Protection Technique for Inter-intra domain Convergence), an efficient way to assemble both protocols without losing the hot-potato property. OPTIC pre-computes sets of gateways (BGP next-hops) shared by groups of prefixes. Such sets are guaranteed to contain the post-convergence gateway after any single IGP event for the grouped prefixes. The new optimal exits can be found through a single walk-through of each set, allowing the transit traffic to benefit from optimal BGP routes almost as soon as the IGP converges. Compared to vanilla BGP, OPTIC's structures allow it to consider a reduced number of entries: this number can be reduced by 99% for stub networks. The update of OPTIC's structures, which is not required as long as border routers remain at least bi-connected, scales linearly in time with its number of groups.

Threshold-based rerouting and replication for resolving job-server affinity relations

Youri Raaijmakers and Onno Boxma (Eindhoven University of Technology, The Netherlands); Sem Borst (Eindhoven University of Technology & Nokia Bell Labs, USA)

0
We consider a system with several job types and two parallel server pools. Within the pools the servers are homogeneous, but across pools possibly not in the sense that the service speed of a job may depend on its type as well as the server pool. Immediately upon arrival, jobs are assigned to a server pool, possibly based on (partial) knowledge of their type. In case such knowledge is not available upon arrival, it can however be obtained while the job is in service; as the service progresses, the likelihood that the service speed of this job type is low increases, creating an incentive to execute the job on different, possibly faster, server(s). Two policies are considered: reroute the job to the other server pool, or replicate it there. We determine the effective load per server under both the rerouting and replication policy for completely unknown as well as partly known job types. We also examine the impact of these policies on the stability bound, which is defined as the maximum arrival rate of jobs for which the effective load per server is smaller than one. We demonstrate that the uncertainty in job types may significantly reduce the stability bound, and that for (highly) unbalanced service speeds full replication achieves the largest stability bound. Finally, we discuss how the use of threshold-based policies can help improve the expected latency for completely or partly unknown job types.

Session Chair

Kaushik Chowdhury (Northeastern University)

Session Break-2-May13

Virtual Lunch Break

Conference
1:30 PM — 2:30 PM EDT
Local
May 13 Thu, 10:30 AM — 11:30 AM PDT

Session A-9

Attack and Anomaly Detection

Conference
2:30 PM — 4:00 PM EDT
Local
May 13 Thu, 11:30 AM — 1:00 PM PDT

MANDA: On Adversarial Example Detection for Network Intrusion Detection System

Ning Wang (Virginia Tech, USA); Yimin Chen (Virginia Polytechnic Institute and State University, USA); Yang Hu (Virgina Tech, USA); Wenjing Lou and Thomas Hou (Virginia Tech, USA)

0
With the rapid advancement in machine learning (ML), ML-based Intrusion Detection Systems (IDSs) are widely deployed to protect networks from various attacks. Yet one of the biggest challenges is that ML-based IDSs suffer from adversarial example (AE) attacks. By applying small perturbations (e.g. slightly increasing packet inter-arrival time) to the intrusion traffic, an AE attack can flip the prediction of a well-trained IDS. We address this challenge by proposing MANDA, a MANifold and Decision boundary-based AE detection system. Through analyzing AE attacks, we notice that 1) an AE tends to be close to its original manifold (i.e., the cluster of samples in its original class) regardless which class it is misclassified into; and 2) AEs tend to be close to the decision boundary so as to minimize the perturbation scale. Based on the two observations, we design MANDA for accurate AE detection by exploiting inconsistency between manifold evaluation and IDS model inference and evaluating model uncertainty on small perturbations. We evaluate MANDA on NSL-KDD under three state-of-the-art AE attacks. Our experimental results show that MANDA achieves as high as 98.41% true-positive rate with 5% false-positive rate and can be applied to other problem spaces such as image recognition.

Detecting Localized Adversarial Examples: A Generic Approach using Critical Region Analysis

Fengting Li, Xuankai Liu, XiaoLi Zhang and Qi Li (Tsinghua University, China); Kun Sun (George Mason University, USA); Kang Li (University of Georgia, USA)

1
Deep neural networks (DNNs) have been applied in a wide range of applications, e.g., face recognition and image classification; however, they are vulnerable to adversarial examples. By adding a small amount of imperceptible perturbations, an attacker can easily manipulate the outputs of a DNN. Particularly, the localized adversarial examples only perturb a small and contiguous region of the target object, so that they are robust and effective in both digital and physical worlds. Although the localized adversarial examples have more severe real-world impacts than traditional pixel attacks, they have not been well addressed in the literature. In this paper, we propose a generic defense system called TaintRadar to accurately detect localized adversarial examples via analyzing critical regions that have been manipulated by attackers. The main idea is that when removing critical regions from input images, the ranking changes of adversarial labels will be larger than those of benign labels. Compared with existing defense solutions, TaintRadar can effectively capture sophisticated localized partial attacks, e.g., the eye-glasses attack, while not requiring additional training or fine-tuning of the original model's structure. Comprehensive experiments have been conducted in both digital and physical worlds to verify the effectiveness and robustness of our defense.

Towards Cross-Modal Forgery Detection and Localization on Live Surveillance Videos

Yong Huang, Xiang Li, Wei Wang and Tao Jiang (Huazhong University of Science and Technology, China); Qian Zhang (Hong Kong University of Science and Technology, Hong Kong)

0
The cybersecurity breaches render surveillance systems vulnerable to video forgery attacks, under which authentic live video streams are tampered to conceal illegal human activities under surveillance cameras. Traditional video forensics approaches can detect and localize forgery traces in each video frame using computationally-expensive spatial-temporal analysis, while falling short in real-time verification of live video feeds. The recent work correlates time-series camera and wireless signals to recognize replayed surveillance videos using event-level timing information but it cannot realize fine-grained forgery detection and localization on each frame. To fill this gap, this paper proposes Secure-Pose, a novel cross-modal forgery detection and localization system for live surveillance videos using WiFi signals near the camera spot. We observe that coexisting camera and WiFi signals convey common human semantic information and the presence of forgery attacks on video frames will decouple such information correspondence. Secure-Pose extracts effective human pose features from synchronized multi-modal signals and detects and localizes forgery traces under both inter-frame and intra-frame attacks in each frame. We implement Secure-Pose using a commercial camera and two Intel 5300 NICs and evaluate it in real-world environments. Secure-Pose achieves a high detection accuracy of 95.1% and can effectively localize tampered objects under different forgery attacks.

CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Model Transfer

Ming Sun and Ya Su (Tsinghua University, China); Shenglin Zhang, Yuanpu Cao and Yuqing Liu (Nankai University, China); Dan Pei and Wenfei Wu (Tsinghua University, China); Yongsu Zhang, Xiaozhou Liu and Junliang Tang (ByteDance, China)

1
Anomaly detection is indispensable in modern IT infrastructure management. However, the dimension explosion problem of the monitoring data (large-scale machines, many key performance indicators, and frequent monitoring queries) causes a scalability issue to the existing algorithms. We propose a coarse-to-fine model transfer based framework CTF to achieve a scalable and accurate data-center-scale anomaly detection. CTF pre-trains a coarse-grained model, uses the model to extract and compress per-machine features to a distribution, clusters machines according to the distribution, and conducts model transfer to fine-tune per-cluster models for high accuracy. The framework takes advantage of clustering on the per-machine latent representation distribution, reusing the pre-trained model, and partial-layer model fine-tuning to boost the whole training efficiency. We also justify design choices such as the clustering algorithm and distance algorithm to achieve the best accuracy. We prototype CTF and experiment on production data to show its scalability and accuracy. We also release a labeling tool for multivariate time series and a labeled dataset to the research community.

Session Chair

Tony Luo (Missouri Univ. Science and Technology)

Session B-9

Packets and Flows

Conference
2:30 PM — 4:00 PM EDT
Local
May 13 Thu, 11:30 AM — 1:00 PM PDT

ECLAT: An ECN Marking System for Latency Guarantee in Cellular Networks

Junseon Kim (Ulsan National Institute of Science and Technology (UNIST), Korea (South)); Youngbin Im (Ulsan National Institute of Science and Technology, Korea (South)); Kyunghan Lee (Seoul National University, Korea (South))

0
As the importance of latency performance increases, a number of multi-bit feedback-based congestion control mechanisms have been proposed for explicit latency control in cellular networks. However, due to their reactive nature and limited access to the network queue, while latency reduction was possible, latency guarantee has not been achieved. Also, due to the need for end-host modifications, it was hard to commonly provide latency benefit to all connected devices. To this end, we propose a novel network-assisted congestion control, ECLAT, which can always bound the queuing delay within a delay-budget through ECN-based single-bit feedback while maintaining high link utilization for any device. To do so, ECLAT 1) calculates its target operating point for each flow, which is related to the maximum allowable cwnd to meet the delay-budget under time-varying cellular networks, and 2) determines its single-bit feedback policy to limit cwnd within the target operating point. Our extensive experiments in our testbed demonstrate that ECLAT is able to bound the queuing delays of multiple flows within their delay-budget and achieve high utilization even in the dynamic cellular network environment.

PCL: Packet Classification with Limited knowledge

Vitalii Demianiuk (Ariel University, Israel); Chen Hajaj (Ariel University & Data Science and Artificial Intelligence Research Center, Israel); Kirill Kogan (Ariel University, Israel)

0
We introduce a novel representation of packet classifiers allowing to operate on partially available input data varying dynamically. For a given packet classifier, availability of fields or complexity of field computations, and free target specific resources, the proposed infrastructure computes a classifier representation satisfying performance and robustness requirements. We show the feasibility to reconstruct a classification result in this noisy environment, allowing for the improvement of performance and the achievement of additional robustness levels of network infrastructure. Our results are supported by extensive evaluations in various settings where only a partial input is available.

Towards the Fairness of Traffic Policer

Danfeng Shan and Peng Zhang (Xi'an Jiaotong University, China); Wanchun Jiang (Central South University, China); Hao Li (Xi'an Jiaotong University, China); Fengyuan Ren (Tsinghua University, China)

0
Traffic policing is widely used by ISPs to limit their customers' traffic rates. It has long been believed that a well-tuned traffic policer offers a satisfactory performance for TCP. However, we find this belief breaks with the emergence of new congestion control (CC) algorithms like BBR: flows using these new CC algorithms can easily occupy the majority of the bandwidth, starving traditional TCP flows. We confirm this problem with experiments and reveal its root cause as follows. Without buffer in traffic policers, congestion only causes packet losses, while new CC algorithms are loss-resilient, i.e. they adjust the sending rate based on other network feedback like delay. Thus, when being policed they will not reduce the sending rate until an unacceptable loss ratio for TCP is reached, resulting in low throughput for TCP. Simply adding buffer to the traffic policer improves fairness but incurs high latency. To this end, we propose FairPolicer, which can achieve fair bandwidth allocation without sacrificing latency. FairPolicer regards token as a basic unit of bandwidth and fairly allocates tokens to active flows in a round-robin manner. Testbed experiments show that FairPolicer can significantly improve the fairness and achieve much lower latency than other kinds of rate-limiters.

Jellyfish: Locality-sensitive Subflow Sketching

Yongquan Fu (National University of Defense Technology, China); Lun An (Beijing University of Posts and Telecommunications, China); Siqi Shen (Xiamen University, China); Kai Chen (Hong Kong University of Science and Technology, China); Pere Barlet-Ros (Universitat Politècnica de Catalunya, Spain)

0
To cope with increasing network rates and massive traffic volumes, sketch-based methods have been extensively studied to trade accuracy for memory scalability and storage cost. However, sketches are sensitive to hash collisions due to skewed keys in real world environment, and need complicated performance control for line-rate packet streams. We present Jellyfish, a locality-sensitive sketching framework to address these issues. Jellyfish goes beyond network flow-based sketching towards fragments of network flows called subflows. First, Jellyfish splits consecutive packets from each network-flow to subflow records, which not only reduces the rate contention but also provides intermediate subflow representations in form of truncated counters. Next, Jellyfish maps similar subflow records to the same bucket array and merges those from the same network-flow to reconstruct the network-flow level counters. Real-world trace-driven experiments show that Jellyfish reduces the average estimation errors by up to six orders of magnitude for per-flow queries, by six orders of magnitude for entropy queries, and up to ten times for heavy-hitter queries.

Session Chair

Shuochao Yao (George Mason University, USA)

Session C-9

Social Networks and Applications

Conference
2:30 PM — 4:00 PM EDT
Local
May 13 Thu, 11:30 AM — 1:00 PM PDT

Medley: Predicting Social Trust in Time-Varying Online Social Networks

Wanyu Lin and Baochun Li (University of Toronto, Canada)

0
Social media, such as Reddit, has become a norm in our daily lives, where users routinely express their attitude using upvotes (likes) or downvotes. These social interactions may encourage users to interact frequently and form strong ties of trust between one another. It is therefore important to predict social trust from these interactions, as they facilitate routine features in social media, such as online recommendation and advertising.

Conventional methods for predicting social trust often accept static graphs as input, oblivious of the fact that social interactions are time-dependent. In this work, we propose Medley, to explicitly model users' time-varying latent factors and to predict social trust that varies over time. We propose to use functional time encoding to capture continuous-time features and employ attention mechanisms to assign higher importance weights to social interactions that are more recent. By incorporating topological structures that evolve over time, our framework can infer pairwise social trust based on past interactions. Our experiments on benchmarking datasets show that Medley is able to utilize time-varying interactions effectively for predicting social trust, and achieves an accuracy that is up to 26% higher over its alternatives.

Setting the Record Straighter on Shadow Banning

Erwan Le Merrer (Inria, France); Benoit Morgan (IRIT-ENSEEIHT, University of Toulouse, France); Gilles Tredan (LAAS-CNRS, France)

1
Shadow banning consists for an online social network in limiting the visibility of some of its users, without them being aware of it. Twitter declares that it does not use such a practice, sometimes arguing about the occurrence of "bugs" to justify restrictions on some users. This paper is the first to address the plausibility of shadow banning on a major online platform, by adopting both a statistical and a graph topological approach. We first conduct an extensive data collection and analysis campaign, gathering occurrences of visibility limitations on user profiles (we crawl more than 2.5 millions of them). In such a black-box observation setup, we highlight the salient user profile features that may explain a banning practice (using machine learning predictors). We then pose two hypotheses for the phenomenon: i) limitations are bugs, as claimed by Twitter, and ii) shadow banning propagates as an epidemic on userinteraction ego-graphs. We show that hypothesis i) is statistically unlikely with regards to the data we collected. We then show some interesting correlation with hypothesis ii), suggesting that the interaction topology is a good indicator of the presence of groups of shadow banned users on the service.

MIERank: Co-ranking Individuals and Communities with Multiple Interactions in Evolving Networks

Shan Qu (Shanghai Jiaotong University, China); Luoyi Fu (Shanghai Jiao Tong University, China); Xinbing Wang (Shanghai Jiaotong University, China)

0
Ranking has significant applications in real life. It aims to evaluate the importance (or popularity) of two categories of objects, i.e., individuals and communities. Numerous efforts have been dedicated to these two types of rankings respectively. Instead, in this paper, we for the first time explore the co-ranking of both individuals and communities. Our insight lies in that co-ranking may enhance the mutual evaluation on both sides. To this end, we first establish an Evolving Coupled Graph that contains a series of smoothly weighted snapshots, each of which characterizes and couples the intricate interactions of both individuals and communities till a certain evolution time into a single graph. Then we propose an algorithm, called MIERank to implement the co-ranking of individuals and communities in the proposed evolving graph. The core idea of MIERank lies in a novel unbiased random walk, which, when sampling the interplay among nodes over different generation times, incorporates the preference knowledge of ranking by utilizing nodes' future actions. MIERank returns the co-ranking of both individuals and communities by iteratively alternating between their corresponding stationary probabilities of the unbiased random walk in a mutually-reinforcing manner. We prove the efficiency of MIERank in terms of its convergence, optimality and extensiblity. Our experiments on a big scholarly dataset of 606862 papers and 1215 fields further validate the superiority of MIERank with fast convergence and an up to 26% ranking accuracy gain compared with the separate counterparts.

ProHiCo: A Probabilistic Framework to Hide Communities in Large Networks

Xuecheng Liu and Luoyi Fu (Shanghai Jiao Tong University, China); Xinbing Wang (Shanghai Jiaotong University, China); John Hopcroft (Cornell University, USA)

0
While community detection has been one of the cornerstones in network analysis and data science, its opposite, community obfuscation, has received little attention in recent years. With the increasing awareness of data security and privacy protection, the need to understand the impact of such attacks on traditional community detection algorithms emerges. To this end, we investigate the community obfuscation problem which aims to hide a target set of communities from being detected by perturbing the network structure. We identify and analyze the Matthew effect incurred by the classical quality function based methods, which essentially results in the imbalanced allocation of perturbation resources. To mitigate such effect, we propose a probabilistic framework named as ProHiCo to hide communities. The key idea of ProHiCo is to first allocate the resource of perturbations randomly and fairly and then choose the appropriate edges to perturb via likelihood minimization. Our ProHiCo framework provides the additional freedom to choose the generative graph model with community structure. By incorporating the stochastic block model and its degree-corrected variant into the ProHiCo framework, we develop two scalable and effective algorithms called SBM and DCSBM. Via extensive experiments on 8 real-world networks and 5 community detection algorithms, we show that both SBM and DCSBM are about 30x faster than the prominent baselines in the literature when there are around 500 target communities, while their performance is comparable to the baselines.

Session Chair

Fabricio Murai (Universidade Federal de Minas Gerais, Brasil)

Session D-9

Localization

Conference
2:30 PM — 4:00 PM EDT
Local
May 13 Thu, 11:30 AM — 1:00 PM PDT

VideoLoc: Video-based Indoor Localization with Text Information

Shusheng Li and Wenbo He (McMaster University, Canada)

0
Indoor localization serves as an important role in various scenarios such as navigation in shopping malls or hospitals. However, the existing technology is usually based on additional deployment and the signal suffers from strong environment interference in the complex indoor environment. In this paper, we propose video-based indoor localization with text information (i.e. "VideoLoc") without the deployment of additional equipment. Videos taken by the phone carriers cover more critical information (e.g. logos in malls), while a single photo may fail to capture it. To reduce redundant information in the video, we propose key-frame selection based on deep learning model and clustering algorithm. Video frames are characterized with deep visual descriptors and clustering algorithm efficiently clusters these descriptors into a set of non-overlapping snippets. We select keyframes from these non-overlapping snippets in terms of the cluster centroid that represents each snippet. Then, we propose text detection and recognition with transformation to make full use of stable and discriminative text information (e.g. logos or room numbers) in keyframes for localization. After that, we obtain the location of the phone carrier via triangulation algorithm. The experimental results show that VideoLoc achieves high precision of localization and is robust to dynamic environments.

The Effect of Ground Truth Accuracy on the Evaluation of Localization Systems

Chen Gu (Google, USA); Ahmed Shokry and Moustafa Youssef (Alexandria University, Egypt)

1
The ability to accurately evaluate the performance of location determination systems is crucial for many applications. Typically, the performance of such systems is obtained by comparing ground truth locations with estimated locations. However, these ground truth locations are usually obtained by clicking on a map or using other worldwide available technologies like GPS. This introduces ground truth errors that are due to the marking process, map distortions, or inherent GPS inaccuracy.

In this paper, we present a theoretical framework for analyzing the effect of ground truth errors on the evaluation of localization systems. Based on that, we design two algorithms for computing the real algorithmic error from the validation error and marking/map ground truth errors, respectively. We further establish bounds on different performance metrics.

Validation of our theoretical assumptions and analysis using real data collected in a typical environment shows the ability of our theoretical framework to correct the estimated error of a localization algorithm in the presence of ground truth errors. Specifically, our marking error algorithm matches the real error CDF within 4%, and our map error algorithm provides a more accurate estimate of the median/tail error by 150%/72% when the map is shifted by 6m.

Train Once, Locate Anytime for Anyone: Adversarial Learning based Wireless Localization

Danyang Li, Jingao Xu, Zheng Yang, Yumeng Lu and Qian Zhang (Tsinghua University, China); Xinglin Zhang (South China University of Technology, China)

0
Among numerous indoor localization systems, WiFi fingerprint-based localization has been one of the most attractive solutions, which is known to be free of extra infrastructure and specialized hardware. To push forward this approach for wide deployment, three crucial goals on delightful deployment ubiquity, high localization accuracy, and low maintenance cost are desirable. However, due to severe challenges about signal variation, device heterogeneity, and database degradation root in environmental dynamics, pioneer works usually make a trade-off among them. In this paper, we propose iToLoc, a deep learning based localization system that achieves all three goals simultaneously. Once trained, iToLoc will provide accurate localization service for everyone using different devices and under diverse network conditions, and automatically update itself to maintain reliable performance anytime. iToLoc is purely based on WiFi fingerprints without relying on specific infrastructures. The core components of iToLoc are a domain adversarial neural network and a co-training based semi-supervised learning framework. Extensive experiments across 7 months with 8 different devices demonstrate that iToLoc achieves remarkable performance with an accuracy of 1.92m and > 95% localization success rate. Even 7 months after the original fingerprint database was established, the rate still maintains > 90%, which significantly outperforms previous works.

Failure Localization through Progressive Network Tomography

Viviana Arrigoni (Sapienza, University of Rome, Italy); Novella Bartolini (Sapienza University of Rome, Italy); Annalisa Massini (Sapienza Università di Roma, Italy); Federico Trombetti (Sapienza, University of Rome, Italy)

4
Boolean Network Tomography (BNT) allows to localize network failures by means of end-to-end monitoring paths. Nevertheless, it falls short of providing efficient failure identification in real scenarios, due to the large combinatorial size of the solution space, especially when multiple failures occur concurrently. We aim at maximizing the identification capabilities of a bounded number of monitoring probes. To tackle this problem we propose a progressive approach to failure localization based on stochastic optimization, whose solution is the optimal sequence of monitoring paths to probe. We address the complexity of the problem by proposing a greedy strategy in two variants: one considers exact calculation of posterior probabilities of node failures given the observation, whereas the other approximates these values through a novel failure centrality metric. We discuss the approximation of the proposed approaches.
Then, by means of numerical experiments conducted on real network topologies, we demonstrate the practical applicability of our approach. The performance evaluation evidences the superiority of our algorithms with respect to state of the art solutions based on classic Boolean Network Tomography as well as approaches based on sequential group testing.

Session Chair

Song Fang (University of Oklahoma)

Session E-9

Sensing and Learning

Conference
2:30 PM — 4:00 PM EDT
Local
May 13 Thu, 11:30 AM — 1:00 PM PDT

DeepSense: Fast Wideband Spectrum Sensing Through Real-Time In-the-Loop Deep Learning

Daniel Uvaydov, Salvatore D'Oro, Francesco Restuccia and Tommaso Melodia (Northeastern University, USA)

0
Spectrum sharing will be a key technology to tackle spectrum scarcity in the sub-6 GHz bands. To fairly access the shared bandwidth, wireless users will necessarily need to quickly sense large portions of spectrum and opportunistically access unutilized bands. The key unaddressed challenges of spectrum sensing are that (i) it has to be performed with extremely low latency over large bandwidths to detect tiny spectrum holes and to guarantee strict real-time digital signal processing (DSP) constraints; (ii) its underlying algorithms need to be extremely accurate, and flexible enough to work with different wireless bands and protocols to find application in real-world settings. To the best of our knowledge, the literature lacks spectrum sensing techniques able to accomplish both requirements. In this paper, we propose DeepSense, a software/hardware framework for real-time wideband spectrum sensing that relies on real-time deep learning tightly integrated into the transceiver's baseband processing logic to detect and exploit unutilized spectrum bands. DeepSense uses a convolutional neural network (CNN) implemented in the wireless platform's hardware fabric to analyze a small portion of the unprocessed baseband waveform to automatically extract the maximum amount of information with the least amount of I/Q samples. We extensively validate the accuracy, latency and generality performance of DeepSense with (i) a 400 GB dataset containing hundreds of thousands of WiFi transmissions collected "in the wild" with different Signal-to-Noise-Ratio (SNR) conditions and over different days; (ii) a dataset of transmissions collected using our own software-defined radio testbed; and (iii) a synthetic dataset of LTE transmissions under controlled SNR conditions. We also measure the real-time latency of the CNNs trained on the three datasets with an FPGA implementation, and compare our approach with a fixed energy threshold mechanism. Results show that our learning-based approach can deliver a precision and recall of 98% and 97% respectively and a latency as low as 0.61ms. For reproducibility and benchmarking purposes, we pledge to share the code and the datasets used in this paper to the community.

Bayesian Online Learning for Energy-Aware Resource Orchestration in Virtualized RANs

Jose A. Ayala-Romero (Trinity College Dublin, Ireland); Andres Garcia-Saavedra (NEC Labs Europe, Germany); Xavier Costa-Perez (NEC Laboratories Europe, Germany); George Iosifidis (Delft University of Technology, The Netherlands)

0
Radio Access Network Virtualization (vRAN) will spearhead the quest towards supple radio stacks that adapt to heterogeneous infrastructure: from energy-constrained platforms deploying cells-on-wheels (e.g., drones) or battery-powered cells to green edge clouds. We perform an in-depth experimental analysis of the energy consumption of virtualized Base Stations (vBSs) and render two conclusions: (i) characterizing performance and power consumption is intricate as it depends on human behavior such as network load or user mobility; and (ii) there are many control policies and some of them have non-linear and monotonic relations with power and throughput. Driven by our experimental insights, we argue that machine learning holds the key for vBS control. We formulate two problems and two algorithms: (i) BP-vRAN, which uses Bayesian online learning to balance performance and energy consumption, and (ii) SBP-vRAN, which augments our Bayesian optimization approach with safe controls that maximize performance while respecting hard power constraints. We show that our approaches are data-efficient and have provably performance, which is paramount for carrier-grade vRANs. We demonstrate the convergence and flexibility of our approach and assess its performance using an experimental prototype.

Multi-Agent Reinforcement Learning for Urban Crowd Sensing with For-Hire Vehicles

Rong Ding (Shanghai Jiao Tong University, China); Zhaoxing Yang, Yifei Wei and Haiming Jin (Shanghai Jiao Tong University, China); Xinbing Wang (Shanghai Jiaotong University, China)

2
Recently, vehicular crowd sensing (VCS) that leverages sensor-equipped urban vehicles to collect city-scale sensory data has emerged as a promising paradigm for urban sensing. Nowadays, a wide spectrum of VCS tasks are carried out by for-hire vehicles (FHVs) due to various hardware and software constraints that are difficult for private vehicles to satisfy. However, such FHV-enabled VCS systems face a fundamental yet unsolved problem of striking a balance between the order-serving and sensing outcomes. To address this problem, we propose a novel graph convolutional cooperative multi-agent reinforcement learning (GCC-MARL) framework, which helps FHVs make distributed routing decisions that cooperatively optimize the system-wide global objective. Specifically, GCC-MARL meticulously assigns credits to agents in the training process to effectively stimulate cooperation, represents agents' actions by a carefully chosen statistics to cope with the variable agent scales, and integrates graph convolution to capture useful spatial features from complex large-scale urban road networks. We conduct extensive experiments with a real-world dataset collected in Shenzhen, China, containing around 1 million trajectories and 50 thousand orders of 553 taxis per-day from June 1st to 30th, 2017. Our experiment results show that GCC-MARL outperforms state-of-the-art baseline methods in order-serving revenue, as well as sensing coverage and quality.

Near-Optimal Topology-adaptive Parameter Synchronization in Distributed DNN Training

Zhe Zhang and Chuan Wu (The University of Hong Kong, Hong Kong); Zongpeng Li (Wuhan University & University of Calgary, China)

0
Distributed machine learning with multiple concurrent workers has been widely adopted to train large deep neural networks (DNNs). Parameter synchronization is a key component in each iteration of distributed training, where workers exchange locally computed gradients through an AllReduce operation or parameter servers, for global parameter updates. Parameter synchronization often constitutes a significant portion of the training time; minimizing the communication time contributes substantially to DNN training speed-up. Standard ring-based AllReduce or PS architecture work efficiently mostly with homogeneous inter-worker connectivity. However, available bandwidth among workers in real-world clusters is often heterogeneous, due to different hardware configurations, switching topologies, and contention with concurrent jobs. This work investigates the best parameter synchronization topology and schedule among workers for most expedited communication in distributed DNN training. We show that the optimal parameter synchronization topology should be comprised of trees with different workers as roots, each for aggregating or broadcasting a partition of gradients/parameters. We identify near-optimal forest packing to maximally utilize available bandwidth and overlap aggregation and broadcast stages to minimize communication time. We provide theoretical analysis of the performance bound, and show that our scheme outperforms state-of-the-art parameter synchronization schemes by up to 18.3 times with extensive evaluation under various settings.

Session Chair

Bo Ji (Virginia Tech)

Session F-9

Performance

Conference
2:30 PM — 4:00 PM EDT
Local
May 13 Thu, 11:30 AM — 1:00 PM PDT

On the Performance of Pipelined HotStuff

Jianyu Niu and Fangyu Gai (The University of British Columbia, Canada); Mohammad Jalalzai (The University of British Columbia); Chen Feng (University of British Columbia, Canada)

0
HotStuff is a state-of-the-art Byzantine fault-tolerant consensus protocol. It can be pipelined to build large-scale blockchains. One of its variants called LibraBFT is adopted in Facebook's Libra blockchain. Although it is well known that pipelined HotStuff is secure against up to 1/3 of Byzantine nodes, its performance in terms of throughput and delay is still under-explored. In this paper, we develop a multi-metric evaluation framework to quantitatively analyze pipelined HotStuff's performance with respect to its chain growth rate, chain quality, and latency. We then propose several attack strategies and evaluate their effects on the performance of pipelined HotStuff. Our analysis shows that the chain growth rate (resp, chain quality) of pipelined HotStuff under our attacks can drop to as low as 4/9 (resp, 12/17) of that without attacks when 1/3 nodes are Byzantine. As another application, we use our framework to evaluate certain engineering optimizations adopted by LibraBFT. We find that these optimizations make the system more vulnerable to our attacks than the original pipelined HotStuff. Finally, we provide two countermeasures to thwart these attacks. We hope that our studies can shed light on the rigorous understanding of the state-of-the-art pipelined HotStuff protocol as well as its variants.

Practical Analysis of Replication-Based Systems

Florin Ciucu (University of Warwick, United Kingdom (Great Britain)); Felix Poloczek (University of Warwick / TU Berlin, Germany); Lydia Y. Chen (IBM Zurich Research Laboratory, Switzerland); Martin Chan (University of Warwick, Germany)

0
Task replication has been advocated as a practical solution to reduce response times in parallel systems. The analysis of replication-based systems typically rests on some strong assumptions: Poisson arrivals, exponential service times, or independent service times of the replicas. This study is motivated not only by several studies which indicate that these assumptions are unrealistic, but also by some elementary observations highlighting some contriving behaviour. For instance, when service times are not exponential, adding a replication factor can stabilize an unstable system, i.e., having infinite delays, but a tempting higher replication factor can push the system back in a perilous unstable state. This behaviour disappears however if the replicas are sufficiently correlated, in which case any replication factor would even be detrimental. Motivated by the need to dispense with such common yet unrealistic and misleading assumptions, we provide a robust theoretical framework to compute stochastic bounds on response time distributions in general replication systems subject to Markovian arrivals, quite general service times, and correlated replicas. Numerical results show that our bounds are accurate and improve state-of-the-art bounds in the case of Markovian arrivals by as much as three orders of magnitude. We apply our results to a practical application and highlight that correctly setting the replication factor crucially depends on both the service time distributions of the replicas and the degree of correlation amongst.

WebMythBusters: An In-depth Study of Mobile Web Experience

Seonghoon Park and Yonghun Choi (Yonsei University, Korea (South)); Hojung Cha (Yonsei University, S. Korea, Korea (South))

0
The quality of experience (QoE) is an important issue for users when accessing the web. Although many metrics have been designed to estimate the QoE in the desktop environment, few studies have confirmed whether the QoE metrics are valid in the mobile environment. In this paper, we ask questions regarding the validity of using desktop-based QoE metrics for the mobile web and find answers. We first classify the existing QoE metrics into several groups according to three criteria and then identify the differences between the mobile and desktop environments. Based on the analysis, we ask three research questions and develop a system, called WebMythBusters, for collecting and analyzing mobile web experiences. Through an extensive analysis of the collected user data, we find that (1) the metrics focusing on fast completion or fast initiation of the page loading process cannot estimate the actual QoE, (2) the conventional scheme of calculating visual progress is not appropriate, and (3) focusing only on the above-the-fold area is not sufficient in the mobile environment. The findings indicate that QoE metrics designed for the desktop environment are not necessarily adequate for the mobile environment, and appropriate metrics should be devised to reflect the mobile web experience.

SOBA: Session optimal MDP-based network friendly recommendations

Theodoros Giannakas (EURECOM, France); Anastasios Giovanidis (Sorbonne Université & CNRS-LIP6, France); Thrasyvoulos Spyropoulos (EURECOM, France)

2
Caching content over CDNs or at the network edge has been solidified as a means to improve network cost and offer better streaming experience to users. Furthermore, nudging the users towards low-cost content has recently gained momentum as a strategy to boost network performance. We focus on the problem of optimal policy design for Network Friendly Recommendations (NFR). We depart from recent modeling attempts, and propose a Markov Decision Process (MDP) formulation. MDPs offer a unified framework that can model a user with random session length. As it turns out, many state-of-the-art approaches can be cast as subcases of our MDP formulation. Moreover, the approach offers flexibility to model users who are reactive to the quality of the received recommendations. In terms of performance, for users consuming an arbitrary number of contents in sequence, we show theoretically and using extensive validation over real traces that the MDP approach outperforms myopic algorithms both in session cost as well as in offered recommendation quality. Finally, even compared to optimal state-of-art algorithms targeting specific subcases, our MDP framework is significantly more efficient, speeding the execution time by a factor of 10, and enjoying better scaling with the content catalog and recommendation batch sizes.

Session Chair

Marie-Jose Montpetit (Concordia University, Canada)

Session G-9

Miscellaneous

Conference
2:30 PM — 4:00 PM EDT
Local
May 13 Thu, 11:30 AM — 1:00 PM PDT

De-anonymizing Social Networks Under Partial Overlap: An F-score Based Approach

Jiapeng Zhang and Luoyi Fu (Shanghai Jiao Tong University, China); Xinbing Wang (Shanghai Jiaotong University, China); Guihai Chen (Shanghai Jiao Tong University, China)

0
This paper studies social network de-anonymization problem, which aims to identify users of an anonymized network by matching its user set with that of another auxiliary sanitized network. Prior arts primarily assume that both networks share exactly the same set of users, as opposed to many real situations of partially shared users in between. Different from the full matching case that only needs to take care of increasing the number of correctly matched pairs, the case of partial overlapping imposes additional demand on avoiding the wrong matches of those who do not have accounts across networks. To this end, we establish a new cost function, which we call the structural F-score to incorporate both the structural commonness and difference across networks. Intrinsically, the structural F-score computes the ratio of link agreements and disagreements, thus serving as the harmonic mean of precision and recall for any given matching function. Theoretically, we show that for networks parameterized by node overlap t-2 and link overlap s-2, as long as the mean degree of networks grows as Ω(t −-2 s −-3 log n), maximizing the structural F-score provably ensures the perfect matching, where the nodal precision and recall are both maximized to 1. Algorithmically, for small-scale networks, we propose a two-step heuristic of F-score based de-anonymization, which firstly finds the optimal full matching between networks and then removes those pairs hindering structural F-score maximization. Due to the universal adaptability of the structural F-score, we further extend the algorithm to large-scale networks via a progressive matching process. Empirical results also validate the effectiveness of our methods in terms of improving the nodal F-score.

First-Order Efficient General-Purpose Clean-label Data Poisoning

Tianhang Zheng and Baochun Li (University of Toronto, Canada)

0
As one of the recently emerged threats to Deep Learning (DL) models, clean-label data poisoning can teach DL models to make wrong predictions on specific target data, such as images or network traffic packets, by injecting a small set of poisoning data with clean labels into the training datasets. Although several clean-label poisoning methods have been developed before, they have two main limitations. First, the methods developed with bi-level optimization or influence functions usually require second-order information, leading to substantial computational overhead. Second, the methods based on feature collision are not very transferable to unseen feature spaces or generalizable to various scenarios. To address these limitations, we propose a first-order efficient general-purpose clean-label poisoning attack in this paper. In our attack, we first identify the first-order model update that can push the model towards predicting the target data as the attack targeted label. We then formulate a necessary condition based on the model update and other first-order information to optimize the poisoning data. Theoretically, we prove that our first-order poisoning method is an approximation of a second-order approach with theoretically-guaranteed performance. Empirically, extensive evaluations on image classification and network traffic classification demonstrate the outstanding efficiency, transferability, and generalizability of our poisoning method.

INT-label: Lightweight In-band Network-Wide Telemetry via Interval-based Distributed Labelling

Enge Song, Tian Pan and Chenhao Jia (Beijing University of Posts and Telecommunications, China); Wendi Cao (Peking University, China); Jiao Zhang, Tao Huang and Yunjie Liu (Beijing University of Posts and Telecommunications, China)

0
The In-band Network Telemetry (INT) enables hop-by-hop device-internal state exposure for reliably maintaining and troubleshooting data center networks. For achieving network-wide telemetry, orchestration on top of the INT primitive is further required. One straightforward solution is to flood the INT probe packets into the network topology for maximum measurement coverage, which, however, leads to huge bandwidth overhead. A refined solution is to leverage the SDN controller to collect the topology and carry out centralized probing path planning, which, however, cannot seamlessly adapt to occasional topology changes. To tackle the above problems, in this work, we propose INT-label, a lightweight In-band Network-Wide Telemetry architecture via interval-based distributed labelling. INT-label periodically labels device-internal states onto sampled packets, which is cost-effective with minor bandwidth overhead and able to seamlessly adapt to topology changes. Furthermore, to avoid telemetry resolution degradation due to loss of labelled packets, we also design a feedback mechanism to adaptively change the instant label frequency. Evaluation on software P4 switches suggests that INT-label can achieve 99.72% measurement coverage under a label frequency of 20 times per second. With adaptive labelling enabled, the coverage can still reach 92% even if 60% of the packets are lost in the data plane.

Finding Critical Files from a Packet

JunNyung Hur, Hahoon Jeon, Hyeon gy Shon, Young Jae Kim and MyungKeun Yoon (Kookmin University, Korea (South))

0
Network-based intrusion detection and data leakage prevention systems inspect packets to detect if critical files such as malware or confidential documents are transferred. However, this kind of detection requires heavy computing resources in reassembling packets and only well-known protocols can be interpreted. Besides, finding similar files from a storage requires pairwise comparisons. In this paper, we present a new network-based file identification scheme that inspects packets independently without reassembly and finds similar files through inverted indexing instead of pairwise comparison. We use a contents-based chunking algorithm to consistently divide both files and packets into multiple byte sequences, called chunks. If a packet is a part of a file, they would have common chunks. The challenging problem is that packet chunking and inverted-index search should be fast and scalable enough for packet processing. The file identification should be accurate although many chunks are noises. In this paper, we use a small Bloom filter and a delayed query strategy to solve the problems. To the best of our knowledge, this is the first scheme that identifies a specific critical file from a packet over unknown protocols. Experimental results show that the proposed scheme can successfully identify a critical file from a packet.

Session Chair

Damla Turgut (University of Central Florida)

Session Break-3-May13

Virtual Coffee Break

Conference
4:00 PM — 4:30 PM EDT
Local
May 13 Thu, 1:00 PM — 1:30 PM PDT

Session A-10

Security

Conference
4:30 PM — 6:00 PM EDT
Local
May 13 Thu, 1:30 PM — 3:00 PM PDT

Bipartite Graph Matching Based Secret Key Generation

Hongbo Liu (University of Electronic Science and Technology of China, China); Yan Wang (Temple University, USA); Yanzhi Ren (University of Electronic Science and Technology of China, China); Yingying Chen (Rutgers University, USA)

1
The physical layer secret key generation exploiting wireless channel reciprocity has attracted considerable attention in the past two decades. On-going research have demonstrated its viability in various radio frequency (RF) systems. Most of existing work rely on quantization technique to convert channel measurements into digital binaries that are suitable for secret key generation. However, non-simultaneous packet exchanges in time division duplex systems and noise effects in practice usually create random channel measurements between two users, leading to inconsistent quantization results and mismatched secret bits. While significant efforts were spent in recent research to mitigate such non-reciprocity, no efficient method has been found yet. Unlike existing quantization-based approaches, we take a different viewpoint and perform the secret key agreement by solving a bipartite graph matching problem. Specifically, an efficient dual-permutation secret key generation method, DP-SKG, is developed to match the randomly permuted channel measurements between a pair of users by minimizing their discrepancy holistically. DP-SKG allows two users to generate the same secret key based on the permutation order of channel measurements despite the non-reciprocity over wireless channels. Extensive experimental results show that DP-SKG could achieve error-free key agreement on received signal strength (RSS) with a low cost under various scenarios.

ScreenID: Enhancing QRCode Security by Fingerprinting Screens

Yijie Li and Yi-Chao Chen (Shanghai Jiao Tong University, China); Xiaoyu Ji (Zhejiang University, China); Hao Pan, Lanqing Yang, Guangtao Xue and Jiadi Yu (Shanghai Jiao Tong University, China)

0
Quick response (QR) codes have been widely used in mobile applications due to its convenience and the pervasive built-in cameras on smartphones. Recently, however, attacks against QR codes have been reported that attackers can capture a QR code of the victim and replay it to achieve a fraudulent transaction or intercept private information, just before the original QR code is scanned. In this study, we enhance the security of a QR code by identifying its authenticity. We propose SCREENID, which embeds a QR code with information of the screen which displays it, thereby the QR code can reveal whether it is reproduced by an adversary or not. In SCREENID, PWM frequency of screens is exploited as the unique screen fingerprint. To improve the estimation accuracy of PWM frequency, SCREENID incorporates a model for the interaction between the camera and screen in the temporal and spatial domains. Extensive experiments demonstrate that SCREENID can differentiate screens of different models, types, and manufacturers, thus improve the security of QR codes.

Prison Break of Android Reflection Restriction and Defense

Zhen Ling and Ruizhao Liu (Southeast University, China); Yue Zhang (Jinan University, China); Kang Jia (Southeast University, China); Bryan Pearson (University of Central Florida, USA); Xinwen Fu (University of Massachusetts Lowell, USA); Luo Junzhou (Southeast University, China)

0
Java reflection technique is pervasively used in the Android system. To reduce the risk of reflection abuse, Android restricts the use of reflection at the Android Runtime (ART) to hide potentially dangerous methods/fields. We perform the first comprehensive study of the reflection restrictions and have discovered three novel approaches to bypass the reflection restrictions. Novel reflection-based attacks are also presented, including the password stealing attack. To mitigate the threats, we analyze these restriction bypassing approaches and find three techniques crucial to these approaches, i.e., double reflection, memory manipulation, and inline hook. We propose a defense mechanism that consists of classloader double checker, ART variable protector, and ART method protector, to prohibit the reflection restriction bypassing. Finally, we design and implement an automatic reflection detection framework and have discovered 5,531 reflection powered apps out of 100,000 downloaded apps, which are installed on our defense enabled Android system of a Google Pixel 2 to evaluate the effectiveness and efficiency of our defense mechanism. Extensive empirical experiment results demonstrate that our defense enabled system can accurately obstruct the malicious reflection attempts.

Counter-Collusion Smart Contracts for Watchtowers in Payment Channel Networks

Yuhui Zhang and Dejun Yang (Colorado School of Mines, USA); Guoliang Xue (Arizona State University, USA); Ruozhou Yu (North Carolina State University, USA)

0
Payment channel networks (PCNs) are proposed to improve the cryptocurrency scalability by settling off-chain transactions. However, PCN introduces an undesirable assumption that a channel participant must stay online and be synchronized with the blockchain to defend against frauds. To alleviate this issue, watchtowers have been introduced, such that a hiring party can employ a watchtower to monitor the channel for fraud. However, a watchtower might profit from colluding with a cheating counterparty and fail to perform this job. Existing solutions either focus on heavy cryptographic techniques or require a large collateral. In this work, we leverage smart contracts through economic approaches to counter collusions for watchtowers in PCNs. This brings distrust between the watchtower and the counterparty, so that rational parties do not collude or cheat. We provide detailed analyses on the contracts and rigorously prove that the contracts are effective to counter collusions with minimal on-chain operations. In particular, a watchtower only needs to lock a small collateral, which incentivizes participation of watchtowers and users. We also provide an implementation of the contracts in Solidity and execute them on Ethereum to demonstrate the scalability and efficiency of the contracts.

Session Chair

Satyajayant Misra (New Mexico State University)

Session B-10

Programmable Switches

Conference
4:30 PM — 6:00 PM EDT
Local
May 13 Thu, 1:30 PM — 3:00 PM PDT

Programmable Switches for in-Networking Classification

Bruno Missi Xavier and Rafael Silva Guimaraes (Federal Institute of Espirito Santo - Campus Cachoeiro de Itapemirim, Brazil); Giovanni Comarela (Universidade Federal do Espírito Santo, Brazil); Magnos Martinello (Federal University of Espirito Santo, Brazil)

2
Deploying accurate machine learning algorithms into a high-throughput networking environment is a challenging task. On the one hand, machine learning has proved itself useful for traffic classification in many contexts (e.g., intrusion detection, application classification, and early heavy hitter identification). On the other hand, most of the work in the area is related to post-processing (i.e., training and testing are performed offline on previously collected samples) or to scenarios where the traffic has to leave the data plane to be classified (i.e., high latency). In this work, we tackle the problem of creating simple and reasonably accurate machine learning models that can be deployed into the data plane in a way that performance degradation is acceptable. To that purpose, we introduce a framework and discuss issues related to the translation of simple models, for handling individual packets or flows, into the P4 language. We validate our framework with an intrusion detection use case and by deploying a single decision tree into a Netronome SmartNIC (Agilio CX 2x10GbE). Our results show that high-accuracy is achievable (above 95%) with minor performance degradation, even for a large number of flows.

Fix with P6: Verifying Programmable Switches at Runtime

Apoorv Shukla (Huawei Munich Research Center, Germany); Kevin Hudemann (SAP, Germany); Zsolt Vági (SWISSCOM, Switzerland); Lily Hügerich (TU Berlin, Germany); Georgios Smaragdakis (TU Berlin and Max Planck Institute for Informatics, Germany); Artur Hecker (Huawei, Germany); Stefan Schmid (University of Vienna, Austria); Anja Feldmann (Max Planck Institute for Informatics & Saarland Informatics Campus / TU Berlin, Germany)

2
We design, develop, and evaluate P6, an automated approach to (a) detect, (b) localize, and (c) patch software bugs in P4 programs. Bugs are reported via a violation of pre-specified expected behavior that is captured by P6. P6 is based on machine learning-guided fuzzing that tests P4 switch non-intrusively, i.e., without modifying the P4 program for detecting runtime bugs. This enables an automated and real-time localization and patching of bugs. We used a P6 prototype to detect and patch existing bugs in various publicly available P4 application programs deployed on two different switch platforms: behavioral model (bmv2) and Tofino. Our evaluation shows that P6 significantly outperforms bug detection baselines while generating fewer packets and patches bugs in large P4 programs such as switch.p4 without triggering any regressions.

Making Multi-String Pattern Matching Scalable and Cost-Efficient with Programmable Switching ASICs

Shicheng Wang, Menghao Zhang, Guanyu Li, Chang Liu and Ying Liu (Tsinghua University, China); Xuya Jia (Huawei Technologies Co. Ltd., China); Mingwei Xu (Tsinghua University, China)

1
Multi-string pattern matching is a crucial building block for many network security applications, and thus of great importance. Since every byte of a packet has to be inspected by a large set of patterns, it often becomes a bottleneck of these applications and dominates the performance of an entire system. Many existing works have been devoted to alleviate this performance bottleneck either by algorithm optimization or hardware acceleration. However, neither one provides the desired scalability and costs that keep pace with the dramatic increase of the network bandwidth and network traffic today. In this paper, we present BOLT, a scalable and cost-efficient multi-string pattern matching system leveraging the capability of emerging programmable switches. BOLT combines the following two techniques, a smart state encoding scheme to fit a large number of strings into the limited memory on the programmable switch, and a variable k-stride transition mechanism to increase the throughput significantly with the same level of memory costs. We implement a prototype of BOLT and make its source code publicly available. Extensive evaluations demonstrate that BOLT could provide orders of magnitude improvement in throughput which is scalable with pattern sets and workloads, and could also significantly decrease the number of entries and memory requirement.

Traffic-aware Buffer Management in Shared Memory Switches

Sijiang Huang, Mowei Wang and Yong Cui (Tsinghua University, China)

0
Switch buffer serves an important role in modern internet. To achieve efficiency, today's switches often use on-chip shared memory. Shared memory switches rely on buffer management policies to allocate buffer among ports. To avoid waste of buffer resources or a few ports occupy too much buffer, existing policies tend to maximize overall buffer utilization and pursue queue length fairness. However, blind pursuit of utilization and misleading fairness definition based on queue length leads to buffer occupation with no benefit to throughput but extends queuing delay and undermines burst absorption of other ports. We contend that a buffer management policy should proactively detect port traffic and adjust buffer allocation accordingly. In this paper, we propose Traffic-aware Dynamic Threshold (TDT) policy. On the basis of classic dynamic threshold policy, TDT proactively raise or lower port threshold to absorb burst traffic or evacuate meaningless buffer occupation. We present detailed designs of port control state transition and state decision module that detect real time traffic and change port thresholds accordingly. Simulation and DPDK-based real testbed demonstrate that TDT simultaneously optimizes for throughput, loss and delay, and reduces up to 50% flow completion time.

Session Chair

Qiao Xiang (Xiamen University, China)

Session C-10

Memory Topics

Conference
4:30 PM — 6:00 PM EDT
Local
May 13 Thu, 1:30 PM — 3:00 PM PDT

Adaptive Batch Update in TCAM: How Collective Optimization Beats Individual Ones

Ying Wan (Tsinghua University, China); Haoyu Song (Futurewei Technologies, USA); Yang Xu (Fudan University, China); Chuwen Zhang (Tsinghua University, China); Yi Wang (Southern University of Science and Technology, China); Bin Liu (Tsinghua University, China)

1
Rule update in TCAM has long been identified as a key technical challenge due to the rule order constraint. Existing algorithms take each rule update as an independent task. However, emerging applications produce batch rule update requests. Processing the updates individually causes high aggregated cost which can strain the processor and/or incur excessive TCAM lookup interrupts. This paper presents the first true batch update algorithm, ABUT. Unlike the other alleged batch update algorithms, ABUT collectively evaluates and optimizes the
TCAM placement for whole batches throughout. By applying the topology grouping and maintaining the group order invariance in TCAM, ABUT achieves substantial computing time reduction yet still yields the best-in-class placement cost. Our evaluations show that ABUT is ideal for low-latency and high-throughput batch TCAM updates in modern high-performance switches.

HAVS: Hardware-accelerated Shared-memory-based VPP Network Stack

Shujun Zhuang and Jian Zhao (ShangHaiJiaoTong University, China); Jian Li (Shanghai Jiao Tong University, China); Ping Yu and Yuwei Zhang (Intel, China); Haibing Guan (Shanghai Jiao Tong University, China)

0
The number of requests to transfer large files is increasing rapidly in web server and remote-storage scenarios, and this increase requires a higher processing capacity from the network stack. However, to fully decouple from applications, many latest userspace network stacks, such as VPP (vector packet processing) and snap, adopt a shared-memory-based solution to communicate with upper applications. During this communication, the application or network stack needs to copy data to or from shared memory queues. In our verification experiment, these multiple copy operations incur more than 50% CPU consumption and severe performance degradation when the transferred file is larger than 32 KB.
This paper adopts a hardware-accelerated solution and proposes HAVS which integrates Intel I/O Acceleration Technology into the VPP network stack to achieve high-performance memory copy offloading. An asynchronous copy architecture is introduced in HAVS to free up CPU resources. Moreover, an abstract memcpy accelerator layer is constructed in HAVS to ease the use of different types of hardware accelerators and sustain high availability with a fault-tolerance mechanism. The comprehensive evaluation shows that HAVS can provide an average 50%-60% throughput improvement over the original VPP stack when accelerating the nginx and SPDK iSCSI target application.

Maximizing the Benefit of RDMA at End Hosts

Xiaoliang Wang (Nanjing University, China); Hexiang Song (NJU, China); Cam-Tu Nguyen (Nanjing University, Vietnam); Dongxu Cheng and Tiancheng Jin (NJU, China)

0
RDMA is increasingly deployed in data center to meet the demands of ultra-low latency, high throughput and low CPU overhead. However, it is not easy to migrate existing applications from the TCP/IP stack to the RDMA. The developers usually need to carefully select communication primitives and manually tune the parameters for each single-purpose system. After operating the high-speed RDMA network, we identify multiple hidden costs which may cause degraded and/or unpredictable performance of RDMA-based applications. We demonstrate these hidden costs including the combination of complicated parameter settings, scalability of Reliable Connections, two-sided memory management and page alignment, resource contention among diverse traffics, etc. Furthermore, to address these problems, we introduce Nem, a suite that allows developers to maximize the benefit of RDMA by i) eliminating the resource contention at NIC cache through asynchronous resource sharing; ii) introducing hybrid page management based on messages sizes; iii) isolating flows of different traffic classes based hardware features. We implement the prototype of Nem and verify its effectiveness by rebuilding the RPC message service, which demonstrates the high throughput for large messages, low latency for small messages without compromising the low CPU utilization and good scalability performance for a large number of active connections.

Session Chair

Xinwen Fu (U. Massachussets, Lowell)

Session D-10

SDN

Conference
4:30 PM — 6:00 PM EDT
Local
May 13 Thu, 1:30 PM — 3:00 PM PDT

Safety Critical Networks using Commodity SDNs

Ashish Kashinath (University of Illinois at Urbana-Champaign, USA); Monowar Hasan (University of Illinois Urbana-Champaign, USA); Rakesh Kumar (University of Illinois, Urbana-Champaign, USA); Sibin Mohan (University of Illinois at Urbana-Champaign, USA); Rakesh B. Bobba (Oregon State University, USA); Smruti Padhy (University of Texas at Austin, USA)

0
Safety-critical networks often have stringent real-time requirements; they must also be resilient to failures. In this paper, we propose the RealFlow framework that uses commodity software-defined networks (SDNs) to realize networks with end-to-end timing guarantees, while also: (a) increasing resiliency against link/switch failures and (b) increasing network utilization. The use of SDNs in this space also improves the management capabilities of the system due to the global visibility into the network. RealFlow is implemented as a northbound SDN controller application compatible with standard OpenFlow protocols with little to no runtime overheads. We demonstrate feasibility on a real hardware testbed (Pica8 SDN switches+Raspberry Pi endhosts) and a practical avionics case study. Our evaluations show that RealFlow can accommodate 63% more network flows with safety-critical guarantees when compared to current designs and up to 18% when link resiliency (via backup paths) is also considered.

Bandwidth Isolation Guarantee for SDN Virtual Networks

Gyeongsik Yang, Yeonho Yoo and Minkoo Kang (Korea University, Korea (South)); Heesang Jin (ETRI, Korea (South)); Chuck Yoo (Korea University, Korea (South))

1
We introduce TeaVisor, which provides bandwidth isolation guarantee for software-defined networking (SDN)-based network virtualization (NV). SDN-NV provides topology and address virtualization while allowing flexible resource provisioning, control, and monitoring of virtual networks. However, to the best of our knowledge, the bandwidth isolation guarantee, which is essential for providing stable and reliable throughput on network services, is missing in SDN-NV. Without bandwidth isolation guarantee, tenants suffer degraded service quality and significant revenue loss. In fact, we find that the existing studies on bandwidth isolation guarantees are insufficient for SDN-NV. With SDN-NV, routing is performed by tenants, and existing studies have not addressed the overloaded link problem. To solve this problem, TeaVisor designs three components: path virtualization, bandwidth reservation, and path establishment, which utilize multipath routing. With these, TeaVisor achieves the bandwidth isolation guarantee while preserving the routing of the tenants. In addition, TeaVisor guarantees the minimum and maximum amounts of bandwidth simultaneously. We fully implement TeaVisor, and the comprehensive evaluation results show that near-zero error rates on achieving the bandwidth isolation guarantee. We also present an overhead analysis of control traffic and memory consumption.

Online Joint Optimization on Traffic Engineering and Network Update in Software-defined WANs

Jiaqi Zheng, Yimeng Xu and Li Wang (Nanjing University, China); Haipeng Dai (Nanjing University & State Key Laboratory for Novel Software Technology, China); Guihai Chen (Shanghai Jiao Tong University, China)

0
State-of-the-art inter-datacenter WANs rely on centralized traffic engineering (TE) to improve the network performance, where TE computation is a periodical procedure and timely performs routing configurations (i.e., enforces TE polices via add, remove and modify forwarding rules) in response to the changing network conditions. The TE computation determines the routing configurations corresponding to the current network conditions and the network update operations change the routing configurations from last TE to current TE solution. Existing works take centralized TE computation and network update as two individual optimization procedures, which inevitably leads to suboptimal solution in the long run. In this paper we initiate the study of online joint optimization on TE computation and network update with the objective of minimizing the sum of TE cost and network update cost. We formulate this problem as an optimization program and propose a set of provable online algorithms with rigorous competitive and regret analysis. Trace-driven simulations on two empirical topologies demonstrate that our algorithms can significantly decrease the total cost.

Modeling the Cost of Flexibility in Communication Networks

Alberto Martínez Alba (Technische Universität München, Germany); Péter Babarczi (Budapest University of Technology and Economics, Hungary & Technische Universität München, Germany); Andreas Blenk and Mu He (Technische Universität München, Germany); Patrick Kalmbach (Technical University of Munich, Germany); Johannes Zerwas and Wolfgang Kellerer (Technische Universität München, Germany)

0
Communication networks are evolving towards a more adaptive and reconfigurable nature due to the evergrowing demands they face. A framework for measuring network flexibility has been proposed recently, but the cost of rendering communication networks more flexible has not yet been mathematically modeled. As new technologies such as software-defined networking (SDN), network function virtualization (NFV), or network virtualization (NV) emerge to provide network flexibility, a way to estimate and compare the cost of different implementation options is needed. In this paper, we present a comprehensive model of the cost of a flexible network that takes into account its transient and stationary phases. This allows network researchers and operators to not only qualitatively argue about their new flexible network solutions, but also to analyze their cost for the first time in a quantitative way.

Session Chair

Y. Richard Yang (Yale University)

Session E-10

Learning Networks

Conference
4:30 PM — 6:00 PM EDT
Local
May 13 Thu, 1:30 PM — 3:00 PM PDT

Analyzing Learning-Based Networked Systems with Formal Verification

Arnaud Dethise and Marco Canini (KAUST, Saudi Arabia); Nina Narodytska (VMware Research Group, USA)

0
As more applications of (deep) neural networks emerge in the computer networking domain, the correctness and predictability of a neural agent's behavior for corner case inputs are becoming crucial. Enabling the formal analysis of agents with nontrivial properties, we bridge between specifying intended high-level behavior and expressing low-level statements directly encoded into an efficient verification framework. Our results support that within minutes, one can establish the resilience of a neural network to adversarial attacks on its inputs, as well as formally prove properties that were previously relying on educated guesses. Finally, we also show how formal verification can help create an accurate visual representation of an agent behavior to perform visual inspection and improve its trustworthiness.

Bringing Fairness to Actor-Critic Reinforcement Learning for Network Utility Optimization

Jingdi Chen and Yimeng Wang (The George Washington University, USA); Tian Lan (George Washington University, USA)

0
Fairness is a crucial design objective in virtually all network optimization problems, where limited system resources are shared by multiple agents. Recently, reinforcement learning has been successfully applied to autonomous online decision making in many network design and optimization problems. However, most of them try to maximize the long-term (discounted) reward of all agents, without taking fairness into account. In this paper, we propose a family of algorithms that bring fairness to actor-critic reinforcement learning for optimizing general fairness utility functions. In particular, we present a novel method for adjusting the rewards in standard reinforcement learning by a multiplicative weight depending on both the shape of fairness utility and some statistics of past rewards. It is shown that for proper choice of the adjusted rewards, a policy gradient update converges to at least a stationary point of general α-fairness utility optimization. It inspires the design of fairness optimization algorithms in actor-critic reinforcement learning. Evaluations show that the proposed algorithm can be easily deployed in real-world network optimization problems, such as wireless scheduling and video QoE optimization, and can significantly improve the fairness utility value over previous heuristics and learning algorithms.

Incentive Mechanism Design for Distributed Coded Machine Learning

Ningning Ding (The Chinese University of Hong Kong, Hong Kong); Zhixuan Fang (Tsinghua University, China); Lingjie Duan (Singapore University of Technology and Design (SUTD), Singapore); Jianwei Huang (The Chinese University of Hong Kong, Shenzhen, China)

1
A distributed machine learning platform needs to recruit many heterogeneous worker nodes to finish computation simultaneously. As a result, the overall performance may be degraded due to straggling workers. By introducing redundancy into computation, coded machine learning can effectively improve the runtime performance by recovering the final computation result through the first k (out of the total n) workers who finish computation. While existing studies focus on designing efficient coding schemes, the issue of designing proper incentives to encourage worker participation is still under-explored. This paper studies the platform's optimal incentive mechanism for motivating proper workers' participation in coded machine learning, despite the incomplete information about heterogeneous workers' computation performances and costs. A key contribution of this work is to summarize workers' multi-dimensional heterogeneity as a one-dimensional metric, which guides the platform's efficient selection of workers under incomplete information with a linear computation complexity. Moreover, we prove that the optimal recovery threshold k is linearly proportional to the participator number n if we use the widely adopted MDS codes for data encoding. We also show that the platform's increased cost due to incomplete information disappears when worker number is sufficiently large, but it does not monotonically decrease in worker number.

Efficient Learning-based Scheduling for Information Freshness in Wireless Networks

Bin Li (University of Rhode Island, USA)

1
Motivated by the recent trend of integrating artificial intelligence into the Internet-of-Things (IoT), we consider the problem of scheduling packets from multiple sensing sources to a central controller over a wireless network. Here, packets from different sensing sources have different values or degrees of importance to the central controller for intelligent decision making. In such a setup, it is critical to provide timely and valuable information for the central controller. In this paper, we develop a parameterized maximum-weight type scheduling policy that combines both the AoI metrics and Upper Confidence Bound (UCB) estimates in its weight measure with parameter η. Here, UCB estimates balance the tradeoff between exploration and exploitation in learning and are critical for yielding a small cumulative regret. We show that our proposed algorithm yields the running average total age at most by O(N 2 η). We also prove that our proposed algorithm achieves the cumulative regret over time horizon T at most by O(NT/η+ √ NT log T ). This reveals a tradeoff between the cumulative regret and the running average total age: when increasing η, the cumulative regret becomes smaller, but is at the cost of increasing running average total age. Simulation results are provided to evaluate the efficiency of our proposed algorithm.

Session Chair

WenZhan Song (University of Georgia)

Session F-10

Protocols

Conference
4:30 PM — 6:00 PM EDT
Local
May 13 Thu, 1:30 PM — 3:00 PM PDT

802.11ad in Smartphones: Energy Efficiency, Spatial Reuse, and Impact on Applications

Shivang Aggarwal (Northeastern University, USA); Moinak Ghoshal (Northeastern University, Boston, MA, USA); Piyali Banerjee (University at Buffalo, USA); Dimitrios Koutsonikolas (Northeastern University, USA); Joerg Widmer (IMDEA Networks Institute, Spain)

1
We present an extensive experimental evaluation of the performance and power consumption of the 60 GHz IEEE 802.11ad technology on commercial smartphones. We also compare 802.11ad against its main competitors in the 5 GHz band - 802.11ac and, for first time, 802.11ax, on mobile devices. Our performance comparison focuses on two aspects that have not been extensively studied before: (i) dense multi-client and multi-AP topologies and (ii) popular mobile applications under realistic mobility patterns. Our power consumption study covers both non-communicating and communicating modes. We also present the first study of the power saving mode in 802.11ad-enabled smartphones and its impact on performance. Our results show that 802.11ad is better able to address the needs of emerging bandwidth-intensive applications in smartphones than its 5 GHz counterparts. At the same time, we identify several key research directions towards realizing its full potential.

Age-Dependent Distributed MAC for Ultra-Dense Wireless Networks

Dheeraj Narasimha (Arizona State University, USA); Srinivas G Shakkottai (Texas A&M University, USA); Lei Ying (University of Michigan, USA)

0
We consider an ultra-dense wireless network with N channels and M = N devices. Messages with fresh information are generated at each device according to a random process and need to be transmitted to an access point. The value of a message decreases as it ages, so each device searches for an idle channel to transmit the message as soon as it can. However, each channel probing is associated with a fixed cost (energy), so a device needs to adapt its probing rate based on the "age" of the message. At each device, the design of the optimal probing strategy can be formulated as an infinite horizon Markov Decision Process (MDP) where the devices compete with each other to find idle channels. While it is natural to view the system as a Bayesian game, it is often intractable to analyze such a system. Thus, we use the Mean Field Game (MFG) approach to analyze the system in a large-system regime, where the number of devices is very large, to understand the structure of the problem and to find efficient probing strategies. We present an analysis based on the MFG perspective. We begin by characterizing the space of valid policies and use this to show the existence of a Mean Field Nash Equilibrium (MFNE) in a constrained set for any general increasing cost functions with diminishing rewards. Further we provide an algorithm for computing the equilibrium for any given device, and the corresponding age-dependent channel probing policy.

Delay-Tolerant Constrained OCO with Application to Network Resource Allocation

Juncheng Wang and Ben Liang (University of Toronto, Canada); Min Dong (Ontario Tech University, Canada); Gary Boudreau and Hatem Abou-zeid (Ericsson, Canada)

0
We consider online convex optimization (OCO) with multi-slot feedback delay, where an agent makes a sequence of online decisions to minimize the accumulation of time-varying convex loss functions, subject to short-term and long-term constraints that are possibly time-varying. The current convex loss function and the long-term constraint function are revealed to the agent only after the decision is made, and they may be delayed for multiple time slots. Existing work on OCO under this general setting has focused on the static regret, which measures the gap of losses between the online decision sequence and an offline benchmark that is fixed over time. In this work, we consider both the static regret and the more practically meaningful dynamic regret, where the benchmark is a time-varying sequence of per-slot optimizers. We propose an efficient algorithm, termed Delay-Tolerant Constrained-OCO (DTC-OCO), which uses a novel constraint penalty with double regularization to tackle the asynchrony between information feedback and decision updates. We derive upper bounds on its dynamic regret, static regret, and constraint violation, proving them to be sublinear under mild conditions. We further apply DTC-OCO to a general network resource allocation problem, which arises in many systems such as data networks and cloud computing. Simulation results demonstrate substantial performance gain of DTC-OCO over the known best alternative.

Multicast Communications with Varying Bandwidth Constraints

Yuval Emek and Shay Kutten (Technion, Israel); Mordechai Shalom (Tel-Hai College & Technion, Israel); Shmuel Zaks (Technion, Israel)

0
To find a maximum number of communication requests that can be satisfied concurrently, is a fundamental network scheduling problem. In this work we investigate the problem of finding a maximum number of multicast requests that can be scheduled simultaneously in a tree network in which the edges and links have heterogeneous bandwidth limitations. This problem generalizes two problems studied in the literature: maximum k-colorable subgraph in chordal graphs, maximum multi-commodity flow in trees. The problem is NP-hard and admits a 1.585-approximation in the special case of homogeneous bandwidth limitations. We first show that the problem is harder to approximate when the bandwidth limitations are heterogeneous, i.e. vary from link to link and from node to node. We then generalize of a classical algorithm and obtain an M-approximation where M is the maximum number of leaves of the communication subtrees. Surprisingly, variants of the same algorithm, are used in the literature at least four times to solve related problems. There exists a polynomial-time algorithm for the special case of unicast requests and star topology. We generalize this result and relax the second requirement so that the set of unicast requests share a common vertex with no restriction on the tree topology.

Session Chair

Jiangchuan Liu (Simon Fraser University, Canada)

Session G-10

Next Generation Challenges

Conference
4:30 PM — 6:00 PM EDT
Local
May 13 Thu, 1:30 PM — 3:00 PM PDT

BlendVLC: A Cell-free VLC Network Architecture Empowered by Beamspot Blending

Jona Beysens (KU Leuven, Belgium); Qing Wang (Delft University of Technology, The Netherlands); Maxim Van den Abeele and Sofie Pollin (KU Leuven, Belgium)

2
In visible light communication (VLC), the quality of communication is primarily dominated by line-of-sight links. To ensure an appropriate link quality anywhere, beamsteering has been proposed where transmitters (TXs) dynamically steer their beams to create beamspots on the users. However, these highly dynamic TXs face the beam tracking problem and result in highly variable illumination. In this work, we propose BlendVLC, a cell-free network architecture to improve the mobility robustness of users by blending the beamspots from both steerable and fixed TXs. We solve the beam tracking by designing a centimeter-level visible light positioning algorithm empowered by a neural network. Relying on this location information, we formulate and solve an optimization problem on the beamspot blending, and design a fast and scalable heuristic for large networks. We build a proof-of-concept testbed as well as a simulator to evaluate BlendVLC. We show that it achieves superior performance compared to denser networks with fully fixed TXs. For example, in a large-scale VLC network of 8 m x 4 m, BlendVLC improves the average system throughput by 30%, while only requiring half the number of TXs.

Characterizing Ethereum's Mining Power Decentralization at a Deeper Level

Liyi Zeng (Tsinghua University, China); Yang Chen (Microsoft Research Asia, China); Shuo Chen (Microsoft Research, USA); Xian Zhang and Zhongxin Guo (Microsoft Research Asia, China); Wei Xu (Tsinghua University, China); Thomas Moscibroda (Microsoft Research, USA)

2
For proof-of-work blockchains such as Ethereum, the mining power decentralization is an important discussion point in the community. Previous studies mostly focus on the aggregated power of the mining pools, neglecting the pool participants who are the source of the pools' power. In this paper, we present the first large-scale study of the pool participants in Ethereum's mining pools. Pool participants are not directly observable because they communicate with their pools via private channels. However, they leave "footprints" on chain as they use Ethereum accounts to anonymously receive rewards from mining pools. For this study, we combine several data sources to identify 62,358,646 pool reward transactions sent by 47 pools to their participants over Ethereum's entire near 5-year history. Our analyses about these transactions reveal interesting insights about three aspects of pool participants: the power decentralization at the participant level, their pool-switching behavior, and why they participate in pools. Our results provide a complementary and more balanced view about Ethereum's mining power decentralization at a deeper level.

Uplink Multi-User Beamforming on Single RF Chain mmWave WLANs

Keerthi Priya Dasala (Rice University, USA); Josep M Jornet (Northeastern University, USA); Edward W. Knightly (Rice University, USA)

1
Today's mmWave WLANs can realize simultaneous multi-user multi-stream transmission solely on the downlink. In this paper, we present Uplink Multi-user Beamforming on single RF chain AP (UMBRA), a novel framework for supporting multi-stream multi-user uplink transmissions via a single RF chain. We design multi-user overlayed constellations and multi-user receiver mechanisms to enable concurrent time-triggered uplink multi-user transmissions received on a single RF chain AP. We devise exemplary beam selection policies to jointly adapt beams at users and the AP for targeting aggregate rate maximization without increasing training requirements compared to single-user systems. We implement the key components of UMBRA using a programmable WLAN testbed using software-defined radios and commercial 60-GHz transceivers and collect over-the-air measurements using phased-array antennas and horn antennas with varying beamwidth. We find that in comparison to single-user transmissions, UMBRA achieves more than 1.45× improvement in aggregate rate regardless of the choice of the user group, geometric separation, and receiver beamwidth.

Session Chair

Nirupama Bulusu (Portland State University)

Made with in Toronto · Privacy Policy · INFOCOM 2020 · © 2021 Duetone Corp.