Papers by Vaneet Aggarwal
IEEE Transactions on Quantum Engineering
This article proposes circular hidden quantum Markov models (c-HQMMs), which can be applied for m... more This article proposes circular hidden quantum Markov models (c-HQMMs), which can be applied for modeling temporal data. We show that c-HQMMs are equivalent to a tensor network (more precisely, circular local purified state) model. This equivalence enables us to provide an efficient learning model for c-HQMMs. The proposed learning approach is evaluated on six real datasets and demonstrates the advantage of c-HQMMs as compared to HQMMs and HMMs. INDEX TERMS Hidden quantum Markov model (HQMM), tensor network.
In this paper, we study the effect of channel output feedback on the sum capacity in a two-user s... more In this paper, we study the effect of channel output feedback on the sum capacity in a two-user symmetric deterministic interference channel. We find that having a single feedback link from one of the receivers to its own transmitter results in the same sum capacity as having a total of 4 feedback links from both the receivers to both the transmitters. Hence, from the sum capacity point of view, the three additional feedback links are not helpful. We also consider a half-duplex feedback model, where the forward and the feedback resources are symmetric and timeshared. Surprisingly, we find that there is no gain in sum-capacity with feedback in a half-duplex feedback model, when interference links have more capacity than direct links.

arXiv (Cornell University), Feb 10, 2019
Internet video traffic has been been rapidly increasing and is further expected to increase with ... more Internet video traffic has been been rapidly increasing and is further expected to increase with the emerging 5G applications such as higher definition videos, IoT and augmented/virtual reality applications. As end-users consume video in massive amounts and in an increasing number of ways, the content distribution network (CDN) should be efficiently managed to improve the system efficiency. The streaming service can include multiple caching tiers, at the distributed servers and the edge routers, and efficient content management at these locations affect the quality of experience (QoE) of the end users. In this paper, we propose a model for video streaming systems, typically composed of a centralized origin server, several CDN sites, and edge-caches located closer to the end user. We comprehensively consider different systems design factors including the limited caching space at the CDN sites, allocation of CDN for a video request, choice of different ports (or paths) from the CDN and the central storage, bandwidth allocation, the edge-cache capacity, and the caching policy. We focus on minimizing a performance metric, stall duration tail probability (SDTP), and present a novel and efficient algorithm accounting for the multiple design flexibilities. The theoretical bounds with respect to the SDTP metric are also analyzed and presented. The implementation on a virtualized cloud system managed by Openstack demonstrate that the proposed algorithms can significantly improve the SDTP metric, compared to the baseline strategies.

2018 IEEE Global Communications Conference (GLOBECOM)
In this paper, we study the problem of Quality of Experience (QoE) aware resource allocation in w... more In this paper, we study the problem of Quality of Experience (QoE) aware resource allocation in wireless systems. In particular, we consider application-aware joint Bandwidth-Power allocation for a small cell. We optimize a QoE metric for multiuser video streaming in a small cell that maintains a trade-off between maximizing the playback rate of each user and ensuring proportional fairness (PF) among users. We formulate the application-driven joint bandwidth-power allocation as a non-convex optimization problem. However, we develop a polynomial complexity algorithm, and we show that the proposed algorithm achieves the optimal solution of the proposed optimization problem. Simulation results show that the proposed QoE-aware algorithm significantly improves the average QoE. Moreover, it outperforms the weighted sum rate allocation which is the state-of-the-art physical resource allocation scheme.

IEEE Transactions on Information Theory, Nov 1, 2015
In this paper, we study the capacity regions of two-way diamond channels. We show that for a line... more In this paper, we study the capacity regions of two-way diamond channels. We show that for a linear deterministic model the capacity of the diamond channel in each direction can be simultaneously achieved for all values of channel parameters, where the forward and backward channel parameters are not necessarily the same. We divide the achievability scheme into three cases, depending on the forward and backward channel parameters. For the first case, we use a reverse amplify-and-forward strategy in the relays. For the second case, we use four relay strategies based on the reverse amplify-and-forward with some modifications in terms of replacement and repetition of some stream levels. For the third case, we use two relay strategies based on performing two rounds of repetitions in a relay. The proposed schemes for deterministic channels are used to find the capacity regions within constant gaps for two special cases of the Gaussian two-way diamond channel. First, for the general Gaussian two-way relay channel with a simple coding scheme the smallest gap is achieved compared to the prior works. Then, a special symmetric Gaussian two-way diamond model is considered and the capacity region is achieved within four bits. Index terms: Two-way diamond channel, reverse amplify-and-forward, cut-set bound, linear deterministic channel, Gaussian channel, two-way relay channel, rate region with constant gap.
In large wireless networks, acquiring full network state information is typically infeasible. Hen... more In large wireless networks, acquiring full network state information is typically infeasible. Hence, nodes need to flow the information and manage the interference based on partial information about the network. In this paper, we consider multi-hop wireless networks and assume that each source only knows the channel gains that are on the routes from itself to other destinations in the network. We develop several distributed strategies to manage the interference among the users and prove their optimality in maximizing the achievable normalized sumrate for some classes of networks.

IEEE/ACM Transactions on Networking
We propose cooperative edge-assisted dynamic federated learning (CE-FL). CE-FL introduces a distr... more We propose cooperative edge-assisted dynamic federated learning (CE-FL). CE-FL introduces a distributed machine learning (ML) architecture, where data collection is carried out at the end devices, while the model training is conducted cooperatively at the end devices and the edge servers, enabled via data offloading from the end devices to the edge servers through base stations. CE-FL also introduces floating aggregation point, where the local models generated at the devices and the servers are aggregated at an edge server, which varies from one model training round to another to cope with the network evolution in terms of data distribution and users' mobility. CE-FL considers the heterogeneity of network elements in terms of communication/computation models and the proximity to one another. CE-FL further presumes a dynamic environment with online variation of data at the network devices which causes a drift at the ML model performance. We model the processes taken during CE-FL, and conduct analytical convergence analysis of its ML model training. We then formulate network-aware CE-FL which aims to adaptively optimize all the network elements via tuning their contribution to the learning process, which turns out to be a non-convex mixed integer problem. Motivated by the large scale of the system, we propose a distributed optimization solver to break down the computation of the solution across the network elements. We finally demonstrate the effectiveness of our framework with the data collected from a real-world testbed.

2022 IEEE 61st Conference on Decision and Control (CDC)
In tabular multi-agent reinforcement learning with average-cost criterion, a team of agents seque... more In tabular multi-agent reinforcement learning with average-cost criterion, a team of agents sequentially interacts with the environment and observes local incentives. We focus on the case that the global reward is a sum of local rewards, the joint policy factorizes into agents' marginals, and full state observability. To date, few global optimality guarantees exist even for this simple setting, as most results yield convergence to stationarity for parameterized policies in large/possibly continuous spaces. To solidify the foundations of MARL, we build upon linear programming (LP) reformulations, for which stochastic primal-dual methods yields a model-free approach to achieve optimal sample complexity in the centralized case. We develop multiagent extensions, whereby agents solve their local saddle point problems and then perform local weighted averaging. We establish that the sample complexity to obtain nearglobally optimal solutions matches tight dependencies on the cardinality of the state and action spaces, and exhibits classical scalings with respect to the network in accordance with multi-agent optimization. Experiments corroborate these results in practice. * denotes equal contributions.

ACM/IMS Transactions on Data Science
There are numerous real-world problems where a user must make decisions under uncertainty. For th... more There are numerous real-world problems where a user must make decisions under uncertainty. For the problem of influence maximization on a social network, for example, the user must select a set of K influencers who will jointly have a large influence on many users. With the lack of prior knowledge about the diffusion process or even topological information, this problem becomes quite challenging. This problem can be cast as a combinatorial bandit problem, where the user can repeatedly choose a candidate set of K out of N arms at each time, with an aim to achieve an efficient trade-off between exploration and exploitation. In this work, we present the first combinatorial bandit algorithm for which the only feedback is a non-linear reward of the selected K arms. No other feedback is needed. In the context of influence maximization, this means no feedback in the form of which nodes or edges were activated needs to be available, just the amount of influence. The novel algorithm we propo...

Foundations and Trends® in Communications and Information Theory
As consumers are increasingly engaged in social networking and E-commerce activities, businesses ... more As consumers are increasingly engaged in social networking and E-commerce activities, businesses grow to rely on Big Data analytics for intelligence, and traditional IT infrastructures continue to migrate to the cloud and edge, these trends cause distributed data storage demand to rise at an unprecedented speed. Erasure coding has seen itself quickly emerged as a promising technique to reduce storage cost while providing similar reliability as replicated systems, widely adopted by companies like Facebook, Microsoft and Google. However, it also brings new challenges in characterizing and optimizing the access latency when erasure codes are used in distributed storage. The aim of this monograph is to provide a review of recent progress (both theoretical and practical) on systems that employ erasure codes for distributed storage. In this monograph, we will first identify the key challenges and taxonomy of the research problems and then give an overview of different approaches that have been developed to quantify and model latency of erasure-coded storage. This includes recent work leveraging MDS-Reservation, Fork-Join, Probabilistic, and Delayed-Relaunch scheduling policies, as well as their applications to characterize access latency (e.g., 2 mean, tail, asymptotic latency) of erasure-coded distributed storage systems. We will also extend the problem to the case when users are streaming videos from erasure-coded distributed storage systems. Next, we bridge the gap between theory and practice, and discuss lessons learned from prototype implementation. In particular, we will discuss exemplary implementations of erasure-coded storage, illuminate key design degrees of freedom and tradeoffs, and summarize remaining challenges in real-world storage systems such as in content delivery and caching. Open problems for future research are discussed at the end of each chapter.

arXiv (Cornell University), Sep 12, 2021
We consider the problem of tabular infinite horizon concave utility reinforcement learning (CURL)... more We consider the problem of tabular infinite horizon concave utility reinforcement learning (CURL) with convex constraints. For this, we propose a model-based learning algorithm that also achieves zero constraint violations. Assuming that the concave objective and the convex constraints have a solution interior to the set of feasible occupation measures, we solve a tighter optimization problem to ensure that the constraints are never violated despite the imprecise model knowledge and model stochasticity. We use Bellman error-based analysis for tabular infinite-horizon setups which allows analyzing stochastic policies. Combining the Bellman error-based analysis and tighter optimization equation, for T interactions with the environment, we obtain a high-probability regret guarantee for objective which grows asÕ(1/ √ T), excluding other factors. The proposed method can be applied for optimistic algorithms to obtain high-probability regret bounds and also be used for posterior sampling algorithms to obtain a loose Bayesian regret bounds but with significant improvement in computational complexity.
Quantum Information and Measurement VI 2021
Inferring causality from observational data alone is one of the most important and challenging pr... more Inferring causality from observational data alone is one of the most important and challenging problems in statistical inference. We propose a greedy algorithm for quantum entropic causal inference that unifies classical and quantum causal inference.

arXiv (Cornell University), Mar 25, 2022
We propose cooperative edge-assisted dynamic federated learning (CE-FL). CE-FL introduces a distr... more We propose cooperative edge-assisted dynamic federated learning (CE-FL). CE-FL introduces a distributed machine learning (ML) architecture, where data collection is carried out at the end devices, while the model training is conducted cooperatively at the end devices and the edge servers, enabled via data offloading from the end devices to the edge servers through base stations. CE-FL also introduces floating aggregation point, where the local models generated at the devices and the servers are aggregated at an edge server, which varies from one model training round to another to cope with the network evolution in terms of data distribution and users' mobility. CE-FL considers the heterogeneity of network elements in terms of communication/computation models and the proximity to one another. CE-FL further presumes a dynamic environment with online variation of data at the network devices which causes a drift at the ML model performance. We model the processes taken during CE-FL, and conduct analytical convergence analysis of its ML model training. We then formulate network-aware CE-FL which aims to adaptively optimize all the network elements via tuning their contribution to the learning process, which turns out to be a non-convex mixed integer problem. Motivated by the large scale of the system, we propose a distributed optimization solver to break down the computation of the solution across the network elements. We finally demonstrate the effectiveness of our framework with the data collected from a real-world testbed.

2021 IEEE Global Communications Conference (GLOBECOM), 2021
In this paper, we propose an energy-efficient federated meta-learning framework. The objective is... more In this paper, we propose an energy-efficient federated meta-learning framework. The objective is to enable learning a meta-model that can be fine-tuned to a new task with a few number of samples in a distributed setting and at low computation and communication energy consumption. We assume that each task is owned by a separate agent, so a limited number of tasks is used to train a meta-model. Assuming each task was trained offline on the agent's local data, we propose a lightweight algorithm that starts from the local models of all agents, and in a backward manner using projected stochastic gradient ascent (P-SGA) finds a meta-model. The proposed method avoids complex computations such as computing hessian, double looping, and matrix inversion, while achieving high performance at significantly less energy consumption compared to the stateof-the-art methods such as MAML and iMAML on conducted experiments for sinusoid regression and image classification tasks.

IEEE/ACM Transactions on Networking, 2022
Federated learning has generated significant interest, with nearly all works focused on a "star" ... more Federated learning has generated significant interest, with nearly all works focused on a "star" topology where nodes/devices are each connected to a central server. We migrate away from this architecture and extend it through the network dimension to the case where there are multiple layers of nodes between the end devices and the server. Specifically, we develop multi-stage hybrid federated learning (MH-FL), a hybrid of intraand inter-layer model learning that considers the network as a multi-layer cluster-based structure. MH-FL considers the topology structures among the nodes in the clusters, including local networks formed via device-to-device (D2D) communications, and presumes a semi-decentralized architecture for federated learning. It orchestrates the devices at different network layers in a collaborative/cooperative manner (i.e., using D2D interactions) to form local consensus on the model parameters and combines it with multi-stage parameter relaying between layers of the treeshaped hierarchy. We derive the upper bound of convergence for MH-FL with respect to parameters of the network topology (e.g., the spectral radius) and the learning algorithm (e.g., the number of D2D rounds in different clusters). We obtain a set of policies for the D2D rounds at different clusters to guarantee either a finite optimality gap or convergence to the global optimum. We then develop a distributed control algorithm for MH-FL to tune the D2D rounds in each cluster over time to meet specific convergence criteria. Our experiments on real-world datasets verify our analytical results and demonstrate the advantages of MH-FL in terms of resource utilization metrics. Index Terms-Fog learning, device-to-device communications, peer-to-peer learning, cooperative learning, distributed machine learning, semi-decentralized federated learning.

ICC 2019 - 2019 IEEE International Conference on Communications (ICC), 2019
One of the 5G promises is to provide Ultra Reliable Low Latency Communications (URLLC) which targ... more One of the 5G promises is to provide Ultra Reliable Low Latency Communications (URLLC) which targets an end to end communication latency that is ≤ 1ms. The very low latency requirement of URLLC entails a lot of work in all networking layers. In this paper, we focus on the physical layer, and in particular, we propose a novel formulation of the massive MIMO uplink detection problem. We introduce an objective function that is a sum of strictly convex and separable functions based on decomposing the received vector into multiple vectors. Each vector represents the contribution of one of the transmitted symbols in the received vector. Proximal Jacobian Alternating Direction Method of Multipliers (PJADMM) is used to solve the new formulated problem in an iterative manner where at every iteration all variables are updated in parallel and in a closed form expression. The proposed algorithm provides a lower complexity and much faster processing time compared to the conventional MMSE detection technique and other iterative-based techniques, especially when the number of single antenna users is close to the number of base station (BS) antennas. This improvement is obtained without any matrix inversion. Simulation results demonstrate the efficacy of the proposed algorithm in reducing detection processing time in the multiuser uplink massive MIMO setting.

Abstract—Most communication systems use some form of feedback, often related to channel state inf... more Abstract—Most communication systems use some form of feedback, often related to channel state information. The common models used in analyses either assume perfect channel state information at the receiver and/or noiseless state feedback links. However, in practical systems, neither is the channel estimate known perfectly at the receiver and nor is the feedback link perfect. In this paper, we study the achievable diversity multi-plexing tradeoff using i.i.d. Gaussian codebooks, considering the errors in training the receiver and the errors in the feedback link for FDD systems, where the forward and the feedback are independent MIMO channels. Our key result is that the maximum diversity order with one-bit of feedback information is identical to systems with more feedback bits. Thus, asymptotically in SNR, more than one bit of feedback does not improve the system performance at constant rates. Furthermore, the one-bit diversity-multiplexing performance is identical to the system which...
2020 54th Annual Conference on Information Sciences and Systems (CISS), 2020
Gradient descent and its variants are widely used in machine learning. However, oracle access of ... more Gradient descent and its variants are widely used in machine learning. However, oracle access of gradient may not be available in many applications, limiting the direct use of gradient descent. This paper proposes a method of estimating gradient to perform gradient descent, that converges to a stationary point for general nonconvex optimization problems. Beyond the first-order stationary properties, the second-order stationary properties are important in machine learning applications to achieve better performance. We show that the proposed modelfree non-convex optimization algorithm returns an ǫ-second-order stationary point with O(d 2+ θ 2 ǫ 8+θ) queries of the function for any arbitrary θ > 0.

ArXiv, 2019
Reinforcement Learning (RL) is being increasingly applied to optimize complex functions that may ... more Reinforcement Learning (RL) is being increasingly applied to optimize complex functions that may have a stochastic component. RL is extended to multi-agent systems to find policies to optimize systems that require agents to coordinate or to compete under the umbrella of Multi-Agent RL (MARL). A crucial factor in the success of RL is that the optimization problem is represented as the expected sum of rewards, which allows the use of backward induction for the solution. However, many real-world problems require a joint objective that is non-linear and dynamic programming cannot be applied directly. For example, in a resource allocation problem, one of the objective is to maximize long-term fairness among the users. This paper addresses and formalizes the problem of joint objective optimization, where not only the sum of rewards of each agent but a function of the sum of rewards of each agent needs to be optimized. The proposed algorithms at the centralized controller aims to learn the...

ArXiv, 2020
In the optimization of dynamic systems, the variables typically have constraints. Such problems c... more In the optimization of dynamic systems, the variables typically have constraints. Such problems can be modeled as a Constrained Markov Decision Process (CMDP). This paper considers the peak constraints, where the agent chooses the policy to maximize the long-term average reward as well as satisfies the constraints at each time. We propose a model-free algorithm that converts CMDP problem to an unconstrained problem and a Q-learning based approach is used. The proposed algorithm achieves $\tilde{O}(T^{\frac{1}{2}+\epsilon}\sqrt{H^4SA})$ bound for the regret and $O(HT^{\frac{1}{2}+\epsilon})$ bound for the number of constraint violations where $\epsilon>0$ is an arbitrary positive number, $T$ is the time-horizon, $S$ and $A$ is the number of states and actions, respectively, and $H$ is the number of steps per episode. We note that this is the first results on regret analysis for CMDP with peak constraints, where the transition problems are not known apriori. We demonstrate the prop...
Uploads
Papers by Vaneet Aggarwal