A Restless Bandit Formulation of Opportunistic Access: Indexablity and Index Policy
2008 5th IEEE Annual Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks Workshops, 2008
... with time-varying states. With limited sensing, a user can only sense and access a subset of ... more ... with time-varying states. With limited sensing, a user can only sense and access a subset of channels and accrue rewards determined by the state of the sensed channels. We formulate the problem of optimal sequential channel probing as a restless multi-armed bandit process ...
Distributed learning in cognitive radio networks: Multi-armed bandit with distributed multiple players
2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010
... The policy proposed here, however, applies to more general reward models (for example,Gaussia... more ... The policy proposed here, however, applies to more general reward models (for example,Gaussian and Poisson ... Recently, a variation of centralized MAB in the context of cognitive radio has been considered in [6 ... 0, 1} and f(y; θi) = θy i (1 − θi)1−y, ie, the reward process on each ...
Logarithmic weak regret of non-Bayesian restless multi-armed bandit
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011
... 27, pp. 1054C1078, 1995. [5] P. Auer, N. Cesa-Bianchi, P. Fischer, “Finite-time Analysis of t... more ... 27, pp. 1054C1078, 1995. [5] P. Auer, N. Cesa-Bianchi, P. Fischer, “Finite-time Analysis of the Multiarmed Bandit Problem,” Machine Learning, 47, 235-256, 2002. [6] C. Tekin, M. Liu, “Online Algorithms for the Multi-Armed Bandit Problem With Markovian Rewards,” Proc. ...
Fourth International Conference on Software Engineering Research, Management and Applications (SERA'06), 2006
With the growing competition in the global marketplace, the call center has become an essential p... more With the growing competition in the global marketplace, the call center has become an essential part for many companies. The outbound technology of call center plays a very important role in Customer Relationship Management (CRM) and marketing. The call center of the large service companies usually has the characteristics of wide geographical distribution, high volume of calls, high efficiency and centralized data processing. With the establishment of IP network infrastructure in many large companies, a solution of a distributed IP-based predictive dialing system is discussed, which includes the architectural design, network topological structure and system functions. In the end, the benefit of developing a predictive dialing system using the virtual call center is discussed.
Distributed learning under imperfect sensing in cognitive radio networks
2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers, 2010
We consider a cognitive radio network, where M distributed secondary users search for spectrum op... more We consider a cognitive radio network, where M distributed secondary users search for spectrum opportunities among N independent channels without information exchange. The occupancy of each channel by the primary network is modeled as a Bernoulli process with unknown mean which represents the unknown traffic load of the primary network. In each slot, a secondary transmitter chooses one channel to
Indexability and whittle index for restless bandit problems involving reset processes
IEEE Conference on Decision and Control and European Control Conference, 2011
ABSTRACT We consider a class of restless multi-armed bandit (RMAB) problems, in which the active ... more ABSTRACT We consider a class of restless multi-armed bandit (RMAB) problems, in which the active action resets the stochastic evolution of the system. We obtain the Whittle index in closed-form, showing that it induces a policy that is equivalent to the myopic policy, and that it is optimal for stochastically identical arms. These results find applications in opportunistic spectrum access and supervisory control systems such as anomaly detection and control.
2010 Information Theory and Applications Workshop (ITA), 2010
We formulate and study a decentralized multi-armed bandit (MAB) problem. There are M distributed ... more We formulate and study a decentralized multi-armed bandit (MAB) problem. There are M distributed players competing for N independent arms. Each arm, when played, offers i.i.d. reward according to a distribution with an unknown parameter. At each time, each player chooses one arm to play without exchanging observations or any information with other players. Players choosing the same arm collide, and, depending on the collision model, either no one receives reward or the colliding players share the reward in an arbitrary way. We show that the minimum system regret of the decentralized MAB grows with time at the same logarithmic order as in the centralized counterpart where players act collectively as a single entity by exchanging observations and making decisions jointly. A decentralized policy is constructed to achieve this optimal order while ensuring fairness among players and without assuming any pre-agreement or information exchange among players. Based on a Time Division Fair Sharing (TDFS) of the M best arms, the proposed policy is constructed and its order optimality is proven under a general reward model. Furthermore, the basic structure of the TDFS policy can be used with any order-optimal single-player policy to achieve order optimality in the decentralized setting. We also establish a lower bound on the system regret growth rate for a general class of decentralized polices, to which the proposed policy belongs. This problem finds potential applications in cognitive radio networks, multi-channel communication systems, multi-agent systems, web search and advertising, and social networks.
We formulate and study a decentralized multi-armed bandit (MAB) problem. There are M distributed ... more We formulate and study a decentralized multi-armed bandit (MAB) problem. There are M distributed players competing for N independent arms. Each arm, when played, offers i.i.d. reward according to a distribution with an unknown parameter. At each time, each player chooses one arm to play without exchanging observations or any information with other players. Players choosing the same arm collide, and, depending on the collision model, either no one receives reward or the colliding players share the reward in an arbitrary way. We show that the minimum system regret of the decentralized MAB grows with time at the same logarithmic order as in the centralized counterpart where players act collectively as a single entity by exchanging observations and making decisions jointly. A decentralized policy is constructed to achieve this optimal order while ensuring fairness among players and without assuming any pre-agreement or information exchange among players. Based on a Time Division Fair Sharing (TDFS) of the M best arms, the proposed policy is constructed and its order optimality is proven under a general reward model. Furthermore, the basic structure of the TDFS policy can be used with any order-optimal single-player policy to achieve order optimality in the decentralized setting. We also establish a lower bound on the system regret growth rate for a general class of decentralized polices, to which the proposed policy belongs. This problem finds potential applications in cognitive radio networks, multi-channel communication systems, multi-agent systems, web search and advertising, and social networks.
Distributed Sensing and Access in Cognitive Radio Networks
2008 IEEE 10th International Symposium on Spread Spectrum Techniques and Applications, 2008
... A suboptimal randomized sensing policy is then proposed. This policy effectively addresses th... more ... A suboptimal randomized sensing policy is then proposed. This policy effectively addresses this design tradeoff and offers significant improvement in network throughput over the optimal single-user design. Index TermsCognitive radio, spectrum opportunity tracking, multi ...
Uploads
Papers by Keqin Liu