Keqin Liu

On the myopic policy for a class of restless bandit problems with applications in dynamic multichannel access

Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference, 2009

A Restless Bandit Formulation of Opportunistic Access: Indexablity and Index Policy

2008 5th IEEE Annual Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks Workshops, 2008

... with time-varying states. With limited sensing, a user can only sense and access a subset of ... more

Dynamic intrusion detection in resource-constrained cyber networks

2012 IEEE International Symposium on Information Theory Proceedings, 2012

Multi-armed bandit problems with heavy-tailed reward distributions

2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2011

Distributed learning in cognitive radio networks: Multi-armed bandit with distributed multiple players

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, 2010

... The policy proposed here, however, applies to more general reward models (for example,Gaussia... more ... The policy proposed here, however, applies to more general reward models (for example,Gaussian and Poisson ... Recently, a variation of centralized MAB in the context of cognitive radio has been considered in [6 ... 0, 1} and f(y; θi) = θy i (1 − θi)1−y, ie, the reward process on each ...

Dynamic probing for intrusion detection under resource constraints

2013 IEEE International Conference on Communications (ICC), 2013

Link throughput of multi-channel opportunistic access with limited sensing

2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 2008

Detecting, tracking, and exploiting spectrum opportunities in unslotted primary systems

2008 IEEE Radio and Wireless Symposium, 2008

Logarithmic weak regret of non-Bayesian restless multi-armed bandit

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011

... 27, pp. 1054C1078, 1995. [5] P. Auer, N. Cesa-Bianchi, P. Fischer, “Finite-time Analysis of t... more

Learning and sharing in a changing world: Non-Bayesian restless bandit with multiple players

2011 Information Theory and Applications Workshop, 2011

Research on Predictive Dialing System Based on Distributed Call Center

Fourth International Conference on Software Engineering Research, Management and Applications (SERA'06), 2006

With the growing competition in the global marketplace, the call center has become an essential p... more With the growing competition in the global marketplace, the call center has become an essential part for many companies. The outbound technology of call center plays a very important role in Customer Relationship Management (CRM) and marketing. The call center of the large service companies usually has the characteristics of wide geographical distribution, high volume of calls, high efficiency and centralized data processing. With the establishment of IP network infrastructure in many large companies, a solution of a distributed IP-based predictive dialing system is discussed, which includes the architectural design, network topological structure and system functions. In the end, the benefit of developing a predictive dialing system using the virtual call center is discussed.

Download

Distributed learning under imperfect sensing in cognitive radio networks

2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers, 2010

We consider a cognitive radio network, where M distributed secondary users search for spectrum op... more We consider a cognitive radio network, where M distributed secondary users search for spectrum opportunities among N independent channels without information exchange. The occupancy of each channel by the primary network is modeled as a Bernoulli process with unknown mean which represents the unknown traffic load of the primary network. In each slot, a secondary transmitter chooses one channel to

Indexability and whittle index for restless bandit problems involving reset processes

IEEE Conference on Decision and Control and European Control Conference, 2011

ABSTRACT We consider a class of restless multi-armed bandit (RMAB) problems, in which the active ... more ABSTRACT We consider a class of restless multi-armed bandit (RMAB) problems, in which the active action resets the stochastic evolution of the system. We obtain the Whittle index in closed-form, showing that it induces a policy that is equivalent to the myopic policy, and that it is optimal for stochastically identical arms. These results find applications in opportunistic spectrum access and supervisory control systems such as anomaly detection and control.

Channel probing for opportunistic access with multi-channel sensing

2008 42nd Asilomar Conference on Signals, Systems and Computers, 2008

Download

Learning from collisions in cognitive radio networks: Time Division Fair Sharing without pre-agreement

2010 - MILCOM 2010 MILITARY COMMUNICATIONS CONFERENCE, 2010

Low-Complexity Approaches to Spectrum Opportunity Tracking

2007 2nd International Conference on Cognitive Radio Oriented Wireless Networks and Communications, 2007

Decentralized multi-armed bandit with multiple distributed players

2010 Information Theory and Applications Workshop (ITA), 2010

We formulate and study a decentralized multi-armed bandit (MAB) problem. There are M distributed ... more We formulate and study a decentralized multi-armed bandit (MAB) problem. There are M distributed players competing for N independent arms. Each arm, when played, offers i.i.d. reward according to a distribution with an unknown parameter. At each time, each player chooses one arm to play without exchanging observations or any information with other players. Players choosing the same arm collide, and, depending on the collision model, either no one receives reward or the colliding players share the reward in an arbitrary way. We show that the minimum system regret of the decentralized MAB grows with time at the same logarithmic order as in the centralized counterpart where players act collectively as a single entity by exchanging observations and making decisions jointly. A decentralized policy is constructed to achieve this optimal order while ensuring fairness among players and without assuming any pre-agreement or information exchange among players. Based on a Time Division Fair Sharing (TDFS) of the M best arms, the proposed policy is constructed and its order optimality is proven under a general reward model. Furthermore, the basic structure of the TDFS policy can be used with any order-optimal single-player policy to achieve order optimality in the decentralized setting. We also establish a lower bound on the system regret growth rate for a general class of decentralized polices, to which the proposed policy belongs. This problem finds potential applications in cognitive radio networks, multi-channel communication systems, multi-agent systems, web search and advertising, and social networks.

Download

Online learning for stochastic linear optimization problems

2012 Information Theory and Applications Workshop, 2012

Distributed Learning in Multi-Armed Bandit With Multiple Players

We formulate and study a decentralized multi-armed bandit (MAB) problem. There are M distributed ... more We formulate and study a decentralized multi-armed bandit (MAB) problem. There are M distributed players competing for N independent arms. Each arm, when played, offers i.i.d. reward according to a distribution with an unknown parameter. At each time, each player chooses one arm to play without exchanging observations or any information with other players. Players choosing the same arm collide, and, depending on the collision model, either no one receives reward or the colliding players share the reward in an arbitrary way. We show that the minimum system regret of the decentralized MAB grows with time at the same logarithmic order as in the centralized counterpart where players act collectively as a single entity by exchanging observations and making decisions jointly. A decentralized policy is constructed to achieve this optimal order while ensuring fairness among players and without assuming any pre-agreement or information exchange among players. Based on a Time Division Fair Sharing (TDFS) of the M best arms, the proposed policy is constructed and its order optimality is proven under a general reward model. Furthermore, the basic structure of the TDFS policy can be used with any order-optimal single-player policy to achieve order optimality in the decentralized setting. We also establish a lower bound on the system regret growth rate for a general class of decentralized polices, to which the proposed policy belongs. This problem finds potential applications in cognitive radio networks, multi-channel communication systems, multi-agent systems, web search and advertising, and social networks.

Download

Distributed Sensing and Access in Cognitive Radio Networks

2008 IEEE 10th International Symposium on Spread Spectrum Techniques and Applications, 2008

... A suboptimal randomized sensing policy is then proposed. This policy effectively addresses th... more

Uploads

Papers by Keqin Liu

Log In