Journal of the American Statistical Association, Nov 2, 2022
In this paper, we study the contextual dynamic pricing problem where the market value of a produc... more In this paper, we study the contextual dynamic pricing problem where the market value of a product is linear in some observed features plus some market noise. Products are sold one at a time, and only a binary response indicating success or failure of a sale is observed. Our model setting is similar to Javanmard and Nazerzadeh (2019) except that we expand the demand curve to a semi-parametric model and need to learn dynamically both parametric and non-parametric components. We propose a dynamic statistical learning and decision making policy that combines semi-parametric estimation from a generalized linear model with online decision making to minimize regret (maximize revenue). Under mild conditions, we show that for a market noise c.d.f. F (•) with m-th order derivative (m ≥ 2), our policy achieves a regret upper bound of O d (T 2m+1 4m−1), where T is time horizon and O d is the order that hides logarithmic terms and the dimensionality of feature d. The upper bound is further reduced to O d (√ T) if F is super smooth whose Fourier transform decays exponentially. In terms of dependence on the horizon T , these upper bounds are close to Ω(√ T), the lower bound where F belongs to a parametric class. We further generalize these results to the case with dynamically dependent product features under the strong mixing condition.
This paper studies the performance of the spectral method in the estimation and uncertainty quant... more This paper studies the performance of the spectral method in the estimation and uncertainty quantification of the unobserved preference scores of compared entities in a very general and more realistic setup in which the comparison graph consists of hyper-edges of possible heterogeneous sizes and the number of comparisons can be as low as one for a given hyper-edge. Such a setting is pervasive in real applications, circumventing the need to specify the graph randomness and the restrictive homogeneous sampling assumption imposed in the commonly-used Bradley-Terry-Luce (BTL) or Plackett-Luce (PL) models. Furthermore, in the scenarios when the BTL or PL models are appropriate, we unravel the relationship between the spectral estimator and the Maximum Likelihood Estimator (MLE). We discover that a two-step spectral method, where we apply the optimal weighting estimated from the equal weighting vanilla spectral method, can achieve the same asymptotic efficiency as the MLE. Given the asymptotic distributions of the estimated preference scores, we also introduce a comprehensive framework to carry out both onesample and two-sample ranking inferences, applicable to both fixed and random graph settings. It is noteworthy that it is the first time effective two-sample rank testing methods are proposed. Finally, we substantiate our findings via comprehensive numerical simulations and subsequently apply our developed methodologies to perform statistical inferences on statistics journals and movie rankings.
This paper concerns with statistial estimation and inference for the ranking problems based on pa... more This paper concerns with statistial estimation and inference for the ranking problems based on pairwise comparisons with additional covariate information such as the attributes of the compared items. Despite extensive studies, few prior literatures investigate this problem under the more realistic setting where covariate information exists. To tackle this issue, we propose a novel model, Covariate-Assisted Ranking Estimation (CARE) model, that extends the well-known Bradley-Terry-Luce (BTL) model, by incorporating the covariate information. Specifically, instead of assuming every compared item has a fixed latent score {θ * i } n i=1 , we assume the underlying scores are given by {α * i + x i β * } n i=1 , where α * i and x i β * represent latent baseline and covariate score of the i-th item, respectively. We impose natural identifiability conditions and derive the ∞and 2-optimal rates for the maximum likelihood estimator of {α * i } n i=1 and β * under a sparse comparison graph, using a novel 'leave-one-out' technique (Chen et al., 2019). To conduct statistical inferences, we further derive asymptotic distributions for the MLE of {α * i } n i=1 and β * with minimal sample complexity. This allows us to answer the question whether some covariates have any explanation power for latent scores and to threshold some sparse parameters to improve the ranking performance. We improve the approximation method used in Gao et al. (2021) for the BLT model and generalize it to the CARE model. Moreover, we validate our theoretical results through large-scale numerical studies and an application to the mutual fund stock holding dataset.
This paper considers ranking inference of n items based on the observed data on the top choice am... more This paper considers ranking inference of n items based on the observed data on the top choice among M randomly selected items at each trial. This is a useful modification of the Plackett-Luce model for M-way ranking with only the top choice observed and is an extension of the celebrated Bradley-Terry-Luce model that corresponds to M " 2. Under a uniform sampling scheme in which any M distinguished items are selected for comparisons with probability p and the selected M items are compared L times with multinomial outcomes, we establish the statistical rates of convergence for underlying n preference scores using both 2-norm and 8-norm, with the minimum sampling complexity. In addition, we establish the asymptotic normality of the maximum likelihood estimator that allows us to construct confidence intervals for the underlying scores. Furthermore, we propose a novel inference framework for ranking items through a sophisticated maximum pairwise difference statistic whose distribution is estimated via a valid Gaussian multiplier bootstrap. The estimated distribution is then used to construct simultaneous confidence intervals for the differences in the preference scores and the ranks of individual items. They also enable us to address various inference questions on the ranks of these items. Extensive simulation studies lend further support to our theoretical results. A real data application illustrates the usefulness of the proposed methods convincingly.
A stylized feature of high-dimensional data is that many variables have heavy tails, and robust s... more A stylized feature of high-dimensional data is that many variables have heavy tails, and robust statistical inference is critical for valid large-scale statistical inference. Yet, the existing developments such as Winsorization, Huberization and median of means require the bounded second moments and involve variable-dependent tuning parameters, which hamper their fidelity in applications to large-scale problems. To liberate these constraints, this paper revisits the celebrated Hodges-Lehmann (HL) estimator for estimating location parameters in both the one-and two-sample problems, from a non-asymptotic perspective. Our study develops Berry-Esseen inequality and Cramér type moderate deviation for the HL estimator based on newly developed non-asymptotic Bahadur representation, and builds data-driven confidence intervals via a weighted bootstrap approach. These results allow us to extend the HL estimator to large-scale studies and propose tuning-free and moment-free high-dimensional inference procedures for testing global null and for large-scale multiple testing with false discovery proportion control. It is convincingly shown that the resulting tuning-free and moment-free methods control false discovery proportion at a prescribed level. The simulation studies lend further support to our developed theory.
We study offline reinforcement learning under a novel model called strategic MDP, which character... more We study offline reinforcement learning under a novel model called strategic MDP, which characterizes the strategic interactions between a principal and a sequence of myopic agents with private types. Due to the bilevel structure and private types, strategic MDP involves information asymmetry between the principal and the agents. We focus on the offline RL problem where the goal is to learn the optimal policy of the principal concerning a target population of agents, based on a pre-collected dataset that consists of historical interactions. The unobserved private types confound such a dataset as they affect both the rewards and observations received by the principal. We propose a novel algorithm, pessimistic policy learning with algorithmic instruments (PLAN), which leverages the ideas of instrumental variable regression and the pessimism principle to learn a near-optimal principal's policy in the context of general function approximation. Our algorithm is based on the critical observation that the principal's actions serve as valid instrumental variables. In particular, under a partial coverage assumption on the offline dataset, we prove that PLAN outputs a 1/ √ K-optimal policy with K being the number of collected trajectories. We further apply our framework to some special cases of strategic MDP, including strategic regression (Harris et al., 2021b), strategic bandit, and noncompliance in recommendation systems (Robins, 1998).
Journal of the American Statistical Association, 2022
In this paper, we leverage over-parameterization to design regularization-free algorithms for the... more In this paper, we leverage over-parameterization to design regularization-free algorithms for the high-dimensional single index model and provide theoretical guarantees for the induced implicit regularization phenomenon. Specifically, we study both vector and matrix single index models where the link function is nonlinear and unknown, the signal parameter is either a sparse vector or a low-rank symmetric matrix, and the response variable can be heavy-tailed. To gain a better understanding of the role played by implicit regularization without excess technicality, we assume that the distribution of the covariates is known a priori. For both the vector and matrix settings, we construct an over-parameterized least-squares loss function by employing the score function transform and a robust truncation step designed specifically for heavy-tailed data. We propose to estimate the true parameter by applying regularization-free gradient descent to the loss function. When the initialization is close to the origin and the stepsize is sufficiently small, we prove that the obtained solution achieves minimax optimal statistical rates of convergence in both the vector and matrix cases. In addition, our experimental results support our theoretical findings and also demonstrate that our methods empirically outperform classical methods with explicit regularization in terms of both 2-statistical rate and variable selection consistency.
This paper is concerned with the statistical learning of the extreme smog (PM 2.5 ) dynamics of a... more This paper is concerned with the statistical learning of the extreme smog (PM 2.5 ) dynamics of a vast region in China. Differently from classical extreme value modeling approaches, this paper develops a dynamic model of conditional, exponentiated Weibull distribution modeling and analysis of regional smog extremes, particularly for the worst scenarios observed in each day. To gain higher modeling efficiency, weather factors will be introduced in an enhanced model. The proposed model and the enhanced model are illustrated with temporal/spatial maxima of hourly PM 2.5 observations each day from smog monitoring stations located in the Beijing–Tianjin–Hebei geographical region between 2014 and 2019. The proposed model performs more precisely on fittings compared with other previous models dealing with maxima with autoregressive parameter dynamics, and provides relatively accurate prediction as well. The findings enhance the understanding of how severe extreme smog scenarios can be and ...
We propose the Factor Augmented sparse linear Regression Model (FARM) that not only encompasses b... more We propose the Factor Augmented sparse linear Regression Model (FARM) that not only encompasses both the latent factor regression and sparse linear regression as special cases but also bridges dimension reduction and sparse regression together. We provide theoretical guarantees for the estimation of our model under the existence of sub-Gaussian and heavy-tailed noises (with bounded p1` θq-th moment, for all θ ą 0) respectively. In addition, the existing works on supervised learning often assume the latent factor regression or sparse linear regression is the true underlying model without justifying its adequacy. To fill in such an important gap, we also leverage our model as the alternative model to test the sufficiency of the latent factor regression and the sparse linear regression models. To accomplish these goals, we propose the Factor-Adjusted deBiased Test (FabTest) and a two-stage ANOVA type test respectively. We also conduct large-scale numerical experiments including both sy...
Uploads
Papers by Mengxin Yu